Balancing Efficiency And Effectiveness For Fusion-Based Search Engines In The 'Big Data' Environment

INFORMATION RESEARCH-AN INTERNATIONAL ELECTRONIC JOURNAL(2016)

引用 24|浏览10
暂无评分
摘要
Introduction. In the big data age, we have to deal with a tremendous amount of information, which can be collected from various types of sources. For information search systems such as Web search engines or online digital libraries, the collection of documents becomes larger and larger. For some queries, an information search system needs to retrieve a large number of documents. On the other hand, very often people are only willing to visit no more than a few top-ranked documents. Therefore, how to develop an information search system with desirable efficiency and effectiveness is a research problem.Method. In this paper, we focus on the data fusion approach to information search, in which each component search model contributes a result and all the results are combined by a fusion algorithm. Through empirical study, we are able to find a feasible combination method that balances effectiveness and efficiency in the context of data fusion.Analysis. It is a multi-optimisation problem that aims to balance effectiveness and efficiency. To support this, we need to understand how these two factors affect each other and to what extent.Results. Using some groups of historical runs from TREC to carry out the experiment, we find that using much less information (e.g., less than 10% of the documents in the experiment), good efficiency is achievable with marginal loss on effectiveness.Conclusions. We consider that the findings from our experiment are informative and this can be used as a guideline for providing more efficient search service in the big data environment.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要