Shuffling a Stacked Deck: The Case for Partially Randomized Ranking of Search Engine Results

VLDB '05: Proceedings of the 31st international conference on Very large data bases(2005)

引用 98|浏览68
暂无评分
摘要
In-degree, PageRank, number of visits and other measures of Web page popularity significantly influence the ranking of search results by modern search engines. The assumption is that popularity is closely correlated with quality, a more elusive concept that is difficult to measure directly. Unfortunately, the correlation between popularity and quality is very weak for newly-created pages that have yet to receive many visits and/or in-links. Worse, since discovery of new content is largely done by querying search engines, and because users usually focus their attention on the top few results, newly-created but high-quality pages are effectively ``shut out,'' and it can take a very long time before they become popular. We propose a simple and elegant solution to this problem: the introduction of a controlled amount of randomness into search result ranking methods. Doing so offers new pages a chance to prove their worth, although clearly using too much randomness will degrade result quality and annul any benefits achieved. Hence there is a tradeoff between exploration to estimate the quality of new pages and exploitation of pages already known to be of high quality. We study this tradeoff both analytically and via simulation, in the context of an economic objective function based on aggregate result quality amortized over time. We show that a modest amount of randomness leads to improved search results.
更多
查看译文
关键词
new page,aggregate result quality,high quality,result quality,improved search result,modern search engine,querying search engine,search result,search result ranking method,Web page popularity,randomized ranking,search engine result
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要