A First Study on Temporal Dynamics of Topics on the Web.

WWW '16: 25th International World Wide Web Conference Montréal Québec Canada April, 2016(2016)

引用 15|浏览18
暂无评分
摘要
While much work has been devoted to understanding Web dynamics and using this knowledge to efficiently maintain the freshness of the indexes of generic search engines, the same is not true for domain-specific indexes constructed by focused crawlers. For the latter, the problem is compounded by the fact that it is important not only to maintain already-crawled pages fresh, but also to identify new relevant content and expand the collection. In this paper, we discuss the challenges involved in this problem and describe our preliminary efforts in building a testbed to better understand the dynamics of specific topics and characterize how they evolve over time. We propose a data collection methodology and a set of experiments to answer important questions about temporal dynamics and evolution of topics. We also present the results of the experimental analysis we carried out using data collected over a period of four weeks using two distinct topics. These results suggest that topic-specific refreshing strategies can be beneficial for focused crawlers.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要