AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
We propose a novel automatic online algorithm for news topic ranking based on an aging theory, using both media focus and user attention

Automatic online news topic ranking using media focus and user attention based on aging theory

CIKM, pp.1033-1042, (2008)

Cited: 108|Views420
WOS SCOPUS EI

Abstract

News topics, which are constructed from news stories using the techniques of Topic Detection and Tracking (TDT), bring convenience to users who intend to see what is going on through the Internet. However, it is almost impossible to view all the generated topics, because of the large amount. So it will be helpful if all topics are ranked ...More

Code:

Data:

0
Introduction
  • A new problem arises: how to rank the news topics to show the top ones with high priority, which are both timely and important?.
  • More factors must be taken into consideration: (1) every news story of a topic contributes to its importance, while the contribution decays along the timeline; (2) topics that attract more users’ attention should be ranked higher
Highlights
  • News stories are gathered from many Websites and organized into news topics by practical Web applications like Google News
  • How much do users like to read news stories about the topic? This one is called user attention. Both media focus and user attention varies as time goes on, so the effect of time on topic ranking has already been included by the two factors
  • Through investigation of characteristics of topics, we have found out that topic ranking is determined by two primary factors: media focus and user attention
  • We propose a novel automatic online algorithm for news topic ranking based on an aging theory, using both media focus and user attention
  • The main contributions of this paper are twofold: (1) we present the quantitative measure of the inconsistency between media focus and user attention, which provides a basis for topic ranking and an experimental evidence to show that there is a gap between what the media provide and what users view
Methods
  • Preliminary experiments are firstly performed on a training dataset to find proper values for parameters.
  • The analysis on results of calculating inconsistency between media focus and user attention is demonstrated.
  • The discussions of topic ranking results are presented .
  • 5.1 Dataset and Experimental Setup
Results
  • TestingSet is used to perform the automatic online news topic ranking experiment.
  • Topics are ranked in every time slot online automatically, using the method described in Section 4.3.
  • Snippets of the latest news story are shown as the summaries of topics.
  • In this way, the up-to-date status of a topic will be viewed.
  • It is worth noting that media focus and user attention curves are provided for users to know the topic trends
Conclusion
  • The authors propose a novel automatic online algorithm for news topic ranking based on an aging theory, using both media focus and user attention.
  • Both media focus and user attention varies as time goes on, so the effect of time on topic ranking has already been included.
  • Empirical evaluation on the topic ranking result indicates that the proposed topic ranking algorithm reflects the influence of time, the media and users
Tables
  • Table1: Contingency
  • Table2: Top 10 topics on search engine related companies at 8:00 a.m., Oct 26, 2007
Download tables as Excel
Related work
  • Topic detection and tracking (TDT) are intended to structure news stories from newswires and broadcasts into topics [1]. Approaches in TDT were mainly variants and improvements of the single pass method and agglomerative clustering algorithms [2, 3, 7, 14, 15, 16, 19, 21, 22, 23]. Although [3] concluded that time information “did not help” improve the new event detection results, some recent work has utilized the aging theory or timeline analysis, and achieved good performance in TDT and hot topic extraction [4, 5]. The state-of-the-art TDT techniques are used to generate topics from news stories in our work. We also apply the aging theory both in the TDT process and the calculation of media focus and user attention. However, traditional TDT tasks [1] are not the main focus of our work.
Funding
  • This work is supported by the Chinese National Key Foundation Research & Development Plan (2004CB318108), Natural Science Foundation (60621062, 60503064, 60736044) and National 863 High Technology Project (2006AA01Z141)
Reference
  • http://www.nist.gov/speech/tests/tdt/
    Findings
  • J. Allan, R. Papka, and V. Lavrenko. On-line new event detection and tracking. In Proceedings of the 21st Annual International ACM SIGIR Conference, Melbourne, Australia. ACM Press. 1998, 37-45.
    Google ScholarLocate open access versionFindings
  • T. Brants, F. Chen, and A. Farahat. A System for New Event Detection. In Proceedings of the 26th Annual International ACM SIGIR Conference, New York, NY, USA. ACM Press. 2003, 330-337.
    Google ScholarLocate open access versionFindings
  • C.C. Chen, Y.T. Chen, Y. Sun and M.C. Chen. Life Cycle Modeling of News Events Using Aging Theory. In Proceedings of 14th European Conference of Machine Learning (ECML ’03), pp. 47-59, 2003.
    Google ScholarLocate open access versionFindings
  • K. Y. Chen, L. Luesukprasert and S. T. Chou. Hot topic extraction based on timeline analysis and multi-dimensional sentence modeling. IEEE Trans. on Knowledge and Data Engineering, 2007, 19(8):1016-1025.
    Google ScholarLocate open access versionFindings
  • H. L. Chieu and Y. K. Lee. Query Based Event Extraction along a Timeline. In Proceedings of the 27th Annual International ACM SIGIR Conference, Sheffield, UK, ACM Press. 2004, 425-432.
    Google ScholarLocate open access versionFindings
  • M. Connell, A. Feng, G. Kumaran, H. Raghavan, C. Shah, and J. Allan. UMass at tdt 2004. In 2004 Topic Detection and Tracking Workshop (TDT’04), 2004.
    Google ScholarLocate open access versionFindings
  • G.P.C. Fung, J.X. Yu, H. Liu and P.S. Yu. Time-Dependent Event Hierarchy Construction. In Proceedings of KDD2007, pages 300-309, California, USA, 2007.
    Google ScholarLocate open access versionFindings
  • G.P.C. Fung, J.X. Yu, P.S. Yu and H. Liu. Parameter free bursty events detection in text streams. In Proceedings of the 31st VLDB Conference, pages 181-192, Trondheim, Norway, 2005.
    Google ScholarLocate open access versionFindings
  • Q. He, K. Chang, and E. P. Lim. Analyzing Feature Trajectories for Event Detection. In Proceedings of the 30th Annual International ACM SIGIR Conference, Amsterdam, the Netherlands. ACM Press. 2007, 207-214.
    Google ScholarLocate open access versionFindings
  • Q. He, K. Chang and E. P. Lim. Using Burstiness to Improve Clustering of Topics in News Streams. In Proceedings of the 7th IEEE International Conference on Data Mining, pp. 493498, 2007.
    Google ScholarLocate open access versionFindings
  • T. He, G. Qu, S. Li, and et al. Semi-automatic Hot Event Detection. In Proceedings of the 2nd International Conference on Advanced Data Mining and Applications. 2006, LNAI4093, 1008-1016.
    Google ScholarLocate open access versionFindings
  • J. Kleinberg. Authoritative sources in a hyperlinked environment. Proc. 9th ACM-SIAM Symposium on Discrete Algorithms, 1998.
    Google ScholarLocate open access versionFindings
  • G. Kumaran and J. Allan. Text Classification and Named Entities for New Event Detection. In Proceedings of the 27th Annual International ACM SIGIR Conference, Sheffield, UK, ACM Press. 2004, 297-304.
    Google ScholarLocate open access versionFindings
  • M. Spitters and W. Kraaij. TNO at TDT2001: Language Model-Based Topic Detection. Topic Detection and Tracking Workshop Report, 2001.
    Google ScholarFindings
  • N. Stokes and J. Carthy. Combining Semantic and Syntactic Document Classifiers to Improve First Story Detection. In Proceedings of the 24th Annual International ACM SIGIR Conference, New Orleans. ACM Press. 2001, 424-425.
    Google ScholarLocate open access versionFindings
  • R. Swan and J. Allan. Extracting Significant Time Varying Features from Text. In Proceedings of the 8th Conference on Information and Knowledge Management, pages 38-45, 1999.
    Google ScholarLocate open access versionFindings
  • R. Swan and J. Allan. Automatic Generation of Overview Timelines. In Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 49-56, 2000.
    Google ScholarLocate open access versionFindings
  • D. Trieschnigg and W. Kraaij. Hierarchical topic detection in large digital news archives. In Proceedings of the 5th Dutch Belgian Information Retrieval workshop, 2005.
    Google ScholarLocate open access versionFindings
  • Y. Wang, Y. Liu, M. Zhang, S. Ma, Identify Temporal Websites Based on User Behavior Analysis, In Proceedings of 3rd International Joint Conference on Natural Language Processing, Hyderabad, India, 2008.
    Google ScholarLocate open access versionFindings
  • C. Wang, M. Zhang, S. Ma and L. Ru. Automatic online news issue construction in Web environment. In proceedings of the 17th international conference on World Wide Web, 2008, 457-466.
    Google ScholarLocate open access versionFindings
  • Y. Yang, T. Pierce, and J. Carbonell. A Study of Retrospective and On-line Event Detection. In Proceedings of the 21st Annual International ACM SIGIR Conference, Melbourne, Australia. ACM Press. 1998, 28-36.
    Google ScholarLocate open access versionFindings
  • K. Zhang, J. Li, and G. Wu. New Event Detection Based on Indexing-tree and Named Entity. In Proceedings of the 30th Annual International ACM SIGIR Conference, Amsterdam, the Netherlands. ACM Press. 2007, 215-222.
    Google ScholarLocate open access versionFindings
  • Y. Zhao and G. Karypis. Criterion Functions for Document Clustering. Technical Report, 2005.
    Google ScholarFindings
0
Your rating :

No Ratings

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn