Active Information Retrieval for Linking Twitter Posts with Political Debates

2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA)(2015)

引用 4|浏览31
暂无评分
摘要
Users of microblogging social networks produce millions of short messages every day. Retrieving relevant information to a particular event from this sheer volume of data is not a trivial task. In this paper, we present a framework for the retrieval of Twitter posts that are relevant to a set of political debates. Our main contribution is the proposal of a set of strategies for involving the user in the retrieval process, so that by presenting to her meaningful posts to be labeled, the method achieves a noticeably higher accuracy. The correct retrieval or labeling could be provided by an external information source such as a domain expert, or simulated with an oracle. A key aspect of active retrieval methods is to request the labels of the instances that help improve the retrieval accuracy the most, while keeping the number of labeling requests to a minimum. The proposed strategies for selecting labeling requests make use of the textual content of tweets and their structural information. The experimental results show the advantages of the proposed methods and the effectiveness of the selection strategies for involving the user in the retrieval process.
更多
查看译文
关键词
active learning,information retrieval,text similarity,feature extraction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要