Probabilistic Approaches to Controversy Detection
ACM International Conference on Information and Knowledge Management(2016)
摘要
Recently, the problem of automated controversy detection has attracted a lot of interest in the information retrieval community. Existing approaches to this problem have set forth a number of detection algorithms, but there has been little effort to model controversy directly.
In this paper, we propose a probabilistic framework to detect controversy on the web, and investigate two models. We first recast a state-of-the-art controversy detection algorithm into a model in our framework. Based on insights from social science research, we also introduce a language modeling approach to this problem. We extensively evaluate different methods of creating controversy language models based on a diverse set of public datasets including Wikipedia, Web and News corpora.
Based on insights from social science research, we build language models of controversy and introduce formal models of both prior work and our own.
Our automatically derived language models show a significant relative improvement of 18% in AUC over prior work, and 23% over two manually curated lexicons.
更多查看译文
关键词
controversy detection,language model,information retrieval
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要