Probabilistic Text Modeling With Orthogonalized Topics

SIGIR '14: The 37th International ACM SIGIR Conference on Research and Development in Information Retrieval Gold Coast Queensland Australia July, 2014(2014)

引用 13|浏览178
暂无评分
摘要
Topic models have been widely used for text analysis. Previous topic models have enjoyed great success in mining the latent topic structure of text documents. With many efforts made on endowing the resulting document-topic distributions with different motivations, however, none of these models have paid any attention on the resulting topic-word distributions. Since topic-word distribution also plays an important role in the modeling performance, topic models which emphasize only the resulting document-topic representations but pay less attention to the topic-term distributions are limited. In this paper, we propose the Orthogonalized Topic Model (OTM) which imposes an orthogonality constraint on the topic-term distributions. We also propose a novel model fitting algorithm based on the generalized Expectation-Maximization algorithm and the Newthon-Raphson method. Quantitative evaluation of text classification demonstrates that OTM outperforms other baseline models and indicates the important role played by topic orthogonalizing.
更多
查看译文
关键词
Probabilistic Text Modeling,Latent Semantic Analysis,Text Classification
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要