Focused Topic Models

google(2009)

引用 1|浏览34
暂无评分
摘要
We present the focused topic model (FTM), a family of nonparametric Bayesian models for learning sparse topic mixture patterns. The FTM integrates desirable features from both the hierarchical Dirichlet process (HDP) and the Indian buffet process (IBP) – allowing an unbounded number of topics for the entire corpus, while each document maintains a sparse distribution over these topics. We observe that the HDP assumes correlation between the global and within-documant prevalences of a topic, and note that such a relationship may be undesirable. By using an IBP to select which topics contribute to a document, and an unnormalized Dirichlet Process to determine how much of the document is generated by that topic, the FTM decouples these probabilities, allowing for more flexible modeling. Experimental results on three text corpora demonstrate superior performance over the hierarchical Dirichlet process topic model.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要