谷歌浏览器插件
订阅小程序
在清言上使用

Sparse Biterm Topic Model for Short Texts

WEB AND BIG DATA, APWEB-WAIM 2021, PT I(2021)

引用 1|浏览15
暂无评分
摘要
Extracting meaningful and coherent topics from short texts is an important task for many real world applications. Biterm topic model (BTM) is a popular topic model for short texts by explicitly model word co-occurrence patterns in the corpus level. However, BTM ignores the fact that a topic is usually described by a few words in a given corpus. In other words, the topic word distribution in topic model should be highly sparse. Understanding the sparsity in topic word distribution may get more coherent topics and improve the performance of BTM. In this paper, we propose a sparse biterm topic model (SparseBTM) which combines a spike and slab prior into BTM to explicitly model the topic sparsity. Experiments on two short texts datasets show that our model can get comparable topic coherent scores and higher classification and clustering performance than BTM.
更多
查看译文
关键词
Topic modeling, Short texts, Topic sparsity
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要