Supervised Topic Models for Microblog Classification

IEEE International Conference on DataMining(2015)

引用 11|浏览12
暂无评分
摘要
In this paper we present a topic model basedapproach for classifying micro-blog posts into a giventopics of interests. The short nature of micro-blog postsmake them challenging for directly learning a classificationmodel. To overcome this limitation, we use content ofthe links embedded in these posts to improve the topiclearning. The hypothesis is that since the link content is farricher than the content of the post itself, using link contentalong with the content of the post will help learning. However, how this link content can be used to constructfeatures for classification remains a challenging issue. Furthermore, in previous methods, user based information isutilized in an ad-hoc manner that only work for certaintype of classification, such as characterizing content ofmicroblogs. In this paper, we propose supervised topicmodel, User-Labeled-LDA and its nonparametric variantthat can avoid the ad-hoc feature construction task andmodel the topics in a discriminative way. Our experimentson a Twitter dataset shows that modeling user interestsand link information helps in learning quality topics forsparse tweets as well as helps significantly in classificationtask. Our experiments further show that modeling thisinformation in a principled way through topic modelshelps more than simply adding this information through features.
更多
查看译文
关键词
microblog classification,supervised topic model,user-labeled-LDA,Twitter,tweets,topic classification,latent Dirichlet allocation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要