Ad-hoc Information Retrieval based on Boosted Latent Dirichlet Allocated Topics

2018 37th International Conference of the Chilean Computer Science Society (SCCC)(2018)

引用 1|浏览179
暂无评分
摘要
Latent Dirichlet Allocation (LDA) is a fundamental method in the text mining field. We propose strategies for topic and model selection based on LDA that exploits the semantic coherence of the topics inferred, boosting the quality of the models found. Then we study how our boosted topic models perform in ad-hoc information retrieval tasks. Experimental results in four datasets show that our proposal improves the quality of the topics found favoring document retrieval tasks. Our method outperforms traditional LDA-based methods showing that model selection based on semantic coherence is useful for document modeling and information retrieval tasks.
更多
查看译文
关键词
Task analysis,Hidden Markov models,Information retrieval,Semantics,Smoothing methods,Coherence,Data models
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要