Improving Probabilistic Latent Semantic Analysis with Principal Component Analysis.
Conference of the European Chapter of the Association for Computational Linguistics(2006)
摘要
Probabilistic Latent Semantic Analysis (PLSA) models have been shown to provide a better model for capturing polysemy and synonymy than Latent Semantic Analysis (LSA). However, the parameters of a PLSA model are trained using the Expectation Maximization (EM) algorithm, and as a result, the trained model is dependent on the initialization values so that performance can be highly variable. In this paper we present a method for using LSA analysis to initialize a PLSA model. We also investigated the performance of our method for the tasks of text segmentation and retrieval on personal-size corpora, and present results demonstrating the efficacy of our proposed approach.
更多查看译文
关键词
probabilistic latent semantic analysis,principal component analysis
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要