A Lexical Resource-Constrained Topic Model for Word Relatedness

IEEE ACCESS(2019)

引用 3|浏览351
暂无评分
摘要
Word relatedness computation is an important supporting technology for many tasks in natural language processing. Traditionally, there have been two distinct strategies for word relatedness measurement: one utilizes corpus-based models, whereas the other leverages external lexical resources. However, either solution has its strengths and weaknesses. In this paper, we propose a lexical resource-constrained topic model to integrate the two complementary strategies effectively. Our model is an extension of probabilistic latent semantic analysis, which automatically learns word-level distributed representations forward relatedness measurement. Furthermore, we introduce generalized expectation maximization (GEM) algorithm for statistical estimation. The proposed model not merely inherit the advantage of conventional topic models in dimension reduction, but it also refines parameter estimation by using word pairs that are known to be related. The experimental results in different languages demonstrate the effectiveness of our model in topic extraction and word relatedness measurement.
更多
查看译文
关键词
Natural language processing,unsupervised learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要