Interpretable Topic Extraction and Word Embedding Learning Using Row-Stochastic DEDICOM.

CD-MAKE(2020)

引用 1|浏览21
暂无评分
摘要
The DEDICOM algorithm provides a uniquely interpretable matrix factorization method for symmetric and asymmetric square matrices. We employ a new row-stochastic variation of DEDICOM on the pointwise mutual information matrices of text corpora to identify latent topic clusters within the vocabulary and simultaneously learn interpretable word embeddings. We introduce a method to efficiently train a constrained DEDICOM algorithm and a qualitative evaluation of its topic modeling and word embedding performance.
更多
查看译文
关键词
Word embeddings, Topic analysis, Matrix factorization, Natural language processing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要