Metaconcepts: Isolating Context in Word Embeddings

2019 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR)(2019)

引用 2|浏览63
暂无评分
摘要
Word embeddings are commonly used to measure word-level semantic similarity in text, especially in direct word-to-word comparisons. However, the relationships between words in the embedding space are often viewed as approximately linear and concepts comprised of multiple words are a sort of linear combination. In this paper, we demonstrate that this is not generally true and show how the relationships can be better captured by leveraging the topology of the embedding space. We propose a technique for directly computing new vectors representing multiple words in a way that naturally combines them into a new, more consistent space where distance better correlates to similarity. We show that this technique works well for natural language, even when it comprises multiple words, on a simple task derived from WordNet synset descriptions and examples of words. Thus, the generated vectors better represent complex concepts in the word embedding space.
更多
查看译文
关键词
Word Embeddings,Contextualization,Similarity Measures,Word Sense Disambiguation,word2vec,GloVe
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要