Automatic Idiom Recognition with Word Embeddings.

Communications in Computer and Information Science(2017)

引用 7|浏览14
暂无评分
摘要
Expressions, such as add fuel to the fire, can be interpreted literally or idiomatically depending on the context they occur in. Many Natural Language Processing applications could improve their performance if idiom recognition were improved. Our approach is based on the idea that idioms and their literal counterparts do not appear in the same contexts. We propose two approaches: (1) Compute inner product of context word vectors with the vector representing a target expression. Since literal vectors predict well local contexts, their inner product with contexts should be larger than idiomatic ones, thereby telling apart literals from idioms; and (2) Compute literal and idiomatic scatter (covariance) matrices from local contexts in word vector space. Since the scatter matrices represent context distributions, we can then measure the difference between the distributions using the Frobenius norm. For comparison, we implement [8,16,24] and apply them to our data. We provide experimental results validating the proposed techniques.
更多
查看译文
关键词
Idiom recognition,Vector space models,Distributional semantics,Word embeddings
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要