Learning Diachronic Word Embeddings with Iterative Stable Information Alignment

Lecture Notes in Artificial Intelligence（2019）

引用 3|浏览42

暂无评分

摘要

Diachronic word embedding aims to reveal the semantic evolution of words over time. Previous works learned word embeddings in different time periods first, and then aligned all the word embeddings into a same vector space. Different from previous works, we iteratively identify stable words, meanings of which remain acceptably stable even in different time periods, as anchors to ensure the performances of both embedding learning and alignment. To learn word embeddings in the same vector space, two different cross-time constraints are used during training. Initially, we identify the most obvious stable words with an unconstrained model, and then use hard constraint to restrain them in related stable time periods. In the iterative process, we identify new stable words from previously trained model and use soft constraint on them to fine-tune the model. We use COHA dataset (https://corpus.byu.edu/coha/) [14], which consists of texts from 1810s to 2000s. Both qualitative and quantitative evaluations show our model can capture meanings in each single time period accurately and model the changes of word meaning. Experimental results indicate that our proposed model outperforms all baseline methods in terms of diachronic text evaluation.

查看译文

关键词

Linguistic change,Diachronic word embedding,Lexical semantics

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要