Class-Specific Word Embedding through Linear Compositionality

2018 IEEE International Conference on Big Data and Smart Computing (BigComp)（2018）

引用 4|浏览70

暂无评分

摘要

English linguist John Rupert Firth has a famous saying "you shall know a word by the company it keeps". Most word representation learning models are based on this assumption that a word's semantic meaning can be learned from the context in which it resides. The context is defined as a small unordered number of words surrounding the target word. Research has shown that context alone provides limited information because the context contains only neighboring words. Thus only local information is learned in the word embeddings. Some research tries to improve this by utilizing outside information sources such as a knowledge base. We observe that the meaning of a word in a sentence can be better interpreted when the class information or label of the sentence is presented. We propose three approaches to train class-specific embeddings to encode class information by utilizing the linear compositionality property of word embeddings. We present a general framework consisting of a pair of convolutional neural networks for text classification tasks where the learned class-specific embeddings serve as features. We evaluate our approach and framework on topic classification of a disaster-focused Twitter dataset and a benchmark Twitter sentiment classification dataset from SemEval 2013. Our results show a potential relative accuracy improvement of more than 5% over a recent baseline.

查看译文

关键词

word embeddings,text classification,sentiment analysis

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要