Integrating Extra Knowledge Into Word Embedding Models For Biomedical Nlp Tasks

2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN)(2017)

引用 33|浏览54
暂无评分
摘要
Word embedding in the NLP area has attracted increasing attention in recent years. The continuous bag-of-words model (CBOW) and the continuous Skip-gram model (Skip-gram) have been developed to learn distributed representations of words from a large amount of unlabeled text data. In this paper, we explore the idea of integrating extra knowledge to the CBOW and Skip-gram models and applying the new models to biomedical NLP tasks. The main idea is to construct a weighted graph from knowledge bases (KBs) to represent structured relationships among words/concepts. In particular, we propose a GCBOW model and a GSkip-gram model respectively by integrating such a graph into the original CBOW model and Skip-gram model via graph regularization. Our experiments on four general domain standard datasets show encouraging improvements with the new models. Further evaluations on two biomedical NLP tasks (biomedical similarity/relatedness task and biomedical Information Retrieval (IR) task) show that our methods have better performance than baselines.
更多
查看译文
关键词
knowledge integration,word embedding models,natural language processing,biomedical NLP,continuous bag-of-words model,CBOW,continuous Skip-gram model,weighted graph,knowledge bases,KB,information retrieval,IR
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要