TFM: A Triple Fusion Module for Integrating Lexicon Information in Chinese Named Entity Recognition

Neural Processing Letters(2022)

引用 5|浏览3
暂无评分
摘要
Due to the characteristics of the Chinese writing system, character-based Chinese named entity recognition models ignore the word information in sentences, which harms their performance. Recently, many works try to alleviate the problem by integrating lexicon information into character-based models. These models, however, either simply concatenate word embeddings, or have complex structures which lead to low efficiency. Furthermore, word information is viewed as the only resource from lexicon, thus the value of lexicon is not fully explored. In this work, we observe another neglected information, i.e., character position in a word, which is beneficial for identifying character meanings. To fuse character, word and character position information, we modify the key-value memory network and propose a triple fusion module, termed as TFM. TFM is not limited to simple concatenation or suffers from complicated computation, compatibly working with the general sequence labeling model. Experimental evaluations show that our model has performance superiority. The F1-scores on Resume, Weibo and MSRA are 96.19%, 71.12% and 95.63% respectively.
更多
查看译文
关键词
Chinese named entity recognition, Lexicon information, Information fusion, Natural language processing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要