谷歌浏览器插件
订阅小程序
在清言上使用

A Supervised Named Entity Recognition Method Based on Pattern Matching and Semantic Verification

Wǎngjì wǎnglù jìshù xuékān(2020)

引用 2|浏览12
暂无评分
摘要
Named entity recognition is a basic task in the field of natural language processing and plays a pivotal role in tasks such as information extraction, machine translation, and knowledge graph construction. It has also received widespread attention in financial, biological and pharmaceutical industries. This paper proposes a method of weakly supervised learning to recognize the complex named entities (commonly composed of multiple small entity sequences, hereinafter referred to as CNEs) in the corpus, which makes it difficult to determine the boundaries of such entities. To improve the recognition accuracy, our method Masked-BiLSTM-CRF is proposed to separate the context semantic relationship determination from the entity boundary confirmation. This method is based on two aspects to solve the above problems: (1) Semantic model based on CNEs mask processing. Before training, the CNEs in the corpus will be masked, and then use the masked corpus training the semantic model through BiLSTM-CRF, which can verify whether the context semantics of the corresponding location entities are correct. (2) A weakly supervised CNEs boundary confirmation model based on sequential patterns. In the small sample data set, the target CNE candidate set is found by sliding window combined with sequence pattern matching, and then it is effectively screened and judged by the semantic understanding model obtained in (1). The experimental results show that compared with the named entity recognition method based directly on BiLSTMCRF on the weakly-supervised named entity recognition in financial field, our proposed method improves F1Score in the small data training sample set by nearly 9%, and it has some generalization ability.
更多
查看译文
关键词
Named entity recognition,Weakly supervised learning,Deep learning,Pattern matching
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要