A Multigranularity Text Driven Named Entity Recognition CGAN Model for Traditional Chinese Medicine Literatures

COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE(2022)

引用 1|浏览12
暂无评分
摘要
Recognition of Traditional Chinese Medicine (TCM) entities from different types of literature is challenging research, which is the foundation for extracting a large amount of TCM knowledge existing in unstructured texts into structured formats. The lack of large-scale annotated data makes unsatisfactory application of conventional deep learning models in TCM text knowledge extraction. Some other unsupervised methods rely on other auxiliary data, such as domain dictionaries. We propose a multigranularity text-driven NER model based on Conditional Generation Adversarial Network (MT-CGAN) to implement TCM NER with small-scale annotated corpus. In the model, a multigranularity text features encoder (MTFE) is designed to extract rich semantic and grammatical information from multiple dimensions of TCM texts. By differentiating the conditional constraints of the generator and discriminator of MT-CGAN, the synchronization between the generated tag labs and the named entities is guaranteed. Furthermore, seeds of different TCM text types are introduced into our model to improve the precision of NER. We compare our method with other baseline methods to illustrate the effectiveness of our method on 4 kinds of gold-standard datasets. The experiment results show that the standard precision, recall, and F1 score of our method are higher than the state-of-the-art methods by 0.24 similar to 8.97%, 0.89 similar to 12.74%, and 0.01 similar to 10.84%. MT-CGAN is able to extract entities from different types of TCM literature effectively. Our experimental results indicate that the proposed approach has a clear advantage in processing TCM texts with more entity types, higher sparsity, less regular features, and a small-scale corpus.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要