Mulco: Recognizing Chinese Nested Named Entities through Multiple Scopes

PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023(2023)

引用 0|浏览21
暂无评分
摘要
Nested Named Entity Recognition (NNER), as a subarea of Named Entity Recognition, has presented longstanding challenges to researchers. In NNER, one entity may be part of a larger entity, which can occur at multiple levels. These nested structures prevent traditional sequence labeling methods from properly recognizing all entities. While recent research has focused on designing better recognition methods for NNER in various languages, Chinese Nested Named Entity Recognition (CNNER) is still underdeveloped, largely due to a lack of freely available CNNER benchmarks. To support CNNER research, in this paper, we introduce ChiNesE, a CNNER dataset comprising 20,000 sentences from online passages in multiple domains and containing 117,284 entities that fall into 10 categories, of which 43.8% are nested named entities. Based on ChiNesE, we propose Mulco, a novel method that can recognize named entities in nested structures through multiple scopes. Each scope uses a scope-based sequence labeling method that predicts an anchor and the length of a named entity to recognize it. Experimental results show that Mulco outperforms state-of-the-art baseline methods with different recognition schemes on ChiNesE and ACE 2005 Chinese corpus.
更多
查看译文
关键词
Datasets,Nested Named Entity Recognition,Chinese Nested Named Entity Recognition,Sequence Labeling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要