Recognizing Compound Entity Phrases In Hybrid Academic Domains In View Of Community Division

INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE (ICCS 2017)(2017)

引用 0|浏览32
暂无评分
摘要
Classifying compound named entities in academic domains, such as the name of papers, patents and projects, plays an important role in enhancing many applications such as knowledge discovering and intelligence property protection. However, there are very little work on this novel and hard problem. Prior mainstream approaches mainly focus on classifying basic named entities (e.g. person names, organization names, twitter named entities, and simple entities in specific sci-tech domain etc). We use context templates to extract the possible candidate compound entities roughly, which is used for reducing searching space of text splitting. We reduce the text splitting problem to the community division problem, which is addressed based on the dynamic programming strategy. The construction of indicative words set used in segment validating is reduced to the classical minimum set cover problem, which is also addressed based on dynamic programming. Experimental results on classifying real-world science technology compound entities show that GenericSegVal achieves a sharp increase in both precision rate and recall rate by comparing with the supervised bidirectional LSTM approach. (C) 2017 The Authors. Published by Elsevier B.V.
更多
查看译文
关键词
Named Entity Recognition,Template Extraction,Minimum Set Cover Problem,Community Division
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要