Named Entity recognition without gazetteers

EACL '99: Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics(1999)

引用 326|浏览1
暂无评分
摘要
It is often claimed that Named Entity recognition systems need extensive gazetteers---lists of names of people, organisations, locations, and other named entities. Indeed, the compilation of such gazetteers is sometimes mentioned as a bottleneck in the design of Named Entity recognition systems.We report on a Named Entity recognition system which combines rule-based grammars with statistical (maximum entropy) models. We report on the system's performance with gazetteers of different types and different sizes, using test material from the MUC-7 competition. We show that, for the text type and task of this competition, it is sufficient to use relatively small gazetteers of well-known names, rather than large gazetteers of low-frequency names. We conclude with observations about the domain independence of the competition and of our experiments.
更多
查看译文
关键词
rule-based grammar,large gazetteer,low-frequency name,maximum entropy,entity recognition system,domain independence,extensive gazetteer,different type,muc-7 competition,different size,rule based,maximum entropy model,low frequency
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要