Customizing an Information Extraction System to a New Domain.

Mihai Surdeanu,David McClosky, Mason Smith,Andrey Gusev,Christopher D. Manning

RELMS '11: Proceedings of the ACL 2011 Workshop on Relational Models of Semantics（2011）

引用 40|浏览80

暂无评分

摘要

We introduce several ideas that improve the performance of supervised information extraction systems with a pipeline architecture, when they are customized for new domains. We show that: (a) a combination of a sequence tagger with a rule-based approach for entity mention extraction yields better performance for both entity and relation mention extraction; (b) improving the identification of syntactic heads of entity mentions helps relation extraction; and (c) a deterministic inference engine captures some of the joint domain structure, even when introduced as a postprocessing step to a pipeline system. All in all, our contributions yield a 20% relative increase in F1 score in a domain significantly different from the domains used during the development of our information extraction system.

查看译文

关键词

entity mention extraction yield,information extraction system,relation extraction,relation mention extraction,supervised information extraction system,better performance,joint domain structure,new domain,pipeline architecture,pipeline system

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要