QA4IE: A Question Answering Based System for Document-Level General Information Extraction.

IEEE ACCESS(2020)

引用 8|浏览422
暂无评分
摘要
Information Extraction (IE) is the task of distilling structured information from unstructured texts by identifying references to named entities as well as relationships between such entities. Existing IE solutions, including Relation Extraction and Open IE, can hardly take cross-sentence information like coreferences into account and are severely restricted by limited relation types as well as informal relation specifications (e.g., free-text based relation triples). In order to overcome the weaknesses, we propose a novel IE framework named QA4IE, which leverages the flexible question answering approaches to produce high-quality relation triples across sentences. Based on this framework, we develop a real-time IE system, which can perform general IE throughout the entire document. For training and evaluating our system, we build a large-scale IE benchmark using distant supervision under human evaluation. We deploy both component analyses and pipeline experiments to evaluate our system. The results show that our system can generalize on unseen entities and relations, as well as achieve significant improvements over existing IE systems.
更多
查看译文
关键词
Knowledge acquisition,machine learning,natural language processing,neural networks
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要