Pattern Discovery for Wide-Window Open Information Extraction in Biomedical Literature

2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)(2018)

引用 9|浏览159
暂无评分
摘要
Open information extraction is an important task in Biomedical domain. The goal of the OpenIE is to automatically extract structured information from unstructured text with no or little supervision. It aims to extract all the relation tuples from the corpus without requiring pre-specified relation types. The existing tools may extract ill-structured or incomplete information, or fail on the Biomedical literature due to the long and complicated sentences. In this paper, we propose a novel pattern-based information extraction method for the wide-window entities (WW-PIE). WW-PIE utilizes dependency parsing to break down the long sentences first and then utilizes frequent textual patterns to extract the high-quality information. The pattern hierarchical grouping organize and structure the extractions to be straightforward and precise. Consequently, comparing with the existing OpenIE tools, WW-PIE produces structured output that can be directly used for downstream applications. The proposed WW-PIE is also capable in extracting n-ary and nested relation structures, which is less studied in the existing methods. Extensive experiments on real-world biomedical corpus from PubMed abstracts demonstrate the power of WW-PIE at extracting precise and well-structured information.
更多
查看译文
关键词
pattern-based information extraction method,nested relation structures,pattern hierarchical grouping,high-quality information,frequent textual patterns,WW-PIE utilizes,wide-window entities,complicated sentences,long sentences,incomplete information,ill-structured information,pre-specified relation types,relation tuples,unstructured text,Biomedical domain,Biomedical literature,wide-window open information extraction,pattern discovery
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要