Open Domain Chinese Triples Hierarchical Extraction Method

APPLIED SCIENCES-BASEL(2020)

引用 6|浏览13
暂无评分
摘要
Featured Application This method can be applied to the task of automatic extraction of triples in unstructured text. Open domain relation prediction is an important task in triples extraction. When faced with the task of constructing large-scale knowledge graph systems, with the exception of structured data, it is necessary to automatically extract triples from a large amount of unstructured text to expand entities and relations. Although a large number of English open relation prediction methods have achieved good performance, the high-performance system for open domain Chinese triples extraction remains undeveloped due to the lack of large-scale Chinese annotation corpora and the difficulty of Chinese language processing. In this paper, we propose an integrated open domain Chinese triples hierarchical extraction method (CTHE) to solve this problem, considering the advantages of Bi-LSTM-CRF and Att-Bi-GRU models based on the pre-trained BERT encoding model. This method can recognize the named entities from Chinese sentences to establish entity pairs, and implement hierarchical extraction of specific and open relations based on the user-defined schema library and attention mechanism. The experimental results demonstrate the effectiveness of this method, which achieved stable performance on the test dataset, and better precision and F1-score in comparison with state-of-the-art Chinese open domain triples extraction methods. Furthermore, a large-scale annotated dataset for a Chinese named entity recognition (NER) task is established, which provides support for research on Chinese NER tasks.
更多
查看译文
关键词
named entity recognition,open relation prediction,information extraction,CTHE
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要