A Hybrid Framework for Semantic Relation Extraction over Enterprise Data.
Int. J. Semantic Web Inf. Syst.(2015)
摘要
Relation extraction from the Web data has attracted a lot of attention in recent years. However, little work has been done when it comes to relation extraction from the enterprise data regardless of the urgent needs to such work in real applications e.g., E-discovery. One distinct characteristic of the enterprise data in comparison with the Web data is its low redundancy. Previous work on relation extraction from the Web data largely relies on the data's high redundancy level and thus cannot be applied to the enterprise data effectively. This paper proposes an unsupervised hybrid framework called REACTOR. REACTOR combines a statistical method, classification, and clustering to identify various types of relations among entities appearing in the enterprise data automatically. Furthermore, the authors explore to apply pronominal anaphora resolution to extract more relations expressed across multiple sentences. They evaluate REACTOR over a real-world enterprise data set from HP that contains over three million pages and the experimental results show the effectiveness of REACTOR.
更多查看译文
关键词
Anaphora Resolution, Enterprise Data, Information Extraction, Relation Extraction, Relation Tagging
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络