Sjoin: A Semantic Join Operator To Integrate Heterogeneous Rdf Graphs

DATABASE AND EXPERT SYSTEMS APPLICATIONS, DEXA 2017, PT I(2017)

引用 12|浏览70
暂无评分
摘要
Semi-structured data models like the Resource Description Framework (RDF), naturally allow for modeling the same real-world entity in various ways. For example, different RDF vocabularies enable the definition of various RDF graphs representing the same drug in Bio2RDF or Drugbank. Albeit semantically equivalent, these RDF graphs may be syntactically different, i.e., they have distinctive graph structure or entity identifiers and properties. Existing data-driven integration approaches only consider syntactic matching criteria or similarity measures to solve the problem of integrating RDF graphs. However, syntactic-based approaches are unable to semantically integrate heterogeneous RDF graphs. We devise SJoin, a semantic similarity join operator to solve the problem of matching semantically equivalent RDF graphs, i.e., syntactically different graphs corresponding to the same real-world entity. Two physical implementations are proposed for SJoin which follow blocking or non-blocking data processing strategies, i.e., RDF graphs can be merged in a batch or incrementally. We empirically evaluate the effectiveness and efficiency of the SJoin physical operators with respect to baseline similarity join algorithms. Experimental results suggest that SJoin outperforms baseline approaches, i.e., non-blocking SJoin incrementally produces results faster, while the blocking SJoin accurately matches all semantically equivalent RDF graphs.
更多
查看译文
关键词
Open Connection, Resource Description Framework (RDF), DrugBank, Semantic Equivalence, Hash Join
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要