Integrity constraints for the semantic web: an owl 2 dl extension

Integrity constraints for the semantic web: an owl 2 dl extension(2012)

引用 25|浏览15
暂无评分
摘要
Along with the growth of Semantic Web applications, there has been an increasing amount of data that has been published and consumed on the Semantic Web. With this proliferation of data, many managers as well as application developers are now asking questions when they are given access to a collection of instance data. One key question is if and under what conditions the data is ready to use. In order to answer this question, some data validation mechanisms are needed to ensure data correctness and integrity. While syntax and semantics validation is supported by existing tools, the Semantic Web still lacks integrity constraint (IC) support.The recommended standard knowledge representation language, i.e., Web Ontology Language (OWL), has a standard semantics, that adopts the Open World Assumption (OWA) and the non-Unique Name Assumption (nUNA). These two assumptions are suitable for many distributed knowledge representation scenarios on the Semantic Web. In typical Semantic Web settings, the knowledge about the domain often comes from distributed sources and complete knowledge about the domain cannot be assumed. However, in many settings, certain parts of the domain are expected to have complete knowledge available and integrity constraint validation is desirable. In these settings it would be expected that the Closed World Assumption (CWA) and the Unique Names Assumption (UNA) are used.Each set of assumptions has their value and place. Any plan to deploy and use data is likely to include the tasks of (a) determining the settings (b) deciding which set of assumptions, OWA and nUNA, or CWA and UNA, to use in the settings, and (c) deciding how to navigate between different settings that may require different sets of assumptions. Since the sets of assumptions conflict, support for bridging between the settings can be valuable.In this thesis, we propose an OWL 2 DL extension to support integrity constraints on the Semantic Web. With this extension, OWL 2 DL can adopt the CWA and the weak UNA. Therefore it can be used not only for knowledge representation but also for integrity constraints. It allows OWL to be used in the settings for open world reasoning and closed world constraint validation, and supports bridging between them. The main contributions of this thesis work are as follows: • An integrity constraint semantics for OWL 2 DL that adopts the CWA and the weak UNA thus interpreting OWL axioms as integrity constraints. • A sound and complete solution to integrity constraint validation by reduction to conjunctive query (DCQnot) answering. • A solution to explanation and repair of integrity constraint violations based on explanations of answers to conjunctive queries DCQnot. • A prototype implementation based on the rule engine DLV system that we used to evaluate some relatively well published semantic web instance data from real applications: a semantically-enabled wine and food advisor using a long lived wine and foods ontology (labeled Wine in this thesis), three natural science virtual observatory data sets (labeled MLSO, CEDAR, and BCODMO in this thesis), and one large open linked government data family (labeled Data-gov in this thesis).This thesis work addresses an important aspect of data validation on the Semantic Web. It supports discovery and repair of defects in the instance data thus enabling quality improvement of Semantic Web data. It is important for the Semantic Web community in the following aspects: • It supports non-trivial violation detection, explanation, and repair in Semantic Web instance data, where the ICs and the inferences involved could be complex. This non-trivial data integrity checking capability goes beyond the integrity constraints in relational databases. Therefore, this work provides a powerful solution to data integrity representation, validation, explanation and repair for the Semantic Web. • Our results impact datapublishing on the Semantic Web. Data publication should include explicit encodings of assumptions about completeness and any requirements for existence of data. This information can then be used to determine which axioms in the referenced ontologies should be used as ICs vs being used as standard axioms. If the IC axioms from the referenced ontologies do not provide enough restrictions to ensure data integrity checking, additional ICs may need to be modeled. It is important to separate the ICs, including ICs from referenced ontologies and additional ICs, from the standard axioms. • It gives an initial proposal for mapping ICs in databases to OWL IC axioms. It provides an initial step towards bridging the integrity constraints in databases with the integrity constraints in Semantic Web applications. With this proposal, IC migration from databases to the Semantic Web can be straightforwardly achieved, thus facilitating data migration between databases and the Semantic Web.
更多
查看译文
关键词
Semantic Web data,data correctness,integrity constraint validation,semantic web instance data,instance data,Semantic Web application,integrity constraint,Semantic Web,data integrity checking,dl extension,data integrity representation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要