Integrating Approximate Summarization with Provenance Capture.

Seokki Lee,Xing Niu,Bertram Ludäscher,Boris Glavic

TaPP（2017）

引用 23|浏览6

暂无评分

摘要

How to use provenance to explain why a query returns a result or why a result is missing has been studied extensively. Recently, we have demonstrated how to uniformly answer these types of provenance questions for first-order queries with negation and have presented an implementation of this approach in our PUG (Provenance Unification through Graphs) system. However, for realistically-sized databases, the provenance of answers and missing answers can be very large, overwhelming the user with too much information and wasting computational resources. In this paper, we introduce an (approximate) summarization technique that generates compact representations of why and why-not provenance. Our technique uses patterns as a summarized representation of sets of elements from the provenance, i.e., successful or failed derivations. We rank these patterns based on their descriptiveness (we use precision and recall as quality measures for patterns) and return only the top-k summaries. We demonstrate how this summarization technique can be integrated with provenance capture to compute summaries on demand and how sampling techniques can be employed to speed up both the summarization and capture steps. Our preliminary experiments demonstrate that this summarization technique scales to large instances of a real-world dataset.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要