Automated and non-intrusive provenance capture with UML2PROV

Carlos Sáenz-Adán,Francisco J. García-Izquierdo,Beatriz Pérez,Trung Dong Huynh,Luc Moreau

Computing（2021）

引用 0|浏览1

暂无评分

摘要

Data provenance is a form of knowledge graph providing an account of what a system performs, describing the data involved, and the processes carried out over them. It is crucial to ascertaining the origin of data, validating their quality, auditing applications behaviours, and, ultimately, making them accountable. However, instrumenting applications, especially legacy ones, to track the provenance of their operations remains a significant technical hurdle, hindering the adoption of provenance technology. UML2PROV is a software-engineering methodology that facilitates the instrumentation of provenance recording in applications designed with UML diagrams. It automates the generation of (1) templates for the provenance to be recorded and (2) the code to capture values required to instantiate those templates from an application at run time, both from the application’s UML diagrams. By so doing, UML2PROV frees application developers from manual instrumentation of provenance capturing while ensuring the quality of recorded provenance. In this paper, we present in detail UML2PROV’s approach to generating application code for capturing provenance values via the means of Bindings Generation Module (BGM). In particular, we propose a set of requirements for BGM implementations and describe an event-based design of BGM that relies on the Aspect-Oriented Programming (AOP) paradigm to automatically weave the generated code into an application. Finally, we present three different BGM implementations following the above design and analyze their pros and cons in terms of computing/storage overheads and implications to provenance consumers.

查看译文

关键词

Data provenance,Application logging,Technology audits

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要