Modeling Data Curation to Scientific Inquiry: A Case Study for Multimodal Data Integration.
ACM/IEEE Joint Conference on Digital Libraries(2020)
Abstract
Scientific data publications may include interactive data applications designed by scientists to explore a scientific problem. Defined as knowledge systems, their development is complex when data are aggregated from multiple sources over time. Multimodal data are created, encoded, and maintained differently, and even when reporting about identical phenomena, fields and their values may be inconsistent across datasets. To assure the validity and accuracy of the application, the data has to abide by curation requirements similar to those ruling digital libraries. We present a novel, inquiry-driven curation approach aimed to optimize multimodal datasets curation and maximize data reuse by domain researchers. We demonstrate the method through the ASTRIAGraph project, in which multiple data sources about near earth space objects are aggregated into a central knowledge system. The process involves multidisciplinary collaboration, resulting in the design of a data model as the backbone for both data curation and scientific inquiry. We demonstrate a) how data provenance information is needed to assess the uncertainty of the results of scientific inquiries involving multiple data sources, and b) that continuous curation of integrated datasets is facilitated when undertaken as integral to the research project. The approach provides flexibility to support expansion of scientific inquiries and data in the knowledge system, and allows for transparent and explainable results.
MoreTranslated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined