Data Integration, Management, and Quality: From Basic Research to Industrial Application

DATABASE AND EXPERT SYSTEMS APPLICATIONS, DEXA 2022 WORKSHOPS(2022)

引用 1|浏览0
暂无评分
摘要
Data integration, data management, and data quality assurance are essential tasks in any data science project. However, these tasks are often not treated with the same priority as core data analytics tasks, such as the training of statistical models. One reason is that data analytics generate directly reportable results and data management is only the precondition without clear notion about its corporate value. Yet, the success of both aspects is strongly connected and in practice many data science projects fail since too little emphasis is put on the integration, management, and quality assurance of the data to be analyzed. In this paper, we motivate the importance of data integration, data management, and data quality by means of four industrial use cases that highlight key challenges in industrial applied-research projects. Based on the use cases, we present our approach on how to successfully conduct such projects: how to start the project by asking the right questions, and how to apply and develop appropriate tools that solve the aforementioned challenges. To this end, we summarize our lessons learned and open research challenges to facilitate further research in this area.
更多
查看译文
关键词
Data management, Data quality, Data integration, Metadata management, Applied research
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要