Real-Time Data Quality Analysis

2020 IEEE Second International Conference on Cognitive Machine Intelligence (CogMI)(2020)

引用 0|浏览23
暂无评分
摘要
Data quality is critically important for big data and machine learning applications. Data quality systems can analyze data sets for quality and detection of potential errors. They can also provide remediation to fix problems encountered in analyzing data sets. This paper discusses key features that of data quality analysis systems. We also present new algorithms for efficiently maintaining updated data quality metrics on changing data sets. Our algorithms consider anomalies in data regions in determining how much different regions of data contribute to overall data metrics. We also make intelligent choices of which data metrics to update and how frequently to do so in order to limit the overhead for data quality metric updates.
更多
查看译文
关键词
data quality,data analytics,real time data analytics
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要