Data Provenance Based System For Classification And Linear Regression In Distributed Machine Learning

STRUCTURED OBJECT-ORIENTED FORMAL LANGUAGE AND METHOD (SOFL+MSVL 2019)(2020)

引用 0|浏览51
暂无评分
摘要
Nowadays, data provenance is widely used to increase the accuracy of machine learning models. However, facing the difficulties in information heredity, these models produce data association. Most of the studies in the field of data provenance are focused on specific domains. And there are only a few studies on a machine learning (ML) framework with distinct emphasis on the accurate partition of coherent and physical activities with implementation of ML pipelines for provenance. This paper presents a novel approach to usage of data provenance which is also called data provenance based system for classification and linear regression in distributed machine learning (DPMLR). To develop the comprehensive approach for data analysis and visualization based on a collective set of functions for various algorithms and provide the ability to run large scale graph analysis, we apply StellarGraph as our primary ML structure. The preliminary results on the complex data stream structure showed that the overall overhead is no more than 20%. It opens up opportunities for designing an integrated system which performs dynamic scheduling and network bounded synchronization based on the ML algorithm.
更多
查看译文
关键词
Data provenance, Machine learning, StellarGraph, Pipeline, Distributed computing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要