Asynchronous Dual Free Stochastic Dual Coordinate Ascent for Distributed Data Mining

2018 IEEE International Conference on Data Mining (ICDM)(2018)

引用 3|浏览19
暂无评分
摘要
The primal-dual distributed computational methods have broad large-scale data mining applications. Previous primal-dual distributed methods are not applicable when the dual formulation is not available, e.g. the sum-of-non-convex objectives. Moreover, these algorithms and theoretical analysis are based on the fundamental assumption that the computing speeds of multiple machines in a cluster are similar. However, the straggler problem is an unavoidable practical issue in the distributed system because of the existence of slow machines. Therefore, the total computational time of the distributed optimization methods is highly dependent on the slowest machine. In this paper, we address these two issues by proposing novel distributed asynchronous dual free stochastic dual coordinate ascent algorithm for distributed data mining. Our method does not need the dual formulation of the target problem in the computation. We tackle the straggler problem through asynchronous communication and the negative effect of slow machines is significantly alleviated. We also analyze the convergence rate of our method and prove the linear convergence rate even if the individual functions in objective are non-convex. Experiments on both convex and nonconvex loss functions are used to validate our statements.
更多
查看译文
关键词
Asynchronous Distributed Data Mining,Stochastic Dual Coordinate Ascent,Big Data Mining
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要