Distributed Proximal Gradient Algorithm for Partially Asynchronous Computer Clusters.

Yi Zhou,Yingbin Liang,Yaoliang Yu,Wei Dai,Eric P. Xing

JOURNAL OF MACHINE LEARNING RESEARCH（2018）

引用 25|浏览88

暂无评分

摘要

With ever growing data volume and model size, an error-tolerant, communication efficient, yet versatile distributed algorithm has become vital for the success of many large-scale machine learning applications. In this work we propose m-PAPG, an implementation of the flexible proximal gradient algorithm in model parallel systems equipped with the partially asynchronous communication protocol. The worker machines communicate asynchronously with a controlled staleness bound s and operate at different frequencies. We characterize various convergence properties of m-PAPG: 1) Under a general non-smooth and non-convex setting, we prove that every limit point of the sequence generated by m-PAPG is a critical point of the objective function; 2) Under an error bound condition of convex objective functions, we prove that the optimality gap decays linearly for every s steps; 3) Under the Kurdyka-Lojasiewicz inequality and a sufficient decrease assumption , we prove that the sequences generated by m-PAPG converge to the same critical point, provided that a proximal Lipschitz condition is satisfied.

查看译文

关键词

proximal gradient,distributed system,model parallel,partially asynchronous,machine learning

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要