On Convergence of Model Parallel Proximal Gradient Algorithm for Stale Synchronous Parallel System.

JMLR Workshop and Conference Proceedings(2016)

引用 33|浏览29
暂无评分
摘要
With ever growing data volume and model size, an error-tolerant, communication efficient, yet versatile parallel algorithm has become a vital part for the success of many large-scale applications. In this work we propose msPG, an extension of the flexible proximal gradient algorithm to the model parallel and stale synchronous setting. The worker machines of msPG operate asynchronously as long as they are not too far apart, and they communicate efficiently through a dedicated parameter server. Theoretically, we provide a rigorous analysis of the various convergence properties of msPG, and a salient feature of our analysis is its seamless generality that allows both nonsmooth and nonconvex functions. Under mild conditions, we prove the whole iterate sequence of msPG converges to a critical point (which is optimal under convexity assumptions). We further provide an economical implementation of msPG, completely bypassing the need of keeping a local full model. We confirm our theoretical findings through numerical experiments.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要