Parallelizing Machine Learning Optimization Algorithms on Distributed Data-Parallel Platforms with Parameter Server

Rong Gu,Shiqing Fan,Qiu Hu,Chunfeng Yuan,Yihua Huang

2018 IEEE 24th International Conference on Parallel and Distributed Systems (ICPADS)（2018）

引用 5|浏览39

暂无评分

摘要

In the big data era, machine learning optimization algorithms usually need to be designed and implemented on widely-used distributed computing platforms, such as Apache Hadoop, Spark, and Flink. However, these general distributed computing platforms themselves do not focus on parallelizing machine learning optimization algorithms. In this paper, we present a parallel optimization algorithm framework for scalable machine learning, and empirically evaluate the synchronous Elastic Averaging SGD (EASGD) and other distributed SGD-based optimization algorithms. First, we design a distributed machine learning optimization algorithm framework based on Apache Spark by adopting the parameter server. Then, we design and implement the widely-used distributed synchronous EASGD and several other popular SGD-based optimization algorithms, such as Adadelta and Adam, on top of the framework. In addition, we evaluate the performance of synchronous distributed EASGD compared with other distributed optimization algorithms based on the same framework. Finally, to explore the optimal settings of mini-batch size in large-scale distributed optimization, we further analyze the empirical linear scaling rule originally proposed in the single-node environment. Experimental results show that our parallel optimization algorithm framework achieves good flexibility and scalability. And, the distributed synchronous EASGD runs over the proposed framework gains a competitive convergence performance and is about 5.7% faster than other distributed SGD-based optimization algorithms. Experimental results also verified that the empirical linear scaling rule only holds well before the mini-batch size exceeds certain threshold on large-scale benchmarks in the distributed environment.

查看译文

关键词

Optimization,Servers,Machine learning,Training,Machine learning algorithms,Sparks,Computational modeling

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要