Training Acceleration for Deep Neural Networks: A Hybrid Parallelization Strategy

2021 58TH ACM/IEEE DESIGN AUTOMATION CONFERENCE (DAC)(2021)

引用 3|浏览30
暂无评分
摘要
Deep Neural Networks (DNNs) are widely investigated due to their striking performance in various applications of artificial intelligence. However, with DNNs becoming larger and deeper, the computing resource of a single hardware accelerator is insufficient to meet the training requirements of popular DNNs. Hence, it is required to train them using multiple accelerators in a distributed setting. For a better utilization of the accelerators and a faster training, it is necessary to partition the whole process into segments that can run in parallel. However, in this context, intra-layer parallelization techniques (i.e., data and model parallelization) often face communication and memory bottlenecks, while the performance and resource utilization of inter-layer parallelization techniques (i.e., using pipelining) depend on the partitioning possibilities of the model. We present EffTra, a synchronous hybrid parallelization strategy, that uses a combination of intra-layer and inter-layer parallelism to realize a distributed training of DNNs. EffTra employs the idea of dynamic programming to try to search for the optimal partitioning of a DNN model and assigns devices to the obtained partitions. Our evaluation shows that EffTra accelerates training by up to 2.0x and 1.78x compared to state-of-the-art inter-layer (i.e., GPipe) and intra-layer (i.e., data parallelism) parallelization techniques respectively.
更多
查看译文
关键词
training acceleration,deep neural networks,artificial intelligence,single hardware accelerator,multiple accelerators,distributed setting,intra-layer parallelization techniques,memory bottlenecks,resource utilization,inter-layer parallelization techniques,EffTra,synchronous hybrid parallelization strategy,distributed training,optimal partitioning,DNN model,data parallelism,dynamic programming
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要