Cambricon-Q: A Hybrid Architecture for Efficient Training

2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA)(2021)

引用 16|浏览67
暂无评分
摘要
Deep neural network (DNN) training is notoriously time-consuming, and quantization is promising to improve the training efficiency with reduced bandwidth/storage requirements and computation costs. However, state-of-the-art quantized algorithms with negligible training accuracy loss, which require on-the-fly statistic-based quantization over a great amount of data (e.g., neurons and weights) and high-precision weight update, cannot be effectively deployed on existing DNN accelerators. To address this problem, we propose the first customized architecture for efficient quantized training with negligible accuracy loss, which is named as Cambricon-Q. Cambricon-Q features a hybrid architecture consisting of an ASIC acceleration core and a near-data-processing (NDP) engine. The acceleration core mainly targets at improving the efficiency of statistic-based quantization with specialized computing units for both statistical analysis (e.g., determining maximum) and data reformating, while the NDP engine avoids transferring the high-precision weights from the off-chip memory to the acceleration core. Experimental results show that on the evaluated benchmarks, Cambricon-Q improves the energy efficiency of DNN training by 6.41× and 1.62×, performance by 4.20× and 1.70× compared to GPU and TPU, respectively, with only ⩽ 0.4% accuracy degradation compared with full precision training.
更多
查看译文
关键词
hybrid architecture,neural network training,computation costs,negligible training accuracy loss,DNN accelerators,customized architecture,efficient quantized training,negligible accuracy loss,ASIC acceleration core,near-data-processing engine,statistic-based quantization,specialized computing units,statistical analysis,NDP engine,high-precision weights,energy efficiency,DNN training,accuracy degradation,precision training,quantized algorithms
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要