SubMac: Exploiting the subword-based computation in RRAM-based CNN accelerator for energy saving and speedup.

Integration(2019)

引用 0|浏览12
暂无评分
摘要
Although the CMOS-based CNN accelerators have achieved impressive progress, the memory wall issue and the high power density are still the major barriers for substantial improvement in energy efficiency and throughput. As an attractive alternative, recently the Resistive RAM-based accelerators have delivered significant breakthroughs by leveraging the in-situ computation. However, there are still some challenges, including the high computation complexity and the large energy overhead at the analog/digital interfacing circuits. In this work, we take advantage of the subword-based computation in the Resistive RAM-based accelerator to achieve energy saving and speedup. First, an encoding method is proposed for the weights and activations to reduce the energy consumption of the in-situ computation and the resolution requirement of ADC. Then the resolution of ADC is further optimized based on the distribution of the subword computation results. Furthermore, a dynamic quantization scheme is proposed to skip 67%–87% of the subword computations which outperforms the conventional quantization schemes. We fully investigate the influences of the encoding scheme and the layer-wise quantization range scaling on the performance of dynamic quantization. Finally, we demonstrate the effectiveness of the proposed algorithms under different hardware configurations and network complexities. A dedicated architecture, SubMac, is proposed to implement the above schemes. Experimental results show that the energy efficiency and the throughput are improved by 2.8–5.7 and 2.5–7.9 times, respectively, when compared with the state-of-the-art Resistive RAM-based accelerators.
更多
查看译文
关键词
Convolutional neural network,Resistive RAM,Subword encoding,Dynamic quantization,Computation reduction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要