A Configurable Multi-Precision CNN Computing Framework Based on Single Bit RRAM

Proceedings of the 56th Annual Design Automation Conference 2019(2019)

引用 86|浏览159
暂无评分
摘要
Convolutional Neural Networks (CNNs) play a vital role in machine learning. Emerging resistive random-access memories (RRAMs) and RRAM-based Processing-In-Memory architectures have demonstrated great potentials in boosting both the performance and energy efficiency of CNNs. However, restricted by the immature process technology, it is hard to implement and fabricate a CNN accelerator chip based on multi-bit RRAM devices. In addition, existing single bit RRAM based CNN accelerators only focus on binary or ternary CNNs which have more than 10% accuracy loss compared with full precision CNNs. This paper proposes a configurable multi-precision CNN computing framework based on single bit RRAM, which consists of an RRAM computing overhead aware network quantization algorithm and a configurable multi-precision CNN computing architecture based on single bit RRAM. The proposed method can achieve equivalent accuracy as full precision CNN but also with lower storage consumption and latency via multiple precision quantization. The designed architecture supports for accelerating the multi-precision CNNs even with various precision among different layers. Experiment results show that the proposed framework can reduce 70% computing area and 75% computing energy on average, with nearly no accuracy loss. And the equivalent energy efficiency is 1.6 ~ 8.6× compared with existing RRAM based architectures with only 1.07% area overhead.
更多
查看译文
关键词
configurable multiprecision CNN computing framework,single bit RRAM,RRAM computing overhead aware network quantization algorithm,CNN accelerator chip,multibit RRAM devices,RRAM-based Processing-in-memory architectures,convolutional neural networks,machine learning,resistive random-access memories,energy efficiency
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要