NNPIM: A Processing In-Memory Architecture for Neural Network Acceleration

Saransh Gupta,Mohsen Imani, Harveen Kaur,Tajana Simunic Rosing

IEEE Transactions on Computers（2019）

引用 54|浏览45

暂无评分

摘要

Neural networks (NNs) have shown great ability to process emerging applications such as speech recognition, language recognition, image classification, video segmentation, and gaming. It is therefore important to make NNs efficient. Although attempts have been made to improve NNs’ computation cost, the data movement between memory and processing cores is the main bottleneck for NNs’ energy consumption and execution time. This makes the implementation of NNs significantly slower on traditional CPU/GPU cores. In this paper, we propose a novel processing in-memory architecture, called NNPIM, that significantly accelerates neural network's inference phase inside the memory. First, we design a crossbar memory architecture that supports fast addition, multiplication, and search operations inside the memory. Second, we introduce simple optimization techniques which significantly improves NNs’ performance and reduces the overall energy consumption. We also map all NN functionalities using parallel in-memory components. To further improve the efficiency, our design supports weight sharing to reduce the number of computations in memory and consecutively speedup NNPIM computation. We compare the efficiency of our proposed NNPIM with GPU and the state-of-the-art PIM architectures. Our evaluation shows that our design can achieve 131.5× higher energy efficiency and is 48.2× faster as compared to NVIDIA GTX 1,080 GPU architecture. Compared to state-of-the-art neural network accelerators, NNPIM can achieve on an average 3.6× higher energy efficiency and is 4.6× faster, while providing the same classification accuracy.

查看译文

关键词

Artificial neural networks,Biological neural networks,Memory management,Acceleration,Computational modeling,Neurons

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要