Efficient neural network acceleration on GPGPU using content addressable memory.

Mohsen Imani,Daniel Peroni,Yeseong Kim,Abbas Rahimi,Tajana Rosing

DATE（2017）

引用 60|浏览85

暂无评分

摘要

Recently, neural networks have been demonstrated to be effective models for image processing, video segmentation, speech recognition, computer vision and gaming. However, high energy computation and low performance are the primary bottlenecks of running the neural networks. In this paper, we propose an energy/performance-efficient network acceleration technique on General Purpose GPU (GPGPU) architecture which utilizes specialized resistive nearest content addressable memory blocks, called NNCAM, by exploiting computation locality of the learning algorithms. NNCAM stores highly frequent patterns corresponding to neural network operations and searches for the most similar patterns to reuse the computation results. To improve NNCAM computation efficiency and accuracy, we proposed layer-based associative update and selective approximation techniques. The layer-based update improves data locality of NNCAM blocks by filling NNCAM values based on the frequent computation patterns of each neural network layer. To guarantee the appropriate level of computation accuracy while providing maximum energy saving, our design adaptively allocates the neural network operations to either NNCAM or GPGPU floating point units (FPUs). The selective approximation relaxes computation on neural network layers by considering the impact on accuracy. In evaluation, we integrate NNCAM blocks with the modern AMD southern Island GPU architecture. Our experimental evaluation shows that the enhanced GPGPU can result in 68% energy savings and 40% speedup running on four popular convolutional neural networks (CNN), ensuring acceptable < 2% quality loss.

查看译文

关键词

GPGPU,energy/performance-efficient neural network acceleration technique,general purpose GPU architecture,specialized resistive nearest content addressable memory blocks,NNCAM,computation locality,learning algorithms,neural network operations,NNCAM computation efficiency,layer-based associative update technique,selective approximation techniques,data locality,computation patterns,energy saving,GPGPU floating point units,FPU,AMD southern Island GPU architecture,convolutional neural networks,CNN

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要