A 2.9–33.0 TOPS/W Reconfigurable 1-D/2-D Compute-Near-Memory Inference Accelerator in 10-nm FinFET CMOS

H. Ekin Sumbul,Gregory K. Chen,Phil C. Knag,Raghavan Kumar,Mark A. Anders,Himanshu Kaul,Steven K. Hsu,Amit Agarwal,Monodeep Kar,Seongjong Kim,Ram K. Krishnamurthy

IEEE Solid-State Circuits Letters（2020）

引用 5|浏览45

暂无评分

摘要

A 10-nm compute-near-memory (CNM) accelerator augments SRAM with multiply accumulate (MAC) units to reduce interconnect energy and achieve 2.9 8b-TOPS/W for matrix–vector computation. The CNM provides high memory bandwidth by accessing SRAM subarrays to enable low-latency, real-time inference in fully connected and recurrent neural networks with small mini-batch sizes. For workloads with greater a...

查看译文

关键词

Compute-near-memory (CNM),deep learning ASIC,deep learning inference,reconfigurable systolic array,variable-precision

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要