A 28 nm 16 Kb Bit-Scalable Charge-Domain Transpose 6T SRAM In-Memory Computing Macro

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS(2023)

引用 1|浏览5
暂无评分
摘要
This article presents a compact, robust, and transposable SRAM in-memory computing (IMC) macro to support feed forward (FF) and back propagation (BP) computation within a single macro. The transpose macro is created with a clustering structure, and eight 6T bitcells are shared with one charge-domain computing unit (CCU) to efficiently deploy the DNNs weights. The normalized area overhead of clustering structure compared to 6T SRAM cell is only 0.37. During computation, the CCU performs robust charge-domain operations on the parasitic capacitances of the local bitlines in the IMC cluster. In the FF mode, the proposed design supports 128-input 1b XNOR and 1b AND multiplications and accumulations (MACs). The 1b AND can be extended to multi-bit MAC via bit-serial (BS) mapping, which can support DNNs with various precision. A power-gated auto-zero Flash analog-to-digital converter (ADC) reducing the input offset voltage maintains the overall energy efficiency and throughput. The proposed macro is prototyped in a 28-nm CMOS process. It demonstrates a 1b energy efficiency of 166|257 TOPS/W in FF-XNOR|AND mode, and 31.8 TOPS/W in BP mode, respectively. The macro achieves 80.26%|85.07% classification accuracy for the CIFAR-10 dataset with 1b|4b CNN models. Besides, 95.50% MNIST dataset classification accuracy (95.66% software accuracy) is achieved by the BP mode of the proposed transpose IMC macro.
更多
查看译文
关键词
Computer architecture,Energy efficiency,Transistors,Capacitors,Microprocessors,Training,Throughput,Transposable,in-memory computing (IMC),back propagation (BP),charge domain,clustering structure
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要