Polyomino: A 3D-SRAM-Centric Accelerator for Randomly Pruned Matrix Multiplication With Simple Reordering Algorithm and Efficient Compression Format in 180-nm CMOS

IEEE Transactions on Circuits and Systems I: Regular Papers(2023)

引用 0|浏览11
暂无评分
摘要
We have developed a sparse matrix reordering algorithm with a novel 3D-SRAM-centric Polyomino accelerator that enables efficient processing of the reordered matrix for parameter compression. By reordering randomly pruned, irregularly structured sparse matrices into regularly structured matrices, both the compression ratio of the data and the efficiency of the hardware processing increase. The reordering algorithm can be implemented simply by attributing it to the widely known k-sum problem. We also developed a compression format for storing the reordered matrices and show that the reordered regular structure can reduce the amount of required memory by 63% compared with the conventional method. The proposed Polyomino accelerator can efficiently process reordered matrices by using a 3D stacked SRAM, which is an external memory with random accessibility and low latency. The measurement results using a test chip fabricated in a 180-nm CMOS process demonstrate that the proposed accelerator can achieve high area-efficiency and high energy-efficiency and scales well with the pruning rate.
更多
查看译文
关键词
Sparse matrices, Memory management, Hardware, Computer architecture, Random access memory, Neural networks, Transformers, 3D integration, compressed sparse matrix format, deep neural networks (DNNs), pruning, static random access memory (SRAM), vision transformer
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要