TT-CIM: Tensor Train Decomposition for Neural Network in RRAM-Based Compute-in-Memory Systems

Fan-Hsuan Meng,Yuting Wu,Zhengya Zhang,Wei D. Lu

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS（2023）

引用 0|浏览3

暂无评分

摘要

Compute-in-Memory (CIM) implemented with Resistive-Random-Access-Memory (RRAM) crossbars is a promising approach for accelerating Convolutional Neural Network (CNN) computations. The growing size in the number of parameters in state-of-the-art CNN models, however, creates challenge for on-chip weight storage for CIM implementations, and CNN compression becomes a crucial topic of exploration. Tensor Train (TT) decomposition can be used to decompose a tensor into smaller ones with fewer parameters, at the cost of increased number of computations. In this work we propose a technique to minimize intermediate operations across the full convolution operation and improve hardware utilization to implement TT-CNNs in CIM systems. We first use an iterative decompose-and-fine-tune method to prepare TT-CNNs. We then propose an inter-convolutional-step reuse scheme to reduce the required operation count and post-mapping RRAM count for TT-CNN implementation in tiled-CIM architecture. We demonstrate that through proper mapping, pipelining, and reuse, effective compression ratio of 12 and 20 with 0.8% and 1.4% accuracy drop, respectively for WRN; and effective compression ratio of 6 and 11 with 0.9% and 1.2% accuracy drop for VGG8. We also show that around 30% higher hardware utilization than the original CNN format can be achieved using the proposed TT-CIM approaches.

查看译文

关键词

Compute-in-memory,deep neural network,convolutional neural network,neural network compression,tensor train decomposition

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要