I²CU: A Dedicated Im2col Hardware Unit

Tao Zhongyu,Wang Yuanfeng,Zhang Huaisheng

2022 19th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP)（2022）

引用 0|浏览0

暂无评分

摘要

For Convolution Neural Network (CNN), the convolution operation for feature map and weight map usually implemented by im2col + GEMM method. However, for conventional method need expand feature map to a large feature matrix during a single kernel function based on convolution parameters (i.e. filter size, padding, and stride), then multiplication for matrixes took place in another function. Thus the conventional method will generate tons data transfer and the large feature matrix requires enormous storage space, it is hardware unfriendly.We design a hardware unit, I ² CU (Im2Col Unit), a dedicated hardware unit to implement im2col in hardware friendly way. I ² CU dynamically expand loaded 4D-Block return from texture unit and write back destination matrix to shared memory. I ² CU can decrease the feature matrix storage space and implement im2col + GEMM in one kernel function.

查看译文

关键词

Convolution Neural Network,I2CU,Feature Matrix,Shared memory

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要

I2CU: A Dedicated Im2col Hardware Unit

I²CU: A Dedicated Im2col Hardware Unit