CAESAR: A CNN Accelerator Exploiting Sparsity and Redundancy Pattern

Seongwook Kim, Yongjun Kim, Gwangeun Byeon,Seokin Hong

2023 International Technical Conference on Circuits/Systems, Computers, and Communications (ITC-CSCC)(2023)

引用 0|浏览3
暂无评分
摘要
Convolutional Neural Networks (CNN) have shown outstanding performance in many computer vision applications. However, CNN Inference on mobile and edge devices is challenging due to high computation demands. Recently, many prior studies have tried to address this challenge by reducing the data precision with quantization techniques, leading to abundant redundancy in the CNN models. This paper proposes CAESAR, a CNN accelerator that eliminates redundant computations to reduce the computation demands of CNN inference. By analyzing the computation pattern of the convolution layer, CAESAR predicts the location where the redundant computations occur and removes them in the executions. After that, CAESAR remaps the remaining effectual computations on the processing elements originally mapped to the redundant computations so that all processing elements are fully utilized. Based on our evaluation with a cycle-level microarchitecture simulator, CAESAR achieves an overall speedup of up to 2.13x and saves energy by 78% over the TPU-like baseline accelerator.
更多
查看译文
关键词
Accelerator,Convolution Neural Network,Computation Reuse
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要