RASHT: A Partially Reconfigurable Architecture for Efficient Implementation of CNNs

IEEE Transactions on Very Large Scale Integration (VLSI) Systems(2022)

引用 4|浏览9
暂无评分
摘要
Convolutional neural networks (CNNs) are widely used in machine learning (ML) applications such as image processing. CNN requires heavy computations to provide significant accuracy for many ML tasks. Therefore, the efficient implementations of CNNs to improve performance using limited resources without accuracy reduction is a challenge for ML systems. One of the architectures for the efficient execution of CNNs is the array-based accelerator, that consists of an array of similar processing elements (PEs). The array accelerators are popular as high-performance architecture using the features of parallel computing and data reuse. These accelerators are optimized for a set of CNN layers, not for individual layers. Using the same accelerator dimension size to compute all CNN layers with varying shapes and sizes leads to the resource underutilization problem. We propose a flexible and scalable architecture for array-based accelerator that increases resource utilization by resizing PEs to better match the different shapes of CNN layers. The low-cost partial reconfiguration improves resource utilization and performance, resulting in a 23.2% reduction in computational times of GoogLeNet compared to the state-of-the-art accelerators. The proposed architecture decreases the on-chip memory access rate by 26.5% with no accuracy loss.
更多
查看译文
关键词
Array accelerator,convolutional neural network (CNN),image processing and computer vision,machine learning (ML),reconfigurable hardware
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要