Adapting Skip Connections for Resource-Efficient FPGA Inference.

Olivia Weng, Gabriel Marcano,Vladimir Loncar,Alireza Khodamoradi,Nojan Sheybani,Farinaz Koushanfar,Kristof Denolf,Javier Mauricio Duarte,Ryan Kastner

FPGA（2023）

引用 0|浏览38

暂无评分

摘要

Deep neural networks employ skip connections – identity functions that combine the outputs of different layers-to improve training convergence; however, these skip connections are costly to implement in hardware. In particular, for inference accelerators on resource-limited platforms, they require extra buffers, increasing not only on- and off-chip memory utilization but also memory bandwidth requirements. Thus, a network that has skip connections costs more to deploy in hardware than one that has none. We argue that, for certain classification tasks, a network's skip connections are needed for the network to learn but not necessary for inference after convergence. We thus explore removing skip connections from a fully-trained network to mitigate their hardware cost. From this investigation, we introduce a fine-tuning/retraining method that adapts a network's skip connections – by either removing or shortening them-to make them fit better in hardware with minimal to no loss in accuracy. With these changes, we decrease resource utilization by up to 34% for BRAMs, 7% for FFs, and 12% LUTs when implemented on an FPGA.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要