SqueezeJet-3: An HLS-based Accelerator for Edge CNN Applications on SoC FPGAs


Cited 1|Views6
No score
Most FPGA-based Convolutional Neural Network (CNN) hardware accelerators target the datacenter rather than edge processing units. To further fill this gap, this work presents SqueezeJet-3 and the corresponding design flow of a novel FPGA-based embedded system, consisting of software and hardware for accelerating edge CNN inference. SqueezeJet-3 is optimized for accelerating small ImageNet class CNNs, such as SqueezeNet vl.l and ZynqNet, on low-end low-cost SoC FPGA devices. SqueezeJet-3 is evaluated against the DietChai accelerator, which is part of Xilinx's ChaiDNN v2 framework, in terms of performance, resource utilization, power, and accuracy; the results demonstrate that for the acceleration of SqueezeNet vl.l, SqueezeJet-3 is better than DietChai in all categories. Our evaluation results also show that, by using the presented design framework, a developer can implement FPGA accelerators for larger CNNs, such as the VGG16, with similar performance to the accelerators designed by Angel-Eye and fpgaConvNet frameworks which are optimized for VGG16-like CNN networks.
Translated text
Key words
Algorithm-to-HLS Workflow,High-Level Synthesis,FPGA CNN Accelerator,Deep Learning Application,Mobile Embedded Systems
AI Read Science
Must-Reading Tree
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined