Sparq: A Custom RISC-V Vector Processor for Efficient Sub-Byte Quantized Inference

Theo Dupuis,Yoan Fournier,MohammadHossein AskariHemmat,Nizar El Zarif,Francois Leduc-Primeau,Jean Pierre David,Yvon Savaria

CoRR（2023）

引用 0|浏览17

暂无评分

摘要

Convolutional Neural Networks (CNNs) are used in a wide range of applications, with full-precision CNNs achieving high accuracy at the expense of portability. Recent progress in quantization techniques has demonstrated that sub-byte Quantized Neural Networks (QNNs) achieve comparable or superior accuracy while significantly reducing the computational cost and memory footprint. However, sub-byte computation on commodity hardware is sub-optimal due to the lack of support for such precision. In this paper, we introduce Sparq, a Sub-byte vector Processor designed for the AcceleRation of QNN inference. This processor is based on a modified version of Ara, an open-source 64bit RISC-V "V" compliant processor. Sparq is implemented in GLOBAL FOUNDRIES 22FDX FD-SOI technology and extends the Instruction Set Architecture (ISA) by adding a new multiply-shift-accumulate instruction to improve sub-byte computation effciency. The floating-point unit is also removed to minimize area and power usage. To demonstrate Sparq performance, we implement an ultra-low-precision (1-bit to 4-bit) vectorized conv2d operation taking advantage of the dedicated hardware. We show that Sparq can significantly accelerate sub-byte computations with respectively 3.2 times, and 1.7 times acceleration over an optimized 16-bit 2D convolution for 2-bit and 4-bit quantization.

查看译文

关键词

RISC-V,Vector ISA,Sub-byte,Convolution

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要