Design and Analysis of Posit Quire Processing Engine for Neural Network Applications

2023 36th International Conference on VLSI Design and 2023 22nd International Conference on Embedded Systems (VLSID)(2023)

引用 1|浏览0
暂无评分
摘要
Multiplications-Accumulations (MACs) are the significant computations in deep learning networks that must be hardware efficient and result in higher inference accuracy for the given set of network layers. Typically, the processing engines (PEs) which comprise the MAC units are the most resource-intensive element in a Deep Neural Network (DNN) accelerator. This paper proposes efficient PEs and presents an architectural design, accuracy, and resources cost analysis for hardware-based parameterized Posit Quire Processing Engine (PQPE) supporting P(8,0), P(16,1), P(32,2), and P(64,3). PQPE performs the vector computations (dot products) on Posit numbers with a wide quire accumulator and has been enabled with instructions, commands, and responses. PQPE has been modelled in Python to evaluate the inference accuracy for various DNN kernels and datasets. LeNet-5, AlexNet, custom 5 layers and 4 layers CNN models have been adopted and evaluated for inference accuracy with MNIST, CIFAR-10, and Alphabet recognition datasets. The PQPE has been used for matrix computations in convolutions. P(16,1) without quire provides same accuracy as float32 with less hardware resources. P(16,1) with quire brings same accuracy level as without quire and hence quire is not advantageous for larger posit data widths. Posit (8,0) with / without quire results insignificant loss of 0.8% / 3.28% (LeNet + MNIST) to 3.8% / 10.1% (AlexNet + CIFAR-10) accuracy as compared to P(16,1)/float32. Further, PQPE was modelled using Verilog HDL, synthesized targeting Xilinx Virtex-7 FPGA using Xilinx-Vivado. PQPE (8,0) achieved a $f_{max}$ of 372 MHz and utilized 77.6% fewer hardware resources as compared to float32 at the cost of insignificant accuracy loss for CIFAR-10 which is 0.8% on LeNet and 3.8% on AlexNet. PQPE (16,1) has used 6.5% less resources and achieved equivalent accuracy as compared to float32.
更多
查看译文
关键词
Posit MAC Engine,Posit Quire Accumulators,Posit for AI/ML,Neural Network Processing Engine,Inference Accuracy
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要