Data multiplexed and hardware reused architecture for deep neural network accelerator

Neurocomputing(2022)

引用 5|浏览3
暂无评分
摘要
Despite many decades of research on high-performance Deep Neural Network (DNN) accelerators, their massive computational demand still requires resource-efficient, optimized and parallel architecture for computational acceleration. Contemporary hardware implementations of DNNs face the burden of excess area requirement due to resource-intensive elements such as multipliers and non-linear Activation Functions (AFs). This paper proposes DNN with reused hardware-costly AF by multiplexing data using shift-register. The on-chip quantized log2 based memory addressing with an optimized technique is used to access input features, weights, and biases. This way the external memory bandwidth requirement is reduced and dynamically adjusted for DNNs. Further, high-throughput and resource-efficient memory elements for sigmoid activation function are extracted using the Taylor series and its order expansion have been tuned for better test accuracy. The performance is validated and compared with previous works for the MNIST dataset. Besides, the digital design of AF is synthesized at 45 nm technology node and physical parameters are compared with previous works. The proposed hardware reused architecture is verified for neural network 16:16:10:4 using 8-bit dynamic fixed-point arithmetic and implemented on Xilinx Zynq xc7z010clg400 SoC using 100 MHz clock. The implemented architecture uses 25% less hardware resources and consumes 12% less power without performance loss, compared to other state-of-the-art implementations, as lower hardware resources and power consumption are especially important for increasingly important edge computing solutions.
更多
查看译文
关键词
Activation function,Embedded system design,Hardware reused architecture,Deep neural network,Data multiplexing,Programmable logic,Processing system
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要