Chrome Extension
WeChat Mini Program
Use on ChatGLM

Data Footprint Reduction in DNN Inference by Sensitivity-Controlled Approximations with Online Arithmetic

Euromicro Symposium on Digital Systems Design(2020)

Cited 9|Views1
No score
Abstract
In deep neural network (DNN) inference, researchers have been trying to reduce the number of computations and connections without performance degradation, departing from a bit-parallel to a bit-serial mode of arithmetic. In this regard, approximations translated as the mixed-precision profile for among-layer-mixed-precision through bit-serial architecture have been adopted in the literature. However, the introduction of within-layer mixed precision through controlled approximations for low-latency DNN architecture is yet to be studied. For DNN inference in this study, we apply an unconventional computation technique of online arithmetic, which serially generates the most significant digits first(MSDF) and then terminates computation according to the required precision. Specifically, Taylor expansion-based sensitivity analysis guides the within-layer-mixed-precision method for the choice of approximation intensity (desired bits) for weights and activations of convolutional layers. In turn, the within-layer-mixed-precision method drives the termination of the convolution operation carried out using an online multiplier. Hence, we aim to reduce the data footprint by early terminations achieved thanks to the insightful nature of within-layer-mixed-precision instead of among-layer-mixedprecision for online convolution. In this manner, convolution operations compute not-more-than-necessary most significant digits to overcome the bottleneck of data footprint for in-demand edge computing devices.
More
Translated text
Key words
Approximate computing,bit-serial,convolution,DNN inference,mixed precision,MSDF,online arithmetic
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined