Stochastic Quantization for Learning Accurate Low-Bit Deep Neural Networks

International Journal of Computer Vision(2019)

引用 18|浏览195
暂无评分
摘要
Low-bit deep neural networks (DNNs) become critical for embedded applications due to their low storage requirement yet high computing efficiency. However, they suffer much from the non-negligible accuracy drop. This paper proposes the stochastic quantization (SQ) algorithm for learning accurate low-bit DNNs. The motivation is due to the following observation. Existing training algorithms approximate the real-valued weights with low-bit representation all together in each iteration. The quantization error may be small for some elements/filters, while is remarkable for others, which leads to inappropriate gradient directions during training, and thus brings notable accuracy drop. Instead, SQ quantizes a portion of elements/filters to low-bit values with a stochastic probability inversely proportional to the quantization error, while keeping the other portion unchanged with full precision. The quantized and full precision portions are updated with their corresponding gradients separately in each iteration. The SQ ratio, which measures the ratio of the quantized weights to all weights, is gradually increased until the whole network is quantized. This procedure can greatly compensate for the quantization error and thus yield better accuracy for low-bit DNNs. Experiments show that SQ can consistently and significantly improve the accuracy for different low-bit DNNs on various datasets and various network structures, no matter whether activation values are quantized or not.
更多
查看译文
关键词
Compression,Efficient deep neural networks,Low-bit,Stochastic quantization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要