Smart Ternary Quantization

Morin Grégoire,Razani Ryan,Nia Vahid Partovi,Sari Eyyüb

arxiv（2021）

引用 3|浏览13

暂无评分

摘要

Neural network models are resource hungry. Low bit quantization such as binary and ternary quantization is a common approach to alleviate this resource requirements. Ternary quantization provides a more flexible model and often beats binary quantization in terms of accuracy, but doubles memory and increases computation cost. Mixed quantization depth models, on another hand, allows a trade-off between accuracy and memory footprint. In such models, quantization depth is often chosen manually (which is a tiring task), or is tuned using a separate optimization routine (which requires training a quantized network multiple times). Here, we propose Smart Ternary Quantization (STQ) in which we modify the quantization depth directly through an adaptive regularization function, so that we train a model only once. This method jumps between binary and ternary quantization while training. We show its application on image classification.

查看译文

关键词

neural network models,low bit quantization,memory footprint,adaptive binary-ternary quantization,smart quantization,MNIST,CIFAR10,SQ,separate optimization routine

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要