An Empirical Comparison of Quantization, Pruning and Low-rank Neural Network Compression using the LC Toolkit
2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN)(2021)
摘要
Compression of machine learning models, and of neural networks in particular, has become an essential problem among practitioners. Many different approaches including quantization, pruning, low-rank and tensor decompositions have been proposed in the literature to solve the problem. Despite this, an important unanswered question remains: what is the best compression scheme for a model? As a step towards answering this question objectively and fairly, we empirically compare quantization, pruning, and low-rank compressions in the algorithmic footing of the Learning-Compression (LC) framework. This allows us to explore the compression schemes systematically and perform an apples-to-apples comparison along the entire error-compression tradeoff curves. We describe our methodology, the framework, experimental setup, and present our comparisons. Based on our experiments, we conclude that the choice of compression is strongly model-dependent: for example, VGG16 is better compressed with pruning, while quantization is more suitable for the ResNets. This, once again, underlines the need for a common benchmark of compression schemes with fair and objective comparisons of the models of interest.
更多查看译文
关键词
empirical comparison,low-rank neural network compression,LC toolkit,machine learning models,neural networks,different approaches including quantization,tensor decompositions,compression scheme,pruning,low-rank compressions,learning-compression framework,apples-to-apples comparison,error-compression tradeoff curves,fair comparisons,objective comparisons
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要