An Empirical Comparison of Quantization, Pruning and Low-rank Neural Network Compression using the LC Toolkit

2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN)(2021)

引用 3|浏览13
暂无评分
摘要
Compression of machine learning models, and of neural networks in particular, has become an essential problem among practitioners. Many different approaches including quantization, pruning, low-rank and tensor decompositions have been proposed in the literature to solve the problem. Despite this, an important unanswered question remains: what is the best compression scheme for a model? As a step towards answering this question objectively and fairly, we empirically compare quantization, pruning, and low-rank compressions in the algorithmic footing of the Learning-Compression (LC) framework. This allows us to explore the compression schemes systematically and perform an apples-to-apples comparison along the entire error-compression tradeoff curves. We describe our methodology, the framework, experimental setup, and present our comparisons. Based on our experiments, we conclude that the choice of compression is strongly model-dependent: for example, VGG16 is better compressed with pruning, while quantization is more suitable for the ResNets. This, once again, underlines the need for a common benchmark of compression schemes with fair and objective comparisons of the models of interest.
更多
查看译文
关键词
empirical comparison,low-rank neural network compression,LC toolkit,machine learning models,neural networks,different approaches including quantization,tensor decompositions,compression scheme,pruning,low-rank compressions,learning-compression framework,apples-to-apples comparison,error-compression tradeoff curves,fair comparisons,objective comparisons
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要