An Empirical Comparison of Quantization, Pruning and Low-rank Neural Network Compression using the LC Toolkit

Yerlan Idelbayev,Miguel Á. Carreira-Perpiñán

2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN)（2021）

引用 3|浏览13

暂无评分

摘要

Compression of machine learning models, and of neural networks in particular, has become an essential problem among practitioners. Many different approaches including quantization, pruning, low-rank and tensor decompositions have been proposed in the literature to solve the problem. Despite this, an important unanswered question remains: what is the best compression scheme for a model? As a step towards answering this question objectively and fairly, we empirically compare quantization, pruning, and low-rank compressions in the algorithmic footing of the Learning-Compression (LC) framework. This allows us to explore the compression schemes systematically and perform an apples-to-apples comparison along the entire error-compression tradeoff curves. We describe our methodology, the framework, experimental setup, and present our comparisons. Based on our experiments, we conclude that the choice of compression is strongly model-dependent: for example, VGG16 is better compressed with pruning, while quantization is more suitable for the ResNets. This, once again, underlines the need for a common benchmark of compression schemes with fair and objective comparisons of the models of interest.

查看译文

关键词

empirical comparison,low-rank neural network compression,LC toolkit,machine learning models,neural networks,different approaches including quantization,tensor decompositions,compression scheme,pruning,low-rank compressions,learning-compression framework,apples-to-apples comparison,error-compression tradeoff curves,fair comparisons,objective comparisons

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要