Dynamic Unit Surgery For Deep Neural Network Compression And Acceleration
2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN)(2019)
摘要
Successful deep neural network models tend to possess millions of parameters. Reducing the size of such models by pruning parameters has recently earned significant interest from the research community, allowing more compact models with similar performance level. While pruning parameters usually result in large sparse weight tensors which cannot easily lead to proportional improvement in computational efficiency, pruning filters or entire units allow readily available off-the-shelf libraries to harness the benefit of smaller architecture. One of the most well-known aspects of network pruning is that the final retained performance can be improved by making the process of pruning more gradual. Most existing techniques smooth the process by repeating the technique (multi-pass) at increasing pruning ratios, or by applying the method in a layer-wise fashion. In this paper, we introduce Dynamic Unit Surgery (DUS) that smooths the process in a novel way by using decaying mask values, instead of multi-pass or layer-wise treatment. While multi-pass schemes entirely discard network components pruned at the early stage, DUS allows recovery of such components. We empirically show that DUS achieves competitive performance against existing state-of-the-art pruning techniques in multiple image classification tasks. In CIFAR10, we prune VGG16 network to use 5% of the parameters and 23 % of FLOPs while achieving 6.65 % error rate with no degradation from the original network. We also explore the method's application to transfer learning environment for fine-grained image classification and report its competitiveness against state-of-the-art baseline.
更多查看译文
关键词
Neural Network, Pruning, Network Compression, Network Acceleration, Deep Learning, Image Classification
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络