Resolution Adaptive Networks for Efficient Inference

CVPR, pp. 2366-2375, 2020.

Cited by: 0|Bibtex|Views101|DOI:https://doi.org/10.1109/CVPR42600.2020.00244
EI
Other Links: arxiv.org|dblp.uni-trier.de|academic.microsoft.com
Weibo:
We proposed a novel resolution adaptive neural network based on a multi-scale dense connection architecture, which we refer to as Resolution Adaptive Network

Abstract:

Recently, adaptive inference is gaining increasing attention due to its high computational efficiency. Different from existing works, which mainly exploit architecture redundancy for adaptive network design, in this paper, we focus on spatial redundancy of input samples, and propose a novel Resolution Adaptive Network (RANet). Our motiv...More

Code:

Data:

0
Introduction
  • Advances in computer hardware have enabled the training of very deep convolutional neural networks (CNNs), such as ResNet [8] and DenseNet [14], the high computational cost of deep CNNs is still unaffordable in many applications.
  • It has been shown that the intrinsic classification difficulty for different samples varies drastically: some of them can be correctly classified by smaller models with fewer layers or channels, while some may need larger networks [24, 37, 12, 36]
  • By exploiting this fact, many works have been proposed recently.
  • Multi-Scale Dense Network (MSDNet) [12] allows some samples to exit at some auxiliary classifiers conditioned on their prediction confidence
Highlights
  • Advances in computer hardware have enabled the training of very deep convolutional neural networks (CNNs), such as ResNet [8] and DenseNet [14], the high computational cost of deep convolutional neural networks is still unaffordable in many applications
  • As spatial redundancy of input images has been certificated in recent work [4], this paper proposes a novel adaptive learning model which exploits both structural redundancy of a neural network and spatial redundancy of input samples
  • Multi-Scale Dense Network substantially outperforms other baseline models, and Resolution Adaptive Network are superior to Multi-Scale Dense Network, especially when the computational budget is low
  • On CIFAR-10 (CIFAR-100), the accuracies of different classifiers for Resolution Adaptive Network are over 1% (2% − 5%) higher than those of Multi-Scale Dense Network when the computational budget ranges from 0.1 × 108 to 0.5 × 108 FLOPs
  • The accuracies of Resolution Adaptive Network are 2% and 5% higher than those of Multi-Scale Dense Network on CIFAR-10 and CIFAR-100, respectively
  • We proposed a novel resolution adaptive neural network based on a multi-scale dense connection architecture, which we refer to as Resolution Adaptive Network
Methods
  • The authors first introduce the idea of adaptive inference, the authors demonstrate the overall architecture and the network details of the proposed RANet.

    3.1.
  • The authors use the highest confidence of the softmax output as the decision basis, which means that the final output will be the prediction of the first classifier whose largest softmax output is greater than a given threshold.
  • This can be represented by k∗ = min k| max c pkc ≥
Results
  • The authors report classification accuracies of all individual classifiers in the model and other baselines.
  • MSDNet substantially outperforms other baseline models, and RANet are superior to MSDNet, especially when the computational budget is low.
  • The authors plot the classification accuracy of each MSDNet and RANet in a gray and a light-yellow curve, respectively.
  • The results on the two CIFAR datasets show that RANets consistently outperform MSDNets and other baseline models across all budgets.
  • Even though the model and MSDNet show close performance on CIFAR-10 when the computational budget ranges from 0.2 × 108 to 0.3 × 108, the classification accuracies of RANets are consistently higher than ( 1%) these of MSDNets on CIFAR-
Conclusion
  • The authors proposed a novel resolution adaptive neural network based on a multi-scale dense connection architecture, which the authors refer to as RANet.
  • Samples with high prediction confidence will exit early from the network and larger scale features with finer details will only be further utilized for those non-typical images which achieve unreliable predictions in previous sub-networks.
  • This resolution adaptation mechanism and the depth adaptation in each sub-network of RANet guarantee its high computational efficiency.
  • On three image classification benchmarks, the experiments demonstrate the effectiveness of the proposed RANet in both the anytime prediction setting and the budgeted batch classification setting
Summary
  • Introduction:

    Advances in computer hardware have enabled the training of very deep convolutional neural networks (CNNs), such as ResNet [8] and DenseNet [14], the high computational cost of deep CNNs is still unaffordable in many applications.
  • It has been shown that the intrinsic classification difficulty for different samples varies drastically: some of them can be correctly classified by smaller models with fewer layers or channels, while some may need larger networks [24, 37, 12, 36]
  • By exploiting this fact, many works have been proposed recently.
  • Multi-Scale Dense Network (MSDNet) [12] allows some samples to exit at some auxiliary classifiers conditioned on their prediction confidence
  • Methods:

    The authors first introduce the idea of adaptive inference, the authors demonstrate the overall architecture and the network details of the proposed RANet.

    3.1.
  • The authors use the highest confidence of the softmax output as the decision basis, which means that the final output will be the prediction of the first classifier whose largest softmax output is greater than a given threshold.
  • This can be represented by k∗ = min k| max c pkc ≥
  • Results:

    The authors report classification accuracies of all individual classifiers in the model and other baselines.
  • MSDNet substantially outperforms other baseline models, and RANet are superior to MSDNet, especially when the computational budget is low.
  • The authors plot the classification accuracy of each MSDNet and RANet in a gray and a light-yellow curve, respectively.
  • The results on the two CIFAR datasets show that RANets consistently outperform MSDNets and other baseline models across all budgets.
  • Even though the model and MSDNet show close performance on CIFAR-10 when the computational budget ranges from 0.2 × 108 to 0.3 × 108, the classification accuracies of RANets are consistently higher than ( 1%) these of MSDNets on CIFAR-
  • Conclusion:

    The authors proposed a novel resolution adaptive neural network based on a multi-scale dense connection architecture, which the authors refer to as RANet.
  • Samples with high prediction confidence will exit early from the network and larger scale features with finer details will only be further utilized for those non-typical images which achieve unreliable predictions in previous sub-networks.
  • This resolution adaptation mechanism and the depth adaptation in each sub-network of RANet guarantee its high computational efficiency.
  • On three image classification benchmarks, the experiments demonstrate the effectiveness of the proposed RANet in both the anytime prediction setting and the budgeted batch classification setting
Related work
  • Efficient inference for deep networks. Many previous works explore variants of deep networks to speed up the network inference. One direct solution is designing lightweight models, e.g., MobileNet [10, 31], ShuffleNet [42, 27] and CondenseNet [13]. Other lines of research focus on pruning redundant network connections [20, 22, 26], or quantizing network weights [15, 29, 17]. Moreover, knowledge distilling [9] is proposed to train a small (student) network which mimics outputs of a deeper and/or wider (teacher) network.

    The aforementioned approaches can be seen as static model acceleration techniques, which infer all input samples with a whole network consistently. In contrast, adaptive networks can strategically allocate appropriate computational resources for classifying input images based on input complexity. This research direction is gaining increasing attention in recent years due to its advantages. The most intuitive implementation is ensembling multiple models and selectively executing a subset of the models in a cascading [2] or mixing way [32, 30]. Recent works also propose to adaptively skip layers or blocks [7, 37, 39, 40], or dynamically select channels [24, 3, 1] during inference time. Auxiliary predictors can also be attached at different locations of a deep network to allow early exiting “easy” examples [35, 12, 11, 23]. Furthermore, dynamically activating parts of network branches with multi-branch structure [36] also provide an alternate way for adaptive inference.
Funding
  • This work is supported by grants from the Institute for Guo Qiang of Tsinghua University, National Natural Science Foundation of China (No 61906106) and Beijing Academy of Artificial Intelligence (BAAI)
Reference
  • Babak Ehteshami Bejnordi, Tijmen Blankevoort, and Max Welling. Batch-shaped channel gated networks. CoRR, abs/1907.06627, 2019. 2
    Findings
  • Tolga Bolukbasi, Joseph Wang, Ofer Dekel, and Venkatesh Saligrama. Adaptive neural networks for efficient inference. In ICML, 2017. 2
    Google ScholarLocate open access versionFindings
  • Shaofeng Cai, Gang Chen, Beng Chin Ooi, and Jinyang Gao. Model slicing for supporting complex analytics with elastic inference cost and resource constraints. In PVLDB, 2019. 2
    Google ScholarLocate open access versionFindings
  • Yunpeng Chen, Haoqi Fan, Bing Xu, Zhicheng Yan, Yannis Kalantidis, Marcus Rohrbach, Shuicheng Yan, and Jiashi Feng. Drop an octave: Reducing spatial redundancy in convolutional neural networks with octave convolution. In ICCV, 2019. 2, 3, 8
    Google ScholarLocate open access versionFindings
  • Ting-Wu Chin, Ruizhou Ding, and Diana Marculescu. Adascale: Towards real-time video object detection using adaptive scaling. In Systems and Machine Learning Conference, 2019. 3
    Google ScholarLocate open access versionFindings
  • Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In CVPR, 2009. 5
    Google ScholarLocate open access versionFindings
  • Michael Figurnov, Maxwell D Collins, Yukun Zhu, Li Zhang, Jonathan Huang, Dmitry Vetrov, and Ruslan Salakhutdinov. Spatially adaptive computation time for residual networks. In CVPR, 2012
    Google ScholarLocate open access versionFindings
  • Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In CVPR, 2016. 1, 2, 6
    Google ScholarLocate open access versionFindings
  • Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. Distilling the knowledge in a neural network. In NeurlIPS (Workshop), 2015. 2
    Google ScholarLocate open access versionFindings
  • Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. Mobilenets: Efficient convolutional neural networks for mobile vision applications. CoRR, abs/1704.04861, 2017. 1, 2, 3
    Findings
  • Hanzhang Hu, Debadeepta Dey, Martial Hebert, and J Andrew Bagnell. Learning anytime predictions in neural networks via adaptive loss balancing. In AAAI, 2019. 2
    Google ScholarLocate open access versionFindings
  • Gao Huang, Danlu Chen, Tianhong Li, Felix Wu, Laurens van der Maaten, and Kilian Q Weinberger. Multi-scale dense networks for resource efficient image classification. In ICLR, 2018. 1, 2, 3, 5, 6, 8
    Google ScholarLocate open access versionFindings
  • Gao Huang, Shichen Liu, Laurens Van der Maaten, and Kilian Q Weinberger. Condensenet: An efficient densenet using learned group convolutions. In CVPR, 2018. 1, 2
    Google ScholarLocate open access versionFindings
  • Gao Huang, Zhuang Liu, Geoff Pleiss, Laurens Van Der Maaten, and Kilian Weinberger. Convolutional networks with dense connectivity. IEEE trans. on PAMI, 2019. 1, 2, 4, 5
    Google ScholarLocate open access versionFindings
  • Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran ElYaniv, and Yoshua Bengio. Binarized neural networks. In NeurlIPS, 2016. 1, 2
    Google ScholarLocate open access versionFindings
  • Sergey Ioffe and Christian Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In ICML, 2015. 4
    Google ScholarLocate open access versionFindings
  • Benoit Jacob, Skirmantas Kligys, Bo Chen, Menglong Zhu, Matthew Tang, Andrew Howard, Hartwig Adam, and Dmitry Kalenichenko. Quantization and training of neural networks for efficient integer-arithmetic-only inference. In CVPR, 2018. 1, 2
    Google ScholarLocate open access versionFindings
  • Tsung-Wei Ke, Michael Maire, and Stella X Yu. Multigrid neural architectures. In CVPR, 2017. 2
    Google ScholarLocate open access versionFindings
  • Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images. Technical report, Citeseer, 2009. 5
    Google ScholarFindings
  • Yann LeCun, John S Denker, and Sara A Solla. Optimal brain damage. In NeurlIPS, 1990. 1, 2
    Google ScholarLocate open access versionFindings
  • Chen-Yu Lee, Saining Xie, Patrick Gallagher, Zhengyou Zhang, and Zhuowen Tu. Deeply-supervised nets. In AISTATS, 2015. 6
    Google ScholarLocate open access versionFindings
  • Hao Li, Asim Kadav, Igor Durdanovic, Hanan Samet, and Hans Peter Graf. Pruning filters for efficient convnets. In ICLR, 2017. 1, 2
    Google ScholarLocate open access versionFindings
  • Hao Li, Hong Zhang, Xiaojuan Qi, Ruigang Yang, and Gao Huang. Improved techniques for training adaptive deep networks. In ICCV, 2019. 2, 6
    Google ScholarLocate open access versionFindings
  • Ji Lin, Yongming Rao, Jiwen Lu, and Jie Zhou. Runtime neural pruning. In NeurlIPS, 2017. 1, 2
    Google ScholarLocate open access versionFindings
  • Tsung-Yi Lin, Piotr Dollar, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. Feature pyramid networks for object detection. In CVPR, 2017. 2
    Google ScholarLocate open access versionFindings
  • Zhuang Liu, Jianguo Li, Zhiqiang Shen, Gao Huang, Shoumeng Yan, and Changshui Zhang. Learning efficient convolutional networks through network slimming. In ICCV, 2017. 1, 2
    Google ScholarLocate open access versionFindings
  • Ningning Ma, Xiangyu Zhang, Hai-Tao Zheng, and Jian Sun. Shufflenet v2: Practical guidelines for efficient cnn architecture design. In ECCV, 2018. 2
    Google ScholarLocate open access versionFindings
  • Vinod Nair and Geoffrey E Hinton. Rectified linear units improve restricted boltzmann machines. In ICML, pages 807– 814, 2010. 4
    Google ScholarLocate open access versionFindings
  • Mohammad Rastegari, Vicente Ordonez, Joseph Redmon, and Ali Farhadi. Xnor-net: Imagenet classification using binary convolutional neural networks. In ECCV, 2016. 1, 2
    Google ScholarLocate open access versionFindings
  • Adria Ruiz and Jakob Verbeek. Adaptative inference cost with convolutional neural mixture models. In ICCV, 2019. 2
    Google ScholarLocate open access versionFindings
  • Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. Mobilenetv2: Inverted residuals and linear bottlenecks. In CVPR, 2018. 1, 2
    Google ScholarLocate open access versionFindings
  • Noam Shazeer, Azalia Mirhoseini, Krzysztof Maziarz, Andy Davis, Quoc V. Le, Geoffrey E. Hinton, and Jeff Dean. Outrageously large neural networks: The sparsely-gated mixture-of-experts layer. In ICLR, 2017. 2
    Google ScholarLocate open access versionFindings
  • Ke Sun, Bin Xiao, Dong Liu, and Jingdong Wang. Deep high-resolution representation learning for human pose estimation. In CVPR, 2019. 2
    Google ScholarLocate open access versionFindings
  • Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. Going deeper with convolutions. In CVPR, 2015. 6
    Google ScholarLocate open access versionFindings
  • Surat Teerapittayanon, Bradley McDanel, and Hsiang-Tsung Kung. Branchynet: Fast inference via early exiting from deep neural networks. In ICPR, 2016. 2
    Google ScholarLocate open access versionFindings
  • Ravi Teja Mullapudi, William R Mark, Noam Shazeer, and Kayvon Fatahalian. Hydranets: Specialized dynamic architectures for efficient inference. In CVPR, 2018. 1, 2
    Google ScholarLocate open access versionFindings
  • Andreas Veit and Serge Belongie. Convolutional networks with adaptive inference graphs. In ECCV, 2018. 1, 2
    Google ScholarLocate open access versionFindings
  • Tom Veniat and Ludovic Denoyer. Learning time/memoryefficient deep architectures with budgeted super networks. In CVPR, 2018. 2
    Google ScholarLocate open access versionFindings
  • Xin Wang, Fisher Yu, Zi-Yi Dou, Trevor Darrell, and Joseph E Gonzalez. Skipnet: Learning dynamic routing in convolutional networks. In ECCV, 2018. 2
    Google ScholarLocate open access versionFindings
  • Zuxuan Wu, Tushar Nagarajan, Abhishek Kumar, Steven Rennie, Larry S Davis, Kristen Grauman, and Rogerio Feris. Blockdrop: Dynamic inference paths in residual networks. In CVPR, 2018. 2
    Google ScholarLocate open access versionFindings
  • Sergey Zagoruyko and Nikos Komodakis. Wide residual networks. In BMVC, 2016. 6
    Google ScholarLocate open access versionFindings
  • Xiangyu Zhang, Xinyu Zhou, Mengxiao Lin, and Jian Sun. Shufflenet: An extremely efficient convolutional neural network for mobile devices. In CVPR, 2018. 1, 2
    Google ScholarLocate open access versionFindings
  • Hengshuang Zhao, Xiaojuan Qi, Xiaoyong Shen, Jianping Shi, and Jiaya Jia. Icnet for real-time semantic segmentation on high-resolution images. In ECCV, 2018. 2 Supplementary Materials for: Resolution Adaptive Networks for Efficient Inference arXiv:2003.07326v5 [cs.CV] 18 May 2020
    Findings
  • 1. Appendix A. Implementation Details
    Google ScholarFindings
  • 2. Appendix B. Improved Techniques
    Google ScholarFindings
  • [1] Gao Huang, Danlu Chen, Tianhong Li, Felix Wu, Laurens van der Maaten, and Kilian Q Weinberger. Multi-scale dense networks for resource efficient image classification. In ICLR, 2018. 3
    Google ScholarLocate open access versionFindings
  • [2] Hao Li, Hong Zhang, Xiaojuan Qi, Ruigang Yang, and Gao Huang. Improved techniques for training adaptive deep networks. In ICCV, 2019. 3
    Google ScholarLocate open access versionFindings
Full Text
Your rating :
0

 

Tags
Comments