AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
We focus on the search of Block Stacking Style of a network, which has drawn little attention from researchers

AutoBSS: An Efficient Algorithm for Block Stacking Style Search

NIPS 2020, (2020)

Cited by: 0|Views58
EI
Full Text
Bibtex
Weibo

Abstract

Neural network architecture design mostly focuses on the new convolutional operator or special topological structure of network block, little attention is drawn to the configuration of stacking each block, called Block Stacking Style (BSS). Recent studies show that BSS may also have an unneglectable impact on networks, thus we design an...More

Code:

Data:

0
Introduction
  • Recent progress in computer vision is mostly driven by the advance of Convolutional Neural Networks (CNNs).
  • Works [1, 2, 5] designed layer-based architectures, while most of the modern architectures [3, 4, 6, 7, 8, 9] are block-based
  • For those block-based networks, the design procedure consists of two steps: (1) designing the block structure.
  • Compared with the block structure, BSS draws little attention from the community
Highlights
  • Recent progress in computer vision is mostly driven by the advance of Convolutional Neural Networks (CNNs)
  • Compared with the original ResNet18, ResNet50 and MobileNetV2, we improve the accuracy by 2.01%, 1.08% and 0.83% respectively
  • EfficientNet-B0 is developed by leveraging a reinforcement learning-based neural architecture search (NAS) approach [20, 9], Block Stacking Style (BSS) is involved in their search space as well
  • We focus on the search of Block Stacking Style (BSS) of a network, which has drawn little attention from researchers
  • We propose a Bayesian optimization based search method named AutoBSS, which can efficiently find a better BSS configuration for a given network within tens of trials
  • The results show that AutoBSS improves the performance of well-known networks by a large margin
Methods
  • Given a building block of a neural network, BSS defines the number of blocks in each stage and channels for each block, which can be represented by a fixed-length code, namely Block Stacking Style Code (BSSC).
  • The authors have a prior that similar BSSC may have similar accuracy
  • To benefit from this hypothesis, the authors propose an efficient algorithm to search BSS by Bayesian Optimization.
  • As the search space is usually huge, to perform BSS clustering efficiently, the authors propose Candidate Set Construction method to select a subset effectively.
  • The authors will introduce these methods in the following subsections in detail.
Results
  • Results and Analysis

    The results of ImageNet are shown in Table 1.
  • Compared with the original ResNet18, ResNet50 and MobileNetV2, the authors improve the accuracy by 2.01%, 1.08% and 0.83% respectively.
  • It indicates BSS has an unneglectable impact on the performance, and there is a large improvement room for the manually designed BSS.
  • The 0.38% improvement on EfficientNet-B1 demonstrates the superiority of the method over grid search, which indicates that AutoBSS is a more elegant and efficient tool for scaling neural networks
Conclusion
  • The authors focus on the search of Block Stacking Style (BSS) of a network, which has drawn little attention from researchers.
  • The results show that AutoBSS improves the performance of well-known networks by a large margin.
  • The authors plan to further analyze the underlying impact of BSS on network performance.
  • Another important future research topic is searching the BSS and block topology of a neural network jointly, which will further promote the performance of neural networks
Summary
  • Introduction:

    Recent progress in computer vision is mostly driven by the advance of Convolutional Neural Networks (CNNs).
  • Works [1, 2, 5] designed layer-based architectures, while most of the modern architectures [3, 4, 6, 7, 8, 9] are block-based
  • For those block-based networks, the design procedure consists of two steps: (1) designing the block structure.
  • Compared with the block structure, BSS draws little attention from the community
  • Objectives:

    The authors aim to break the BSS designing principles defined by human, and propose an efficient AutoML based method called AutoBSS.
  • The authors' goal is to search an optimal BSSC with the best accuracy under some target constraints (e.g. FLOPs or latency).
  • The main goal of this work is to investigate the impact of Block Stacking Style (BSS) and design an efficient algorithm to search it automatically
  • Methods:

    Given a building block of a neural network, BSS defines the number of blocks in each stage and channels for each block, which can be represented by a fixed-length code, namely Block Stacking Style Code (BSSC).
  • The authors have a prior that similar BSSC may have similar accuracy
  • To benefit from this hypothesis, the authors propose an efficient algorithm to search BSS by Bayesian Optimization.
  • As the search space is usually huge, to perform BSS clustering efficiently, the authors propose Candidate Set Construction method to select a subset effectively.
  • The authors will introduce these methods in the following subsections in detail.
  • Results:

    Results and Analysis

    The results of ImageNet are shown in Table 1.
  • Compared with the original ResNet18, ResNet50 and MobileNetV2, the authors improve the accuracy by 2.01%, 1.08% and 0.83% respectively.
  • It indicates BSS has an unneglectable impact on the performance, and there is a large improvement room for the manually designed BSS.
  • The 0.38% improvement on EfficientNet-B1 demonstrates the superiority of the method over grid search, which indicates that AutoBSS is a more elegant and efficient tool for scaling neural networks
  • Conclusion:

    The authors focus on the search of Block Stacking Style (BSS) of a network, which has drawn little attention from researchers.
  • The results show that AutoBSS improves the performance of well-known networks by a large margin.
  • The authors plan to further analyze the underlying impact of BSS on network performance.
  • Another important future research topic is searching the BSS and block topology of a neural network jointly, which will further promote the performance of neural networks
Tables
  • Table1: Single crop Top-1 accuracy (%) of different BSS configurations on ImageNet dataset
  • Table2: The best 5 BSS searched with/without BSS Clustering
  • Table3: Compared with other methods on MobileNetV2
  • Table4: Comparison between the original BSS and the one searched by our method
  • Table5: The single scale testing results on PSACAL VOC 2012
  • Table6: Settings for the training networks on ImageNet
  • Table7: MobileNetV2 network. Each row describes a stage i with Li layers, with input resolution Hi × Wi, expansion factor[<a class="ref-link" id="c7" href="#r7">7</a>] Ti and output channels Ci
  • Table8: Comparison between the original BSSC and the one searched by random search or our method
  • Table9: Comparison between the BSSC obtained by uniformly rescaling or AutoBSS for MobileNetV2
  • Table10: Comparison between the original BSS and the one searched by AutoBSS
Download tables as Excel
Related work
  • 2.1 Convolutional Neural Network Design

    The convolutional neural networks (CNNs) have been applied in many computer vision tasks [18, 1]. Most of modern network architectures [3, 4, 6, 7, 8] are block-based, where the design process is usually two phases: (1) designing a block structure, (2) stacking blocks to form the complete structure, in this paper we call the second phase BSS design. Many works have been devoted to effective and efficient block structure design, such as bottleneck [4], inverted bottleneck [7] and shufflenet block [8]. However, little effort has been made to BSS design, which has an unneglectable impact on network performance based on recent studies [12, 10]. There are two commonly-used rules for designing BSS: (1) doubling the channels when downsampling the feature maps, (2) allocating more blocks in the middle stages. These rough rules may not make the most potential of a carefully designed block structure. In this paper, we propose an automatic BSS search method named AutoBSS, which aims to break the human-designed BSS paradigm and find the optimal BSS configuration for a given block structure within a few trials.
Funding
  • Compared with the original ResNet18, ResNet50 and MobileNetV2, we improve the accuracy by 2.01%, 1.08% and 0.83% respectively
  • ResNet18/50Rand, MobileNetV2Rand and EfficientNet-B0/B1Rand in Table 1 are networks with the randomly searched BSS, the accuracy for them is 0.88/0.69%, 0.83% and 1.06/0.92% lower compared with our proposed AutoBSS
Reference
  • Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097–1105, 2012.
    Google ScholarLocate open access versionFindings
  • K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. In International Conference on Learning Representations, pages 1–14, 2015.
    Google ScholarLocate open access versionFindings
  • Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. Going deeper with convolutions. In The IEEE conference on computer vision and pattern recognition, pages 1–9, 2015.
    Google ScholarLocate open access versionFindings
  • Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In The IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
    Google ScholarLocate open access versionFindings
  • Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
    Google ScholarLocate open access versionFindings
  • Andrew G Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861, 2017.
    Findings
  • Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. Mobilenetv2: Inverted residuals and linear bottlenecks. In The IEEE Conference on Computer Vision and Pattern Recognition, pages 4510–4520, 2018.
    Google ScholarLocate open access versionFindings
  • Ningning Ma, Xiangyu Zhang, Hai-Tao Zheng, and Jian Sun. Shufflenet v2: Practical guidelines for efficient cnn architecture design. In The European Conference on Computer Vision (ECCV), pages 116–131, 2018.
    Google ScholarLocate open access versionFindings
  • Mingxing Tan and Quoc V. Le. Efficientnet: Rethinking model scaling for convolutional neural networks. In International Conference on Learning Representations, pages 6105–6114, 2019.
    Google ScholarLocate open access versionFindings
  • Zhao Zhong, Zichen Yang, Boyang Deng, Junjie Yan, Wei Wu, Jing Shao, and Cheng-Lin Liu. Blockqnn: Efficient block-wise neural network architecture generation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020.
    Google ScholarLocate open access versionFindings
  • Xiangyu Zhang, Xinyu Zhou, Mengxiao Lin, and Jian Sun. Shufflenet: An extremely efficient convolutional neural network for mobile devices. In The IEEE Conference on Computer Vision and Pattern Recognition, pages 6848–6856, 2018.
    Google ScholarLocate open access versionFindings
  • Dongyoon Han, Jiwhan Kim, and Junmo Kim. Deep pyramidal residual networks. In The IEEE Conference on Computer Vision and Pattern Recognition, pages 5927–5935, 2017.
    Google ScholarLocate open access versionFindings
  • Barret Zoph, Vijay Vasudevan, Jonathon Shlens, and Quoc V Le. Learning transferable architectures for scalable image recognition. In The IEEE conference on computer vision and pattern recognition, pages 8697–8710, 2018.
    Google ScholarLocate open access versionFindings
  • Yukang Chen, Gaofeng Meng, Qian Zhang, Shiming Xiang, Chang Huang, Lisen Mu, and Xinggang Wang. Renas: Reinforced evolutionary neural architecture search. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019.
    Google ScholarLocate open access versionFindings
  • Jiemin Fang, Yuzhu Sun, Qian Zhang, Yuan Li, Wenyu Liu, and Xinggang Wang. Densely connected search space for more flexible neural architecture search. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020.
    Google ScholarLocate open access versionFindings
  • Wuyang Chen, Xinyu Gong, Xianming Liu, Qian Zhang, Yuan Li, and Zhangyang Wang. Fasterseg: Searching for faster real-time semantic segmentation. In International Conference on Learning Representations, 2020.
    Google ScholarLocate open access versionFindings
  • Muyuan Fang, Qiang Wang, and Zhao Zhong. Betanas: Balanced training and selective drop for neural architecture search, 2019.
    Google ScholarFindings
  • Christian Szegedy, Alexander Toshev, and Dumitru Erhan. Deep neural networks for object detection. In Advances in neural information processing systems, pages 2553–2561, 2013.
    Google ScholarLocate open access versionFindings
  • Barret Zoph and Quoc V Le. Neural architecture search with reinforcement learning. In International Conference on Learning Representations, 2017.
    Google ScholarLocate open access versionFindings
  • Mingxing Tan, Bo Chen, Ruoming Pang, Vijay Vasudevan, Mark Sandler, Andrew Howard, and Quoc V. Le. Mnasnet: Platform-aware neural architecture search for mobile. In The IEEE Conference on Computer Vision and Pattern Recognition, pages 2820–2828, 2019.
    Google ScholarLocate open access versionFindings
  • Zhao Zhong, Junjie Yan, Wei Wu, Jing Shao, and Cheng-Lin Liu. Practical block-wise neural network architecture generation. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
    Google ScholarLocate open access versionFindings
  • Minghao Guo, Zhao Zhong, Wei Wu, Dahua Lin, and Junjie Yan. Irlas: Inverse reinforcement learning for architecture search. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019.
    Google ScholarLocate open access versionFindings
  • Esteban Real, Sherry Moore, Andrew Selle, Saurabh Saxena, Yutaka Leon Suematsu, Jie Tan, Quoc V Le, and Alexey Kurakin. Large-scale evolution of image classifiers. In International Conference on Machine Learning-Volume 70, pages 2902–2911. JMLR. org, 2017.
    Google ScholarLocate open access versionFindings
  • Esteban Real, Alok Aggarwal, Yanping Huang, and Quoc V Le. Regularized evolution for image classifier architecture search. In Proceedings of the aaai conference on artificial intelligence, volume 33, pages 4780–4789, 2019.
    Google ScholarLocate open access versionFindings
  • Hanxiao Liu, Karen Simonyan, and Yiming Yang. Darts: Differentiable architecture search. In International Conference on Learning Representations, 2019.
    Google ScholarLocate open access versionFindings
  • Han Cai, Ligeng Zhu, and Song Han. Proxylessnas: Direct neural architecture search on target task and hardware. In International Conference on Learning Representations, 2019.
    Google ScholarLocate open access versionFindings
  • Kirthevasan Kandasamy, Willie Neiswanger, Jeff Schneider, Barnabas Poczos, and Eric P Xing. Neural architecture search with bayesian optimisation and optimal transport. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors, Advances in Neural Information Processing Systems 31, pages 2016–2025. Curran Associates, Inc., 2018.
    Google ScholarLocate open access versionFindings
  • Xin Li, Yiming Zhou, Zheng Pan, and Jiashi Feng. Partial order pruning: for best speed/accuracy trade-off in neural architecture search. Proceedings of the IEEE conference on computer vision and pattern recognition, 2019.
    Google ScholarLocate open access versionFindings
  • Han Cai, Chuang Gan, and Song Han. Once for all: Train one network and specialize it for efficient deployment. arXiv preprint arXiv:1908.09791, 2019.
    Findings
  • Antoine Yang, Pedro M Esperança, and Fabio M Carlucci. Nas evaluation is frustratingly hard. arXiv preprint arXiv:1912.12522, 2019.
    Findings
  • Christian Sciuto, Kaicheng Yu, Martin Jaggi, Claudiu Musat, and Mathieu Salzmann. Evaluating the search phase of neural architecture search. arXiv preprint arXiv:1902.08142, 2019.
    Findings
  • Eric Brochu, Vlad M. Cora, and Nando de Freitas. A tutorial on bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. CoRR, abs/1012.2599, 2010.
    Findings
  • James MacQueen et al. Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, volume 1, pages 281–297. Oakland, CA, USA, 1967.
    Google ScholarLocate open access versionFindings
  • David Ginsbourger, Janis Janusevskis, and Rodolphe Le Riche. Dealing with asynchronicity in parallel Gaussian Process based global optimization. Research report, Mines Saint-Etienne, 2011.
    Google ScholarFindings
  • Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A largescale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255.
    Google ScholarLocate open access versionFindings
  • Stefan Elfwing, Eiji Uchibe, and Kenji Doya. Sigmoid-weighted linear units for neural network function approximation in reinforcement learning. Neural Networks, 107:3–11, 2018.
    Google ScholarLocate open access versionFindings
  • Jie Hu, Li Shen, and Gang Sun. Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7132–7141, 2018.
    Google ScholarLocate open access versionFindings
  • Liam Li and Ameet Talwalkar. Random search and reproducibility for neural architecture search. CoRR, abs/1902.07638, 2019.
    Findings
  • Zechun Liu, Haoyuan Mu, Xiangyu Zhang, Zichao Guo, Xin Yang, Tim Kwang-Ting Cheng, and Jian Sun. Metapruning: Meta learning for automatic neural network channel pruning. arXiv preprint arXiv:1903.10258, 2019.
    Findings
  • Jian-Hao Luo, Jianxin Wu, and Weiyao Lin. Thinet: A filter level pruning method for deep neural network compression. In Proceedings of the IEEE international conference on computer vision, pages 5058–5066, 2017.
    Google ScholarLocate open access versionFindings
  • Mao Ye, Chengyue Gong, Lizhen Nie, Denny Zhou, Adam Klivans, and Qiang Liu. Good subnetworks provably exist: Pruning via greedy forward selection. arXiv preprint arXiv:2003.01794, 2020.
    Findings
  • Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision, pages 2980–2988, 2017.
    Google ScholarLocate open access versionFindings
  • Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. Mask r-cnn. In Proceedings of the IEEE international conference on computer vision, pages 2961–2969, 2017.
    Google ScholarLocate open access versionFindings
  • Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. Microsoft coco: Common objects in context. In European conference on computer vision, pages 740–755.
    Google ScholarLocate open access versionFindings
  • Kaiming He, Ross Girshick, and Piotr Dollar. Rethinking imagenet pre-training. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2019.
    Google ScholarLocate open access versionFindings
  • Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems, pages 91–99, 2015.
    Google ScholarLocate open access versionFindings
  • Mark Everingham, Luc Van Gool, Christopher KI Williams, John Winn, and Andrew Zisserman. The pascal visual object classes (voc) challenge. International journal of computer vision, 88(2):303–338, 2010.
    Google ScholarLocate open access versionFindings
  • Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, and Jiaya Jia. Pyramid scene parsing network. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2881–2890, 2017.
    Google ScholarLocate open access versionFindings
  • Hengshuang Zhao, Yi Zhang, Shu Liu, Jianping Shi, Chen Change Loy, Dahua Lin, and Jiaya Jia. Psanet: Point-wise spatial attention network for scene parsing. In Proceedings of the European Conference on Computer Vision (ECCV), pages 267–283, 2018.
    Google ScholarLocate open access versionFindings
  • Hengshuang Zhao. semseg. https://github.com/hszhao/semseg, 2019.
    Findings
  • Carl Edward Rasmussen. Gaussian processes in machine learning. In Summer School on
    Google ScholarLocate open access versionFindings
  • James S Bergstra, Rémi Bardenet, Yoshua Bengio, and Balázs Kégl. Algorithms for hyperparameter optimization. In Advances in neural information processing systems, pages 2546– 2554, 2011.
    Google ScholarLocate open access versionFindings
  • Tong He, Zhi Zhang, Hang Zhang, Zhongyue Zhang, Junyuan Xie, and Mu Li. Bag of tricks for image classification with convolutional neural networks. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019.
    Google ScholarLocate open access versionFindings
  • Ekin D Cubuk, Barret Zoph, Dandelion Mane, Vijay Vasudevan, and Quoc V Le. Autoaugment: Learning augmentation strategies from data. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 113–123, 2019.
    Google ScholarLocate open access versionFindings
  • Chao Peng, Tete Xiao, Zeming Li, Yuning Jiang, Xiangyu Zhang, Kai Jia, Gang Yu, and Jian Sun. Megdet: A large mini-batch object detector. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 6181–6189, 2018.
    Google ScholarLocate open access versionFindings
  • Priya Goyal, Piotr Dollár, Ross Girshick, Pieter Noordhuis, Lukasz Wesolowski, Aapo Kyrola, Andrew Tulloch, Yangqing Jia, and Kaiming He. Accurate, large minibatch sgd: Training imagenet in 1 hour. arXiv preprint arXiv:1706.02677, 2017.
    Findings
Author
yikang zhang
yikang zhang
Jian Zhang
Jian Zhang
Your rating :
0

 

Tags
Comments
小科