Log-DenseNet: How to Sparsify a DenseNet

arXiv: Computer Vision and Pattern Recognition, Volume abs/1711.00002, 2018.

Cited by: 20|Bibtex|Views84
EI
Other Links: dblp.uni-trier.de|academic.microsoft.com|arxiv.org
Weibo:
We show that short backpropagation distances are important for networks that have shortcut connections: if each layer has a fixed number of shortcut inputs, they should be placed to minimize maximum backpropagation distance

Abstract:

Skip connections are increasingly utilized by deep neural networks to improve accuracy and cost-efficiency. In particular, the recent DenseNet is efficient in computation and parameters, and achieves state-of-the-art predictions by directly connecting each feature layer to all previous ones. However, DenseNetu0027s extreme connectivity pa...More

Code:

Data:

0
Introduction
  • Deep neural networks have been improving performance for many machine learning tasks, scaling from networks like AlexNet (Krizhevsky et al, 2012) to increasingly more complex and expensive networks, like VGG (Simonyan & Zisserman, 2014), ResNet (He et al, 2016) and Inception (Christian Szegedy & Alemi, 2017).
  • FractalNet (Larsson et al, 2017) explicitly constructs shortcut networks recursively and averages the outputs from the shortcuts.
  • Such structures prevent deep networks from degrading from the shallow shortcuts via “teacher-student” effects.
  • (Huang et al, 2016) implicitly constructs skip connections by allowing entire layers to be dropout during training.
  • DualPathNet (Chen et al, 2017) combines the insights of DenseNet (Huang et al, 2017) and ResNet (He et al, 2016), and utilizes both concatenation and summation of previous features
Highlights
  • Deep neural networks have been improving performance for many machine learning tasks, scaling from networks like AlexNet (Krizhevsky et al, 2012) to increasingly more complex and expensive networks, like VGG (Simonyan & Zisserman, 2014), ResNet (He et al, 2016) and Inception (Christian Szegedy & Alemi, 2017)
  • Residual and Highway Networks (He et al, 2016; Srivastava et al, 2015) propose to sum the new feature map at each depth with the ones from skip connections, so that new features can be understood as fitting residual features of the earlier ones
  • To reduce the O(L2) computation and memory footprint of DenseNet, we propose LogDenseNet which increase maximum backpropagation distance slightly to 1 + log2 L while using only O(L log L) connections and run-time complexity
  • We show that short backpropagation distances are important for networks that have shortcut connections: if each layer has a fixed number of shortcut inputs, they should be placed to minimize maximum backpropagation distance
  • We show that Log-DenseNets improve the performance and scalability of tabula rasa fully convolutional DenseNets on CamVid
  • Our work provides insights for future network designs, especially those that cannot afford full dense shortcut connections and need high depths, like fully convolutional networks
Methods
  • I 2 number of previous layers.
  • With this many connections, NEAREST has a MBD of log L, because the authors can halve i until j > i/2 so that i and j are directly connected.
  • SPACED has a MBD of 2, because each xi takes input from every other previous layer.
  • Table 2 shows that EVENLY-SPACED significantly outperform NEAREST on CIFAR10 and CIFAR100
Results
  • The average relative increase of top-1 error rate using NEAREST and EVENLY-SPACED from using Log-DenseNet is 12.2% and 8.5%, which is significant: for instance, (52,32) achieves 23.10% error rate using EVENLY-SPACED, which is about 10% relatively worse than the 20.58% from (52,32) using Log-DenseNet, but (52,16) using Log-DenseNet already has 23.45% error rate using a quarter of the computation of (52,32).
  • FC-Log-DenseNet achieves 67.3% mean IoUs, which is slightly higher than the 66.9% of FC-DenseNet
Conclusion
  • The authors show that short backpropagation distances are important for networks that have shortcut connections: if each layer has a fixed number of shortcut inputs, they should be placed to minimize MBD.
  • The authors show that Log-DenseNets improve the performance and scalability of tabula rasa fully convolutional DenseNets on CamVid. Log-DenseNets achieve competitive results in visual recognition data-sets, offering a trade-off between accuracy and network depth.
  • The authors' work provides insights for future network designs, especially those that cannot afford full dense shortcut connections and need high depths, like FCNs
Summary
  • Introduction:

    Deep neural networks have been improving performance for many machine learning tasks, scaling from networks like AlexNet (Krizhevsky et al, 2012) to increasingly more complex and expensive networks, like VGG (Simonyan & Zisserman, 2014), ResNet (He et al, 2016) and Inception (Christian Szegedy & Alemi, 2017).
  • FractalNet (Larsson et al, 2017) explicitly constructs shortcut networks recursively and averages the outputs from the shortcuts.
  • Such structures prevent deep networks from degrading from the shallow shortcuts via “teacher-student” effects.
  • (Huang et al, 2016) implicitly constructs skip connections by allowing entire layers to be dropout during training.
  • DualPathNet (Chen et al, 2017) combines the insights of DenseNet (Huang et al, 2017) and ResNet (He et al, 2016), and utilizes both concatenation and summation of previous features
  • Methods:

    I 2 number of previous layers.
  • With this many connections, NEAREST has a MBD of log L, because the authors can halve i until j > i/2 so that i and j are directly connected.
  • SPACED has a MBD of 2, because each xi takes input from every other previous layer.
  • Table 2 shows that EVENLY-SPACED significantly outperform NEAREST on CIFAR10 and CIFAR100
  • Results:

    The average relative increase of top-1 error rate using NEAREST and EVENLY-SPACED from using Log-DenseNet is 12.2% and 8.5%, which is significant: for instance, (52,32) achieves 23.10% error rate using EVENLY-SPACED, which is about 10% relatively worse than the 20.58% from (52,32) using Log-DenseNet, but (52,16) using Log-DenseNet already has 23.45% error rate using a quarter of the computation of (52,32).
  • FC-Log-DenseNet achieves 67.3% mean IoUs, which is slightly higher than the 66.9% of FC-DenseNet
  • Conclusion:

    The authors show that short backpropagation distances are important for networks that have shortcut connections: if each layer has a fixed number of shortcut inputs, they should be placed to minimize MBD.
  • The authors show that Log-DenseNets improve the performance and scalability of tabula rasa fully convolutional DenseNets on CamVid. Log-DenseNets achieve competitive results in visual recognition data-sets, offering a trade-off between accuracy and network depth.
  • The authors' work provides insights for future network designs, especially those that cannot afford full dense shortcut connections and need high depths, like FCNs
Tables
  • Table1: Error rates of Log-DenseNet V1(L), NEAREST (N) and EVENLY-SPACED (E), in each of which layer xi has log i previous layers as input. (L) has a MBD of 1 + log L, and the other two have
  • Table2: a) NEAREST (N), EVENLY-SPACED (E), and NearestHalfAndLog (N+L) each connects to about i/2 previous layers at xi, and have MBD log L, 2 and 2. N+L and E clearly outperform N
  • Table3: Performance on the CamVid semantic segmentation data-set. The column GFLOPS reports the computation on a 224x224 image in 1e9 FLOPS. We compare against 1 (<a class="ref-link" id="cBadrinarayanan_et+al_2015_a" href="#rBadrinarayanan_et+al_2015_a">Badrinarayanan et al, 2015</a>), 2 (<a class="ref-link" id="cLong_et+al_2015_a" href="#rLong_et+al_2015_a">Long et al, 2015</a>), 3 (<a class="ref-link" id="cChen_et+al_2016_a" href="#rChen_et+al_2016_a">Chen et al, 2016</a>), 4 (<a class="ref-link" id="cKundu_et+al_2016_a" href="#rKundu_et+al_2016_a">Kundu et al, 2016</a>), and 5 (<a class="ref-link" id="cJegou_et+al_2017_a" href="#rJegou_et+al_2017_a">Jegou et al, 2017</a>)
Download tables as Excel
Reference
  • L. J. Ba and R. Caruana. Do deep nets really need to be deep? In Proceedings of NIPS, 2014.
    Google ScholarLocate open access versionFindings
  • Vijay Badrinarayanan, Alex Kendall, and Roberto Cipolla. Segnet: A deep convolutional encoderdecoder architecture for image segmentation. arXiv preprint arXiv:1511.00561, 2015.
    Findings
  • Gabriel J. Brostow, Julien Fauqueur, and Roberto Cipolla. Semantic object classes in video: A high-definition ground truth database. Pattern Recognition Letters, 2008.
    Google ScholarLocate open access versionFindings
  • Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L Yuille. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. arXiv preprint arXiv:1606.00915, 2016.
    Findings
  • Yunpeng Chen, Jianan Li, Huaxin Xiao, Xiaojie Jin, Shuicheng Yan, and Jiashi Feng. Dual path networks. arXiv preprint arXiv:1707.01629, 2017.
    Findings
  • Vincent Vanhoucke Christian Szegedy, Sergey Ioffe and Alex Alemi. Inception-v4, inception-resnet and the impact of residual connections on learning. In AAAI, 2017.
    Google ScholarLocate open access versionFindings
  • Emily L Denton, Wojciech Zaremba, Joan Bruna, Yann LeCun, and Rob Fergus. Exploiting linear structure within convolutional networks for efficient evaluation. In NIPS, 2014.
    Google ScholarLocate open access versionFindings
  • Bharath Hariharan, Pablo Arbelaez, Ross Girshick, and Jitendra Malik. Hypercolumns for object segmentation and fine-grained localization. In CVPR, 2015.
    Google ScholarLocate open access versionFindings
  • K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In Computer Vision and Pattern Recognition (CVPR), 2016.
    Google ScholarLocate open access versionFindings
  • Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. Distilling the knowledge in a neural network. In Deep Learning and Representation Learning Workshop, NIPS, 2014.
    Google ScholarLocate open access versionFindings
  • Hanzhang Hu, Debadeepta Dey, J. Andrew Bagnell, and Martial Hebert. Anytime neural networks via joint optimization of auxiliary losses. In Arxiv Preprint: 1708.06832, 2017.
    Findings
  • Gao Huang, Zhuang Liu, Kilian Q. Weinberger, and Laurens van der Maaten. Densely connected convolutional networks. In Computer Vision and Pattern Recognition (CVPR), 2017.
    Google ScholarLocate open access versionFindings
  • Yani Ioannou, Duncan Robertson, Roberto Cipolla, and Antonio Criminisi. Deep roots: Improving cnn efficiency with hierarchical filter groups. arXiv preprint arXiv:1605.06489, 2016.
    Findings
  • Simon Jegou, Michal Drozdzal, David Vazquez, Adriana Romero, and Yoshua Bengio. The one hundred layers tiramisu: Fully convolutional densenets for semantic segmentation. In Computer Vision and Pattern Recognition Workshops (CVPRW), 2017.
    Google ScholarLocate open access versionFindings
  • Y.D. Kim, E. Park, S. Yoo, T. Choi, L. Yang, and D. Shin. Compression of deep convolutional neural networks for fast and low power mobile applications. In ICLR, 2016.
    Google ScholarLocate open access versionFindings
  • Alex Krizhevsky and Geoffrey Hinton. Learning multiple layers of features from tiny images. Technical report, University of Toronto, 2009.
    Google ScholarFindings
  • Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, 2012.
    Google ScholarLocate open access versionFindings
  • Abhijit Kundu, Vibhav Vineet, and Vladlen Koltun. Feature space optimization for semantic video segmentation. In CVPR, 2016. 9
    Google ScholarLocate open access versionFindings
  • G. Larsson, M. Maire, and G. Shakhnarovich. Fractalnet: Ultra-deep neural networks without residuals. In International Conference on Learning Representations (ICLR), 2017.
    Google ScholarLocate open access versionFindings
  • Chen-Yu Lee, Saining Xie, Patrick W. Gallagher, Zhengyou Zhang, and Zhuowen Tu. Deeplysupervised nets. In AISTATS, 2015.
    Google ScholarLocate open access versionFindings
  • Tongchen Li. https://github.com/Tongcheng/caffe/blob/master/src/caffe/layers/DenseBlock_layer.cu, 2016.
    Findings
  • https://github.com/liuzhuang13/DenseNet/tree/master/
    Findings
  • Zhuang Liu, Jianguo Li, Zhiqiang Shen, Gao Huang, Shoumeng Yan, and Changshui Zhang. Learning efficient convolutional networks through network slimming. In arxiv preprint:1708.06519, 2017.
    Findings
  • J. Long, E. Shelhamer, and T. Darrell. Fully convolutional networks for semantic segmentation. In CVPR, 2015.
    Google ScholarLocate open access versionFindings
  • Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y. Ng. Reading digits in natural images with unsupervised feature learning. In NIPS Workshop on Deep Learning and Unsupervised Feature Learning, 2011.
    Google ScholarLocate open access versionFindings
  • S. Hussain Raza, Matthias Grundmann, and Irfan Essa. Geometric context from video. In CVPR, 2013.
    Google ScholarFindings
  • Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. ImageNet Large Scale Visual Recognition Challenge. IJCV, 2015.
    Google ScholarLocate open access versionFindings
  • Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
    Findings
  • Rupesh Kumar Srivastava, Klaus Greff, and Jurgen Schmidhuber. Highway networks. arXiv preprint arXiv:1505.00387, 2015.
    Findings
  • Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, and Jiaya Jia. Pyramid scene parsing network. In Computer Vision and Pattern Recognition (CVPR), 2017.
    Google ScholarLocate open access versionFindings
Full Text
Your rating :
0

 

Tags
Comments