AI帮你理解科学

AI 生成解读视频

AI抽取解析论文重点内容自动生成视频


pub
生成解读视频

AI 溯源

AI解析本论文相关学术脉络


Master Reading Tree
生成 溯源树

AI 精读

AI抽取本论文的概要总结


微博一下
Much of these improvements can be attributed to the use of convolutional neural networks, which are capable of learning complex hierarchical feature representations of images

Improved Regularization of Convolutional Neural Networks with Cutout.

arXiv: Computer Vision and Pattern Recognition, (2017)

引用623|浏览233
EI
下载 PDF 全文
引用
微博一下

摘要

Convolutional neural networks are capable of learning powerful representational spaces, which are necessary for tackling complex learning tasks. However, due to the model capacity required to capture such representations, they are often susceptible to overfitting and therefore require proper regularization in order to generalize well. In ...更多

代码

数据

0
简介
  • In recent years deep learning has contributed to considerable advances in the field of computer vision, resulting in state-of-the-art performance in many challenging vision tasks such as object recognition [8], semantic segmentation [11], image captioning [19], and human pose estimation [17]
  • Much of these improvements can be attributed to the use of convolutional neural networks (CNNs) [9], which are capable of learning complex hierarchical feature representations of images.
  • One of the most common uses of noise for improving model accuracy is dropout [6], which stochastically drops neuron activations during training and as a result discourages the co-adaptation of feature detectors
重点内容
  • In recent years deep learning has contributed to considerable advances in the field of computer vision, resulting in state-of-the-art performance in many challenging vision tasks such as object recognition [8], semantic segmentation [11], image captioning [19], and human pose estimation [17]
  • Much of these improvements can be attributed to the use of convolutional neural networks (CNNs) [9], which are capable of learning complex hierarchical feature representations of images
  • In the remainder of this paper, we introduce cutout and demonstrate that masking out contiguous sections of the input to convolutional neural networks can improve model robustness and yield better model performance
  • We show that this simple method works in conjunction with other current state-of-the-art techniques such as residual networks and batch normalization, and can be combined with most regularization techniques, including standard dropout and data augmentation
  • Cutout was originally conceived as a targeted method for removing visual features with high activations in later layers of a convolutional neural networks
方法
  • Cutout size of 8 × 8 pixels for CIFAR-100 when training on the full datasets
  • It appears that as the number of classes increases, the optimal cutout size decreases.
  • Adding cutout to the current state-of-the-art shake-shake regularization models improves performance by 0.3 and 0.6 percentage points on CIFAR-10 and CIFAR-100 respectively, yielding new stateof-the-art results of 2.56% and 15.20% test error
结论
  • The authors discovered that the conceptually and computationally simpler approach of randomly masking square sections of the image performed equivalently in the experiments the authors conducted.
  • This simple regularizer proved to be complementary to existing forms of data augmentation and regularization.
  • Future work will return to the original investigation of visual feature removal informed by activations
表格
  • Table1: Test error rates (%) on CIFAR (C10, C100) and SVHN datasets. “+” indicates standard data augmentation (mirror + crop). Results averaged over five runs, with the exception of shake-shake regularization which only had three runs each. Baseline shake-shake regularization results taken from [<a class="ref-link" id="c4" href="#r4">4</a>]
  • Table2: Test error rates on STL-10 dataset. “+” indicates standard data augmentation (mirror + crop). Results averaged over five runs on full training set
Download tables as Excel
相关工作
  • Our work is most closely related to two common regularization techniques: data augmentation and dropout. Here we examine the use of both methods in the setting of training convolutional neural networks. We also discuss denoising auto-encoders and context encoders, which share some similarities with our work.

    2.1. Data Augmentation for Images

    Data augmentation has long been used in practice when training convolutional neural networks. When training LeNet5 [9] for optical character recognition, LeCun et al apply various affine transforms, including horizontal and vertical translation, scaling, squeezing, and horizontal shearing to improve their model’s accuracy and robustness.

    In [1], Bengio et al demonstrate that deep architectures benefit much more from data augmentation than shallow architectures. They apply a large variety of transformations to their handwritten character dataset, including local elastic deformation, motion blur, Gaussian smoothing, Gaussian noise, salt and pepper noise, pixel permutation, and adding fake scratches and other occlusions to the images, in addition to affine transformations.
基金
  • Shows that the simple regularization technique of randomly masking out square regions of input during training, which calls cutout, can be used to improve the robustness and overall performance of convolutional neural networks
  • Demonstrates that it can be used in conjunction with existing forms of data augmentation and other regularizers to further improve model performance
  • Evaluates this method by applying it to current state-of-the-art architectures on the CIFAR10, CIFAR-100, and SVHN datasets, yielding new state-ofthe-art results of 2.56%, 15.20%, and 1.30% test error respectively
  • Introduces cutout and demonstrate that masking out contiguous sections of the input to convolutional neural networks can improve model robustness and yield better model performance
  • Shows that this simple method works in conjunction with other current state-of-the-art techniques such as residual networks and batch normalization, and can be combined with most regularization techniques, including standard dropout and data augmentation
研究对象与分析
demonstrates similar observations: 5
The latter observation illustrates that cutout is indeed encouraging the network to take into account a wider variety of features when making predictions, rather than relying on the presence of a smaller number of features. Figure 5 demonstrates similar observations for individual samples, where the effects of cutout are more pronounced. Cutout was originally conceived as a targeted method for removing visual features with high activations in later layers of a CNN

引用论文
  • Y. Bengio, A. Bergeron, N. Boulanger-Lewandowski, T. Breuel, Y. Chherawala, et al. Deep learners benefit more from out-of-distribution examples. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pages 164–172, 2011.
    Google ScholarLocate open access versionFindings
  • A. Canziani, A. Paszke, and E. Culurciello. An analysis of deep neural network models for practical applications. In IEEE International Symposium on Circuits & Systems, 2016.
    Google ScholarLocate open access versionFindings
  • A. Coates, A. Ng, and H. Lee. An analysis of single-layer networks in unsupervised feature learning. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pages 215–223, 2011.
    Google ScholarLocate open access versionFindings
  • X. Gastaldi. Shake-shake regularization. arXiv preprint arXiv:1705.07485, 2017.
    Findings
  • K. He, X. Zhang, S. Ren, and J. Sun. Identity mappings in deep residual networks. In European Conference on Computer Vision, pages 630–645.
    Google ScholarLocate open access versionFindings
  • G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. R. Salakhutdinov. Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580, 2012.
    Findings
  • A. Krizhevsky and G. Hinton. Learning multiple layers of features from tiny images. 2009.
    Google ScholarFindings
  • A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems, pages 1097–1105, 2012.
    Google ScholarLocate open access versionFindings
  • Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradientbased learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
    Google ScholarLocate open access versionFindings
  • J. Lemley, S. Bazrafkan, and P. Corcoran. Smart augmentation-learning an optimal data augmentation strategy. IEEE Access, 2017.
    Google ScholarLocate open access versionFindings
  • J. Long, E. Shelhamer, and T. Darrell. Fully convolutional networks for semantic segmentation. In CVPR, pages 3431– 3440, 2015.
    Google ScholarLocate open access versionFindings
  • Y. Netzer, T. Wang, A. Coates, A. Bissacco, B. Wu, and A. Y. Ng. Reading digits in natural images with unsupervised feature learning. In NIPS Workshop on Deep Learning and Unsupervised Feature Learning, volume 2011, page 5, 2011.
    Google ScholarLocate open access versionFindings
  • S. Park and N. Kwak. Analysis on the dropout effect in convolutional neural networks. In Asian Conference on Computer Vision, pages 189–204.
    Google ScholarLocate open access versionFindings
  • D. Pathak, P. Krahenbuhl, J. Donahue, T. Darrell, and A. A. Efros. Context encoders: Feature learning by inpainting. In CVPR, pages 2536–2544, 2016.
    Google ScholarLocate open access versionFindings
  • N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Dropout: A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 15(1):1929–1958, 2014.
    Google ScholarLocate open access versionFindings
  • J. Tompson, R. Goroshin, A. Jain, Y. LeCun, and C. Bregler. Efficient object localization using convolutional networks. In CVPR, pages 648–656, 2015.
    Google ScholarLocate open access versionFindings
  • A. Toshev and C. Szegedy. Deeppose: Human pose estimation via deep neural networks. In CVPR, pages 1653–1660, 2014.
    Google ScholarLocate open access versionFindings
  • P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P.A. Manzagol. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. Journal of Machine Learning Research, 11(Dec):3371–3408, 2010.
    Google ScholarLocate open access versionFindings
  • O. Vinyals, A. Toshev, S. Bengio, and D. Erhan. Show and tell: A neural image caption generator. In CVPR, pages 3156–3164, 2015.
    Google ScholarLocate open access versionFindings
  • H. Wu and X. Gu. Towards dropout training for convolutional neural networks. Neural Networks, 71:1–10, 2015.
    Google ScholarLocate open access versionFindings
  • R. Wu, S. Yan, Y. Shan, Q. Dang, and G. Sun. Deep image: Scaling up image recognition. arXiv preprint arXiv:1501.02876, 7(8), 2015.
    Findings
  • S. Zagoruyko and N. Komodakis. Wide residual networks. British Machine Vision Conference (BMVC), 2016.
    Google ScholarLocate open access versionFindings
0
您的评分 :

暂无评分

标签
评论
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn