The Origins and Prevalence of Texture Bias in Convolutional Neural Networks

NeurIPS, 2020.

Cited by: 9|Bibtex|Views22
EI
Other Links: dblp.uni-trier.de|academic.microsoft.com
Weibo:
This paper focuses on one such result, namely that Convolutional neural networks appear to make classifications based on superficial textural features rather than on the shape information preferentially used by humans

Abstract:

Recent work has indicated that, unlike humans, ImageNet-trained CNNs tend to classify images by texture rather than by shape. How pervasive is this bias, and where does it come from? We find that, when trained on datasets of images with conflicting shape and texture, CNNs learn to classify by shape at least as easily as by texture. What f...More

Code:

Data:

0
Introduction
  • Convolutional neural networks (CNNs) define state-of the-art-performance in many computer vision tasks, such as image classification [57], object detection [83, 40], and segmentation [40].
  • Building on a long tradition of work in psychology and neuroscience documenting humans’ shape-based object classification, Geirhos et al [36] compared humans to ImageNet-trained CNNs on a dataset of images with conflicting shape and texture information and found that models tended to classify according to texture (e.g.
Highlights
  • Convolutional neural networks (CNNs) define state-of the-art-performance in many computer vision tasks, such as image classification [57], object detection [83, 40], and segmentation [40]
  • This paper focuses on one such result, namely that CNNs appear to make classifications based on superficial textural features [36, 4] rather than on the shape information preferentially used by humans [61, 60]
  • Building on a long tradition of work in psychology and neuroscience documenting humans’ shape-based object classification, Geirhos et al [36] compared humans to ImageNet-trained CNNs on a dataset of images with conflicting shape and texture information and found that models tended to classify according to texture (e.g. “elephant”), and humans according to shape (e.g. “knife”)
  • This paper explores the origins of texture bias in ImageNet-trained CNNs, looking at the effects of data augmentation, training procedure, model architecture, and task
  • We find that it is possible to extract more shape information from a CNN’s later layers than is reflected in the model’s classifications, and study how this information loss occurs as data flows through a network
  • Augmentation that reduced texture bias according to the Geirhos Style-Transfer (GST) dataset improved accuracy on the ImageNet-Sketch (IN-Sketch) [96] dataset, which consists of human-drawn sketches of each ImageNet class, and Stylized ImageNet (SIN) [36], which was generated by processing the ImageNet test set with a style transfer algorithm
Results
  • This paper explores the origins of texture bias in ImageNet-trained CNNs, looking at the effects of data augmentation, training procedure, model architecture, and task.
  • The authors show that architectures that perform better on ImageNet generally exhibit lower texture bias, but neither architectures designed to match the human visual system nor models that replace convolution with self-attention have texture biases substantially different from ordinary CNNs.
  • In their experiments showing a texture bias in ImageNet-trained CNNs, Geirhos et al [36] followed the standard practice of doing randomcrop augmentation: crop shapes are sampled as random proportions of the original image size from [0.08, 1.0] with aspect ratio sampled from [0.75, 1.33], and resized to 224 × 224px [90].
  • The authors hypothesized that the development of human-like shape representations might require more diverse training data than what is present in ImageNet. Geirhos et al [91] previously showed that neural style transfer data augmentation induces a model to classify images according to shape.
  • Augmentation that reduced texture bias according to the GST dataset improved accuracy on the ImageNet-Sketch (IN-Sketch) [96] dataset, which consists of human-drawn sketches of each ImageNet class, and Stylized ImageNet (SIN) [36], which was generated by processing the ImageNet test set with a style transfer algorithm.
  • Figure 4 shows the shape bias, shape match, and texture match of 16 high-performing ImageNet models trained with the same hyperparameters (Appendix E.5.1 for details).
  • Standard ImageNet-trained CNNs are biased towards texture in their classification decisions, but this does not rule out the possibility that shape information is still represented in layers of the model prior to the output.
  • Taking as input activations from a layer of a frozen, ImageNet-trained model, the classifier predicted either (i) the shape of a GST image or (ii) its texture.
Conclusion
  • From the perspective of the computer vision practitioner, the results indicate that models that prefer to classify images by shape rather than texture outperform baselines on some out-ofdistribution test sets.
  • The authors suggest practical ways to reduce texture bias, for example using addito classify shape or texture of the GST stimuli given layer activations from frozen ImageNet-trained AlexNet and ResNet-50.
Summary
  • Convolutional neural networks (CNNs) define state-of the-art-performance in many computer vision tasks, such as image classification [57], object detection [83, 40], and segmentation [40].
  • Building on a long tradition of work in psychology and neuroscience documenting humans’ shape-based object classification, Geirhos et al [36] compared humans to ImageNet-trained CNNs on a dataset of images with conflicting shape and texture information and found that models tended to classify according to texture (e.g.
  • This paper explores the origins of texture bias in ImageNet-trained CNNs, looking at the effects of data augmentation, training procedure, model architecture, and task.
  • The authors show that architectures that perform better on ImageNet generally exhibit lower texture bias, but neither architectures designed to match the human visual system nor models that replace convolution with self-attention have texture biases substantially different from ordinary CNNs.
  • In their experiments showing a texture bias in ImageNet-trained CNNs, Geirhos et al [36] followed the standard practice of doing randomcrop augmentation: crop shapes are sampled as random proportions of the original image size from [0.08, 1.0] with aspect ratio sampled from [0.75, 1.33], and resized to 224 × 224px [90].
  • The authors hypothesized that the development of human-like shape representations might require more diverse training data than what is present in ImageNet. Geirhos et al [91] previously showed that neural style transfer data augmentation induces a model to classify images according to shape.
  • Augmentation that reduced texture bias according to the GST dataset improved accuracy on the ImageNet-Sketch (IN-Sketch) [96] dataset, which consists of human-drawn sketches of each ImageNet class, and Stylized ImageNet (SIN) [36], which was generated by processing the ImageNet test set with a style transfer algorithm.
  • Figure 4 shows the shape bias, shape match, and texture match of 16 high-performing ImageNet models trained with the same hyperparameters (Appendix E.5.1 for details).
  • Standard ImageNet-trained CNNs are biased towards texture in their classification decisions, but this does not rule out the possibility that shape information is still represented in layers of the model prior to the output.
  • Taking as input activations from a layer of a frozen, ImageNet-trained model, the classifier predicted either (i) the shape of a GST image or (ii) its texture.
  • From the perspective of the computer vision practitioner, the results indicate that models that prefer to classify images by shape rather than texture outperform baselines on some out-ofdistribution test sets.
  • The authors suggest practical ways to reduce texture bias, for example using addito classify shape or texture of the GST stimuli given layer activations from frozen ImageNet-trained AlexNet and ResNet-50.
Tables
  • Table1: Random-crop augmentation biases models towards texture. Characteristics of ImageNettrained models with random-crop (Random) versus with center-crop (Center) preprocessing. For each model and metric, the preprocessing that achieved a higher value is boldfaced
  • Table2: Color distortion, Gaussian blur, Gaussian noise, and Sobel filtering reduce texture bias. ResNet-50 models were trained on ImageNet with random crops for 90 epochs. Augmentations were applied with 50% probability
  • Table3: The effect of augmentations that reduce texture bias is additive. ResNet-50 models were trained on ImageNet for 90 epochs with random-crop augmentation and augmentation p = 0.5 unless noted. Augmentations are cumulative across the rows (e.g. the “+ Gaussian blur” model used color distortion and Gaussian blur augmentation). “Stronger” augmentation used p = 0.75; “Longer” training is 270 epochs. The final two models are shape-biased (>50% shape-based classifications)
  • Table4: Both objective and base architecture affect shape bias. We trained AlexNet and ResNet50 base architectures on five objectives (rows). For SimCLR, we additionally include a baseline with the same augmentation. We froze the convolutional layers, reinitialized and retrained the fully connected layers, and evaluated the models using the GST stimuli. Shape match is generally higher for models with an AlexNet than ResNet-50 base architecture, while the reverse is true for texture. Rank order of shape bias across tasks is largely preserved across base architectures
Download tables as Excel
Related work
  • Adversarial examples. Adversarial examples are small perturbations that cause inputs to be misclassified [87, 10, 70]. Adversarial perturbations are not entirely misaligned with human perception [29, 106], and reflect true features of the training distribution [48]. State-of-the-art defenses on ImageNet use adversarial training [87, 41, 66, 100] or randomized smoothing [17], and images generated by optimizing class confidence under these models are more perceptually recognizable to humans [80, 30, 93, 51]. Adversarial training can improve accuracy on out-of-distribution data [99]. However, models that are robust to adversarial examples generated with respect to the p norm are not necessarily robust to other forms of imperceptible perturbations [98, 92].
Funding
  • KLH was supported by NSF GRFP grant DGE-1656518
Reference
  • ALCORN, M. A., LI, Q., GONG, Z., WANG, C., MAI, L., KU, W.-S., AND NGUYEN, A. Strike (with) a pose: Neural networks are easily fooled by strange poses of familiar objects. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2019), pp. 4845–4854.
    Google ScholarLocate open access versionFindings
  • AUTHORS, T. Tensorflow tpu implementation of resnet-50. https://github.com/tensorflow/tpu/tree/master/models/official/resnet.
    Findings
  • AZULAY, A., AND WEISS, Y. Why do deep convolutional networks generalize so poorly to small image transformations? arXiv preprint arXiv:1805.12177 (2018).
    Findings
  • BAKER, N., LU, H., ERLIKHMAN, G., AND KELLMAN, P. J. Deep convolutional networks do not classify based on global object shape. PLoS computational biology 14, 12 (2018), e1006613.
    Google ScholarLocate open access versionFindings
  • BALLESTER, P., AND ARAUJO, R. M. On the performance of googlenet and alexnet applied to sketches. In Thirtieth AAAI Conference on Artificial Intelligence (2016).
    Google ScholarLocate open access versionFindings
  • BARBU, A., MAYO, D., ALVERIO, J., LUO, W., WANG, C., GUTFREUND, D., TENENBAUM, J., AND KATZ, B. Objectnet: A large-scale bias-controlled dataset for pushing the limits of object recognition models. In Advances in Neural Information Processing Systems (2019), pp. 9448–9458.
    Google ScholarLocate open access versionFindings
  • BASHIVAN, P., KAR, K., AND DICARLO, J. J. Neural population control via deep image synthesis. Science 364, 6439 (2019), eaav9436.
    Google ScholarLocate open access versionFindings
  • BENGIO, Y., BASTIEN, F., BERGERON, A., BOULANGER-LEWANDOWSKI, N., BREUEL, T., CHHERAWALA, Y., CISSE, M., CÔTÉ, M., ERHAN, D., EUSTACHE, J., ET AL. Deep learners benefit more from out-of-distribution examples. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics (2011), pp. 164–172.
    Google ScholarLocate open access versionFindings
  • BERARDINO, A., LAPARRA, V., BALLÉ, J., AND SIMONCELLI, E. Eigen-distortions of hierarchical representations. In Advances in neural information processing systems (2017), pp. 3530–3539.
    Google ScholarLocate open access versionFindings
  • BIGGIO, B., CORONA, I., MAIORCA, D., NELSON, B., ŠRNDIC, N., LASKOV, P., GIACINTO, G., AND ROLI, F. Evasion attacks against machine learning at test time. In Joint European conference on machine learning and knowledge discovery in databases (2013), Springer, pp. 387–402.
    Google ScholarLocate open access versionFindings
  • BRENDEL, W., AND BETHGE, M. Approximating cnns with bag-of-local-features models works surprisingly well on imagenet. arXiv preprint arXiv:1904.00760 (2019).
    Findings
  • CADIEU, C. F., HONG, H., YAMINS, D. L., PINTO, N., ARDILA, D., SOLOMON, E. A., MAJAJ, N. J., AND DICARLO, J. J. Deep neural networks rival the representation of primate it cortex for core visual object recognition. PLoS computational biology 10, 12 (2014), e1003963.
    Google ScholarLocate open access versionFindings
  • CHEN, T., KORNBLITH, S., NOROUZI, M., AND HINTON, G. A simple framework for contrastive learning of visual representations. arXiv preprint arXiv:2002.05709 (2020).
    Findings
  • CHILEY, V., SHARAPOV, I., KOSSON, A., KOSTER, U., REECE, R., DE LA FUENTE, S. S., SUBBIAH, V., AND JAMES, M. Online normalization for training neural networks. arXiv preprint arXiv:1905.05894 (2019).
    Findings
  • CICHY, R. M., KHOSLA, A., PANTAZIS, D., TORRALBA, A., AND OLIVA, A. Comparison of deep neural networks to spatio-temporal cortical dynamics of human visual object recognition reveals hierarchical correspondence. Scientific reports 6 (2016), 27755.
    Google ScholarLocate open access versionFindings
  • CIRES AN, D. C., MEIER, U., GAMBARDELLA, L. M., AND SCHMIDHUBER, J. Deep, big, simple neural nets for handwritten digit recognition. Neural computation 22, 12 (2010), 3207–3220.
    Google ScholarLocate open access versionFindings
  • COHEN, J., ROSENFELD, E., AND KOLTER, Z. Certified adversarial robustness via randomized smoothing. In Proceedings of the 36th International Conference on Machine Learning (Long Beach, California, USA, 09–15 Jun 2019), K. Chaudhuri and R. Salakhutdinov, Eds., vol. 97 of Proceedings of Machine Learning Research, PMLR, pp. 1310–1320.
    Google ScholarLocate open access versionFindings
  • https://github.com/dicarlolab/CORnet.
    Findings
  • CUBUK, E. D., ZOPH, B., MANE, D., VASUDEVAN, V., AND LE, Q. V. Autoaugment: Learning augmentation strategies from data. In Proceedings of the IEEE conference on computer vision and pattern recognition (2019), pp. 113–123.
    Google ScholarLocate open access versionFindings
  • CUBUK, E. D., ZOPH, B., SHLENS, J., AND LE, Q. V. Randaugment: Practical data augmentation with no separate search. arXiv preprint arXiv:1909.13719 (2019).
    Findings
  • DEEPMIND. TensorFlow Hub: bigbigan-resnet50. https://tfhub.dev/deepmind/bigbigan-resnet50/1.
    Findings
  • DEVRIES, T., AND TAYLOR, G. W. Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017).
    Findings
  • DODGE, S., AND KARAM, L. A study and comparison of human and deep learning recognition performance under visual distortions. In 2017 26th international conference on computer communication and networks (ICCCN) (2017), IEEE, pp. 1–7.
    Google ScholarLocate open access versionFindings
  • DONAHUE, J., KRÄHENBÜHL, P., AND DARRELL, T. Adversarial feature learning. arXiv preprint arXiv:1605.09782 (2016).
    Findings
  • DONAHUE, J., AND SIMONYAN, K. Large scale adversarial representation learning. arXiv preprint arXiv:1907.02544 (2019).
    Findings
  • DOSOVITSKIY, A., SPRINGENBERG, J. T., RIEDMILLER, M., AND BROX, T. Discriminative unsupervised feature learning with convolutional neural networks. In Advances in neural information processing systems (2014), pp. 766–774.
    Google ScholarLocate open access versionFindings
  • DUMOULIN, V., BELGHAZI, I., POOLE, B., MASTROPIETRO, O., LAMB, A., ARJOVSKY, M., AND COURVILLE, A. Adversarially learned inference. arXiv preprint arXiv:1606.00704 (2016).
    Findings
  • EFROS, A. A., AND FREEMAN, W. T. Image quilting for texture synthesis and transfer. In Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques (2001), ACM, pp. 341–346.
    Google ScholarLocate open access versionFindings
  • ELSAYED, G., SHANKAR, S., CHEUNG, B., PAPERNOT, N., KURAKIN, A., GOODFELLOW, I., AND SOHL-DICKSTEIN, J. Adversarial examples that fool both computer vision and time-limited humans. In Advances in Neural Information Processing Systems (2018), pp. 3910– 3920.
    Google ScholarLocate open access versionFindings
  • ENGSTROM, L., ILYAS, A., SANTURKAR, S., TSIPRAS, D., TRAN, B., AND MADRY, A. Adversarial Robustness as a Prior for Learned Representations. arXiv e-prints (Jun 2019), arXiv:1906.00945.
    Findings
  • FAHLMAN, S. E. An empirical study of learning speed in back-propagation networks. Tech. rep., Carnegie Mellon University, Computer Science Department, 1988.
    Google ScholarFindings
  • FAWZI, A., AND FROSSARD, P. Manitest: Are classifiers really invariant? arXiv preprint arXiv:1507.06535 (2015).
    Findings
  • FORD, N., GILMER, J., CARLINI, N., AND CUBUK, D. Adversarial examples are a natural consequence of test error in noise. arXiv preprint arXiv:1901.10513 (2019).
    Findings
  • GATYS, L. A., ECKER, A. S., AND BETHGE, M. Image style transfer using convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (2016), pp. 2414–2423.
    Google ScholarLocate open access versionFindings
  • GEIRHOS, R., JACOBSEN, J.-H., MICHAELIS, C., ZEMEL, R., BRENDEL, W., BETHGE, M., AND WICHMANN, F. A. Shortcut learning in deep neural networks. arXiv preprint arXiv:2004.07780 (2020).
    Findings
  • GEIRHOS, R., RUBISCH, P., MICHAELIS, C., BETHGE, M., WICHMANN, F. A., AND BRENDEL, W. ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. In Internation Conference on Learning Representations (2019).
    Google ScholarLocate open access versionFindings
  • GEIRHOS, R., TEMME, C. R., RAUBER, J., SCHÜTT, H. H., BETHGE, M., AND WICHMANN, F. A. Generalisation in humans and deep neural networks. In Advances in Neural Information Processing Systems (2018), pp. 7538–7550.
    Google ScholarLocate open access versionFindings
  • GERSHKOFF-STOWE, L., AND SMITH, L. B. Shape and the first hundred nouns. Child development 75, 4 (2004), 1098–1114.
    Google ScholarLocate open access versionFindings
  • GIDARIS, S., SINGH, P., AND KOMODAKIS, N. Unsupervised representation learning by predicting image rotations. arXiv preprint arXiv:1803.07728 (2018).
    Findings
  • GIRSHICK, R., DONAHUE, J., DARRELL, T., AND MALIK, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (2014), pp. 580–587.
    Google ScholarLocate open access versionFindings
  • GOODFELLOW, I. J., SHLENS, J., AND SZEGEDY, C. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014).
    Findings
  • HE, K., ZHANG, X., REN, S., AND SUN, J. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (2016), pp. 770–778.
    Google ScholarLocate open access versionFindings
  • HEEGER, D. J., AND BERGEN, J. R. Pyramid-based texture analysis/synthesis. In Proceedings of the 22nd Annual Conference on Computer Graphics and Interactive Techniques (1995), Citeseer, pp. 229–238.
    Google ScholarLocate open access versionFindings
  • HENDRYCKS, D., AND DIETTERICH, T. G. Benchmarking neural network robustness to common corruptions and surface variations. arXiv preprint arXiv:1807.01697 (2018).
    Findings
  • HENDRYCKS, D., MU, N., CUBUK, E. D., ZOPH, B., GILMER, J., AND LAKSHMINARAYANAN, B. Augmix: A simple data processing method to improve robustness and uncertainty. arXiv preprint arXiv:1912.02781 (2019).
    Findings
  • HOSSEINI, H., AND POOVENDRAN, R. Semantic adversarial examples. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (2018), pp. 1614–1619.
    Google ScholarLocate open access versionFindings
  • HOSSEINI, H., XIAO, B., JAISWAL, M., AND POOVENDRAN, R. Assessing shape bias property of convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (2018), pp. 1923–1931.
    Google ScholarLocate open access versionFindings
  • ILYAS, A., SANTURKAR, S., TSIPRAS, D., ENGSTROM, L., TRAN, B., AND MADRY, A. Adversarial examples are not bugs, they are features. arXiv preprint arXiv:1905.02175 (2019).
    Findings
  • ImageNet training script for torchvision model implementations (used with modification). https://github.com/pytorch/examples/blob/master/imagenet/main.py.
    Findings
  • JOHNSON, J., ALAHI, A., AND FEI-FEI, L. Perceptual losses for real-time style transfer and super-resolution. In European conference on computer vision (2016), Springer, pp. 694–711.
    Google ScholarLocate open access versionFindings
  • KAUR, S., COHEN, J., AND LIPTON, Z. C. Are perceptually-aligned gradients a general property of robust classifiers? arXiv preprint arXiv:1910.08640 (2019).
    Findings
  • KHALIGH-RAZAVI, S.-M., AND KRIEGESKORTE, N. Deep supervised, but not unsupervised, models may explain it cortical representation. PLoS computational biology 10, 11 (2014), e1003915.
    Google ScholarLocate open access versionFindings
  • KINGMA, D. P., AND BA, J. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
    Findings
  • KOLESNIKOV, A., ZHAI, X., AND BEYER, L. Revisiting self-supervised visual representation learning. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019).
    Google ScholarLocate open access versionFindings
  • KORNBLITH, S., SHLENS, J., AND LE, Q. V. Do better imagenet models transfer better? In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2019), pp. 2661–2671.
    Google ScholarLocate open access versionFindings
  • KRIZHEVSKY, A. One weird trick for parallelizing convolutional neural networks. arXiv preprint arXiv:1404.5997 (2014).
    Findings
  • KRIZHEVSKY, A., SUTSKEVER, I., AND HINTON, G. E. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (2012), pp. 1097–1105.
    Google ScholarLocate open access versionFindings
  • KUBILIUS, J., BRACCI, S., AND DE BEECK, H. P. O. Deep neural networks as a computational model for human shape sensitivity. PLoS computational biology 12, 4 (2016), e1004896.
    Google ScholarLocate open access versionFindings
  • KUBILIUS, J., SCHRIMPF, M., NAYEBI, A., BEAR, D., YAMINS, D. L., AND DICARLO, J. J. Cornet: modeling the neural mechanisms of core object recognition. BioRxiv (2018), 408385.
    Google ScholarLocate open access versionFindings
  • KUCKER, S. C., SAMUELSON, L. K., PERRY, L. K., YOSHIDA, H., COLUNGA, E., LORENZ, M. G., AND SMITH, L. B. Reproducibility and a unifying explanation: Lessons from the shape bias. Infant Behavior and Development (2018).
    Google ScholarLocate open access versionFindings
  • LANDAU, B., SMITH, L. B., AND JONES, S. S. The importance of shape in early lexical learning. Cognitive development 3, 3 (1988), 299–321.
    Google ScholarLocate open access versionFindings
  • LECUN, Y., BOTTOU, L., BENGIO, Y., AND HAFFNER, P. Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 11 (1998), 2278–2324.
    Google ScholarLocate open access versionFindings
  • LI, Y., WEI, C., AND MA, T. Towards explaining the regularization effect of initial large learning rate in training neural networks. arXiv preprint arXiv:1907.04595 (2019).
    Findings
  • LIM, S., KIM, I., KIM, T., KIM, C., AND KIM, S. Fast autoaugment. In Advances in Neural Information Processing Systems (2019), pp. 6662–6672.
    Google ScholarLocate open access versionFindings
  • LOPES, R. G., YIN, D., POOLE, B., GILMER, J., AND CUBUK, E. D. Improving robustness without sacrificing accuracy with patch gaussian augmentation. arXiv preprint arXiv:1906.02611 (2019).
    Findings
  • MADRY, A., MAKELOV, A., SCHMIDT, L., TSIPRAS, D., AND VLADU, A. Towards deep learning models resistant to adversarial attacks. In International Conference on Learning Representations (2018).
    Google ScholarLocate open access versionFindings
  • MEHTA, D., KIM, K. I., AND THEOBALT, C. On implicit filter level sparsity in convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2019), pp. 520–528.
    Google ScholarLocate open access versionFindings
  • MÜLLER, R., KORNBLITH, S., AND HINTON, G. When does label smoothing help? In Advances in Neural Information Processing Systems (2019).
    Google ScholarLocate open access versionFindings
  • NAVON, D. Forest before trees: The precedence of global features in visual perception. Cognitive psychology 9, 3 (1977), 353–383.
    Google ScholarLocate open access versionFindings
  • NGUYEN, A., YOSINSKI, J., AND CLUNE, J. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2015).
    Google ScholarLocate open access versionFindings
  • PONCE, C. R., XIAO, W., SCHADE, P. F., HARTMANN, T. S., KREIMAN, G., AND LIVINGSTONE, M. S. Evolving images for visual neurons using a deep generative network reveals coding principles and neuronal preferences. Cell 177, 4 (2019), 999–1009.
    Google ScholarLocate open access versionFindings
  • PORTILLA, J., AND SIMONCELLI, E. P. A parametric texture model based on joint statistics of complex wavelet coefficients. International Journal of Computer Vision 40, 1 (2000), 49–70.
    Google ScholarLocate open access versionFindings
  • RAGHUNATHAN, A., XIE, S. M., YANG, F., DUCHI, J. C., AND LIANG, P. Adversarial training can hurt generalization. arXiv preprint arXiv:1906.06032 (2019).
    Findings
  • RAJALINGHAM, R., ISSA, E. B., BASHIVAN, P., KAR, K., SCHMIDT, K., AND DICARLO, J. J. Large-scale, high-resolution comparison of the core visual object recognition behavior of humans, monkeys, and state-of-the-art deep artificial neural networks. Journal of Neuroscience 38, 33 (2018), 7255–7269.
    Google ScholarLocate open access versionFindings
  • RAMACHANDRAN, P., PARMAR, N., VASWANI, A., BELLO, I., LEVSKAYA, A., AND SHLENS, J. Stand-alone self-attention in vision models. arXiv preprint arXiv:1906.05909 (2019).
    Findings
  • RICHARDWEBSTER, B., ANTHONY, S., AND SCHEIRER, W. Psyphy: A psychophysics driven evaluation framework for visual recognition. IEEE transactions on pattern analysis and machine intelligence (2018).
    Google ScholarLocate open access versionFindings
  • RITTER, S., BARRETT, D. G., SANTORO, A., AND BOTVINICK, M. M. Cognitive psychology for deep neural networks: A shape bias case study. In Proceedings of the 34th International Conference on Machine Learning - Volume 70 (2017), ICML’17, JMLR.org, pp. 2940–2949.
    Google ScholarLocate open access versionFindings
  • RUSAK, E., SCHOTT, L., ZIMMERMANN, R., BITTERWOLF, J., BRINGMANN, O., BETHGE, M., AND BRENDEL, W. Increasing the robustness of dnns against image corruptions by playing the game of noise. arXiv preprint arXiv:2001.06057 (2020).
    Findings
  • RUSSAKOVSKY, O., DENG, J., SU, H., KRAUSE, J., SATHEESH, S., MA, S., HUANG, Z., KARPATHY, A., KHOSLA, A., BERNSTEIN, M., ET AL. Imagenet large scale visual recognition challenge. International journal of computer vision 115, 3 (2015), 211–252.
    Google ScholarLocate open access versionFindings
  • SANTURKAR, S., TSIPRAS, D., TRAN, B., ILYAS, A., ENGSTROM, L., AND MADRY, A. Computer vision with a single (robust) classifier. CoRR abs/1906.09453 (2019).
    Findings
  • SCHRIMPF, M., KUBILIUS, J., HONG, H., MAJAJ, N. J., RAJALINGHAM, R., ISSA, E. B., KAR, K., BASHIVAN, P., PRESCOTT-ROY, J., SCHMIDT, K., ET AL. Brain-score: which artificial neural network for object recognition is most brain-like? BioRxiv (2018), 407007.
    Google ScholarLocate open access versionFindings
  • SEABOLD, S., AND PERKTOLD, J. Statsmodels: Econometric and statistical modeling with python. In 9th Python in Science Conference (2010).
    Google ScholarLocate open access versionFindings
  • SERMANET, P., EIGEN, D., ZHANG, X., MATHIEU, M., FERGUS, R., AND LECUN, Y. Overfeat: Integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv:1312.6229 (2013).
    Findings
  • SILBERMAN, N., AND GUADARRAMA, S. TensorFlow-Slim implementation of Inception-ResNet v2. https://github.com/tensorflow/models/tree/master/research/slim.
    Findings
  • https://github.com/google-research/simclr.
    Findings
  • SIMONYAN, K., AND ZISSERMAN, A. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).
    Findings
  • SZEGEDY, C., ZAREMBA, W., SUTSKEVER, I., BRUNA, J., ERHAN, D., GOODFELLOW, I., AND FERGUS, R. Intriguing properties of neural networks. In International Conference on Learning Representations (2014).
    Google ScholarLocate open access versionFindings
  • TAORI, R., DAVE, A., SHANKAR, V., CARLINI, N., RECHT, B., AND SCHMIDT, L. When robustness doesn’t promote robustness: Synthetic vs. natural distribution shifts on imagenet, 2020.
    Google ScholarFindings
  • TORCHVISION CONTRIBUTORS. PyTorch implementations of AlexNet, VGG16, ResNet-50. https://pytorch.org/docs/stable/torchvision/models.html.
    Findings
  • TORCHVISION CONTRIBUTORS. transforms.RandomResizedCrop. https://github.com/pytorch/vision/blob/7c9bbf5bdf68564a511578177dc8054e3c66fdf3/torchvision/transforms/transforms.py#L601-L604.
    Findings
  • https://github.com/rgeirhos/texture-vs-shape.
    Findings
  • TRAMÈR, F., AND BONEH, D. Adversarial training and robustness for multiple perturbations. In Advances in Neural Information Processing Systems (2019).
    Google ScholarLocate open access versionFindings
  • TSIPRAS, D., SANTURKAR, S., ENGSTROM, L., TURNER, A., AND MADRY, A. Robustness may be at odds with accuracy. In International Conference on Learning Representations (2019).
    Google ScholarLocate open access versionFindings
  • TSOTSOS, J. K., KOTSERUBA, I., ANDREOPOULOS, A., AND WU, Y. A possible reason for why data-driven beats theory-driven computer vision. arXiv preprint arXiv:1908.10933 (2019).
    Findings
  • VISUAL TASK ADAPTATION BENCHMARK. sup-100, rotation, and exemplar tf-hub modules. https://tfhub.dev/s?publisher=vtab.
    Findings
  • WANG, H., GE, S., LIPTON, Z., AND XING, E. P. Learning robust global representations by penalizing local predictive power. In Advances in Neural Information Processing Systems (2019), pp. 10506–10518.
    Google ScholarLocate open access versionFindings
  • WILSON, A. C., ROELOFS, R., STERN, M., SREBRO, N., AND RECHT, B. The marginal value of adaptive gradient methods in machine learning. In Advances in Neural Information Processing Systems (2017), pp. 4148–4158.
    Google ScholarLocate open access versionFindings
  • XIAO, C., ZHU, J.-Y., LI, B., HE, W., LIU, M., AND SONG, D. Spatially transformed adversarial examples. In International Conference on Learning Representations (2018).
    Google ScholarLocate open access versionFindings
  • XIE, C., TAN, M., GONG, B., WANG, J., YUILLE, A., AND LE, Q. V. Adversarial examples improve image recognition. arXiv preprint arXiv:1911.09665 (2019).
    Findings
  • XIE, C., WU, Y., MAATEN, L. V. D., YUILLE, A. L., AND HE, K. Feature denoising for improving adversarial robustness. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2019), pp. 501–509.
    Google ScholarLocate open access versionFindings
  • YAMINS, D. L., HONG, H., CADIEU, C. F., SOLOMON, E. A., SEIBERT, D., AND DICARLO, J. J. Performance-optimized hierarchical models predict neural responses in higher visual cortex. Proceedings of the National Academy of Sciences 111, 23 (2014), 8619–8624.
    Google ScholarLocate open access versionFindings
  • YIN, D., LOPES, R. G., SHLENS, J., CUBUK, E. D., AND GILMER, J. A fourier perspective on model robustness in computer vision. In Neural Information Processing Systems (2019).
    Google ScholarLocate open access versionFindings
  • YUN, S., HAN, D., OH, S. J., CHUN, S., CHOE, J., AND YOO, Y. Cutmix: Regularization strategy to train strong classifiers with localizable features. In Proceedings of the IEEE International Conference on Computer Vision (2019), pp. 6023–6032.
    Google ScholarLocate open access versionFindings
  • ZHAI, X., PUIGCERVER, J., KOLESNIKOV, A., RUYSSEN, P., RIQUELME, C., LUCIC, M., DJOLONGA, J., PINTO, A. S., NEUMANN, M., DOSOVITSKIY, A., ET AL. The visual task adaptation benchmark. arXiv preprint arXiv:1910.04867 (2019).
    Findings
  • ZHANG, R., ISOLA, P., EFROS, A. A., SHECHTMAN, E., AND WANG, O. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018), pp. 586–595.
    Google ScholarLocate open access versionFindings
  • ZHOU, Z., AND FIRESTONE, C. Humans can decipher adversarial images. Nature communications 10, 1 (2019), 1334.
    Google ScholarLocate open access versionFindings
  • ZHU, Z., XIE, L., AND YUILLE, A. L. Object recognition with and without objects. arXiv preprint arXiv:1611.06596 (2016).
    Findings
Full Text
Your rating :
0

 

Tags
Comments