Geometric robustness of deep networks: analysis and improvement

CVPR, 2018.

Cited by: 44|Bibtex|Views29
EI
Other Links: dblp.uni-trier.de|academic.microsoft.com|arxiv.org
Weibo:
We showed that adversarial training using ManiFool can be used as a way to improve the robustness of deep networks against both worst-case and random transformations and leads to more invariant networks

Abstract:

Deep convolutional neural networks have been shown to be vulnerable to arbitrary geometric transformations. However, there is no systematic method to measure the invariance properties of deep networks to such transformations. We propose ManiFool as a simple yet scalable algorithm to measure the invariance of deep networks. In particular, ...More

Code:

Data:

0
Introduction
  • Convolutional neural networks (CNNs) have been largely successful in various applications, they have been shown to be quite vulnerable to additive adversarial perturbations [25, 10, 18] which can negatively affect their applicability in sensitive applications such as autonomous driving [6].
  • The authors focus on studying the robustness of deep networks to geometric transformations in the worst-case regime as these can be quite problematic for sensitive applications
  • The authors approach this problem by searching for minimal ’fooling’ transformations, i.e., transformations that change the decision of image classifiers, and the authors use these transformed examples to measure the invariance of a deep network.
Highlights
  • Convolutional neural networks (CNNs) have been largely successful in various applications, they have been shown to be quite vulnerable to additive adversarial perturbations [25, 10, 18] which can negatively affect their applicability in sensitive applications such as autonomous driving [6]
  • Deep networks have been shown to be vulnerable to rigid geometric transformations [7, 9], which are more natural than additive perturbations: they can represent the change of the viewpoint of an image
  • We focus on studying the robustness of deep networks to geometric transformations in the worst-case regime as these can be quite problematic for sensitive applications
  • We have presented a new constructive framework for computing the invariance score of deep image classifiers against geometric transformations
  • We showed that adversarial training using ManiFool can be used as a way to improve the robustness of deep networks against both worst-case and random transformations and leads to more invariant networks
  • We believe this process can be used for empirical analysis of neural networks under geometric transformations and provide a better understanding of invariance to non-additive perturbations and the properties of different network architectures
Results
  • The authors test the algorithm on convolutional neural network architectures.
  • In these experiments, the invariance score for minimal transformations, defined in (5), is calculated by finding fooling transformation examples using ManiFool for a set of images, and computing the average of the geodesic distance of these examples.
  • The discrete images after transformation are obtained using bilinear interpolation; they further have the same size as the original image with zero-padding boundary conditions when necessary
Conclusion
  • The authors have presented a new constructive framework for computing the invariance score of deep image classifiers against geometric transformations.
  • The simple idea behind it is to perform gradient descent on the manifold of geometric transformations, in other words, it iteratively moves towards the class decision boundary while staying on the manifold to generate adversarial examples.
  • Using this method, the authors have studied the robustness of networks trained on ImageNet against worst-case and random transformations.
  • The ManiFool algorithm can be useful for generating new and practically relevant types of adversarial examples by using wider types of natural transformations
Summary
  • Introduction:

    Convolutional neural networks (CNNs) have been largely successful in various applications, they have been shown to be quite vulnerable to additive adversarial perturbations [25, 10, 18] which can negatively affect their applicability in sensitive applications such as autonomous driving [6].
  • The authors focus on studying the robustness of deep networks to geometric transformations in the worst-case regime as these can be quite problematic for sensitive applications
  • The authors approach this problem by searching for minimal ’fooling’ transformations, i.e., transformations that change the decision of image classifiers, and the authors use these transformed examples to measure the invariance of a deep network.
  • Results:

    The authors test the algorithm on convolutional neural network architectures.
  • In these experiments, the invariance score for minimal transformations, defined in (5), is calculated by finding fooling transformation examples using ManiFool for a set of images, and computing the average of the geodesic distance of these examples.
  • The discrete images after transformation are obtained using bilinear interpolation; they further have the same size as the original image with zero-padding boundary conditions when necessary
  • Conclusion:

    The authors have presented a new constructive framework for computing the invariance score of deep image classifiers against geometric transformations.
  • The simple idea behind it is to perform gradient descent on the manifold of geometric transformations, in other words, it iteratively moves towards the class decision boundary while staying on the manifold to generate adversarial examples.
  • Using this method, the authors have studied the robustness of networks trained on ImageNet against worst-case and random transformations.
  • The ManiFool algorithm can be useful for generating new and practically relevant types of adversarial examples by using wider types of natural transformations
Tables
  • Table1: Comparison of Manitest and ManiFool for different transformation sets on MNIST dataset. In the table, T, R and S stand for translation, rotation and scaling respectively; while d represents the number of dimensions of the transformation groups. The time column lists the average time required to compute one sample. The experiment was done using a baseline CNN with 2 convolutional layers. These times are computed on a server with 2 Intel Xeon CPU E5-2680 v3 without GPU support
  • Table2: The invariance to affine transformations of ResNet18 on CIFAR10 before and after the first epoch of fine tuning. Invariance score is calculated using 5000 images from CIFAR10 test set. ’Minimal’, ’Random’ and ’Baseline’ stand for the extra epoch done using the transformed dataset created using ManiFool, the dataset created using random transformations and the training set respectively
Download tables as Excel
Funding
  • We also gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan X GPU used for this research
  • This work has been partly supported by the Hasler Foundation, Switzerland
Reference
  • P.-A. Absil, R. Mahony, and R. Sepulchre. Optimization Algorithms on Matrix Manifolds. Princeton University Press, Princeton, N.J.; Woodstock, 2008. OCLC: ocn174129993. 4
    Google ScholarFindings
  • A. Bakry, M. Elhoseiny, T. El-Gaaly, and A. Elgammal. Digging Deep into the layers of CNNs: In Search of How CNNs Achieve View Invariance. In International Conference on Learning Representations(ICLR), 2016. 2
    Google ScholarLocate open access versionFindings
  • S. Baluja and I. Fischer. Adversarial Transformation Networks: Learning to Generate Adversarial Examples. CoRR, abs/1703.09387, Mar. 2017. 2
    Findings
  • N. Carlini and D. Wagner. Towards evaluating the robustness of neural networks. In Security and Privacy (SP), 2017 IEEE Symposium On, pages 39–57. IEEE, 2017. 2
    Google ScholarLocate open access versionFindings
  • J. Dai, H. Qi, Y. Xiong, Y. Li, G. Zhang, H. Hu, and Y. Wei. Deformable Convolutional Networks. In International Conference on Computer Vision(ICCV), 2017. 2
    Google ScholarLocate open access versionFindings
  • I. Evtimov, K. Eykholt, E. Fernandes, T. Kohno, B. Li, A. Prakash, A. Rahmati, and D. Song. Robust PhysicalWorld Attacks on Deep Learning Models. arXiv:1707.08945 [cs], July 2017. 1
    Findings
  • A. Fawzi and P. Frossard. Manitest: Are classifiers really invariant? In British Machine Vision Conference (BMVC), 2015. 1, 2, 6, 7
    Google ScholarLocate open access versionFindings
  • A. Fawzi and P. Frossard. Measuring the effect of nuisance variables on classifiers. In British Machine Vision Conference (BMVC), pages 106.1–106.13, 2016. 2
    Google ScholarLocate open access versionFindings
  • I. Goodfellow, H. Lee, Q. V. Le, A. Saxe, and A. Y. Ng. Measuring invariances in deep networks. In Advances in Neural Information Processing Systems, pages 646–654, 2001, 2, 3, 7
    Google ScholarLocate open access versionFindings
  • I. J. Goodfellow, J. Shlens, and C. Szegedy. Explaining and harnessing adversarial examples. In International Conference on Learning Representations(ICLR), 2015. 1, 2, 8
    Google ScholarLocate open access versionFindings
  • K. He, X. Zhang, S. Ren, and J. Sun. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pages 770–778, 2016. 6
    Google ScholarLocate open access versionFindings
  • M. Jaderberg, K. Simonyan, A. Zisserman, and others. Spatial transformer networks. In Advances in Neural Information Processing Systems, pages 2017–2025, 2015. 2
    Google ScholarLocate open access versionFindings
  • E. Kokiopoulou and P. Frossard. Minimum Distance between Pattern Transformation Manifolds: Algorithm and Applications. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(7):1225–1238, July 2009. 2
    Google ScholarLocate open access versionFindings
  • A. Krizhevsky and G. Hinton. Learning Multiple Layers of Features from Tiny Images. Master’s thesis, University of Toronto, Department of Computer Science, 2009. 7
    Google ScholarFindings
  • A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems, pages 1097–1105, 2012. 1, 6
    Google ScholarLocate open access versionFindings
  • Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner. Gradientbased learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, Nov. 1998. 6
    Google ScholarLocate open access versionFindings
  • K. Lenc and A. Vedaldi. Understanding image representations by measuring their equivariance and equivalence. In IEEE Conference on Computer Vision and Machine Learning(CVPR), 2015. 2
    Google ScholarLocate open access versionFindings
  • S.-M. Moosavi-Dezfooli, A. Fawzi, and P. Frossard. Deepfool: A simple and accurate method to fool deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2574–2582, 2016. 1, 2, 3, 8
    Google ScholarLocate open access versionFindings
  • O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei. ImageNet Large Scale Visual Recognition Challenge. Int J Comput Vis, 115(3):211–252, Dec. 2015. 6
    Google ScholarLocate open access versionFindings
  • S. Sabour, Y. Cao, F. Faghri, and D. J. Fleet. Adversarial Manipulation of Deep Representations. In International Conference on Learning Representations(ICLR), 2016. 2
    Google ScholarLocate open access versionFindings
  • J. A. Sethian. A fast marching level set method for monotonically advancing fronts. Proceedings of the National Academy of Sciences, 93(4):1591–1595, 1996. 5
    Google ScholarLocate open access versionFindings
  • X. Shen, X. Tian, A. He, S. Sun, and D. Tao. TransformInvariant Convolutional Neural Networks for Image Classification and Search. In Proceedings of the 2016 ACM on Multimedia Conference, MM ’16, pages 1345–1354, New York, NY, USA, 2016. ACM. 2
    Google ScholarLocate open access versionFindings
  • K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. In International Conference on Learning Representations(ICLR), 2015. 6
    Google ScholarLocate open access versionFindings
  • S. Soatto and A. Chiuso. Visual Representations: Defining Properties and Deep Approximations. In International Conference on Learning Representations(ICLR), 2016. 2
    Google ScholarLocate open access versionFindings
  • C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus. Intriguing properties of neural networks. In International Conference on Learning Representations(ICLR), 2014. 1, 2
    Google ScholarLocate open access versionFindings
  • L. W. Tu. Differential Geometry, volume 275 of Graduate Texts in Mathematics. Springer International Publishing, Cham, 2017. 4
    Google ScholarLocate open access versionFindings
  • M. B. Wakin, D. L. Donoho, H. Choi, and R. G. Baraniuk. The multiscale structure of non-differentiable image manifolds. In Optics & Photonics 2005, pages 59141B–59141B. International Society for Optics and Photonics, 2005. 2
    Google ScholarLocate open access versionFindings
Full Text
Your rating :
0

 

Tags
Comments