# Geometric robustness of deep networks: analysis and improvement

CVPR, 2018.

EI

Weibo:

Abstract:

Deep convolutional neural networks have been shown to be vulnerable to arbitrary geometric transformations. However, there is no systematic method to measure the invariance properties of deep networks to such transformations. We propose ManiFool as a simple yet scalable algorithm to measure the invariance of deep networks. In particular, ...More

Code:

Data:

Introduction

- Convolutional neural networks (CNNs) have been largely successful in various applications, they have been shown to be quite vulnerable to additive adversarial perturbations [25, 10, 18] which can negatively affect their applicability in sensitive applications such as autonomous driving [6].
- The authors focus on studying the robustness of deep networks to geometric transformations in the worst-case regime as these can be quite problematic for sensitive applications
- The authors approach this problem by searching for minimal ’fooling’ transformations, i.e., transformations that change the decision of image classifiers, and the authors use these transformed examples to measure the invariance of a deep network.

Highlights

- Convolutional neural networks (CNNs) have been largely successful in various applications, they have been shown to be quite vulnerable to additive adversarial perturbations [25, 10, 18] which can negatively affect their applicability in sensitive applications such as autonomous driving [6]
- Deep networks have been shown to be vulnerable to rigid geometric transformations [7, 9], which are more natural than additive perturbations: they can represent the change of the viewpoint of an image
- We focus on studying the robustness of deep networks to geometric transformations in the worst-case regime as these can be quite problematic for sensitive applications
- We have presented a new constructive framework for computing the invariance score of deep image classifiers against geometric transformations
- We showed that adversarial training using ManiFool can be used as a way to improve the robustness of deep networks against both worst-case and random transformations and leads to more invariant networks
- We believe this process can be used for empirical analysis of neural networks under geometric transformations and provide a better understanding of invariance to non-additive perturbations and the properties of different network architectures

Results

- The authors test the algorithm on convolutional neural network architectures.
- In these experiments, the invariance score for minimal transformations, defined in (5), is calculated by finding fooling transformation examples using ManiFool for a set of images, and computing the average of the geodesic distance of these examples.
- The discrete images after transformation are obtained using bilinear interpolation; they further have the same size as the original image with zero-padding boundary conditions when necessary

Conclusion

- The authors have presented a new constructive framework for computing the invariance score of deep image classifiers against geometric transformations.
- The simple idea behind it is to perform gradient descent on the manifold of geometric transformations, in other words, it iteratively moves towards the class decision boundary while staying on the manifold to generate adversarial examples.
- Using this method, the authors have studied the robustness of networks trained on ImageNet against worst-case and random transformations.
- The ManiFool algorithm can be useful for generating new and practically relevant types of adversarial examples by using wider types of natural transformations

Summary

## Introduction:

Convolutional neural networks (CNNs) have been largely successful in various applications, they have been shown to be quite vulnerable to additive adversarial perturbations [25, 10, 18] which can negatively affect their applicability in sensitive applications such as autonomous driving [6].- The authors focus on studying the robustness of deep networks to geometric transformations in the worst-case regime as these can be quite problematic for sensitive applications
- The authors approach this problem by searching for minimal ’fooling’ transformations, i.e., transformations that change the decision of image classifiers, and the authors use these transformed examples to measure the invariance of a deep network.
## Results:

The authors test the algorithm on convolutional neural network architectures.- In these experiments, the invariance score for minimal transformations, defined in (5), is calculated by finding fooling transformation examples using ManiFool for a set of images, and computing the average of the geodesic distance of these examples.
- The discrete images after transformation are obtained using bilinear interpolation; they further have the same size as the original image with zero-padding boundary conditions when necessary
## Conclusion:

The authors have presented a new constructive framework for computing the invariance score of deep image classifiers against geometric transformations.- The simple idea behind it is to perform gradient descent on the manifold of geometric transformations, in other words, it iteratively moves towards the class decision boundary while staying on the manifold to generate adversarial examples.
- Using this method, the authors have studied the robustness of networks trained on ImageNet against worst-case and random transformations.
- The ManiFool algorithm can be useful for generating new and practically relevant types of adversarial examples by using wider types of natural transformations

- Table1: Comparison of Manitest and ManiFool for different transformation sets on MNIST dataset. In the table, T, R and S stand for translation, rotation and scaling respectively; while d represents the number of dimensions of the transformation groups. The time column lists the average time required to compute one sample. The experiment was done using a baseline CNN with 2 convolutional layers. These times are computed on a server with 2 Intel Xeon CPU E5-2680 v3 without GPU support
- Table2: The invariance to affine transformations of ResNet18 on CIFAR10 before and after the first epoch of fine tuning. Invariance score is calculated using 5000 images from CIFAR10 test set. ’Minimal’, ’Random’ and ’Baseline’ stand for the extra epoch done using the transformed dataset created using ManiFool, the dataset created using random transformations and the training set respectively

Funding

- We also gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan X GPU used for this research
- This work has been partly supported by the Hasler Foundation, Switzerland

Reference

- P.-A. Absil, R. Mahony, and R. Sepulchre. Optimization Algorithms on Matrix Manifolds. Princeton University Press, Princeton, N.J.; Woodstock, 2008. OCLC: ocn174129993. 4
- A. Bakry, M. Elhoseiny, T. El-Gaaly, and A. Elgammal. Digging Deep into the layers of CNNs: In Search of How CNNs Achieve View Invariance. In International Conference on Learning Representations(ICLR), 2016. 2
- S. Baluja and I. Fischer. Adversarial Transformation Networks: Learning to Generate Adversarial Examples. CoRR, abs/1703.09387, Mar. 2017. 2
- N. Carlini and D. Wagner. Towards evaluating the robustness of neural networks. In Security and Privacy (SP), 2017 IEEE Symposium On, pages 39–57. IEEE, 2017. 2
- J. Dai, H. Qi, Y. Xiong, Y. Li, G. Zhang, H. Hu, and Y. Wei. Deformable Convolutional Networks. In International Conference on Computer Vision(ICCV), 2017. 2
- I. Evtimov, K. Eykholt, E. Fernandes, T. Kohno, B. Li, A. Prakash, A. Rahmati, and D. Song. Robust PhysicalWorld Attacks on Deep Learning Models. arXiv:1707.08945 [cs], July 2017. 1
- A. Fawzi and P. Frossard. Manitest: Are classifiers really invariant? In British Machine Vision Conference (BMVC), 2015. 1, 2, 6, 7
- A. Fawzi and P. Frossard. Measuring the effect of nuisance variables on classifiers. In British Machine Vision Conference (BMVC), pages 106.1–106.13, 2016. 2
- I. Goodfellow, H. Lee, Q. V. Le, A. Saxe, and A. Y. Ng. Measuring invariances in deep networks. In Advances in Neural Information Processing Systems, pages 646–654, 2001, 2, 3, 7
- I. J. Goodfellow, J. Shlens, and C. Szegedy. Explaining and harnessing adversarial examples. In International Conference on Learning Representations(ICLR), 2015. 1, 2, 8
- K. He, X. Zhang, S. Ren, and J. Sun. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pages 770–778, 2016. 6
- M. Jaderberg, K. Simonyan, A. Zisserman, and others. Spatial transformer networks. In Advances in Neural Information Processing Systems, pages 2017–2025, 2015. 2
- E. Kokiopoulou and P. Frossard. Minimum Distance between Pattern Transformation Manifolds: Algorithm and Applications. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(7):1225–1238, July 2009. 2
- A. Krizhevsky and G. Hinton. Learning Multiple Layers of Features from Tiny Images. Master’s thesis, University of Toronto, Department of Computer Science, 2009. 7
- A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems, pages 1097–1105, 2012. 1, 6
- Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner. Gradientbased learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, Nov. 1998. 6
- K. Lenc and A. Vedaldi. Understanding image representations by measuring their equivariance and equivalence. In IEEE Conference on Computer Vision and Machine Learning(CVPR), 2015. 2
- S.-M. Moosavi-Dezfooli, A. Fawzi, and P. Frossard. Deepfool: A simple and accurate method to fool deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2574–2582, 2016. 1, 2, 3, 8
- O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei. ImageNet Large Scale Visual Recognition Challenge. Int J Comput Vis, 115(3):211–252, Dec. 2015. 6
- S. Sabour, Y. Cao, F. Faghri, and D. J. Fleet. Adversarial Manipulation of Deep Representations. In International Conference on Learning Representations(ICLR), 2016. 2
- J. A. Sethian. A fast marching level set method for monotonically advancing fronts. Proceedings of the National Academy of Sciences, 93(4):1591–1595, 1996. 5
- X. Shen, X. Tian, A. He, S. Sun, and D. Tao. TransformInvariant Convolutional Neural Networks for Image Classification and Search. In Proceedings of the 2016 ACM on Multimedia Conference, MM ’16, pages 1345–1354, New York, NY, USA, 2016. ACM. 2
- K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. In International Conference on Learning Representations(ICLR), 2015. 6
- S. Soatto and A. Chiuso. Visual Representations: Defining Properties and Deep Approximations. In International Conference on Learning Representations(ICLR), 2016. 2
- C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus. Intriguing properties of neural networks. In International Conference on Learning Representations(ICLR), 2014. 1, 2
- L. W. Tu. Differential Geometry, volume 275 of Graduate Texts in Mathematics. Springer International Publishing, Cham, 2017. 4
- M. B. Wakin, D. L. Donoho, H. Choi, and R. G. Baraniuk. The multiscale structure of non-differentiable image manifolds. In Optics & Photonics 2005, pages 59141B–59141B. International Society for Optics and Photonics, 2005. 2

Full Text

Tags

Comments