# Universal adversarial perturbations

CVPR, 2017.

EI

Weibo:

Abstract:

Given a state-of-the-art deep neural network classifier, we show the existence of a universal (image-agnostic) and very small perturbation vector that causes natural images to be misclassified with high probability. We propose a systematic algorithm for computing universal perturbations, and show that state-of-the-art deep neural networ...More

Code:

Data:

Introduction

- The authors show in this paper the existence of such quasi-imperceptible universal perturbation vectors that lead to misclassify natural images with high probability
- By adding such a quasi-imperceptible perturbation to natural images, the label estimated by the deep neural network is changed with high probability.
- Such perturbations are dubbed universal, as they are imageagnostic.

Highlights

- Can we find a single small image perturbation that fools a state-of-the-art deep neural network classifier on all natural images? We show in this paper the existence of such quasi-imperceptible universal perturbation vectors that lead to misclassify natural images with high probability
- By adding such a quasi-imperceptible perturbation to natural images, the label estimated by the deep neural network is changed with high probability
- We show that universal perturbations have a remarkable generalization property, as perturbations computed for a rather small set of training points fool new images with high probability
- We examine the existence of universal perturbations that are common to most data points belonging to the data distribution
- We showed the existence of small universal perturbations that can fool state-of-the-art classifiers on natural im

Results

- The universal perturbations computed for CaffeNet and VGG-F fool more than 90% of the validation.
- Note for example that with a set X containing only 500 images, the authors can fool more than 30% of the images on the validation set

Conclusion

- The authors showed the existence of small universal perturbations that can fool state-of-the-art classifiers on natural im- Singular values Random Normal vectors.

Summary

## Introduction:

The authors show in this paper the existence of such quasi-imperceptible universal perturbation vectors that lead to misclassify natural images with high probability- By adding such a quasi-imperceptible perturbation to natural images, the label estimated by the deep neural network is changed with high probability.
- Such perturbations are dubbed universal, as they are imageagnostic.
## Results:

The universal perturbations computed for CaffeNet and VGG-F fool more than 90% of the validation.- Note for example that with a set X containing only 500 images, the authors can fool more than 30% of the images on the validation set
## Conclusion:

The authors showed the existence of small universal perturbations that can fool state-of-the-art classifiers on natural im- Singular values Random Normal vectors.

- Table1: Fooling ratios on the set X, and the validation set
- Table2: Generalizability of the universal perturbations across different networks. The percentages indicate the fooling rates. The rows indicate the architecture for which the universal perturbations is computed, and the columns indicate the architecture for which the fooling rate is reported

Funding

- The universal perturbations computed for CaffeNet and VGG-F fool more than 90% of the validation
- Note for example that with a set X containing only 500 images, we can fool more than 30% of the images on the validation set

Study subjects and analysis

pairs: 2

Observe that such universal perturbations are different, although they exhibit a similar pattern. This is moreover confirmed by computing the normalized inner products between two pairs of perturbation images, as the normalized inner products do not exceed 0.1, which shows that one can find diverse universal perturbations. While the above universal perturbations are computed for a set X of 10,000 images from the training set (i.e., in average 10 images per class), we now examine the influence of the size of X on the quality of the universal perturbation

Reference

- O. Bastani, Y. Ioannou, L. Lampropoulos, D. Vytiniotis, A. Nori, and A. Criminisi. Measuring neural net robustness with constraints. In Neural Information Processing Systems (NIPS), 2016. 2
- B. Biggio, I. Corona, D. Maiorca, B. Nelson, N. Srndic, P. Laskov, G. Giacinto, and F. Roli. Evasion attacks against machine learning at test time. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pages 387–402, 2013. 2
- K. Chatfield, K. Simonyan, A. Vedaldi, and A. Zisserman. Return of the devil in the details: Delving deep into convolutional nets. In British Machine Vision Conference, 2014. 4
- A. Fawzi, O. Fawzi, and P. Frossard. Analysis of classifiers’ robustness to adversarial perturbations. CoRR, abs/1502.02590, 2015. 2
- A. Fawzi, S. Moosavi-Dezfooli, and P. Frossard. Robustness of classifiers: from adversarial to random noise. In Neural Information Processing Systems (NIPS), 2016. 2, 7
- I. J. Goodfellow, J. Shlens, and C. Szegedy. Explaining and harnessing adversarial examples. In International Conference on Learning Representations (ICLR), 2015. 2, 7
- K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. 2, 4
- R. Huang, B. Xu, D. Schuurmans, and C. Szepesvari. Learning with a strong adversary. CoRR, abs/1511.03034, 2015. 3
- Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell. Caffe: Convolutional architecture for fast feature embedding. In ACM International Conference on Multimedia (MM), pages 675–678, 2014. 4
- A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (NIPS), pages 1097–1105, 2012. 2
- Q. V. Le, W. Y. Zou, S. Y. Yeung, and A. Y. Ng. Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, pages 3361–3368. IEEE, 202
- S.-M. Moosavi-Dezfooli, A. Fawzi, and P. Frossard. Deepfool: a simple and accurate method to fool deep neural networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. 2, 3, 7
- A. Nguyen, J. Yosinski, and J. Clune. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 427–436, 2015. 2
- E. Rodner, M. Simon, R. Fisher, and J. Denzler. Fine-grained recognition in the noisy wild: Sensitivity analysis of convolutional neural networks approaches. In British Machine Vision Conference (BMVC), 2016. 2
- A. Rozsa, E. M. Rudd, and T. E. Boult. Adversarial diversity and hard positive generation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2016. 2
- O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. Berg, and L. Fei-Fei. Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3):211–252, 2015. 3
- S. Sabour, Y. Cao, F. Faghri, and D. J. Fleet. Adversarial manipulation of deep representations. In International Conference on Learning Representations (ICLR), 2016. 2
- K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. In International Conference on Learning Representations (ICLR), 2014. 4
- C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015. 4
- C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus. Intriguing properties of neural networks. In International Conference on Learning Representations (ICLR), 2014. 2, 3, 4
- P. Tabacof and E. Valle. Exploring the space of adversarial images. IEEE International Joint Conference on Neural Networks, 2016. 2
- Y. Taigman, M. Yang, M. Ranzato, and L. Wolf. Deepface: Closing the gap to human-level performance in face verification. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1701–1708, 2014. 2

Full Text

Tags

Comments