Universal adversarial perturbations

CVPR, 2017.

Cited by: 976|Bibtex|Views105
EI
Other Links: dblp.uni-trier.de|arxiv.org
Weibo:
Given a state-of-the-art deep neural network classifier, we show the existence of a universal and very small perturbation vector that causes natural images to be misclassified with high probability

Abstract:

Given a state-of-the-art deep neural network classifier, we show the existence of a universal (image-agnostic) and very small perturbation vector that causes natural images to be misclassified with high probability. We propose a systematic algorithm for computing universal perturbations, and show that state-of-the-art deep neural networ...More

Code:

Data:

Introduction
  • The authors show in this paper the existence of such quasi-imperceptible universal perturbation vectors that lead to misclassify natural images with high probability
  • By adding such a quasi-imperceptible perturbation to natural images, the label estimated by the deep neural network is changed with high probability.
  • Such perturbations are dubbed universal, as they are imageagnostic.
Highlights
  • Can we find a single small image perturbation that fools a state-of-the-art deep neural network classifier on all natural images? We show in this paper the existence of such quasi-imperceptible universal perturbation vectors that lead to misclassify natural images with high probability
  • By adding such a quasi-imperceptible perturbation to natural images, the label estimated by the deep neural network is changed with high probability
  • We show that universal perturbations have a remarkable generalization property, as perturbations computed for a rather small set of training points fool new images with high probability
  • We examine the existence of universal perturbations that are common to most data points belonging to the data distribution
  • We showed the existence of small universal perturbations that can fool state-of-the-art classifiers on natural im
Results
  • The universal perturbations computed for CaffeNet and VGG-F fool more than 90% of the validation.
  • Note for example that with a set X containing only 500 images, the authors can fool more than 30% of the images on the validation set
Conclusion
  • The authors showed the existence of small universal perturbations that can fool state-of-the-art classifiers on natural im- Singular values Random Normal vectors.
Summary
  • Introduction:

    The authors show in this paper the existence of such quasi-imperceptible universal perturbation vectors that lead to misclassify natural images with high probability
  • By adding such a quasi-imperceptible perturbation to natural images, the label estimated by the deep neural network is changed with high probability.
  • Such perturbations are dubbed universal, as they are imageagnostic.
  • Results:

    The universal perturbations computed for CaffeNet and VGG-F fool more than 90% of the validation.
  • Note for example that with a set X containing only 500 images, the authors can fool more than 30% of the images on the validation set
  • Conclusion:

    The authors showed the existence of small universal perturbations that can fool state-of-the-art classifiers on natural im- Singular values Random Normal vectors.
Tables
  • Table1: Fooling ratios on the set X, and the validation set
  • Table2: Generalizability of the universal perturbations across different networks. The percentages indicate the fooling rates. The rows indicate the architecture for which the universal perturbations is computed, and the columns indicate the architecture for which the fooling rate is reported
Download tables as Excel
Funding
  • The universal perturbations computed for CaffeNet and VGG-F fool more than 90% of the validation
  • Note for example that with a set X containing only 500 images, we can fool more than 30% of the images on the validation set
Study subjects and analysis
pairs: 2
Observe that such universal perturbations are different, although they exhibit a similar pattern. This is moreover confirmed by computing the normalized inner products between two pairs of perturbation images, as the normalized inner products do not exceed 0.1, which shows that one can find diverse universal perturbations. While the above universal perturbations are computed for a set X of 10,000 images from the training set (i.e., in average 10 images per class), we now examine the influence of the size of X on the quality of the universal perturbation

Reference
  • O. Bastani, Y. Ioannou, L. Lampropoulos, D. Vytiniotis, A. Nori, and A. Criminisi. Measuring neural net robustness with constraints. In Neural Information Processing Systems (NIPS), 2016. 2
    Google ScholarLocate open access versionFindings
  • B. Biggio, I. Corona, D. Maiorca, B. Nelson, N. Srndic, P. Laskov, G. Giacinto, and F. Roli. Evasion attacks against machine learning at test time. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pages 387–402, 2013. 2
    Google ScholarLocate open access versionFindings
  • K. Chatfield, K. Simonyan, A. Vedaldi, and A. Zisserman. Return of the devil in the details: Delving deep into convolutional nets. In British Machine Vision Conference, 2014. 4
    Google ScholarLocate open access versionFindings
  • A. Fawzi, O. Fawzi, and P. Frossard. Analysis of classifiers’ robustness to adversarial perturbations. CoRR, abs/1502.02590, 2015. 2
    Findings
  • A. Fawzi, S. Moosavi-Dezfooli, and P. Frossard. Robustness of classifiers: from adversarial to random noise. In Neural Information Processing Systems (NIPS), 2016. 2, 7
    Google ScholarLocate open access versionFindings
  • I. J. Goodfellow, J. Shlens, and C. Szegedy. Explaining and harnessing adversarial examples. In International Conference on Learning Representations (ICLR), 2015. 2, 7
    Google ScholarLocate open access versionFindings
  • K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. 2, 4
    Google ScholarLocate open access versionFindings
  • R. Huang, B. Xu, D. Schuurmans, and C. Szepesvari. Learning with a strong adversary. CoRR, abs/1511.03034, 2015. 3
    Findings
  • Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell. Caffe: Convolutional architecture for fast feature embedding. In ACM International Conference on Multimedia (MM), pages 675–678, 2014. 4
    Google ScholarLocate open access versionFindings
  • A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (NIPS), pages 1097–1105, 2012. 2
    Google ScholarLocate open access versionFindings
  • Q. V. Le, W. Y. Zou, S. Y. Yeung, and A. Y. Ng. Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, pages 3361–3368. IEEE, 202
    Google ScholarLocate open access versionFindings
  • S.-M. Moosavi-Dezfooli, A. Fawzi, and P. Frossard. Deepfool: a simple and accurate method to fool deep neural networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. 2, 3, 7
    Google ScholarLocate open access versionFindings
  • A. Nguyen, J. Yosinski, and J. Clune. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 427–436, 2015. 2
    Google ScholarLocate open access versionFindings
  • E. Rodner, M. Simon, R. Fisher, and J. Denzler. Fine-grained recognition in the noisy wild: Sensitivity analysis of convolutional neural networks approaches. In British Machine Vision Conference (BMVC), 2016. 2
    Google ScholarLocate open access versionFindings
  • A. Rozsa, E. M. Rudd, and T. E. Boult. Adversarial diversity and hard positive generation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2016. 2
    Google ScholarLocate open access versionFindings
  • O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. Berg, and L. Fei-Fei. Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3):211–252, 2015. 3
    Google ScholarLocate open access versionFindings
  • S. Sabour, Y. Cao, F. Faghri, and D. J. Fleet. Adversarial manipulation of deep representations. In International Conference on Learning Representations (ICLR), 2016. 2
    Google ScholarLocate open access versionFindings
  • K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. In International Conference on Learning Representations (ICLR), 2014. 4
    Google ScholarLocate open access versionFindings
  • C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015. 4
    Google ScholarLocate open access versionFindings
  • C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus. Intriguing properties of neural networks. In International Conference on Learning Representations (ICLR), 2014. 2, 3, 4
    Google ScholarLocate open access versionFindings
  • P. Tabacof and E. Valle. Exploring the space of adversarial images. IEEE International Joint Conference on Neural Networks, 2016. 2
    Google ScholarLocate open access versionFindings
  • Y. Taigman, M. Yang, M. Ranzato, and L. Wolf. Deepface: Closing the gap to human-level performance in face verification. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1701–1708, 2014. 2
    Google ScholarLocate open access versionFindings
Full Text
Your rating :
0

 

Tags
Comments