GeoDA: a geometric framework for black-box adversarial attacks

CVPR, pp. 8443-8452, 2020.

Cited by: 2|Bibtex|Views29|DOI:https://doi.org/10.1109/CVPR42600.2020.00847
EI
Other Links: arxiv.org|dblp.uni-trier.de|academic.microsoft.com
Weibo:
Our method relies on the key observation that the curvature of the decision boundary of deep networks is small in the vicinity of data samples

Abstract:

Adversarial examples are known as carefully perturbed images fooling image classifiers. We propose a geometric framework to generate adversarial examples in one of the most challenging black-box settings where the adversary can only generate a small number of queries, each of them returning the top-$1$ label of the classifier. Our frame...More

Code:

Data:

0
Introduction
  • It has become well known that deep neural networks are vulnerable to small adversarial perturbations, which are carefully designed to cause miss-classification in state-ofthe-art image classifiers [29].
  • The authors exploit the low mean curvature of the decision boundary in the vicinity of the data samples to effectively estimate the normal vector to the decision boundary.
  • This key prior permits to considerably reduces the number of queries that are necessary to fool the blackbox classifier.
Highlights
  • It has become well known that deep neural networks are vulnerable to small adversarial perturbations, which are carefully designed to cause miss-classification in state-ofthe-art image classifiers [29]
  • We propose a new geometric framework for designing query-efficient decision-based black-box attacks, in which the attacker only has access to the top-1 label of the classifier
  • Our method relies on the key observation that the curvature of the decision boundary of deep networks is small in the vicinity of data samples
  • In the particular case of l2-norm attacks, we show theoretically that our algorithm converges to the minimal adversarial perturbations, and that the number of queries at each step of the iterative search can be optimized mathematically
  • We study Geometric Decision-based Attack through extensive experiments that confirm its superior performance compared to state-of-the-art black-box attacks
Methods
  • The authors evaluate the algorithms on a pre-trained ResNet50 [18] with a set X of 350 correctly classified and randomly selected images from the ILSVRC2012’s validation set [10].
  • All the images are resized to 224 × 224 × 3.
  • For sparse perx∈X.
  • Algorithm 2: Sparse GeoDA algorithm.
  • ml = 0, mu = d, J = round(log2(d)) + 1.
  • for j = 1 : J do k round( mu +ml 2 as wsp.
Conclusion
  • The authors propose a new geometric framework for designing query-efficient decision-based black-box attacks, in which the attacker only has access to the top-1 label of the classifier.
  • The authors' method relies on the key observation that the curvature of the decision boundary of deep networks is small in the vicinity of data samples.
  • This permits to estimate the normals to the decision boundary with a small number of queries to the classifier, to eventually design query-efficient lp-norm attacks.
  • The authors study GeoDA through extensive experiments that confirm its superior performance compared to state-of-the-art black-box attacks
Summary
  • Introduction:

    It has become well known that deep neural networks are vulnerable to small adversarial perturbations, which are carefully designed to cause miss-classification in state-ofthe-art image classifiers [29].
  • The authors exploit the low mean curvature of the decision boundary in the vicinity of the data samples to effectively estimate the normal vector to the decision boundary.
  • This key prior permits to considerably reduces the number of queries that are necessary to fool the blackbox classifier.
  • Methods:

    The authors evaluate the algorithms on a pre-trained ResNet50 [18] with a set X of 350 correctly classified and randomly selected images from the ILSVRC2012’s validation set [10].
  • All the images are resized to 224 × 224 × 3.
  • For sparse perx∈X.
  • Algorithm 2: Sparse GeoDA algorithm.
  • ml = 0, mu = d, J = round(log2(d)) + 1.
  • for j = 1 : J do k round( mu +ml 2 as wsp.
  • Conclusion:

    The authors propose a new geometric framework for designing query-efficient decision-based black-box attacks, in which the attacker only has access to the top-1 label of the classifier.
  • The authors' method relies on the key observation that the curvature of the decision boundary of deep networks is small in the vicinity of data samples.
  • This permits to estimate the normals to the decision boundary with a small number of queries to the classifier, to eventually design query-efficient lp-norm attacks.
  • The authors study GeoDA through extensive experiments that confirm its superior performance compared to state-of-the-art black-box attacks
Tables
  • Table1: The performance comparison of GeoDA with BA and HSJA for median l2 and l∞ on ImageNet dataset
  • Table2: The performance comparison of black-box sparse GeoDA for median sparsity compared to white box attack SparseFool [<a class="ref-link" id="c2" href="#r2">2</a>] on ImageNet dataset
  • Table3: The performance comparison of GeoDA on different ResNet image classifiers
Download tables as Excel
Related work
  • Adversarial examples can be crafted in white-box setting [15, 27, 3], score-based black-box setting [28, 6, 20] or decision-based black-box scenario [4, 2, 22]. The latter settings are obviously the most challenging as little is known about the target classification settings. Yet, there are several recent works on the black-box attacks on image classifiers [20, 21, 32]. However, they assume that the loss function, the prediction probabilities, or several top sorted labels are available, which may be unrealistic in many real-world scenarios. In the most challenging settings, there are a few attacks that exploit only the top-1 label information returned by the classifier, including the Boundary Attack (BA) [2], the HopSkipJump Attack (HSJA) [5], the OPT attack [8], and qFool [22]. In [2], by starting from a large adversarial perturbation, BA can iteratively reduce the norm of the perturbation. In [5], the authors provided an attack based on [2] that improves the BA taking the advantage of an estimated gradient. This attack is quite query efficient and can be assumed as the state-of-the-art baseline in the black-box setting. In [8], an optimization-based hard-label black-box attack algorithm is introduced with guaranteed convergence rate in the hard-label black-box setting which outperforms the BA in terms of number of queries. Closer to our work, in [22], a heuristic algorithm based on the estimation of the normal vector to decision boundary is proposed for the case of l2-norm perturbations.
Funding
  • This work was supported in part by the US National Science Foundation under grants ECCS-1444009 and CNS1824518
  • M. is supported by a Google Postdoctoral Fellowship
Reference
  • Stephen Boyd and Lieven Vandenberghe. Convex optimization. Cambridge university press, 2004. 13
    Google ScholarFindings
  • Wieland Brendel, Jonas Rauber, and Matthias Bethge. Decision-based adversarial attacks: Reliable attacks against black-box machine learning models. arXiv preprint arXiv:1712.04248, 2017. 1, 2, 6, 7
    Findings
  • Nicholas Carlini and David Wagner. Towards evaluating the robustness of neural networks. In 2017 IEEE Symposium on Security and Privacy (SP), pages 39–57, 2017. 1, 2, 7
    Google ScholarLocate open access versionFindings
  • Jianbo Chen and Michael I Jordan. Boundary attack++: Query-efficient decision-based adversarial attack. arXiv preprint arXiv:1904.02144, 2019. 1, 2
    Findings
  • Jianbo Chen, Michael I Jordan, and Martin J Wainwright. Hopskipjumpattack: A query-efficient decision-based attack. arXiv preprint arXiv:1904.02144, 2019. 2, 6, 7
    Findings
  • Pin-Yu Chen, Huan Zhang, Yash Sharma, Jinfeng Yi, and Cho-Jui Hsieh. Zoo: Zeroth order optimization based blackbox attacks to deep neural networks without training substitute models. In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, pages 15–2ACM, 2017. 1, 2
    Google ScholarLocate open access versionFindings
  • Steven Chen, Nicholas Carlini, and David Wagner. Stateful detection of black-box adversarial attacks. arXiv preprint arXiv:1907.05587, 2019. 5
    Findings
  • Minhao Cheng, Thong Le, Pin-Yu Chen, Jinfeng Yi, Huan Zhang, and Cho-Jui Hsieh. Query-efficient hard-label blackbox attack: An optimization-based approach. arXiv preprint arXiv:1807.04457, 2012, 6
    Findings
  • Shuyu Cheng, Yinpeng Dong, Tianyu Pang, Hang Su, and Jun Zhu. Improving black-box adversarial attacks with a transfer-based prior. arXiv preprint arXiv:1906.06919, 2014
    Findings
  • Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255, 2009. 6
    Google ScholarLocate open access versionFindings
  • Aditya Devarakonda, Maxim Naumov, and Michael Garland. Adabatch: adaptive batch sizes for training deep neural networks. arXiv preprint arXiv:1712.02029, 2017. 5
    Findings
  • Alhussein Fawzi, Seyed-Mohsen Moosavi-Dezfooli, and Pascal Frossard. Robustness of classifiers: from adversarial to random noise. In Advances in Neural Information Processing Systems, pages 1632–1640, 2016. 2, 5, 11, 13
    Google ScholarLocate open access versionFindings
  • Alhussein Fawzi, Seyed-Mohsen Moosavi-Dezfooli, and Pascal Frossard. The robustness of deep networks: A geometrical perspective. IEEE Signal Processing Magazine, 34(6):50–62, 2017. 2
    Google ScholarLocate open access versionFindings
  • Alhussein Fawzi, Seyed-Mohsen Moosavi-Dezfooli, Pascal Frossard, and Stefano Soatto. Empirical study of the topology and geometry of deep networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3762–3770, 2018. 2
    Google ScholarLocate open access versionFindings
  • Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014. 1, 2
    Findings
  • Chuan Guo, Jacob R Gardner, Yurong You, Andrew Gordon Wilson, and Kilian Q Weinberger. Simple black-box adversarial attacks. arXiv preprint arXiv:1905.07121, 2019. 3
    Findings
  • David Lee Hanson and Farroll Tim Wright. A bound on tail probabilities for quadratic forms in independent random variables. The Annals of Mathematical Statistics, 42(3):1079–1083, 1971. 4, 11
    Google ScholarLocate open access versionFindings
  • Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016. 6
    Google ScholarLocate open access versionFindings
  • Nicholas J Higham. Analysis of the Cholesky decomposition of a semi-definite matrix. Oxford University Press, 1990. 11
    Google ScholarFindings
  • Andrew Ilyas, Logan Engstrom, Anish Athalye, and Jessy Lin. Black-box adversarial attacks with limited queries and information. arXiv preprint arXiv:1804.08598, 2018. 1, 2
    Findings
  • Andrew Ilyas, Logan Engstrom, and Aleksander Madry. Prior convictions: Black-box adversarial attacks with bandits and priors. arXiv preprint arXiv:1807.07978, 2018. 2, 3
    Findings
  • Yujia Liu, Seyed-Mohsen Moosavi-Dezfooli, and Pascal Frossard. A geometry-inspired decision-based attack. arXiv preprint arXiv:1903.10826, 2019. 1, 2, 4, 6
    Findings
  • Gabor Lugosi, Shahar Mendelson, et al. Sub-gaussian estimators of the mean of a random vector. The Annals of Statistics, 47(2):783–794, 2019. 4, 11
    Google ScholarLocate open access versionFindings
  • Rachid Marsli. Bounds for the smallest and largest eigenvalues of hermitian matrices. International Journal of Algebra, 9(8):379–394, 2015. 11
    Google ScholarLocate open access versionFindings
  • Seyed Mohsen Moosavi Dezfooli. Geometry of adversarial robustness of deep networks: methods and applications. Technical report, EPFL, 2019. 8
    Google ScholarFindings
  • Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, Omar Fawzi, and Pascal Frossard. Universal adversarial perturbations. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1765–1773, 2017. 4
    Google ScholarLocate open access versionFindings
  • Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, and Pascal Frossard. Deepfool: a simple and accurate method to fool deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2574–2582, 2016. 1, 2, 7, 8
    Google ScholarLocate open access versionFindings
  • Nina Narodytska and Shiva Prasad Kasiviswanathan. Simple black-box adversarial perturbations for deep networks. arXiv preprint arXiv:1612.06299, 2016. 1, 2
    Findings
  • Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199, 2013. 1
    Findings
  • GM Tallis. Plane truncation in normal populations. Journal of the Royal Statistical Society: Series B (Methodological), 27(2):301–307, 1965. 3, 11
    Google ScholarLocate open access versionFindings
  • Florian Tramer, Nicolas Papernot, Ian Goodfellow, Dan Boneh, and Patrick McDaniel. The space of transferable adversarial examples. arXiv preprint arXiv:1704.03453, 2017. 4
    Findings
  • Chun-Chen Tu, Paishun Ting, Pin-Yu Chen, Sijia Liu, Huan Zhang, Jinfeng Yi, Cho-Jui Hsieh, and Shin-Ming Cheng. Autozoom: Autoencoder-based zeroth order optimization method for attacking black-box neural networks. arXiv preprint arXiv:1805.11770, 2018. 2
    Findings
  • Chun-Chen Tu, Paishun Ting, Pin-Yu Chen, Sijia Liu, Huan Zhang, Jinfeng Yi, Cho-Jui Hsieh, and Shin-Ming Cheng. Autozoom: Autoencoder-based zeroth order optimization method for attacking black-box neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pages 742–749, 2019. 3
    Google ScholarLocate open access versionFindings
  • Pu Zhao, Sijia Liu, Pin-Yu Chen, Nghia Hoang, Kaidi Xu, Bhavya Kailkhura, and Xue Lin. On the design of black-box adversarial examples by leveraging gradient-free optimization and operator splitting method. In Proceedings of the IEEE International Conference on Computer Vision, pages 121–130, 2019. 3
    Google ScholarLocate open access versionFindings
  • 1. By approximation, we have
    Google ScholarFindings
  • 9. Additional experiment results
    Google ScholarFindings
Full Text
Your rating :
0

 

Tags
Comments