## AI帮你理解科学

## AI 精读

AI抽取本论文的概要总结

微博一下：

# Safety Verification Of Deep Neural Networks

COMPUTER AIDED VERIFICATION, CAV 2017, PT I, (2017): 3-29

EI

关键词

摘要

Deep neural networks have achieved impressive experimental results in image classification, but can surprisingly be unstable with respect to adversarial perturbations, that is, minimal changes to the input image that cause the network to misclassify it. With potential applications including perception modules and end-to-end controllers fo...更多

代码：

数据：

简介

- Deep neural networks have achieved impressive experimental results in image classification, matching the cognitive ability of humans [23] in complex tasks with thousands of classes.
- Let Rn be a vector space of images that the authors wish to classify and assume that f : Rn → C, where C is a set of class labels, models the human perception capability, a neural network classifier is a function f(x) which approximates f (x) from M training examples {(xi, ci)}i=1,..,M.
- N}} is a set of activation functions φk : DLk−1 → DLk , one for each non-input layer.
- Layers other than input and output layers are called the hidden layers

重点内容

- Deep neural networks have achieved impressive experimental results in image classification, matching the cognitive ability of humans [23] in complex tasks with thousands of classes
- Since we reduce verification to a search for adversarial examples, we can achieve safety verification or falsification
- The neural networks are built from a widelyused neural networks library Keras [3] with a deep learning package Theano [6] as its backend
- We validate DLV on a set of experiments performed for neural networks trained for classification based on a predefined multi-dimensional surface, as well as image classification
- This paper presents an automated verification framework for checking safety of deep neural networks that is based on a systematic exploration of a region around a data point to search for adversarial manipulations of a given type, and propagating the analysis into deeper layers
- The network is trained with 5,000 points sampled from the provided two-dimensional space, and has an accuracy of more than 99%
- The results are encouraging, with adversarial examples found in some cases in a matter of seconds when working with few dimensions, but the verification process itself is exponential in the number of features and has prohibitive complexity for larger images

结果

- The proposed framework has been implemented as a software tool called DLV (Deep Learning Verification) [2] written in Python, see Appendix of [20] for details of input parameters and how to use the tool.
- The authors validate DLV on a set of experiments performed for neural networks trained for classification based on a predefined multi-dimensional surface, as well as image classification.
- These networks respectively use two representative types of layers: fully connected layers and convolutional layers.
- The first three demonstrate the single-path search functionality on the Euclidean (L2) norm, whereas the fourth (GTSRB) multi-path search for the L1 and L2 norms

结论

- This paper presents an automated verification framework for checking safety of deep neural networks that is based on a systematic exploration of a region around a data point to search for adversarial manipulations of a given type, and propagating the analysis into deeper layers.
- It would be interesting to see if the notions of regularity suggested in [24] permit a symbolic approach, and whether an abstraction refinement framework can be formulated to improve the scalability and computational performance

- Table1: FGSM vs. DLV (on a single path) vs. JSMA

相关工作

- AI safety is recognised an an important problem, see e.g., [10,33]. An early verification approach for neural networks was proposed in [30], where, using the notation of this paper, safety is defined as the existence, for all inputs in a region η0 ∈ DL0 , of a corresponding output in another region ηn ⊆ DLn . They encode the entire network as a set of constraints, approximating the sigmoid using constraints, which can then be solved by a SAT solver, but their approach only works with 6 neurons (3 hidden neurons). A similar idea is presented in [32]. In contrast, we work layer by layer and obtain much greater scalability. Since the first version of this paper appeared [20], another constraint-based method has been proposed in [21] which improves on [30]. While they consider more general correctness properties than this paper, they can only handle the ReLU activation functions, by extending the Simplex method to work with the piecewise linear ReLU functions that cannot be expressed using linear programming. This necessitates a search tree (instead of a search path as in Simplex), for which a heuristic search is proposed and shown to be complete. The approach is demonstrated on networks with 300 ReLU nodes, but as it encodes the full network it is unclear whether it can be scaled to work with practical deep neural networks: for example, the MNIST network has 630,016 ReLU nodes. They also handle continuous spaces directly without discretisation, the benefits of which are not yet clear, since it is argued in [19] that linear behaviour in high-dimensional spaces is sufficient to cause adversarial examples.

基金

- For example, This work is supported by the EPSRC Programme Grant on Mobile Autonomy (EP/M019918/1)

研究对象与分析

pairs: 4

a perception module of a self-driving car may input an image from a camera and must correctly classify the type of object in its view, irrespective of aspects such as the angle of its vision and image imperfections. Therefore, though they clearly include imperfections, all four pairs of images in Fig. 1 should arguably be classified as automobiles, since they appear so to a human eye. Classifiers employed in vision tasks are typically multi-layer networks, which propagate the input image through a series of linear and non-linear operators

pairs: 16

We also work with 500 dimensions and otherwise the same experimental parameters as for MNIST. Figure 13 in Appendix of [20] gives 16 pairs of original images (classified correctly) and perturbed images (classified wrongly). We found that, while the manipulations lead to human-recognisable modifications to the images, the perturbed images can be classified wrongly by the network

引用论文

- CIFAR10 model for Keras. https://github.com/fchollet/keras/blob/master/examples/cifar10 cnn.py
- DLV. https://github.com/verideep/dlv 3. Keras.https://keras.io 4. Large scale visual recognition challenge.http://www.image-net.org/challenges/ Mnist, CNN network. https://github.com/fchollet/keras/blob/master/examples/
- mnist cnn.py 6. Theano. http://deeplearning.net/software/theano/
- 7. VGG16 model for Keras. https://gist.github.com/baraldilorenzo/07d7802847aaa d0a35d3
- 8. Z3. http://rise4fun.com/z3 9. Ambrosio, L., Fusco, N., Pallara, D.: Functions of Bounded Variation and Free Discontinuity Problems. Oxford Mathematical Monographs. Oxford University Press, Oxford (2000)
- 10. Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J., Mane, D.: Concrete problems in AI safety. CoRR, abs/1606.06565 (2016)
- 11. Anselmi, F., Leibo, J.Z., Rosasco, L., Mutch, J., Tacchetti, A., Poggio, T.: Unsupervised learning of invariant representations. Theoret. Comput. Sci. 633, 112–121 (2016)
- 12. Bastani, O., Ioannou, Y., Lampropoulos, L., Vytiniotis, D., Nori, A., Criminisi, A.: Measuring neural net robustness with constraints. CoRR, abs/1605.07262 (2016). (To appear in NIPS) 13. Biggio, B., Corona, I., Maiorca, D., Nelson, B., Srndic, N., Laskov, P., Giacinto, G., Roli, F.: Evasion attacks against machine learning at test time. In: Blockeel, H., Kersting, K., Nijssen, S., Zelezny, F. (eds.) ECML PKDD 2013. LNCS, vol. 8190, pp. 387–402.
- Springer, Heidelberg (2013). doi:10.1007/978-3-642-40994-3 25 14.
- Bishop, C.M.: Neural Networks for Pattern Recognition. Oxford University Press, Oxford (1995)
- 15. Bojarski, M., Del Testa, D., Dworakowski, D., Firner, B., Flepp, B., Goyal, P., Jackel, L.D., Monfort, M., Muller, U., Zhang, J., Zhang, X., Zhao, J., Zieba, K.: End to end learning for self-driving cars. arXiv:1604.07316 (2016)
- 16. Carlsson, G.E., Ishkhanov, T., de Silva, V., Zomorodian, A.: On the local behavior of spaces of natural images. Int. J. Comput. Vis. 76(1), 1–12 (2008)
- 17. Hendricks, L.A., Park, D.H., Akata, Z., Schiele, B., Darrell, T., Rohrbach, M.: Attentive explanations: justifying decisions and pointing to the evidence. arXiv.org/abs/1612.04757 (2016)
- 18. Fawzi, A., Fawzi, O., Frossard, P.: Analysis of classifiers’ robustness to adversarial perturbations. CoRR, abs/1502.02590 (2015)
- 19. Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. CoRR, abs/1412.6572 (2014)
- 20. Huang, X., Kwiatkowska, M., Wang, S., Wu, M.: Safety verification of deep neural networks (2016). https://arxiv.org/abs/1610.06940 21. Katz, G., Barrett, C., Dill, D., Julian, K., Kochenderfer, M.: Reluplex: an efficient SMT solver for verifying deep neural networks. In: CAV 2017 (2017, to appear) 22.
- Kurakin, A., Goodfellow, I., Bengio, S.: Adversarial examples in the physical world. arXiv:1607.02533 (2016) 23.
- LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 436–444 (2015)
- 24. Mallat, S.: Understanding deep convolutional networks. Philos. Trans. R. Soc. Lond. A: Math. Phys. Eng. Sci. 374(2065) (2016). ISSN 1364-503X. doi:10.1098/ rsta.2015.0203 25. Moosavi-Dezfooli, S.-M., Fawzi, A., Frossard, P.: Deepfool: a simple and accurate method to fool deep neural networks. CoRR, abs/1511.04599 (2015)
- 26. Nguyen, A., Yosinski, J., Clune, J.: Deep neural networks are easily fooled: high confidence predictions for unrecognizable images. In: Computer Vision and Pattern Recognition (CVPR 2015) (2015)
- 27. Papernot, N., Goodfellow, I., Sheatsley, R., Feinman, R., McDaniel, P.: Cleverhans v1.0.0: an adversarial machine learning library. arXiv preprint arXiv:1610.00768 (2016)
- 28. Papernot, N., McDaniel, P., Jha, S., Fredrikson, M., Celik, Z.B., Swami, A.: The limitations of deep learning in adversarial settings. In: Proceedings of the 1st IEEE European Symposium on Security and Privacy (2015)
- 29. Papernot, N., McDaniel, P.D., Goodfellow, I.J., Jha, S., Celik, Z.B., Swami, A.: Practical black-box attacks against deep learning systems using adversarial examples. CoRR, abs/1602.02697 (2016)
- 30. Pulina, L., Tacchella, A.: An abstraction-refinement approach to verification of artificial neural networks. In: Touili, T., Cook, B., Jackson, P. (eds.) CAV 2010. LNCS, vol. 6174, pp. 243–257. Springer, Heidelberg (2010). doi:10.1007/ 978-3-642-14295-6 24
- 31. Ribeiro, M.T., Singh, S., Guestrin, C.: “Why should i trust you?”: explaining the predictions of any classifier. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD2016) (2016)
- 32. Scheibler, K., Winterer, L., Wimmer, R., Becker, B.: Towards verification of artificial neural networks. In: 18th Workshop on Methoden und Beschreibungssprachen zur Modellierung und Verifikation von Schaltungen und Systemen (MBMV), pp. 30–40 (2015)
- 33. Seshia, S.A., Sadigh, D.: Towards verified artificial intelligence. CoRR, abs/1606.08514 (2016)
- 34. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 (2014)
- 35. Stallkamp, J., Schlipsing, M., Salmen, J., Igel, C.: Man vs. computer: benchmarkingmachine learning algorithms for traffic sign recognition. Neural Netw. 32, 323– 332 (2012)
- 36. Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., Fergus, R.: Intriguing properties of neural networks. In: International Conference on Learning Representations (ICLR-2014) (2014)
- 37. Vapnik, V.: Principles of risk minimization for learning theory. In: Advances in Neural Information Processing Systems 4, NIPS Conference, Denver, Colorado, USA, 2–5 December 1991, pp. 831–838 (1991)
- 38. Zheng, S., Song, Y., Leung, T., Goodfellow, I.: Improving the robustness of deep neural networks via stability training. In: CVPR 2016 (2016)

标签

评论

数据免责声明

页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果，我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问，可以通过电子邮件方式联系我们：report@aminer.cn