AI帮你理解科学

AI 生成解读视频

AI抽取解析论文重点内容自动生成视频


pub
生成解读视频

AI 溯源

AI解析本论文相关学术脉络


Master Reading Tree
生成 溯源树

AI 精读

AI抽取本论文的概要总结


微博一下
We study the task of replicating the functionality of black-box neural models, for which we only know the output class probabilities provided for a set of input images

Black-Box Ripper: Copying black-box models using generative evolutionary algorithms

NIPS 2020, (2020)

被引用0|浏览13
EI
下载 PDF 全文
引用
微博一下

摘要

We study the task of replicating the functionality of black-box neural models, for which we only know the output class probabilities provided for a set of input images. We assume back-propagation through the black-box model is not possible and its training images are not available, e.g. the model could be exposed only through an API. In...更多

代码

数据

0
简介
  • In the last couple of years, AI has gained a lot of attention in industry, due to the latest research developments in the field, e.g. deep learning [21].
  • Studying ways of stealing or copying the functionality of black-box models is of great interest to AI companies, giving them the opportunity to better protect their models through various mechanisms [12, 40].
  • The proposed framework is somewhat related to knowledge distillation with teacher-student networks [2, 10, 22, 39], the main difference being that access to the training data of the teacher is not permitted to preserve the black-box nature of the teacher.
  • The teacher is trained independently of the framework, 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada
重点内容
  • In the last couple of years, AI has gained a lot of attention in industry, due to the latest research developments in the field, e.g. deep learning [21]
  • Our work is related to zero-shot knowledge distillation methods [1, 3, 4, 8, 24, 28, 38], with the difference that we regard the teacher model as a black box, and to model stealing methods [17, 30, 31, 32, 35, 36], with the difference that we focus on accuracy and not on minimizing the number of API calls to the black box
  • Even tough our work focuses on knowledge distillation in a realistic scenario, in which training data, structure and parameters of the teacher are completely obscured, we aim to compare our method to top white-box approaches in order to perform a more solid evaluation for Black-Box Ripper
  • Observing that GANs seem more useful for the student, we report results with Variational Auto-Encoder (VAE) in a single experiment, applying the same rule to Black-Box Ripper
  • The teacher’s accuracy is 82.5%, this being the upper bound for Black-Box Ripper and other methods [1, 28, 31]
  • We proposed a novel black-box functionality stealing framework able to achieve state-of-the-art results in zero-shot knowledge distillation scenarios
方法
  • The authors approach the task of stealing the functionality of a classification model, while assuming no access to the model’s weights and hyperparameters and no information about the training data.
  • With these assumptions, the classification model is a black box.
  • The authors' problem formulation is related to zero-shot knowledge distillation [1] and model functionality stealing [31].
  • The authors consider a relaxed setting of model functionality stealing, in which the number of model calls is not an issue, the primary focus being the accuracy level
结果
  • Even though the authors consider the teacher as a black box, the model manages to outperform the white-box DeGAN [1] in three out of four cases, while achieving a close result in the unfavorable case
  • These results indicate that the method does not need full access to the teacher in order to achieve stateof-the-art results in zero-shot knowledge distillation.
结论
  • The authors proposed a novel black-box functionality stealing framework able to achieve state-of-the-art results in zero-shot knowledge distillation scenarios.
  • The authors compared the framework with state-of-the-art data-free knowledge distillation [1, 28] and model stealing [31] methods.
  • The authors showed ablation results indicating that the evolutionary algorithm is helpful in reducing the distribution gap between the proxy and the true data set.
  • The authors would like to turn the attention towards (i) reducing the number of black-box model calls instead of increasing accuracy and (ii) designing preventive solutions, as one of the most important goals is to raise awareness around model stealing, contributing to AI security
总结
  • Introduction:

    In the last couple of years, AI has gained a lot of attention in industry, due to the latest research developments in the field, e.g. deep learning [21].
  • Studying ways of stealing or copying the functionality of black-box models is of great interest to AI companies, giving them the opportunity to better protect their models through various mechanisms [12, 40].
  • The proposed framework is somewhat related to knowledge distillation with teacher-student networks [2, 10, 22, 39], the main difference being that access to the training data of the teacher is not permitted to preserve the black-box nature of the teacher.
  • The teacher is trained independently of the framework, 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada
  • Objectives:

    The objective V of the evolutionary algorithm is the mean squared error between y and y. the authors aim to solve:.
  • Even tough the work focuses on knowledge distillation in a realistic scenario, in which training data, structure and parameters of the teacher are completely obscured, the authors aim to compare the method to top white-box approaches in order to perform a more solid evaluation for Black-Box Ripper.
  • By exposing techniques such as Black-Box Ripper, the authors aim to get a head start in designing preventive solutions.
  • The authors' aim is to stimulate future research in detecting functionality stealing attacks
  • Methods:

    The authors approach the task of stealing the functionality of a classification model, while assuming no access to the model’s weights and hyperparameters and no information about the training data.
  • With these assumptions, the classification model is a black box.
  • The authors' problem formulation is related to zero-shot knowledge distillation [1] and model functionality stealing [31].
  • The authors consider a relaxed setting of model functionality stealing, in which the number of model calls is not an issue, the primary focus being the accuracy level
  • Results:

    Even though the authors consider the teacher as a black box, the model manages to outperform the white-box DeGAN [1] in three out of four cases, while achieving a close result in the unfavorable case
  • These results indicate that the method does not need full access to the teacher in order to achieve stateof-the-art results in zero-shot knowledge distillation.
  • Conclusion:

    The authors proposed a novel black-box functionality stealing framework able to achieve state-of-the-art results in zero-shot knowledge distillation scenarios.
  • The authors compared the framework with state-of-the-art data-free knowledge distillation [1, 28] and model stealing [31] methods.
  • The authors showed ablation results indicating that the evolutionary algorithm is helpful in reducing the distribution gap between the proxy and the true data set.
  • The authors would like to turn the attention towards (i) reducing the number of black-box model calls instead of increasing accuracy and (ii) designing preventive solutions, as one of the most important goals is to raise awareness around model stealing, contributing to AI security
表格
  • Table1: Accuracy rates (in %) on CIFAR-10 of various zero-shot knowledge distillation [<a class="ref-link" id="c1" href="#r1">1</a>, <a class="ref-link" id="c28" href="#r28">28</a>] and model stealing [<a class="ref-link" id="c31" href="#r31">31</a>] methods versus Black-Box Ripper. For our model, we report the average accuracy as well as the standard deviation computed over 5 runs. Best results are highlighted in bold
  • Table2: Accuracy rates (in %) of various zero-shot knowledge distillation [<a class="ref-link" id="c1" href="#r1">1</a>, <a class="ref-link" id="c28" href="#r28">28</a>] and model stealing [<a class="ref-link" id="c31" href="#r31">31</a>] methods versus Black-Box Ripper on Fashion-MNIST as true data set and CIFAR-10 as proxy data set. Ablation results with two generators, a VAE and a SNGAN [<a class="ref-link" id="c26" href="#r26">26</a>], are also included. Best results are highlighted in bold
  • Table3: Accuracy rates (in %) of a state-of-the-art model stealing method [<a class="ref-link" id="c31" href="#r31">31</a>] versus Black-Box Ripper on 10 Monkey Species as true data set and CelebA-HQ and ImageNet Cats and Dogs as proxy data sets. Ablation results with GANs are also included. Best results are highlighted in bold
Download tables as Excel
相关工作
  • Our work is related to zero-shot knowledge distillation methods [1, 3, 4, 8, 24, 28, 38], with the difference that we regard the teacher model as a black box, and to model stealing methods [17, 30, 31, 32, 35, 36], with the difference that we focus on accuracy and not on minimizing the number of API calls to the black box.

    Zero-shot knowledge distillation. After researchers introduced methods of distilling information [2, 10] from large neural networks (teachers) to smaller and faster models (students) with minimal accuracy loss, a diverse set of methods have been developed to improve the preliminary approaches, addressing some of their practical limitations for specific tasks. A limitation of interest to us is the requirement to access the original training data (of the teacher model). Many formulations have been developed to alleviate this requirement [1, 8, 4, 24, 28], with methods either requiring a small subset of the original data [3, 4], or none at all [1]. Nayak et al [28] proposed a method for knowledge distillation without real training data, using the teacher model to synthesize data impressions via backpropagation instead. While the generated samples do not resemble natural images, the student is able to learn from the high response patterns of the teacher, showing reasonable generalization accuracy. Methods to synthesize samples through back-propagation, e.g. feature visualization methods, have gained a lot of interest in the area of knowledge distillation. Nguyen et al [29] showed that, through network inversion, resulting feature visualizations exhibit a high degree of realism. Further, Yin et al [38] used the same method to generate samples for training a student network, while employing a discrepancy loss in the form of Jensen-Shannon entropy between the teacher and the student. While showing good results, these methods are not considering the teacher model as a black box, since back-propagation implies knowledge of and access to the model’s weights. Micaelli et al [24] developed a method for zero-shot knowledge transfer by jointly training a generative model and the student, such that the generated samples are easily classified by the teacher, but hard for the student. In a similar manner to Yin et al [38], a discrepancy loss is applied in the training process between the teacher and the student. We take a different approach, as our generative model is trained beforehand, and we optimize the synthesized samples through evolutionary search to elicit a high response from the teacher. More closely-related to our work, Addepalli et al [1] proposed a data-enriching GAN (DeGAN), that is trained jointly with the student, but on a proxy data set, different from the model’s inaccessible, true data set. The generative model generates samples such that the teacher model outputs a confident response, through a loss function promoting diversity of samples and low-entropy confidence scores. In the context of their framework, by means of back-propagation though the teacher network, the GAN is able to synthesize samples that help the student approach the teacher’s accuracy level. Different from their approach, we do not propagate information through the teacher, as we consider it a black-box model. Moreover, our generative model is fixed, being trained a priori on a proxy data set, which is not related to the true set. Unlike Addepalli et al [1] and all previous works on zero-shot knowledge distillation, we generate artificial samples through evolutionary search, using these samples to train the student.
基金
  • This work was supported by a grant of the Romanian Ministry of Education and Research, CNCS UEFISCDI, project number PN-III-P1-1.1-TE-2019-0235, within PNCDI III. Our work has shown that, in the current state of machine learning, it is possible to obtain stateof-the-art results in model functionality replication without knowledge of the internal structure or parameters of the targeted model
研究对象与分析
benchmark data sets: 3
To generate useful data samples for training the student, our framework (i) learns to generate images on a proxy data set (with images and classes different from those used to train the black-box) and (ii) applies an evolutionary strategy to make sure that each generated data sample exhibits a high response for a specific class when given as input to the black box. Our framework is compared with several baseline and state-of-the-art methods on three benchmark data sets. The empirical evidence indicates that our model is superior to the considered baselines

benchmark data sets: 3
In the second training phase, we apply an evolutionary strategy, modifying the generated data samples such that they exhibit a high response for a certain class when given as input to the teacher. To demonstrate the effectiveness of our generative evolutionary framework, we conduct experiments on three benchmark data sets: CIFAR-10 [18] with CIFAR-100 [18] as proxy, Fashion-MNIST [37] with CIFAR-10 [18] as proxy and 10 Monkey Species [27] with CelebA-HQ [13] and ImageNet Cats and Dogs [33] as proxies. We compare our framework with a series of state-of-the-art methods [1, 28, 31], demonstrating generally superior accuracy rates, while preserving the black-box nature of the teacher

species of monkeys: 10
Since CIFAR-10, CIFAR-100 and Fashion-MNIST have low resolution images, we also test our approach in a more realistic scenario with high-resolution images. For this set of experiments, we use the 10 Monkey Species [27] data set, containing images of 10 species of monkeys in their natural habitat, as true data set. In this scenario, we independently consider two proxy data sets, namely CelebA-HQ [13] and ImageNet Cats and Dogs [33]

species: 143
CelebA-HQ contains high-resolution images of 1024 × 1024 pixels. ImageNet Cats and Dogs is composed of 143 species of cats and dogs. For the latter proxy, we additionally provide qualitative results to showcase our optimization process

epochs and the students: 30
The teacher and the students are ResNet-18 [9] models. The teacher is trained for 30 epochs and the students are trained for 200 epochs using the same mini-batch size as the teacher. Results

引用论文
  • S. Addepalli, G. K. Nayak, A. Chakraborty, and R. V. Babu. DeGAN: Data-Enriching GAN for Retrieving Representative Samples from a Trained Classifier. In Proceedings of AAAI, 2020.
    Google ScholarLocate open access versionFindings
  • J. Ba and R. Caruana. Do deep nets really need to be deep? In Proceedings of NIPS, pages 2654–2662, 2014.
    Google ScholarLocate open access versionFindings
  • H. Bai, J. Wu, I. King, and M. Lyu. Few shot network compression via cross distillation. arXiv preprint arXiv:1911.09450, 2019.
    Findings
  • K. Bhardwaj, N. Suda, and R. Marculescu. Dream distillation: A data-independent model compression framework. arXiv preprint arXiv:1905.07072, 2019.
    Findings
  • J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of NAACL, pages 4171–4186, 2019.
    Google ScholarLocate open access versionFindings
  • R. Geirhos, P. Rubisch, C. Michaelis, M. Bethge, F. A. Wichmann, and W. Brendel. ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. In Proceedings of ICLR, 2019.
    Google ScholarLocate open access versionFindings
  • I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. Generative adversarial nets. In Proceedings of NIPS, pages 2672–2680, 2014.
    Google ScholarLocate open access versionFindings
  • M. Haroush, I. Hubara, E. Hoffer, and D. Soudry. The Knowledge Within: Methods for Data-Free Model Compression. arXiv preprint arXiv:1912.01274, 2019.
    Findings
  • K. He, X. Zhang, S. Ren, and J. Sun. Deep Residual Learning for Image Recognition. In Proceedings of CVPR, pages 770–778, 2016.
    Google ScholarLocate open access versionFindings
  • G. Hinton, O. Vinyals, and J. Dean. Distilling the Knowledge in a Neural Network. In Proceedings of NIPS Deep Learning and Representation Learning Workshop, 2014.
    Google ScholarLocate open access versionFindings
  • J. Jackovich and R. Richards. Machine Learning with AWS: Explore the Power of Cloud Services for Your Machine Learning and Artificial Intelligence Projects. Packt Publishing, 2018. ISBN 978-1789806199.
    Google ScholarFindings
  • M. Juuti, S. Szyller, S. Marchal, and N. Asokan. PRADA: Protecting against DNN model stealing attacks. In Proceedings of EuroS&P, pages 512–527, 2019.
    Google ScholarLocate open access versionFindings
  • T. Karras, T. Aila, S. Laine, and J. Lehtinen. Progressive Growing of GANs for Improved Quality, Stability, and Variation. In Proceedings of ICLR, 2018.
    Google ScholarLocate open access versionFindings
  • M. Khayatkhoei, M. K. Singh, and A. Elgammal. Disconnected Manifold Learning for Generative Adversarial Networks. In Proceedings of NIPS, pages 7343–7353, 2018.
    Google ScholarLocate open access versionFindings
  • D. P. Kingma and J. Ba. Adam: A Method for Stochastic Optimization. In Proceedings ICLR, 2015.
    Google ScholarLocate open access versionFindings
  • D. P. Kingma and M. Welling. Auto-encoding variational bayes. In Proceedings of ICLR, 2014.
    Google ScholarLocate open access versionFindings
  • K. Krishna, G. S. Tomar, A. P. Parikh, N. Papernot, and M. Iyyer. Thieves on Sesame Street! Model Extraction of BERT-based APIs. In Proceedings of ICLR, 2020.
    Google ScholarLocate open access versionFindings
  • [19] A. Krizhevsky, I. Sutskever, and G. E. Hinton. ImageNet Classification with Deep Convolutional Neural Networks. In Proceddings of NIPS, pages 1097–1105, 2012.
    Google ScholarLocate open access versionFindings
  • [20] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
    Google ScholarLocate open access versionFindings
  • [21] Y. LeCun, Y. Bengio, and G. Hinton. Deep learning. Nature, 521(7553):436–444, 05 2015.
    Google ScholarLocate open access versionFindings
  • [22] D. Lopez-Paz, L. Bottou, B. Schölkopf, and V. Vapnik. Unifying distillation and privileged information. In Proceedings of ICLR, 2016.
    Google ScholarLocate open access versionFindings
  • [23] B. Lorica and P. Nathan. The State of Machine Learning Adoption in the Enterprise. O’Reilly Media, 2018.
    Google ScholarFindings
  • [24] P. Micaelli and A. Storkey. Zero-shot Knowledge Transfer via Adversarial Belief Matching. In Procedings of NeurIPS, pages 9547–9557, 2019.
    Google ScholarLocate open access versionFindings
  • [25] T. Miyato and M. Koyama. cGANs with Projection Discriminator. In Proceedings of ICLR, 2018.
    Google ScholarLocate open access versionFindings
  • [26] T. Miyato, T. Kataoka, M. Koyama, and Y. Yoshida. Spectral Normalization for Generative Adversarial Networks. In Proceedings of ICLR, 2018.
    Google ScholarLocate open access versionFindings
  • [27] G. Montoya, J. Zhang, and S. Loaiciga. 10 Monkey Species. Online, 2018. URL https://www.kaggle.
    Locate open access versionFindings
  • [28] G. K. Nayak, K. R. Mopuri, V. Shaj, V. B. Radhakrishnan, and A. Chakraborty. Zero-Shot Knowledge Distillation in Deep Networks. In Procedings of ICML, pages 4743–4751, 2019.
    Google ScholarLocate open access versionFindings
  • [29] A. Nguyen, A. Dosovitskiy, J. Yosinski, T. Brox, and J. Clune. Synthesizing the preferred inputs for neurons in neural networks via deep generator networks. In Proceedings of NIPS, pages 3387–3395, 2016.
    Google ScholarLocate open access versionFindings
  • [30] S. J. Oh, M. Augustin, B. Schiele, and M. Fritz. Towards Reverse-Engineering Black-Box Neural Networks. In Proceedings of ICLR, 2018.
    Google ScholarLocate open access versionFindings
  • [31] T. Orekondy, B. Schiele, and M. Fritz. Knockoff Nets: Stealing Functionality of Black-Box Models. In Proceedings of CVPR, pages 4954–4963, 2019.
    Google ScholarLocate open access versionFindings
  • [32] N. Papernot, P. McDaniel, I. Goodfellow, S. Jha, Z. B. Celik, and A. Swami. Practical Black-Box Attacks against Machine Learning. In Proceedings of AsiaCCS, pages 506–519, 2017.
    Google ScholarLocate open access versionFindings
  • [33] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision, 115(3):211–252, 2015.
    Google ScholarLocate open access versionFindings
  • [34] K. Simonyan and A. Zisserman. Very Deep Convolutional Networks for Large-Scale Image Recognition. In Proceedings of ICLR, 2014.
    Google ScholarLocate open access versionFindings
  • [35] F. Tramèr, F. Zhang, A. Juels, M. K. Reiter, and T. Ristenpart. Stealing Machine Learning Models via Prediction APIs. In Proceedings of USENIX Security, pages 601–618, 2016.
    Google ScholarLocate open access versionFindings
  • [36] B. Wang and N. Z. Gong. Stealing Hyperparameters in Machine Learning. In Proceedings of S&P, pages 36–52, 2018.
    Google ScholarLocate open access versionFindings
  • [37] H. Xiao, K. Rasul, and R. Vollgraf. Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms. arXiv preprint arXiv:1708.07747, 2017.
    Findings
  • [38] H. Yin, P. Molchanov, Z. Li, J. M. Alvarez, A. Mallya, D. Hoiem, N. K. Jha, and J. Kautz. Dreaming to Distill: Data-free Knowledge Transfer via DeepInversion. arXiv preprint arXiv:1912.08795, 2019.
    Findings
  • [39] S. You, C. Xu, C. Xu, and D. Tao. Learning from multiple teacher networks. In Proceedings of KDD, pages 1285–1294, 2017.
    Google ScholarLocate open access versionFindings
  • [40] J. Zhang, Z. Gu, J. Jang, H. Wu, M. P. Stoecklin, H. Huang, and I. Molloy. Protecting intellectual property of deep neural networks with watermarking. In Proceedings of AsiaCCS, pages 159–172, 2018.
    Google ScholarLocate open access versionFindings
作者
Antonio Barbalau
Antonio Barbalau
Adrian Cosma
Adrian Cosma
您的评分 :
0

 

标签
评论
小科