谷歌浏览器插件
订阅小程序
在清言上使用

Anti-Distillation Backdoor Attacks: Backdoors Can Really Survive in Knowledge Distillation

PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021(2021)

引用 21|浏览30
暂无评分
摘要
Motivated by resource-limited scenarios, knowledge distillation (KD) has received growing attention, effectively and quickly producing lightweight yet high-performance student models by transferring the dark knowledge from large teacher models. However, many pre-trained teacher models are downloaded from public platforms that lack necessary vetting, posing a possible threat to knowledge distillation tasks. Unfortunately, thus far, there has been little research to consider the backdoor attack from the teacher model into student models in KD, which may pose a severe threat to its wide use. In this paper, we, for the first time, propose a novel Anti-Distillation Backdoor Attack (ADBA), in which the backdoor embedded in the public teacher model can survive the knowledge distillation process and thus be transferred to secret distilled student models. We first introduce a shadow to imitate the distillation process and adopt an optimizable trigger to transfer information to help craft the desired teacher model. Our attack is powerful and effective, which achieves 95.92%, 94.79%, and 90.19% average success rates of attacks (SRoAs) against several different structure student models on MNIST, CIFAR-10, and GTSRB, respectively. Our ADBA also performs robustly under different user distillation environments with 91.72% and 92.37% average SRoAs on MNIST and CIFAR-10, respectively. Finally, we show that the ADBA has a low overhead in the injecting process, which converges on 50 and 70 epochs on CIFAR-10 and GTSRB, respectively, while the normal training epochs of these datasets are almost 200.
更多
查看译文
关键词
Deep neural network,neural backdoor,knowledge distillation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要