AdvDiff: Generating Unrestricted Adversarial Examples using Diffusion Models
CoRR(2023)
摘要
Unrestricted adversarial attacks present a serious threat to deep learning
models and adversarial defense techniques. They pose severe security problems
for deep learning applications because they can effectively bypass defense
mechanisms. However, previous attack methods often utilize Generative
Adversarial Networks (GANs), which are not theoretically provable and thus
generate unrealistic examples by incorporating adversarial objectives,
especially for large-scale datasets like ImageNet. In this paper, we propose a
new method, called AdvDiff, to generate unrestricted adversarial examples with
diffusion models. We design two novel adversarial guidance techniques to
conduct adversarial sampling in the reverse generation process of diffusion
models. These two techniques are effective and stable to generate high-quality,
realistic adversarial examples by integrating gradients of the target
classifier interpretably. Experimental results on MNIST and ImageNet datasets
demonstrate that AdvDiff is effective to generate unrestricted adversarial
examples, which outperforms GAN-based methods in terms of attack performance
and generation quality.
更多查看译文
关键词
generating unrestricted adversarial examples,diffusion
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要