Explore the Transformation Space for Adversarial Images

CODASPY '20: Tenth ACM Conference on Data and Application Security and Privacy New Orleans LA USA March, 2020(2020)

引用 8|浏览37
暂无评分
摘要
Deep learning models are vulnerable to adversarial examples. Most of current adversarial attacks add pixel-wise perturbations restricted to some \(L^p\)-norm, and defense models are evaluated also on adversarial examples restricted inside \(L^p\)-norm balls. However, we wish to explore adversarial examples exist beyond \(L^p\)-norm balls and their implications for attacks and defenses. In this paper, we focus on adversarial images generated by transformations. We start with color transformation and propose two gradient-based attacks. Since \(L^p\)-norm is inappropriate for measuring image quality in the transformation space, we use the similarity between transformations and the Structural Similarity Index. Next, we explore a larger transformation space consisting of combinations of color and affine transformations. We evaluate our transformation attacks on three data sets --- CIFAR10, SVHN, and ImageNet --- and their corresponding models. Finally, we perform retraining defenses to evaluate the strength of our attacks. The results show that transformation attacks are powerful. They find high-quality adversarial images that have higher transferability and misclassification rates than C&W's \(L^p \) attacks, especially at high confidence levels. They are also significantly harder to defend against by retraining than C&W's \(L^p \) attacks. More importantly, exploring different attack spaces makes it more challenging to train a universally robust model.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要