Towards a Unified Game-Theoretic View of Adversarial Perturbations and Robustness.

Jie Ren,Die Zhang,Yisen Wang,Lu Chen,Zhanpeng Zhou,Yiting Chen,Xu Cheng,Xin Wang,Meng Zhou,Jie Shi,Quanshi Zhang

Annual Conference on Neural Information Processing Systems（2021）

引用 11|浏览29

暂无评分

摘要

This paper provides a unified view to explain different adversarial attacks and defense methods, i.e. the view of multi-order interactions between input variables of DNNs. Based on the multi-order interaction, we discover that adversarial attacks mainly affect high-order interactions to fool the DNN. Furthermore, we find that the robustness of adversarially trained DNNs comes from category-specific low-order interactions. Our findings provide a potential method to unify adversarial perturbations and robustness, which can explain the existing robustness-boosting methods in a principle way. Besides, our findings also make a revision of previous inaccurate understanding of the shape bias of adversarially learned features. Our code is available online at https://github.com/Jie-Ren/A-Unified-Game-Theoretic-Interpretation-of-Adversarial-Robustness.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要