Adversarial Robustness through the Lens of Convolutional Filters

2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)(2022)

引用 14|浏览43
暂无评分
摘要
Deep learning models are intrinsically sensitive to distribution shifts in the input data. In particular, small, barely perceivable perturbations to the input data can force models to make wrong predictions with high confidence. An common defense mechanism is regularization through adversarial training which injects worst-case perturbations back into training to strengthen the decision boundaries, and to reduce overfitting. In this context, we perform an investigation of 3 × 3 convolution filters that form in adversarially- trained models. Filters are extracted from 71 public models of the ℓ -RobustBench CIFAR-10/100 and ImageNet1k leaderboard and compared to filters extracted from models built on the same architectures but trained without robust regularization. We observe that adversarially-robust models appear to form more diverse, less sparse, and more orthogonal convolution filters than their normal counterparts. The largest differences between robust and normal models are found in the deepest layers, and the very first convolution layer, which consistently and predominantly forms filters that can partially eliminate perturbations, irrespective of the architecture.
更多
查看译文
关键词
small perturbations,barely perceivable perturbations,common defense mechanism,adversarial training,worst-case perturbations,3 × 3 convolution filters,public models,ImageNet1k leaderboard,robust regularization,adversarially-robust models,orthogonal convolution filters,convolution layer,predominantly forms filters,adversarial robustness,convolutional filters,deep learning models,distribution shifts,particular perturbations,ℓ∞-RobustBench CIFAR-10-100
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要