D-square-B: Deep Distribution Bound for Natural-looking Adversarial Attack

arxiv(2020)

引用 0|浏览24
暂无评分
摘要
We propose a novel technique that can generate natural-looking adversarial examples by bounding the variations induced for internal activation values in some deep layer(s), through a distribution quantile bound and a polynomial barrier loss function. By bounding model internals instead of individual pixels, our attack admits perturbations closely coupled with the existing features of the original input, allowing the generated examples to be natural-looking while having diverse and often substantial pixel distances from the original input. Enforcing per-neuron distribution quantile bounds allows addressing the non-uniformity of internal activation values. Our evaluation on ImageNet and five different model architecture demonstrates that our attack is quite effective. Compared to the state-of-the-art pixel space attack, semantic attack, and feature space attack, our attack can achieve the same attack success/confidence level while having much more natural-looking adversarial perturbations. These perturbations piggy-back on existing local features and do not have any fixed pixel bounds.
更多
查看译文
关键词
deep distribution bound,d-square-b,natural-looking
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要