Explainability of Image Classifiers for Targeted Adversarial Attack

Mayur Anand Pandya,PC Siddalingaswamy,Sanjay Singh

2022 IEEE 19th India Council International Conference (INDICON)（2022）

引用 0|浏览12

暂无评分

摘要

Deep neural networks outperform their contemporary peer models for computer vision tasks. These models are complex and challenging to interpret for practitioners. As more and more learning algorithms are deployed in real-world applications, model interpretability has become quite essential. Incidentally, Model interpretability is quite relevant to the case of deep neural networks as they can fall prey to adversarial attacks crafted by adversaries. In this paper, we launch an iterative targeted attack using a set of image classes on base architectures and interpret the results by applying an explanation algorithm before and after the attack. This process leads us to some valuable conclusions regarding the effects of the attack on the explanation methods and how explanation methods can be made to have more adversarial robustness.

查看译文

关键词

targeted adversarial attack,image classifiers,explainability

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要