Learn from the Past: A Proxy Guided Adversarial Defense Framework with Self Distillation Regularization
arxiv(2023)
摘要
Adversarial Training (AT), pivotal in fortifying the robustness of deep
learning models, is extensively adopted in practical applications. However,
prevailing AT methods, relying on direct iterative updates for target model's
defense, frequently encounter obstacles such as unstable training and
catastrophic overfitting. In this context, our work illuminates the potential
of leveraging the target model's historical states as a proxy to provide
effective initialization and defense prior, which results in a general proxy
guided defense framework, `LAST' (Learn from the Past).
Specifically, LAST derives response of the proxy model as dynamically learned
fast weights, which continuously corrects the update direction of the target
model. Besides, we introduce a self-distillation regularized defense objective,
ingeniously designed to steer the proxy model's update trajectory without
resorting to external teacher models, thereby ameliorating the impact of
catastrophic overfitting on performance. Extensive experiments and ablation
studies showcase the framework's efficacy in markedly improving model
robustness (e.g., up to 9.2% and 20.3% enhancement in robust accuracy on
CIFAR10 and CIFAR100 datasets, respectively) and training stability. These
improvements are consistently observed across various model architectures,
larger datasets, perturbation sizes, and attack modalities, affirming LAST's
ability to consistently refine both single-step and multi-step AT strategies.
The code will be available at .
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要