Reactive Model Correction: Mitigating Harm to Task-Relevant Features via Conditional Bias Suppression
arxiv(2024)
摘要
Deep Neural Networks are prone to learning and relying on spurious
correlations in the training data, which, for high-risk applications, can have
fatal consequences. Various approaches to suppress model reliance on harmful
features have been proposed that can be applied post-hoc without additional
training. Whereas those methods can be applied with efficiency, they also tend
to harm model performance by globally shifting the distribution of latent
features. To mitigate unintended overcorrection of model behavior, we propose a
reactive approach conditioned on model-derived knowledge and eXplainable
Artificial Intelligence (XAI) insights. While the reactive approach can be
applied to many post-hoc methods, we demonstrate the incorporation of
reactivity in particular for P-ClArC (Projective Class Artifact Compensation),
introducing a new method called R-ClArC (Reactive Class Artifact Compensation).
Through rigorous experiments in controlled settings (FunnyBirds) and with a
real-world dataset (ISIC2019), we show that introducing reactivity can minimize
the detrimental effect of the applied correction while simultaneously ensuring
low reliance on spurious features.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要