Batch Normalization Is Blind to the First and Second Derivatives of the Loss

AAAI 2024(2024)

引用 0|浏览12
暂无评分
摘要
We prove that when we do the Taylor series expansion of the loss function, the BN operation will block the influence of the first-order term and most influence of the second-order term of the loss. We also find that such a problem is caused by the standardization phase of the BN operation. We believe that proving the blocking of certain loss terms provides an analytic perspective for potential detects of a deep model with BN operations, although the blocking problem is not fully equivalent to significant damages in all tasks on benchmark datasets. Experiments show that the BN operation significantly affects feature representations in specific tasks.
更多
查看译文
关键词
PEAI: Accountability, Interpretability & Explainability,ML: Deep Learning Theory
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要