Theoretical Understanding of Batch-normalization: A Markov Chain Perspective
Abstract:
Batch-normalization (BN) is a key component to effectively train deep neural networks. Empirical evidence has shown that without BN, the training process is prone to unstabilities. This is however not well understood from a theoretical point of view. Leveraging tools from Markov chain theory, we show that BN has a direct effect on the r...More
Code:
Data:
Tags
Comments