Understanding the Disharmony between Dropout and Batch Normalization by Variance Shift

CVPR, Volume abs/1801.05134, 2019, Pages 2682-2690.

Cited by: 131|Views44
EI
Weibo:
We investigate the “variance shift” phenomenon when Dropout layers are applied with Batch Normalization on modern convolutional networks

Abstract:

This paper first answers the question "why do the two most powerful techniques Dropout and Batch Normalization (BN) often lead to a worse performance when they are combined together?" in both theoretical and statistical aspects. Theoretically, we find that Dropout would shift the variance of a specific neural unit when we transfer the s...More

Code:

Data:

0
Your rating :
0

 

Tags
Comments