Catastrophic Fisher Explosion: Early Phase Fisher Matrix Impacts Generalization

Stanislaw Jastrzebski
Stanislaw Jastrzebski
Devansh Arpit
Devansh Arpit
Oliver Astrand
Oliver Astrand
Giancarlo Kerg
Giancarlo Kerg
Huan Wang
Huan Wang
Krzysztof Geras
Krzysztof Geras
Cited by: 0|Bibtex|Views14
Other Links: arxiv.org

Abstract:

The early phase of training has been shown to be important in two ways for deep neural networks. First, the degree of regularization in this phase significantly impacts the final generalization. Second, it is accompanied by a rapid change in the local loss curvature influenced by regularization choices. Connecting these two findings, we...More

Code:

Data:

Full Text
Your rating :
0

 

Tags
Comments