When Does Preconditioning Help or Hurt Generalization?

Shun-ichi Amari
Shun-ichi Amari
Xuechen Li
Xuechen Li
Atsushi Nitanda
Atsushi Nitanda
Denny Wu
Denny Wu
Ji Xu
Ji Xu

international conference on learning representations, 2020.

Cited by: 3|Views71
Weibo:
Characterized the population risk of preconditioned least squares regression in the overparameterized regime and determined the optimal preconditioner for generalization.

Abstract:

While second order optimizers such as natural gradient descent (NGD) often speed up optimization, their effect on generalization remains controversial. For instance, it has been pointed out that gradient descent (GD), in contrast to second-order optimizers, converges to solutions with small Euclidean norm in many overparameterized model...More

Code:

Data:

0
Full Text
Bibtex
Weibo
Your rating :
0

 

Tags
Comments