Ellipsoidal Trust Region Methods for Neural Network Training
arXiv preprint arXiv:1905.09201(2019)
摘要
We investigate the use of ellipsoidal trust region constraints for second-order optimization of neural networks. This approach can be seen as a higher-order counterpart of adaptive gradient methods, which we here show to be interpretable as first-order trust region methods with ellipsoidal constraints. In particular, we show that the preconditioning matrix used in RMSProp and Adam satisfies the necessary conditions for convergence of (first-and) second-order trust region methods and report that this ellipsoidal constraint constantly outperforms its spherical counterpart in practice.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络