First-order Stochastic Algorithms for Escaping From Saddle Points in Almost Linear Time

neural information processing systems(2018)

引用 129|浏览115
暂无评分
摘要
In this paper, we consider first-order methods for solving stochastic non-convex optimization problems. The key building block of the proposed algorithms is first-order procedures to extract negative curvature from the Hessian matrix through a principled sequence starting from noise, which are referred to NEgative-curvature-Originated-from-Noise or NEON and are of independent interest. Based on this building block, we design purely first-order stochastic algorithms for escaping from non-degenerate saddle points with a much better time complexity (almost linear time in the problem's dimensionality) under a bounded variance condition of stochastic gradients than previous first-order stochastic algorithms. In particular, we develop a general framework of first-order stochastic algorithms with a second-order convergence guarantee based on our new technique and existing algorithms that may only converge to a first-order stationary point. For finding a nearly second-order stationary point x such that parallel to del F(x)parallel to <= epsilon and del F-2(x) >= -root epsilon I (in high probability), the best time complexity of the presented algorithms is (O) over tilde (d/epsilon(3.5)), where F(.) denotes the objective function and d is the dimensionality of the problem. To the best of our knowledge, this is the first theoretical result of first-order stochastic algorithms with an almost linear time in terms of problem's dimensionality for finding second-order stationary points, which is even competitive with existing stochastic algorithms hinging on the second-order information.
更多
查看译文
关键词
saddle points,objective function,time complexity,building block,linear time,hessian matrix,stationary point,negative curvature
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要