Efficient Adaptive Online Learning via Frequent Directions.

IJCAI（2018）

引用 7|浏览35

暂无评分

摘要

By employing time-varying proximal functions, adaptive subgradient methods (ADAGRAD) have improved the regret bound and been widely used in online learning and optimization. However, ADAGRAD with full matrix proximal functions (ADA-FULL) cannot handle large-scale problems due to the impractical

$O(d^3)$

time and

$O(d^2)$

space complexities, though it has better performance when gradients are correlated. In this paper, we propose two efficient variants of ADA-FULL via a matrix sketching technique called frequent directions (FD). The first variant named as ADA-FD directly utilizes FD to maintain and manipulate low-rank matrices, which reduces the space and time complexities to

$O(\tau d)$

and

$O(\tau ^2d)$

respectively, where

$d$

is the dimensionality and

$\tau \ll d$

is the sketching size. The second variant named as ADA-FFD further adopts a doubling trick to accelerate FD used in ADA-FD, which reduces the average time complexity to

$O(\tau d)$

while only doubles the space complexity of ADA-FD. Theoretical analysis reveals that the regret of ADA-FD and ADA-FFD is close to that of ADA-FULL as long as the outer product matrix of gradients is approximately low-rank. Experimental results demonstrate the efficiency and effectiveness of our algorithms.

查看译文

关键词

Algorithms,Education, Distance

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要