Robust Subspace Approximation In A Stream

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018)(2018)

引用 24|浏览12
暂无评分
摘要
We study robust subspace estimation in the streaming and distributed settings. Given a set of n data points {a(i)}(i=1)(n) in R-d and an integer k, we wish to find a linear subspace S of dimension k for which Sigma(i) M(dist(S; a(i))) is minimized, where dist(S; x) := min(y is an element of S) parallel to x - y parallel to(2), and M( is some loss function. When M is the identity function, S gives a subspace that is more robust to outliers than that provided by the truncated SVD. Though the problem is NP-hard, it is approximable within a (1 + epsilon) factor in polynomial time when k and epsilon are constant. We give the first sublinear approximation algorithm for this problem in the turnstile streaming and arbitrary partition distributed models, achieving the same time guarantees as in the offline case. Our algorithm is the fi rst based entirely on oblivious dimensionality reduction, and significantly simplifies prior methods for this problem, which held in neither the streaming nor distributed models.
更多
查看译文
关键词
polynomial time,loss function,a set,dimensionality reduction,approximation algorithm,novelty detection,identically distributed,linear subspace,identity function
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要