Detection Thresholds in Very Sparse Matrix Completion

arxiv(2022)

引用 3|浏览12
暂无评分
摘要
We study the matrix completion problem: an underlying m × n matrix P is low rank, with incoherent singular vectors, and a random m × n matrix A is equal to P on a (uniformly) random subset of entries of size dn . All other entries of A are equal to zero. The goal is to retrieve information on P from the observation of A . Let A_1 be the random matrix where each entry of A is multiplied by an independent {0,1} -Bernoulli random variable with parameter 1/2. This paper is about when, how and why the non-Hermitian eigen-spectra of the matrices A_1 (A - A_1)^* and (A-A_1)^*A_1 captures more of the relevant information about the principal component structure of A than the eigen-spectra of A A^* and A^* A . We show that the eigenvalues of the asymmetric matrices A_1 (A - A_1)^* and (A-A_1)^* A_1 with modulus greater than a detection threshold are asymptotically equal to the eigenvalues of PP^* and P^*P and that the associated eigenvectors are aligned as well. The central surprise is that by intentionally inducing asymmetry and additional randomness via the A_1 matrix, we can extract more information than if we had worked with the singular value decomposition (SVD) of A . The associated detection threshold is asymptotically exact and is non-universal since it explicitly depends on the element-wise distribution of the underlying matrix P . We show that reliable, statistically optimal but not perfect matrix recovery, via a universal data-driven algorithm, is possible above this detection threshold using the information extracted from the asymmetric eigen-decompositions. Averaging the left and right eigenvectors provably improves estimation accuracy but not the detection threshold. Our results encompass the very sparse regime where d is of order 1 where matrix completion via the SVD of A fails or produces unreliable recovery. We define another variant of this asymmetric principal component analysis procedure that bypasses the randomization step and has a detection threshold that is smaller by a constant factor but with a computational cost that is larger by a polynomial factor of the number of observed entries. Both detection thresholds allow to go beyond the barrier due to the well-known information theoretical limit d ≍log n for exact matrix completion found in the literature.
更多
查看译文
关键词
Matrix completion,Sparse random graphs,Eigenvalues,Spectral algorithms,Non-Hermitian matrices
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要