Small Covers for Near-Zero Sets of Polynomials and Learning Latent Variable Models
2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS)(2020)
摘要
Let
$V$
be any vector space of multivariate degree-
$d$
homogeneous polynomials with co-dimension at most
$k$
, and
$S$
be the set of points where all polynomials in
$V$
nearly vanish. We establish a qualitatively optimal upper bound on the size of
$\epsilon$
-covers for
$S$
, in the
$\ell_{2}$
-norm. Roughly speaking, we show that there exists an
$\epsilon$
-cover for
$S$
of cardinality
$M=(k/\epsilon)^{O_{d}(k^{1/d})}$
. Our result is constructive yielding an algorithm to compute such an
$\epsilon$
-cover that runs in time
$\text{poly}(M)$
. Building on our structural result, we obtain significantly improved learning algorithms for several fundamental high-dimensional probabilistic models with hidden variables. These include density and parameter estimation for
$k$
-mixtures of spherical Gaussians (with known common covariance), PAC learning one-hidden-layer ReLU networks with
$k$
hidden units (under the Gaussian distribution), density and parameter estimation for
$k$
-mixtures of linear regressions (with Gaussian covariates), and parameter estimation for
$k$
-mixtures of hyperplanes. Our algorithms run in time quasi-polynomial in the parameter
$k$
. Previous algorithms for these problems had running times exponential in
$k^{\Omega(1)}$
. At a high-level our algorithms for all these learning problems work as follows: By computing the low-degree moments of the hidden parameters, we are able to find a vector space of polynomials that nearly vanish on the unknown parameters. Our structural result allows us to compute a quasi-polynomial sized cover for the set of hidden parameters, which we exploit in our learning algorithms.
更多查看译文
关键词
component,machine learning,style,styling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络