Algorithms For Learning Sparse Additive Models With Interactions In High Dimensions
INFORMATION AND INFERENCE-A JOURNAL OF THE IMA(2018)
摘要
A function f : R-d -> R is a sparse additive model (SPAM), if it is of the form f (x) = Sigma(l is an element of S) phi(l)(x(l)), where S subset of [d], vertical bar S vertical bar << d. Assuming phi's, S to be unknown, there exists extensive work for estimating f from its samples. In this work, we consider a generalized version of SPAMs that also allows for the presence of a sparse number of second-order interaction terms. For some S-1 subset of [d], S-2 subset of ([d]/2), with vertical bar S-1 vertical bar << d, vertical bar S-2 vertical bar| << d(2), the function f is now assumed to be of the form: Sigma(p is an element of S1) phi(p)(x(p)) + Sigma((l, l')is an element of S2) phi((l, l'))(x(l), x(l')). Assuming we have the freedom to query f anywhere in its domain, we derive efficient algorithms that provably recover S-1, S-2 with finite sample bounds. Our analysis covers the noiseless setting where exact samples of f are obtained and also extends to the noisy setting where the queries are corrupted with noise. For the noisy setting in particular, we consider two noise models namely: i.i.d. Gaussian noise and arbitrary but bounded noise. Our main methods for identification of S-2 essentially rely on estimation of sparse Hessian matrices, for which we provide two novel compressed sensing-based schemes. Once S-1, S-2 are known, we show how the individual components phi(p), phi((l, l')) can be estimated via additional queries of f, with uniform error bounds. Lastly, we provide simulation results on synthetic data that validate our theoretical findings.
更多查看译文
关键词
sparse additive models, non-parametric function estimation, compressed sensing, sparse Hessian estimation, high-dimensional functions
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络