## AI helps you reading Science

## AI Insight

AI extracts a summary of this paper

Weibo:

# Robust Density Estimation under Besov IPM Losses

NIPS 2020, (2020)

EI

Keywords

Abstract

We study minimax convergence rates of nonparametric density estimation in the Huber contamination model, in which a proportion of the data comes from an unknown outlier distribution. We provide the first results for this problem under a large family of losses, called Besov integral probability metrics (IPMs), that includes $\mathcal{L}^...More

Code:

Data:

Introduction

- In many settings, observed data contains samples from a population distribution of interest, and a small proportion of outlier samples.
- The authors extend this result to show that when p′d ≥ pg wavelet thresholding estimators remain minimax optimal under both structured and unstructured contamination settings.
- For p′d ≤ pg linear wavelet estimators are minimax optimal under the structured contamination setting and the unstructured contamination setting if the IPM is generated by a smooth enough class of functions.

Highlights

- In many settings, observed data contains samples from a population distribution of interest, and a small proportion of outlier samples
- We show faster convergence rates are possible under the assumption that G has a bounded density g, but that these rates are not further improved by additional smoothness assumptions on g
- This implies that when σd ≥ D/pd and the contamination is below the breakdown point of the wavelet thresholding estimator, we can construct a generative adversarial networks (GANs) estimate with large enough networks that converges at the minimax optimal rate of the uncontaminated setting i.e
- We studied a variant of nonparametric density estimation in which a proportion of the data are contaminated by random outliers
- The classical wavelet thresholding estimator originally proposed by Donoho et al (1996), which is widely known to be optimal for uncontaminated nonparametric density estimation, continues to be minimax optimal in the presence of contamination, in many settings
- Additional smoothness assumptions have no effect. This contrasts from the case of estimating a density at a point, as studied by Liu & Gao (2017); the minimax rates they derived depend precisely on the smoothness assumed of the contamination density

Results

- When contamination is small the thresholding wavelet estimator converges at the rate in the uncontaminated setting which is faster than any linear estimator as shown in Uppal et al (2019).
- Bp1d,qd Under these losses, the wavelet thresholding estimator is robustly minimax optimal, in both the arbitrary and structured contamination settings.
- In the case pg = qg = ∞, the data distribution is itself σg-Holder continuous, and the linear wavelet estimator is robustly minimax optimal under any Besov
- An oracle inequality of Liang (2018), several recent works (Liu et al, 2017; Liang, 2018; Singh et al, 2018; Uppal et al, 2019) have studied a statistical formulation of GANs as a distribution estimate based on empirical risk minimization (ERM) under an IPM loss.
- The authors extend these results to the contamination setting and show that the wavelet thresholding estimator can be used to construct a GAN estimate that are robustly minimax optimal.
- Theorem 3 if the authors let the approximation error of the generator and discriminator network be at most the convergence rate of the wavelet thresholding estimator there is a
- Which is the upper bound from Theorem 3 when σ ≥ D/pd or the convergence rate of the wavelet thresholding estimator when the contamination is arbitrary and the discriminator is smooth enough.

Conclusion

- This implies that when σd ≥ D/pd and the contamination is below the breakdown point of the wavelet thresholding estimator,, the authors can construct a GAN estimate with large enough networks that converges at the minimax optimal rate of the uncontaminated setting i.e. n + n −1/2
- The classical wavelet thresholding estimator originally proposed by Donoho et al (1996), which is widely known to be optimal for uncontaminated nonparametric density estimation, continues to be minimax optimal in the presence of contamination, in many settings.
- This contrasts from the case of estimating a density at a point, as studied by Liu & Gao (2017); the minimax rates they derived depend precisely on the smoothness assumed of the contamination density.

Related work

- This paper extends recent results in both robust estimation and non-parametric density estimation. We now summarize the results of the most relevant papers, namely those of Chen et al (2016), Liu & Gao (2017), and Uppal et al (2019).

Chen et al (2016) give a unified study of a large class of robust nonparametric estimation problems under the total variation loss. In the particular case of estimating a σg-

Holder continuous density, their results imply a minimax convergence rate of n− σg 2σg +1

+ǫ, matching our results (Theorems 4 and 6) for total variation loss. Whereas the results of Chen et al (2016) are quite specific to total variation loss, we simultaneously provide results for a range of other loss functions that have been considered in the literature as well as a range of densities of varying smoothness assumptions. Moreover, the estimator studied by Chen et al.

Reference

- Abbasnejad, E., Shi, J., and van den Hengel, A. Deep learning. In Advances in Neural Information Processing Systems, pp. 5551–5559, 2017.
- Lipschitz networks and Dudley GANs, 2018. URL Lugosi, G. and Mendelson, S. Risk minimization https://openreview.net/forum?id=rkw-jlb0W.by median-of-means tournaments.arXiv preprint
- Chen, M., Gao, C., Ren, Z., et al. A general decision theory for Hubers ǫ-contamination model. Electronic Journal of Statistics, 10(2):3752–3774, 2016.
- arXiv:1608.00757, 2016.
- Minsker, S. et al. Distributed statistical estimation and rates of convergence in normal approximation. Electronic Journal of Statistics, 13(2):5213–5252, 2019.
- Chen, M., Gao, C., Ren, Z., et al. Robust covariance and scatter matrix estimation under Hubers contamination model. The Annals of Statistics, 46(5):1932–1960, 2018.
- Mohamed, S. and Lakshminarayanan, B. Learning in implicit generative models. arXiv preprint arXiv:1610.03483, 2016.
- Diakonikolas, I., Kamath, G., Kane, D., Li, J., Moitra, A., and Stewart, A. Robust estimators in high-dimensions without the computational intractability. SIAM Journal
- Mroueh, Y., Li, C.-L., Sercu, T., Raj, A., and Cheng, Y. Sobolev GAN. arXiv preprint arXiv:1711.04894, 2017.
- on Computing, 48(2):742–864, 2019.
- two-sample testing and related families of nonparametric tests. Entropy, 19(2):47, 2017.
- The Annals of Statistics, pp. 508–539, 1996.
- Dudley, R. Speeds of metric probability convergence. Zeitschrift fur Wahrscheinlichkeitstheorie und by sequences of independent random variables. Israel Journal of Mathematics, 8(3):273–303, 1970.
- Verwandte Gebiete, 22(4):323–332, 1972.
- education, 2006.
- The Annals of Mathematical Statistics, pp. 1753–1758, Singh, S., Uppal, A., Li, B., Li, C.-L., Zaheer, M., and
- Kantorovich, L. V. and Rubinstein, G. S. On a space of completely additive functions. Vestnik Leningrad. Univ, versarial losses. In Advances in Neural Information Processing Systems 31, pp. 10246–10257, 2018.
- 13(7):52–59, 1958.
- Kim, J. and Scott, C. D. Robust kernel density estimation. Journal of Machine Learning Research, 13(Sep):2529– 2565, 2012.
- pirical distributions. The annals of mathematical statistics, 19(2):279–281, 1948.
- Huszar, F. Amortised map inference for image superresolution. arXiv preprint arXiv:1610.04490, 2016.
- lgge di distribuzione. Inst. Ital. Attuari, Giorn., 4:83–91, 1933.
- Leoni, G. A first course in Sobolev spaces. American Mathmal rate and curse of dimensionality. arXiv preprint ematical Soc., 2017.
- arXiv:1810.08033, 2018.
- Lerasle, M., Szabo, Z., Mathieu, T., and Lecue, G. Monk– outlier-robust mean embedding estimation by median-ofmeans. arXiv preprint arXiv:1802.04784, 2018.
- Szekely, G. J., Rizzo, M. L., Bakirov, N. K., et al. Measuring and testing dependence by correlation of distances. The annals of statistics, 35(6):2769–2794, 2007.
- Tolstikhin, I., Sriperumbudur, B. K., and Muandet, K. Minimax estimation of kernel mean embeddings. The Journal of Machine Learning Research, 18(1):3002–3048, 2017.
- Triebel, H. Theory of function spaces II. Bull. Amer. Math. Soc, 31:119–125, 1994.
- Tsybakov, A. B. Introduction to nonparametric estimation. Revised and extended from the 2004 French original. Translated by Vladimir Zaiats. Springer Series in Statistics. Springer, New York, 2009.
- Uppal, A., Singh, S., and Poczos, B. Nonparametric density estimation under Besov IPM losses. arXiv preprint arXiv:1902.03511, 2019.
- Vandermeulen, R. and Scott, C. Consistency of robust kernel density estimators. In Conference on Learning Theory, pp. 568–591, 2013.
- Villani, C. Optimal transport: old and new, volume 338. Springer Science & Business Media, 2008.
- Wasserman, L. All of Nonparametric Statistics. Springer Science & Business Media, 2006.

Tags

Comments

数据免责声明

页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果，我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问，可以通过电子邮件方式联系我们：report@aminer.cn