# Outlier Robust Mean Estimation with Subgaussian Rates via Stability

NeurIPS 2020, 2020.

Keywords:

outlier robust meanhigh dimensional robustdimensional robust meanstrong contamination modelmean estimationMore(12+)

Weibo:

Abstract:

We study the problem of outlier robust high-dimensional mean estimation under a finite covariance assumption, and more broadly under finite low-degree moment assumptions. We consider a standard stability condition from the recent robust statistics literature and prove that, except with exponentially small failure probability, there exis...More

Code:

Data:

Introduction

- Consider the following problem: For a given family F of distributions on Rd, estimate the mean of an unknown D ∈ F, given access to i.i.d. samples from D.
- This is the problem of mean estimation and is arguably the most fundamental statistical task.
- The authors relax the “i.i.d. assumption” and aim to obtain estimators that are robust to a constant fraction of adversarial outliers

Highlights

- 1.1 Background and Motivation

Consider the following problem: For a given family F of distributions on Rd, estimate the mean of an unknown D ∈ F, given access to i.i.d. samples from D - In the most basic setting where F is the family of high-dimensional Gaussians, the empirical mean is well-known to be an optimal estimator — in the sense that it achieves the best possible accuracy-confidence tradeoff and is easy to compute
- We study high-dimensional mean estimation in the high confidence regime when the underlying family F is only assumed to satisfy bounded moment conditions
- We showed that a standard stability condition from the recent high-dimensional robust statistics literature suffices to obtain near-subgaussian rates for robust mean estimation in the strong contamination model
- An interesting technical question is whether the extra log d factor in Theorem 1.4 is needed. (Our results imply that it is not needed when ǫ = Ω(1).) If not, this would imply that stability-based algorithms achieve subgaussian rates without the pre-processing

Results

- The authors' first main result establishes the stability of a subset of i.i.d. points drawn from a distribution with bounded covariance.

Theorem 1.4. - The authors' first main result establishes the stability of a subset of i.i.d. points drawn from a distribution with bounded covariance.
- Let S be a multiset of n i.i.d. samples from a distribution on.
- Let ǫ′ = O(log(1/τ )/n + ǫ) ≤ c, for a sufficiently small constant c > 0.
- With S′ is (2ǫ′, δ)-stable probability at least 1− with respect to μ and τ, Σ there exists a subset S′ ⊆ S such that , where δ = O( (r(Σ) log r(Σ))/n+

Conclusion

- The authors showed that a standard stability condition from the recent high-dimensional robust statistics literature suffices to obtain near-subgaussian rates for robust mean estimation in the strong contamination model.
- With a simple pre-processing, this leads to efficient outlier-robust estimators with subgaussian rates under only a bounded covariance assumption.
- (The authors' results imply that it is not needed when ǫ = Ω(1).) If not, this would imply that stability-based algorithms achieve subgaussian rates without the pre-processing

Summary

## Introduction:

Consider the following problem: For a given family F of distributions on Rd, estimate the mean of an unknown D ∈ F, given access to i.i.d. samples from D.- This is the problem of mean estimation and is arguably the most fundamental statistical task.
- The authors relax the “i.i.d. assumption” and aim to obtain estimators that are robust to a constant fraction of adversarial outliers
## Objectives:

The authors aim to achieve the best of both worlds. Recall that the aim is to find a w ∈ ∆n,ǫ that satisfies the conditions: (i) μw − μ ≤ δ, and (ii) Σw − I ≤ δ2/ǫ.## Results:

The authors' first main result establishes the stability of a subset of i.i.d. points drawn from a distribution with bounded covariance.

Theorem 1.4.- The authors' first main result establishes the stability of a subset of i.i.d. points drawn from a distribution with bounded covariance.
- Let S be a multiset of n i.i.d. samples from a distribution on.
- Let ǫ′ = O(log(1/τ )/n + ǫ) ≤ c, for a sufficiently small constant c > 0.
- With S′ is (2ǫ′, δ)-stable probability at least 1− with respect to μ and τ, Σ there exists a subset S′ ⊆ S such that , where δ = O( (r(Σ) log r(Σ))/n+
## Conclusion:

The authors showed that a standard stability condition from the recent high-dimensional robust statistics literature suffices to obtain near-subgaussian rates for robust mean estimation in the strong contamination model.- With a simple pre-processing, this leads to efficient outlier-robust estimators with subgaussian rates under only a bounded covariance assumption.
- (The authors' results imply that it is not needed when ǫ = Ω(1).) If not, this would imply that stability-based algorithms achieve subgaussian rates without the pre-processing

Related work

- Since the initials works [DKK+16, LRV16], there has been an explosion of research activity on algorithmic aspects of outlier-robust high dimensional estimation by several communities. See, e.g., [DK19] for a recent survey on the topic. In the context of outlier-robust mean estimation, a number of works [DKK+17, SCV18, CDG18, DHL19] have obtained efficient algorithms under various assumptions on the distribution of the inliers. Notably, efficient high-dimensional outlierrobust mean estimators have been used as primitives for robustly solving machine learning tasks that can be expressed as stochastic optimization problems [PSBR18, DKK+18]. The above works typically focus on the constant probability error regime and do not establish subgaussian rates for their estimators.

Two recent works [DL19, LLVZ19] studied the problem of outlier-robust mean estimation in the additive contamination model (when the adversary is only allowed to add outliers) and gave computationally efficient algorithms with subgaussian rates. Specifically, [DL19] gave an SDP-based algorithm, which is very similar to the algorithm of [CDG18]. The algorithm of [LLVZ19] is a fairly sophisticated iterative spectral algorithm, building on [CFB19]. In the strong contamination model, non-constructive outlier-robust estimators with subgaussian rates were established very recently. Specifically, [LM19b] gave a an exponential time estimator achieving the optimal rate. Our Proposition 1.6 implies that a very simple and practical algorithm – pre-processing followed by iterative filtering [DKK+17, DK19] – achieves this guarantee.

Reference

- [AMS99] N. Alon, Y. Matias, and M. Szegedy. The space complexity of approximating the frequency moments. J. Comput. Syst. Sci., 58(1):137–147, 1999.
- [BLM13] S. Boucheron, G. Lugosi, and P. Massart. Concentration Inequalities: A Nonasymptotic Theory of Independence. Oxford University Press, Oxford New York, NY, paperback edition, 2013.
- [Cat12] O. Catoni. Challenging the empirical mean and empirical variance: A deviation study. Ann. Inst. H. Poincare Probab. Statist., 48(4):1148–1185, 11 2012.
- [CDG18] Y. Cheng, I. Diakonikolas, and R. Ge. High-dimensional robust mean estimation in nearly-linear time. CoRR, abs/1811.09380, 2018. Conference version in SODA 2019, p. 2755-2771. URL: http://arxiv.org/abs/1811.09380, arXiv:1811.09380.
- [CDGS20] Y. Cheng, I. Diakonikolas, R. Ge, and M. Soltanolkotabi. High-dimensional robust mean estimation via gradient descent. CoRR, abs/2005.01378, 2020. URL: https://arxiv.org/abs/2005.01378, arXiv:2005.01378.
- Y. Cherapanamjeri, N. Flammarion, and P. L. Bartlett. Fast mean estimation with subgaussian rates. In Conference on Learning Theory, COLT 2019, volume 99 of Proceedings of Machine Learning Research, pages 786–80PMLR, 2019.
- [DHL19] Y. Dong, S. B. Hopkins, and J. Li. Quantum entropy scoring for fast robust mean estimation and improved outlier detection. CoRR, abs/1906.11366, 2019. Conference version in NeurIPS 2019. URL: http://arxiv.org/abs/1906.11366, arXiv:1906.11366.
- I. Diakonikolas and D. M. Kane. Recent advances in algorithmic high-dimensional robust statistics. CoRR, abs/1911.05911, 2019. URL: http://arxiv.org/abs/1911.05911, arXiv:1911.05911.
- [DKK+16] I. Diakonikolas, G. Kamath, D. M. Kane, J. Li, A. Moitra, and A. Stewart. Robust estimators in high dimensions without the computational intractability. In Proc. 57th IEEE Symposium on Foundations of Computer Science (FOCS), pages 655–664, 2016.
- [DKK+17] I. Diakonikolas, G. Kamath, D. M. Kane, J. Li, A. Moitra, and A. Stewart. Being robust (in high dimensions) can be practical. In Proc. 34th International Conference on Machine Learning (ICML), pages 999–1008, 2017.
- [DKK+18] I. Diakonikolas, G. Kamath, D. M. Kane, J. Li, J. Steinhardt, and A. Stewart. Sever: A robust meta-algorithm for stochastic optimization. CoRR, abs/1803.02815, 2018. Conference version in ICML 2019. URL: http://arxiv.org/abs/1803.02815, arXiv:1803.02815.
- J. Depersin and G. Lecue. Robust subgaussian estimation of a mean vector in nearly linear time. CoRR, abs/1906.03058, 2019.
- [DLLO16] L. Devroye, M. Lerasle, G. Lugosi, and R. I. Oliveira. Sub-gaussian mean estimators. Ann. Statist., 44(6):2695–2725, 12 2016.
- S. B. Hopkins and J. Li. How hard is robust mean estimation? In Conference on Learning Theory, COLT 2019, pages 1649–1682, 2019.
- [Hop18] S. B. Hopkins. Sub-gaussian mean estimation in polynomial time. CoRR, abs/1809.07425, 2018. URL: http://arxiv.org/abs/1809.07425, arXiv:1809.07425.
- [Hub64] P. J. Huber. Robust estimation of a location parameter. Ann. Math. Statist., 35(1):73– 101, 03 1964.
- [JVV86] M. Jerrum, L. G. Valiant, and V. V. Vazirani. Random generation of combinatorial structures from a uniform distribution. Theor. Comput. Sci., 43:169–188, 1986.
- V. Koltchinskii and S. Mendelson. Bounding the Smallest Singular Value of a Random Matrix Without Concentration. International Mathematics Research Notices, 2015(23):12991–13008, 03 2015.
- [LLVZ19] Z. Lei, K. Luh, P. Venkat, and F. Zhang. A fast spectral algorithm for mean estimation with sub-gaussian rates. CoRR, abs/1908.04468, 20URL: http://arxiv.org/abs/1908.04468, arXiv:1908.04468.
- [LM19a] G. Lugosi and S. Mendelson. Mean estimation and regression under heavy-tailed distributions: A survey. Foundations of Computational Mathematics, 19(5):1145–1190, 2019.
- Robust multivariate mean estimation: the optimality of trimmed mean. CoRR, abs/1907.11391, 2019. URL: http://arxiv.org/abs/1907.11391, arXiv:1907.11391.
- [LM19c] G. Lugosi and S. Mendelson. Sub-gaussian estimators of the mean of a random vector. Ann. Statist., 47(2):783–794, 04 2019.
- K. A. Lai, A. B. Rao, and S. Vempala. Agnostic estimation of mean and covariance. In Proc. 57th IEEE Symposium on Foundations of Computer Science (FOCS), pages 665–674, 2016.
- [LT91] M. Ledoux and M. Talagrand. Probability in Banach Spaces. Springer, 1991.
- [Min15] S. Minsker. Geometric median and robust estimation in Banach spaces. Bernoulli, 21(4):2308–2335, 2015.
- Statistics & Probability Letters, 127:111–119, August 2017.
- doi:10.1016/j.spl.2017.03.020.
- [NU83] A. S. Nemirovsky and D.B. Udin. Problem complexity and method efficiency in optimization. Wiley„ 1983.
- A. Prasad, S. Balakrishnan, and P. Ravikumar. A unified approach to robust mean estimation. CoRR, abs/1907.00927, 2019. URL: http://arxiv.org/abs/1907.00927, arXiv:1907.00927.
- [PSBR18] A. Prasad, A. S. Suggala, S. Balakrishnan, and P. Ravikumar. Robust estimation via robust gradient estimation. arXiv preprint arXiv:1802.06485, 2018.
- J. Steinhardt, M. Charikar, and G. Valiant. Resilience: A criterion for learning in the presence of arbitrary outliers. In Proc. 9th Innovations in Theoretical Computer Science Conference (ITCS), pages 45:1–45:21, 2018.
- M. Sion. On general minimax theorems. Pacific Journal of Mathematics, 8(1):171–176, 1958.
- M. Talagrand. New concentration inequalities in product spaces. Inventiones Mathematicae, 126(3):505–563, November 1996. doi:10.1007/s002220050108.
- J. A. Tropp. An introduction to matrix concentration inequalities. dations and Trends R in Machine Learning, 8(1-2):1–230, 2015. http://dx.doi.org/10.1561/2200000048, doi:10.1561/2200000048.
- [Tsy08] A. B. Tsybakov. Introduction to Nonparametric Estimation. Springer Publishing Company, Incorporated, 2008.
- [Tuk60] J. W. Tukey. A survey of sampling from contaminated distributions. Contributions to probability and statistics, 2:448–485, 1960.
- [ZJS19] B. Zhu, J. Jiao, and J. Steinhardt. Generalized resilience and robust statistics. CoRR, abs/1909.08755, 2019. URL: http://arxiv.org/abs/1909.08755, arXiv:1909.08755.

Tags

Comments