AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
In this work we provided the first statistically optimal robust estimators for learning Ising models in the high temperature regime

On Learning Ising Models under Huber's Contamination Model

NIPS 2020, (2020)

Cited by: 0|Views29
EI
Full Text
Bibtex
Weibo

Abstract

We study the problem of learning Ising models in a setting where some of the samples from the underlying distribution can be arbitrarily corrupted. In such a setup, we aim to design statistically optimal estimators in a high-dimensional scaling in which the number of nodes p, the number of edges k and the maximal node degree d are allowed...More

Code:

Data:

0
Introduction
  • Undirected graphical models ( known as Markov random fields (MRFs)) have gained significant attention as a tool for discovering and visualizing dependencies among covariates in multivariate data.
  • Graphical models provide compact and structured representations of the joint distribution of multiple random variables using graphs that represent conditional independences between the individual random variables.
  • They are used in domains as varied as natural language processing[37], image processing [9, 24, 26], spatial statistics [43] and computational biology [23], among others.
  • An Ising model is a special instantiation of an MRF where each random variable Xs take values in {−1, +1}, and the joint probability mass function is given by: Pθ(x1, . . . , xp) ∝ exp θstxsxt ,
Highlights
  • Undirected graphical models ( known as Markov random fields (MRFs)) have gained significant attention as a tool for discovering and visualizing dependencies among covariates in multivariate data
  • We focus on the specific undirected graphical model sub-class of Ising models [29]
  • An Ising model is a special instantiation of an MRF where each random variable Xs take values in {−1, +1}, and the joint probability mass function is given by:
  • We propose the first statistically optimal estimator for sparse logistic regression, and use that to provide estimators for learning Ising models
  • In this work we provided the first statistically optimal robust estimators for learning Ising models in the high temperature regime
  • Our estimators achieved optimal asymptotic error in the -contamination model, and high-probability deviation bounds in the uncontaminated setting
Results
  • The authors notice that the slope is not drastically affected by ω, which suggests that the constant C(α) appearing in the results is O(1).
  • In Figures 1(c) and 1(f), the authors notice the variation in the slope with increasing model width ω.
  • While the current result study the case when ω < 1, it is interesting to note an increasing trend when ω ≥ 1 suggesting an explicit dependence on ω in the low-temperature regime
Conclusion
  • Discussion and Future

    Work

    In this work the authors provided the first statistically optimal robust estimators for learning Ising models in the high temperature regime.
  • The authors' focus was on designing estimators for the contaminated model, i.e., where a fraction of the data is arbitrarily corrupted.
  • Another model of corruption - motivated by sensor networks and distributed computation where node failures are common - is when only a few features(nodes) get corrupted, and the authors still want to learn the appropriate graph structure for the uncontaminated nodes
Summary
  • Introduction:

    Undirected graphical models ( known as Markov random fields (MRFs)) have gained significant attention as a tool for discovering and visualizing dependencies among covariates in multivariate data.
  • Graphical models provide compact and structured representations of the joint distribution of multiple random variables using graphs that represent conditional independences between the individual random variables.
  • They are used in domains as varied as natural language processing[37], image processing [9, 24, 26], spatial statistics [43] and computational biology [23], among others.
  • An Ising model is a special instantiation of an MRF where each random variable Xs take values in {−1, +1}, and the joint probability mass function is given by: Pθ(x1, . . . , xp) ∝ exp θstxsxt ,
  • Objectives:

    The authors aim to design statistically optimal estimators in a high-dimensional scaling in which the number of nodes p, the number of edges k and the maximal node degree d are allowed to increase to infinity as a function of the sample size n.
  • Results:

    The authors notice that the slope is not drastically affected by ω, which suggests that the constant C(α) appearing in the results is O(1).
  • In Figures 1(c) and 1(f), the authors notice the variation in the slope with increasing model width ω.
  • While the current result study the case when ω < 1, it is interesting to note an increasing trend when ω ≥ 1 suggesting an explicit dependence on ω in the low-temperature regime
  • Conclusion:

    Discussion and Future

    Work

    In this work the authors provided the first statistically optimal robust estimators for learning Ising models in the high temperature regime.
  • The authors' focus was on designing estimators for the contaminated model, i.e., where a fraction of the data is arbitrarily corrupted.
  • Another model of corruption - motivated by sensor networks and distributed computation where node failures are common - is when only a few features(nodes) get corrupted, and the authors still want to learn the appropriate graph structure for the uncontaminated nodes
Related work
  • In this work, we focus on the specific undirected graphical model sub-class of Ising models [29]. There has been a lot of work for learning Ising models in the uncontaminated setting dating back to the classical work of Chow and Liu [8]. Csiszár and Talata [10] discuss pseudo likelihood based approaches for estimating the neighborhood at a given node in MRFs. Subsequently, a simple search based method is described in [6] with provable guarantees. Later, Ravikumar et al [42] showed that under an incoherence assumption, node-wise (regularized) estimators provably recover the correct dependency graph with a small number of samples. Recently, there has been a flurry of work [5, 30, 36, 47, 49] to get computationally efficient estimators which recover the true graph structure without the incoherence assumption, including extensions to identity and independence testing [12]. However, all the aforementioned results are in the uncontaminated setting. Recently, Lindgren et al [35] derived preliminary results for learning Ising models robustly. However, their upper and lower bounds do not match. Moreover, their analysis primarily focuses on the robustness of the Sparsitron algorithm in [30], and they do not explore the effect of the underlying graph and correlation structures comprehensively.
Funding
  • AP, VS and PR acknowledge the support of NSF via IIS-1955532, OAC-1934584, DARPA via HR00112020006, and ONR via N000141812861
  • SB and AP acknowledge the support of NSF via DMS-17130003 and CCF-1763734
Reference
  • Mehmet Eren Ahsen and Mathukumalli Vidyasagar. An approach to one-bit compressed sensing based on probably approximately correct learning theory. The Journal of Machine Learning Research, 20(1):408–430, 2019.
    Google ScholarLocate open access versionFindings
  • DF Andrews, PJ Bickel, FR Hampel, PJ Huber, WH Rogers, and JW Tukey. Robust estimates of location: Survey and advances, 1972.
    Google ScholarFindings
  • Sivaraman Balakrishnan, Simon S Du, Jerry Li, and Aarti Singh. Computationally efficient robust sparse estimation in high dimensions. In Conference on Learning Theory, pages 169–212, 2017.
    Google ScholarLocate open access versionFindings
  • Julian Besag. Spatial interaction and the statistical analysis of lattice systems. Journal of the Royal Statistical Society: Series B (Methodological), 36(2):192–225, 1974.
    Google ScholarLocate open access versionFindings
  • Guy Bresler. Efficiently learning ising models on arbitrary graphs. In Proceedings of the forty-seventh annual ACM symposium on Theory of computing, pages 771–782, 2015.
    Google ScholarLocate open access versionFindings
  • Guy Bresler, Elchanan Mossel, and Allan Sly. Reconstruction of markov random fields from samples: Some observations and algorithms. In Approximation, Randomization and Combinatorial Optimization. Algorithms and Techniques, pages 343–356.
    Google ScholarLocate open access versionFindings
  • Moses Charikar, Jacob Steinhardt, and Gregory Valiant. Learning from untrusted data. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, pages 47–60. ACM, 2017.
    Google ScholarLocate open access versionFindings
  • C Chow and Cong Liu. Approximating discrete probability distributions with dependence trees. IEEE transactions on Information Theory, 14(3):462–467, 1968.
    Google ScholarLocate open access versionFindings
  • George R Cross and Anil K Jain. Markov random field texture models. IEEE Transactions on Pattern Analysis and Machine Intelligence, (1):25–39, 1983.
    Google ScholarLocate open access versionFindings
  • Imre Csiszár and Zsolt Talata. Consistent estimation of the basic neighborhood of markov random fields. The Annals of Statistics, pages 123–145, 2006.
    Google ScholarLocate open access versionFindings
  • Yuval Dagan, Constantinos Daskalakis, Nishanth Dikkala, and Anthimos Vardis Kandiros. Estimating ising models from one sample. arXiv preprint arXiv:2004.09370, 2020.
    Findings
  • Constantinos Daskalakis, Nishanth Dikkala, and Gautam Kamath. Testing ising models. IEEE Transactions on Information Theory, 65(11):6829–6852, 2019.
    Google ScholarLocate open access versionFindings
  • Christopher De Sa, Kunle Olukotun, and Christopher Ré. Ensuring rapid mixing and low bias for asynchronous gibbs sampling. In JMLR workshop and conference proceedings, volume 48, page 1567. NIH Public Access, 2016.
    Google ScholarLocate open access versionFindings
  • Luc Devroye, Abbas Mehrabian, Tommy Reddad, et al. The minimax learning rates of normal and ising undirected graphical models. Electronic Journal of Statistics, 14(1):2338–2361, 2020.
    Google ScholarLocate open access versionFindings
  • Ilias Diakonikolas, Gautam Kamath, Daniel M Kane, Jerry Li, Ankur Moitra, and Alistair Stewart. Robust estimators in high dimensions without the computational intractability. In Foundations of Computer Science (FOCS), 2016 IEEE 57th Annual Symposium on, pages 655–664. IEEE, 2016.
    Google ScholarLocate open access versionFindings
  • Ilias Diakonikolas, Gautam Kamath, Daniel M Kane, Jerry Li, Ankur Moitra, and Alistair Stewart. Being robust (in high dimensions) can be practical. In Proceeds of the 34th International Conference on Machine Learning, pages 999–1008, 2017.
    Google ScholarLocate open access versionFindings
  • Ilias Diakonikolas, Gautam Kamath, Daniel Kane, Jerry Li, Ankur Moitra, and Alistair Stewart. Robust estimators in high-dimensions without the computational intractability. SIAM Journal on Computing, 48(2):742–864, 2019.
    Google ScholarLocate open access versionFindings
  • Ilias Diakonikolas, Gautam Kamath, Daniel Kane, Jerry Li, Jacob Steinhardt, and Alistair Stewart. Sever: A robust meta-algorithm for stochastic optimization. In Proceedings of the 36th International Conference on Machine Learning, Proceedings of Machine Learning Research, pages 1596–1606, 2019.
    Google ScholarLocate open access versionFindings
  • PL Dobruschin. The description of a random field by means of conditional probabilities and conditions of its regularity. Theory of Probability & Its Applications, 13(2):197–224, 1968.
    Google ScholarLocate open access versionFindings
  • Roland L Dobrushin and Senya B Shlosman. Completely analytical interactions: constructive description. Journal of Statistical Physics, 46(5-6):983–1014, 1987.
    Google ScholarLocate open access versionFindings
  • David L Donoho and Richard C Liu. The" automatic" robustness of minimum distance functionals. The Annals of Statistics, pages 552–586, 1988.
    Google ScholarLocate open access versionFindings
  • David L Donoho and Richard C Liu. Geometrizing rates of convergence, iii. The Annals of Statistics, pages 668–701, 1991.
    Google ScholarLocate open access versionFindings
  • Nir Friedman. Inferring cellular networks using probabilistic graphical models. Science, 303 (5659):799–805, 2004.
    Google ScholarLocate open access versionFindings
  • Stuart Geman and Donald Geman. Stochastic relaxation, gibbs distributions, and the bayesian restoration of images. IEEE Transactions on pattern analysis and machine intelligence, (6): 721–741, 1984.
    Google ScholarFindings
  • Friedrich Götze, Holger Sambale, and Arthur Sinulis. Higher order concentration for functions of weakly dependent random variables. Electron. J. Probab., 24:19 pp., 2019.
    Google ScholarLocate open access versionFindings
  • Martin Hassner and Jack Sklansky. The use of markov random fields as models of texture. In Image Modeling, pages 185–198.
    Google ScholarLocate open access versionFindings
  • Peter J Huber. Robust estimation of a location parameter. The Annals of Mathematical Statistics, 35(1):73–101, 1964.
    Google ScholarLocate open access versionFindings
  • Peter J Huber. Robust statistics. In International Encyclopedia of Statistical Science, pages 1248–1251.
    Google ScholarLocate open access versionFindings
  • Ernst Ising. Beitrag zur theorie des ferromagnetismus. Zeitschrift für Physik, 31(1):253–258, 1925.
    Google ScholarLocate open access versionFindings
  • Adam Klivans and Raghu Meka. Learning graphical models using multiplicative weights. In 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS), pages 343–354. IEEE, 2017.
    Google ScholarLocate open access versionFindings
  • Pravesh K Kothari, Jacob Steinhardt, and David Steurer. Robust moment estimation and improved clustering via sum of squares. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, pages 1035–1046. ACM, 2018.
    Google ScholarLocate open access versionFindings
  • Christof Külske. Concentration inequalities for functions of gibbs fields with application to diffraction and random gibbs measures. Communications in mathematical physics, 239(1-2): 29–51, 2003.
    Google ScholarLocate open access versionFindings
  • H Künsch. Decay of correlations under dobrushin’s uniqueness condition and its applications. Communications in Mathematical Physics, 84(2):207–222, 1982.
    Google ScholarLocate open access versionFindings
  • Kevin A Lai, Anup B Rao, and Santosh Vempala. Agnostic estimation of mean and covariance. In Foundations of Computer Science (FOCS), 2016 IEEE 57th Annual Symposium on, pages 665–674. IEEE, 2016.
    Google ScholarLocate open access versionFindings
  • Erik M Lindgren, Vatsal Shah, Yanyao Shen, Alexandros G Dimakis, and Adam Klivans. On robust learning of ising models. NeurIPS Workshop on Relational Representation Learning, 2019.
    Google ScholarLocate open access versionFindings
  • Andrey Y Lokhov, Marc Vuffray, Sidhant Misra, and Michael Chertkov. Optimal structure and parameter learning of ising models. Science advances, 4(3):e1700791, 2018.
    Google ScholarLocate open access versionFindings
  • Christopher D Manning, Christopher D Manning, and Hinrich Schütze. Foundations of statistical natural language processing. 1999.
    Google ScholarFindings
  • Pascal Massart. Concentration inequalities and model selection, volume 6.
    Google ScholarFindings
  • Nicolai Meinshausen, Peter Bühlmann, et al. High-dimensional graphs and variable selection with the lasso. The Annals of Statistics, 34(3):1436–1462, 2006.
    Google ScholarLocate open access versionFindings
  • Adarsh Prasad, Sivaraman Balakrishnan, and Pradeep Ravikumar. A unified approach to robust mean estimation. arXiv preprint arXiv:1907.00927, 2019.
    Findings
  • Adarsh Prasad, Arun Sai Suggala, Sivaraman Balakrishnan, Pradeep Ravikumar, et al. Robust estimation via robust gradient estimation. Journal of the Royal Statistical Society Series B, 82 (3):601–627, 2020.
    Google ScholarLocate open access versionFindings
  • Pradeep Ravikumar, Martin J Wainwright, John D Lafferty, et al. High-dimensional ising model selection using 1-regularized logistic regression. The Annals of Statistics, 38(3):1287–1319, 2010.
    Google ScholarLocate open access versionFindings
  • Brian D Ripley. Spatial statistics, volume 575. John Wiley & Sons, 2005.
    Google ScholarLocate open access versionFindings
  • Adam J Rothman, Peter J Bickel, Elizaveta Levina, Ji Zhu, et al. Sparse permutation invariant covariance estimation. Electronic Journal of Statistics, 2:494–515, 2008.
    Google ScholarLocate open access versionFindings
  • Narayana P Santhanam and Martin J Wainwright. Information-theoretic limits of selecting binary graphical models in high dimensions. IEEE Transactions on Information Theory, 58(7): 4117–4134, 2012.
    Google ScholarLocate open access versionFindings
  • Daniel W Stroock and Boguslaw Zegarlinski. The logarithmic sobolev inequality for discrete spin systems on a lattice. Communications in Mathematical Physics, 149(1):175–193, 1992.
    Google ScholarLocate open access versionFindings
  • Marc Vuffray, Sidhant Misra, Andrey Lokhov, and Michael Chertkov. Interaction screening: Efficient and sample-optimal learning of ising models. In Advances in Neural Information Processing Systems, pages 2595–2603, 2016.
    Google ScholarLocate open access versionFindings
  • Martin J Wainwright. High-dimensional statistics: A non-asymptotic viewpoint, volume Cambridge University Press, 2019.
    Google ScholarFindings
  • Shanshan Wu, Sujay Sanghavi, and Alexandros G Dimakis. Sparse logistic regression learns all discrete pairwise graphical models. In Advances in Neural Information Processing Systems, pages 8069–8079, 2019.
    Google ScholarLocate open access versionFindings
  • Yannis G Yatracos. Rates of convergence of minimum distance estimators and kolmogorov’s entropy. The Annals of Statistics, pages 768–774, 1985.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments
小科