Causal Discovery from Soft Interventions with Unknown Targets: Characterization and Learning

NIPS 2020, 2020.

Cited by: 0|Bibtex|Views8
EI
Other Links: dblp.uni-trier.de|academic.microsoft.com
Weibo:
We started by defining the -Markov property that connects a tuple of distributions with unknown targets to a pair of causal graph D and a corresponding possible interventional target set I

Abstract:

One fundamental problem in the empirical sciences is of reconstructing the causal structure that underlies a phenomenon of interest through observation and experimentation. While there exists a plethora of methods capable of learning the equivalence class of causal structures that are compatible with observations, it is less well-understo...More

Code:

Data:

0
Introduction
  • Learning cause-and-e↵ect relationships is one of the fundamental problems for various fields, including biology [28, 6], epidemiology [26], and economics [12].
  • 1. The authors formulate a graphical characterization to test whether two pairs of causal graphs and their corresponding interventional target sets, hD1, I1i and hD2, I2i, are in the same -Markov equivalence class, i.e., they are indistinguishable with respect to the available datasets.
Highlights
  • Learning cause-and-e↵ect relationships is one of the fundamental problems for various fields, including biology [28, 6], epidemiology [26], and economics [12]
  • Assuming a tuple of distributions P = hPiimi=1 is generated by the same system, i.e., causal graph with latents, we define a property called -Markov that connects P to a pair consisting of (1) a causal graph D and (2) a set of interventional targets I. Building on this property in Section 3, we provide a graphical characterization that allows one to test whether two causal graphs with possibly di↵erent sets of intervention targets belong to the same -Markov equivalence class
  • We investigated the problem of learning causal graphs with latent variables from a mixture of observational and interventional distributions with unknown interventional targets
  • We started by defining the -Markov property that connects a tuple of distributions with unknown targets to a pair of causal graph D and a corresponding possible interventional target set I
  • Two pairs hD1, I1i and hD2, I2i are said to be -Markov equivalent if they license the same tuples of distributions. Based on this refined equivalence relation, we derived a graphical characterization to evaluate whether two pairs are in the same -Markov equivalence class
  • We developed a sound and complete algorithm that recovers a -Markov equivalence class given a tuple of distributions
Results
  • Given the causal graphs D1 = (V [ L1, E1) and D2 = (V [ L2, E2), and the corresponding interventional targets I1, I2, the pairs hD1, I1i and hD2, I2i are said to be -Markov equivalent if I1 (D1) = I2 (D2).
  • Two pairs of causal graphs and their corresponding sets of interventional targets hD1, I1i and hD2, I2i are -Markov equivalent if they can induce the same set of distribution tuples.
  • The authors derive a graphical characterization for two causal graphs with corresponding sets of interventional targets to be -Markov equivalent.
  • Given causal graphs D1 = (V [ L1, E2), D2 = (V [ L2, E2) and corresponding sets of interventional targets I1, I2, hD1, I1i and hD2, I2i are -Markov equivalent if and only if for M1 = MAG(AugI1 (D1)) and M2 = MAG(AugI2 (D2)):3
  • Given causal graphs without latents, D1 = (V, E2), D2 = (V, E2), and the corresponding interventional targets I1, I2, the pairs hD1, I1i and hD2, I2i are -Markov equivalent if and only if AugI1 (D1) and AugI2 (D2) have (1) the same skeleton and (2) the same unshielded colliders.
  • The authors investigate the problem of how to learn the -Markov EC (Def. 3) from a tuple of interventional distributions generated by some unknown pair hD, Ii. The characterization provided in Thm. 1 together with PAGs motivate the following definition of -PAG.
  • Consider a tuple of distributions hP1, P2i and let the pair hD, Ii in Fig. 1a be the true and unknown causal graph and set of corresponding interventional targets.
  • Assuming tuple P is generated by unknown pair hD, Ii, -FCI is sound in the sample limit, i.e., MAG(AugI (D)) has the same skeleton as P -FCI, the -PAG learned by -FCI, and shares all its tail and arrowhead orientations.
Conclusion
  • Assuming tuple P is generated by unknown pair hD, Ii, -FCI is complete, i.e., P contains all the common edge marks in the -Markov equivalence class.
  • The authors started by defining the -Markov property that connects a tuple of distributions with unknown targets to a pair of causal graph D and a corresponding possible interventional target set I.
Summary
  • Learning cause-and-e↵ect relationships is one of the fundamental problems for various fields, including biology [28, 6], epidemiology [26], and economics [12].
  • 1. The authors formulate a graphical characterization to test whether two pairs of causal graphs and their corresponding interventional target sets, hD1, I1i and hD2, I2i, are in the same -Markov equivalence class, i.e., they are indistinguishable with respect to the available datasets.
  • Given the causal graphs D1 = (V [ L1, E1) and D2 = (V [ L2, E2), and the corresponding interventional targets I1, I2, the pairs hD1, I1i and hD2, I2i are said to be -Markov equivalent if I1 (D1) = I2 (D2).
  • Two pairs of causal graphs and their corresponding sets of interventional targets hD1, I1i and hD2, I2i are -Markov equivalent if they can induce the same set of distribution tuples.
  • The authors derive a graphical characterization for two causal graphs with corresponding sets of interventional targets to be -Markov equivalent.
  • Given causal graphs D1 = (V [ L1, E2), D2 = (V [ L2, E2) and corresponding sets of interventional targets I1, I2, hD1, I1i and hD2, I2i are -Markov equivalent if and only if for M1 = MAG(AugI1 (D1)) and M2 = MAG(AugI2 (D2)):3
  • Given causal graphs without latents, D1 = (V, E2), D2 = (V, E2), and the corresponding interventional targets I1, I2, the pairs hD1, I1i and hD2, I2i are -Markov equivalent if and only if AugI1 (D1) and AugI2 (D2) have (1) the same skeleton and (2) the same unshielded colliders.
  • The authors investigate the problem of how to learn the -Markov EC (Def. 3) from a tuple of interventional distributions generated by some unknown pair hD, Ii. The characterization provided in Thm. 1 together with PAGs motivate the following definition of -PAG.
  • Consider a tuple of distributions hP1, P2i and let the pair hD, Ii in Fig. 1a be the true and unknown causal graph and set of corresponding interventional targets.
  • Assuming tuple P is generated by unknown pair hD, Ii, -FCI is sound in the sample limit, i.e., MAG(AugI (D)) has the same skeleton as P -FCI, the -PAG learned by -FCI, and shares all its tail and arrowhead orientations.
  • Assuming tuple P is generated by unknown pair hD, Ii, -FCI is complete, i.e., P contains all the common edge marks in the -Markov equivalence class.
  • The authors started by defining the -Markov property that connects a tuple of distributions with unknown targets to a pair of causal graph D and a corresponding possible interventional target set I.
Funding
  • Acknowledgments and Disclosure of Funding Bareinboim and Jaber are supported in parts by grants from NSF IIS-1704352 and IIS-1750807 (CAREER)
  • Kocaoglu and Shanmugam are supported by the MIT-IBM Watson AI Lab
Reference
  • Elias Bareinboim, Carlos Brito, and Judea Pearl. Local characterizations of causal bayesian networks. In Graph Structures for Knowledge Representation and Reasoning (IJCAI), pages 1–17. Springer Berlin Heidelberg, 2012.
    Google ScholarFindings
  • N. Cartwright. Hunting Causes and Using Them: Approaches in Philosophy and Economics. Cambridge University Press, 2007.
    Google ScholarFindings
  • A Philip Dawid. Influence diagrams for causal modelling and inference. International Statistical Review, 70:161–189, 2002.
    Google ScholarLocate open access versionFindings
  • Daniel Eaton and Kevin Murphy. Exact bayesian structure learning from uncertain interventions. In Artificial intelligence and statistics, pages 107–114, 2007.
    Google ScholarLocate open access versionFindings
  • Frederich Eberhardt, Clark Glymour, and Richard Scheines. On the number of experiments su cient and in the worst case necessary to identify all causal relations among n variables. In Proceedings of the 21st Conference on Uncertainty in Artificial Intelligence (UAI), pages 178–184, 2005.
    Google ScholarLocate open access versionFindings
  • Kyle Kai-How Farh, Alexander Marson, Jiang Zhu, Markus Kleinewietfeld, William J Housley, Samantha Beik, Noam Shoresh, Holly Whitton, Russell JH Ryan, Alexander A Shishkin, et al. Genetic and epigenetic fine mapping of causal autoimmune disease variants. Nature, 518(7539):337–343, 2015.
    Google ScholarLocate open access versionFindings
  • Dan Geiger, Thomas Verma, and Judea Pearl. d-separation: From theorems to algorithms. In Machine Intelligence and Pattern Recognition, volume 10, pages 139–148.
    Google ScholarLocate open access versionFindings
  • AmirEmad Ghassami, Saber Salehkaleybar, Negar Kiyavash, and Elias Bareinboim. Budgeted experiment design for causal structure learning. In Jennifer Dy and Andreas Krause, editors, Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pages 1724–1733. PMLR, 2018.
    Google ScholarLocate open access versionFindings
  • Alain Hauser and Peter Bühlmann. Characterization and greedy learning of interventional Markov equivalence classes of directed acyclic graphs. Journal of Machine Learning Research, 13(1):2409–2464, 2012.
    Google ScholarLocate open access versionFindings
  • Alain Hauser and Peter Bühlmann. Two optimal strategies for active learning of causal networks from interventional data. In Proceedings of Sixth European Workshop on Probabilistic Graphical Models, 2012.
    Google ScholarLocate open access versionFindings
  • Patrik O Hoyer, Dominik Janzing, Joris Mooij, Jonas Peters, and Bernhard Schölkopf. Nonlinear causal discovery with additive noise models. In Proceedings of NIPS 2008, 2008.
    Google ScholarLocate open access versionFindings
  • Paul Hünermund and Elias Bareinboim. Causal inference and data-fusion in econometrics. arXiv preprint arXiv:1912.09104, 2019.
    Findings
  • Amin Jaber, Murat Kocaoglu, Karthikeyan Shanmugam, and Elias Bareinboim. Causal discovery from soft interventions with unknown targets: Characterization and learning. Technical report, R-67, Columbia CausalAI Lab, Department of Computer Science, Columbia University, 2020.
    Google ScholarFindings
  • Dominik Janzing, Joris Mooij, Kun Zhang, Jan Lemeire, Jakob Zscheischler, Povilas Daniušis, Bastian Steudel, and Bernhard Schölkopf. Information-geometric approach to inferring causal directions. Artificial Intelligence, 182-183:1–31, 2012.
    Google ScholarLocate open access versionFindings
  • Nan Rosemary Ke, Olexa Bilaniuk, Anirudh Goyal, Stefan Bauer, Hugo Larochelle, Chris Pal, and Yoshua Bengio. Learning neural causal models from unknown interventions. arXiv preprint arXiv:1910.01075, 2019.
    Findings
  • Murat Kocaoglu, Alexandros G. Dimakis, Sriram Vishwanath, and Babak Hassibi. Entropic causal inference. In AAAI’17, 2017.
    Google ScholarLocate open access versionFindings
  • Murat Kocaoglu, Amin Jaber, Karthikeyan Shanmugam, and Elias Bareinboim. Characterization and learning of causal graphs with latent variables from soft interventions. In Advances in Neural Information Processing Systems, pages 14346–14356, 2019.
    Google ScholarLocate open access versionFindings
  • Murat Kocaoglu, Karthikeyan Shanmugam, and Elias Bareinboim. Experimental design for learning causal graphs with latent variables. In Advances in Neural Information Processing Systems, pages 7018–7028, 2017.
    Google ScholarLocate open access versionFindings
  • Christopher Meek. Causal inference and causal explanation with background knowledge. In Proceedings of the eleventh conference on uncertainty in artificial intelligence, 1995.
    Google ScholarLocate open access versionFindings
  • Joris M Mooij, Sara Magliacane, and Tom Claassen. Joint causal inference from multiple contexts. arXiv preprint arXiv:1611.10351, 2016.
    Findings
  • J. Pearl. Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann, San Mateo, CA, 1988.
    Google ScholarFindings
  • J. Pearl. Causality: Models, Reasoning, and Inference. Cambridge University Press, New York, 2000. 2nd edition, 2009.
    Google ScholarFindings
  • Jonas Peters, Peter Bühlmann, and Nicolai Meinshausen. Causal inference by using invariant prediction: identification and confidence intervals. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 78(5):947–1012, 2016.
    Google ScholarLocate open access versionFindings
  • Jonas Peters, Dominik Janzing, and Bernhard Schölkopf. Elements of causal inference: foundations and learning algorithms. MIT press, 2017.
    Google ScholarFindings
  • Thomas Richardson and Peter Spirtes. Ancestral graph Markov models. The Annals of Statistics, 30(4):962–1030, 2002.
    Google ScholarLocate open access versionFindings
  • James M Robins, Miguel Angel Hernan, and Babette Brumback. Marginal structural models and causal inference in epidemiology, 2000.
    Google ScholarFindings
  • Dominik Rothenhäusler, Christina Heinze, Jonas Peters, and Nicolai Meinshausen. Backshift: Learning causal cyclic graphs from unknown shift interventions. In Advances in Neural Information Processing Systems, pages 1513–1521, 2015.
    Google ScholarLocate open access versionFindings
  • Karen Sachs, Omar Perez, Dana Pe’er, Douglas A Lau↵enburger, and Garry P Nolan. Causal protein-signaling networks derived from multiparameter single-cell data. Science, 308(5721):523–529, 2005.
    Google ScholarLocate open access versionFindings
  • Peter Spirtes, Clark Glymour, and Richard Scheines. Causation, Prediction, and Search. A Bradford Book, 2001.
    Google ScholarFindings
  • Chandler Squires, Yuhao Wang, and Caroline Uhler. Permutation-based causal structure learning with unknown intervention targets. arXiv preprint arXiv:1910.09007, 2019.
    Findings
  • Jin Tian and Judea Pearl. Causal discovery from changes. In Proceedings of UAI2013, 2013.
    Google ScholarLocate open access versionFindings
  • Thomas Verma and Judea Pearl. An algorithm for deciding if a set of observed independencies has a causal explanation. In Proceedings of the Eighth international conference on uncertainty in artificial intelligence, 1992.
    Google ScholarLocate open access versionFindings
  • J. Woodward, J.F. Woodward, and Oxford University Press. Making Things Happen: A Theory of Causal Explanation. Oxford scholarship online. Oxford University Press, 2003.
    Google ScholarFindings
  • Karren Yang, Abigail Katco↵, and Caroline Uhler. Characterizing and learning equivalence classes of causal DAGs under interventions. In Proceedings of the 35th International Conference on Machine Learning, volume 80, pages 5541–5550. PMLR, 2018.
    Google ScholarLocate open access versionFindings
  • Jiji Zhang. Causal inference and reasoning in causally insu cient systems. PhD thesis, Department of Philosophy, Carnegie Mellon University, 2006.
    Google ScholarFindings
  • Jiji Zhang. Causal reasoning with ancestral graphs. Journal of Machine Learning Research, 9(Jul):1437–1474, 2008.
    Google ScholarLocate open access versionFindings
  • Jiji Zhang. On the completeness of orientation rules for causal discovery in the presence of latent confounders and selection bias. Artificial Intelligence, 172(16):1873–1896, 2008.
    Google ScholarLocate open access versionFindings
  • Kun Zhang, Biwei Huang, Jiji Zhang, Clark Glymour, and Bernhard Schölkopf. Causal discovery from nonstationary/heterogeneous data: Skeleton estimation and orientation determination. In IJCAI: Proceedings of the Conference, volume 2017, page 1347, 2017.
    Google ScholarLocate open access versionFindings
Full Text
Your rating :
0

 

Tags
Comments