# Causal Discovery from Soft Interventions with Unknown Targets: Characterization and Learning

NIPS 2020, 2020.

EI

Weibo:

Abstract:

One fundamental problem in the empirical sciences is of reconstructing the causal structure that underlies a phenomenon of interest through observation and experimentation. While there exists a plethora of methods capable of learning the equivalence class of causal structures that are compatible with observations, it is less well-understo...More

Code:

Data:

Introduction

- Learning cause-and-e↵ect relationships is one of the fundamental problems for various fields, including biology [28, 6], epidemiology [26], and economics [12].
- 1. The authors formulate a graphical characterization to test whether two pairs of causal graphs and their corresponding interventional target sets, hD1, I1i and hD2, I2i, are in the same -Markov equivalence class, i.e., they are indistinguishable with respect to the available datasets.

Highlights

- Learning cause-and-e↵ect relationships is one of the fundamental problems for various fields, including biology [28, 6], epidemiology [26], and economics [12]
- Assuming a tuple of distributions P = hPiimi=1 is generated by the same system, i.e., causal graph with latents, we define a property called -Markov that connects P to a pair consisting of (1) a causal graph D and (2) a set of interventional targets I. Building on this property in Section 3, we provide a graphical characterization that allows one to test whether two causal graphs with possibly di↵erent sets of intervention targets belong to the same -Markov equivalence class
- We investigated the problem of learning causal graphs with latent variables from a mixture of observational and interventional distributions with unknown interventional targets
- We started by defining the -Markov property that connects a tuple of distributions with unknown targets to a pair of causal graph D and a corresponding possible interventional target set I
- Two pairs hD1, I1i and hD2, I2i are said to be -Markov equivalent if they license the same tuples of distributions. Based on this refined equivalence relation, we derived a graphical characterization to evaluate whether two pairs are in the same -Markov equivalence class
- We developed a sound and complete algorithm that recovers a -Markov equivalence class given a tuple of distributions

Results

- Given the causal graphs D1 = (V [ L1, E1) and D2 = (V [ L2, E2), and the corresponding interventional targets I1, I2, the pairs hD1, I1i and hD2, I2i are said to be -Markov equivalent if I1 (D1) = I2 (D2).
- Two pairs of causal graphs and their corresponding sets of interventional targets hD1, I1i and hD2, I2i are -Markov equivalent if they can induce the same set of distribution tuples.
- The authors derive a graphical characterization for two causal graphs with corresponding sets of interventional targets to be -Markov equivalent.
- Given causal graphs D1 = (V [ L1, E2), D2 = (V [ L2, E2) and corresponding sets of interventional targets I1, I2, hD1, I1i and hD2, I2i are -Markov equivalent if and only if for M1 = MAG(AugI1 (D1)) and M2 = MAG(AugI2 (D2)):3
- Given causal graphs without latents, D1 = (V, E2), D2 = (V, E2), and the corresponding interventional targets I1, I2, the pairs hD1, I1i and hD2, I2i are -Markov equivalent if and only if AugI1 (D1) and AugI2 (D2) have (1) the same skeleton and (2) the same unshielded colliders.
- The authors investigate the problem of how to learn the -Markov EC (Def. 3) from a tuple of interventional distributions generated by some unknown pair hD, Ii. The characterization provided in Thm. 1 together with PAGs motivate the following definition of -PAG.
- Consider a tuple of distributions hP1, P2i and let the pair hD, Ii in Fig. 1a be the true and unknown causal graph and set of corresponding interventional targets.
- Assuming tuple P is generated by unknown pair hD, Ii, -FCI is sound in the sample limit, i.e., MAG(AugI (D)) has the same skeleton as P -FCI, the -PAG learned by -FCI, and shares all its tail and arrowhead orientations.

Conclusion

- Assuming tuple P is generated by unknown pair hD, Ii, -FCI is complete, i.e., P contains all the common edge marks in the -Markov equivalence class.
- The authors started by defining the -Markov property that connects a tuple of distributions with unknown targets to a pair of causal graph D and a corresponding possible interventional target set I.

Summary

- Learning cause-and-e↵ect relationships is one of the fundamental problems for various fields, including biology [28, 6], epidemiology [26], and economics [12].
- 1. The authors formulate a graphical characterization to test whether two pairs of causal graphs and their corresponding interventional target sets, hD1, I1i and hD2, I2i, are in the same -Markov equivalence class, i.e., they are indistinguishable with respect to the available datasets.
- Given the causal graphs D1 = (V [ L1, E1) and D2 = (V [ L2, E2), and the corresponding interventional targets I1, I2, the pairs hD1, I1i and hD2, I2i are said to be -Markov equivalent if I1 (D1) = I2 (D2).
- Two pairs of causal graphs and their corresponding sets of interventional targets hD1, I1i and hD2, I2i are -Markov equivalent if they can induce the same set of distribution tuples.
- The authors derive a graphical characterization for two causal graphs with corresponding sets of interventional targets to be -Markov equivalent.
- Given causal graphs D1 = (V [ L1, E2), D2 = (V [ L2, E2) and corresponding sets of interventional targets I1, I2, hD1, I1i and hD2, I2i are -Markov equivalent if and only if for M1 = MAG(AugI1 (D1)) and M2 = MAG(AugI2 (D2)):3
- Given causal graphs without latents, D1 = (V, E2), D2 = (V, E2), and the corresponding interventional targets I1, I2, the pairs hD1, I1i and hD2, I2i are -Markov equivalent if and only if AugI1 (D1) and AugI2 (D2) have (1) the same skeleton and (2) the same unshielded colliders.
- The authors investigate the problem of how to learn the -Markov EC (Def. 3) from a tuple of interventional distributions generated by some unknown pair hD, Ii. The characterization provided in Thm. 1 together with PAGs motivate the following definition of -PAG.
- Consider a tuple of distributions hP1, P2i and let the pair hD, Ii in Fig. 1a be the true and unknown causal graph and set of corresponding interventional targets.
- Assuming tuple P is generated by unknown pair hD, Ii, -FCI is sound in the sample limit, i.e., MAG(AugI (D)) has the same skeleton as P -FCI, the -PAG learned by -FCI, and shares all its tail and arrowhead orientations.
- Assuming tuple P is generated by unknown pair hD, Ii, -FCI is complete, i.e., P contains all the common edge marks in the -Markov equivalence class.
- The authors started by defining the -Markov property that connects a tuple of distributions with unknown targets to a pair of causal graph D and a corresponding possible interventional target set I.

Funding

- Acknowledgments and Disclosure of Funding Bareinboim and Jaber are supported in parts by grants from NSF IIS-1704352 and IIS-1750807 (CAREER)
- Kocaoglu and Shanmugam are supported by the MIT-IBM Watson AI Lab

Reference

- Elias Bareinboim, Carlos Brito, and Judea Pearl. Local characterizations of causal bayesian networks. In Graph Structures for Knowledge Representation and Reasoning (IJCAI), pages 1–17. Springer Berlin Heidelberg, 2012.
- N. Cartwright. Hunting Causes and Using Them: Approaches in Philosophy and Economics. Cambridge University Press, 2007.
- A Philip Dawid. Influence diagrams for causal modelling and inference. International Statistical Review, 70:161–189, 2002.
- Daniel Eaton and Kevin Murphy. Exact bayesian structure learning from uncertain interventions. In Artificial intelligence and statistics, pages 107–114, 2007.
- Frederich Eberhardt, Clark Glymour, and Richard Scheines. On the number of experiments su cient and in the worst case necessary to identify all causal relations among n variables. In Proceedings of the 21st Conference on Uncertainty in Artificial Intelligence (UAI), pages 178–184, 2005.
- Kyle Kai-How Farh, Alexander Marson, Jiang Zhu, Markus Kleinewietfeld, William J Housley, Samantha Beik, Noam Shoresh, Holly Whitton, Russell JH Ryan, Alexander A Shishkin, et al. Genetic and epigenetic fine mapping of causal autoimmune disease variants. Nature, 518(7539):337–343, 2015.
- Dan Geiger, Thomas Verma, and Judea Pearl. d-separation: From theorems to algorithms. In Machine Intelligence and Pattern Recognition, volume 10, pages 139–148.
- AmirEmad Ghassami, Saber Salehkaleybar, Negar Kiyavash, and Elias Bareinboim. Budgeted experiment design for causal structure learning. In Jennifer Dy and Andreas Krause, editors, Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pages 1724–1733. PMLR, 2018.
- Alain Hauser and Peter Bühlmann. Characterization and greedy learning of interventional Markov equivalence classes of directed acyclic graphs. Journal of Machine Learning Research, 13(1):2409–2464, 2012.
- Alain Hauser and Peter Bühlmann. Two optimal strategies for active learning of causal networks from interventional data. In Proceedings of Sixth European Workshop on Probabilistic Graphical Models, 2012.
- Patrik O Hoyer, Dominik Janzing, Joris Mooij, Jonas Peters, and Bernhard Schölkopf. Nonlinear causal discovery with additive noise models. In Proceedings of NIPS 2008, 2008.
- Paul Hünermund and Elias Bareinboim. Causal inference and data-fusion in econometrics. arXiv preprint arXiv:1912.09104, 2019.
- Amin Jaber, Murat Kocaoglu, Karthikeyan Shanmugam, and Elias Bareinboim. Causal discovery from soft interventions with unknown targets: Characterization and learning. Technical report, R-67, Columbia CausalAI Lab, Department of Computer Science, Columbia University, 2020.
- Dominik Janzing, Joris Mooij, Kun Zhang, Jan Lemeire, Jakob Zscheischler, Povilas Daniušis, Bastian Steudel, and Bernhard Schölkopf. Information-geometric approach to inferring causal directions. Artificial Intelligence, 182-183:1–31, 2012.
- Nan Rosemary Ke, Olexa Bilaniuk, Anirudh Goyal, Stefan Bauer, Hugo Larochelle, Chris Pal, and Yoshua Bengio. Learning neural causal models from unknown interventions. arXiv preprint arXiv:1910.01075, 2019.
- Murat Kocaoglu, Alexandros G. Dimakis, Sriram Vishwanath, and Babak Hassibi. Entropic causal inference. In AAAI’17, 2017.
- Murat Kocaoglu, Amin Jaber, Karthikeyan Shanmugam, and Elias Bareinboim. Characterization and learning of causal graphs with latent variables from soft interventions. In Advances in Neural Information Processing Systems, pages 14346–14356, 2019.
- Murat Kocaoglu, Karthikeyan Shanmugam, and Elias Bareinboim. Experimental design for learning causal graphs with latent variables. In Advances in Neural Information Processing Systems, pages 7018–7028, 2017.
- Christopher Meek. Causal inference and causal explanation with background knowledge. In Proceedings of the eleventh conference on uncertainty in artificial intelligence, 1995.
- Joris M Mooij, Sara Magliacane, and Tom Claassen. Joint causal inference from multiple contexts. arXiv preprint arXiv:1611.10351, 2016.
- J. Pearl. Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann, San Mateo, CA, 1988.
- J. Pearl. Causality: Models, Reasoning, and Inference. Cambridge University Press, New York, 2000. 2nd edition, 2009.
- Jonas Peters, Peter Bühlmann, and Nicolai Meinshausen. Causal inference by using invariant prediction: identification and confidence intervals. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 78(5):947–1012, 2016.
- Jonas Peters, Dominik Janzing, and Bernhard Schölkopf. Elements of causal inference: foundations and learning algorithms. MIT press, 2017.
- Thomas Richardson and Peter Spirtes. Ancestral graph Markov models. The Annals of Statistics, 30(4):962–1030, 2002.
- James M Robins, Miguel Angel Hernan, and Babette Brumback. Marginal structural models and causal inference in epidemiology, 2000.
- Dominik Rothenhäusler, Christina Heinze, Jonas Peters, and Nicolai Meinshausen. Backshift: Learning causal cyclic graphs from unknown shift interventions. In Advances in Neural Information Processing Systems, pages 1513–1521, 2015.
- Karen Sachs, Omar Perez, Dana Pe’er, Douglas A Lau↵enburger, and Garry P Nolan. Causal protein-signaling networks derived from multiparameter single-cell data. Science, 308(5721):523–529, 2005.
- Peter Spirtes, Clark Glymour, and Richard Scheines. Causation, Prediction, and Search. A Bradford Book, 2001.
- Chandler Squires, Yuhao Wang, and Caroline Uhler. Permutation-based causal structure learning with unknown intervention targets. arXiv preprint arXiv:1910.09007, 2019.
- Jin Tian and Judea Pearl. Causal discovery from changes. In Proceedings of UAI2013, 2013.
- Thomas Verma and Judea Pearl. An algorithm for deciding if a set of observed independencies has a causal explanation. In Proceedings of the Eighth international conference on uncertainty in artificial intelligence, 1992.
- J. Woodward, J.F. Woodward, and Oxford University Press. Making Things Happen: A Theory of Causal Explanation. Oxford scholarship online. Oxford University Press, 2003.
- Karren Yang, Abigail Katco↵, and Caroline Uhler. Characterizing and learning equivalence classes of causal DAGs under interventions. In Proceedings of the 35th International Conference on Machine Learning, volume 80, pages 5541–5550. PMLR, 2018.
- Jiji Zhang. Causal inference and reasoning in causally insu cient systems. PhD thesis, Department of Philosophy, Carnegie Mellon University, 2006.
- Jiji Zhang. Causal reasoning with ancestral graphs. Journal of Machine Learning Research, 9(Jul):1437–1474, 2008.
- Jiji Zhang. On the completeness of orientation rules for causal discovery in the presence of latent confounders and selection bias. Artificial Intelligence, 172(16):1873–1896, 2008.
- Kun Zhang, Biwei Huang, Jiji Zhang, Clark Glymour, and Bernhard Schölkopf. Causal discovery from nonstationary/heterogeneous data: Skeleton estimation and orientation determination. In IJCAI: Proceedings of the Conference, volume 2017, page 1347, 2017.

Full Text

Tags

Comments