Recovering from Selection Bias in Causal and Statistical Inference

AAAI, pp. 2410-2416, 2014.

Cited by: 71|Bibtex|Views124
EI
Other Links: dblp.uni-trier.de|academic.microsoft.com
Weibo:
Since selection bias is a common problem across many disciplines, the methods developed in this paper should help to understand, formalize, and alleviate this problem in a broad range of data-intensive applications

Abstract:

Selection bias is caused by preferential exclusion of units from the samples and represents a major obstacle to valid causal and statistical inferences; it cannot be removed by randomized experiments and can rarely be detected in either experimental or observational studies. In this paper, we provide complete graphical and algorithmic con...More

Code:

Data:

Introduction
  • Selection bias is induced by preferential selection of units for data analysis, usually governed by unknown factors including treatment, outcome, and their consequences, and represents a major obstacle to valid causal and statistical inferences
  • It cannot be removed by randomized experiments and can rarely be detected in either experimental or observational studies.1.
  • Both action and outcome affect the entry in the data pool, which will be shown not to be recoverable – i.e., there is no method capable of unbiasedly estimating the population-level distribution using data gathered under this selection process
Highlights
  • Selection bias is induced by preferential selection of units for data analysis, usually governed by unknown factors including treatment, outcome, and their consequences, and represents a major obstacle to valid causal and statistical inferences
  • We provide conditions for recoverability from selection bias in statistical and causal inferences applicable for arbitrary structures in non-parametric settings
  • Theorem 1 provides a complete characterization of recoverability when no external information is available
  • Theorem 5 further gives a graphical condition for recovering causal effects, which generalizes the backdoor adjustment
  • Since selection bias is a common problem across many disciplines, the methods developed in this paper should help to understand, formalize, and alleviate this problem in a broad range of data-intensive applications
  • This paper complements another aspect of the generalization problem in which causal effects are transported among differing environments (Bareinboim and Pearl 2013a; 2013b)
Conclusion
  • The authors provide conditions for recoverability from selection bias in statistical and causal inferences applicable for arbitrary structures in non-parametric settings.
  • Since selection bias is a common problem across many disciplines, the methods developed in this paper should help to understand, formalize, and alleviate this problem in a broad range of data-intensive applications.
  • This paper complements another aspect of the generalization problem in which causal effects are transported among differing environments (Bareinboim and Pearl 2013a; 2013b)
Summary
  • Introduction:

    Selection bias is induced by preferential selection of units for data analysis, usually governed by unknown factors including treatment, outcome, and their consequences, and represents a major obstacle to valid causal and statistical inferences
  • It cannot be removed by randomized experiments and can rarely be detected in either experimental or observational studies.1.
  • Both action and outcome affect the entry in the data pool, which will be shown not to be recoverable – i.e., there is no method capable of unbiasedly estimating the population-level distribution using data gathered under this selection process
  • Conclusion:

    The authors provide conditions for recoverability from selection bias in statistical and causal inferences applicable for arbitrary structures in non-parametric settings.
  • Since selection bias is a common problem across many disciplines, the methods developed in this paper should help to understand, formalize, and alleviate this problem in a broad range of data-intensive applications.
  • This paper complements another aspect of the generalization problem in which causal effects are transported among differing environments (Bareinboim and Pearl 2013a; 2013b)
Related work
  • Related work and Our contributions

    There are three sets of assumptions that are enlightening to acknowledge if we want to understand the procedures avail-

    able in the literature for treating selection bias – qualitative assumptions about the selection mechanism, parametric assumptions regarding the data-generating model, and quantitative assumptions about the selection process.

    In the data-generating model in Fig. 1(c), the selection of units to the sample is treatment-dependent, which means that it is caused by X, but not Y . This case has been studied in the literature and Q = P (y|x) is known to be non-parametrically recoverable from selection (Greenland and Pearl 2011). Alternatively, in the data-generating model in Fig. 1(d), the selection is caused by Y (outcomedependent), and Q is not recoverable from selection (formally shown later on), but is the odds ratio4 (Cornfield 1951; Whittemore 1978; Geng 1992; Didelez, Kreiner, and Keiding 2010). As mentioned earlier, Q is also not recoverable in the graph in Fig. 1(a). By and large, the literature is concerned with treatment-dependent or outcome-dependent selection, but selection might be caused by multiple reasons and embedded in more intricate realities. For instance, a driver of the treatment Z (e.g., age, sex, socio-economic status) may also be causing selection, see Fig. 1(e,f). As it turns out, Q is recoverable in Fig 1(e) but not in (f), so different qualitative assumptions need to be modelled explicitly since each topology entails a different answer for recoverability.
Funding
  • This research was supported in parts by grants from NSF #IIS-1249822 and #IIS-1302448, and ONR #N00014-13-1-0153 and #N00014-10-1-0933
Reference
  • Acid, S., and de Campos, L. 1996. An algorithm for finding minimum d-separating sets in belief networks. In Proceedings of the 12th Annual Conference on Uncertainty in Artificial Intelligence, 3–10. San Francisco, CA: Morgan Kaufmann.
    Google ScholarLocate open access versionFindings
  • Angrist, J. D. 1997. Conditional independence in sample selection models. Economics Letters 54(2):103–112.
    Google ScholarLocate open access versionFindings
  • Bareinboim, E., and Pearl, J. 2012. Controlling selection bias in causal inference. In Girolami, M., and Lawrence, N., eds., Proceedings of The Fifteenth International Conference on Artificial Intelligence and Statistics (AISTATS 2012), 100–108. JMLR (22).
    Google ScholarLocate open access versionFindings
  • Bareinboim, E., and Pearl, J. 2013a. Meta-transportability of causal effects: A formal approach. In Proceedings of The Sixteenth International Conference on Artificial Intelligence and Statistics (AISTATS 2013), 135–143. JMLR (31).
    Google ScholarLocate open access versionFindings
  • Bareinboim, E., and Pearl, J. 2013b. Causal transportability with limited experiments. In desJardins, M., and Littman, M. L., eds., Proceedings of The Twenty-Seventh Conference on Artificial Intelligence (AAAI 2013), 95–101.
    Google ScholarLocate open access versionFindings
  • Bareinboim, E.; Tian, J.; and Pearl, J. 2014. Recovering from selection bias in causal and statistical inference. Technical Report R-425, Cognitive Systems Laboratory, Department of Computer Science, UCLA.
    Google ScholarFindings
  • Cooper, G. 1995. Causal discovery from data in the presence of selection bias. Artificial Intelligence and Statistics 140–150.
    Google ScholarFindings
  • Cornfield, J. 1951. A method of estimating comparative rates from clinical data; applications to cancer of the lung, breast, and cervix. Journal of the National Cancer Institute 11:1269–1275.
    Google ScholarLocate open access versionFindings
  • Cortes, C.; Mohri, M.; Riley, M.; and Rostamizadeh, A. 2008. Sample selection bias correction theory. In Proceedings of the 19th International Conference on Algorithmic Learning Theory, ALT ’08, 38–53.
    Google ScholarLocate open access versionFindings
  • Didelez, V.; Kreiner, S.; and Keiding, N. 20Graphical models for inference under outcome-dependent sampling. Statistical Science 25(3):368–387.
    Google ScholarLocate open access versionFindings
  • Elkan, C. 2001. The foundations of cost-sensitive learning. In Proceedings of the 17th International Joint Conference on Artificial Intelligence - Volume 2, IJCAI’01, 973–978. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.
    Google ScholarLocate open access versionFindings
  • Geng, Z. 1992. Collapsibility of relative risk in contingency tables with a response variable. Journal Royal Statistical Society 54(2):585–593.
    Google ScholarLocate open access versionFindings
  • Glymour, M., and Greenland, S. 2008. Causal diagrams. In Rothman, K.; Greenland, S.; and Lash, T., eds., Modern Epidemiology. Philadelphia, PA: Lippincott Williams & Wilkins, 3rd edition. 183– 209.
    Google ScholarFindings
  • Greenland, S., and Pearl, J. 2011. Adjustments and their consequences – collapsibility analysis using graphical models. International Statistical Review 79(3):401–426.
    Google ScholarLocate open access versionFindings
  • Heckman, J. 1979. Sample selection bias as a specification error. Econometrica 47:153–161.
    Google ScholarLocate open access versionFindings
  • Hein, M. 2009. Binary classification under sample selection bias. In Candela, J.; Sugiyama, M.; Schwaighofer, A.; and Lawrence, N., eds., Dataset Shift in Machine Learning. Cambridge, MA: MIT Press. 41–64.
    Google ScholarLocate open access versionFindings
  • Jewell, N. P. 1991. Some surprising results about covariate adjustment in logistic regression models. International Statistical Review 59(2):227–240.
    Google ScholarLocate open access versionFindings
  • Koller, D., and Friedman, N. 2009. Probabilistic Graphical Models: Principles and Techniques. MIT Press.
    Google ScholarFindings
  • Kuroki, M., and Cai, Z. 2006. On recovering a population covariance matrix in the presence of selection bias. Biometrika 93(3):601–611.
    Google ScholarLocate open access versionFindings
  • Little, R. J. A., and Rubin, D. B. 1986. Statistical Analysis with Missing Data. New York, NY, USA: John Wiley & Sons, Inc.
    Google ScholarFindings
  • Mefford, J., and Witte, J. S. 2012. The covariate’s dilemma. PLoS Genet 8(11):e1003096.
    Google ScholarLocate open access versionFindings
  • Pearl, J., and Paz, A. 2013. Confounding equivalence in causal equivalence. In Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence (UAI 2010), 433–441.
    Google ScholarLocate open access versionFindings
  • Pearl, J. 1988. Probabilistic Reasoning in Intelligent Systems. San Mateo, CA: Morgan Kaufmann.
    Google ScholarFindings
  • Pearl, J. 1993. Aspects of graphical models connected with causality. In Proceedings of the 49th Session of the International Statistical Institute, 391–401.
    Google ScholarLocate open access versionFindings
  • Pearl, J. 1995. Causal diagrams for empirical research. Biometrika 82(4):669–710.
    Google ScholarLocate open access versionFindings
  • Pearl, J. 2000. Causality: Models, Reasoning, and Inference. New York: Cambridge University Press. Second ed., 2009.
    Google ScholarFindings
  • Pearl, J. 2013. Lindear models: A useful “microscope” for causal analysis. Journal of Causal Inference 1:155–170.
    Google ScholarLocate open access versionFindings
  • Pirinen, M.; Donnelly, P.; and Spencer, C. 2012. Including known covariates can reduce power to detect genetic effects in case-control studies. Nature Genetics 44:848–851.
    Google ScholarLocate open access versionFindings
  • Robins, J. 2001.
    Google ScholarFindings
  • Smith, A. T., and Elkan, C. 2007. Making generative classifiers robust to selection bias. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’07, 657–666. New York, NY, USA: ACM.
    Google ScholarLocate open access versionFindings
  • Spirtes, P.; Glymour, C.; and Scheines, R. 2000.
    Google ScholarFindings
  • Storkey, A. 2009. When training and test sets are different: characterising learning transfer. In Candela, J.; Sugiyama, M.; Schwaighofer, A.; and Lawrence, N., eds., Dataset Shift in Machine Learning. Cambridge, MA: MIT Press. 3–28.
    Google ScholarLocate open access versionFindings
  • Textor, J., and Liskiewicz, M. 2011. Adjustment criteria in causal diagrams: An algorithmic perspective. In Pfeffer, A., and Cozman, F., eds., Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence (UAI 2011), 681–688. AUAI Press.
    Google ScholarLocate open access versionFindings
  • Tian, J.; Paz, A.; and Pearl, J. 1998. Finding minimal separating sets. Technical Report R-254, University of California, Los Angeles, CA.
    Google ScholarFindings
  • Whittemore, A. 1978. Collapsibility of multidimensional contingency tables. Journal of the Royal Statistical Society, Series B 40(3):328–340.
    Google ScholarLocate open access versionFindings
  • Zadrozny, B. 2004. Learning and evaluating classifiers under sample selection bias. In Proceedings of the Twenty-first International Conference on Machine Learning, ICML ’04, 114–. New York, NY, USA: ACM.
    Google ScholarLocate open access versionFindings
  • Zhang, J. 2008. On the completeness of orientation rules for causal discovery in the presence of latent confounders and selection bias. Artif. Intell. 172:1873–1896.
    Google ScholarLocate open access versionFindings
Full Text
Your rating :
0

 

Best Paper
Best Paper of AAAI, 2014
Tags
Comments