Causal Modeling for Fairness in Dynamical Systems

ICML, pp. 2185-2195, 2019.

Cited by: 0|Bibtex|Views53
EI
Other Links: arxiv.org|academic.microsoft.com|dblp.uni-trier.de
Weibo:
We discuss three basic approaches to off-policy evaluation of a target policy πgiven a dataset of trajectories13 {τ } that were generated by the historical policy π obs interacting with the environment, and a model M of the environment

Abstract:

In this work, we present causal directed acyclic graphs (DAGs) as a unifying framework for the recent literature on fairness in dynamical systems. We advocate for the use of causal DAGs as a tool in both designing equitable policies and estimating their impacts. By visualizing models of dynamic unfairness graphically, we expose implicit...More

Code:

Data:

0
Introduction
  • How do the authors design equitable policies for complex, evolving societies? A wide range of work in the social sciences aims to understand the long-term consequences of decisions and events [2, 15, 17, 18, 33, 58].
  • The key insight from this literature is that the repeated application of algorithmic tools in a changing environment can have long-term fairness impacts distinct from their short-term impacts
  • Each paper in this literature proposes a dynamics model for a particular domain, exposing unfairness that arises from long-term reapplication of a baseline policy, and proposes a “fair” policy to mitigate some of these biases.
  • Several general algorithms for improved fairness in sequential decision-making have been characterized, with work discussing bandits [29], reinforcement learning [27], and importance sampling estimators [8]
Highlights
  • How do we design equitable policies for complex, evolving societies? A wide range of work in the social sciences aims to understand the long-term consequences of decisions and events [2, 15, 17, 18, 33, 58]
  • We show that causal directed acyclic graphs are a unifying framework for the literature on fairness in dynamical systems
  • To demonstrate the flexibility of the structural causal models framework, we extend their model to compute a variety of new policy evaluations
  • We discuss three basic approaches to off-policy evaluation of a target policy πgiven a dataset of trajectories13 {τ } that were generated by the historical policy π obs interacting with the environment, and a model M of the environment
  • Each trajectory is one tuple comprising a sequence of all observations over all the time steps specified by the structural causal models
  • We believe the field of fairness in dynamical systems is concerned with a set of problems that are well-represented by structural causal models; in Section 5 we show there is frequently an equivalency between the equations governing a model of dynamic unfairness and the structural equations of an structural causal models
Results
  • Figure 9 shows the effect on the average utility E[U] and average per-group score change E[∆j ] of a simple policy intervention by “Mismatch” refers here to structural equations with misspecified functional forms, not incorrect causal assumptions.
  • Institutional profit error Avg score change error Avg score change ∆Black τCB = 600 −20 −40 −60.
  • (a) Score change, min.
  • ∆White (b) Score change, maj.
  • Num steps (a) Group improvement
Conclusion
  • 7.1 Off-Policy Evaluation Methods

    In Section 6, the authors discuss model-based policy evaluation.
  • Compute a reweighted expected reward, where the rewards from each trajectory τ in the sample are reweighted pdo(π →π )(τ ).
  • (2) Model-based policy evaluation (MB-PE, discussed in Section 4): ignore the historical data and use the model.
  • Given a model M, sample exogenous noise from the priors p(U ), produce trajectories {τ } by running the model M along with the target policy π , and compute the expected reward.
  • (3) Counterfactual-based policy evaluation (CF-PE) [4]: Use historical data and model together.
  • When there is no model mismatch, counterfactual policy evaluation is equivalent to model-based [4]
Summary
  • Introduction:

    How do the authors design equitable policies for complex, evolving societies? A wide range of work in the social sciences aims to understand the long-term consequences of decisions and events [2, 15, 17, 18, 33, 58].
  • The key insight from this literature is that the repeated application of algorithmic tools in a changing environment can have long-term fairness impacts distinct from their short-term impacts
  • Each paper in this literature proposes a dynamics model for a particular domain, exposing unfairness that arises from long-term reapplication of a baseline policy, and proposes a “fair” policy to mitigate some of these biases.
  • Several general algorithms for improved fairness in sequential decision-making have been characterized, with work discussing bandits [29], reinforcement learning [27], and importance sampling estimators [8]
  • Results:

    Figure 9 shows the effect on the average utility E[U] and average per-group score change E[∆j ] of a simple policy intervention by “Mismatch” refers here to structural equations with misspecified functional forms, not incorrect causal assumptions.
  • Institutional profit error Avg score change error Avg score change ∆Black τCB = 600 −20 −40 −60.
  • (a) Score change, min.
  • ∆White (b) Score change, maj.
  • Num steps (a) Group improvement
  • Conclusion:

    7.1 Off-Policy Evaluation Methods

    In Section 6, the authors discuss model-based policy evaluation.
  • Compute a reweighted expected reward, where the rewards from each trajectory τ in the sample are reweighted pdo(π →π )(τ ).
  • (2) Model-based policy evaluation (MB-PE, discussed in Section 4): ignore the historical data and use the model.
  • Given a model M, sample exogenous noise from the priors p(U ), produce trajectories {τ } by running the model M along with the target policy π , and compute the expected reward.
  • (3) Counterfactual-based policy evaluation (CF-PE) [4]: Use historical data and model together.
  • When there is no model mismatch, counterfactual policy evaluation is equivalent to model-based [4]
Tables
  • Table1: Symbol legend for Figure 1
  • Table2: Symbol legend for Figure 5
  • Table3: Symbol legend for Figure 6
  • Table4: Symbol legend for Figure 12 class OneStepSimulation: """Runs simulation for one step of dynamics under Liu et al 2018 SCM.""" def __init__(self, f_A: StructuralEqn, # stochastic SE for group membership f_X: StructuralEqn, # stochastic SE for indiv scores f_Y: StructuralEqn, # stochastic SE for potential repayment f_T: StructuralEqn, # SE for threshold loan policy f_Xtilde: StructuralEqn, # SE for indiv score change f_u: StructuralEqn, # SE for individual utility f_Umathcal: StructuralEqn, # SE for avg instit. utility f_Deltaj: StructuralEqn, # SE per-group avg score change ) -> None: self.f_A = f_A self.f_X = f_X self.f_Y = f_Y self.f_T = f_T self.f_Xtilde = f_Xtilde self.f_u = f_u self.f_Deltaj = f_Deltaj self.f_Umathcal = f_Umathcal def run(self, num_steps: int, num_samps: int) -> Dict: """Run simulation forward for num_steps and return all observables.""" if num_steps != 1: raise ValueError('Only one-step dynamics are currently supported.') blank_tensor = torch.zeros(num_samps) A = self.f_A(blank_tensor) X = self.f_X(A) Y = self.f_Y(X, A) T = self.f_T(X, A) Xtilde = self.f_Xtilde(X, Y, T) u = self.f_u(Y, T) Deltaj = self.f_Deltaj(X, Xtilde, A) Umathcal = self.f_Umathcal(u) return_dict = dict( A=A, X=X, Y=Y, T=T, u=u, Xtilde=Xtilde, Deltaj=Deltaj, Umathcal=Umathcal, ) return return_dict def intervene(self, **kwargs): """Update attributes via intervention.""" for k, v in kwargs.items(): setattr(self, k, v)
  • Table5: Symbol legend for Figure 13 import torch from simulation import OneStepSimulation class MultiStepSimulation(OneStepSimulation): """Runs simulation for multiple step of dynamics."""
Download tables as Excel
Reference
  • Susan Athey. 2015. Machine learning and causal inference for policy evaluation. In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, 5–6.
    Google ScholarLocate open access versionFindings
  • Lisa F Berkman, Ichiro Kawachi, and M Maria Glymour. [n. d.]. Social epidemiology.
    Google ScholarFindings
  • Dimitrios Bountouridis, Jaron Harambam, Mykola Makhortykh, Mónica Marrero, Nava Tintarev, and Claudia Hauff. 2019. SIREN: A Simulation Framework for Understanding the Effects of Recommender Systems in Online News Environments. In Proceedings of the Conference on Fairness, Accountability, and Transparency. ACM, 150–159.
    Google ScholarLocate open access versionFindings
  • Lars Buesing, Theophane Weber, Yori Zwols, Sebastien Racaniere, Arthur Guez, Jean-Baptiste Lespiau, and Nicolas Heess. 2019.
    Google ScholarFindings
  • Silvia Chiappa. 2019. Path-specific counterfactual fairness. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 7801–7808.
    Google ScholarLocate open access versionFindings
  • Stephen Coate and Glenn C Loury. 1993. Will affirmative-action policies eliminate negative stereotypes? The American Economic Review (1993), 1220–1240.
    Google ScholarLocate open access versionFindings
  • A Philip Dawid. 2002. Influence diagrams for causal modelling and inference. International Statistical Review 70, 2 (2002), 161–189.
    Google ScholarLocate open access versionFindings
  • Shayan Doroudi, Philip S Thomas, and Emma Brunskill. 2017. Importance Sampling for Fair Policy Selection.. In Uncertainty in Artificial Intelligence (UAI).
    Google ScholarLocate open access versionFindings
  • Danielle Ensign, Sorelle A Friedler, Scott Neville, Carlos Scheidegger, and Suresh Venkatasubramanian. 2018. Runaway Feedback Loops in Predictive Policing. In Conference on Fairness, Accountability and Transparency. 160–171.
    Google ScholarLocate open access versionFindings
  • Tom Everitt, Pedro A Ortega, Elizabeth Barnes, and Shane Legg. 2019. Understanding agent incentives using causal influence diagrams, Part I: single action settings. arXiv preprint arXiv:1902.09980 (2019).
    Findings
  • Claes Fornell and David F Larcker. 1981. Evaluating structural equation models with unobservable variables and measurement error. Journal of marketing research 18, 1 (1981), 39–50.
    Google ScholarLocate open access versionFindings
  • Dean P Foster and Rakesh V Vohra. 1992. An economic argument for affirmative action. Rationality and Society 4, 2 (1992), 176–188.
    Google ScholarLocate open access versionFindings
  • Wayne A Fuller. 2009. Measurement error models. Vol. 305. John Wiley & Sons.
    Google ScholarFindings
  • Andreas Fuster, Paul Goldsmith-Pinkham, Tarun Ramadorai, and Ansgar Walther.
    Google ScholarFindings
  • 2018. Predictably unequal? the effects of machine learning on credit markets. The Effects of Machine Learning on Credit Markets (November 6, 2018) (2018).
    Google ScholarFindings
  • [15] Dora Gicheva and Jeffrey Thompson. 2015. The effects of student loans on longterm household financial stability. Student loans and the dynamics of debt (2015), 287–316.
    Google ScholarFindings
  • [16] Herman Heine Goldstine and John Von Neumann. 1947. Planning and coding of problems for an electronic computing instrument. (1947).
    Google ScholarFindings
  • [17] Bridget J Goosby and Chelsea Heidbrink. 2013. The Transgenerational Consequences of Discrimination on African-American Health Outcomes. Sociology compass 7, 8 (2013), 630–643.
    Google ScholarLocate open access versionFindings
  • [18] John Grin, Jan Rotmans, and Johan Schot. 2010. Transitions to sustainable development: new directions in the study of long term transformative change. Routledge.
    Google ScholarFindings
  • [19] Emil J Gumbel and Julius Lieblein. 1954. Some applications of extreme-value methods. The American Statistician 8, 5 (1954), 14–17.
    Google ScholarLocate open access versionFindings
  • [20] Lois M Haibt. 1959. A program to draw multilevel flow charts. In Papers presented at the the March 3-5, 1959, Western Joint Computer Conference. ACM, 131–137.
    Google ScholarLocate open access versionFindings
  • [21] Tatsunori B Hashimoto, Megha Srivastava, Hongseok Namkoong, and Percy Liang. 2018. Fairness without demographics in repeated loss minimization. In International Conference on Machine Learning.
    Google ScholarLocate open access versionFindings
  • [22] James J Heckman. 2000. Causal parameters and policy analysis in economics: A twentieth century retrospective. The Quarterly Journal of Economics 115, 1 (2000), 45–97.
    Google ScholarLocate open access versionFindings
  • [23] James J Heckman and Edward J Vytlacil. 2007. Econometric evaluation of social programs, part I: Causal models, structural models and econometric policy evaluation. Handbook of econometrics 6 (2007), 4779–4874.
    Google ScholarLocate open access versionFindings
  • [24] Lily Hu and Yiling Chen. 2018. A short-term intervention for long-term fairness in the labor market. In Proceedings of the 2018 World Wide Web Conference. International World Wide Web Conferences Steering Committee, 1389–1398.
    Google ScholarLocate open access versionFindings
  • [25] Lily Hu, Nicole Immorlica, and Jennifer Wortman Vaughan. 2019. The disparate effects of strategic manipulation. In Proceedings of the Conference on Fairness, Accountability, and Transparency. ACM, 259–268.
    Google ScholarLocate open access versionFindings
  • [26] Guido Imbens. 2019. Potential Outcome and Directed Acyclic Graph Approaches to Causality: Relevance for Empirical Practice in Economics. Technical Report. National Bureau of Economic Research.
    Google ScholarFindings
  • [27] Shahin Jabbari, Matthew Joseph, Michael Kearns, Jamie Morgenstern, and Aaron Roth. 2017. Fairness in reinforcement learning. In Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, 1617–1626.
    Google ScholarLocate open access versionFindings
  • [28] Nan Jiang and Lihong Li. 2016. Doubly Robust Off-policy Value Evaluation for Reinforcement Learning. In International Conference on Machine Learning. 652–661.
    Google ScholarLocate open access versionFindings
  • [29] Matthew Joseph, Michael Kearns, Jamie H Morgenstern, and Aaron Roth. 2016. Fairness in learning: Classic and contextual bandits. In Advances in Neural Information Processing Systems. 325–333.
    Google ScholarLocate open access versionFindings
  • [30] Richard Kammann. 1975. The comprehensibility of printed instructions and the flowchart alternative. Human factors 17, 2 (1975), 183–191.
    Google ScholarLocate open access versionFindings
  • [31] Sampath Kannan, Aaron Roth, and Juba Ziani. 2019. Downstream effects of affirmative action. In Proceedings of the Conference on Fairness, Accountability, and Transparency. ACM, 240–248.
    Google ScholarLocate open access versionFindings
  • [32] Niki Kilbertus, Mateo Rojas Carulla, Giambattista Parascandolo, Moritz Hardt, Dominik Janzing, and Bernhard Schölkopf. 2017. Avoiding discrimination through causal reasoning. In Advances in Neural Information Processing Systems. 656–666.
    Google ScholarLocate open access versionFindings
  • [33] Eric I Knudsen, James J Heckman, Judy L Cameron, and Jack P Shonkoff. 2006. Economic, neurobiological, and behavioral perspectives on building America’s future workforce. Proceedings of the National Academy of Sciences 103, 27 (2006), 10155–10162.
    Google ScholarLocate open access versionFindings
  • [34] Donald E Knuth. 1963. Computer-drawn flowcharts. Commun. ACM 6, 9 (1963), 555–563.
    Google ScholarLocate open access versionFindings
  • [35] Daphne Koller and Nir Friedman. 2009. Probabilistic graphical models: principles and techniques. MIT press.
    Google ScholarFindings
  • [36] Matt Kusner, Chris Russell, Joshua Loftus, and Ricardo Silva. 2019. Making Decisions that Reduce Discriminatory Impacts. In International Conference on Machine Learning. 3591–3600.
    Google ScholarLocate open access versionFindings
  • [37] Matt J Kusner, Joshua Loftus, Chris Russell, and Ricardo Silva. 2017. Counterfactual fairness. In Advances in Neural Information Processing Systems. 4066–4076.
    Google ScholarLocate open access versionFindings
  • [38] Robert J Lempert, David G Groves, Steven W Popper, and Steve C Bankes. 2006. A general, analytic method for generating robust strategies and narrative scenarios. Management science 52, 4 (2006), 514–528.
    Google ScholarLocate open access versionFindings
  • [39] Lydia Liu, Sarah Dean, Esther Rolf, Max Simchowitz, and Moritz Hardt. 2018. Delayed Impact of Fair Machine Learning. In International Conference on Machine Learning. 3156–3164.
    Google ScholarLocate open access versionFindings
  • [40] Kristian Lum and William Isaac. 2016. To predict and serve? Significance 13, 5 (2016), 14–19.
    Google ScholarLocate open access versionFindings
  • [41] Chris J Maddison, Daniel Tarlow, and Tom Minka. 2014. A* sampling. In Advances in Neural Information Processing Systems. 3086–3094.
    Google ScholarLocate open access versionFindings
  • [42] David Madras, Elliot Creager, Toniann Pitassi, and Richard Zemel. 2019. Fairness through causal awareness: Learning causal latent-variable models for biased data. In Proceedings of the Conference on Fairness, Accountability, and Transparency. ACM, 349–358.
    Google ScholarLocate open access versionFindings
  • [43] Richard E Mayer. 1975. Different problem-solving competencies established in learning computer programming with and without meaningful models. Journal of educational psychology 67, 6 (1975), 725.
    Google ScholarLocate open access versionFindings
  • [44] Smitha Milli, John Miller, Anca D Dragan, and Moritz Hardt. 2019. The Social Cost of Strategic Classification. In Proceedings of the Conference on Fairness, Accountability, and Transparency. ACM, 230–239.
    Google ScholarLocate open access versionFindings
  • [45] M Granger Morgan, Milind Kandlikar, James Risbey, and Hadi Dowlatabadi. 1999. Why conventional tools for policy analysis are often inadequate for problems of global change. Climatic Change 41, 3 (1999), 271–281.
    Google ScholarLocate open access versionFindings
  • [46] Hussein Mouzannar, Mesrob I Ohannessian, and Nathan Srebro. 2019. From fair decision making to social equality. In Proceedings of the Conference on Fairness, Accountability, and Transparency. ACM, 359–368.
    Google ScholarLocate open access versionFindings
  • [47] Brad A Myers. 1990. Taxonomies of visual programming and program visualization. Journal of Visual Languages & Computing 1, 1 (1990), 97–123.
    Google ScholarLocate open access versionFindings
  • [48] Razieh Nabi and Ilya Shpitser. 2018. Fair inference on outcomes. In Thirty-Second AAAI Conference on Artificial Intelligence.
    Google ScholarLocate open access versionFindings
  • [49] Michael Oberst and David Sontag. 2019. Counterfactual Off-Policy Evaluation with Gumbel-Max Structural Causal Models. In International Conference on Machine Learning. 4881–4890.
    Google ScholarLocate open access versionFindings
  • [50] Judea Pearl et al. 2009. Causal inference in statistics: An overview. Statistics surveys 3 (2009), 96–146.
    Google ScholarLocate open access versionFindings
  • [51] Doina Precup. 2000. Eligibility traces for off-policy policy evaluation. Computer Science Department Faculty Publication Series (2000), 80.
    Google ScholarLocate open access versionFindings
  • [52] US Federal Reserve. 2007. Report to the Congress on Credit Scoring and its Effects on the Availability and Affordability of Credit. Washington, DC: Board of Governors of the Federal Reserve System (2007).
    Google ScholarLocate open access versionFindings
  • [53] Thomas S Richardson and James M Robins. 2013. Single world intervention graphs (SWIGs): A unification of the counterfactual and graphical approaches to causality. Center for the Statistics and the Social Sciences, University of Washington Series. Working Paper 128, 30 (2013), 2013.
    Google ScholarLocate open access versionFindings
  • [54] David A. Scanlan. 1989. Structured flowcharts outperform pseudocode: An experimental comparison. IEEE software 6, 5 (1989), 28–36.
    Google ScholarLocate open access versionFindings
  • [55] Peter Schwartz. 1996. The art of the long view: paths to strategic insight for yourself and your company. Crown Business.
    Google ScholarFindings
  • [56] Adith Swaminathan, Akshay Krishnamurthy, Alekh Agarwal, Miro Dudik, John Langford, Damien Jose, and Imed Zitouni. 2017. Off-policy evaluation for slate recommendation. In Advances in Neural Information Processing Systems. 3632– 3642.
    Google ScholarLocate open access versionFindings
  • [57] Philip Thomas and Emma Brunskill. 2016. Data-efficient off-policy policy evaluation for reinforcement learning. In International Conference on Machine Learning.
    Google ScholarLocate open access versionFindings
  • [58] Debra Umberson, Camille B Wortman, and Ronald C Kessler. 1992. Widowhood and depression: Explaining long-term gender differences in vulnerability. Journal of health and social behavior (1992), 10–24.
    Google ScholarLocate open access versionFindings
  • [59] Patricia Wright and Fraser Reid. 1973. Written information: some alternatives to prose for expressing the outcomes of complex contingencies. Journal of Applied Psychology 57, 2 (1973), 1(14) Uo ∼ N (0, 1) O = A(1 − (Uc + Uh )) + (1 − A)(Uc + Uh )
    Google ScholarLocate open access versionFindings
Full Text
Your rating :
0

 

Tags
Comments