AI帮你理解科学
AI 精读
AI抽取本论文的概要总结
微博一下:
Towards A Rigorous Science of Interpretable Machine Learning
arXiv: Machine Learning, (2017)
摘要
As machine learning systems become ubiquitous, there has been a surge of interest in interpretable machine learning: systems that provide explanation for their outputs. These explanations are often used to qualitatively assess other criteria such as safety or non-discrimination. However, despite the interest in interpretability, there is ...更多
代码:
数据:
简介
- From autonomous cars and adaptive email-filters to predictive policing systems, machine learning (ML) systems are increasingly ubiquitous; they outperform humans on specific tasks [Mnih et al, 2013, Silver et al, 2016, Hamill, 2017] and often guide processes of human understanding and decisions [Carton et al, 2016, Doshi-Velez et al, 2014].
- The authors might not be able to enumerate all unit tests required for the safe operation of a semi-autonomous car or all confounds that might cause a credit scoring system to be discriminatory
- In such cases, a popular fallback is the criterion of interpretability: if the system can explain its reasoning, the authors can verify whether that reasoning is sound with respect to these auxiliary criteria.
- The second evaluates interpretability via a quantifiable proxy: a researcher might first claim that some model class—e.g. sparse linear models, rule lists, gradient boosted trees—are interpretable and present algorithms to optimize within that class (e.g. Bucilu et al [2006], Wang et al [2017], Wang and Rudin [2015], Lou et al [2012])
重点内容
- From autonomous cars and adaptive email-filters to predictive policing systems, machine learning (ML) systems are increasingly ubiquitous; they outperform humans on specific tasks [Mnih et al, 2013, Silver et al, 2016, Hamill, 2017] and often guide processes of human understanding and decisions [Carton et al, 2016, Doshi-Velez et al, 2014]
- The second evaluates interpretability via a quantifiable proxy: a researcher might first claim that some model class—e.g. sparse linear models, rule lists, gradient boosted trees—are interpretable and present algorithms to optimize within that class (e.g. Bucilu et al [2006], Wang et al [2017], Wang and Rudin [2015], Lou et al [2012])
- The claim of the research should match the type of the evaluation
- Just as one would be critical of a reliability-oriented paper that only cites accuracy statistics, the choice of evaluation should match the specificity of the claim being made
- A contribution that is focused on a particular application should be expected to be evaluated in the context of that application, or on a human experiment with a closely-related task
- A contribution that is focused on better optimizing a model class for some definition of interpretability should be expected to be evaluated with functionally-grounded metrics
结论
- Recommendations for Researchers
In this work, the authors have laid the groundwork for a process to rigorously define and evaluate interpretability. - A contribution that is focused on a particular application should be expected to be evaluated in the context of that application, or on a human experiment with a closely-related task.
- A contribution that is focused on better optimizing a model class for some definition of interpretability should be expected to be evaluated with functionally-grounded metrics.
- The authors must be careful in the work on interpretability, both recognizing the need for and the costs of human-subject experiments
基金
- Proposes a taxonomy for the evaluation of interpretability—application-grounded, human-grounded and functionallygrounded
- Proposes data-driven ways to derive operational definitions and evaluations of explanations, and interpretability
- Argues that interpretability can assist in qualitatively ascertaining whether other desiderata—such as fairness, privacy, reliability, robustness, causality, usability and trust—are met
- Argues that the need for interpretability stems from an incompleteness in the problem formalization, creating a fundamental barrier to optimization and evaluation
引用论文
- Dario Amodei, Chris Olah, Jacob Steinhardt, Paul Christiano, John Schulman, and Dan Mane. Concrete problems in AI safety. arXiv preprint arXiv:1606.06565, 2016.
- Pedro Antunes, Valeria Herskovic, Sergio F Ochoa, and Jose A Pino. Structuring dimensions for collaborative systems evaluation. ACM Computing Surveys, 2012.
- William Bechtel and Adele Abrahamsen. Explanation: A mechanist alternative. Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences, 2005.
- Catherine Blake and Christopher J Merz. {UCI} repository of machine learning databases. 1998.
- Nick Bostrom and Eliezer Yudkowsky. The ethics of artificial intelligence. The Cambridge Handbook of Artificial Intelligence, 2014.
- Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. Openai gym. arXiv preprint arXiv:1606.01540, 2016.
- Cristian Bucilu, Rich Caruana, and Alexandru Niculescu-Mizil. Model compression. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2006.
- Samuel Carton, Jennifer Helsby, Kenneth Joseph, Ayesha Mahmud, Youngsoo Park, Joe Walsh, Crystal Cody, CPT Estella Patterson, Lauren Haynes, and Rayid Ghani. Identifying police officers at risk of adverse events. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2016.
- Jonathan Chang, Jordan L Boyd-Graber, Sean Gerrish, Chong Wang, and David M Blei. Reading tea leaves: How humans interpret topic models. In NIPS, 2009.
- Nick Chater and Mike Oaksford. Speculations on human causal learning and reasoning. Information sampling and adaptive cognition, 2006.
- Finale Doshi-Velez, Yaorong Ge, and Isaac Kohane. Comorbidity clusters in autism spectrum disorders: an electronic health record time-series analysis. Pediatrics, 133(1):e54–e63, 2014.
- Finale Doshi-Velez, Byron Wallace, and Ryan Adams. Graph-sparse lda: a topic model with structured sparsity. Association for the Advancement of Artificial Intelligence, 2015.
- Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. Fairness through awareness. In Innovations in Theoretical Computer Science Conference. ACM, 2012.
- Alex Freitas. Comprehensible classification models: a position paper. ACM SIGKDD Explorations, 2014.
- Vikas K Garg and Adam Tauman Kalai. Meta-unsupervised-learning: A supervised approach to unsupervised learning. arXiv preprint arXiv:1612.09030, 2016.
- Stuart Glennan. Rethinking mechanistic explanation. Philosophy of science, 2002.
- Bryce Goodman and Seth Flaxman. European union regulations on algorithmic decision-making and a” right to explanation”. arXiv preprint arXiv:1606.08813, 2016.
- Maya Gupta, Andrew Cotter, Jan Pfeifer, Konstantin Voevodski, Kevin Canini, Alexander Mangylov, Wojciech Moczydlowski, and Alexander Van Esbroeck. Monotonic calibrated interpolated look-up tables. Journal of Machine Learning Research, 2016.
- http://www.post-gazette.com/business/tech-news/2017/01/31/CMU-computerwon-poker-battle-over-humans-by-statistically-significant-margin/stories/
- 201701310250, 2017. Accessed: 2017-02-07.
- Moritz Hardt and Kunal Talwar. On the geometry of differential privacy. In ACM Symposium on Theory of Computing. ACM, 2010.
- Moritz Hardt, Eric Price, and Nati Srebro. Equality of opportunity in supervised learning. In Advances in Neural Information Processing Systems, 2016.
- Carl Hempel and Paul Oppenheim. Studies in the logic of explanation. Philosophy of science, 1948.
- Tin Kam Ho and Mitra Basu. Complexity measures of supervised classification problems. IEEE transactions on pattern analysis and machine intelligence, 2002.
- Frank Keil. Explanation and understanding. Annu. Rev. Psychol., 2006.
- Frank Keil, Leonid Rozenblit, and Candice Mills. What lies beneath? understanding the limits of understanding. Thinking and seeing: Visual metacognition in adults and children, 2004.
- Been Kim, Caleb Chacha, and Julie Shah. Inferring robot task plans from human team meetings: A generative modeling approach with logic-based prior. Association for the Advancement of Artificial Intelligence, 2013.
- Been Kim, Elena Glassman, Brittney Johnson, and Julie Shah. iBCM: Interactive bayesian case model empowering humans via intuitive interaction. 2015a.
- Been Kim, Julie Shah, and Finale Doshi-Velez. Mind the gap: A generative approach to interpretable feature selection and extraction. In Advances in Neural Information Processing Systems, 2015b.
- Himabindu Lakkaraju, Stephen H Bach, and Jure Leskovec. Interpretable decision sets: A joint framework for description and prediction. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1675–1684. ACM, 2016.
- Jonathan Lazar, Jinjuan Heidi Feng, and Harry Hochheiser. Research methods in human-computer interaction. John Wiley & Sons, 2010.
- Tao Lei, Regina Barzilay, and Tommi Jaakkola. Rationalizing neural predictions. arXiv preprint arXiv:1606.04155, 2016.
- Tania Lombrozo. The structure and function of explanations. Trends in cognitive sciences, 10(10): 464–470, 2006.
- Yin Lou, Rich Caruana, and Johannes Gehrke. Intelligible models for classification and regression. In ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2012.
- Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602, 2013.
- Ian Neath and Aimee Surprenant. Human Memory. 2003.
- Clemens Otte. Safe and interpretable machine learning: A methodological review. In Computational Intelligence in Intelligent Data Analysis. Springer, 2013.
- Parliament and Council of the European Union. General data protection regulation. 2016.
- Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. “why should i trust you?”: Explaining the predictions of any classifier. arXiv preprint arXiv:1602.04938, 2016.
- Salvatore Ruggieri, Dino Pedreschi, and Franco Turini. Data mining for discrimination discovery. ACM Transactions on Knowledge Discovery from Data, 2010.
- Eric Schulz, Joshua Tenenbaum, David Duvenaud, Maarten Speekenbrink, and Samuel Gershman. Compositional inductive biases in function learning. bioRxiv, 2016.
- D Sculley, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, Michael Young, Jean-Francois Crespo, and Dan Dennison. Hidden technical debt in machine learning systems. In Advances in Neural Information Processing Systems, 2015.
- David Silver, Aja Huang, Chris J Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, et al. Mastering the game of go with deep neural networks and tree search. Nature, 2016.
- Lior Jacob Strahilevitz. Privacy versus antidiscrimination. University of Chicago Law School Working Paper, 2008.
- Adi Suissa-Peleg, Daniel Haehn, Seymour Knowles-Barley, Verena Kaynig, Thouis R Jones, Alyssa Wilson, Richard Schalek, Jeffery W Lichtman, and Hanspeter Pfister. Automatic neural reconstruction from petavoxel of electron microscopy data. Microscopy and Microanalysis, 2016.
- Vincent Toubiana, Arvind Narayanan, Dan Boneh, Helen Nissenbaum, and Solon Barocas. Adnostic: Privacy preserving targeted advertising. 2010.
- Joaquin Vanschoren, Jan N Van Rijn, Bernd Bischl, and Luis Torgo. Openml: networked science in machine learning. ACM SIGKDD Explorations Newsletter, 15(2):49–60, 2014.
- Kush Varshney and Homa Alemzadeh. On the safety of machine learning: Cyber-physical systems, decision sciences, and data products. CoRR, 2016.
- Fulton Wang and Cynthia Rudin. Falling rule lists. In AISTATS, 2015. Tong Wang, Cynthia Rudin, Finale Doshi-Velez, Yimin Liu, Erica Klampfl, and Perry MacNeille.
- Bayesian rule sets for interpretable classification. In International Conference on Data Mining, 2017. Joseph Jay Williams, Juho Kim, Anna Rafferty, Samuel Maldonado, Krzysztof Z Gajos, Walter S Lasecki, and Neil Heffernan. Axis: Generating explanations at scale with learnersourcing and machine learning. In ACM Conference on Learning@ Scale. ACM, 2016. Andrew Wilson, Christoph Dann, Chris Lucas, and Eric Xing. The human kernel. In Advances in Neural Information Processing Systems, 2015.
标签
评论
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn