AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
We argued that trust is crucial for effective human interaction with machine learning systems, and that explaining individual predictions is important in assessing trust

"Why Should I Trust You?": Explaining the Predictions of Any Classifier.

KDD, (2016): 1135-1144

Cited by: 5014|Views512
EI

Abstract

Despite widespread adoption, machine learning models remain mostly black boxes. Understanding the reasons behind predictions is, however, quite important in assessing trust, which is fundamental if one plans to take action based on a prediction, or when choosing whether to deploy a new model. Such understanding also provides insights into...More

Code:

Data:

0
Introduction
  • Machine learning is at the core of many recent advances in science and technology. the important role of humans is an oft-overlooked aspect in the field.
  • It is important to differentiate between two different definitions of trust: (1) trusting a prediction, i.e. whether a user trusts an individual prediction sufficiently to take some action based on it, and (2) trusting a model, i.e. whether the user trusts a model to behave in reasonable ways if deployed
  • Both are directly impacted by.
Highlights
  • Machine learning is at the core of many recent advances in science and technology
  • We argued that trust is crucial for effective human interaction with machine learning systems, and that explaining individual predictions is important in assessing trust
  • We proposed Local Interpretable Model-agnostic Explanations, a modular and extensible approach to faithfully explain the predictions of any model in an interpretable manner
  • We introduced submodular pick-Local Interpretable Model-agnostic Explanations, a method to select representative and non-redundant predictions, providing a global view of the model to users
  • Our experiments demonstrated that explanations are useful for a variety of models in trust-related tasks in the text and image domains, with both expert and non-expert users: deciding between models, assessing trust, improving untrustworthy models, and getting insights into predictions
  • We describe only sparse linear models as explanations, our framework supports the exploration of a variety of explanation families, such as decision trees; it would be interesting to see a comparative study on these with real users
Results
  • EVALUATION WITH HUMAN SUBJECTS

    the authors recreate three scenarios in machine learning that require trust and understanding of predictions and models.
  • For experiments in §6.2 and §6.3, the authors use the “Christianity” and “Atheism” documents from the 20 newsgroups dataset mentioned beforehand
  • This dataset is problematic since it contains features that do not generalize, and validation accuracy considerably overestimates real-world performance.
  • The authors download Atheism and Christianity websites from the DMOZ directory and human curated lists, yielding 819 webpages in each class
  • High accuracy on this dataset by a classifier trained on 20 newsgroups indicates that the classifier is generalizing using semantic content, instead of placing importance on the data specific issues outlined above.
  • The authors use SVM with RBF kernel, trained on the 20 newsgroups data with hyper-parameters tuned via the cross-validation
Conclusion
  • The authors argued that trust is crucial for effective human interaction with machine learning systems, and that explaining individual predictions is important in assessing trust.
  • The authors' experiments demonstrated that explanations are useful for a variety of models in trust-related tasks in the text and image domains, with both expert and non-expert users: deciding between models, assessing trust, improving untrustworthy models, and getting insights into predictions.
  • The authors would like to explore theoretical properties and computational optimizations, in order to provide the accurate, real-time explanations that are critical for any human-in-the-loop machine learning system
Tables
  • Table1: Average F1 of trustworthiness for different explainers on a collection of classifiers and datasets
  • Table2: Husky vs Wolf ” experiment results
Download tables as Excel
Related work
  • The problems with relying on validation set accuracy as the primary measure of trust have been well studied. Practitioners consistently overestimate their model’s accuracy [21], propagate feedback loops [23], or fail to notice data leaks [14]. In order to address these issues, researchers have proposed tools like Gestalt [20] and Modeltracker [1], which help users navigate individual instances. These tools are complementary to LIME in terms of explaining models, since they do not address the problem of explaining individual predictions. Further, our submodular pick procedure can be incorporated in such tools to aid users in navigating larger datasets.

    Some recent work aims to anticipate failures in machine learning, specifically for vision tasks [3, 29]. Letting users know when the systems are likely to fail can lead to an increase in trust, by avoiding “silly mistakes” [8]. These solutions either require additional annotations and feature engineering that is specific to vision tasks or do not provide insight into why a decision should not be trusted. Furthermore, they assume that the current evaluation metrics are reliable, which may not be the case if problems such as data leakage are present. Other recent work [11] focuses on exposing users to different kinds of mistakes (our pick step). Interestingly, the subjects in their study did not notice the serious problems in the 20 newsgroups data even after looking at many mistakes, suggesting that examining raw data is not sufficient. Note that (author?) [11] are not alone in this regard, many researchers in the field have unwittingly published classifiers that would not generalize for this task. Using LIME, we show that even non-experts are able to identify these irregularities when explanations are present. Further, LIME can complement these existing systems, and allow users to assess trust even when a prediction seems “correct” but is made for the wrong reasons.
Funding
  • This work was supported in part by ONR awards #W911NF-131-0246 and #N00014-13-1-0023, and in part by TerraSwarm, one of six centers of STARnet, a Semiconductor Research Corporation program sponsored by MARCO and DARPA
Reference
  • S. Amershi, M. Chickering, S. M. Drucker, B. Lee, P. Simard, and J. Suh. Modeltracker: Redesigning performance analysis tools for machine learning. In Human Factors in Computing Systems (CHI), 2015.
    Google ScholarLocate open access versionFindings
  • D. Baehrens, T. Schroeter, S. Harmeling, M. Kawanabe, K. Hansen, and K.-R. Muller. How to explain individual classification decisions. Journal of Machine Learning Research, 11, 2010.
    Google ScholarLocate open access versionFindings
  • A. Bansal, A. Farhadi, and D. Parikh. Towards transparent systems: Semantic characterization of failure modes. In European Conference on Computer Vision (ECCV), 2014.
    Google ScholarLocate open access versionFindings
  • J. Blitzer, M. Dredze, and F. Pereira. Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In Association for Computational Linguistics (ACL), 2007.
    Google ScholarFindings
  • J. Q. Candela, M. Sugiyama, A. Schwaighofer, and N. D. Lawrence. Dataset Shift in Machine Learning. MIT, 2009.
    Google ScholarLocate open access versionFindings
  • R. Caruana, Y. Lou, J. Gehrke, P. Koch, M. Sturm, and N. Elhadad. Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In Knowledge Discovery and Data Mining (KDD), 2015.
    Google ScholarLocate open access versionFindings
  • M. W. Craven and J. W. Shavlik. Extracting tree-structured representations of trained networks. Neural information processing systems (NIPS), pages 24–30, 1996.
    Google ScholarLocate open access versionFindings
  • M. T. Dzindolet, S. A. Peterson, R. A. Pomranky, L. G. Pierce, and H. P. Beck. The role of trust in automation reliance. Int. J. Hum.-Comput. Stud., 58(6), 2003.
    Google ScholarLocate open access versionFindings
  • B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani. Least angle regression. Annals of Statistics, 32:407–499, 2004.
    Google ScholarLocate open access versionFindings
  • U. Feige. A threshold of ln n for approximating set cover. J. ACM, 45(4), July 1998.
    Google ScholarLocate open access versionFindings
  • A. Groce, T. Kulesza, C. Zhang, S. Shamasunder, M. Burnett, W.-K. Wong, S. Stumpf, S. Das, A. Shinsel, F. Bice, and K. McIntosh. You are the only possible oracle: Effective test selection for end users of interactive machine learning systems. IEEE Trans. Softw. Eng., 40(3), 2014.
    Google ScholarLocate open access versionFindings
  • J. L. Herlocker, J. A. Konstan, and J. Riedl. Explaining collaborative filtering recommendations. In Conference on Computer Supported Cooperative Work (CSCW), 2000.
    Google ScholarLocate open access versionFindings
  • A. Karpathy and F. Li. Deep visual-semantic alignments for generating image descriptions. In Computer Vision and Pattern Recognition (CVPR), 2015.
    Google ScholarLocate open access versionFindings
  • S. Kaufman, S. Rosset, and C. Perlich. Leakage in data mining: Formulation, detection, and avoidance. In Knowledge Discovery and Data Mining (KDD), 2011.
    Google ScholarFindings
  • A. Krause and D. Golovin. Submodular function maximization. In Tractability: Practical Approaches to Hard Problems. Cambridge University Press, February 2014.
    Google ScholarFindings
  • T. Kulesza, M. Burnett, W.-K. Wong, and S. Stumpf. Principles of explanatory debugging to personalize interactive machine learning. In Intelligent User Interfaces (IUI), 2015.
    Google ScholarLocate open access versionFindings
  • B. Letham, C. Rudin, T. H. McCormick, and D. Madigan. Interpretable classifiers using rules and bayesian analysis: Building a better stroke prediction model. Annals of Applied Statistics, 2015.
    Google ScholarLocate open access versionFindings
  • D. Martens and F. Provost. Explaining data-driven document classifications. MIS Q., 38(1), 2014.
    Google ScholarLocate open access versionFindings
  • T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In Neural Information Processing Systems (NIPS). 2013.
    Google ScholarLocate open access versionFindings
  • K. Patel, N. Bancroft, S. M. Drucker, J. Fogarty, A. J. Ko, and J. Landay. Gestalt: Integrated support for implementation and analysis in machine learning. In User Interface Software and Technology (UIST), 2010.
    Google ScholarLocate open access versionFindings
  • K. Patel, J. Fogarty, J. A. Landay, and B. Harrison. Investigating statistical machine learning as a tool for software development. In Human Factors in Computing Systems (CHI), 2008.
    Google ScholarLocate open access versionFindings
  • I. Sanchez, T. Rocktaschel, S. Riedel, and S. Singh. Towards extracting faithful and descriptive representations of latent variable models. In AAAI Spring Syposium on Knowledge Representation and Reasoning (KRR): Integrating Symbolic and Neural Approaches, 2015.
    Google ScholarLocate open access versionFindings
  • D. Sculley, G. Holt, D. Golovin, E. Davydov, T. Phillips, D. Ebner, V. Chaudhary, M. Young, and J.-F. Crespo. Hidden technical debt in machine learning systems. In Neural Information Processing Systems (NIPS). 2015.
    Google ScholarLocate open access versionFindings
  • E. Strumbelj and I. Kononenko. An efficient explanation of individual classifications using game theory. Journal of Machine Learning Research, 11, 2010.
    Google ScholarLocate open access versionFindings
  • C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions. In Computer Vision and Pattern Recognition (CVPR), 2015.
    Google ScholarLocate open access versionFindings
  • B. Ustun and C. Rudin. Supersparse linear integer models for optimized medical scoring systems. Machine Learning, 2015.
    Google ScholarLocate open access versionFindings
  • F. Wang and C. Rudin. Falling rule lists. In Artificial Intelligence and Statistics (AISTATS), 2015.
    Google ScholarLocate open access versionFindings
  • K. Xu, J. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhutdinov, R. Zemel, and Y. Bengio. Show, attend and tell: Neural image caption generation with visual attention. In International Conference on Machine Learning (ICML), 2015.
    Google ScholarLocate open access versionFindings
  • P. Zhang, J. Wang, A. Farhadi, M. Hebert, and D. Parikh. Predicting failures of vision systems. In Computer Vision and Pattern Recognition (CVPR), 2014.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科