GRACE: Generating Concise and Informative Contrastive Sample to Explain Neural Network Model

KDD 2020, 2020.

被引用0|浏览59
微博一下
We introduce GRACE, a novel instance-based algorithm that provides end-users with simple natural text explaining neural network models’ predictions in a contrastive “Why X rather than Y" fashion

摘要

Despite the recent development in the topic of explainable AI/ML for image and text data, the majority of current solutions are not suitable to explain the prediction of neural network models when the datasets are tabular and their features are in high-dimensional vectorized formats. To mitigate this limitation, therefore, we borrow two n...更多

代码

数据

0
ZH
下载 PDF 全文
引用
微博一下
简介
  • Tabular data is one of the most commonly used data formats. Even though tabular data receives far less attention than computer vision and NLP data in neural networks literature, recent efforts (e.g., [1, 2, 21, 28]) have shown that neural networks, deep learning in particular, can achieve superior performance on this

    Feat freq_now freq_credit freq_!!! freq_! class x1 0.0 0.0 Ham x 1 0.3 0.453 Spam

    Feat freq_you freq_direct avg_longest_capital class x2 158.0 Spam x 2

    1.0 Ham type of data.
  • There is still a lack of interpretability that results in the distrust of neural networks trained on general tabular data domains.
  • Most of the previous explanation approaches are geared for professional users such as ML researchers and developers rather than lay users and ML consumers
  • This situation calls for a novel approach to provide end-users with the intuitive explanation of neural networks trained on tabular data.
重点内容
  • Tabular data is one of the most commonly used data formats
  • We introduce an explanation concept for ML by marrying “contrastive explanation" and “explanation by intervention", extend it to a novel problem of generating contrastive sample to explain why a neural network model predicts X rather than Y for data instances of tabular format;
  • We develop a novel framework, GRACE, which finds key features of a sample, generates contrastive sample based on these features, and provides an explanation text on why the given model predicts X rather than Y with the generated sample; and
  • We propose that as long as the constraints on the maximum number of perturbed features, i.e., Eq (3), their entropy, i.e., Eq (5), and domain, i.e., Eq (6), is satisfied, we can generate more concise and informative explainable contrastive samples
  • The user-studies show that our generated explanation is more intuitive and easy-to-understand and facilitates end-users to make as much as 60% more accurate post-explanation decisions than that of Lime
  • We introduce GRACE, a novel instance-based algorithm that provides end-users with simple natural text explaining neural network models’ predictions in a contrastive “Why X rather than Y" fashion
方法
  • Since the proposed framework combines the best of both worlds: adversarial generation and neural network model explanation, the authors select various relevant baselines from two aspects. NearestCT: Instead of generating synthetic contrastive sample for explanation for data point x , this approach selects the

    Statistics Ravg#Feats Ri∗nfo−gain Rinfluence Dataset

    NearestCT DeepFool GRACE-Local GRACE-Gradient NearestCT DeepFool GRACE-Local GRACE-Gradient NearestCT DeepFool GRACE-Local GRACE-Gradient eegeye

    0.69 0.7 0.64 0.64 0.05 0.05 0.55 0.64 diabetes # Features < 30 cancer95

    0.64 0.62 0.49 0.52 0.11 0.07 0.18 0.2 phoneme

    4.82 5.00 1.25 1.3 0.12 0.12 0.81 0.78 0.03 0.02 0.65 0.61 segment

    16.10 19.00 2.42 3.84 0.19 0.33 0.55 0.23 0.01 0.02 0.23 0.06 magic

    30 ≤ # Features < 100 biodeg.
  • Since the proposed framework combines the best of both worlds: adversarial generation and neural network model explanation, the authors select various relevant baselines from two aspects.
  • NearestCT: Instead of generating synthetic contrastive sample for explanation for data point x , this approach selects the.
  • Statistics Ravg#Feats Ri∗nfo−gain Rinfluence Dataset.
  • NearestCT DeepFool GRACE-Local GRACE-Gradient NearestCT DeepFool GRACE-Local GRACE-Gradient NearestCT DeepFool GRACE-Local GRACE-Gradient eegeye.
  • 0.69 0.7 0.64 0.64 0.05 0.05 0.55 0.64 diabetes # Features < 30 cancer95.
  • 30 ≤ # Features < 100 biodeg
结果
  • Evaluation of Generated Samples

    the authors want to examine the quality of generated contrastive samples.
  • The authors want to compare GRACE with Lime [26] from end-users’ perspectives on their generated explanation.
  • Following the same experimental setting in Section 5, the authors apply Lime and GRACE on the trained neural network model to explain its predictions on the test set.
  • GRACE generates an explanation text as follows:"Had bare_nuclei been 7.0 point lower and clump_thickness been 9.0 point lower, the patient would have been diagnosed as benign rather than malignant"
结论
  • CONCLUSION AND FUTURE WORK

    In this paper, the authors borrow “contrastive explanation" and “explanation by intervention" concepts from previous literature and develop a generative-based approach to explain neural network models’ predictions.
  • The authors introduce GRACE, a novel instance-based algorithm that provides end-users with simple natural text explaining neural network models’ predictions in a contrastive “Why X rather than Y" fashion.
  • To facilitate such an explanation, GRACE extends adversarial perturbation literature with various conditions and constraints, and generates contrastive samples that are concise, informative and faithful to the neural network model’s specific prediction.
  • Since the method works exclusively for multinomial classification task, the authors plan to apply it on other ML tasks such as regression, clustering, etc
总结
  • Introduction:

    Tabular data is one of the most commonly used data formats. Even though tabular data receives far less attention than computer vision and NLP data in neural networks literature, recent efforts (e.g., [1, 2, 21, 28]) have shown that neural networks, deep learning in particular, can achieve superior performance on this

    Feat freq_now freq_credit freq_!!! freq_! class x1 0.0 0.0 Ham x 1 0.3 0.453 Spam

    Feat freq_you freq_direct avg_longest_capital class x2 158.0 Spam x 2

    1.0 Ham type of data.
  • There is still a lack of interpretability that results in the distrust of neural networks trained on general tabular data domains.
  • Most of the previous explanation approaches are geared for professional users such as ML researchers and developers rather than lay users and ML consumers
  • This situation calls for a novel approach to provide end-users with the intuitive explanation of neural networks trained on tabular data.
  • Methods:

    Since the proposed framework combines the best of both worlds: adversarial generation and neural network model explanation, the authors select various relevant baselines from two aspects. NearestCT: Instead of generating synthetic contrastive sample for explanation for data point x , this approach selects the

    Statistics Ravg#Feats Ri∗nfo−gain Rinfluence Dataset

    NearestCT DeepFool GRACE-Local GRACE-Gradient NearestCT DeepFool GRACE-Local GRACE-Gradient NearestCT DeepFool GRACE-Local GRACE-Gradient eegeye

    0.69 0.7 0.64 0.64 0.05 0.05 0.55 0.64 diabetes # Features < 30 cancer95

    0.64 0.62 0.49 0.52 0.11 0.07 0.18 0.2 phoneme

    4.82 5.00 1.25 1.3 0.12 0.12 0.81 0.78 0.03 0.02 0.65 0.61 segment

    16.10 19.00 2.42 3.84 0.19 0.33 0.55 0.23 0.01 0.02 0.23 0.06 magic

    30 ≤ # Features < 100 biodeg.
  • Since the proposed framework combines the best of both worlds: adversarial generation and neural network model explanation, the authors select various relevant baselines from two aspects.
  • NearestCT: Instead of generating synthetic contrastive sample for explanation for data point x , this approach selects the.
  • Statistics Ravg#Feats Ri∗nfo−gain Rinfluence Dataset.
  • NearestCT DeepFool GRACE-Local GRACE-Gradient NearestCT DeepFool GRACE-Local GRACE-Gradient NearestCT DeepFool GRACE-Local GRACE-Gradient eegeye.
  • 0.69 0.7 0.64 0.64 0.05 0.05 0.55 0.64 diabetes # Features < 30 cancer95.
  • 30 ≤ # Features < 100 biodeg
  • Results:

    Evaluation of Generated Samples

    the authors want to examine the quality of generated contrastive samples.
  • The authors want to compare GRACE with Lime [26] from end-users’ perspectives on their generated explanation.
  • Following the same experimental setting in Section 5, the authors apply Lime and GRACE on the trained neural network model to explain its predictions on the test set.
  • GRACE generates an explanation text as follows:"Had bare_nuclei been 7.0 point lower and clump_thickness been 9.0 point lower, the patient would have been diagnosed as benign rather than malignant"
  • Conclusion:

    CONCLUSION AND FUTURE WORK

    In this paper, the authors borrow “contrastive explanation" and “explanation by intervention" concepts from previous literature and develop a generative-based approach to explain neural network models’ predictions.
  • The authors introduce GRACE, a novel instance-based algorithm that provides end-users with simple natural text explaining neural network models’ predictions in a contrastive “Why X rather than Y" fashion.
  • To facilitate such an explanation, GRACE extends adversarial perturbation literature with various conditions and constraints, and generates contrastive samples that are concise, informative and faithful to the neural network model’s specific prediction.
  • Since the method works exclusively for multinomial classification task, the authors plan to apply it on other ML tasks such as regression, clustering, etc
表格
  • Table1: Examples of original samples x i and contrastive samples xi on spam dataset. xi only differs from x i on a few features. (unchanged features are randomly selected)
  • Table2: Examples of generated contrastive samples and their explanation texts
  • Table3: Dataset statistics and prediction performance
  • Table4: All results are averaged across 10 different runs. The best and second best results are highlighted in bold and underline
  • Table5: User-study with hypothesis testing to compare explanation generated by GRACE against Lime
  • Table6: Effects of entropy threshold γ on Rinfo−gain
Download tables as Excel
相关工作
  • Regarding explanation by intervention, our Def. 1 relates to Quantitative Input Influence [5], a general framework to quantify the influence of a set of inputs on the prediction outcomes. The framework follows a two-step approach: (i) it first changes each individual feature by replacing it with a random value, and then (ii) observes how the outcome, i.e., prediction, changes accordingly. However, we propose a more systematic way by generating a new sample at once by directly conditioning it on a contrastive outcome (X rather than Y ). A few prior works (e.g., [20, 32, 37]) also propose to generate contrastive samples with (i) minimal corrections from its original input by minimizing the distance: δ = ∥x − x ∥p and with (ii) minimal number of features needed to change to achieve such corrections. While Wachter et al [32] use δ with l1 norm to induce sparsity with the hope to achieve (ii), Zhang et al [37] approach the problem in a reverse fashion, in which they try to search for minimal δ w.r.t to a pre-defined number of features to be changed. Regardless, without considering the mutual information among pair-wise of features, it does not always guarantee that generated samples are informative to end-users. The work [31] also proposes a method to use decision trees to search for a decisive threshold of feature’s values at which the prediction will change, and utilize such threshold to generate explanations for neural network model’ predictions. While sounds similar to our approach, this method shares a similar dis-merit with Lime [26] since the generated explanation is only an approximation and not faithful to the model. In this paper, we take a novel approach to generate contrastive samples that are not only contrastive but also faithful to the neural network model and “informative" to end-users.
基金
  • This work was in part supported by NSF awards #1742702, #1820609, #1909702, #1915801 and #1934782
引用论文
  • Sercan O Arik and Tomas Pfister. 2019. TabNet: Attentive Interpretable Tabular Learning. arXiv preprint arXiv:1908.07442 (2019).
    Findings
  • Björn Barz and Joachim Denzler. 2019. Deep Learning on Small Datasets without Pre-Training using Cosine Loss. arXiv preprint arXiv:1901.09054 (2019).
    Findings
  • Babak Ehteshami Bejnordi, Mitko Veta, Van Diest, et al. 2017. Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. Jama 318, 22 (2017), 2199–2210.
    Google ScholarLocate open access versionFindings
  • Lingyang Chu, Xia Hu, Juhua Hu, Lanjun Wang, and Jian Pei. 2018. Exact and consistent interpretation for piecewise linear neural networks: A closed form solution. In ACM SIGKDD/KDD. ACM, 1244–1253.
    Google ScholarLocate open access versionFindings
  • Anupam Datta, Shayak Sen, and Yair Zick. 2016. Algorithmic transparency via quantitative input influence: Theory and experiments with learning systems. In 2016 IEEE SP. IEEE, 598–617.
    Google ScholarLocate open access versionFindings
  • Matthew F Dixon, Nicholas G Polson, and Vadim O Sokolov. [n. d.]. Deep learning for spatio-temporal modeling: Dynamic traffic flows and high frequency trading. Applied Stochastic Models in Business and Industry ([n. d.]).
    Google ScholarFindings
  • Dheeru Dua and Casey Graff. 201UCI Machine Learning Repository.
    Google ScholarFindings
  • Usama Fayyad and Keki Irani. 1993. Multi-interval discretization of continuousvalued attributes for classification learning. (1993).
    Google ScholarFindings
  • Thomas Fischer and Christopher Krauss. [n. d.]. Deep learning with long shortterm memory networks for financial market predictions. EJOR 270 ([n. d.]).
    Google ScholarFindings
  • Brian P Flannery, Saul A Teukolsky, William H Press, and William T Vetterling.
    Google ScholarFindings
  • 1988. Numerical recipes in C: The art of scientific computing. Vol. 2.
    Google ScholarFindings
  • [11] Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In AISTATS. 249–256.
    Google ScholarLocate open access versionFindings
  • [12] Andrej Karpathy, Justin Johnson, and Li Fei-Fei. 2015. Visualizing and understanding recurrent networks. arXiv preprint arXiv:1506.02078 (2015).
    Findings
  • [13] Rajiv Khanna, Ethan Elenberg, Alexandros G Dimakis, Sahand Negahban, and Joydeep Ghosh. 2017. Scalable greedy feature selection via weak submodularity. arXiv preprint arXiv:1703.02723 (2017).
    Findings
  • [14] Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
    Findings
  • [15] Michal Kosinski, David Stillwell, and Thore Graepel. 2013. Private traits and attributes are predictable from digital records of human behavior. National Academy of Sciences 110, 15 (2013), 5802–5805.
    Google ScholarLocate open access versionFindings
  • [16] David Lewis. 2013. Counterfactuals. John Wiley & Sons.
    Google ScholarFindings
  • [17] Jiwei Li, Xinlei Chen, Eduard Hovy, and Dan Jurafsky. 2015. Visualizing and understanding neural models in nlp. arXiv preprint arXiv:1506.01066 (2015).
    Findings
  • [18] Zachary C Lipton. [n. d.]. The mythos of model interpretability. Queue ([n. d.]).
    Google ScholarFindings
  • [19] Samaneh Mahdavifar and Ali A Ghorbani. 2019. Application of deep learning to cybersecurity: A survey. (2019).
    Google ScholarFindings
  • [20] Xudong Mao, Qing Li, Haoran Xie, Raymond YK Lau, Zhen Wang, and Stephen Paul Smolley. [n. d.]. Least squares generative adversarial networks. In CVPR.
    Google ScholarLocate open access versionFindings
  • [21] Jan André Marais. 2019. Deep learning for tabular data: an exploratory study. Ph.D. Dissertation. Stellenbosch: Stellenbosch University.
    Google ScholarFindings
  • [22] Alexandra Meliou, Sudeepa Roy, and Dan Suciu. 2014. Causality and explanations in databases. VLDB Endowment 7, 13 (2014), 1715–1716.
    Google ScholarLocate open access versionFindings
  • [23] Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, and Pascal Frossard. 2016. Deepfool: a simple and accurate method to fool deep neural networks. In IEEE CVPR. 2574–2582.
    Google ScholarLocate open access versionFindings
  • [24] Vitali Petsiuk, Abir Das, and Kate Saenko. 2018. Rise: Randomized input sampling for explanation of black-box models. In BMVC.
    Google ScholarFindings
  • [25] Daniele Ravì, Charence Wong, Fani Deligianni, Melissa Berthelot, Javier AndreuPerez, Benny Lo, and Guang-Zhong Yang. 2016. Deep learning for health informatics. IEEE journal of biomedical and health informatics 21, 1 (2016), 4–21.
    Google ScholarLocate open access versionFindings
  • [26] Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. [n. d.]. "Why Should I Trust You?": Explaining the Predictions of Any Classifier. In KDD.
    Google ScholarLocate open access versionFindings
  • [27] Sudeepa Roy and Dan Suciu. 2014. A formal approach to finding explanations for database queries. In ACM SIGMOD. ACM, 1579–1590.
    Google ScholarLocate open access versionFindings
  • [28] Ira Shavitt and Eran Segal. 2018. Regularization learning networks: deep learning for tabular datasets. In NIPS. 1379–1389.
    Google ScholarFindings
  • [29] Craig Silverstein, Sergey Brin, Rajeev Motwani, and Jeff Ullman. 2000. Scalable techniques for mining causal structures. (2000).
    Google ScholarFindings
  • [30] Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. 2013. Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034 (2013).
    Findings
  • [31] Jasper van der Waa, Marcel Robeer, Jurriaan van Diggelen, Matthieu Brinkhuis, and Mark Neerincx. 2018. Contrastive explanations with local foil trees. arXiv preprint arXiv:1806.07470 (2018).
    Findings
  • [32] Sandra Wachter, Brent Mittelstadt, and Chris Russell. 2017. Counterfactual explanations without opening the black box: Automated decisions and the GDPR. Harv. JL & Tech. 31 (2017), 841.
    Google ScholarLocate open access versionFindings
  • [33] Eugene Wu and Samuel Madden. 2013. Scorpion: Explaining away outliers in aggregate queries. Proceedings of the VLDB Endowment 6, 8 (2013), 553–564.
    Google ScholarLocate open access versionFindings
  • [34] Xiaojun Xu, Chang Liu, Qian Feng, Heng Yin, Le Song, and Dawn Song. 2017. Neural network-based graph embedding for cross-platform binary code similarity detection. In ACM SIGSAC CCS. 363–376.
    Google ScholarLocate open access versionFindings
  • [35] Lei Yu and Huan Liu. 2003. Feature selection for high-dimensional data: A fast correlation-based filter solution. In ICML. 856–863.
    Google ScholarLocate open access versionFindings
  • [36] Matthew D Zeiler and Rob Fergus. 2014. Visualizing and understanding convolutional networks. In ECCV. Springer, 818–833.
    Google ScholarFindings
  • [37] Xin Zhang, Armando Solar-Lezama, and Rishabh Singh. 2018. Interpreting neural network judgments via minimal, stable, and symbolic corrections. In NIPS.
    Google ScholarFindings
  • [38] Daniel John Zizzo, Daniel Sgroi, et al. 2000. Bounded-rational behavior by neural networks in normal form games. Nuffield College.
    Google ScholarFindings
您的评分 :
0

 

标签
评论