AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
We propose Probabilistic Graphical Model-Explainer, an explanation method faithfully explaining the predictions of any Graph Neural Networks in an interpretable manner

PGM-Explainer: Probabilistic Graphical Model Explanations for Graph Neural Networks

NIPS 2020, (2020)

Cited by: 0|Views30
EI
Full Text
Bibtex
Weibo

Abstract

In Graph Neural Networks (GNNs), the graph structure is incorporated into the learning of node representations. This complex structure makes explaining GNNs' predictions become much more challenging. In this paper, we propose PGM-Explainer, a Probabilistic Graphical Model (PGM) model-agnostic explainer for GNNs. Given a prediction to be...More
0
Introduction
  • Graph Neural Networks (GNNs) have been emerging as powerful solutions to many real-world applications in various domains where the datasets are in form of graphs such as social networks, citation networks, knowledge graphs, and biological networks [1, 2, 3].
  • (b) Knowledge on model’s behaviors helps them identify scenarios in which the systems may fail.
  • As the field grows, understanding why GNNs made such decisions becomes more vital.
  • This is essential for safety reason in complex real-world tasks in which not all possible scenarios are testable.
  • Understanding the model’s decisions helps them discover these biases before its deployment
Highlights
  • Graph Neural Networks (GNNs) have been emerging as powerful solutions to many real-world applications in various domains where the datasets are in form of graphs such as social networks, citation networks, knowledge graphs, and biological networks [1, 2, 3]
  • Our results show that Probabilistic Graphical Model (PGM)-Explainer achieves significantly higher precision than other methods in these experiments
  • We propose PGM-Explainer, an explanation method faithfully explaining the predictions of any GNN in an interpretable manner
  • By approximating the target prediction with a graphical model, PGM-Explainer is able to demonstrate the non-linear contributions of explained features toward the prediction
  • Our experiments show the high accuracy and precision of PGMExplainer and imply that PGM explanations are favored by end-users
  • We only adopt Bayesian networks as interpretable models, our formulations of PGM-Explainer supports the exploration of others graphical models such as Markov networks and Dependency networks
Methods
  • The authors provide the experiments, comparing the performance of PGM-Explainer to that of existing explanation methods for GNNs, including GNNExplainer [24] and the implementation of the extension of SHapley Additive exPlanations (SHAP) [17] to GNNs.
  • Source codes of gradient-based methods for GNNs are either unavailable or limited to specific models/applications.
  • SHAP is an additive feature attribution methods, unifying explanation methods for conventional neural networks [17].
  • By comparing PGM-Explainer with SHAP, the authors aim to demonstrate drawbacks of the linear-independence assumption of explained features in explaining GNN’s predictions.
  • The authors show that the vanilla gradient-based explanation method and GNNExplainer can be considered as additive feature attribution methods in Appendix A.
  • The authors' source code can be found at [34]
Results
  • Results on Synthetic Datasets

    Table 2 shows the accuracy in the explanations generated by different explainers.
  • The explanations are generated for all nodes in the motifs of the input graph.
  • The precision of nodes in explanations of each explainer on Trust weighted signed networks datasets are reported in Table 3.
  • The authors compare PGM-Explainer with the SHAP extension for GNN and GRAD, a simple gradient approach.
  • In this experiment, the authors do not restrict the number of nodes returned by PGM-Explainer.
  • In GRAD, the top-3 nodes are chosen based on the sum gradients of the GNN’s loss function with respect to the associated node features
Conclusion
  • The authors propose PGM-Explainer, an explanation method faithfully explaining the predictions of any GNN in an interpretable manner.
  • By approximating the target prediction with a graphical model, PGM-Explainer is able to demonstrate the non-linear contributions of explained features toward the prediction.
  • The authors only adopt Bayesian networks as interpretable models, the formulations of PGM-Explainer supports the exploration of others graphical models such as Markov networks and Dependency networks.
Summary
  • Introduction:

    Graph Neural Networks (GNNs) have been emerging as powerful solutions to many real-world applications in various domains where the datasets are in form of graphs such as social networks, citation networks, knowledge graphs, and biological networks [1, 2, 3].
  • (b) Knowledge on model’s behaviors helps them identify scenarios in which the systems may fail.
  • As the field grows, understanding why GNNs made such decisions becomes more vital.
  • This is essential for safety reason in complex real-world tasks in which not all possible scenarios are testable.
  • Understanding the model’s decisions helps them discover these biases before its deployment
  • Objectives:

    The authors aim to explain the prediction of the role of node E. (c) A PGMexplanation in a form of Bayesian network.
  • The authors aim to explain the prediction of the role of node E.
  • (c) A PGMexplanation in a form of Bayesian network.
  • By comparing PGM-Explainer with SHAP, the authors aim to demonstrate drawbacks of the linear-independence assumption of explained features in explaining GNN’s predictions.
  • The authors' aim is to bring a clearer view on explanation models and show that the class of additive feature attribution methods introduced in [17] fully captures current explanation methods for GNNs. The authors aim to find a graph, called perfect map, that precisely capture P
  • Methods:

    The authors provide the experiments, comparing the performance of PGM-Explainer to that of existing explanation methods for GNNs, including GNNExplainer [24] and the implementation of the extension of SHapley Additive exPlanations (SHAP) [17] to GNNs.
  • Source codes of gradient-based methods for GNNs are either unavailable or limited to specific models/applications.
  • SHAP is an additive feature attribution methods, unifying explanation methods for conventional neural networks [17].
  • By comparing PGM-Explainer with SHAP, the authors aim to demonstrate drawbacks of the linear-independence assumption of explained features in explaining GNN’s predictions.
  • The authors show that the vanilla gradient-based explanation method and GNNExplainer can be considered as additive feature attribution methods in Appendix A.
  • The authors' source code can be found at [34]
  • Results:

    Results on Synthetic Datasets

    Table 2 shows the accuracy in the explanations generated by different explainers.
  • The explanations are generated for all nodes in the motifs of the input graph.
  • The precision of nodes in explanations of each explainer on Trust weighted signed networks datasets are reported in Table 3.
  • The authors compare PGM-Explainer with the SHAP extension for GNN and GRAD, a simple gradient approach.
  • In this experiment, the authors do not restrict the number of nodes returned by PGM-Explainer.
  • In GRAD, the top-3 nodes are chosen based on the sum gradients of the GNN’s loss function with respect to the associated node features
  • Conclusion:

    The authors propose PGM-Explainer, an explanation method faithfully explaining the predictions of any GNN in an interpretable manner.
  • By approximating the target prediction with a graphical model, PGM-Explainer is able to demonstrate the non-linear contributions of explained features toward the prediction.
  • The authors only adopt Bayesian networks as interpretable models, the formulations of PGM-Explainer supports the exploration of others graphical models such as Markov networks and Dependency networks.
Tables
  • Table1: Models’ accuracy and number of sampled data used by PGM-Explainer
  • Table2: Accuracy of Explainers on Synthetic Datasets
  • Table3: Precision of Explainers on Trust Signed networks datasets
  • Table4: Parameters of synthetic datasets
Download tables as Excel
Funding
  • Acknowledgments and Disclosure of Funding This work was supported in part by the National Science Foundation Program on Fairness in AI in collaboration with Amazon under award No 1939725
Study subjects and analysis
synthetic datasets: 6
Synthetic node classification task. Six synthetic datasets, detailed in Appendix I, were considered. We reuse the source code of [24] as we want to evaluate explainers on the same settings

Reference
  • J. You, B. Liu, Z. Ying, V. Pande, and J. Leskovec, “Graph convolutional policy network for goal-directed molecular graph generation,” in Advances in Neural Information Processing Systems 31, 2018, pp. 6410–6421.
    Google ScholarLocate open access versionFindings
  • M. Zhang and Y. Chen, “Link prediction based on graph neural networks,” in Advances in Neural Information Processing Systems 31, 2018, pp. 5165–5175.
    Google ScholarLocate open access versionFindings
  • M. Zitnik, M. Agrawal, and J. Leskovec, “Modeling polypharmacy side effects with graph convolutional networks,” Bioinformatics, vol. 34, no. 13, p. 457–466, 2018.
    Google ScholarLocate open access versionFindings
  • M. Defferrard, X. Bresson, and P. Vandergheynst, “Convolutional neural networks on graphs with fast localized spectral filtering,” in Proceedings of the 30th International Conference on Neural Information Processing Systems, 2016, p. 3844–3852.
    Google ScholarLocate open access versionFindings
  • T. N. Kipf and M. Welling, “Semi-Supervised Classification with Graph Convolutional Networks,” in Proceedings of the 5th International Conference on Learning Representations, 2017.
    Google ScholarLocate open access versionFindings
  • W. Hamilton, Z. Ying, and J. Leskovec, “Inductive representation learning on large graphs,” in Advances in Neural Information Processing Systems 30, 2017, pp. 1024–1034.
    Google ScholarLocate open access versionFindings
  • P. Velickovic, G. Cucurull, A. Casanova, A. Romero, P. Liò, and Y. Bengio, “Graph Attention Networks,” International Conference on Learning Representations, 2018.
    Google ScholarLocate open access versionFindings
  • R. Levie, F. Monti, X. Bresson, and M. M. Bronstein, “Cayleynets: Graph convolutional neural networks with complex rational spectral filters,” IEEE Transactions on Signal Processing, vol. 67, no. 1, pp. 97–109, Jan 2019.
    Google ScholarLocate open access versionFindings
  • F. Monti, K. Otness, and M. M. Bronstein, “Motifnet: A motif-based graph convolutional network for directed graphs,” 2018 IEEE Data Science Workshop (DSW), pp. 225–228, 2018.
    Google ScholarLocate open access versionFindings
  • K. Xu, W. Hu, J. Leskovec, and S. Jegelka, “How powerful are graph neural networks?” in International Conference on Learning Representations, 2019.
    Google ScholarLocate open access versionFindings
  • Z. Ying, J. You, C. Morris, X. Ren, W. Hamilton, and J. Leskovec, “Hierarchical graph representation learning with differentiable pooling,” in Advances in Neural Information Processing Systems 31, 2018, pp. 4800–4810.
    Google ScholarLocate open access versionFindings
  • Z. Xinyi and L. Chen, “Capsule graph neural network,” in International Conference on Learning Representations, 2019.
    Google ScholarLocate open access versionFindings
  • J. Lee, I. Lee, and J. Kang, “Self-attention graph pooling,” in 36th International Conference on Machine Learning, ICML 2019, 2019, pp. 6661–6670.
    Google ScholarLocate open access versionFindings
  • F. Doshi-Velez and B. Kim, “Towards a rigorous science of interpretable machine learning,” arXiv, 2017. [Online]. Available: https://arxiv.org/abs/1702.08608
    Findings
  • K. Simonyan, A. Vedaldi, and A. Zisserman, “Deep inside convolutional networks: Visualising image classification models and saliency maps,” in Workshop at International Conference on Learning Representations, 2014.
    Google ScholarLocate open access versionFindings
  • M. T. Ribeiro, S. Singh, and C. Guestrin, ““Why should i trust you?”: Explaining the predictions of any classifier,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016, p. 1135–1144.
    Google ScholarLocate open access versionFindings
  • S. M. Lundberg and S.-I. Lee, “A unified approach to interpreting model predictions,” in Advances in Neural Information Processing Systems 30, 2017, pp. 4765–4774.
    Google ScholarLocate open access versionFindings
  • R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra, “Grad-cam: Visual explanations from deep networks via gradient-based localization,” in 2017 IEEE International Conference on Computer Vision (ICCV), Oct 2017, pp. 618–626.
    Google ScholarLocate open access versionFindings
  • A. Shrikumar, P. Greenside, and A. Kundaje, “Learning important features through propagating activation differences,” in Proceedings of the 34th International Conference on Machine Learning, vol. 70, 06–11 Aug 2017, pp. 3145–3153.
    Google ScholarLocate open access versionFindings
  • S. Bach, A. Binder, G. Montavon, F. Klauschen, K.-R. Müller, and W. Samek, “On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation,” PLOS ONE, vol. 10, no. 7, pp. 1–46, 07 2015.
    Google ScholarLocate open access versionFindings
  • J. Springenberg, A. Dosovitskiy, T. Brox, and M. Riedmiller, “Striving for simplicity: The all convolutional net,” in ICLR (workshop track), 2015.
    Google ScholarFindings
  • Z. Jianming, L. Zhe, B. Jonathan, S. Xiaohui, and S. Stan, “Top-down neural attention by excitation backprop,” in European Conference on Computer Vision(ECCV), 2016.
    Google ScholarLocate open access versionFindings
  • M. N. Vu, T. D. T. Nguyen, N. Phan, R. Gera, and M. T. Thai, “Evaluating explainers via perturbation,” CoRR, vol. abs/1906.02032, 2019.
    Findings
  • Z. Ying, D. Bourgeois, J. You, M. Zitnik, and J. Leskovec, “GNNexplainer: Generating explanations for graph neural networks,” in Advances in Neural Information Processing Systems 32, 2019, pp. 9244–9255.
    Google ScholarLocate open access versionFindings
  • A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala, “Pytorch: An imperative style, high-performance deep learning library,” in Advances in Neural Information Processing Systems 32, 2019, pp. 8026–8037.
    Google ScholarLocate open access versionFindings
  • M. Wang, L. Yu, D. Zheng, Q. Gan, Y. Gai, Z. Ye, M. Li, J. Zhou, Q. Huang, C. Ma, Z. Huang, Q. Guo, H. Zhang, H. Lin, J. Zhao, J. Li, A. J. Smola, and Z. Zhang, “Deep graph library: Towards efficient and scalable deep learning on graphs,” ICLR Workshop on Representation Learning on Graphs and Manifolds, 2019.
    Google ScholarLocate open access versionFindings
  • P. E. Pope, S. Kolouri, M. Rostami, C. E. Martin, and H. Hoffmann, “Explainability methods for graph convolutional neural networks,” in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 10 764–10 773.
    Google ScholarLocate open access versionFindings
  • D. Koller and N. Friedman, Probabilistic Graphical Models: Principles and Techniques Adaptive Computation and Machine Learning. The MIT Press, 2009.
    Google ScholarFindings
  • J. Pearl, “Chapter 3 - markov and bayesian networks: Two graphical representations of probabilistic knowledge,” in Probabilistic Reasoning in Intelligent Systems, 1988, pp. 77 – 141.
    Google ScholarLocate open access versionFindings
  • M. Scanagatta, A. Salmerón, and F. Stella, “A survey on bayesian network structure learning from data,” in Progress in Artificial Intelligence 8, 2019, p. 425–439.
    Google ScholarLocate open access versionFindings
  • A. Barabási and M. PÃ3sfai, Network Science. Cambridge University Press, 2016. [Online]. Available: https://books.google.com/books?id=iLtGDQAAQBAJ
    Findings
  • D. Margaritis and S. Thrun, “Bayesian network induction via local neighborhoods,” in Advances in Neural Information Processing Systems 12, 2000, pp. 505–511.
    Google ScholarLocate open access versionFindings
  • J. Gámez, J. Mateo, and J. Puerta, “Learning bayesian networks by hill climbing: Efficient methods based on progressive restriction of the neighborhood,” Data Mining and Knowledge Discovery, vol. 22, pp. 106–148, 05 2011.
    Google ScholarLocate open access versionFindings
  • M. N. Vu, Source Code for PGM-Explainer, https://github.com/vunhatminh/PGMExplainer.
    Findings
  • S. Kumar, F. Spezzano, V. Subrahmanian, and C. Faloutsos, “Edge weight prediction in weighted signed networks,” in Data Mining (ICDM), 2016 IEEE 16th International Conference on. IEEE, 2016, pp. 221–230.
    Google ScholarLocate open access versionFindings
  • V. P. Dwivedi, C. K. Joshi, T. Laurent, Y. Bengio, and X. Bresson, “Benchmarking graph neural networks,” arXiv preprint arXiv:2003.00982, 2020.
    Findings
  • Y. LeCun and C. Cortes, “MNIST handwritten digit database,” 2010. [Online]. Available: http://yann.lecun.com/exdb/mnist/
    Findings
  • D. Kingma and J. Ba, “Adam: A method for stochastic optimization,” International Conference on Learning Representations, 12 2014.
    Google ScholarLocate open access versionFindings
  • R. Ying, D. Bourgeois, J. You, M. Zitnik, and J. Leskovec, Source Code for GNNExplainer:, https://github.com/RexYing/gnn-model-explainer.
    Findings
  • M. Sundararajan, A. Taly, and Q. Yan, “Gradients of counterfactuals,” CoRR, vol. abs/1611.02639, 2016. [Online]. Available: http://arxiv.org/abs/1611.02639
    Findings
  • 2. All structure B such that I(B) = I(B∗) will have strictly lower score.
    Google ScholarFindings
Author
Minh N Vu
Minh N Vu
Your rating :
0

 

Tags
Comments
小科