AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
We have demonstrated the efficacy of Graph Information Bottleneck by evaluating the robustness of the Graph Attention Networks model trained under the GIB principle on adversarial attacks

Graph Information Bottleneck

NIPS 2020, (2020)

Cited by: 12|Views247
EI
Full Text
Bibtex
Weibo

Abstract

Representation learning of graph-structured data is challenging because both graph structure and node features carry important information. Graph Neural Networks (GNNs) provide an expressive way to fuse information from network structure and node features. However, GNNs are prone to adversarial attacks. Here we introduce Graph Informati...More
0
Introduction
  • Representation learning on graphs aims to learn representations of graph-structured data for downstream tasks such as node classification and link prediction [1, 2].
  • Graph representation learning is a challenging task since both node features as well as graph structure carry important information [3, 4].
  • Graph Neural Networks (GNNs) [1, 3, 5,6,7] have demonstrated impressive performance, by learning to fuse information from both the node features and the graph structure [8].
  • GNN’s reliance on message passing over the edges of the graph makes it prone to noise and adversarial attacks that target at the graph structure [15, 16]
Highlights
  • Representation learning on graphs aims to learn representations of graph-structured data for downstream tasks such as node classification and link prediction [1, 2]
  • We introduce Graph Information Bottleneck (GIB), an information-theoretic principle inherited from IB, adapted for representation learning on graph-structured data
  • We consider the following two questions: (1) Boosted by GIB, does GIB-Cat and GIB-Bern learn more robust representations than Graph Attention Networks (GAT) to defend against attacks? (2) How does each component of GIB contribute to such robustness, especially, to controlling the information from one of the two sides — the structure and node features?
  • We have introduced Graph Information Bottleneck (GIB), an information-theoretic principle for learning representations that capture minimal sufficient information from graph-structured data
  • We have demonstrated the efficacy of GIB by evaluating the robustness of the GAT model trained under the GIB principle on adversarial attacks
  • Are there any other better instantiations of GIB, especially in capturing discrete structural information? If incorporated with a node for global aggregation, can GIB break the limitation of the local-dependence assumption? May GIB be applied to other graph-related tasks including link prediction and graph classification?
Methods
  • The goal of the experiments is to test whether GNNs trained with the GIB objective are more robust and reliable.
  • For GCNJaccard and RGCN, the authors perform extensive hyperparameter search as detailed in Appendix G.3.
  • For GIB-Cat and GIB-Bern, the authors keep the same architectural component as GAT, and for the additional hyperparameters k and T (Algorithm 1, 2 and 3), the authors search k ∈ {2, 3} and T ∈ {1, 2} for each experimental setting and report the better performance.
Results
  • GIB-based models empirically achieve up to 31% improvement with adversarial perturbation of the graph structure as well as node features.
  • GIB-Cat and GIB-Bern improve the classification accuracy by up to 31.3% and 34.0% under adversarial perturbation, respectively
Conclusion
  • The authors have introduced Graph Information Bottleneck (GIB), an information-theoretic principle for learning representations that capture minimal sufficient information from graph-structured data.
  • GNNs share a common issue with other techniques based on neural networks
  • They are very sensitive to noise of data and are fragile to model attacks.
  • The Graph Information Bottleneck (GIB) principle proposed in this work paves a principled way to alleviate the above problem by increasing the robustness of GNN models.
  • The authors' work further releases the worries about the usage of GNN techniques in practical systems, such as recommender systems, social media, or to analyze data for other disciplines, including physics, biology, social science.
  • The authors' work increases the interaction between AI, machine learning techniques and other aspects of the society, and could achieve far-reaching impact
Tables
  • Table1: Average classification accuracy (%) for the targeted nodes under direct attack. Each number is the average accuracy for the 40 targeted nodes for 5 random initialization of the experiments. Bold font denotes top two models
  • Table2: Average classification accuracy (%) for the ablations of GIB-Cat and GIB-Bern on Cora dataset
  • Table3: Classification F1-micro (%) for the trained models with increasing additive feature noise. Bold font denotes top 2 models
  • Table4: Summary of the datasets and splits in our experiments
  • Table5: Hyperparameter scope for Section 5.1 and 5.2 for GIB-Cat and GIB-Bern
  • Table6: Hyperparameter for adversarial attack experiment for GIB-Cat and GIB-Bern
  • Table7: Hyperparameter for adversarial attack experiment for the ablations of GIB-Cat and
  • Table8: Hyperparameter for feature attack experiment (Section 5.2) for GIB-Cat and GIB-Bern
  • Table9: Hyperparameter of baselines used on Citeseer dataset
  • Table10: Hyperparameter of baselines used on Cora dataset
  • Table11: Hyperparameter of baselines used on Pubmed dataset
  • Table12: Average classification accuracy (%) for the targeted nodes under direct attack for Cora
  • Table13: Statistics of the target nodes and adversarial perturbations by Nettack in Section 5.1
Download tables as Excel
Related work
  • GNNs learn node-level representations through message passing and aggregation from neighbors [1, 3, 29,30,31]. Several previous works further incorporate the attention mechanism to adaptively learn the correlation between a node and its neighbor [5, 32]. Recent literature shows that representations learned by GNNs are far from robust and can be easily attacked by malicious manipulation on either features or structure [15, 16]. Accordingly, several defense models are proposed to increase the robustness by injecting random noise in the representations [33], removing suspicious and uninformative edges [34], low-rank approximation of the adjacency matrix [35], additional hinge loss for certified robustness [36]. In contrast, even though not specifically designed against adversarial attacks, our model learns robust representations via the GIB principle that naturally defend against attacks. Moreover, none of those defense models has theoretical foundations except [36] that uses tools of robust optimization instead of information theory.
Funding
  • Hongyu Ren is supported by the Masason Foundation Fellowship
  • We also gratefully acknowledge the support of DARPA under Nos
Study subjects and analysis
citation benchmark datasets: 3
Please see Appendix G for more details. We use three citation benchmark datasets: Cora, Pubmed and Citeseer [43], in our evaluation. In all experiments, we follow the standard transductive node classification setting and standard trainvalidation-test split as GAT [5]

Reference
  • W. Hamilton, Z. Ying, and J. Leskovec, “Inductive representation learning on large graphs,” in Advances in neural information processing systems, 2017.
    Google ScholarFindings
  • T. N. Kipf and M. Welling, “Variational graph auto-encoders,” arXiv preprint arXiv:1611.07308, 2016.
    Findings
  • ——, “Semi-supervised classification with graph convolutional networks,” in International Conference on Learning Representations, 2017.
    Google ScholarLocate open access versionFindings
  • P. Li, I. Chien, and O. Milenkovic, “Optimizing generalized pagerank methods for seedexpansion community detection,” in Advances in Neural Information Processing Systems, 2019.
    Google ScholarLocate open access versionFindings
  • P. Velickovic, G. Cucurull, A. Casanova, A. Romero, P. Liò, and Y. Bengio, “Graph attention networks,” in International Conference on Learning Representations, 2018.
    Google ScholarLocate open access versionFindings
  • J. Chen, T. Ma, and C. Xiao, “FastGCN: Fast learning with graph convolutional networks via importance sampling,” in International Conference on Learning Representations, 2018.
    Google ScholarLocate open access versionFindings
  • J. Klicpera, A. Bojchevski, and S. Günnemann, “Predict then propagate: Graph neural networks meet personalized pagerank,” in International Conference on Learning Representations, 2019.
    Google ScholarLocate open access versionFindings
  • K. Xu, W. Hu, J. Leskovec, and S. Jegelka, “How powerful are graph neural networks?” in International Conference on Learning Representations, 2019.
    Google ScholarLocate open access versionFindings
  • J. You, R. Ying, and J. Leskovec, “Position-aware graph neural networks,” in International Conference on Machine Learning, 2019.
    Google ScholarLocate open access versionFindings
  • H. Pei, B. Wei, K. C.-C. Chang, Y. Lei, and B. Yang, “Geom-gcn: Geometric graph convolutional networks,” in International Conference on Learning Representations, 2020.
    Google ScholarLocate open access versionFindings
  • H. Maron, H. Ben-Hamu, H. Serviansky, and Y. Lipman, “Provably powerful graph networks,” in Advances in Neural Information Processing Systems, 2019.
    Google ScholarLocate open access versionFindings
  • R. Murphy, B. Srinivasan, V. Rao, and B. Riberio, “Relational pooling for graph representations,” in International Conference on Machine Learning, 2019.
    Google ScholarLocate open access versionFindings
  • Z. Chen, S. Villar, L. Chen, and J. Bruna, “On the equivalence between graph isomorphism testing and function approximation with gnns,” in Advances in Neural Information Processing Systems, 2019.
    Google ScholarLocate open access versionFindings
  • Y. Hou, J. Zhang, J. Cheng, K. Ma, R. T. B. Ma, H. Chen, and M.-C. Yang, “Measuring and improving the use of graph information in graph neural networks,” in International Conference on Learning Representations, 2020.
    Google ScholarLocate open access versionFindings
  • D. Zügner, A. Akbarnejad, and S. Günnemann, “Adversarial attacks on neural networks for graph data,” in Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2018.
    Google ScholarLocate open access versionFindings
  • H. Dai, H. Li, T. Tian, X. Huang, L. Wang, J. Zhu, and L. Song, “Adversarial attack on graph structured data,” arXiv preprint arXiv:1806.02371, 2018.
    Findings
  • T. M. Cover and J. A. Thomas, Elements of information theory. John Wiley & Sons, 2012.
    Google ScholarFindings
  • N. Tishby, F. C. Pereira, and W. Bialek, “The information bottleneck method,” arXiv preprint physics/0004057, 2000.
    Google ScholarFindings
  • N. Tishby and N. Zaslavsky, “Deep learning and the information bottleneck principle,” in 2015 IEEE Information Theory Workshop (ITW). IEEE, 2015.
    Google ScholarLocate open access versionFindings
  • P. A. M. Dirac, The principles of quantum mechanics. Oxford university press, 1981, no. 27.
    Google ScholarFindings
  • A. A. Alemi, I. Fischer, J. V. Dillon, and K. Murphy, “Deep variational information bottleneck,” arXiv preprint arXiv:1612.00410, 2016.
    Findings
  • B. Poole, S. Ozair, A. Van Den Oord, A. Alemi, and G. Tucker, “On variational bounds of mutual information,” in International Conference on Machine Learning, 2019.
    Google ScholarLocate open access versionFindings
  • X. Nguyen, M. J. Wainwright, and M. I. Jordan, “Estimating divergence functionals and the likelihood ratio by convex risk minimization,” IEEE Transactions on Information Theory, 2010.
    Google ScholarLocate open access versionFindings
  • E. Jang, S. Gu, and B. Poole, “Categorical reparameterization with gumbel-softmax,” in International Conference on Learning Representations, 2017.
    Google ScholarLocate open access versionFindings
  • C. J. Maddison, A. Mnih, and Y. W. Teh, “The concrete distribution: A continuous relaxation of discrete random variables,” in International Conference on Learning Representations, 2017.
    Google ScholarLocate open access versionFindings
  • I. Fischer and A. A. Alemi, “Ceb improves model robustness,” arXiv preprint arXiv:2002.05380, 2020.
    Findings
  • N. Dilokthanakul, P. A. Mediano, M. Garnelo, M. C. Lee, H. Salimbeni, K. Arulkumaran, and M. Shanahan, “Deep unsupervised clustering with gaussian mixture variational autoencoders,” arXiv preprint arXiv:1611.02648, 2016.
    Findings
  • A. v. d. Oord, Y. Li, and O. Vinyals, “Representation learning with contrastive predictive coding,” arXiv preprint arXiv:1807.03748, 2018.
    Findings
  • J. Gilmer, S. S. Schoenholz, P. F. Riley, O. Vinyals, and G. E. Dahl, “Neural message passing for quantum chemistry,” in Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, 2017.
    Google ScholarLocate open access versionFindings
  • R. Li, S. Wang, F. Zhu, and J. Huang, “Adaptive graph convolutional neural networks,” in Thirty-second AAAI conference on artificial intelligence, 2018.
    Google ScholarLocate open access versionFindings
  • K. Xu, C. Li, Y. Tian, T. Sonobe, K.-i. Kawarabayashi, and S. Jegelka, “Representation learning on graphs with jumping knowledge networks,” arXiv preprint arXiv:1806.03536, 2018.
    Findings
  • J. Zhang, X. Shi, J. Xie, H. Ma, I. King, and D.-Y. Yeung, “Gaan: Gated attention networks for learning on large and spatiotemporal graphs,” arXiv preprint arXiv:1803.07294, 2018.
    Findings
  • D. Zhu, Z. Zhang, P. Cui, and W. Zhu, “Robust graph convolutional networks against adversarial attacks,” in Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2019.
    Google ScholarLocate open access versionFindings
  • H. Wu, C. Wang, Y. Tyshetskiy, A. Docherty, K. Lu, and L. Zhu, “Adversarial examples for graph data: Deep insights into attack and defense,” in International Joint Conference on Artificial Intelligence, IJCAI, 2019.
    Google ScholarLocate open access versionFindings
  • N. Entezari, S. A. Al-Sayouri, A. Darvishzadeh, and E. E. Papalexakis, “All you need is low (rank) defending against adversarial attacks on graphs,” in Proceedings of the 13th International Conference on Web Search and Data Mining, 2020.
    Google ScholarLocate open access versionFindings
  • D. Zügner and S. Günnemann, “Certifiable robustness and robust training for graph convolutional networks,” in Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2019.
    Google ScholarLocate open access versionFindings
  • P. Velickovic, W. Fedus, W. L. Hamilton, P. Liò, Y. Bengio, and R. D. Hjelm, “Deep graph infomax,” arXiv preprint arXiv:1809.10341, 2018.
    Findings
  • Z. Peng, W. Huang, M. Luo, Q. Zheng, Y. Rong, T. Xu, and J. Huang, “Graph representation learning via graphical mutual information maximization,” in Proceedings of The Web Conference 2020, 2020.
    Google ScholarLocate open access versionFindings
  • F.-Y. Sun, J. Hoffmann, and J. Tang, “Infograph: Unsupervised and semi-supervised graph-level representation learning via mutual information maximization,” arXiv preprint arXiv:1908.01000, 2019.
    Findings
  • X. B. Peng, A. Kanazawa, S. Toyer, P. Abbeel, and S. Levine, “Variational discriminator bottleneck: Improving imitation learning, inverse rl, and gans by constraining information flow,” arXiv preprint arXiv:1810.00821, 2018.
    Findings
  • I. Higgins, L. Matthey, A. Pal, C. Burgess, X. Glorot, M. Botvinick, S. Mohamed, and A. Lerchner, “beta-vae: Learning basic visual concepts with a constrained variational framework.” in International Conference on Learning Representations, 2017.
    Google ScholarLocate open access versionFindings
  • R. D. Hjelm, A. Fedorov, S. Lavoie-Marchildon, K. Grewal, P. Bachman, A. Trischler, and Y. Bengio, “Learning deep representations by mutual information estimation and maximization,” in International Conference on Learning Representations, 2019.
    Google ScholarLocate open access versionFindings
  • P. Sen, G. Namata, M. Bilgic, L. Getoor, B. Galligher, and T. Eliassi-Rad, “Collective classification in network data,” AI magazine, 2008.
    Google ScholarLocate open access versionFindings
  • E. Cho, S. A. Myers, and J. Leskovec, “Friendship and mobility: user movement in locationbased social networks,” in Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, 2011.
    Google ScholarLocate open access versionFindings
  • O. Mason and M. Verwoerd, “Graph theory and networks in biology,” IET systems biology, 2007.
    Google ScholarLocate open access versionFindings
  • M. Barthélemy, “Spatial networks,” Physics Reports, 2011.
    Google ScholarLocate open access versionFindings
  • I. Kaastra and M. Boyd, “Designing a neural network for forecasting financial,” Neurocomputing, 1996.
    Google ScholarLocate open access versionFindings
  • R. Ying, R. He, K. Chen, P. Eksombatchai, W. L. Hamilton, and J. Leskovec, “Graph convolutional neural networks for web-scale recommender systems,” in Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2018.
    Google ScholarLocate open access versionFindings
  • I. Fischer, “The conditional entropy bottleneck,” arXiv preprint arXiv:2002.05379, 2020.
    Findings
  • A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala, “Pytorch: An imperative style, high-performance deep learning library,” in Advances in Neural Information Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, Eds. Curran Associates, Inc., 2019.
    Google ScholarLocate open access versionFindings
  • M. Fey and J. E. Lenssen, “Fast graph representation learning with PyTorch Geometric,” in ICLR Workshop on Representation Learning on Graphs and Manifolds, 2019.
    Google ScholarFindings
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科