## AI helps you reading Science

## AI Insight

AI extracts a summary of this paper

Weibo:

# A graph similarity for deep learning

NIPS 2020, (2020)

EI

Keywords

Abstract

Graph neural networks (GNNs) have been successful in learning representations from graphs. Many popular GNNs follow the pattern of aggregate-transform: they aggregate the neighbors’ attributes and then transform the results of aggregation with a learnable function. Analyses of these GNNs explain which pairs of non-identical graphs have di...More

Code:

Data:

Introduction

- Graphs are the most popular mathematical abstractions for relational data structures.
- The Weisfeiler–Leman (WL) algorithm (Weisfeiler & Leman, 1968) has been extensively studied as a test of isomorphism between graphs.
- It is easy to find a pair of non-isomorphic graphs that the WL-algorithm cannot distinguish, many graph similarity measures and graph neural networks (GNNs) have adopted the WL-algorithm at the core, due to its algorithmic simplicity.
- One of the most famous GNNs, GCN (Kipf & Welling, 2017), uses degree-normalized averaging as its aggregation.
- Other GNN models such as GAT (Velickovicet al. , 2018), GatedGCN (Bresson & Laurent, 2017), and MoNet (Monti et al , 2017) assign different weights to the neighbors depending on their attributes before aggregation

Highlights

- Graphs are the most popular mathematical abstractions for relational data structures
- It is easy to find a pair of non-isomorphic graphs that the WL-algorithm cannot distinguish, many graph similarity measures and graph neural networks (GNNs) have adopted the WL-algorithm at the core, due to its algorithmic simplicity
- We propose a graph neural network based on Weisfeiler–Leman similarity
- In Section 5.2, we show that a simple GNN based on transformsum-cat can outperform popular GNN models in node classification and graph regression
- Deep learning on graphs naturally calls for the study of graphs with continuous attributes
- Previous analyses of GNNs identified the cases when non-identical graphs had the same learned representations. It has been unclear how similarities between input graphs could be reflected in the distance between GNN representations

Methods

- In Section 5.1, the authors test the transform-sum-cat against several aggregation operations from GNN literature, comparing their performances in graph classification.
- In Section 5.2, the authors show that a simple GNN based on transformsum-cat can outperform popular GNN models in node classification and graph regression.
- In Section 5.3, the authors present a successful application of WLS in adversarial learning of graph generation with enhanced stability.
- Except for graph generation, the authors use the experimental protocols from the benchmarking framework1 (Dwivedi et al , 2020).
- The benchmark includes the datasets with fixed splits as well as reference implementations of popular GNN models, including GAT (Velickovicet al. , 2018), GatedGCN (Bresson & Laurent, 2017), GCN (Kipf & Welling, 2017), GIN (Xu et al , 2019), GraphSAGE (Hamilton et al , 2017), and MoNet (Monti et al , 2017)

Conclusion

- Deep learning on graphs naturally calls for the study of graphs with continuous attributes.
- Previous analyses of GNNs identified the cases when non-identical graphs had the same learned representations.
- It has been unclear how similarities between input graphs could be reflected in the distance between GNN representations.
- The authors have fast and efficient kernels, which cannot reflect a smooth change in the node attributes.
- The authors have smooth matching-based kernels, which are slow and costly

Summary

## Introduction:

Graphs are the most popular mathematical abstractions for relational data structures.- The Weisfeiler–Leman (WL) algorithm (Weisfeiler & Leman, 1968) has been extensively studied as a test of isomorphism between graphs.
- It is easy to find a pair of non-isomorphic graphs that the WL-algorithm cannot distinguish, many graph similarity measures and graph neural networks (GNNs) have adopted the WL-algorithm at the core, due to its algorithmic simplicity.
- One of the most famous GNNs, GCN (Kipf & Welling, 2017), uses degree-normalized averaging as its aggregation.
- Other GNN models such as GAT (Velickovicet al. , 2018), GatedGCN (Bresson & Laurent, 2017), and MoNet (Monti et al , 2017) assign different weights to the neighbors depending on their attributes before aggregation
## Methods:

In Section 5.1, the authors test the transform-sum-cat against several aggregation operations from GNN literature, comparing their performances in graph classification.- In Section 5.2, the authors show that a simple GNN based on transformsum-cat can outperform popular GNN models in node classification and graph regression.
- In Section 5.3, the authors present a successful application of WLS in adversarial learning of graph generation with enhanced stability.
- Except for graph generation, the authors use the experimental protocols from the benchmarking framework1 (Dwivedi et al , 2020).
- The benchmark includes the datasets with fixed splits as well as reference implementations of popular GNN models, including GAT (Velickovicet al. , 2018), GatedGCN (Bresson & Laurent, 2017), GCN (Kipf & Welling, 2017), GIN (Xu et al , 2019), GraphSAGE (Hamilton et al , 2017), and MoNet (Monti et al , 2017)
## Conclusion:

Deep learning on graphs naturally calls for the study of graphs with continuous attributes.- Previous analyses of GNNs identified the cases when non-identical graphs had the same learned representations.
- It has been unclear how similarities between input graphs could be reflected in the distance between GNN representations.
- The authors have fast and efficient kernels, which cannot reflect a smooth change in the node attributes.
- The authors have smooth matching-based kernels, which are slow and costly

- Table1: Graph classification results on the TU datasets via WLS kernels with different aggregations. The numbers are mean test accuracies over ten splits. Bold-faced numbers are the top scores for the corresponding datasets. The proposed aggregation (WLS) shows strong performance compared with other aggregations from the literature. See Section 5.1
- Table2: Node classification results for Stochastic Block Model (SBM) datasets. The test accuracy and training time are averaged across four runs with random seeds 1, 10, 100, and 1000. WLS obtains the highest accuracy and is close to the best speed. See Section 5.2
- Table3: Graph classification on TU datasets via graph neural networks. ENZ. for ENZYMES, PRO. for PROTEINS_full, Synth. for Synthie. The numbers in the second sets of columns are mean test accuracies over ten splits, averaged over four runs with random seeds 1, 10, 100, and 1000. MRR stands for Mean Reciprocal Rank and Time indicates the accumulated time for single run across all six datasets. Bold-faced numbers indicate the best score for each column. See Section 5.2

Related work

- Graph kernels Most of the graph kernels inspired by the Weisfeiler–Leman test act only on graphs with discrete (categorical) attributes. Morris et al (2016) extend discrete WL kernels to continuous attributes; however, its use of hashing functions cannot reflect the continuous change in attributes smoothly. Propagation kernel (Neumann et al , 2016) is another instance of hashing continuous attributes, which shares the same problem. WWL (Togninalli et al , 2019) is a smooth kernel; however, the Wasserstein distance at its core makes it difficult to scale.

The kernels based on matching or random walks (Feragen et al , 2013; Orsini et al , 2015; Kashima et al , 2003) are better suited for continuous attributes. Their speed can be drastically increased with explicit feature maps (Kriege et al , 2019). However their construction often requires large auxiliary graphs, resulting again in scalability issues.

Reference

- Abbe, Emmanuel. 2018. Community detection and stochastic block models: Recent developments. Journal of Machine Learning Research, 18(177), 1–86.
- Arvind, V., Köbler, Johannes, Rattan, Gaurav, & Verbitsky, Oleg. 2017. Graph isomorphism, color refinement, and compactness. Computational Complexity, 26, 627–685.
- Babai, László, Erdös, Paul, & Selkow, Stanley M. 1980. Random graph isomorphism. SIAM Journal on Computing, 9(3), 628–635.
- Bresson, Xavier, & Laurent, Thomas. 2017. Residual gated graph ConvNets. arXiv preprint arXiv:1711.07553.
- De Cao, Nicola, & Kipf, Thomas. 2018. MolGAN: An implicit generative model for small molecular graphs. ICML 2018 workshop on Theoretical Foundations and Applications of Deep Generative Models.
- Dehmamy, Nima, Barabási, Albert-László, & Yu, Rose. 2019. Understanding the representation power of graph neural networks in learning graph topology. Advances in Neural Information Processing Systems 32 (NIPS 2019).
- Dwivedi, Vijay Prakash, Joshi, Chaitanya K, Laurent, Thomas, Bengio, Yoshua, & Bresson, Xavier. 2020. Benchmarking graph neural networks. arXiv preprint arXiv:2003.00982v1.
- Elton, Daniel C., Boukouvalas, Zois, Fuge, Mark D., & Chung, Peter W. 2019. Deep learning for molecular design - a review of the state of the art. arXiv preprint arXiv:1903.04388.
- Feragen, Aasa, Kasenburg, Niklas, Petersen, Jens, de Bruijne, Marleen, & Borgwardt, Karsten. 2013. Scalable kernels for graphs with continuous attributes. Advances in Neural Information Processing Systems 26 (NIPS 2013).
- Gilmer, Justin, Schoenholz, Samuel S., Riley, Patrick F., Vinyals, Oriol, & Dahl, George E. 2017. Neural message passing for quantum chemistry. Proceedings of the 34th International Conference on Machine Learning (ICML 2017).
- Gori, Marco, Monfardini, Gabriele, & Scarselli, Franco. 2005. A new model for learning in graph domains. Proceedings of 2005 IEEE International Joint Conference on Neural Networks (IJCNN 2005).
- Gulrajani, Ishaan, Ahmed, Faruk, Arjovsky, Martin, Dumoulin, Vincent, & Courville, Aaron C. 2017. Improved training of Wasserstein GANs. Advances in Neural Information Processing Systems 30 (NIPS 2017).
- Gutmann, Michael, & Hyvärinen, Aapo. 2010. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models. Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS 2010).
- Hamilton, William L., Ying, Rex, & Leskovec, Jure. 2017. Inductive representation learning on large graphs. Advances in Neural Information Processing Systems 30 (NIPS 2017).
- Hein, Matthias, & Bousquet, Olivier. 2004.
- Jin, Wengong, Barzilay, Regina, & Jaakkola, Tommi. 2018. Junction tree variational autoencoder for molecular graph generation. Proceedings of the 35th International Conference on Machine Learning (ICML 2018).
- Johnson, William B., & Lindenstrauss, Joram. 1984. Extensions of Lipschitz mappings into a Hilbert space. Contemporary Mathematics, 26, 189–206.
- Kashima, Hisashi, Tsuda, Koji, & Inokuchi, Akihiro. 2003. Marginalized kernels between labeled graphs. Proceedings of the 20th International Conference on Machine Learning (ICML 2003).
- Kersting, Kristian, Kriege, Nils M., Morris, Christopher, Mutzel, Petra, & Neumann, Marion. 2016. Benchmark data sets for graph kernels. http://graphkernels.cs.tu-dortmund.de.
- Kipf, Thomas N., & Welling, Max. 2017. Semi-supervised classification with graph convolutional networks. Fifth International Conference on Learning Representations (ICLR 2017).
- Kriege, Nils M., Neumann, Marion, Morris, Christopher, Kersting, Kristian, & Mutzel, Petra. 2019. A unifying view of explicit and implicit feature maps of graph kernels. Data Mining and Knowledge Discovery, 33, 1505–1547.
- Kriege, Nils M., Johansson, Fredrik D., & Morris, Christopher. 2020. A survey on graph kernels. Applied Network Science, 5(6).
- Landrum, Greg. 2019. RDKit: Open-Source Cheminformatics Software, https://www.rdkit.org/.
- Magner, Abram, Baranwal, Mayank, & III, Alfred O. Hero. 2020. The power of graph convolutional networks to distinguish random graph models: Short version. arXiv preprint arXiv:2002.05678.
- Mikolov, Tomas, Sutskever, Ilya, Chen, Kai, Corrado, Greg, & Dean, Jeffrey. 2013. Distributed representations of words and phrases and their compositionality. Advances in Neural Information Processing Systems 26 (NIPS 2013).
- Monti, Federico, Boscaini, Davide, Masci, Jonathan, Rodolà, Emanuele, Svoboda, Jan, & Bronstein, Michael M. 2017. Geometric deep learning on graphs and manifolds using mixture model cnns. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
- Morris, Christopher, Kriege, Nils M., Kersting, Kristian, & Mutzel, Petra. 2016. Faster kernel for graphs with continuous attributes via hashing. Pages 1095–1100 of: IEEE International Conference on Data Mining (ICDM), 2016.
- Morris, Christopher, Ritzert, Martin, Fey, Matthias, Hamilton, William L., Lenssen, Jan Eric, Rattan, Gaurav, & Grohe, Martin. 2019. Weisfeiler and Leman go neural: Higher-order graph neural networks. The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI 2019).
- Neumann, Marion, Garnett, Roman, Bauckhage, Christian, & Kersting, Kristian. 2016. Propagation kernels: efficient graph kernels from propagated information. Machine Learning, 102, 209–245.
- Nikolentzos, Giannis, Siglidis, Giannis, & Vazirgiannis, Michalis. 2019. Graph kernels: A survey. arXiv preprint arXiv:1904.12218.
- Orsini, Francesco, Frasconi, Paolo, & Raedt, Luc De. 2015. Graph invariant kernels. Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI 2015).
- Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, E. 2011. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830.
- Phillips, Jeff M., & Venkatasubramanian, Suresh. 2011. A gentle introduction to the kernel distance. arXiv preprint arXiv:1103.1625.
- Scarselli, Franco, Gori, Marco, Tsoi, Ah Chung, Hagenbuchner, Markus, & Monfardini, Gabriele. 2009. The graph neural network model. IEEE Transactions on Neural Networks, 20(1), 61–80.
- Schlichtkrull, Michael, Kipf, Thomas N., Bloem, Peter, van den Berg, Rianne, Titov, Ivan, & Welling, Max. 2018. Modeling relational data with graph convolutional networks. European Semantic Web Conference.
- Simonovsky, Martin, & Komodakis, Nikos. 2018. GraphVAE: Towards generation of small graphs using variational autoencoders. arXiv preprint arXiv:1802.03480.
- Sterling, Teague, & Irwin, John J. 2015. ZINC 15 - ligand discovery for everyone. Journal of Chemical Information and Modeling, 55(11), 2324–2337.
- Togninalli, Matteo, Ghisu, Elisabetta, Llinares-Lopez, Felipe, Rieck, Bastian, & Borgwardt, Karsten. 2019. Wasserstein Weisfeiler-Lehman graph kernels. Advances in Neural Information Processing Systems 32 (NIPS 2019).
- Velickovic, Petar, Cucurull, Guillem, Casanova, Arantxa, Romero, Adriana, Liò, Pietro, & Bengio, Yoshua. 2018. Graph attention networks. Sixth International Conference on Learning Representations (ICLR 2018).
- Weisfeiler, Boris Y., & Leman, Andrei A. 1968. The reduction of a graph to canonical form and the algebra which appears therein. Nauchno-Technicheskaya Informatsia, Series 2, 9. English translation by G. Ryabov available at https://www.iti.zcu.cz/wl2018/pdf/wl_paper_translation.pdf.
- Xu, Keyulu, Hu, Weihua, Leskovec, Jure, & Jegelka, Stefanie. 2019. How powerful are graph neural networks? Seventh International Conference on Learning Representations (ICLR 2019).
- Yanardag, Pinar, & Vishwanathan, S.V.N. 2015. Deep graph kernels. Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2015.

Tags

Comments