## AI helps you reading Science

## AI Insight

AI extracts a summary of this paper

Weibo:

# Dirichlet Graph Variational Autoencoder

NIPS 2020, (2020)

EI

Keywords

Abstract

Graph Neural Networks (GNNs) and Variational Autoencoders (VAEs) have been widely used in modeling and generating graphs with latent factors. However, there is no clear explanation of what these latent factors are and why they perform well. In this work, we present Dirichlet Graph Variational Autoencoder (DGVAE) with graph cluster members...More

Code:

Data:

Introduction

- Since the introduction of Graph Neural Networks (GNNs) [19, 4, 6] and Variational Autoencoders (VAEs) [16], many studies [18, 21, 8] have used GNNs and VAEs (GVAEs) to generate realistic graphs with latent factors.
- Inspired by the recent development of variational autoencoder topic model [27, 3] in text generation, in this work, the authors propose to formulate the latent factors in GVAEs as graph cluster memberships, analogous to topics in text generation.
- JT-VAE [14] proposes to generate molecular graphs in two phases, in which it first

Highlights

- Since the introduction of Graph Neural Networks (GNNs) [19, 4, 6] and Variational Autoencoders (VAEs) [16], many studies [18, 21, 8] have used GNNs and VAEs (GVAEs) to generate realistic graphs with latent factors
- As an ablation study, when replacing Heatts with Graph Convolutional Networks (GCN) [19], the performance is just comparable to baselines, and worse than Dirichlet Graph Variational Autoencoder (DGVAE), which shows the superiority of Heatts
- As DGVAE/DGAE does not rely on K-means to derive cluster memberships, this cluster performance indicates the effectiveness of our framework on graph clustering tasks
- We show the latent factors of DGVAE can be understood as cluster memberships and the reconstruction term connects with spectral relaxed balanced graph cut
- Motivated by low pass characteristics in balanced graph cut, we propose Heatts, a new variant of GNN, which utilizes Taylor series for the fast computation of heat kernels and admits better low pass characteristics than GCN

Methods

- Data and baselines The authors follow Graphite [8] and create data sets from six graph families with fixed, known generative processes, to evaluate the performance of DGVAE on graph generation.
- The authors compare with GAE/VGAE [18] and Graphite-AE/VAE [8].
- Setup For DGVAE/DGAE, the authors use the same network architecture through all the experiments.
- The authors train for 200 iterations with a learning rate of 0.01.
- The Dirichlet prior is set to be 0.01 for all dimensions if not specified otherwise.

Results

- The negative log-likelihood (NLL) and root mean square error (RMSE) on a test set of instances are shown in Table 1
- Both DGVAE and DGAE outperform their competitors significantly on all data sets, indicating the effectiveness of DGVAE and DGVE.
- The clustering accuracy (ACC), normalized mutual information (NMI) and macro F1 score (F1) are shown in Table 2
- Both DGVAE and DGAE outperform their competitors on most data sets.
- As an ablation study, when replacing Heatts with GCN [19], the performance is just comparable to baselines, and worse than DGVAE, which again proves the superiority of Heatts over GCN

Conclusion

- The authors present DGVAE, a graph variational generative model with Dirichlet latent variables.
- The authors show the latent factors of DGVAE can be understood as cluster memberships and the reconstruction term connects with spectral relaxed balanced graph cut.
- The effectiveness of DGVAE is validated on graph generation and graph clustering tasks.
- This work connects VAEs based graph generation and traditional graph research topic — balanced graph cut.
- Researchers in drug design or molecule generation may benefit from this research, since the interpretation of deep learning based graph generation is worthwhile to be further explored

- Table1: Test graph generation comparison of different methods
- Table2: Cluster performance comparison of different methods
- Table3: Statistics of data sets used in graph clustering

Related work

- Dirichlet VAEs Previous studies [3, 27, 32, 15] on VAEs have enabled the usage of the Dirichlet distributions as priors, and most of them are applied in the text domain. In these practices, there are two commonly observed difficulties: (1) reparameterization trick is problematic when the Dirichlet distributions are applied [27], and (2) component collapsing, in which the model reaches close to the prior belief [17]. To tackle these two issues, Srivastava and Sutton [27] resolve the former by softmax Laplace approximation [12] and the latter by stacking training strategies, i.e., higher learning rate, batch normalization and dropout. Joo et al [15] use inverse Gamma approximation to address the former issue and argue there is no component collapsing in their modeling. Burkhardt and Kramer [3] apply rejection sampling variational inference for the former and propose sparse Dirichlet VAEs to address the latter. Some other methods include Weibull distribution approximation [35] and Dirichlet stick-breaking priors [22].

Funding

- Acknowledgments and Disclosure of Funding The work was supported by grants from the Research Grant Council of the Hong Kong Special Administrative Region, China [Project No.: CUHK 14205618], Tencent AI Lab RhinoBird Focused Research Program GF202005, and NSFC Grant No U1936205

Study subjects and analysis

graphs and their samples: 4

As shown in Figure 3, this training dynamics can significantly relieve the posterior collapse problem. Visualization We plot four graphs and their samples generated by DGVAE in Figure 2 with latent cluster dimension K = 3. We let the number of edges for graph samples equal with the number for the input graphs

Reference

- David M Blei, Andrew Y Ng, and Michael I Jordan. “Latent dirichlet allocation”. In: Journal of machine Learning research 3.Jan (2003), pp. 993–1022.
- Thomas Bühler and Matthias Hein. “Spectral clustering based on the graph p-Laplacian”. In: Proceedings of the 26th Annual International Conference on Machine Learning (ICML). 2009, pp. 81–88.
- Sophie Burkhardt and Stefan Kramer. “Decoupling sparsity and smoothness in the dirichlet variational autoencoder topic model”. In: Journal of Machine Learning Research 20.131 (2019), pp. 1–27.
- Michaël Defferrard, Xavier Bresson, and Pierre Vandergheynst. “Convolutional neural networks on graphs with fast localized spectral filtering”. In: Advances in neural information processing systems. 2016, pp. 3844–3852.
- Claire Donnat, Marinka Zitnik, David Hallac, and Jure Leskovec. “Learning structural node embeddings via diffusion wavelets”. In: ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD). 2018, pp. 1320–1329.
- Justin Gilmer, Samuel S Schoenholz, Patrick F Riley, Oriol Vinyals, and George E Dahl. “Neural message passing for quantum chemistry”. In: Proceedings of the 34th International Conference on Machine Learning (ICML). JMLR. org. 2017, pp. 1263–1272.
- Aditya Grover and Jure Leskovec. “node2vec: Scalable feature learning for networks”. In: ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD). 2016, pp. 855– 864.
- Aditya Grover, Aaron Zweig, and Stefano Ermon. “Graphite: Iterative generative modeling of graphs”. In: The International Conference on Machine Learning (ICML). 2019, pp. 2434–2444.
- Lars Hagen and Andrew B Kahng. “New spectral methods for ratio cut partitioning and clustering”. In: IEEE transactions on computer-aided design of integrated circuits and systems 11.9 (1992), pp. 1074–1085.
- David K Hammond, Pierre Vandergheynst, and Rémi Gribonval. “Wavelets on graphs via spectral graph theory”. In: Applied and Computational Harmonic Analysis 30.2 (2011), pp. 129– 150.
- Junxian He, Daniel Spokoyny, Graham Neubig, and Taylor Berg-Kirkpatrick. “Lagging inference networks and posterior collapse in variational autoencoders”. In: The International Conference on Learning Representations (ICLR) (2019).
- Philipp Hennig, David Stern, Ralf Herbrich, and Thore Graepel. “Kernel topic models”. In: Artificial Intelligence and Statistics. 2012, pp. 511–519.
- Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. “beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework.” In: The International Conference on Learning Representations (ICLR) 2.5 (2017), p. 6.
- Wengong Jin, Regina Barzilay, and Tommi Jaakkola. “Junction tree variational autoencoder for molecular graph generation”. In: The International Conference on Machine Learning (ICML) (2018).
- Weonyoung Joo, Wonsung Lee, Sungrae Park, and Il-Chul Moon. “Dirichlet variational autoencoder”. In: arXiv preprint arXiv:1901.02739 (2019).
- Diederik P Kingma and Max Welling. “Auto-encoding variational bayes”. In: The International Conference on Learning Representations (ICLR). 2014.
- Durk P Kingma, Tim Salimans, Rafal Jozefowicz, Xi Chen, Ilya Sutskever, and Max Welling. “Improved variational inference with inverse autoregressive flow”. In: Advances in neural information processing systems. 2016, pp. 4743–4751.
- Thomas N Kipf and Max Welling. “Variational Graph Auto-Encoders”. In: Conference on Neural Information Processing Systems (NeurIPS) Workshop on Bayesian Deep Learning. 2016.
- Thomas N. Kipf and Max Welling. “Semi-Supervised Classification with Graph Convolutional Networks”. In: The International Conference on Learning Representations (ICLR). 2017.
- Qimai Li, Zhichao Han, and Xiao-Ming Wu. “Deeper insights into graph convolutional networks for semi-supervised learning”. In: The AAAI Conference on Artificial Intelligence (AAAI). 2018.
- Qi Liu, Miltiadis Allamanis, Marc Brockschmidt, and Alexander Gaunt. “Constrained graph variational autoencoders for molecule design”. In: Conference on Neural Information Processing Systems (NeurIPS). 2018, pp. 7795–7804.
- Eric Nalisnick and Padhraic Smyth. “Stick-breaking variational autoencoders”. In: arXiv preprint arXiv:1605.06197 (2016).
- Hoang NT and Takanori Maehara. “Revisiting Graph Neural Networks: All We Have is Low-Pass Filters”. In: arXiv preprint arXiv:1905.09550 (2019).
- P. Sen, G. Namata, M. Bilgic, L. Getoor, B. Galligher, and T. Eliassi-Rad. “Collective classification in network data”. In: AI magazine 29.3 (2008), pp. 93–106.
- Jianbo Shi and Jitendra Malik. “Normalized cuts and image segmentation”. In: IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) (2000), p. 107.
- David I Shuman, Sunil K Narang, Pascal Frossard, Antonio Ortega, and Pierre Vandergheynst. “The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains”. In: IEEE signal processing magazine 30.3 (2013), pp. 83–98.
- Akash Srivastava and Charles Sutton. “Autoencoding variational inference for topic models”. In: The International Conference on Learning Representations (ICLR) (2017).
- Ulrike Von Luxburg. “A tutorial on spectral clustering”. In: Statistics and computing 17.4 (2007), pp. 395–416.
- Dorothea Wagner and Frank Wagner. “Between min cut and graph bisection”. In: International Symposium on Mathematical Foundations of Computer Science. Springer. 1993, pp. 744–750.
- Song Wang and Jeffrey Mark Siskind. “Image segmentation with ratio cut”. In: IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) 25.6 (2003), pp. 675–690.
- Felix Wu, Amauri Souza, Tianyi Zhang, Christopher Fifty, Tao Yu, and Kilian Weinberger. “Simplifying Graph Convolutional Networks”. In: Proceedings of the 36th International Conference on Machine Learning (ICML). PMLR, 2019, pp. 6861–6871.
- Yijun Xiao, Tiancheng Zhao, and William Yang Wang. “Dirichlet variational autoencoder for text modeling”. In: arXiv preprint arXiv:1811.00135 (2018).
- Bingbing Xu, Huawei Shen, Qi Cao, Keting Cen, and Xueqi Cheng. “Graph Convolutional Networks using Heat Kernel for Semi-supervised Learning”. In: Proceedings of the TwentyEighth International Joint Conference on Artificial Intelligence (IJCAI-19). 2019, pp. 1928– 1934.
- Cheng Yang, Zhiyuan Liu, Deli Zhao, Maosong Sun, and Edward Chang. “Network representation learning with rich text information”. In: The International Joint Conference on Artificial Intelligence (IJCAL). 2015.
- Hao Zhang, Bo Chen, Dandan Guo, and Mingyuan Zhou. “WHAI: Weibull hybrid autoencoding inference for deep topic modeling”. In: arXiv preprint arXiv:1803.01328 (2018).
- Shengjia Zhao, Jiaming Song, and Stefano Ermon. “Infovae: Information maximizing variational autoencoders”. In: arXiv preprint arXiv:1706.02262 (2017).
- 0. Hence, the regularization is used to maximize the sample variance. Assume that only samples are accessible to the target distribution Dir(β), we consider the variance instead (i.e., ECi∼Dir(β)[

Tags

Comments

数据免责声明

页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果，我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问，可以通过电子邮件方式联系我们：report@aminer.cn