## AI helps you reading Science

## AI Insight

AI extracts a summary of this paper

Weibo:

# Gated Graph Sequence Neural Networks

international conference on learning representations, (2016)

EI

Keywords

Abstract

Graph-structured data appears frequently in domains including chemistry, natural language semantics, social networks, and knowledge bases. In this work, we study feature learning techniques for graph-structured inputs. Our starting point is previous work on Graph Neural Networks (Scarselli et al., 2009), which we modify to use gated rec...More

Code:

Data:

Introduction

- Many practical applications build on graph-structured data, and the authors often want to perform machine learning tasks that take graphs as inputs.
- More closely related to the goal in this work are methods that learn features on graphs, including Graph Neural Networks (Gori et al, 2005; Scarselli et al, 2009), spectral networks (Bruna et al, 2013) and recent work on learning graph fingerprints for classification tasks on graph representations of chemical molecules (Duvenaud et al, 2015).
- Previous work on feature learning for graph-structured inputs has focused on models that produce single outputs such as graph-level classifications, but many problems with graph inputs require outputting sequences.
- A secondary contribution is highlighting that Graph Neural Networks are a broadly useful class of neural network model that is applicable to many problems currently facing the field

Highlights

- Many practical applications build on graph-structured data, and we often want to perform machine learning tasks that take graphs as inputs
- A secondary contribution is highlighting that Graph Neural Networks are a broadly useful class of neural network model that is applicable to many problems currently facing the field
- We describe Gated Graph Neural Networks (GG-NNs), our adaptation of Graph Neural Networks that is suitable for non-sequential outputs
- We describe Gated Graph Sequence Neural Networks (GGS-NNs), in which several Gated Graph Neural Networks operate in sequence to produce an output sequence o(1) . . . o(K)
- We provide baselines to show that the symbolic representation does not help RNNs or LSTMs significantly, and show that Gated Graph Sequence Neural Networks solve the problem with a small number of training instances
- The Gated Graph Neural Networks model can be seen as learning this, with results stored in the neural network weights

Methods

- The authors produced a dataset of 327 formulas that involves three program variables, with 498 graphs per formula, yielding around 160,000 formula/heap graph combinations.
- The authors compared the GGS-NN-based model with a method the authors developed earlier (Brockschmidt et al, 2015).
- The earlier approach treats each prediction step as standard classification, and requires complex, manual, problem-specific feature engineering, to achieve an accuracy of 89.11%.
- The authors' new model was trained with no feature engineering and very little domain knowledge and achieved an accuracy of 89.96%

Results

- The authors demonstrate the capabilities on some simple AI and graph algorithm learning tasks.
- The bAbI Task 19 (Path Finding) is arguably the hardest task among all bAbI tasks (see e.g., (Sukhbaatar et al, 2015), which reports an accuracy of less than 20% for all methods that do not use the strong supervision)

Conclusion

- What is being learned? It is instructive to consider what is being learned by the GG-NNs.
- The current GGS-NNs formulation specifies a question only after all the facts have been consumed
- This implies that the network must try to derive all consequences of the seen facts and store all pertinent information to a node within its node representation.
- This is likely not ideal; it would be preferable to develop methods that take the question as an initial input, and dynamically derive the facts needed to answer the question.
- The authors consider these graph neural networks as representing a step towards a model that can combine structured representations with the powerful algorithms of deep learning, with the aim of taking advantage of known structure while learning and inferring how to reason with and extend these representations

- Table1: Accuracy in percentage of different models for different tasks. Number in parentheses is number of training examples required to reach shown accuracy
- Table2: Performance breakdown of RNN and LSTM on bAbI task 4 as the amount of training data changes
- Table3: Accuracy in percentage of different models for different tasks. The number in parentheses is number of training examples required to reach that level of accuracy
- Table4: Example list manipulation programs and the separation logic formula invariants the GGSNN model founds from a set of input graphs. The “=” parts are produced by a deterministic procedure that goes through all the named program variables in all graphs and checks for inequality

Related work

- The most closely related work is GNNs, which we have discussed at length above. Micheli (2009) proposed another closely related model that differs from GNNs mainly in the output model. GNNs have been applied in several domains (Gori et al, 2005; Di Massa et al, 2006; Scarselli et al, 2009; Uwents et al, 2011), but they do not appear to be in widespread use in the ICLR community. Part of our aim here is to publicize GNNs as a useful and interesting neural network variant. An analogy can be drawn between our adaptation from GNNs to GG-NNs, to the work of Domke (2011) and Stoyanov et al (2011) in the structured prediction setting. There belief propagation (which must be run to near convergence to get good gradients) is replaced with truncated belief propagation updates, and then the model is trained so that the truncated iteration produce good results after a fixed number of iterations. Similarly, Recursive Neural Networks (Goller & Kuchler, 1996; Socher et al, 2011) being extended to Tree LSTMs (Tai et al, 2015) is analogous to our using of GRU updates in GG-NNs instead of the standard GNN recurrence with the aim of improving the long-term propagation of information across a graph structure.

Reference

- Almeida, Luis B. A learning rule for asynchronous perceptrons with feedback in a combinatorial environment. In Artificial neural networks, pp. 102–11IEEE Press, 1990.
- Bahdanau, Dzmitry, Cho, Kyunghyun, and Bengio, Yoshua. Neural machine translation by jointly learning to align and translate. CoRR, abs/1409.0473, 2014.
- Bottou, Leon. From machine learning to machine reasoning. Machine learning, 94(2):133–149, 2014.
- Brockschmidt, Marc, Chen, Yuxin, Cook, Byron, Kohli, Pushmeet, and Tarlow, Daniel. Learning to decipher the heap for program verification. In Workshop on Constructive Machine Learning at the International Conference on Machine Learning (CMLICML), 2015.
- Bruna, Joan, Zaremba, Wojciech, Szlam, Arthur, and LeCun, Yann. Spectral networks and locally connected networks on graphs. arXiv preprint arXiv:1312.6203, 2013.
- Cho, Kyunghyun, Van Merrienboer, Bart, Gulcehre, Caglar, Bahdanau, Dzmitry, Bougares, Fethi, Schwenk, Holger, and Bengio, Yoshua. Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078, 2014.
- Di Massa, Vincenzo, Monfardini, Gabriele, Sarti, Lorenzo, Scarselli, Franco, Maggini, Marco, and Gori, Marco. A comparison between recursive neural networks and graph neural networks. In International Joint Conference on Neural Networks (IJCNN), pp. 778–785. IEEE, 2006.
- Domke, Justin. Parameter learning with truncated message-passing. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2937–2943. IEEE, 2011.
- Duvenaud, David, Maclaurin, Dougal, Aguilera-Iparraguirre, Jorge, Gomez-Bombarelli, Rafael, Hirzel, Timothy, Aspuru-Guzik, Alan, and Adams, Ryan P. Convolutional networks on graphs for learning molecular fingerprints. arXiv preprint arXiv:1509.09292, 2015.
- Goller, Christoph and Kuchler, Andreas. Learning task-dependent distributed representations by backpropagation through structure. In IEEE International Conference on Neural Networks, volume 1, pp. 347–352. IEEE, 1996.
- Gori, Marco, Monfardini, Gabriele, and Scarselli, Franco. A new model for learning in graph domains. In International Joint Conference onNeural Networks (IJCNN), volume 2, pp. 729–734. IEEE, 2005.
- Hammer, Barbara and Jain, Brijnesh J. Neural methods for non-standard data. In European Symposium on Artificial Neural Networks (ESANN), 2004.
- Hinton, Geoffrey E. Representing part-whole hierarchies in connectionist networks. In Proceedings of the Tenth Annual Conference of the Cognitive Science Society, pp. 48–54. Erlbaum., 1988.
- Hoare, Charles Antony Richard. An axiomatic basis for computer programming. Communications of the ACM, 12(10):576–580, 1969.
- Kashima, Hisashi, Tsuda, Koji, and Inokuchi, Akihiro. Marginalized kernels between labeled graphs. In Proceedings of the International Conference on Machine Learning, volume 3, pp. 321–328, 2003.
- Kingma, Diederik and Ba, Jimmy. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
- Kumar, Ankit, Irsoy, Ozan, Su, Jonathan, Bradbury, James, English, Robert, Pierce, Brian, Ondruska, Peter, Gulrajani, Ishaan, and Socher, Richard. Ask me anything: Dynamic memory networks for natural language processing. arXiv preprint arXiv:1506.07285, 2015.
- Lusci, Alessandro, Pollastri, Gianluca, and Baldi, Pierre. Deep architectures and deep learning in chemoinformatics: the prediction of aqueous solubility for drug-like molecules. J Chem Inf Model, 2013.
- Micheli, Alessio. Neural network for graphs: A contextual constructive approach. IEEE Transactions on Neural Networks, 20(3):498–511, 2009.
- O’Hearn, Peter, Reynolds, John C., and Yang, Hongseok. Local reasoning about programs that alter data structures. In 15th International Workshop on Computer Science Logic (CSL’01), pp. 1–19, 2001.
- Perozzi, Bryan, Al-Rfou, Rami, and Skiena, Steven. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 701–710. ACM, 2014.
- Pineda, Fernando J. Generalization of back-propagation to recurrent neural networks. Physical review letters, 59(19):2229, 1987.
- Piskac, Ruzica, Wies, Thomas, and Zufferey, Damien. GRASShopper - complete heap verification with mixed specifications. In 20st International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS’14), pp. 124–139, 2014.
- Reynolds, John C. Separation logic: A logic for shared mutable data structures. In 7th IEEE Symposium on Logic in Computer Science (LICS’02), pp. 55–74, 2002.
- Scarselli, Franco, Gori, Marco, Tsoi, Ah Chung, Hagenbuchner, Markus, and Monfardini, Gabriele. The graph neural network model. IEEE Transactions on Neural Networks, 20(1):61–80, 2009.
- Shervashidze, Nino, Schweitzer, Pascal, Van Leeuwen, Erik Jan, Mehlhorn, Kurt, and Borgwardt, Karsten M. Weisfeiler-lehman graph kernels. The Journal of Machine Learning Research, 12: 2539–2561, 2011.
- Socher, Richard, Lin, Cliff C, Manning, Chris, and Ng, Andrew Y. Parsing natural scenes and natural language with recursive neural networks. In Proceedings of the 28th international conference on machine learning (ICML-11), pp. 129–136, 2011.
- Sperduti, Alessandro and Starita, Antonina. Supervised neural networks for the classification of structures. IEEE Transactions on Neural Networks, 8(3):714–735, 1997.
- Published as a conference paper at ICLR 2016 Stoyanov, Veselin, Ropson, Alexander, and Eisner, Jason. Empirical risk minimization of graphical model parameters given approximate inference, decoding, and model structure. In International Conference on Artificial Intelligence and Statistics, pp. 725–733, 2011.
- Sukhbaatar, Sainbayar, Szlam, Arthur, Weston, Jason, and Fergus, Rob. End-to-end memory networks. arXiv preprint arXiv:1503.08895, 2015.
- Tai, Kai Sheng, Socher, Richard, and Manning, Christopher D. Improved semantic representations from tree-structured long short-term memory networks. arXiv preprint arXiv:1503.00075, 2015.
- Uwents, Werner, Monfardini, Gabriele, Blockeel, Hendrik, Gori, Marco, and Scarselli, Franco. Neural networks for relational learning: an experimental comparison. Machine Learning, 82(3):315–349, 2011.
- Vinyals, Oriol, Fortunato, Meire, and Jaitly, Navdeep. Pointer networks. arXiv preprint arXiv:1506.03134, 2015.
- Weston, Jason, Bordes, Antoine, Chopra, Sumit, and Mikolov, Tomas. Towards ai-complete question answering: a set of prerequisite toy tasks. arXiv preprint arXiv:1502.05698, 2015.

Tags

Comments

数据免责声明

页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果，我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问，可以通过电子邮件方式联系我们：report@aminer.cn