An Explicitly Relational Neural Network Architecture

Kyriacos Nikiforou
Kyriacos Nikiforou
Antonia Creswell
Antonia Creswell
Christos Kaplanis
Christos Kaplanis
David Barrett
David Barrett
Marta Garnelo
Marta Garnelo

ICML 2020, 2019.

Cited by: 6|Bibtex|Views149|Links
EI
Keywords:
principal component analysisreusable representationgeneral purposereinforcement learningnovel taskMore(4+)
Weibo:
One important respect in which the PrediNet differs from other network architectures is the extent to which it canalises information flow; at the core of the network, information is organised into small chunks which are processed in parallel channels that limit the ways the chunk...

Abstract:

With a view to bridging the gap between deep learning and symbolic AI, we present a novel end-to-end neural network architecture that learns to form propositional representations with an explicitly relational structure from raw pixel data. In order to evaluate and analyse the architecture, we introduce a family of simple visual relation...More

Code:

Data:

0
Introduction
  • When humans face novel problems, they are able to draw effectively on past experience with other problems that are superficially very different, but that have similarities on a more abstract, structural level
  • This ability is essential for lifelong, continual learning, and confers on humans a degree of data efficiency, powers of transfer learning, and a capacity for out-of-distribution generalisation that contemporary machine learning has yet to match [10, 19, 20, 29].
  • With the exception of the very first representations the system learns, all learning in such a system would in effect be transfer learning, and the process of learning would be inherently cumulative, continual, and lifelong
Highlights
  • When humans face novel problems, they are able to draw effectively on past experience with other problems that are superficially very different, but that have similarities on a more abstract, structural level
  • We report the results of a number of experiments using these datasets that demonstrate the potential of an explicitly relational network architecture to improve data efficiency and generalisation, to facilitate transfer, and to learn reusable representations
  • A pair of xy co-ordinates is appended to each convolutional neural network feature vector, denoting its position in convolved image space and, where applicable, a one-hot task identifier is appended to the output of the central module
  • We identified the heads in the PrediNet that attended to both objects in the image and found that they overlapped almost entirely with those that meaningfully clustered the labels (Fig.S10)
  • We have presented a neural network architecture capable, in principle, of supporting predicate logic’s powers of abstraction without compromising the ideal of end-to-end learning, where the network itself discovers objects and relations in the raw data and avoids the symbol grounding problem entailed by symbolic AI’s practice of hand-crafting representations [13]
  • On the ‘xoccurs’ task, the PrediNet out-performs the baselines by more than 10%, and on the ‘colour / shape’ task, it out-performs all the baselines except MHA by 25% or more
  • One important respect in which the PrediNet differs from other network architectures is the extent to which it canalises information flow; at the core of the network, information is organised into small chunks which are processed in parallel channels that limit the ways the chunks can interact
Methods
  • Each architecture the authors consider in this paper comprises 1) a single convolutional input layer (CNN), 2) a central module, and 3) a small output multi-layer perceptron (MLP) (Fig. 3).
  • A pair of xy co-ordinates is appended to each CNN feature vector, denoting its position in convolved image space and, where applicable, a one-hot task identifier is appended to the output of the central module.
  • All use the same input CNN architecture and the same output MLP architecture, and differ only in the central module.
Results
  • As a prelude to investigating the issues of generality and reusabilty, the authors studied the data efficiency of the PrediNet architecture in a single-task Relations Game setting.
  • Results obtained on a selection of five tasks – ‘same’, ‘between’, ‘occurs’, ‘xoccurs’, and ‘colour / shape’ – are summarised in Table 1.
  • Table 1 shows the accuracy obtained by each of the five architectures after 100,000 batches when tested on the two held-out object sets.
  • The PrediNet is the only architecture that achieves over 90% accuracy on all tasks with both held-out object sets after 100,000 batches.
  • On the ‘xoccurs’ task, the PrediNet out-performs the baselines by more than 10%, and on the ‘colour / shape’ task, it out-performs all the baselines except MHA by 25% or more
Conclusion
  • Conclusion and Further

    Work

    The authors have presented a neural network architecture capable, in principle, of supporting predicate logic’s powers of abstraction without compromising the ideal of end-to-end learning, where the network itself discovers objects and relations in the raw data and avoids the symbol grounding problem entailed by symbolic AI’s practice of hand-crafting representations [13].
  • One important respect in which the PrediNet differs from other network architectures is the extent to which it canalises information flow; at the core of the network, information is organised into small chunks which are processed in parallel channels that limit the ways the chunks can interact
  • The authors believe this pressures the network to learn representations where each separate chunk of information has independent meaning and utility.
  • The findings reported here are just the first foray into unexplored architectural territory, and much work needs to be done to gauge the architecture’s full potential
Summary
  • When humans face novel problems, they are able to draw effectively on past experience with other problems that are superficially very different, but that have similarities on a more abstract, structural level.
  • We present an architecture, which we call a PrediNet, that learns representations whose parts map directly onto propositions, relations, and objects.
  • We report the results of a number of experiments using these datasets that demonstrate the potential of an explicitly relational network architecture to improve data efficiency and generalisation, to facilitate transfer, and to learn reusable representations.
  • For a given input L, each head h computes the same set of relations but selects a different pair of objects, using dot-product attention based on key-query matching [31].
  • MHA comprises multiple heads, each of which generates mappings from the input feature vectors to sets of keys K, queries Q, and values V , and computes softmax(QK )V .
  • To assess the generality and reusability of the representations produced by the PrediNet, we adopted a four-stage experimental protocol wherein 1) the network is pre-trained on a curriculum of one or more tasks, 2) the weights in the input CNN and PrediNet are frozen while the weights in the output MLP are re-initialised with random values, and 3) the network is retrained on a new target task or set of tasks (Fig. 3).
  • To gain some insight into how the PrediNet encodes relations, we carried out principal component analysis (PCA) on each head of the central module’s output vectors for a number of trained networks, again in the single-task setting (Fig. 6b).
  • For the other baselines, which lack the multi-head organisation of the PrediNet and the MHA network, the only option is to carry out PCA on the whole output vector of the central module.
  • We have presented a neural network architecture capable, in principle, of supporting predicate logic’s powers of abstraction without compromising the ideal of end-to-end learning, where the network itself discovers objects and relations in the raw data and avoids the symbol grounding problem entailed by symbolic AI’s practice of hand-crafting representations [13].
  • Our empirical results support the view that a network architecturally constrained to learn explicitly propositional, relational representations will have beneficial data efficiency, generalisation, and transfer properties.
  • Thanks to the structural priors of its architecture, representations generated by a PrediNet module have a natural semantics compatible with predicate calculus (Equation 1), which makes them an ideal medium for logic-like downstream processes such as rule-based deduction, causal or counterfactual reasoning, and inference to the best explanation.
  • The findings reported here are just the first foray into unexplored architectural territory, and much work needs to be done to gauge the architecture’s full potential
Tables
  • Table1: Data efficiency in a single-task Relations Game setting
  • Table2: Default hyperparameters
Download tables as Excel
Related work
  • The need for good representations has long been recognised in AI [21, 25], and is fundamental to deep learning [4]. The importance of reusability and abstraction, especially in the context of transfer, is emphasised by Bengio, et al [4], who argue for feature sets that are “invariant to the irrelevant features and disentangle the relevant features”. Our work here shares this motivation. Other work has looked at learning representations that are disentangled at the feature level [14, 16]. The novelty of the PrediNet is to incorporate architectural priors that favour representations that are disentangled at the relational and propositional levels. Previous work with relation nets and multi-head attention nets has shown how non-local information can be extracted from raw pixel data and used to solve tasks that require relational reasoning. [26, 22, 27, 33] But unlike the PrediNet, these networks don’t produce representations with an explicitly relational, propositional structure. By addressing the problem of acquiring structured representations, the PrediNet complements another thread of related work, which is concerned with learning how to carry out inference with structured representations, but which assumes the job of acquiring those representations is done elsewhere [12, 2, 24, 9, 8].
Funding
  • On the ‘xoccurs’ task, the PrediNet out-performs the baselines by more than 10%, and on the ‘colour / shape’ task (where chance is 25%), it out-performs all the baselines except MHA by 25% or more
Reference
  • Masataro Asai. “Unsupervised Grounding of Plannable First-Order Logic Representation from Images”. In: International Conference on Automated Planning and Scheduling. 2019.
    Google ScholarLocate open access versionFindings
  • Peter W. Battaglia et al. “Interaction Networks for Learning about Objects, Relations and Physics”. In: Advances in Neural Information Processing Systems. 2016, pp. 4502–4510.
    Google ScholarLocate open access versionFindings
  • Yoshua Bengio. “Deep learning of representations for unsupervised and transfer learning”. In: Proc. ICML Workshop on Unsupervised and Transfer Learning. Vol. 27. 2012, pp. 17–37.
    Google ScholarLocate open access versionFindings
  • Yoshua Bengio, Aaron Courville, and Pascal Vincent. “Representation learning: A review and new perspectives”. In: IEEE transactions on pattern analysis and machine intelligence 35.8 (2013), pp. 1798–1828.
    Google ScholarLocate open access versionFindings
  • Yoshua Bengio et al. “Curriculum Learning”. In: Proc. 26th International Conference on Machine Learning. 2009, pp. 41–48.
    Google ScholarLocate open access versionFindings
  • Antoine Bordes et al. “Learning structured embeddings of knowledge bases”. In: Proc. AAAI. 2011, pp. 301–306.
    Google ScholarLocate open access versionFindings
  • Mostafa Dehghani et al. “Universal transformers”. In: Proc. International Conference on Learning Representations. 2019.
    Google ScholarLocate open access versionFindings
  • Honghua Dong et al. “Neural Logic Machines”. In: International Conference on Learning Representations. 2019.
    Google ScholarLocate open access versionFindings
  • Richard Evans and Edward Grefenstette. “Learning explanatory rules from noisy data”. In: Journal of Artificial Intelligence Research 61 (2018), pp. 1–64.
    Google ScholarLocate open access versionFindings
  • Marta Garnelo, Kai Arulkumaran, and Murray Shanahan. “Towards deep symbolic reinforcement learning”. In: arXiv preprint: 1609.05518 (2016).
    Findings
  • Marta Garnelo and Murray Shanahan. “Reconciling deep learning with symbolic artificial intelligence: representing objects and relations”. In: Current Opinion in Behavioral Sciences 29 (2019), pp. 17–23.
    Google ScholarLocate open access versionFindings
  • Lisa Getoor and Ben Taskar, eds. Introduction to Statistical Relational Learning. MIT Press, 2007.
    Google ScholarFindings
  • Stevan Harnad. “The symbol grounding problem”. In: Physica D: Nonlinear Phenomena 42.1-3 (1990), pp. 335–346.
    Google ScholarFindings
  • Irina Higgins et al. “beta-vae: Learning basic visual concepts with a constrained variational framework”. In: International Conference on Learning Representations. 2017.
    Google ScholarLocate open access versionFindings
  • Irina Higgins et al. “DARLA: Improving Zero-Shot Transfer in Reinforcement Learning”. In: Proc. 34th International Conference on Machine Learning. 2017, pp. 1480–1490.
    Google ScholarLocate open access versionFindings
  • Irina Higgins et al. “Scan: Learning hierarchical compositional visual concepts”. In: International Conference on Learning Representations. 2018.
    Google ScholarLocate open access versionFindings
  • Justin Johnson et al. “CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning”. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017, pp. 1988–1997.
    Google ScholarLocate open access versionFindings
  • Ken Kansky et al. “Schema Networks: Zero-shot Transfer with a Generative Causal Model of Intuitive Physics”. In: Proc. 34th International Conference on Machine Learning. Vol. 70. 2017, pp. 1809–1818.
    Google ScholarLocate open access versionFindings
  • Brenden M Lake et al. “Building machines that learn and think like people”. In: Behavioral and Brain Sciences 40 (2017), e253.
    Google ScholarLocate open access versionFindings
  • Gary Marcus. “Deep learning: a critical appraisal”. In: arXiv preprint arXiv:1801.00631 (2018).
    Findings
  • John McCarthy. “Generality in artificial intelligence”. In: Communications of the ACM 30.12 (1987), pp. 1030–1035.
    Google ScholarLocate open access versionFindings
  • Rasmus Berg Palm, Ulrich Paquet, and Ole Winter. “Recurrent relational networks”. In: Advances in Neural Information Processing Systems. 2018, pp. 3368–3378.
    Google ScholarLocate open access versionFindings
  • Sébastien Racanière et al. “Imagination-augmented Agents for Deep Reinforcement Learning”. In: Advances in Neural Information Processing Systems. 2017, pp. 5694–5705.
    Google ScholarLocate open access versionFindings
  • Tim Rocktäschel and Sebastian Riedel. “End-to-end differentiable proving”. In: Advances in Neural Information Processing Systems. 2017, pp. 3788–3800.
    Google ScholarLocate open access versionFindings
  • Stuart Russell and Peter Norvig. Artificial Intelligence: A Modern Approach. 3rd. Prentice Hall Press, 2009.
    Google ScholarFindings
  • Adam Santoro et al. “A simple neural network module for relational reasoning”. In: Advances in Neural Information Processing Systems. 2017, pp. 4974–4983.
    Google ScholarLocate open access versionFindings
  • Adam Santoro et al. “Relational recurrent neural networks”. In: Advances in Neural Information Processing Systems. 2018, pp. 7299–7310.
    Google ScholarLocate open access versionFindings
  • Jonathan Schwarz et al. “Progress & Compress: A scalable framework for continual learning”. In: Proc. 35th International Conference on Machine Learning. Vol. 80. 2018, pp. 4528–4537.
    Google ScholarLocate open access versionFindings
  • Brian Cantwell Smith. The Promise of Artificial Intelligence: Reckoning and Judgment. MIT Press, 2019.
    Google ScholarFindings
  • Richard Socher et al. “Reasoning with neural tensor networks for knowledge base completion”. In: Advances in Neural Information Processing Systems. 2013, pp. 926–934.
    Google ScholarLocate open access versionFindings
  • Ashish Vaswani et al. “Attention is All you Need”. In: Advances in Neural Information Processing Systems. 2017, pp. 5998–6008.
    Google ScholarLocate open access versionFindings
  • Xiaolong Wang et al. “Non-local Neural Networks”. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition. 2018.
    Google ScholarLocate open access versionFindings
  • Vinicius Zambaldi et al. “Deep reinforcement learning with relational inductive biases”. In: International Conference on Learning Representations. 2019.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments