Hamiltonian Generative Networks

Aleksandar Botev
Aleksandar Botev
Andrew Jaegle
Andrew Jaegle
Sebastian Racaniere
Sebastian Racaniere
Peter Toth
Peter Toth

International Conference on Learning Representations, 2020.

Cited by: 30|Bibtex|Views20
Keywords:
Hamiltonian dynamics normalising flows generative model physics
Weibo:
We have presented Hamiltonian Generative Network, the first deep learning approach capable of reliably learning Hamiltonian dynamics from pixel observations

Abstract:

The Hamiltonian formalism plays a central role in classical and quantum physics. Hamiltonians are the main tool for modelling the continuous time evolution of systems with conserved quantities, and they come equipped with many useful properties, like time reversibility and smooth interpolation in time. These properties are important for m...More

Code:

Data:

0
Introduction
  • Any system capable of a wide range of intelligent behaviours within a dynamic environment requires a good predictive model of the environment’s dynamics.
  • This is true for intelligence in both biological (Friston, 2009; 2010; Clark, 2013) and artificial (Hafner et al, 2019; Battaglia et al, 2013; Watter et al, 2015; Watters et al, 2019) systems.
  • After well over a century of development, it has proven to be essential for parsimonious descriptions of nearly all of physics.
Highlights
  • Any system capable of a wide range of intelligent behaviours within a dynamic environment requires a good predictive model of the environment’s dynamics
  • In order to directly compare the performance of Hamiltonian Generative Network to that of its closest baseline, Hamiltonian Neural Network, we generated four datasets analogous to the data used in Greydanus et al (2019)
  • We have presented Hamiltonian Generative Network, the first deep learning approach capable of reliably learning Hamiltonian dynamics from pixel observations
  • Hamiltonian dynamics have a number of useful properties that can be exploited more widely by the machine learning community
  • We have demonstrated the first step towards applying the learnt Hamiltonian dynamics as normalising flows for expressive yet computationally efficient density modelling
  • We hope that this work serves as the first step towards a wider adoption of the rich body of physics literature around the Hamiltonian principles in the machine learning community
Methods
  • The Hamiltonian formalism describes the continuous time evolution of a system in an abstract phase space s = (q, p) ∈ R2n, where q ∈ Rn is a vector of position coordinates, and p ∈ Rn is the corresponding vector of momenta.
  • The Hamiltonian for an undamped mass-spring system is H(q,p) kq2 p2 2m.
  • The Hamiltonian can often be expressed as the sum of the kinetic T and potential V energies H = T (p)+V (q), as is the case for the mass-spring example.
Results
  • In order to directly compare the performance of HGN to that of its closest baseline, HNN, the authors generated four datasets analogous to the data used in Greydanus et al (2019).
  • In order to generate each trajectory, the authors first randomly sampled an initial state, produced a 30 step rollout following the ground truth Hamiltonian dynamics, before adding Gaussian noise with standard deviation σ2 = 0.1 to each phase-space coordinate, and rendering a corresponding 64x64 pixel observation.
  • Note that the pendulum dataset is more challenging than the one described in Greydanus et al (2019), where the pendulum had a fixed radius and was initialized at a maximum angle of 30◦ from the central axis
Conclusion
  • The authors have presented HGN, the first deep learning approach capable of reliably learning Hamiltonian dynamics from pixel observations.
  • Hamiltonian dynamics have a number of useful properties that can be exploited more widely by the machine learning community.
  • The time evolution along these paths is completely reversible
  • These properties can have wide implications in such areas of machine learning as reinforcement learning, representation learning and generative modelling.
  • The authors hope that this work serves as the first step towards a wider adoption of the rich body of physics literature around the Hamiltonian principles in the machine learning community
Summary
  • Introduction:

    Any system capable of a wide range of intelligent behaviours within a dynamic environment requires a good predictive model of the environment’s dynamics.
  • This is true for intelligence in both biological (Friston, 2009; 2010; Clark, 2013) and artificial (Hafner et al, 2019; Battaglia et al, 2013; Watter et al, 2015; Watters et al, 2019) systems.
  • After well over a century of development, it has proven to be essential for parsimonious descriptions of nearly all of physics.
  • Methods:

    The Hamiltonian formalism describes the continuous time evolution of a system in an abstract phase space s = (q, p) ∈ R2n, where q ∈ Rn is a vector of position coordinates, and p ∈ Rn is the corresponding vector of momenta.
  • The Hamiltonian for an undamped mass-spring system is H(q,p) kq2 p2 2m.
  • The Hamiltonian can often be expressed as the sum of the kinetic T and potential V energies H = T (p)+V (q), as is the case for the mass-spring example.
  • Results:

    In order to directly compare the performance of HGN to that of its closest baseline, HNN, the authors generated four datasets analogous to the data used in Greydanus et al (2019).
  • In order to generate each trajectory, the authors first randomly sampled an initial state, produced a 30 step rollout following the ground truth Hamiltonian dynamics, before adding Gaussian noise with standard deviation σ2 = 0.1 to each phase-space coordinate, and rendering a corresponding 64x64 pixel observation.
  • Note that the pendulum dataset is more challenging than the one described in Greydanus et al (2019), where the pendulum had a fixed radius and was initialized at a maximum angle of 30◦ from the central axis
  • Conclusion:

    The authors have presented HGN, the first deep learning approach capable of reliably learning Hamiltonian dynamics from pixel observations.
  • Hamiltonian dynamics have a number of useful properties that can be exploited more widely by the machine learning community.
  • The time evolution along these paths is completely reversible
  • These properties can have wide implications in such areas of machine learning as reinforcement learning, representation learning and generative modelling.
  • The authors hope that this work serves as the first step towards a wider adoption of the rich body of physics literature around the Hamiltonian principles in the machine learning community
Tables
  • Table1: Average pixel MSE over a 30 step unroll on the train and test data on four physical systems. All values are multiplied by 1e+4. We evaluate two versions of the Hamiltonian Neural Network (HNN) (<a class="ref-link" id="cGreydanus_et+al_2019_a" href="#rGreydanus_et+al_2019_a">Greydanus et al, 2019</a>): the original architecture and a convolutional version closely matched to the architecture of HGN. We also compare four versions of our proposed Hamiltonian Generative Network (HGN): the full version, a version trained and tested with an Euler rather than a leapfrog integrator, a deterministic rather than a generative version, and a version of HGN with no extra network between the posterior and the initial state
  • Table2: Variance of the Hamiltonian on four physical systems over single train and test rollouts shown in Fig. 6. The numbers reported are scaled by a factor of 1e+4. High variance indicates that the energy is not conserved by the learned Hamiltonian throughout the rollout. Many HNN Hamiltonians have collapsed to 0, as indicated by N/A. HGN Hamiltonians are meaningful, and different versions of HGN conserve energy to varying degrees
Download tables as Excel
Related work
  • Most machine learning approaches to modeling dynamics use discrete time steps, which often results in an accumulation of the approximation errors when producing rollouts and, therefore, to a fast drop in accuracy. Our approach, on the other hand, does not discretise continuous dynamics and models them directly using the Hamiltonian differential equations, which leads to slower divergence for longer rollouts. The density model version of HGN (NHF) uses the Hamiltonian dynamics as normalising flows along with a numerical integrator, making our approach somewhat related to the recently published neural ODE work (Chen et al, 2018; Grathwohl et al, 2018). What makes our approach different is that Hamiltonian dynamics are both invertible and volume-preserving (as discussed in Sec. 3.3), which makes our approach computationally cheaper than the alternatives and more suitable as a model of physical systems and other processes that have these properties. Also related is recent work attempting to learn a model of physical system dynamics end-to-end from image sequences using an autoencoder (de Avila Belbute-Peres et al, 2018). Unlike our work, this model does not exploit Hamiltonian dynamics and is trained in a supervised or semi-supervised regime.
Funding
  • Introduces the Hamiltonian Generative Network , the first approach capable of consistently learning Hamiltonian dynamics from high-dimensional observations without restrictive domain assumptions
  • Can use HGN to sample new trajectories, perform rollouts both forward and backward in time and even speed up or slow down the learned dynamics. 1 demonstrates how a simple modification of the network architecture turns HGN into a powerful normalising flow model, called Neural Hamiltonian Flow , that uses Hamiltonian dynamics to model expressive densities
  • Introduces the first model that answers both of these questions without relying on restrictive domain assumptions
  • Demonstrates that HGN is able to reliably learn the Hamiltonian dynamics from noisy pixel observations on four simulated physical systems: a pendulum, a mass-spring and two- and three- body systems
  • Demonstrates that HGN produces meaningful samples with reversible dynamics and that the speed of rollouts can be controlled by changing the time derivative of the integrator at test time
Reference
  • Peter W. Battaglia, Jessica B. Hamrick, and Joshua B. Tenenbaum. Simulation as an engine of physical scene understanding. PNAS, 110, 2013.
    Google ScholarLocate open access versionFindings
  • Ricky T. Q. Chen, Jens Behrmann, David Duvenaud, and Jörn-Henrik Jacobsen. Residual flows for invertible generative modeling. arXiv, 2019.
    Google ScholarFindings
  • Tian Qi Chen, Yulia Rubanova, Jesse Bettencourt, and David K Duvenaud. Neural ordinary differential equations. In NeurIPS, 2018.
    Google ScholarLocate open access versionFindings
  • Andy Clark. Whatever next? predictive brains, situated agents, and the future of cognitive science. Behavioral and Brain Sciences, 36(3), 2013.
    Google ScholarLocate open access versionFindings
  • Filipe de Avila Belbute-Peres, Kevin Smith, Kelsey Allen, Josh Tenenbaum, and J. Zico Kolter. End-to-end differentiable physics for learning and control. NeurIPS, 2018.
    Google ScholarLocate open access versionFindings
  • Laurent Dinh, Jascha Sohl-Dickstein, and Samy Bengio. Density estimation using real nvp. ICLR, 2017.
    Google ScholarLocate open access versionFindings
  • K. Friston. The free-energy principle: a rough guide to the brain? Trends in cognitive sciences, 13, 2009.
    Google ScholarLocate open access versionFindings
  • Karl Friston. The free-energy principle: a unified brain theory? Nature Reviews Neuroscience, 2010.
    Google ScholarLocate open access versionFindings
  • Herbert Goldstein. Classical Mechanics. Addison-Wesley Pub. Co, Reading, Mass., 1980.
    Google ScholarFindings
  • Will Grathwohl, Ricky T. Q. Chen, Jesse Bettencourt, Ilya Sutskever, and David Duvenaud. Ffjord: Free-form continuous dynamics for scalable reversible generative models. ICLR, 2018.
    Google ScholarLocate open access versionFindings
  • Sam Greydanus, Misko Dzamba, and Jason Yosinski. Hamiltonian neural networks. NeurIPS, 2019.
    Google ScholarLocate open access versionFindings
  • Danijar Hafner, Timothy Lillicrap, Ian Fischer, Ruben Villegas, David Ha, Honglak Lee, and James Davidson. Learning latent dynamics for planning from pixels. PMLR, 2019.
    Google ScholarLocate open access versionFindings
  • William R. Hamilton. On a general method in dynamics. Philosophical Transactions of the Royal Society, II, pp. 247–308, 1834.
    Google ScholarLocate open access versionFindings
  • Irina Higgins, David Amos, David Pfau, Sebastien Racaniere, Loic Matthey, Danilo Rezende, and Alexander Lerchner. Towards a definition of disentangled representations. arXiv, 2018.
    Google ScholarFindings
  • G. E. Hinton and R.R. Salakhutdinov. "reducing the dimensionality of data with neural networks. Science, 313:504–507, 2006.
    Google ScholarLocate open access versionFindings
  • Emiel Hoogeboom, Rianne van den Berg, and Max Welling. Emerging convolutions for generative normalizing flows. ICML, 2019.
    Google ScholarLocate open access versionFindings
  • Chin-Wei Huang, David Krueger, Alexandre Lacoste, and Aaron Courville. Neural autoregressive flows. ICML, 2018.
    Google ScholarLocate open access versionFindings
  • Eric Jones, Travis Oliphant, Pearu Peterson, et al. SciPy: Open source scientific tools for Python, 2001. URL http://www.scipy.org/.
    Locate open access versionFindings
  • Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. Progressive growing of gans for improved quality, stability, and variation. ICLR, 2018.
    Google ScholarLocate open access versionFindings
  • Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
    Findings
  • Diederik P. Kingma and Prafulla Dhariwal. Glow: Generative flow with invertible 1x1 convolutions. In Advances in Neural Information Processing Systems, pp. 10215–10224, 2018.
    Google ScholarLocate open access versionFindings
  • Diederik P. Kingma and Max Welling. Auto-encoding variational bayes. ICLR, 2014.
    Google ScholarLocate open access versionFindings
  • Diederik P. Kingma, Tim Salimans, Rafal Jozefowicz, Xi Chen, Ilya Sutskever, and Max Welling. Improving variational inference with inverse autoregressive flow. NeurIPS, 2016.
    Google ScholarLocate open access versionFindings
  • Daniel Levy, Matthew D Hoffman, and Jascha Sohl-Dickstein. Generalizing hamiltonian monte carlo with neural networks. arXiv preprint arXiv:1711.09268, 2017.
    Findings
  • Cobra: Data-efficient model-based rl through unsupervised object discovery and curiosity-driven exploration. arxiv, 2019.
    Google ScholarFindings
  • We use Adam optimisier (Kingma & Ba, 2014) with learning rate 1.5e-4. When optimising the loss, in practice we do not learn the variance of the decoder pθ(x|s) and fix it to 1, which makes the reconstruction objective equivalent to a scaled L2 loss. Furthermore, we introduce a Lagrange multiplier in front of the KL term and optimise it using the same method as in Rezende & Viola (2018).
    Google ScholarFindings
  • The Hamiltonian Neural Network (HNN) (Greydanus et al., 2019) learns a differentiable function H(q,p) that maps a system’s state in phase space (its position q and momentum p) to a scalar quantity interpreted as the system’s Hamiltonian. This model is trained so that H(q,p) satisfies the Hamiltonian equation by minimizing
    Google ScholarLocate open access versionFindings
  • In the experiments presented here, we reimplemented the PixelHNN architecture as described in Greydanus et al. (2019) and trained it using the full loss (15). As in the original paper, we used a PixelHNN with HNN, encoder, and decoder subnetworks each parameterized by a multi-layer perceptron (MLP). The encoder and decoder MLPs use ReLU nonlinearities. Each consists of 4 layers, with 200 units in each hidden layer and an embedding of the same size as the true position and momentum of the system depicted (2 for mass-spring and pendulum, 8 for two-body, and 12 for 3-body). The HNN MLP uses tanh nonlinearities and consists of two hidden layers with 200 units and a one-dimensional output.
    Google ScholarLocate open access versionFindings
  • In the original paper, the PixelHNN model is trained using full-batch gradient descent. To make it more comparable to our approach, we train it here using stochastic gradient descent using minibatches of size 64 and around 15000 training steps. As in the original paper, we train the model using the Adam optimizer (Kingma & Ba, 2014) and a learning rate of 1e-3. As in the original paper, we produce rollouts of the model using a Runge-Kutta integrator (RK4). See Section A.6 for a description of RK4. Note that, as in the original paper, we use the more sophisticated algorithm implemented in scipy (scipy.integrate.solve_ivp) (Jones et al., 2001).
    Google ScholarFindings
  • The datasets for the experiments described in 4 were generated in a similar manner to Greydanus et al. (2019) for comparative purposes. All of the datasets simulate the exact Hamiltonian dynamics of the underlying differential equation using the default scipy initial value problem solver Jones et al. (2001). After creating a dataset of trajectories for each system, we render those into a sequence of images. The system depicted in each dataset can be visualized by rendering circular objects:
    Google ScholarLocate open access versionFindings
  • Published as a conference paper at ICLR 2020 Pendulum
    Google ScholarFindings
Full Text
Your rating :
0

 

Tags
Comments