# Hamiltonian Generative Networks

International Conference on Learning Representations, 2020.

Keywords:

Hamiltonian dynamics normalising flows generative model physics

Weibo:

Abstract:

The Hamiltonian formalism plays a central role in classical and quantum physics. Hamiltonians are the main tool for modelling the continuous time evolution of systems with conserved quantities, and they come equipped with many useful properties, like time reversibility and smooth interpolation in time. These properties are important for m...More

Introduction

- Any system capable of a wide range of intelligent behaviours within a dynamic environment requires a good predictive model of the environment’s dynamics.
- This is true for intelligence in both biological (Friston, 2009; 2010; Clark, 2013) and artificial (Hafner et al, 2019; Battaglia et al, 2013; Watter et al, 2015; Watters et al, 2019) systems.
- After well over a century of development, it has proven to be essential for parsimonious descriptions of nearly all of physics.

Highlights

- Any system capable of a wide range of intelligent behaviours within a dynamic environment requires a good predictive model of the environment’s dynamics
- In order to directly compare the performance of Hamiltonian Generative Network to that of its closest baseline, Hamiltonian Neural Network, we generated four datasets analogous to the data used in Greydanus et al (2019)
- We have presented Hamiltonian Generative Network, the first deep learning approach capable of reliably learning Hamiltonian dynamics from pixel observations
- Hamiltonian dynamics have a number of useful properties that can be exploited more widely by the machine learning community
- We have demonstrated the first step towards applying the learnt Hamiltonian dynamics as normalising flows for expressive yet computationally efficient density modelling
- We hope that this work serves as the first step towards a wider adoption of the rich body of physics literature around the Hamiltonian principles in the machine learning community

Methods

- The Hamiltonian formalism describes the continuous time evolution of a system in an abstract phase space s = (q, p) ∈ R2n, where q ∈ Rn is a vector of position coordinates, and p ∈ Rn is the corresponding vector of momenta.
- The Hamiltonian for an undamped mass-spring system is H(q,p) kq2 p2 2m.
- The Hamiltonian can often be expressed as the sum of the kinetic T and potential V energies H = T (p)+V (q), as is the case for the mass-spring example.

Results

- In order to directly compare the performance of HGN to that of its closest baseline, HNN, the authors generated four datasets analogous to the data used in Greydanus et al (2019).
- In order to generate each trajectory, the authors first randomly sampled an initial state, produced a 30 step rollout following the ground truth Hamiltonian dynamics, before adding Gaussian noise with standard deviation σ2 = 0.1 to each phase-space coordinate, and rendering a corresponding 64x64 pixel observation.
- Note that the pendulum dataset is more challenging than the one described in Greydanus et al (2019), where the pendulum had a fixed radius and was initialized at a maximum angle of 30◦ from the central axis

Conclusion

- The authors have presented HGN, the first deep learning approach capable of reliably learning Hamiltonian dynamics from pixel observations.
- Hamiltonian dynamics have a number of useful properties that can be exploited more widely by the machine learning community.
- The time evolution along these paths is completely reversible
- These properties can have wide implications in such areas of machine learning as reinforcement learning, representation learning and generative modelling.
- The authors hope that this work serves as the first step towards a wider adoption of the rich body of physics literature around the Hamiltonian principles in the machine learning community

Summary

## Introduction:

Any system capable of a wide range of intelligent behaviours within a dynamic environment requires a good predictive model of the environment’s dynamics.- This is true for intelligence in both biological (Friston, 2009; 2010; Clark, 2013) and artificial (Hafner et al, 2019; Battaglia et al, 2013; Watter et al, 2015; Watters et al, 2019) systems.
- After well over a century of development, it has proven to be essential for parsimonious descriptions of nearly all of physics.
## Methods:

The Hamiltonian formalism describes the continuous time evolution of a system in an abstract phase space s = (q, p) ∈ R2n, where q ∈ Rn is a vector of position coordinates, and p ∈ Rn is the corresponding vector of momenta.- The Hamiltonian for an undamped mass-spring system is H(q,p) kq2 p2 2m.
- The Hamiltonian can often be expressed as the sum of the kinetic T and potential V energies H = T (p)+V (q), as is the case for the mass-spring example.
## Results:

In order to directly compare the performance of HGN to that of its closest baseline, HNN, the authors generated four datasets analogous to the data used in Greydanus et al (2019).- In order to generate each trajectory, the authors first randomly sampled an initial state, produced a 30 step rollout following the ground truth Hamiltonian dynamics, before adding Gaussian noise with standard deviation σ2 = 0.1 to each phase-space coordinate, and rendering a corresponding 64x64 pixel observation.
- Note that the pendulum dataset is more challenging than the one described in Greydanus et al (2019), where the pendulum had a fixed radius and was initialized at a maximum angle of 30◦ from the central axis
## Conclusion:

The authors have presented HGN, the first deep learning approach capable of reliably learning Hamiltonian dynamics from pixel observations.- Hamiltonian dynamics have a number of useful properties that can be exploited more widely by the machine learning community.
- The time evolution along these paths is completely reversible
- These properties can have wide implications in such areas of machine learning as reinforcement learning, representation learning and generative modelling.
- The authors hope that this work serves as the first step towards a wider adoption of the rich body of physics literature around the Hamiltonian principles in the machine learning community

- Table1: Average pixel MSE over a 30 step unroll on the train and test data on four physical systems. All values are multiplied by 1e+4. We evaluate two versions of the Hamiltonian Neural Network (HNN) (<a class="ref-link" id="cGreydanus_et+al_2019_a" href="#rGreydanus_et+al_2019_a">Greydanus et al, 2019</a>): the original architecture and a convolutional version closely matched to the architecture of HGN. We also compare four versions of our proposed Hamiltonian Generative Network (HGN): the full version, a version trained and tested with an Euler rather than a leapfrog integrator, a deterministic rather than a generative version, and a version of HGN with no extra network between the posterior and the initial state
- Table2: Variance of the Hamiltonian on four physical systems over single train and test rollouts shown in Fig. 6. The numbers reported are scaled by a factor of 1e+4. High variance indicates that the energy is not conserved by the learned Hamiltonian throughout the rollout. Many HNN Hamiltonians have collapsed to 0, as indicated by N/A. HGN Hamiltonians are meaningful, and different versions of HGN conserve energy to varying degrees

Related work

- Most machine learning approaches to modeling dynamics use discrete time steps, which often results in an accumulation of the approximation errors when producing rollouts and, therefore, to a fast drop in accuracy. Our approach, on the other hand, does not discretise continuous dynamics and models them directly using the Hamiltonian differential equations, which leads to slower divergence for longer rollouts. The density model version of HGN (NHF) uses the Hamiltonian dynamics as normalising flows along with a numerical integrator, making our approach somewhat related to the recently published neural ODE work (Chen et al, 2018; Grathwohl et al, 2018). What makes our approach different is that Hamiltonian dynamics are both invertible and volume-preserving (as discussed in Sec. 3.3), which makes our approach computationally cheaper than the alternatives and more suitable as a model of physical systems and other processes that have these properties. Also related is recent work attempting to learn a model of physical system dynamics end-to-end from image sequences using an autoencoder (de Avila Belbute-Peres et al, 2018). Unlike our work, this model does not exploit Hamiltonian dynamics and is trained in a supervised or semi-supervised regime.

Funding

- Introduces the Hamiltonian Generative Network , the first approach capable of consistently learning Hamiltonian dynamics from high-dimensional observations without restrictive domain assumptions
- Can use HGN to sample new trajectories, perform rollouts both forward and backward in time and even speed up or slow down the learned dynamics. 1 demonstrates how a simple modification of the network architecture turns HGN into a powerful normalising flow model, called Neural Hamiltonian Flow , that uses Hamiltonian dynamics to model expressive densities
- Introduces the first model that answers both of these questions without relying on restrictive domain assumptions
- Demonstrates that HGN is able to reliably learn the Hamiltonian dynamics from noisy pixel observations on four simulated physical systems: a pendulum, a mass-spring and two- and three- body systems
- Demonstrates that HGN produces meaningful samples with reversible dynamics and that the speed of rollouts can be controlled by changing the time derivative of the integrator at test time

Reference

- Peter W. Battaglia, Jessica B. Hamrick, and Joshua B. Tenenbaum. Simulation as an engine of physical scene understanding. PNAS, 110, 2013.
- Ricky T. Q. Chen, Jens Behrmann, David Duvenaud, and Jörn-Henrik Jacobsen. Residual flows for invertible generative modeling. arXiv, 2019.
- Tian Qi Chen, Yulia Rubanova, Jesse Bettencourt, and David K Duvenaud. Neural ordinary differential equations. In NeurIPS, 2018.
- Andy Clark. Whatever next? predictive brains, situated agents, and the future of cognitive science. Behavioral and Brain Sciences, 36(3), 2013.
- Filipe de Avila Belbute-Peres, Kevin Smith, Kelsey Allen, Josh Tenenbaum, and J. Zico Kolter. End-to-end differentiable physics for learning and control. NeurIPS, 2018.
- Laurent Dinh, Jascha Sohl-Dickstein, and Samy Bengio. Density estimation using real nvp. ICLR, 2017.
- K. Friston. The free-energy principle: a rough guide to the brain? Trends in cognitive sciences, 13, 2009.
- Karl Friston. The free-energy principle: a unified brain theory? Nature Reviews Neuroscience, 2010.
- Herbert Goldstein. Classical Mechanics. Addison-Wesley Pub. Co, Reading, Mass., 1980.
- Will Grathwohl, Ricky T. Q. Chen, Jesse Bettencourt, Ilya Sutskever, and David Duvenaud. Ffjord: Free-form continuous dynamics for scalable reversible generative models. ICLR, 2018.
- Sam Greydanus, Misko Dzamba, and Jason Yosinski. Hamiltonian neural networks. NeurIPS, 2019.
- Danijar Hafner, Timothy Lillicrap, Ian Fischer, Ruben Villegas, David Ha, Honglak Lee, and James Davidson. Learning latent dynamics for planning from pixels. PMLR, 2019.
- William R. Hamilton. On a general method in dynamics. Philosophical Transactions of the Royal Society, II, pp. 247–308, 1834.
- Irina Higgins, David Amos, David Pfau, Sebastien Racaniere, Loic Matthey, Danilo Rezende, and Alexander Lerchner. Towards a definition of disentangled representations. arXiv, 2018.
- G. E. Hinton and R.R. Salakhutdinov. "reducing the dimensionality of data with neural networks. Science, 313:504–507, 2006.
- Emiel Hoogeboom, Rianne van den Berg, and Max Welling. Emerging convolutions for generative normalizing flows. ICML, 2019.
- Chin-Wei Huang, David Krueger, Alexandre Lacoste, and Aaron Courville. Neural autoregressive flows. ICML, 2018.
- Eric Jones, Travis Oliphant, Pearu Peterson, et al. SciPy: Open source scientific tools for Python, 2001. URL http://www.scipy.org/.
- Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. Progressive growing of gans for improved quality, stability, and variation. ICLR, 2018.
- Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
- Diederik P. Kingma and Prafulla Dhariwal. Glow: Generative flow with invertible 1x1 convolutions. In Advances in Neural Information Processing Systems, pp. 10215–10224, 2018.
- Diederik P. Kingma and Max Welling. Auto-encoding variational bayes. ICLR, 2014.
- Diederik P. Kingma, Tim Salimans, Rafal Jozefowicz, Xi Chen, Ilya Sutskever, and Max Welling. Improving variational inference with inverse autoregressive flow. NeurIPS, 2016.
- Daniel Levy, Matthew D Hoffman, and Jascha Sohl-Dickstein. Generalizing hamiltonian monte carlo with neural networks. arXiv preprint arXiv:1711.09268, 2017.
- Cobra: Data-efficient model-based rl through unsupervised object discovery and curiosity-driven exploration. arxiv, 2019.
- We use Adam optimisier (Kingma & Ba, 2014) with learning rate 1.5e-4. When optimising the loss, in practice we do not learn the variance of the decoder pθ(x|s) and fix it to 1, which makes the reconstruction objective equivalent to a scaled L2 loss. Furthermore, we introduce a Lagrange multiplier in front of the KL term and optimise it using the same method as in Rezende & Viola (2018).
- The Hamiltonian Neural Network (HNN) (Greydanus et al., 2019) learns a differentiable function H(q,p) that maps a system’s state in phase space (its position q and momentum p) to a scalar quantity interpreted as the system’s Hamiltonian. This model is trained so that H(q,p) satisfies the Hamiltonian equation by minimizing
- In the experiments presented here, we reimplemented the PixelHNN architecture as described in Greydanus et al. (2019) and trained it using the full loss (15). As in the original paper, we used a PixelHNN with HNN, encoder, and decoder subnetworks each parameterized by a multi-layer perceptron (MLP). The encoder and decoder MLPs use ReLU nonlinearities. Each consists of 4 layers, with 200 units in each hidden layer and an embedding of the same size as the true position and momentum of the system depicted (2 for mass-spring and pendulum, 8 for two-body, and 12 for 3-body). The HNN MLP uses tanh nonlinearities and consists of two hidden layers with 200 units and a one-dimensional output.
- In the original paper, the PixelHNN model is trained using full-batch gradient descent. To make it more comparable to our approach, we train it here using stochastic gradient descent using minibatches of size 64 and around 15000 training steps. As in the original paper, we train the model using the Adam optimizer (Kingma & Ba, 2014) and a learning rate of 1e-3. As in the original paper, we produce rollouts of the model using a Runge-Kutta integrator (RK4). See Section A.6 for a description of RK4. Note that, as in the original paper, we use the more sophisticated algorithm implemented in scipy (scipy.integrate.solve_ivp) (Jones et al., 2001).
- The datasets for the experiments described in 4 were generated in a similar manner to Greydanus et al. (2019) for comparative purposes. All of the datasets simulate the exact Hamiltonian dynamics of the underlying differential equation using the default scipy initial value problem solver Jones et al. (2001). After creating a dataset of trajectories for each system, we render those into a sequence of images. The system depicted in each dataset can be visualized by rendering circular objects:
- Published as a conference paper at ICLR 2020 Pendulum

Full Text

Tags

Comments