Physics enhanced neural networks predict order and chaos

Conventional artificial neural networks are powerful tools in science and industry, but they can fail when applied to nonlinear systems where order and chaos coexist. We use neural networks that incorporate the structures and symmetries of Hamiltonian dynamics to predict phase space trajectories even as nonlinear systems transition from order to chaos. We demonstrate Hamiltonian neural networks on the canonical Henon-Heiles system, which models diverse dynamics from astrophysics to chemistry. The power of the technique and the ubiquity of chaos suggest widespread utility.

Introduction.-Newton wrote, "My brain never hurt more than in my studies of the Moon (and Earth and Sun)" [1], thus anticipating that the seemingly simple three-body problem was intrinsically intractable. Nonetheless, Hamilton remarkably re-imagined Newton's laws as an incompressible energy conserving flow in phase space [2], and this formalism further highlighted the fundamental difference between integrable and nonintegrable systems [3], heralding the revolutionary concept of classical chaos [4,5].
Today, artificial neural networks are popular tools in industry and academia [6], especially for classification and regression problems, and are beginning to elucidate nonlinear dynamics [7] and fundamental physics [8][9][10]. Recent neural networks outperform traditional techniques in symbolic integration [11] and numerical integration [12] and outperform humans in strategy games like chess and Go [13]. But neural networks have a blind spot; they don't understand that "Clouds are not spheres, mountains are not cones, coastlines are not circles, . . . " [14]. They are unaware of the chaos and strange attractors of nonlinear dynamics, where exponentially separating trajectories bounded by finite energy repeatedly stretch and fold into complicated self-similar fractals. Their attempts to learn and predict nonlinear dynamics can be frustrated by ordered and chaotic orbits coexisting at the same energy for different initial positions and momenta.
Recent research [15,16] features artificial neural networks that incorporate Hamiltonian structure to learn simple dynamical systems, especially those with outputs proportional to inputs. But from stormy weather to swirling galaxies most dynamical systems are nonlinear, exhibit far richer behavior, and pose additional challenges. In this Letter, we exploit the Hamiltonian structure of natural systems to provide neural networks with the physics intelligence needed to learn the mix of order and chaos that often characterizes natural phenomena. After reviewing Hamiltonian chaos and neural networks, we apply Hamiltonian neural networks to the Hénon-Heiles model, which describes both stellar [17,18] and molecular [19][20][21] dynamics. Even as these systems transition from order to chaos, Hamiltonian neural networks correctly predict their dynamics, overcoming deep learning's chaos blindness. Physics thereby enhances neural networks, and physics savvy neural networks in turn will help scientists solve hard problems. Hamiltonian chaos.-The Hamiltonian formalism describes phenomena from astronomical scales to nanoscales. Even dissipative systems involving friction or viscosity are microscopically Hamiltonian. It reveals underlying structures in position-momentum phase space and reflects essential symmetries in physical systems. Its elegance stems from its geometric structure, where positions q and conjugate momenta p form a set of 2N canonical coordinates describing a physical system with N degrees of freedom. A single Hamiltonian function H[q, p, t] uniquely determines the time evolution of the system via the 2N coupled differential equations {q,ṗ} = {dq/dt, dp/dt} = {+∂H/∂p, −∂H/∂q} , (1) where the overdots are Newton's notation for time derivatives. For a time-independent system, the total energy E = H[q, p], which for simple systems is the sum of the kinetic and the potential energies.
This classical formalism exhibits two contrasting motions. One is simple predictable near-integrable motion that suggests a "clockwork universe. Additional motion constants constrain the orbits to lie on low-dimensional Kolmogorov-Arnold-Moser (KAM) tori [22] of dimension N in the 2N -dimensional phase space, as in Fig. 1. Too much nonlinearity can cause adjacent smooth KAMtori to break up into infinitely intersecting fractal cantori [23], allowing the orbits to wander over the entire available phase space, constrained only by energy. Such systems can thereby also exhibit chaos where the dynamics, though deterministic, is practically unpredictable due to the extreme sensitivity to initial conditions. From the motion of stars around galactic centers to the vibrations of triatomic molecules, the Hnon-Heiles Hamiltonian [17] is a celebrated paradigm of nonlinear dynamics. It exhibits a transition between order and chaos via a mixed phase space where islands of order are embedded in a sea of chaos, one of the most challenging dynamical scenarios to identify and decipher. In a four-dimensional phase space {q, p} = {q x , q y , p x , p y }, its nondimensionalized Hamiltonian is the sum of the kinetic and potential energies, including quadratic harmonic terms perturbed by cubic nonlinearities that convert a circularly symmetric potential into a triangularly symmetric potential. Bounded motion is possible in a triangular region of the {x, y} plane for energies 0 < E < 1/6. Figure 1 shows a low energy orbit on a KAM torus and the corresponding Poincaré surface of section. Neural networks.-While traditional analyses focuses on forecasting orbits or understanding fractal structure, understanding the entire landscape of dynamical order and chaos requires new tools. Artificial neural networks are today widely used and studied partly because they FIG. 2. Neural network schematics. Weights (red lines) and biases (yellow spheres) connect inputs (green cubes) through neuron hidden layers (gray planes) to outputs (blue cubes). NN (top) has 2N inputs and 2N outputs. HNN (bottom) has 2N inputs and 1 output but internalizes the output's gradient in its weights and biases.
can approximate any continuous function [24,25]. Recent efforts to apply them to chaotic dynamics involve the recurrent neural networks of reservoir computing [26][27][28]. We instead exploit the dominant feed-forward neural networks of deep learning [6].
Inspired by natural neural networks, the activity a = σ[W a −1 + b ] of each layer of a conventional feedforward neural network is the nonlinear step or ramp of the linearly transformed activities of the previous layer, where σ is a vectorized nonlinear function that mimics the on-off activity of a natural neuron, a are activation vectors, and W and b are adjustable weight matrices and bias vectors that mimic the dendrite and axon connectivity of natural neurons. Concatenating multiple layers eliminates the hidden neuron activities, so the output y = f [x; W , b ] is a nonlinear function of just the input x and the weights and biases. A training session inputs multiple x and adjusts the weights and biases to minimize the difference or "loss L = (y t − y) 2 between the target y t and the output y so the neural network learns the correspondence.
Recently, neural networks have been proposed [15,16] that not only learn the dynamics of the system but also capture invariants and symmetries of the system, including its Hamiltonian phase space structure. The , and Hénon-Heiles differential equations (right), for small, medium, and large bounded energies 0 < E < 1/6. Hues code momentum magnitudes, from red to violet; orbit tangents code momentum directions. Orbits fade into the past. HNN has learned which types of orbits are appropriate to which energies. NN is especially poor at high energies.
imize the loss until it learns the correct mapping. In contrast, the Fig. 2 Hamiltonian neural network (HNN) intakes position and momenta {q, p}, outputs the scalar function H, takes its gradient to find its position and momentum rates of change, and minimizes the loss which enforces Hamilton's equations of motion. For a given time step dt, each trained network can extrapolate a given initial condition with an Euler update {q, p} ← {q, p} + {q,ṗ}dt or some better integration scheme [29].
HNN incorporates the physics bias that the output phase-space velocities {q,ṗ} must come from the gradient of an unknown but conserved quantity, the Hamiltonian. This enforces an additional constraint on network weights and biases. Instead of taking the difference of the true output and the network generated output, HNN constructs the Eq. 1 conservative vector field from the given coordinates by differentiating the output with respect to the inputs. It then uses this gradient to construct the loss function that in turn optimizes the vector field.
Results.-For a sample of bounded energies and with the same learning parameters [30], we train NN and HNN on multiple Hénon-Heiles trajectories starting in the triangular basin. We use the neural networks to forecast new trajectories, and then compare them to the "true" trajectories obtained by numerically integrating Hamilton's Eq. 1. Figure 3 shows these results. HNN captures the nature of the global phase space structures well and effectively distinguishes qualitatively different dynamical regimes. NN fails dramatically, especially at high energies.
To quantify the ability of NN and HNN to paint a full portrait of the global, mixed phase space dynamics, we use their knowledge of the system to estimate the Hénon-Heiles Lyapunov spectrum [31], which characterizes the separation rate of infinitesimally close trajectories, one exponent for each dimension. Since perturbations along the flow do not cause divergence away from it, at least one exponent will be zero. For a Hamiltonian system, the ex- ponents must exist in diverging-converging pairs to conserve phase space volume. Hence we expect a spectrum like {−λ, 0, 0, +λ}, with the maximum exponent increasing at large energies like λ ∝ E 3.5 [32]. HNN satisfies both these expectations, which are stringent non-trivial consistency checks that it has authentically learned the true flow. NN satisfies neither [30].
Using NN and HNN we also compute the smaller alignment index α, a metric of chaos that allows us to quickly find the fraction of orbits that are chaotic at any energy [33]. We compute α for a specific orbit by following the time evolution of two different normalized deviation vectors along the orbit and computing the minimum of the norms of their difference and sum. Via extensive testing, an orbit is chaotic if α < 10 −8 , indicating that its deviation vectors have been aligned or anti-aligned by a large positive Lyapunov exponent. Figure 4 shows the fraction of chaotic trajectories for each energy, including a dramatic transition between islands of order at low energy and a sea of chaos at high energy. The chaos fractions computed by numerically integrating Hamilton's Eq. 1 are similar to the HNN estimates. NN again fails dramatically.
Finally, to understand what NN and HNN have learned when they forecast orbits, we use an autoencoder -a neural network with a sparse "bottleneck" layer -to examine their hidden neurons [30]. The autoencoder's Mean Square Error (MSE) loss function forces the input to match the output, so it must adjust its weights and biases to create a compressed, low-dimensional representation of the neural networks' activity, a process called introspection [34] or intelligible artificial intelligence (i AI). For HNN, the loss function drops precipitously for 4 (or more) bottleneck neurons, which appear to encode a linear combination of the 4 phase space coordinates, thereby capturing the dimensionality of the system, as in Fig. 5. NN shows no similar drop, and the uncertainty in its loss function is orders of magnitude larger than HNN's.
Conclusion.-Time series can be analyzed using multiple techniques, including genetic algorithms [35] or neural networks [9] to find algebraic equations of motion or simple compressed representations [8]. But such systems merely discover underlying equations or relations. Newton and Poincaré had the equations hundreds of years ago and still just glimpsed their complexity without fully understanding it. Conventional neural networks extrapolating time series do not conserve energy, and their orbits can drift off the energy surface, jump into the sea of chaos from islands of order, or fly out to infinity. By incorporating energy conserving and volume preserving flows arising from an underlying Hamiltonian functionwithout invoking any details of its form -Hamiltonian neural networks can recognize the presence of order and chaos as well as the challenging regime where both these very distinct dynamics coexist.
A neural network that respects Hamiltonian time- translational symmetry can learn order and chaos, including mixed phase space flows and sections, as quantified by metrics like Lyapunov spectra and smaller alignment indices. Incorporating other symmetries [36] in deep learning may produce comparable qualitative performance improvements. We are excited about the potential for physics to improve artificial neural networks, which can return the favor by helping us solve hard problems. Future researchers will work alongside such artificial intelligences to help extend the frontiers of science.