Abstract
One challenge of physics is to explain how collective properties arise from microscopic interactions. Indeed, interactions form the building blocks of almost all physical theories and are described by polynomial terms in the action. The traditional approach is to derive these terms from elementary processes and then use the resulting model to make predictions for the entire system. But what if the underlying processes are unknown? Can we reverse the approach and learn the microscopic action by observing the entire system? We use invertible neural networks to first learn the observed data distribution. By the choice of a suitable nonlinearity for the neuronal activation function, we are then able to compute the action from the weights of the trained model; a diagrammatic language expresses the change of the action from layer to layer. This process uncovers how the network hierarchically constructs interactions via nonlinear transformations of pairwise relations. We test this approach on simulated datasets of interacting theories and on an established image dataset (MNIST). The network consistently reproduces a broad class of unimodal distributions; outside this class, it finds effective theories that approximate the data statistics up to the third cumulant. We explicitly show how network depth and data quantity jointly improve the agreement between the learned and the true model. This work shows how to leverage the power of machine learning to transparently extract microscopic models from data.
9 More- Received 11 April 2023
- Revised 5 September 2023
- Accepted 12 October 2023
DOI:https://doi.org/10.1103/PhysRevX.13.041033
Published by the American Physical Society under the terms of the Creative Commons Attribution 4.0 International license. Further distribution of this work must maintain attribution to the author(s) and the published article’s title, journal citation, and DOI.
Published by the American Physical Society
Physics Subject Headings (PhySH)
Popular Summary
We use models to make sense of data: The more complicated the data, the more complicated the model needs to be. In the past, models could be designed only from the bottom up, explaining the behavior of the whole from the already known interactions between its parts. In this work, we flip the approach to work top down. Starting from a black-box model explaining the whole, we deconstruct it into simpler interactions between a few parts. For example, a set of images of the number “3” can be described by a small set of rules for how doublets, triplets, and quadruplets of pixels interact with each other.
Key to this approach is the use of a generative neural network, which maps a complicated data distribution to a simpler one. By decomposing this mapping into interactions between simpler features, we can better understand how and why models make predictions. We hence unravel the complex, hierarchical structure that has been learned by a neural network and explain it in a form that is central to physics: interactions between degrees of freedom.
Being able to provide an understanding of how neural networks behave is becoming increasingly important today, as applications of artificial intelligence—from medical diagnosis to face recognition—progressively impact human lives.