Abstract
Gradient-descent-based algorithms and their stochastic versions have widespread applications in machine learning and statistical inference. In this work, we carry out an analytic study of the performance of the algorithm most commonly considered in physics, the Langevin algorithm, in the context of noisy high-dimensional inference. We employ the Langevin algorithm to sample the posterior probability measure for the spiked mixed matrix-tensor model. The typical behavior of this algorithm is described by a system of integrodifferential equations that we call the Langevin state evolution, whose solution is compared with the one of the state evolution of approximate message passing (AMP). Our results show that, remarkably, the algorithmic threshold of the Langevin algorithm is suboptimal with respect to the one given by AMP. This phenomenon is due to the residual glassiness present in that region of parameters. We also present a simple heuristic expression of the transition line, which appears to be in agreement with the numerical results.
15 More- Received 27 June 2019
- Revised 22 November 2019
- Accepted 6 January 2020
DOI:https://doi.org/10.1103/PhysRevX.10.011057
Published by the American Physical Society under the terms of the Creative Commons Attribution 4.0 International license. Further distribution of this work must maintain attribution to the author(s) and the published article’s title, journal citation, and DOI.
Published by the American Physical Society
Physics Subject Headings (PhySH)
Popular Summary
Many tasks of current interest in science and technology are solved using machine-learning methods. While machine-learning methods work impressively well in many cases of practical interest, so far, theoretical understanding of the properties of the algorithms is poor. Our work is driven by the hope that better theoretical understanding will lead to an improvement of existing methods. We introduce a model for which the performance of one of the commonly considered algorithms, the Langevin algorithm, can be fully analyzed, and we locate the region of parameters for which the algorithm reaches optimal performance.
Using methods designed to study the dynamics of glassy materials, we write and analyze a closed set of equations describing the quality of the result found by the Langevin algorithm. We deduce a very simple formula for a threshold of signal-to-noise ratio below which the Langevin algorithm fails. Comparing it to the performance of an alternative algorithm known as approximate message passing, we show that, remarkably, the algorithmic threshold of the Langevin algorithm is largely suboptimal.
We hope that the insight gained in our work can be extended to a better understanding of similar “gradient-based” algorithms in models that are closer to currently used neural networks.