Abstract
We consider a three-layer Sejnowski machine and show that features learnt via contrastive divergence have a dual representation as patterns in a dense associative memory of order . The latter is known to be able to Hebbian store an amount of patterns scaling as , where denotes the number of constituting binary neurons interacting wisely. We also prove that, by keeping the dense associative network far from the saturation regime (namely, allowing for a number of patterns scaling only linearly with , while ) such a system is able to perform pattern recognition far below the standard signal-to-noise threshold. In particular, a network with is able to retrieve information whose intensity is even in the presence of a noise in the large limit. This striking skill stems from a redundancy representation of patterns—which is afforded given the (relatively) low-load information storage—and it contributes to explain the impressive abilities in pattern recognition exhibited by new-generation neural networks. The whole theory is developed rigorously, at the replica symmetric level of approximation, and corroborated by signal-to-noise analysis and Monte Carlo simulations.
- Received 25 June 2019
- Revised 11 October 2019
DOI:https://doi.org/10.1103/PhysRevLett.124.028301
© 2020 American Physical Society