Abstract
In filtering, each output is produced by a certain number of different inputs. We explore the statistics of this degeneracy in an explicitly treatable filtering problem in which filtering performs the maximal compression of relevant information contained in inputs (arrays of zeros and ones). This problem serves as a reference model for the statistics of filtering and related sampling problems. The filter patterns in this problem conveniently allow a microscopic, combinatorial consideration. This allows us to find the statistics of outputs, namely the exact distribution of output degeneracies, for arbitrary input sizes. We observe that the resulting degeneracy distribution of outputs decays as with degeneracy , where is a constant and exponent , i.e., faster than a power law. Importantly, its form essentially depends on the size of the input dataset, appearing to be closer to a power-law dependence for small dataset sizes than for large ones. We demonstrate that for sufficiently small input dataset sizes typical for empirical studies, this distribution could be easily perceived as a power law. We extend our results to filter patterns of various sizes and demonstrate that the shortest filter pattern provides the maximum informative representations of the inputs.
1 More- Received 28 June 2019
- Revised 10 February 2020
- Accepted 25 February 2020
DOI:https://doi.org/10.1103/PhysRevX.10.011074
Published by the American Physical Society under the terms of the Creative Commons Attribution 4.0 International license. Further distribution of this work must maintain attribution to the author(s) and the published article’s title, journal citation, and DOI.
Published by the American Physical Society
Physics Subject Headings (PhySH)
Popular Summary
Cooperative systems such as spin glasses or deep-learning neural networks—where interactions among the components make the system more than the sum of its parts—characteristically have a complex landscape of basins of attraction for each final state. Because of the complexity of such systems, a deep understanding of the statistics of these phenomena has been elusive. Here, we introduce a simple filtering model of a cooperative system that displays the same complex statistics, but which can be calculated exactly up to relatively large system sizes.
Given an input sequence of ones and zeros, our filter produces an output marking the location of a certain short pattern. Each output may be degenerate, corresponding to multiple inputs, and the distribution of these degeneracies is broad and complex. We calculate the exact degeneracy distribution up to large input sizes and develop a theory to explain its various features. Similar distributions have been observed in deep-learning neural networks: They resemble power-law distributions for small input sizes. However, we show that the shape of the distribution depends crucially on the length of the input string and larger sizes diverge from the power-law shape.
The statistics of this problem are analogous to those in cooperative systems such as spin glasses or deep-learning neural networks. We believe our results can give insight into the statistics of problems in cooperative systems and stimulate new research aimed at the exploration of similar size effects in those systems.