• Open Access

Complex Distributions Emerging in Filtering and Compression

G. J. Baxter, R. A. da Costa, S. N. Dorogovtsev, and J. F. F. Mendes
Phys. Rev. X 10, 011074 – Published 30 March 2020
PDFHTMLExport Citation

Abstract

In filtering, each output is produced by a certain number of different inputs. We explore the statistics of this degeneracy in an explicitly treatable filtering problem in which filtering performs the maximal compression of relevant information contained in inputs (arrays of zeros and ones). This problem serves as a reference model for the statistics of filtering and related sampling problems. The filter patterns in this problem conveniently allow a microscopic, combinatorial consideration. This allows us to find the statistics of outputs, namely the exact distribution of output degeneracies, for arbitrary input sizes. We observe that the resulting degeneracy distribution of outputs decays as eclogαd with degeneracy d, where c is a constant and exponent α>1, i.e., faster than a power law. Importantly, its form essentially depends on the size of the input dataset, appearing to be closer to a power-law dependence for small dataset sizes than for large ones. We demonstrate that for sufficiently small input dataset sizes typical for empirical studies, this distribution could be easily perceived as a power law. We extend our results to filter patterns of various sizes and demonstrate that the shortest filter pattern provides the maximum informative representations of the inputs.

  • Figure
  • Figure
  • Figure
  • Figure
  • Figure
  • Figure
  • Figure
1 More
  • Received 28 June 2019
  • Revised 10 February 2020
  • Accepted 25 February 2020

DOI:https://doi.org/10.1103/PhysRevX.10.011074

Published by the American Physical Society under the terms of the Creative Commons Attribution 4.0 International license. Further distribution of this work must maintain attribution to the author(s) and the published article’s title, journal citation, and DOI.

Published by the American Physical Society

Physics Subject Headings (PhySH)

Interdisciplinary PhysicsStatistical Physics & Thermodynamics

Authors & Affiliations

G. J. Baxter1, R. A. da Costa1,*, S. N. Dorogovtsev1, and J. F. F. Mendes1,2

  • 1Departamento de Física da Universidade de Aveiro and I3N, Campus Universitário de Santiago, 3810-193 Aveiro, Portugal
  • 2School of Computer and Communication Sciences and School of Life Sciences, École Polytechnique Fédéral de Lausanne, 1015 Lausanne EPFL, Switzerland

  • *Corresponding author. americo.costa@ua.pt

Popular Summary

Cooperative systems such as spin glasses or deep-learning neural networks—where interactions among the components make the system more than the sum of its parts—characteristically have a complex landscape of basins of attraction for each final state. Because of the complexity of such systems, a deep understanding of the statistics of these phenomena has been elusive. Here, we introduce a simple filtering model of a cooperative system that displays the same complex statistics, but which can be calculated exactly up to relatively large system sizes.

Given an input sequence of ones and zeros, our filter produces an output marking the location of a certain short pattern. Each output may be degenerate, corresponding to multiple inputs, and the distribution of these degeneracies is broad and complex. We calculate the exact degeneracy distribution up to large input sizes and develop a theory to explain its various features. Similar distributions have been observed in deep-learning neural networks: They resemble power-law distributions for small input sizes. However, we show that the shape of the distribution depends crucially on the length of the input string and larger sizes diverge from the power-law shape.

The statistics of this problem are analogous to those in cooperative systems such as spin glasses or deep-learning neural networks. We believe our results can give insight into the statistics of problems in cooperative systems and stimulate new research aimed at the exploration of similar size effects in those systems.

Key Image

Article Text

Click to Expand

Supplemental Material

Click to Expand

References

Click to Expand
Issue

Vol. 10, Iss. 1 — January - March 2020

Subject Areas
Reuse & Permissions
Author publication services for translation and copyediting assistance advertisement

Authorization Required


×
×

Images

×

Sign up to receive regular email alerts from Physical Review X

Reuse & Permissions

It is not necessary to obtain permission to reuse this article or its components as it is available under the terms of the Creative Commons Attribution 4.0 International license. This license permits unrestricted use, distribution, and reproduction in any medium, provided attribution to the author(s) and the published article's title, journal citation, and DOI are maintained. Please note that some figures may have been included with permission from other third parties. It is your responsibility to obtain the proper permission from the rights holder directly for these figures.

×

Log In

Cancel
×

Search


Article Lookup

Paste a citation or DOI

Enter a citation
×