Complex Distributions Emerging in Filtering and Compression

Open Access

Complex Distributions Emerging in Filtering and Compression

G. J. Baxter, R. A. da Costa, S. N. Dorogovtsev, and J. F. F. Mendes

Phys. Rev. X 10, 011074 – Published 30 March 2020

Abstract

In filtering, each output is produced by a certain number of different inputs. We explore the statistics of this degeneracy in an explicitly treatable filtering problem in which filtering performs the maximal compression of relevant information contained in inputs (arrays of zeros and ones). This problem serves as a reference model for the statistics of filtering and related sampling problems. The filter patterns in this problem conveniently allow a microscopic, combinatorial consideration. This allows us to find the statistics of outputs, namely the exact distribution of output degeneracies, for arbitrary input sizes. We observe that the resulting degeneracy distribution of outputs decays as $e^{- c \log^{α} d}$ with degeneracy $d$ , where $c$ is a constant and exponent $α > 1$ , i.e., faster than a power law. Importantly, its form essentially depends on the size of the input dataset, appearing to be closer to a power-law dependence for small dataset sizes than for large ones. We demonstrate that for sufficiently small input dataset sizes typical for empirical studies, this distribution could be easily perceived as a power law. We extend our results to filter patterns of various sizes and demonstrate that the shortest filter pattern provides the maximum informative representations of the inputs.

1 More

Received 28 June 2019
Revised 10 February 2020
Accepted 25 February 2020

DOI:https://doi.org/10.1103/PhysRevX.10.011074

Published by the American Physical Society under the terms of the Creative Commons Attribution 4.0 International license. Further distribution of this work must maintain attribution to the author(s) and the published article’s title, journal citation, and DOI.

Published by the American Physical Society

Physics Subject Headings (PhySH)

Complex systems Computational complexity Information & communication theory Patterns in complex systems Scaling laws of complex systems

Combinatorics

Interdisciplinary PhysicsStatistical Physics & Thermodynamics

Authors & Affiliations

G. J. Baxter ¹, R. A. da Costa ^1,*, S. N. Dorogovtsev ¹, and J. F. F. Mendes ^1,2

¹Departamento de Física da Universidade de Aveiro and I3N, Campus Universitário de Santiago, 3810-193 Aveiro, Portugal
²School of Computer and Communication Sciences and School of Life Sciences, École Polytechnique Fédéral de Lausanne, 1015 Lausanne EPFL, Switzerland

^*Corresponding author. americo.costa@ua.pt

Popular Summary

Cooperative systems such as spin glasses or deep-learning neural networks—where interactions among the components make the system more than the sum of its parts—characteristically have a complex landscape of basins of attraction for each final state. Because of the complexity of such systems, a deep understanding of the statistics of these phenomena has been elusive. Here, we introduce a simple filtering model of a cooperative system that displays the same complex statistics, but which can be calculated exactly up to relatively large system sizes.

Given an input sequence of ones and zeros, our filter produces an output marking the location of a certain short pattern. Each output may be degenerate, corresponding to multiple inputs, and the distribution of these degeneracies is broad and complex. We calculate the exact degeneracy distribution up to large input sizes and develop a theory to explain its various features. Similar distributions have been observed in deep-learning neural networks: They resemble power-law distributions for small input sizes. However, we show that the shape of the distribution depends crucially on the length of the input string and larger sizes diverge from the power-law shape.

The statistics of this problem are analogous to those in cooperative systems such as spin glasses or deep-learning neural networks. We believe our results can give insight into the statistics of problems in cooperative systems and stimulate new research aimed at the exploration of similar size effects in those systems.

Key Image

Article Text

Click to Expand

Supplemental Material

Click to Expand

References

Click to Expand

Issue

Vol. 10, Iss. 1 — January - March 2020

Subject Areas

Reuse & Permissions

Author publication services for translation and copyediting assistance advertisement

Physical Review X