Zipf and Heaps laws from dependency structures in component systems

Andrea Mazzolini, Jacopo Grilli, Eleonora De Lazzari, Matteo Osella, Marco Cosentino Lagomarsino, and Marco Gherardi
Phys. Rev. E 98, 012315 – Published 25 July 2018

Abstract

Complex natural and technological systems can be considered, on a coarse-grained level, as assemblies of elementary components: for example, genomes as sets of genes or texts as sets of words. On one hand, the joint occurrence of components emerges from architectural and specific constraints in such systems. On the other hand, general regularities may unify different systems, such as the broadly studied Zipf and Heaps laws, respectively concerning the distribution of component frequencies and their number as a function of system size. Dependency structures (i.e., directed networks encoding the dependency relations between the components in a system) were proposed recently as a possible organizing principles underlying some of the regularities observed. However, the consequences of this assumption were explored only in binary component systems, where solely the presence or absence of components is considered, and multiple copies of the same component are not allowed. Here we consider a simple model that generates, from a given ensemble of dependency structures, a statistical ensemble of sets of components, allowing for components to appear with any multiplicity. Our model is a minimal extension that is memoryless and therefore accessible to analytical calculations. A mean-field analytical approach (analogous to the “Zipfian ensemble” in the linguistics literature) captures the relevant laws describing the component statistics as we show by comparison with numerical computations. In particular, we recover a power-law Zipf rank plot, with a set of core components, and a Heaps law displaying three consecutive regimes (linear, sublinear, and saturating) that we characterize quantitatively.

  • Figure
  • Figure
  • Figure
  • Figure
  • Figure
  • Received 2 March 2018

DOI:https://doi.org/10.1103/PhysRevE.98.012315

©2018 American Physical Society

Physics Subject Headings (PhySH)

Interdisciplinary PhysicsStatistical Physics & ThermodynamicsNetworks

Authors & Affiliations

Andrea Mazzolini1, Jacopo Grilli2, Eleonora De Lazzari3, Matteo Osella1, Marco Cosentino Lagomarsino3,4,5, and Marco Gherardi3,6,*

  • 1Dipartimento di Fisica and INFN, Università degli Studi di Torino, Via Pietro Giuria 1, 10125 Torino, Italy
  • 2Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, New Mexico 87501, USA
  • 3Sorbonne Universités, UPMC Univ Paris 06, UMR 7238, Computational and Quantitative Biology, 4 Place Jussieu, Paris, France
  • 4CNRS, UMR 7238, Paris, France
  • 5IFOM, Milan, Italy
  • 6Dipartimento di Fisica, Università degli Studi di Milano, via Celoria 16, 20133 Milano, Italy

  • *Corresponding author: marco.gherardi@mi.infn.it

Article Text (Subscription Required)

Click to Expand

References (Subscription Required)

Click to Expand
Issue

Vol. 98, Iss. 1 — July 2018

Reuse & Permissions
Access Options
Author publication services for translation and copyediting assistance advertisement

Authorization Required


×
×

Images

×

Sign up to receive regular email alerts from Physical Review E

Log In

Cancel
×

Search


Article Lookup

Paste a citation or DOI

Enter a citation
×