Statistical properties of large data sets with linear latent features

Philipp Fleig and Ilya Nemenman
Phys. Rev. E 106, 014102 – Published 5 July 2022

Abstract

Analytical understanding of how low-dimensional latent features reveal themselves in large-dimensional data is still lacking. We study this by defining a probabilistic linear latent features model with additive noise and by analytically and numerically computing the statistical distributions of pairwise correlations and eigenvalues of the data correlation matrix. This allows us to resolve the latent feature structure across a wide range of data regimes set by the number of recorded variables, observations, latent features, and the signal-to-noise ratio. We find a characteristic imprint of latent features in the distribution of correlations and eigenvalues and provide an analytic estimate for the boundary between signal and noise, even in the absence of a spectral gap.

  • Figure
  • Figure
  • Figure
  • Figure
  • Figure
  • Figure
  • Received 16 January 2022
  • Revised 4 May 2022
  • Accepted 23 May 2022

DOI:https://doi.org/10.1103/PhysRevE.106.014102

©2022 American Physical Society

Physics Subject Headings (PhySH)

Statistical Physics & Thermodynamics

Authors & Affiliations

Philipp Fleig*

  • Department of Physics & Astronomy, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA

Ilya Nemenman

  • Department of Physics, Emory University, Atlanta, Georgia 30322, USA; Department of Biology, Emory University, Atlanta, Georgia 30322, USA; and Initiative in Theory and Modeling of Living Systems, Atlanta, Georgia 30322, USA

  • *Corresponding author: fleig@sas.upenn.edu

Article Text (Subscription Required)

Click to Expand

References (Subscription Required)

Click to Expand
Issue

Vol. 106, Iss. 1 — July 2022

Reuse & Permissions
Access Options
Author publication services for translation and copyediting assistance advertisement

Authorization Required


×
×

Images

×

Sign up to receive regular email alerts from Physical Review E

Log In

Cancel
×

Search


Article Lookup

Paste a citation or DOI

Enter a citation
×