Significance Analysis and Statistical Mechanics: An Application to Clustering

Marta Łuksza, Michael Lässig, and Johannes Berg
Phys. Rev. Lett. 105, 220601 – Published 23 November 2010

Abstract

This Letter addresses the statistical significance of structures in random data: Given a set of vectors and a measure of mutual similarity, how likely is it that a subset of these vectors forms a cluster with enhanced similarity among its elements? The computation of this cluster p value for randomly distributed vectors is mapped onto a well-defined problem of statistical mechanics. We solve this problem analytically, establishing a connection between the physics of quenched disorder and multiple-testing statistics in clustering and related problems. In an application to gene expression data, we find a remarkable link between the statistical significance of a cluster and the functional relationships between its genes.

  • Figure
  • Figure
  • Figure
  • Received 27 November 2009

DOI:https://doi.org/10.1103/PhysRevLett.105.220601

© 2010 The American Physical Society

Authors & Affiliations

Marta Łuksza1, Michael Lässig2, and Johannes Berg2

  • 1Max Planck Institute for Molecular Genetics, Ihnestraße 63-73, 14195 Berlin, Germany
  • 2Institut für Theoretische Physik, Universität zu Köln, Zülpicher Straße 77, 50937 Köln, Germany

Article Text (Subscription Required)

Click to Expand

References (Subscription Required)

Click to Expand
Issue

Vol. 105, Iss. 22 — 26 November 2010

Reuse & Permissions
Access Options
Author publication services for translation and copyediting assistance advertisement

Authorization Required


×
×

Images

×

Sign up to receive regular email alerts from Physical Review Letters

Log In

Cancel
×

Search


Article Lookup

Paste a citation or DOI

Enter a citation
×