• Open Access

Emergent community agglomeration from data set geometry

Chenchao Zhao and Jun S. Song
Phys. Rev. E 95, 042307 – Published 13 April 2017

Abstract

In the statistical learning language, samples are snapshots of random vectors drawn from some unknown distribution. Such vectors usually reside in a high-dimensional Euclidean space, and thus the “curse of dimensionality” often undermines the power of learning methods, including community detection and clustering algorithms, that rely on Euclidean geometry. This paper presents the idea of effective dissimilarity transformation (EDT) on empirical dissimilarity hyperspheres and studies its effects using synthetic and gene expression data sets. Iterating the EDT turns a static data distribution into a dynamical process purely driven by the empirical data set geometry and adaptively ameliorates the curse of dimensionality, partly through changing the topology of a Euclidean feature space Rn into a compact hypersphere Sn. The EDT often improves the performance of hierarchical clustering via the automatic grouping information emerging from global interactions of data points. The EDT is not restricted to hierarchical clustering, and other learning methods based on pairwise dissimilarity should also benefit from the many desirable properties of EDT.

  • Figure
  • Figure
  • Figure
  • Figure
  • Figure
  • Figure
  • Figure
7 More
  • Received 1 January 2017
  • Revised 20 March 2017

DOI:https://doi.org/10.1103/PhysRevE.95.042307

Published by the American Physical Society under the terms of the Creative Commons Attribution 4.0 International license. Further distribution of this work must maintain attribution to the author(s) and the published article's title, journal citation, and DOI.

Published by the American Physical Society

Physics Subject Headings (PhySH)

NetworksPhysics of Living SystemsInterdisciplinary Physics

Authors & Affiliations

Chenchao Zhao

  • Department of Physics and Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA

Jun S. Song*

  • Department of Physics, Carl R. Woese Institute for Genomic Biology, and Department of Bioengineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA

  • *songj@illinois.edu

Article Text

Click to Expand

References

Click to Expand
Issue

Vol. 95, Iss. 4 — April 2017

Reuse & Permissions
Author publication services for translation and copyediting assistance advertisement

Authorization Required


×
×

Images

×

Sign up to receive regular email alerts from Physical Review E

Reuse & Permissions

It is not necessary to obtain permission to reuse this article or its components as it is available under the terms of the Creative Commons Attribution 4.0 International license. This license permits unrestricted use, distribution, and reproduction in any medium, provided attribution to the author(s) and the published article's title, journal citation, and DOI are maintained. Please note that some figures may have been included with permission from other third parties. It is your responsibility to obtain the proper permission from the rights holder directly for these figures.

×

Log In

Cancel
×

Search


Article Lookup

Paste a citation or DOI

Enter a citation
×