Detectability Thresholds and Optimal Algorithms for Community Structure in Dynamic Networks

Open Access

Detectability Thresholds and Optimal Algorithms for Community Structure in Dynamic Networks

Amir Ghasemian, Pan Zhang, Aaron Clauset, Cristopher Moore, and Leto Peel

Phys. Rev. X 6, 031005 – Published 13 July 2016

Abstract

The detection of communities within a dynamic network is a common means for obtaining a coarse-grained view of a complex system and for investigating its underlying processes. While a number of methods have been proposed in the machine learning and physics literature, we lack a theoretical analysis of their strengths and weaknesses, or of the ultimate limits on when communities can be detected. Here, we study the fundamental limits of detecting community structure in dynamic networks. Specifically, we analyze the limits of detectability for a dynamic stochastic block model where nodes change their community memberships over time, but where edges are generated independently at each time step. Using the cavity method, we derive a precise detectability threshold as a function of the rate of change and the strength of the communities. Below this sharp threshold, we claim that no efficient algorithm can identify the communities better than chance. We then give two algorithms that are optimal in the sense that they succeed all the way down to this threshold. The first uses belief propagation, which gives asymptotically optimal accuracy, and the second is a fast spectral clustering algorithm, based on linearizing the belief propagation equations. These results extend our understanding of the limits of community detection in an important direction, and introduce new mathematical tools for similar extensions to networks with other types of auxiliary information.

Received 2 November 2015

DOI:https://doi.org/10.1103/PhysRevX.6.031005

This article is available under the terms of the Creative Commons Attribution 3.0 License. Further distribution of this work must maintain attribution to the author(s) and the published article’s title, journal citation, and DOI.

Published by the American Physical Society

Physics Subject Headings (PhySH)

Community structure

Evolving networks

Cavity methods

Networks

Authors & Affiliations

Amir Ghasemian¹, Pan Zhang², Aaron Clauset^1,3,4, Cristopher Moore⁴, and Leto Peel^5,6

¹Department of Computer Science, University of Colorado, Boulder, Colorado 80309, USA
²CAS Key Laboratory of Theoretical Physics, Institute of Theoretical Physics, Chinese Academy of Sciences, Beijing 100190, China
³BioFrontiers Institute, University of Colorado, Boulder, Colorado 80305, USA
⁴Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, New Mexico 87501, USA
⁵ICTEAM, Université catholique de Louvain, Avenue George Lemaître 4, B-1348 Louvain-la-Neuve, Belgium
⁶naXys, University of Namur, Rempart de la Vierge 8, 5000 Namur, Belgium

Popular Summary

Detecting communities within the networks that characterize complex social, biological, and technological systems is a crucial step in understanding how their structure shapes their function. However, many such networks are dynamic, with nodes changing their connections and affiliations over time in complicated ways. This situation makes community detection more challenging, but correlations across time provide a means to circumvent this issue. Here, we derive a precise mathematical limit on our ability to recover the underlying community structure in a dynamic network, which depends only on the strength of the hidden communities and the rate at which nodes change their community membership.

While a number of models and algorithms have been proposed for dynamic community detection, up to now there has been no theory about the limits of these algorithms. In this study, we use powerful tools from both statistical physics and Bayesian probability to show that a phase transition exists in the detectability of dynamic communities, and we describe two new algorithms that are optimal in the sense that they succeed all the way down to this threshold. The first algorithm is based on belief propagation, known in physics as the cavity method, and the second is a fast spectral clustering algorithm.

We expect that the techniques used to obtain these results can likely be used to derive similar limits and optimal algorithms for other kinds of generalized networks, such as multiplex networks, networks with edge weights, or networks with node attributes. Additionally, our optimal algorithms can be applied to recover dynamic communities in a wide variety of practical settings.

Key Image

Article Text

Click to Expand

References

Click to Expand

Issue

Vol. 6, Iss. 3 — July - September 2016

Subject Areas

Reuse & Permissions

Author publication services for translation and copyediting assistance advertisement

Physical Review X