Super-Resolution Community Detection for Layer-Aggregated Multilayer Networks

Open Access

Super-Resolution Community Detection for Layer-Aggregated Multilayer Networks

Dane Taylor, Rajmonda S. Caceres, and Peter J. Mucha

Phys. Rev. X 7, 031056 – Published 26 September 2017

Abstract

Applied network science often involves preprocessing network data before applying a network-analysis method, and there is typically a theoretical disconnect between these steps. For example, it is common to aggregate time-varying network data into windows prior to analysis, and the trade-offs of this preprocessing are not well understood. Focusing on the problem of detecting small communities in multilayer networks, we study the effects of layer aggregation by developing random-matrix theory for modularity matrices associated with layer-aggregated networks with $N$ nodes and $L$ layers, which are drawn from an ensemble of Erdős–Rényi networks with communities planted in subsets of layers. We study phase transitions in which eigenvectors localize onto communities (allowing their detection) and which occur for a given community provided its size surpasses a detectability limit $K^{*}$ . When layers are aggregated via a summation, we obtain $K^{*} \propto O (\sqrt{N L} / T)$ , where $T$ is the number of layers across which the community persists. Interestingly, if $T$ is allowed to vary with $L$ , then summation-based layer aggregation enhances small-community detection even if the community persists across a vanishing fraction of layers, provided that $T / L$ decays more slowly than $O (L^{- 1 / 2})$ . Moreover, we find that thresholding the summation can, in some cases, cause $K^{*}$ to decay exponentially, decreasing by orders of magnitude in a phenomenon we call super-resolution community detection. In other words, layer aggregation with thresholding is a nonlinear data filter enabling detection of communities that are otherwise too small to detect. Importantly, different thresholds generally enhance the detectability of communities having different properties, illustrating that community detection can be obscured if one analyzes network data using a single threshold.

1 More

Received 14 September 2016

DOI:https://doi.org/10.1103/PhysRevX.7.031056

Published by the American Physical Society under the terms of the Creative Commons Attribution 3.0 License. Further distribution of this work must maintain attribution to the author(s) and the published article’s title, journal citation, and DOI.

Published by the American Physical Society

Physics Subject Headings (PhySH)

Community structure Information & communication theory Network phase transitions Patterns in complex systems Scaling laws of complex systems

Interdisciplinary PhysicsNetworks

Authors & Affiliations

Dane Taylor^1,2,*, Rajmonda S. Caceres³, and Peter J. Mucha¹

¹Carolina Center for Interdisciplinary Applied Mathematics, Department of Mathematics, University of North Carolina, Chapel Hill, North Carolina 27599, USA
²Department of Mathematics, University at Buffalo, State University of New York, Buffalo, New York 14260, USA
³Lincoln Laboratory, Massachusetts Institute of Technology, Lexington, Massachusetts 02420, USA

^*dane.r.taylor@gmail.com

Popular Summary

Networks can represent relationships between the online behaviors of individuals such as emailing, file sharing, and purchasing history. Small-community detection, which attempts to identify anomalous clusters within such networks, has important applications in cybersecurity for detecting attacks, intrusions, and fraud. Before analyzing the structural patterns of these networks, data are often preprocessed using a variety of techniques. While there is a wealth of theory for network analysis methodology, most preprocessing is done on an ad hoc basis, and there is significant need for an encompassing theory bridging these two steps.

Here, we focus on the task of detecting small communities in large multilayer networks, wherein “layers” encode different types of connections, such as different instances in time. Because it can be beneficial to aggregate layers that are similar, we analyze the effects of layer aggregation on the detectability of communities. We analyze a novel, important metric: the number of layers a community must persist across in order for aggregation to benefit detection. In addition, we introduce layer aggregation with thresholding as a nonlinear data filter, showing that this preprocessing step can allow super-resolution detection of communities that are otherwise too small to detect.

This work paves the way for a new class of holistic methods that simultaneously address both the data preprocessing and network analysis steps. Such an approach will lead to improved pattern analytics in cybersecurity and broadly benefit the analysis of diverse networks arising in the social, biological, and engineering sciences.

Key Image

Article Text

Click to Expand

References

Click to Expand

Issue

Vol. 7, Iss. 3 — July - September 2017

Subject Areas

Reuse & Permissions

Author publication services for translation and copyediting assistance advertisement

Physical Review X