• Open Access

Compression of Flow Can Reveal Overlapping-Module Organization in Networks

Alcides Viamontes Esquivel and Martin Rosvall
Phys. Rev. X 1, 021025 – Published 29 December 2011

Abstract

To better understand the organization of overlapping modules in large networks with respect to flow, we introduce the map equation for overlapping modules. In this information-theoretic framework, we use the correspondence between compression and regularity detection. The generalized map equation measures how well we can compress a description of flow in the network when we partition it into modules with possible overlaps. When we minimize the generalized map equation over overlapping network partitions, we detect modules that capture flow and determine which nodes at the boundaries between modules should be classified in multiple modules and to what degree. With a novel greedy-search algorithm, we find that some networks, for example, the neural network of the nematode Caenorhabditis elegans, are best described by modules dominated by hard boundaries, but that others, for example, the sparse European-roads network, have an organization of highly overlapping modules.

  • Figure
  • Figure
  • Figure
  • Figure
  • Figure
  • Received 6 May 2011

DOI:https://doi.org/10.1103/PhysRevX.1.021025

This article is available under the terms of the Creative Commons Attribution 3.0 License. Further distribution of this work must maintain attribution to the author(s) and the published article’s title, journal citation, and DOI.

Published by the American Physical Society

Authors & Affiliations

Alcides Viamontes Esquivel* and Martin Rosvall

  • Integrated Science Lab, Umeå University, Sweden

  • *a.viamontes.esquivel@physics.umu.se

Popular Summary

Passengers traveling between airports, data packets transferred on the Internet, energy transmitted on a power grid, and ideas exchanged among colleagues or friends are all examples of flow that connect the individual components of complex systems and generate intercomponent dependence. For describing these integrated systems, the theoretical constructs of networks composed of nodes and links are used, with nodes representing the components and links representing the connections between the components. Identifying and understanding how the nodes are connected and the nature of their connections should therefore serve in a fundamental way our need to control or optimize the functions of these systems. While this kind of analyses has been at the center of the complex-networks research, the fact that flows underpin the structural organizations and functions of many network systems has often been overlooked. In this paper, we take a flow-centric perspective and provide an information-theoretic method for sorting the nodes of a network into possibly overlapping functional modules based on the nature and strength of the flows connecting the nodes.

We characterize the structural organization of a network by analyzing the flow on it and identifying the structural pattern of the flow: A group of nodes interconnected by a separate flow system whose integrity persists over a long time is identified as a module; modules are said to be overlapping when their corresponding flows run through a set of shared nodes; and the shared nodes define “fuzzy boundaries” between the modules. The questions for structural sorting of the nodes then become: What is the number of modules in a network? Which nodes belong to which modules? And which nodes should belong to multiple modules and to what degree?

We answer these questions in an information-theoretic framework that takes advantage of the principle that all regularities in data can be used to compress the data. Specifically, we describe the flow through a system as two distinct types of movements: within modules and between modules, very much like the way we use phone numbers for calls within cities and area codes for calls between cities. The length of such a module-aware description depends on how modules are identified and how nodes are assigned to the modules. With too many or too few modules, the description will be longer than it is if we use the optimal number of modules. The same will happen if we assign nodes to too many, too few, or inapt modules. Therefore, our task becomes compressing the description by finding the optimal assignments of nodes into modules. This approach allows us to resolve the fuzzy boundaries between modules by maximally compressing our module-aware description of the flow through the entire network system.

At a speed on par with the fastest methods that operate on the topological backbone of a system instead of on the flow through a system, our novel greedy-search algorithm can capture the overlapping flow systems of networks that are good candidates for functional components in social and biological systems.

Key Image

Article Text

Click to Expand

References

Click to Expand
Issue

Vol. 1, Iss. 2 — October - December 2011

Subject Areas
Reuse & Permissions
Author publication services for translation and copyediting assistance advertisement

Authorization Required


×
×

Images

×

Sign up to receive regular email alerts from Physical Review X

Reuse & Permissions

It is not necessary to obtain permission to reuse this article or its components as it is available under the terms of the Creative Commons Attribution 3.0 License. This license permits unrestricted use, distribution, and reproduction in any medium, provided attribution to the author(s) and the published article's title, journal citation, and DOI are maintained. Please note that some figures may have been included with permission from other third parties. It is your responsibility to obtain the proper permission from the rights holder directly for these figures.

×

Log In

Cancel
×

Search


Article Lookup

Paste a citation or DOI

Enter a citation
×