Abstract
Network growth processes can be understood as generative models of the structure and history of complex networks. This point of view naturally leads to the problem of network archaeology: reconstructing all the past states of a network from its structure—a difficult permutation inference problem. In this paper, we introduce a Bayesian formulation of network archaeology, with a generalization of preferential attachment as our generative mechanism. We develop a sequential Monte Carlo algorithm to evaluate the posterior averages of this model, as well as an efficient heuristic that uncovers a history well correlated with the true one, in polynomial time. We use these methods to identify and characterize a phase transition in the quality of the reconstructed history, when they are applied to artificial networks generated by the model itself. Despite the existence of a no-recovery phase, we find that nontrivial inference is possible in a large portion of the parameter space as well as on empirical data.
3 More- Received 20 June 2019
- Revised 1 October 2019
DOI:https://doi.org/10.1103/PhysRevX.9.041056
Published by the American Physical Society under the terms of the Creative Commons Attribution 4.0 International license. Further distribution of this work must maintain attribution to the author(s) and the published article’s title, journal citation, and DOI.
Published by the American Physical Society
Physics Subject Headings (PhySH)
Popular Summary
Over the past few decades, physics has extended its reach to complex objects of inquiry that are best represented as networks, such as power grids or webs of social interactions. These networks are fundamentally dynamical objects: They have a past, present, and future. While it is known that such descriptions can lead to accurate predictions, we often only have access to snapshots taken at specific points in time. An important question to answer, then, is whether we can use the present state of networks to determine their history. We argue that we often can, and we show how to do it. Most importantly, we find that when a rich-get-richer dynamic governs the evolution of networks, temporal reconstruction can become impossible if the driving mechanism is too strong.
We adopt a statistical approach and use a generative model to ascribe probabilities to the possible past states of network snapshots. Explicitly computing these probabilities is challenging, so we develop state-of-the-art sampling techniques tailored to the model. When we use this model to construct artificial networks with known histories, we find that the number of symmetries determines its recoverability: Beyond a certain number, the network undergoes a phase transition where history cannot be retrieved. Luckily, we also find that many real networks fall in the recoverable phase, something we can exploit to infer the history of statically observed networks.
While we provide compelling qualitative and numerical evidence for the existence of a phase transition, we have yet to pinpoint the exact location of this transition, something we leave for future work. Another essential open challenge is making our inference methods more scalable: The state-of-the-art method we use does not scale to networks of more than a few thousand nodes.