Exact rank-reduction of network models

With the advent of the big data era, models of complex networks are becoming elusive from direct computational simulation. We present an exact, linear-algebraic reduction scheme of undirected network models. We group them in universality classes, and work out, in a computationally affordable way, their relevant properties (e.g., spectrum). By exploiting the bilinearity structure of the expected adjacency matrix of the network, we separate its null eigenspace, and reduce the exact description of the model to a smaller vector space. We show that the rank and signature of such matrix entail a natural and comprehensive classification of network models. The reduction also provides the environment for a simplified computation of their properties. The proposed scheme will be very useful in the study of dynamical processes on networks, as well as in the understanding of models to come, according to the provided universal classification.

Network science is experiencing a burst of activity in the modeling and understanding of very large complex systems, including, for example, those formed by social interactions in microblogging platforms as Twitter [1], or by high-throughput molecular biology data related to genomes [2], proteomes [3], metabolomes [4], etc. However, this endeavor is limited by the computational effort required to solve dynamical processes running on top of very large real networks. Moreover, the full knowledge of the connectivity structure (links) and dynamic state of their units (nodes) is often unaffordable. In this case the use of synthetic models of networks is the only alternative.
Synthetic models are a powerful tool for studying real-world networks and the dynamical processes unfolding on them. Their scope covers social systems [5], the spread of epidemics in human and animal populations [6][7][8], neuroscience [9], human mobility [10][11][12], finance [13], ecology [14,15] and more. Through networked models, researchers have improved the understanding of complex systems in terms of easily interpretable analytical relations. A paramount example is the large literature on critical phenomena on complex networks: synchronization [16], spreading processes [17,18], or percolation [19], to cite a few. These models usually encode network structure into a small set of parameters and generative rules. In the last decades, along with the increased availability of highly-resolved data, network models have flourished to explain newly observed properties, like time-evolving contacts or multilayer topologies. We can say that network models are still the tool of choice for uncovering mechanistic properties of complex systems that can be generalized to a wide set of contexts.
The main problem of this deluge of models is that, as they become richer and more intricate, they are harder to solve. Furthermore, the lack of an inclusive theoretical framework makes it difficult to derive theoretical relationships among models, which could tell us about how models are similar in their structure and functionality.
In this Letter we propose an exact rank-reduction scheme for the expected adjacency matrix of model networks. We reduce the effective dimensionality of the network, facilitating its static and dynamical analysis. Moreover, this scheme allows us to define universality classes of network models in terms of the reduced features. We build a general, algorithmic derivation of some of the most relevant properties of these models, as their spectrum, and the behavior of some linear and nonlinear dynamical systems coupled to them. Finally, we describe the relationship between models in terms of linear algebraic relations and the action of symmetry groups. Our methodology, by providing a general framework for both classification and computation, can be straightforwardly applied to models to come, with no need to develop new ad-hoc approaches.
The focus of our work are network models completely defined in terms of their expected adjacency matrix. As we will show, many popular models respond to this definition, both for static and for time-evolving networks. First, let us note that models in physics are usually designed to depend on few -fundamental-parameters, compared to the size and complexity of the system under study. We claim that, in our case, this translates on the expected adjacency matrix of network models having a low rank. To illustrate the claim, we use a very simple network model, the configuration model. Consider n nodes and fix the expected degree of each node: k i , i = 1, · · · , n. When n is large enough, self-loops are a vanishingly small O(1/n) fraction of all edges when k 2 is finite, and the expected adjacency matrix is A ij = k i k j / (n k ), where k is the average degree [20]. If we define the vector K as K i = k i / n k , we can write the A in matrix form as A = KK T . From this last expression it is obvious that rank A = 1, no matter the size of the system (n). Now, we generalize this decomposition for any symmetric matrix A with rank A = r < n, we can write where V ∈ R n,r (an n × r matrix), ∆ ∈ R r,r symmetric, and both have maximal rank, i.e., exactly r. This decomposition entails a powerful interpretation of the adjacency matrix as a bilinear form R n . Given two vectors X, Y ∈ R n , the adjacency matrix returns their scalar product X T AY . This connection between the adjacency matrix and the structure of a scalar product on R n , already unveiled in [21], has deep implications in our study. The main feature of our scalar product, however, is being highly degenerate. The rank-nullity theorem tells us that the eigenspace of A associated to the eigenvalue 0 has dimension n − r. This eigenspace, which we call L, is completely determined in terms of the kernel of the linear map ξ : R n → R r , defined in terms of V T : ξ(x) = V T x. In other words L is the set of vectors x ∈ R n for which V T x = 0. L is the trivial subspace of R n . We focus on the restriction of ξ to the subspace on which it is invertible. There, it induces an isomorphism between a small subspace R of R n , with dimension r, and R r .
Packing all this together, we induce a decomposition of the full space: R n R ⊕ L, and this is the core step in our rank reduction. Given that an isomorphism is also a change of basis, ∆ is the representation in R r of the same scalar product that A is in R. Moreover, given that by definition the restriction of A to L is identically zero, ∆ encodes exactly the same information as A, once the trivial degeneracy is pruned.
By Sylvester's theorem, we can always convert ∆ to normal form, i.e. a diagonal matrix with entries 1s and −1s (the number of positive entries of ∆ is known as metric signature). Consequently, Eq. (1) is simply the scalar product A written into normal form. A schematic representation of the exact reduction is presented in Fig. 1 This decomposition has a straight consequence. It provides a universal classification of any possible expected adjacency matrix. All the possible network models (representable in terms of the expected adjacency matrix) fall into classes defined by the rank r, and the metric signature p of ∆. We will denote these classes as (r.p). Within each uni-versality class, the specific values of the column vectors of matrix identify a specific model. We call these vectors v µ metadegrees, as they are the rank-r generalization of the vector K of the configuration model. When r = 1, the only existing class is (1.1) and it contains the configuration model. At r = 2, we find two classes, which we can identify as Euclidean models (2.2) and Lorentzian models (2.1), according to the terminology used in general relativity.
Class 2.1 is particularly interesting, as it includes the activity-driven model [22], a model for time-evolving networks. This model assigns to each node a probability of activation a i . When active, it establishes links with m other random nodes (active or inactive). All links are reset before the next time step. The activity-driven represents an extremely simple model of temporal networks yet, just as the configuration model, it has been successfully applied to many different contexts, and exhibits a rich and interesting macroscopic behavior. Given the absence of temporal correlations, it is fully described by the expected adjacency matrix [23] and can be written as follows: A = (m/n)(ΩF T + F Ω T ), with Ω i = a i and F i = 1. From the previous expression, the rank-2 structure becomes apparent, as A is the outer product of two linearly independent vectors. To show its signature, we have to put the metadegrees so that ∆ is diagonal. We find ∆ = diag(1, −1) and In a recent extension of the activity-driven model intended to mimic preferentially on attachment [24], nodes, in addition to the activity potential a i , are assigned specific values of attractiveness. When a node activates it will then be more likely to choose nodes with high attractiveness. This model falls again in class (2.1), and its reduced rank representation is the same as in Eq. 2, with a vector proportional to the attractiveness instead of the constant vector F of the original activity model. Also another extension, the simplicial activity-driven model [25], where active nodes create cliques, instead of single links, to account for multi-agent interactions, can be accommodated in our classification. Depending on the relation between the average activity and the clique size, it can be easily proved that the model is Euclidean (2.2) or Lorentzian (2.1). Specifically, one can prove that it is Euclidean if the average node activity ( a ) higher than a threshold value: where q is the clique size. When clique size is not fixed but follows a given distribution the threshold value is more involute but can still be computed. The configuration model with degree-degree correlations also falls in the rank r = 2 universality class. By setting the first metadegree vector of V to be the degree vector K, one can make the Euclidean model 2.2 exhibit arbitrarily disassortative or assortative behavior [26] by tuning the second metadegree.
Finally, we can rank-reduce another popular model the celebrated stochastic-block model [27]. It has a wide range of applications, as, for instance, community detection [28]. In its simpler form, nodes are divided into c subsets. Links within and between subsets occur with different probabilities. We define k to be the average number of connections a node establishes with nodes from the same subset, and h from subsets other than its own. One can show that the rank of the resulting adjacency matrix is equal to the number of subsets: r = c. The signature exhibits instead two regimes. If subsets reflect a community structure, and nodes are more likely to connect within the same subset (k > h/(c − 1)), then the model is Euclidean, and belongs to the class (c.c). If, instead, nodes tend to form link across subsets (k < h/(c − 1)), the metric has a Lorentzian signature, and the model belongs to the family class (c.1). Even in the presence of a fine partition (large c), the only possible signatures are either Euclidean or Lorentzian. This will prove useful in what follows, when we study the internal symmetries of the different models.
In addition to the classification, our rank-reduction scheme allows us to derive key properties of the models. In what follows, we investigate the spectrum, as well as the solution of some linear and non-linear processes. We start from the spectrum of the adjacency matrix, which plays a key role in many centrality measures and determines the behavior of several dynamical processes. The largest eigenvalue [29], for instance, determines the critical behavior of synchronization [16] and diffusion [7,[30][31][32] phenomena. We wonder if our rank-reduction preserves the spectrum. We focus on the eigendecomposition of A in the subspace R, which is itself the direct sum of the eigenspaces of A relative to its nonzero eigenvectors. As mentioned, Eq. (1) is a change of basis for a bilinear form. Given that now we wish to preserve the spectrum, we need to treat A as a linear map R n → R n . The new representation in R r is B = ξAξ −1 = J∆, with J = V T V . The explicit expression of the inverse isomorphism is ξ −1 = V J −1 . Note that matrix J encodes all the possible scalar products among the metadegrees: J µν = v µ · v ν . We can give J a useful statistical interpretation. Let us assume the values of metadegrees of each node (v µ,i = V iµ ) to come from a given probability distribution: the metadegree distribution. The metadegree distribution is the generalization of the degree distribution beyond rank r = 1. Then, node metadegrees are samples from the metadegree distribution. As a result, the scalar product between v µ and v ν is proportional to the sample estimate of the expectation value of the product of these two metadegrees, intended as stochastic variables. In the limit of large network (n → ∞) , the sample estimate converges to the true expected value: (V µ ·V ν )/n → v µ v ν . This implies that the eigenvalues of the adjacency matrix are linear combinations of the second moments of the metadegrees. This feature is extremely relevant, for instance, in the context of epidemic spreading. Many seminal works have shown the epidemic threshold of the configuration model to depend on the second moment of the degree [17,33], with important implications for disease containment. More recently, the same property was found for the activity distribution and other models [34]. We now discover that this is a general property of any model, not a peculiarity of those two.
Through rank-reduction we can derive a simple formula for the eigenvectors, too. Let Λ µ = 0 an eigenvalue of A (and B) and let f (µ) be an associate eigenvector of B in the reduced space (Bf (µ) = Λ µ f (µ) ). Then g (µ) = V ∆f (µ) will be an eigenvector of A for the same eigenvalue. The eigenvalues end eigenvectors of B thus completely determine the spectral decomposition of A. Furthermore, by choosing {f (µ) } to be an orthogonal basis of R r with respect to the scalar product ∆: f (µ)T ∆f (ν) = δ µν Λ −1 µ , {g (µ) } will automatically be an orthonormal basis of R, thus completing an algorithmic construction of the spectral decomposition of the adjacency matrix from its rank-reduced transform.
In addition, the spectrum of A completely determines the behavior of any linear diffusion process of the formẋ = (a + bA)x, which can then be solved in the reduced space and then projected back. We can, however, use rank-reduction to solve a large class of nonlinear dynamical processes, too. Consider the following equation for the operator where f is an arbitrary holomorphic function. This equations contains all nonlinear terms in the form X(AX) j , for any j ∈ N. The operator P = ξ −1 ξ is an orthogonal projector on the subspace R; using P , we decompose X in terms of its action on the two subspaces L, R X = X RR + X RL + X LR + X LL : X RR = P XP , X RL = P A(1 − P ) and so on. By definition of R and L, A is nonzero only inside R: A = P AP . Using this, and the McLaurin decomposition of f (f (z) = j f j z j ) we can prove that f (AX) = f (AX RR ) + g(AX RR )AX RL . Note that g is another holomorphic function defined as g(z) = j f j+1 z j . We can now decompose Eq. (3) in terms of the four parts of X: Equation (4) is completely restricted to R, and we can use our mapping ξ to send it to R r , by defining U = ξXξ −1 . The resulting equation isU = U f (BU ): identical to Eq. (3), but living in a reduced space. Once we solve for U (either analytically or numerically, depending on the specific equation), we can go back to X RR by using the inverse transformation. Once X RR is known, Eq. (5) and (6) are just linear, and can be solved with standard techniques [35][36][37][38]. Once they are known, X LL is derived by mere integration of the rhs of Eq. 7. Remarkably, if we assume a simple (and often realistic) initial condition of X(0) = I, then X LR , X RL are identically zero. This allows us to write a simple, explicit solution of Eq. (3): X(t) = ξ −1 U (t)ξ + (1 − ξ −1 ξ). We have transformed a system of n 2 coupled nonlinear differential equations in n 2 unknowns, into one of just r 2 . The gain is dramatic considering that n is the number of nodes (large), while r (the reduced rank) is for most known models very small. To conclude our study, we will explore the geometric structure and implications of the rank reduction. We will analyze two symmetries that complete the classifications of the models. Let us consider Internal symmetry transformations, understood as operators on the small space R r that leave the adjacency matrix A unchanged. We call them internal because, in each node, they mix its metadegree values, but do not mix values from different nodes. These transformations build up the isometry group of the metrics ∆ (Iso(∆)), and their action on the metadegree matrix is V → V Q T , with Q ∈ Iso(∆). Internal symmetries do modify B, though clearly not its spectrum: B → QBQ −1 . We can then choose Q wisely, so that B has the simplest possible form, provide we know the structure of Iso(∆). Luckily, we have shown that the known models have either a Euclidean or a Lorentzian signature, whose isometry groups are the most known and studied [39]. In the Euclidean case (B = J) we can go further, as there always exists an orthogonal matrix Q so that QBQ −1 = QBQ T = QJQ T is diagonal. This means that the rotated metadegrees V Q T are orthogonal, and since, ∆ = I, their norm directly gives the spectrum of A. As a result, whenever the metadegrees are orthogonal (or we can make them so with a rotation in R r ), i) their norms are the nonzero eigenvalues of A, ii) the metadegrees are also eigenvectors of A. These isometries are of special interest for the understanding of geometrical embeddings of complex networks [40]. We now turn to the latter symmetry: external symmetry transformations. They act on the large space R n by mixing the values each metadegree has on the nodes. Opposite to internal transformations, they do mix nodes but do not mix different metadegrees. These transformations comprise the orthogonal group O(n), and act on the metadegrees as follows: V → SV , for S ∈ O(n). It is easy to show that they change A through a similarity transformation, while keeping B unchanged. We thus have found that internal symmetries span different low-rank representations (B) of the same model (A). External transformations span all the models (A) that have the same low-rank representation (B). Both symmetries, however, preserve the spectrum. Internal and external symmetries are schematically represented in Fig. 2. We now study the combined action of internal and external symmetry transformations: V → SV Q T . In the Euclidean case, this coincides with the singular value decomposition of V . In general, it still has far-fetching implications on the nature of models themselves. Within the same class, it allows mapping different models onto each other. Models that are completely different in nature and purposes may have the same properties if they are linked by this symmetry transformation. As a practical example, we now show that within class 2.1, the activity-driven model (adm henceforth) can be mapped onto the stochastic-block model with two subsets and high inter-subset connectivity (h > k) (sbm henceforth). This is quite remarkable if we consider that the former is commonly used to model time-evolving networks with fixed microscopic activity patterns, while the latter applies to static networks featuring mesoscale structures. We start from an adm with fixed activity vector Ω and number of stubs m (see Eq. (2)). We will land on a sbm featuring two equally-sized subsets, whose degrees k, h will be computed. This means finding the transformation who give V sbm = SV adm Q T , where V sbm , V adm are the metadegrees of the two models when the metric is in normal form (∆ = diag(1, −1)). We can parametrize the internal transformation using the hyperbolic angle θ: Q = e θσ 1 , where σ 1 is the first Pauli matrix. The transformation relation, made explicit for each of the two metadegrees, defining where the entries of vectors F 1 , F 2 are 1 on the first (second) subset, zero otherwise, so that F 1 + F 2 = F . An explicit form of S would then solve Eq. (8). We however only wish to uncover under which conditions such mapping is possible. Thus, we just require that S, being orthogonal, preserve standard scalar products in R n . That fixes the degrees of the : k = m a , h = m a 2 . It also fixes the hyperbolic angle of the internal transformation to θ = − 1 4 log a 2 . This demonstrates that we can indeed map the adm onto a sbm, and that the mapping fixes the parameters of such sbm, and also fixes the gauge induced by internal transformations.
In this research we have proposed a linear algebraic methodology for classifying a large set of network models, including some of the most popular. By using reduced rank representations of the expected adjacency matrix, we have derived many of the properties of the network, and of the dynamical processes on top of them, in an algorithmic way, with a dramatic decrease in the complexity of the analytic and numerical calculations involved. Finally, we have shown how the geometrical properties of our scheme can be used to devise new models, extend and make connections between existing ones. Our rank-reduction scheme is not, however, complete. It does not include, for instance, models with link-link correlations in space or time [41][42][43], because they are not maximally entropic once A is fixed. Neither it considers finite-size effects (n far from the thermodynamic limit). Nevertheless, we are however confident that more general formulations will be found in the future, to overcome both. For example, link-link correlations could be accounted for by considering adjacency tensors [21,44], and finite-size effects could be addressed by perturbing the spectrum of A with diagonal matrices [45], these are just ideas we are considering for a future work.