Sensitive Dependence of Optimal Network Dynamics on Network Structure

The relation between network structure and dynamics is determinant for the behavior of complex systems in numerous domains. An important long-standing problem concerns the properties of the networks that optimize the dynamics with respect to a given performance measure. Here we show that such optimization can lead to sensitive dependence of the dynamics on the structure of the network. Specifically, using diffusively coupled systems as examples, we demonstrate that the stability of a dynamical state can exhibit sensitivity to unweighted structural perturbations (i.e., link removals and node additions) for undirected optimal networks and to weighted perturbations (i.e., small changes in link weights) for directed optimal networks. As mechanisms underlying this sensitivity, we identify discontinuous transitions occurring in the complement of undirected optimal networks and the prevalence of eigenvector degeneracy in directed optimal networks. These findings establish a unified characterization of networks optimized for dynamical stability, which we illustrate using Turing instability in activator-inhibitor systems, synchronization in power-grid networks, network diffusion, and several other network processes. Our results suggest that the network structure of a complex system operating near an optimum can potentially be fine-tuned for a significantly enhanced stability compared to what one might expect from simple extrapolation. On the other hand, they also suggest constraints on how close to the optimum the system can be in practice. Finally, the results have potential implications for biophysical networks, which have evolved under the competing pressures of optimizing fitness while remaining robust against perturbations.


I. INTRODUCTION
Building on the classical fields of graph theory, statistical physics, and nonlinear dynamics, as well as on the increasing availability of large-scale network data, the field of network dynamics has flourished over the past 15 years [1,2]. Much of the current effort in this area is driven by the premise that understanding the structure, the function, and the relation between the two will help explain the workings of natural systems and facilitate the design of engineered systems with expanded capability, optimized performance, and enhanced robustness. There have been extensive studies on this structuredynamics relation [3][4][5] in a wide range of contexts, such as synchronization [6][7][8][9][10][11][12][13][14][15][16]; reaction, diffusion, and/or advection dynamics [17][18][19][20]; dynamical stability [21,22]; controllability [23,24]; and information flow [25,26]. Many of these studies have led to systematic methods for enhancing the dynamics through network-structural modifications, with examples including network control [27][28][29][30] and synchronization enhancement [31][32][33][34], where the latter has been demonstrated in applications [35][36][37].
A fundamental question at the core of the structuredynamics relation is that of optimization: which network structures optimize the dynamics of the system for a given function and what are the properties of such networks? The significance of addressing this question is twofold. First, knowledge of the properties of optimized structures can inform system architecture design. For example, in power-grid * E-mail: t-nishikawa@northwestern.edu networks, whose operation requires frequency synchronization among power generators, the structures that maximize synchronization stability could potentially be used to devise effective strategies for upgrading the system [38]. Second, the identification of the network structures that guarantee the best fitness of natural complex systems can provide insights into the mechanisms underlying their evolution. Examples of such systems include neuronal networks, whose (synaptic) connectivity structure is believed to have been optimized through evolution or learning for categorization tasks [39], synchronization efficiency [40], dynamical complexity [41,42], information transfer efficiency [42,43], and/or wiring cost [40]. The question of optimizing the network structure can be conceptualized as the problem of maximizing or minimizing a measure of dynamical stability, robustness, or performance over all possible configurations of links connecting the dynamical units.
Here we demonstrate that optimized dynamics are often highly sensitive to perturbations applied to the structure of the network. For concreteness, we focus on optimizing the linear stability of desired dynamical states over all networks with a given number of nodes and links. We consider network states in which the (possibly time-dependent) states of the individual nodes are identical across the network, such as in consensus dynamics, synchronized periodic or chaotic oscillations, and states of equilibrium in diffusion processes. We establish conditions under which the stability is sensitive or non-sensitive to structural perturbations, depending on the class of networks and the nature of the perturbations considered, as summarized in Fig. 1. In particular, we show that optimized stability can exhibit sensitivity under different types of perturbations for directed and undirected networks:

Sensitivity of optimal networks
FIG. 1. Directed and undirected networks optimized for the stability of the network dynamics can be sensitive to weighted and unweighted structural perturbations, respectively. The graphs schematically illustrate typical behavior for systems with sensitivity (in red boxes) and systems with no sensitivity (in blue boxes). Under weighted perturbations, we actually show that all undirected networks (including nonoptimal ones) are nonsensitive (lower blue box).
1. Sensitivity to link removals and node additions (unweighted perturbations) for undirected optimal networks in the limit of large network size (top left red box in Fig. 1).
We show that such sensitivity is observed for a class of optimal networks, which we refer to as Uniform Complete Multipartite (UCM) networks. The UCM networks are composed of node groups of equal sizes that are fully connected to each other but have no internal links. We prove that these networks are the only networks that achieve the maximum stability possible for a given number of nodes and links. The UCM networks are part of a larger class of networks, characterized as having the Minimum possible size of the largest Components in their Complement (MCC) among all networks with a given number of nodes and links. We provide a full analytical characterization of the MCC networks of arbitrary finite size and study their behavior as the network size approaches infinity.
2. Sensitivity to changes in link weights (weighted perturbations) for finite-size directed optimal networks (bottom right red box in Fig. 1).
While specific examples can be found in the literature [44][45][46][47], no systematic study exists on general mechanisms and conditions for such sensitivity. Here we provide such conditions in terms of the spectral degeneracy of the network by establishing the scaling relation between the stability and the perturbation size. These conditions imply that spectral degeneracy underlies such sensitivity to linkweight perturbations. We expect this sensitivity to be observed in many applications since spectral degeneracy appears to be common in real networks [48]. Moreover, here we show that optimization tends to increase the incidence of spectral degeneracy, and we also show that the network exhibits approximately the same sensitivity even when the degeneracy (or the optimality) is only approximate.
In addition to these two cases of sensitivity, we have results on the absence of sensitivity in the other two cases (blue boxes in Fig. 1). We illustrate the implications of our results using a general class of diffusively coupled systems for which the network spectrum is shown to determine the stability and other aspects of the dynamics to be optimized. The spe-cific cases we analyze include the rate of diffusion over networks, the critical threshold for Turing instability in networks of activator-inhibitor systems, and synchronization stability in power grids and in networks of chaotic oscillators. The remainder of the article is organized as follows. We first define the class of network dynamics under consideration (Sec. II). We then present our results on the two types of sensitivity anticipated above (Secs. III and IV), followed by examples of physical systems exhibiting these types of sensitivity (Sec. V). We conclude with a discussion on further implications of our results (Sec. VI).

II. NETWORK DYNAMICS CONSIDERED
We aim to address a wide range of network dynamics in a unified way. For this purpose we consider the dynamics of networks of coupled dynamical units governed by the following general equation with pairwise interactions: for i = 1, . . . , n, where n is the number of dynamical units (nodes), x i = x i (t) is the column vector of state variables for the ith node at time t, andẋ i denotes the time derivative of x i . The function F(x, y 1 , . . . , y n ) is generally nonlinear and describes how the dynamics of node i are influenced by the other nodes through intermediate variables y j = H ij (x i , x j ), where y j = 0 indicates no interaction. This means that the dynamics of an isolated node are described bẏ x = F(x, 0, . . . , 0). We assume that the dependence of F on y j is the same for all j (or more precisely, that F is invariant under any permutation of y 1 , . . . , y n ). Thus, the topology of the interaction network and the strength of individual pairwise coupling are not encoded in F, but rather in the (i, j)dependence of the coupling function H ij . This extends the framework introduced in Ref. [49] and can describe a wide range of dynamical processes on networks, including consensus protocol [50,51], diffusion over networks [1], emergence of Turing patterns in networked activator-inhibitor systems [52], relaxation in certain fluid networks [53], and synchronization of power generators [54] as well as other coupled identical and non-identical oscillators [49,[55][56][57]. Details on these examples can be found in Supplemental Material [58], Sec. S1. For the class of systems described by Eq. (1), we consider network-homogeneous states given by x 1 (t) = · · · = x n (t) = x * (t), (2) where x * satisfies the equation for an isolated node,ẋ * = F(x * , 0, . . . , 0). Each of the example systems mentioned above exhibits such a state: uniform agreement in consensus protocols, synchronous dynamics in oscillator networks, uniform occupancy in network diffusion, uniform concentration in coupled activator-inhibitor systems, and the equilibrium state in the fluid networks. Note that certain nonhomogeneous states can also be represented using such a solution by changing the frame of reference (demonstrated for specific examples of non-uniform phase-locked states in power grids and phase oscillator networks in Supplemental Material [58], Sec. S1A).
To facilitate the stability analysis, we make two general assumptions on the nature of node-to-node interactions when the system is close to a network-homogeneous state. Assumption (A-1): The interactions are "diffusive," in the sense that the coupling strength between two nodes, H ij (u, v), is to first order proportional to the difference between their states, v−u.
In particular, we assume that the coupling strength vanishes as the node states become equal. Assumption (A-2): There is a constant coupling matrix A = (A ij ) encoding the structure of the network of interactions, in the sense that the proportionality coefficient (the "diffusion constant") in assumption (A-1) can be written as A ij · G(t), where the scalar A ij represents the strength of coupling from node j to node i, and the matrix-valued function G(t) is independent of i and j.
Under these assumptions, we define a stability function Λ(α) for each complex-valued parameter α (derivation presented in Appendix A), which captures the factors determining the stability of the network-homogeneous state but is independent of the network structure. This function, referred to as a master stability function in the literature, was originally derived for a general class of systems that is different from the one we consider here [49,56]. The influence of the network structure on the stability is only through the (possibly complex) eigenvalues of the Laplacian matrix L, defined by Note that L always has a null eigenvalue λ 1 = 0 associated with the eigenvector (1, . . . , 1) T , which corresponds to the mode of instability that does not affect the condition in Eq. (2). The maximum Lyapunov exponent measuring the stability of the network-homogeneous state is then given by i.e., it is stable if Λ max < 0, and unstable if Λ max > 0. In addition, |Λ max | gives the asymptotic rate of exponential convergence or divergence.
As an example of stability optimization, we consider the following fundamental question: For a given number of nodes representing dynamical units, and a given number of links with identical weights, what is the assignment of links that maximizes the rate of convergence to a networkhomogeneous state?
In the context of this problem, we may assume A ij to be binary (A ij = 0 or 1) without loss of generality, since any link weight ε = 1 can be factored out of A ij (making A ij binary) and absorbed into G(t), which is then accounted for by the stability function Λ(α).

III. SENSITIVITY TO UNWEIGHTED PERTURBATIONS
In this section, we demonstrate the sensitivity of the convergence rate to link removal and node addition in optimal undirected networks (Subsection A). We then show that such sensitivity is not possible for optimal directed networks (Subsection B).
A. Undirected networks

The optimization problem
For the class of networks with a fixed number of undirected links m = i j>i A ij , we have the additional constraint that the matrix A is symmetric. This constraint can arise from the symmetry of the physical processes underlying the interaction represented by a link, such as the diffusion of chemicals through a channel connecting reactor cells in a chemical reaction network. In this case, the maximization of the convergence rate can be succinctly formulated as the minimization of Λ max : If the stability function Λ(α) is strictly decreasing on the real line {α ∈ C | Im(α) = 0} for Re(α) ≤λ := 2m/(n − 1) (which is satisfied in most cases, as detailed in Supplemental Material [58], Sec. S1), maximizing the convergence rate to the network-homogeneous state for undirected networks is equivalent to maximizing λ 2 , the smallest eigenvalue excluding the null eigenvalue that exists for any networks. We note that the problem is also equivalent to minimizing a bound on the deviations from a network-homogeneous state in a class of networks of non-identical oscillators [59]. There have been a number of previous studies [57,[60][61][62][63][64] on the related (but different) problem of maximizing the eigenratio λ 2 /λ n , which measures the synchronizability of the network structure for networks of coupled chaotic oscillators.
The maximization of λ 2 is generally a challenging task, except for the following particular cases. For m = n(n − 1)/2, the only network with n nodes and m links is the complete graph, resulting in the (maximum) value λ 2 = n. For m = n − 1 (implying that the network is a tree), the maximum possible value of λ 2 = 1 is achieved if and only if the network has the star configuration [53]. For other values of m (assuming m ≥ n − 1 to ensure that the network is connected), it is challenging even numerically, mainly because each A ij is constrained to be either 0 or 1, which makes it a difficult non-convex combinatorial optimization. The problem of maximizing λ 2 has been a subject of substantial interest in graph theory, with several notable results in the limit n → ∞, assuming that each node in the network has the same degree and that this common degree is constant [65][66][67] or assuming a fixed maximum degree [68]. In contrast to these bounded-degree results, below we address the maximization of λ 2 in a different limit, n → ∞, keeping the link density φ := 2m/[n(n − 1)] constant.

Optimal networks: UCM and MCC
Here we define UCM and MCC networks, and then show that they provide analytical solutions of the optimization problem formulated in the previous section. To define these networks, we first introduce two general quantities that characterize connected component sizes. For a given k, let function M (n, k) denote the maximum number of links allowed for any n-node network whose connected components have size ≤ k. Given m, we define k n,m to be the smallest (necessarily positive) integer k for which m ≤ M (n, k), i.e., k n,m is the minimum size of the largest connected components of any network with n nodes and m links. We also use the notion of graph complements [32,69,70]. For a given network with adjacency matrix A, its complement is defined as the network with the adjacency matrix A c given by With these definitions and notations, we now define an MCC network to be one whose largest connected component of the complement is of size k n,mc , where m c := n(n − 1)/2 − m is the number of links in the complement. To see how the definition of MCC networks relates to the maximization of λ 2 , we note that the maximum Laplacian eigenvalue of any network is upper-bounded by its largest component size (stated and proved as Proposition 4 in Supplemental Material [58], Sec. S2B). We also note that the nonzero Laplacian eigenvalues of a network and its complement are related through where we denote the Laplacian eigenvalues of the network as 0 = λ 1 ≤ λ 2 ≤ · · · ≤ λ n (noting that the symmetry of A constrains them to be real), and those of the complement as 0 = λ c 1 < λ c 2 ≤ · · · ≤ λ c n . Thus, the smaller the largest component size in the complement, the smaller we expect the eigenvalue λ c n to be, which would imply larger λ 2 according to Eq. (7). For special combinations of n and m, namely, n = k and m = k 2 ( − 1)/2 with arbitrary positive integers k and , the complement of an MCC network necessarily consists of components, each fully connected and of size k (stated and proved as Proposition 2 in Supplemental Material [58], Sec. S2A). We refer to this unique MCC network as the UCM network for the given n and m. Translating the structure of its complement to that of the network itself, the UCM network can be characterized as the one in which (i) the nodes are divided into groups of equal size k (uniform), (ii) all pairs of nodes from different groups are connected (complete), and (iii) no pair of nodes within the same group are connected (multipartite). Figure 2 shows examples of UCM and MCC networks.
To establish the optimality of UCM and MCC networks, we first prove the following general upper bound: for any n and m for which the link density φ = 2m/[n(n − 1)] < 1 (where x denotes the largest integer not exceeding x). We prove this bound using Proposition 3.9.3 of Ref. [71], which states that holds true for any network (with at least one link), where d max denotes the maximum degree of the network. Applying this proposition to the complement of the network (rather than the network itself) gives where d c max andd c denote the maximum and mean degree of the complement, respectively, and x denotes the smallest integer larger than or equal to x. Thus, we have λ 2 = n − λ c n ≤ n − ( 2m c /n + 1) = n − ( n − 1 − 2m/n + 1) = 2m/n , establishing Eq. (8).
The optimality of UCM networks can now be established for any combination of n and m for which the UCM network can be defined [i.e., n = k and m = k 2 ( − 1)/2]. Indeed, since each connected component in the complement of such a UCM network is fully connected and of size k, it follows that the maximum Laplacian eigenvalue of the complement is λ c n = k. (This is because the Laplacian spectrum of a network is the union of the Laplacian spectra of its connected components, which is a known fact presented as Proposition 3 in Supplemental Material [58], Sec. S2B.) We thus conclude that λ 2 = n − k = 2m/n , implying that the UCM network attains the upper bound in Eq. (8) and has the maximum possible λ 2 . Moreover, the UCM network is actually the only optimizer among all networks with the same n and m (proved in Appendix B).
For other MCC networks, we establish the formula for any link density φ < 1 and use it to show that MCC networks attain the upper bound in Eq. (8) and thus are optimal in several cases of lowest and highest link densities, as well as for a range of link density around each value corresponding to a UCM network. We also show that each MCC network is locally optimal in the space of all networks with the same n and m in the sense that λ 2 ≤ n−k n,mc holds true for any network obtained by rewiring a single link. Proofs of these results can be found in Supplemental Material [58], Secs. S2B and S2C. The optimality of these networks, which have fully connected clusters in the complement, suggests potential significance of other, more general network motifs [72], whose statistics have been studied in the context of network optimization [73,74]. These λ 2 -maximizing networks can be explicitly constructed. In fact, given any n and m, an MCC network with n nodes and m links can be constructed by forming as many isolated, fully connected clusters of size k n,mc as possible in the complement of the network. Details on this procedure are described in Appendix C, and a MATLAB implementation is available for download [75]. This procedure yields the (unique) UCM network if n = k and m = k 2 ( − 1)/2. Similar strategies that suppress the size of largest connected components, when incorporated into a network growth process, have been observed to cause discontinuous, or continuous but "explosive" percolation transitions [76][77][78][79][80]. The deterministic growth process defined in Ref. [81] is particularly close to the definition of MCC networks because the process explicitly minimizes the product of the sizes of the components connected by the new link in each step.

Sensitivity of optimal networks
To demonstrate the sensitivity of UCM networks to link removals and node additions, we first study the dependence of λ 2 for MCC networks on the link density φ < 1. By deriving an explicit formula for k n,mc , we rewrite Eq. (11) as where and depends on φ and is defined as the unique integer satisfying (derivation presented in Supplemental Material [58], Sec. S2D). Equation (12) indicates that λ 2 experiences a series of sudden jumps as the link density increases from φ = 2/n (the minimum possible value for a connected network, corresponding to the star configuration) to φ = 1 (corresponding to the fully connected network). This behavior is better understood by considering the complement of the network as the number of links m c in the complement increases (corresponding to decreasing link density φ), as illustrated for n = 20 in Fig. 3. When the complement has exactly M (n, k n,mc ) links, any additional link would force the maximum component size k n,mc to increase by one, causing a jump in λ 2 = n − k n,mc . In Fig. 3, for example, when the network that has m c = M (20, 4) = 30, k 20,30 = 4, and λ 2 = 16 gains one more link in its complement (m c = 31), the component size jumps to k 20,31 = 5 and λ 2 jumps down to 15. The 18-node UCM and MCC networks in Fig. 2 also illustrate such a jump. In the context of percolation problems, similar cascades of jumps in the maximum component size, called microtransition cascades, have been identified as precursors to global phase transitions [82]. Figure 4 demonstrates that for a wide range of φ, the MCC networks improve λ 2 significantly over the Erdős-Rényi (ER) random networks, as well as those identified by direct numerical optimization of λ 2 using simulated annealing (SA). The difference is particularly large for φ near certain special values such as 1/2. Note that the optimal value of λ 2 given by the upper bound (black curves) is achieved not only by the UCM network (for example, the one indicated by the green dot for k = 25, = 2) at φ = 25/49 ≈ 0.51, but also by MCC networks (orange curves) for a finite range of φ around this value. The optimal λ 2 value, however, is sensitive to changes in the link density φ, and it departs quickly from its value at φ = 25/49 as φ moves away from 25/49, particularly for φ < 25/49. In fact, λ 2 has many points exhibiting such sensitivity, which becomes more prominent for larger networks and turns into a singularity as n → ∞ with fixed φ. To see this, we take the limit in Eq. (12) to obtain where is the unique integer determined by φ ≤ φ < φ +1 , where we define φ := 1 − 1 = −1 for any positive integer . This function of φ, shown in Fig. 4 (red curve), has a cusplike dependence on φ around φ = φ , at which it achieves the asymptotic upper bound lim n→∞ λ 2 /n ≤ φ [which follows directly from Eq. (8)] and has a square-root singularity on the left, i.e., the derivative on the left diverges (while the derivative on the right equals 1/2). This singularity is inherently different from the discrete jumps observed above for finite n. Indeed, as the network size increases, the size of the jumps and the distance between consecutive jumps both tend to zero (as in the microtransition cascades [82] in percolation problems). The function thus becomes increasingly closer to a (piecewise) smooth function, while the square-root singularity becomes progressively more visible (verified numerically in Fig. S2(a) of Supplemental Material [58]). For each sin- gularity point φ = φ , there is a sequence of UCM networks with increasing k (and thus increasing network size n = k ), for which the link density φ The UCM networks associated with these singularities also exhibit sensitivity to the removal of an arbitrary link. As shown in the previous section, the UCM networks are the only networks that attain the upper bound in Eq. (8) and satisfy λ 2 = 2m/n = k( − 1) = k( − 1) [where the last equality holds because k( − 1) is an integer]. The removal of any single link reduces the bound to 2(m − 1)/n = k( − 1) − 2/n = k( − 1) − 1 and thus the normalized eigenvalue λ 2 /n − φ by at least 1/n. Since the link removal reduces φ by 2/[n(n − 1)], the derivative of the normalized eigenvalue with respect to φ (in the limit of large n) is greater than or equal to In terms of the complement, this can be understood as coming from the unavoidable increase of the component size, since the link removal in the network corresponds to a link addition in the complement. We note that the argument above is valid only for UCM networks, since the UCM network is the only one that attains the bound for any φ value at which the upper bound is discontinuous, i.e., when 2m/n is an integer (proof given in Appendix B). In summary, we have the following result: The UCM networks, which maximize λ 2 and correspond to singularities in the λ 2 vs. φ curve for MCC networks, are sensitive to link removals.
The UCM networks show similar sensitivity to node additions as well. When m is fixed, the expression for λ 2 given in Eq. (12), considered now as a function of n, has a squareroot dependence on the right of the points n = 2m/φ , = 2, 3, . . . (corresponding to the UCM networks), as illustrated in Fig. S3 of Supplemental Material [58]. Similarly to the case of link removals, it can be shown that the bound in Eq. (8) suddenly drops from k( − 1) to k( − 1) − 1 when a new node is connected to the network as long as the number of new links is less than m/n, and that this drop leads to an infinite derivative for λ 2 /n − φ with respect to φ in the limit of large n.

The optimization problem
For the class of networks with a fixed number of directed links m d = i j =i A ij , the matrix A can be asymmetric in general. In this case, the problem of maximizing the rate of convergence to the network-homogeneous state can be expressed as The solution of this problem generally depends on the specific shape of the stability function. However, the problem is equivalent to maximizing Re(λ 2 ), the smallest real part among the eigenvalues of L excluding the identically null eigenvalue λ 1 , if the stability function Λ(α) is strictly decreasing in Re(α) and independent of Im(α) for Re(α) ≤λ := m d /(n − 1). This condition is satisfied, e.g., for consensus and diffusion processes (details presented in Supplemental Material [58], Secs. S1D and S1E, respectively). This equivalence is a consequence of the upper bound, which follows from the fact that the sum of the eigenvalues equals the trace of L, which in turn equals m d . [We note that the tighter bound in Eq. (8) is not applicable to directed networks in general.]

Optimal networks
The optimization problem just formulated can be solved if m d is "quantized," i.e., equals an integer multiple of n − 1, in which case there are networks that satisfy λ 2 = · · · = λ n = λ [32]. Such networks attain the upper bound in Eq. (18) and thus are optimal. The class of directed networks satisfying λ 2 = · · · = λ n =λ has previously been studied within the context of network synchronization using objective functions that are not defined by Re(λ 2 ) and different from the convergence rate considered here [32,36]. If m d is not an integer multiple of n − 1, the maximization of Re(λ 2 ), like the maximization of λ 2 for undirected networks, is a hard combinatorial optimization problem. Here we compute the Laplacian eigenvalues symbolically (and thus exactly) for all directed networks of size n = 3, 4, and 5. For the quantized values of m d , we verify that the upper boundλ is indeed attained [Figs. 5(a)-5(c)], in which case λ 2 = · · · = λ n =λ is not only real but also an integer. For intermediate values of m d , the maximum Re(λ 2 ) does not appear to follow a simple rule; it can be strictly less thanλ, have nonzero imaginary part, and/or be non-integer.

Non-sensitivity of optimal networks
In the limit of large networks, however, there is a simple rule: we show below that the maximum value of Re(λ 2 ), normalized by n, converges to the link density φ := m d /[n(n − 1)] as n → ∞ with φ fixed. This in particular implies that the normalized maximum Re(λ 2 ) has no sensitive dependence on φ, in sharp contrast to the sensitivity observed in the same limit for undirected networks [83].
To establish this non-sensitivity result, we first note that φ =λ/n is an upper bound for the maximum value of Re(λ 2 )/n, which follows immediately from Eq. (18). We show that the maximum value approaches the upper bound by showing that there is a lower bound that approaches the upper bound. The lower bound is established by constructing a specific network with n nodes and m d directed links. To construct this network, we start with a variant of directed star networks, in which a core of s fully connected nodes are all connected to all the other nodes, where we define s := λ . Since such a network involves exactly s(n − 1) links, the remaining r links, where r := m d − s(n − 1), are added to the network. The network can thus be constructed as follows: 1) For each i = 1, . . . , s, add n − 1 links from node i to all the other nodes. 2) Add r links from node s + 1 to nodes 1, . . . , r if r ≤ s and to nodes 1, . . . , s, s + 2, . . . , r + 1 if r > s. This network satisfies λ 2 = s = λ (proof given in Appendix D), which provides a lower bound for the maximum value of Re(λ 2 ). This lower bound, as well as the upper boundλ, is indicated by black curves in Figs. 5(a)-5(c) for 3 ≤ n ≤ 5. Thus, the maximum value of Re(λ 2 )/n is at least s/n, and this lower bound approaches the upper bound for large networks: s/n = λ /n = φn /n → φ as n → ∞. This proves our claim that Re(λ 2 )/n for optimal networks is a smooth function of φ in the limit of large networks, thus establishing the absence of sensitivity.

IV. SENSITIVITY TO WEIGHTED PERTURBATIONS
To demonstrate the second type of sensitivity, we now study how the convergence rate behaves when a small weighted perturbation is applied to the network structure, particularly when the initial network is optimal or close to being optimal. Since the convergence rate is determined by the Laplacian eigenvalues through the stability function Λ(α) and Eq. (4), it suffices to analyze how the Laplacian eigenvalues respond to such perturbations, which we formulate as perturbations of the adjacency matrix in the form A + δ∆A, where the small parameter δ is positive (unless noted otherwise) and ∆A is a fixed matrix. This type of structural perturbations can represent imperfections in the strengths of couplings in real networks, such as power grids and networks of chemical [84], electrochemical [85], or optoelectronic [36] oscillators.

A. Eigenvalue scaling for arbitrary networks
Here we show that for a given Laplacian eigenvalue λ of a directed network and a generic choice of ∆A, the change ∆λ of the eigenvalue due to the perturbation generally follows a scaling relation, |∆λ| ∼ δ γ . We also provide a rigorous bound for the scaling exponent γ. This scaling exponent determines the nature of the dependence of the perturbed eigenvalue on δ: if 0 < γ < 1, the dependence is sensitive and characterized by an infinite derivative at δ = 0, and if γ ≥ 1, it is non-sensitive and characterized by a finite derivative.

Bound on scaling exponent
We provide an informative bound on γ by proving the following general result on matrix perturbations. Suppose λ is an eigenvalue of an arbitrary matrix M with geometric degeneracy β [36], defined as the largest number of repetitions of λ associated with the same eigenvector (i.e., the size of the largest diagonal block associated with λ in the Jordan canonical form of M ). For perturbations of the form M + δ∆M with an arbitrary matrix ∆M , there exists a constant C such that the corresponding change ∆λ = ∆λ(δ) in the eigenvalue, as a function of δ, satisfies (proof given in Appendix E). Applying this result to an eigenvalue λ of the Laplacian matrix L, we see that γ ≥ 1/β, implying that the set of perturbed eigenvalues that converge to λ as δ → 0 do so at a rate no slower than δ 1/β .

Typical scaling behavior
The bound established above suggests that the scaling |∆λ| ∼ δ 1/β would be observed for all perturbed eigenvalues that converge to λ as δ → 0. In fact, our numerics supports a more refined statement for networks under generic weighted structural perturbations: for each eigenvector (say, the jth one) associated with λ, there is a set of β j perturbed eigenvalues that converge to λ as δ → 0 and follows the scaling, where β j is the number of repetitions of λ associated with the jth eigenvector (i.e., the size of the jth Jordan block associated with λ). We numerically verify this individual scaling for Laplacian eigenvalues using random perturbations applied to all off-diagonal elements of A. We consider two examples of directed networks of size n = 6, shown at the top of Fig. 6, which are both optimal because λ 2 = · · · = λ 6 . For each of these networks, the left column plots in the corresponding panel of Fig. 6 show the distributions of the scaling exponent γ in the relation |∆λ| ∼ δ γ for random choices of ∆A, where γ is estimated from fitting the computed values of perturbed λ i over different ranges of δ. We see that the distributions are sharply peaked around β j (indicated by the gray inverted triangles) with smaller spread for narrower ranges of δ, supporting the asymptotic scaling in Eq. (20) in the limit δ → 0. We note that, for non-generic weighted perturbations (e.g., if the perturbation is constrained to a subset of the off-diagonal elements of A), the exponent may be different from 1/β j in Eq. (20). For example, when perturbing only the existing links of a directed tree (which is optimal with λ 2 = · · · = λ n = 1), the exponent is one, and thus the network is not sensitive to this type of perturbations even if the degeneracy β > 1, as illustrated in Fig. 6(a) (right column plots). This follows from the fact that the Laplacian matrix of a directed tree is triangular under appropriate indexing of its nodes, which remains true after perturbing the existing links. This nonsensitivity result can be extended to certain other cases, e.g., when P −1 ∆LP is a triangular matrix, where P is the nonsingular matrix in the Jordan decomposition of L and ∆L is the perturbation of the Laplacian matrix (proof presented in Appendix F). In other cases, the scaling with exponent 1/β j , as in Eq. (20), can be observed even when perturbing only the existing links, as illustrated in Fig. 6(b) (right column plots). and G19 (each with 20 nodes and 57 links) all satisfy λ2 = · · · = λ20 = 3 (and thus are optimal) but have different geometric degeneracy. Perturbing the adjacency matrix of each network as A+δ∆A, we plot in double logarithmic scale the resulting change |∆λ(δ)| = |λ(δ)−3| for all 19 Laplacian eigenvalues (blue dots, many of which are overlapping). The same randomly chosen ∆A is used for all four networks, where each ∆Aij is drawn uniformly from the interval [−1, 1] if the link exists from node j to node i, and from [0, 1] otherwise. The perturbations thus allow small increase and decrease of the weight of existing links, as well as the addition of new links with small weight. In each network, nodes of the same color indicates the same in-degree, and a bidirectional arrow represents two directed links in opposite directions. The three nodes in the center of G1 form a fully connected triangle and each of the other nodes has three in-links from these center nodes. In each plot, we observe the expected scaling behavior in Eq. (20), indicated by black lines, each labeled with the corresponding scaling exponent, 1/βj (G1: β = 1 with 19 eigenvectors and β1 = · · · = β19 = 1; G2: β = 2 with 15 eigenvectors and β1 = · · · = β11 = 1, β12 = · · · = β15 = 2; G7: β = 7 with 7 eigenvectors and β1 = β2 = β3 = 1, β4 = 2, β5 = 3, β6 = 4, β7 = 7; G19: β = 19 with 1 eigenvector and β1 = 19).

B. Classification of networks by their sensitivity
The general scaling results in the previous section indicate that the overall sensitivity of a Laplacian eigenvalue is determined by its geometric degeneracy β. This is because larger β j means more sensitive dependence on δ in Eq. (20) and because β is by definition the largest among all the associated β j 's. Thus, we summarize as follows: A Laplacian eigenvalue is sensitive to generic weighted perturbations if and only if the geometric degeneracy β > 1, i.e., the associated eigenvector is degenerate.

Sensitivity in directed networks
We now show that optimal directed networks are often sensitive to generic weighted perturbations. Figure 7 shows examples from the class of optimal networks satisfying λ 2 = · · · = λ n . The geometric degeneracy β can be different for different optimal networks in this class and provides a measure of how sensitive an eigenvalue is when β > 1. Some of these networks are non-sensitive, including simple cases such as the fully connected networks and directed star networks, as well as other networks with more complicated structure, such as the network G 1 in Fig. 7. Other optimal networks in this class are sensitive, and there is a hierarchy of networks having different levels of sensitivity, from β = 2 (e.g., network G 2 in Fig. 7) all the way up to the maximum possible value β = n − 1 (e.g., network G 19 in Fig. 7), including all intermediate cases (e.g., network G 7 in Fig. 7). Such scaling behavior and the resulting sensitivity for β > 1 are robust in the sense that they would be observed even if the associated eigenvector is only approximately degenerate (proved in Appendix G).
How often does an optimal network (including those not satisfying λ 2 = · · · = λ n ) have β > 1 and thus exhibit sensitivity? To study this systematically, we compute β j symbolically and thus exactly for each Laplacian eigenvalue of all possible directed networks with n ≤ 5. We find that a large fraction of the Re(λ 2 )-maximizing networks are indeed sensitive due to geometric degeneracy: 44.4%, 64.3%, and 71.5% of them have β > 1 for n = 3, 4, and 5, respectively [red bars in Figs. 5(d)-5(f)]. These fractions are significantly higher than the corresponding fractions among all directed networks (including non-optimal ones): 21.1%, 19.7%, and 13.7%, respectively [blue bars in Figs. 5(d)-5(f)]. Since β is bounded by the algebraic degeneracy (multiplicity) of λ 2 , an interesting question is to ask how often β attains this bound, giving the network the maximum possible level of sensitivity. Among those networks that are both optimal and sensitive, 74.5% and 60.0% achieve the maximal sensitivity for n = 4 and 5, respectively. (The fraction is trivially 100% for n = 3.) These results thus suggest that optimal directed networks are much more likely to exhibit higher sensitivity than non-optimal ones.

Non-sensitivity in undirected networks
The situation is drastically different when the network is undirected. For an arbitrary undirected network, for which we have the constraint that the matrix A is symmetric, all of its Laplacian eigenvalues are non-sensitive to any (generic or non-generic) perturbation of the form A + δ∆A, since symmetric matrices are diagonalizable [87] and thus γ ≥ 1/β = 1. This in particular implies that there is no sensitivity even for optimal undirected networks, including the UCM and MCC networks. However, this is not in contradiction with the results in Sec. III A, as they concern finite-size perturbations (i.e., addition or removal of whole links) in the limit of large networks, while here we consider infinitesimal perturbations on link weights for finite-size networks.

C. Generality of the scaling
The scaling bound in Eq. (19) is applicable to both directed and undirected networks, regardless of whether the links are weighted or unweighted. We also expect the scaling in Eq. (20) to generically hold true across these classes of networks. Moreover, while the results for unweighted perturbations in Sec. III are specific to the Laplacian eigenvalue λ 2 , Eq. (19) applies to any eigenvalue of an arbitrary matrix, including the adjacency matrix and any other matrix that may characterize a particular system. For example, the largest eigenvalue (in absolute value) of the adjacency matrix for a strongly connected (directed) network is non-degenerate (by, e.g., the Perron-Frobenious Theorem [88]) and therefore nonsensitive. In general, the degree to which the scaling holds is likely to be related to the normality of the matrix, which can range from completely normal matrices with orthogonal eigenvectors (as in undirected networks) to highly non-normal matrices with parallel, degenerate eigenvectors (as in many optimal networks) [89,90]. The result in Appendix G implies that the network does not need to be perfectly degenerate, which opens the door for observing the sensitivity we identified in real-world applications where exact degeneracy is unlikely [36]. Combining all these with the tendency of optimization to cause geometric degeneracy and with the wide range of systems that can be described by Eq. (1), we expect to observe sensitivity to weighted perturbations in many applications.

V. SENSITIVITY IN EXAMPLE PHYSICAL SYSTEMS
As summarized in Fig. 1, we have established two cases in which sensitive dependence on network structure arises: undirected networks under unweighted perturbations (Sec. III A 3) and directed networks under weighted perturbations (Sec. IV B 1). Here we discuss implications of these cases for concrete examples of physical networked systems.

A. Undirected networks under unweighted perturbations
For undirected networks, the sensitivity of λ 2 observed for UCM networks is relevant for a wide range of networked systems, since the stability function formalism establishes that, in many systems, λ 2 determines the stability properties of relevant network-homogeneous states. Typically the asymptotic rate of convergence |Λ max | is a smooth, monotonically increasing function of λ 2 (concrete examples given in Supplemental Material [58], Secs. S1C-S1F), and thus the maximized convergence rate exhibits sensitivity. Below we list specific cases in which sensitivity is observed in |Λ max | or a related quantity: 1. Convergence rate. For networks of phase oscillators, including models of power-grid networks, the convergence rate to a frequency-synchronized, phase-locked state is a function of the Laplacian eigenvalue λ 2 associated with an effective interaction matrix A for the system (details presented in Supplemental Material [58], Sec. S1A). While λ 2 is generally different from λ 2 , it is strongly correlated with λ 2 , and hence with Λ max . We thus expect to observe sensitive dependence of Λ max , which is indeed confirmed in Fig. 8(a) for power-grid networks with a prescribed network topology and realistic parameters for the generators and other electrical components in the system.
2. Transient dynamics. In addition to the asymptotic convergence rate Λ max , sensitive dependence can be observed for the convergence rate in the transient dynamics of the network, which depends not only on λ 2 but on all Laplacian eigenvalues. This is illustrated in Fig. 8(b) using the example of coupled optoelectronic oscillator networks (system details described in Supplemental Material [58], Sec. S1B).
3. Critical coupling threshold. Another physical quantity that can exhibit sensitive dependence is the critical coupling threshold for the stability of the network-homogeneous state in systems with a global coupling strength ε. In such systems, the functions H ij are proportional to ε. For identical oscillators capable of chaotic synchronization, the minimum coupling strength for stable synchronization is inversely proportional to λ 2 . For the activator-inhibitor systems [52], the parameter ε is interpreted as the common diffusivity constant associated with the process of diffusion over individual links. As ε is decreased from a value sufficiently large for the uniform concentration state to be stable, there is a critical diffusivity, ε = ε c , corresponding to the onset of Turing instability. This ε c is inversely proportional to λ 2 (derivation given in Supplemental Material [58], Sec. S1C). Such a critical threshold thus depends sensitively on the link density of the network [as illustrated in Fig. 8(c)] as well as on the number of nodes.

B. Directed networks under weighted perturbations
For directed networks, the sensitivity of Laplacian eigenvalues under generic perturbations is typically inherited by the convergence rate Λ max for many systems and processes governed by Eq. (1), including most of the examples described in Supplemental Material [58], Sec. S1. In fact, Λ max would have the same sensitivity as the Laplacian eigenvalue λ j whenever Λ max has a smooth (non-constant) dependence on λ j near the unperturbed values of λ j . Figure 9 illustrates the sharp contrast between sensitive and non-sensitive cases using the example of synchronization in networks of chaotic optoelectronic oscillators [36] (system details described in Supplemental Material [58], Sec. S1B).

VI. DISCUSSION
The sensitive dependence of collective dynamics on the network structure, characterized here by a derivative that diverges at an optimal point, has several implications. On the one hand, it implies that the dynamics can be manipulated substantially by small structural adjustments, which we suggest has the potential to lead to new control approaches based on modifying the effective structure of the network in real time; indeed, the closer the system is to being optimal, the larger the range of manipulation possible with the same amount of structural adjustment. On the other hand, the observed cusp-like behavior imposes constraints on how close one can get to the ultimate optimum in practice, given unavoidable parameter mismatches, resolution limits, and numerical uncertainty.
It is insightful to interpret our results in the context of living systems. The apparent conundrum that follows from this study is that biological networks (such as genetic, neuronal, and ecological ones) are believed to have evolved under the pressure to both optimize fitness and be robust to structural perturbations [91]. The latter means that the networks would not undergo significant loss of function (hence, of optimality) when perturbed. For example, a mutation in a bacterium (i.e., a structural change to a genetic network) causes the resulting strain to be nonviable in only a minority of cases [92]. A plausible explanation is that much of the robustness of living systems comes from the plasticity they acquire from optimizing their fitness under varying conditions [93,94]. In the case of bacterial organisms, for example, it is believed that the reason most of their genes are not essential for a given environmental condition is because they are required under different conditions. Bacteria kept under stable conditions, such as those that live inside other living cells (i.e., intracellular bacteria), have evolved to virtually have only those genes essential under that condition [95] and are thus sensitive to gene removals; they are a close analog of the optimization of a fixed objective function considered here [96]. While there is therefore no conflict between our results and the optimization-robustness trade-off expected for biological networks, investigating the equivalent of the sensitive dependence on network structure in the case of varying conditions or varying objective function would likely provide further insights.
In general, the optimization-robustness relation may depend on the type of robustness considered. In this study we  Fig. 7) and non-sensitive networks (G1 from Fig. 7). The red cross symbol ("×") indicates the value of µ at δ = 0 corresponding to the case of no perturbation, which is the same for all four networks. The inset shows a zoom-in plot of the marked rectangular region surrounding the red cross. The perturbation matrix ∆A was chosen randomly following the same procedure used in Fig. 7. To facilitate visualization, in this figure we allow negative δ, which corresponds to considering a perturbation term of the form |δ|(−∆A). (b) Log-log plot of the change in convergence rate |∆µ| versus δ, which confirms the scaling |∆µ| ∼ δ 1/β for small δ.
focused on how stable a state is, and hence on how resistant the network is to small changes in its dynamical state, which can be regarded as a form of robustness (terminology used, for example, in Ref. [98]). It is quite remarkable that, in seeking to optimize the network for this "dynamical" robustness, the network would lose "structural" robustness, where the latter is a measure of how resistant the stability of the network state is to changes in the network structure. But is the observed sensitive dependence on network structure really a sign of non-robustness? The answer is both yes and no. It is "yes" in the sense that, because of the non-differentiability of this dependence, small parameter changes cause stability to change significantly. It is "no" in the sense that, because the cusps appear at valleys rather than at peaks, the stability in the vicinity of the local best parameter choices are still generally better than at locations farther away (that is, specific parameters lead to significant improvement but not to significant deterioration). By considering both the dynamical and the structural robustness in the sense above, we can interpret our results as a manifestation of the "robust-yet-fragile" property that has been suggested as a general feature of complex systems [99].
Finally, it is instructive to compare sensitive dependence on network structure with the phenomenon of chaos, which can exhibit multiple forms of sensitive dependence [100]. Sensitive dependence on initial conditions, where small changes in the initial state lead to large changes in the subsequent evolution of the state, is a phenomenon that concerns trajectories in the phase space of a fixed system. Sensitive dependence on parameters may concern a similar change in trajectories across different systems even when the initial conditions are the same, as in the case of the map θ n+1 = 2θ n + c (mod 2π) when c rather than θ 0 is changed. But sensitive dependence on parameters may also concern a change in the nature of the dynamics, which has a qualitative rather than merely quantitative impact on the trajectories; this is the case for the logistic map x n+1 = rx n (1 − x n ), whose behavior can change from chaotic to periodic by arbitrarily small changes in r and, moreover, whose Lyapunov exponent exhibits a cusp-like dependence on r within each periodic window. The latter concerns sensitive dependence of the stability (or the level of stability) of the states under consideration, and therefore is a lowdimensional analog of the sensitive dependence of network dynamics on network structural parameters investigated here. In the case of networks, however, they emerge not from bifurcations but instead from optimization. Much in the same way the discovery of sensitive dependence on initial conditions in the context of (what is now known as) chaos sets constraints on long-term predictability and on the reliability of simple models for weather forecast [101], the sensitive dependence on network structure calls for a careful evaluation of the constraints it sets on predictability and model reliability [102] in the presence of noise and uncertainties in real network systems. We thus believe that the interplay between network structure, optimization, sensitivity, and robustness is a promising topic of future research that can offer fundamental insights into the properties of complex systems. lows: where D u H ij and D v H ij denote the derivatives with respect to the first and second argument, respectively, of the function H ij (u, v). We also assume H ij (u, u) = 0 for all u, which ensures that the network-homogeneous state is a valid solution of Eq. (1). Together, these assumptions are equivalent to assuming that H ij can be approximated as where the scalar A ij is independent of t, and the function G(t) is independent of i and j.
Under these assumptions, the variational equation of the system (1) around a given network-homogeneous state x 1 = · · · = x n = x * (t) becomeṡ where ξ i is the perturbation to the state of node i, D x F and D y F are the derivatives of the function F with respect to the first and any of the other arguments, respectively, evaluated at (x, y 1 , . . . , y n ) = (x * , 0, . . . , 0), and L = (L ij ) is the Laplacian matrix of the network given by Eq. (3). An argument based on the Jordan canonical form of L similar to the one used in Ref. [61] then leads to a stability function Λ(α), defined for given (complex-valued) auxiliary parameter α as the maximum Lyapunov exponent of the solution η = 0 oḟ The exponential rate of convergence or divergence is then given by Λ(λ j ) for the perturbation mode corresponding to the jth (possibly complex) eigenvalue λ j of the Laplacian matrix L. Thus, the perturbation mode with the slowest convergence (or fastest divergence) determines the stability of the network-homogeneous state through Λ max defined in Eq. (4).
A key aspect of this approach is that the functional form of Λ(α) does not depend on the network structure, implying that the network structure influences the stability only through the Laplacian eigenvalues [49].
For a system with a global coupling strength parameter ε, such as the networks of identical oscillators and networks of activator-inhibitor systems described in Secs. S1B and S1C of Supplemental Material [58], respectively, the derivative D v H ij (x * , x * ) in the condition (A-2) above is proportional to ε, and G(t) can be chosen to include the factor ε [thus making the stability function Λ(α) = Λ ε (α) dependent on ε]. We note that the class of systems treated in Ref. [49] is an important special case of our formulation in which F(x, y 1 , . . . , y n ) depends linearly on the y variables and the coupling function H ij is proportional to the difference in (some function of) the state of the nodes (details presented in Sec. S1B of Supplemental Material [58]). We also note that the same stability condition Λ max ≤ 0 is derived in Ref. [56] for a general class of systems that is different from the class of systems treated here. An advantage of our formulation is that the assumptions on the nature of pairwise interactions encoded in the coupling functions H ij are intuitive and have clear relation to the network structure encoded in the adjacency matrix A.

Appendix B: Uniqueness of networks attaining the bound
Here we show that, if the mean degreed := 2m/n of the network is a (non-negative) integer, the UCM network is the only one that attains the bound in Eq. (8) among all networks with the same n and m. For n = k and m = k 2 ( − 1)/2, this claim implies that the UCM network is the only λ 2 optimizer. For other combinations of n and m, no UCM network exists, and the claim implies that there is no network that can achieve the upper bound.
To prove the claim, we assume that the network attains the bound, i.e., λ 2 = 2m/n =d, and aim to show that it must be a UCM network. We first observe that λ c n = n − λ 2 = n −d. Also, sinced is an integer, so is the mean degree of the complement,d c = (n − 1) −d, and thus Eq. (10) becomes Since this implies that the maximum and the mean degree of the complement match, i.e., d c max =d c =: d c , all nodes must have the same degree d c in the complement. Equation (B1) also implies λ c n = d c + 1. Next we consider an arbitrary connected component of the complement and show that its maximum Laplacian eigenvalue equals d c + 1. On the one hand, since the Laplacian spectrum of any network is the union of the Laplacian spectra of its connected components (stated and proved as Proposition 3 in Supplemental Material [58], Sec. S2B), we see that the maximum Laplacian eigenvalue of this component is at most λ c n (= d c + 1). On the other hand, by applying Eq. (9) to the component and noting that its maximum degree is d c , we see that its maximum Laplacian eigenvalue is at least d c + 1. Combining these, we conclude that the maximum Laplacian eigenvalue of this component equals d c +1. We now use the part of Proposition 3.9.3 in Ref. [71] stating that the equality in Eq. (9) holds true only if d max + 1 = n. Applying this to the component and combining with the result above, we see that the component size must be k := d c + 1. Since each node has degree d c , the component must be fully connected. Since the choice of the component was arbitrary, the same holds true for all components in the complement, implying that they form isolated, fully connected clusters of size k (for some positive integer ). Therefore, the network must be a UCM network.

Appendix C: Explicit construction of MCC networks
To construct an MCC network for given n and m, we first compute the function M (n, k), which we recall is the maximum number of links possible for a network of size n when the largest size of connected components is ≤ k. For a given k, the maximum number of fully connected clusters of size k that one can form with n nodes is := n/k . Forming such clusters requires · k(k − 1)/2 links, and completely connecting the remaining n r := n − k nodes requires n r (n r − 1)/2 links. Since any additional link would necessarily make the size of some component greater than k, this network has the maximum possible number of links, and we thus have (proof given in Supplemental Material [58], Sec. S2A). This formula allow us to compute M (n, k) for each k = 1, . . . , n.
The computed M (n, k) can then be used to determine k n,mc for the given m directly from the definition: k n,mc is the smallest integer k for which m c ≤ M (n, k), where m c = n(n−1) 2 − m. The complement of an MCC network is then constructed so as to have as many fully connected clusters of size k n,mc as possible using all the m c available links. If one or more links remain, we recursively apply the procedure to these links and the set of remaining isolated nodes. If no cluster of size k n,mc can be formed (which occurs only when k n,mc ≥ 3), we first construct a fully connected cluster of size k n,mc − 1, which is always possible since m c > M (n, k n,mc − 1) ≥ (k n,mc − 1)(k n,mc − 2)/2 by the definition of k n,mc . We then connect the remaining links arbitrarily while ensuring that the size of the largest connected component is k n,mc . The resulting Laplacian eigenvalues are independent of the configuration of these links, since all possible configurations are equivalent up to permutation of node indices. The procedure thus generates an MCC network with the given number of nodes and links, n and m, respectively. Note that, in the special case of n = k and m = k 2 ( − 1)/2 with given positive integers and k, the procedure described here results in the UCM network with groups of size k, as it is the only MCC network in that case. A MATLAB implementation for the procedure [including the relevant functions such as M (n, k) and k n,m ] is available for download [75].

Appendix D: Lower bound for maximum Re(λ2)
Here we show that the network constructed in Sec. III B 3 to establish the lower bound satisfies λ 1 = 0, λ 2 = · · · = λ n−r = s, λ n−r+1 = · · · = λ n = s + 1, which in particular implies that λ 2 = s. We first note that λ − 1 < s ≤λ, since we have s = λ by definition. From the definition of r, we can write r = m d −s(n−1) = φn(n− 1) − s(n − 1) = (λ − s)(n − 1). Combining these, we see that 0 ≤ r ≤ n − 2. We thus divide the proof into two cases: 0 ≤ r ≤ s − 1 and s ≤ r ≤ n − 2. In the following, we use the notation O n1×n2 for the zero matrix of size n 1 × n 2 and I n1×n1 for the identity matrix of size n 1 .

Case 1:
If 0 ≤ r ≤ s − 1, the matrix L has the lower block triangular form where we use the notations n 1 = s + 1 and n 2 = n − s − 1.
Here L n1×n1 and B n2×n1 are matrices of size n 1 × n 1 and n 2 × n 1 , respectively. The set of eigenvalues of L is thus the union of the set of eigenvalues of L and the set {s, . . . , s} (repeated n 2 times, owing to the diagonal block sI n2×n2 ).
To obtain the eigenvalues of L n1×n1 , we apply a sequence of row operations to the matrix L n1×n1 − λI n1×n1 . Denoting the ith row of this matrix by R i , we first replace R i with R i − R s+1 for each i = 1, . . . , s, and then replace R s+1 with Because of the specific form of L n1×n1 − λI n1×n1 , this results in an upper triangular matrix whose diagonal elements are s + 1 − λ (first r), s − λ (next s − r), and −λ. Since none of these row operations involve switching two rows or multiplying a row by a nonzero constant, the determinant is invariant, and hence the eigenvalues of L n1×n1 are s + 1 (repeated r times), s (repeated s − r times), and 0 (simple). Combining with the n 2 repetitions of s from the block sI n2×n2 in Eq. (D2), the eigenvalues of L are 0 (simple), s (repeated s − r + n 2 = n − r − 1 times), and s + 1 (repeated r times), satisfying Eq. (D1). Case 2: If s ≤ r ≤ n − 2, the matrix L has the lower block triangular form where we use the notations n 1 = s + 1 and n 2 = r − s, and n 3 = n − r − 1. Here, K n1×n1 is the n 1 × n 1 Laplacian matrix of a complete graph of n 1 nodes, with eigenvalues 0 (simple) and n 1 = s + 1 (repeated n 1 − 1 times). Therefore, the eigenvalues of matrix L are 0 (simple), s (repeated n 3 = n − r − 1 times), and s + 1 (repeated n 1 − 1 + n 2 = r times), satisfying Eq. (D1).

Appendix E: Scaling for eigenvalues with geometric degeneracy
We can establish Eq. (19) for an arbitrary eigenvalue of an arbitrary matrix. Given an n × n matrix M , its Jordan decomposition can be written as where P is an invertible matrix and is the block-diagonal Jordan matrix with p Jordan blocks [103]. The jth Jordan block is of size β j × β j and has the form Since Eq. (E1) is a similarity transformation, the eigenvalues of M are the same as those of J, which are the diagonal elements λ 1 , λ 2 , . . . , λ p of J with corresponding multiplicities β 1 , β 2 , . . . , β p , respectively. Note that β j can be smaller than the algebraic multiplicity of λ j , since we may have λ j = λ j for some j = j . As in the main text, we consider the matrix perturbation of the formM (δ) = M + δ∆M , where δ > 0 and ∆M is an n × n matrix. For a given eigenvalue λ of M , let α and β denote its algebraic and geometric degeneracy, respectively. The geometric degeneracy is defined as the size of the largest Jordan block associated with λ, or equivalently, as the largest number of repetitions of λ associated with the same eigenvector. Since the roots of a polynomial depend continuously on the coefficients, each eigenvalue of a matrix changes continuously as the elements of that matrix change [104]. Therefore, there are exactly α eigenvalues of the matrixM (δ) that approach λ as δ → 0. Below we prove that there exists a constant C ≥ 0 such that Eq. (19) holds true for each eigenvalueλ(δ) ofM (δ) that converges to λ, where we denote ∆λ(δ) :=λ(δ) − λ.
We first use the same P that transforms M into J in Eq. (E1) to transformM (δ) for each δ as where Q is the matrix given by Q := P −1 ∆M P . Thus, the eigenvalues ofM (δ) are the same as those of J + δQ. To further transform the matrix, consider the block-diagonal matrix where the jth block T (j) is a β j × β j diagonal matrix with elements T The matrix T is invertible for all δ = 0. Therefore, the eigenvalues ofM (δ) are the same as those of the matrix From the definition of T , it follows that the matrix T −1 JT has the same block-diagonal structure as J and T , and the jth diagonal block is the matrix It also follows that the (i, k)-element of the matrix δT −1 QT is upper-bounded by |Q ik |δ 1/βj , where j is the index for the Jordan block that intersects with the kth column of the matrix J. Applying the Gershgorin Theorem [104] to the right-hand side of Eq. (E6), we see that each eigenvalue ofM (δ) must be contained in the disk centered at λ j with radius Cδ 1/βj for some j = 1, . . . , p, where C := 1 + max k i |Q ik |. [The first term in the expression for C comes from the off-diagonal elements in Eq. (E7).] Now the algebraic and geometric multiplicity of the given eigenvalue λ of M can be expressed as α = j β j and β = max j β j , respectively, where the sum and the maximum are both taken over all j for which λ j = λ. Chooseλ(δ) to be any of the α eigenvalues ofM (δ) that converge to λ as δ → 0. Also choose a fixed δ value sufficiently small to ensure that any two disks with different centers among those mentioned above in connection with the Gershgorin Theorem are disjoint (which can be achieved if max j Cδ 1/βj is less than half the minimum distance between distinct eigenvalues of M ). With this choice, the disk centered at λ with radius Cδ 1/β is disjoint from all the others and must containλ(δ); otherwiseλ(δ) would have to jump discontinuously from another disk as δ → 0 since it must remain in at least one of these disks, and this would violate the continuity ofλ(δ) with respect to δ. Havingλ(δ) in the disk centered at λ with radius Cδ 1/β immediately gives the inequality (19).

Appendix F: Non-sensitivity under weighted constrained perturbations
We can show that all eigenvalues are non-sensitive under a certain class of weighted perturbations even when β > 1. If the matrix P for the Jordan decomposition of M in Eq. (E1) transforms the perturbation matrix ∆M into an upper triangular matrix, then the matrix T −1 QT in Eq. (E6) is also upper triangular. In this case, we have the stronger result that the perturbed eigenvalues are given precisely byλ(δ) = λ+δQ ii , where i is the index for any column of J that intersects with a Jordan block associated with the eigenvalue λ. The change of each eigenvalue is thus proportional to δ, i.e., the scaling exponent is one, independently of β [which is consistent with the general result in Eq. (19) since β ≥ 1]. The result for non-generic perturbations in Sec. IV A 2 follows from this if M is replaced by the Laplacian matrix L and ∆M by ∆L. In particular, the result applies to the case of a directed tree with each link having equal weight and ∆L representing a perturbation of the weights of the existing links.

Appendix G: Scaling for approximately degenerate networks
Here we show that the scaling in Eqs. (19) and (20) is observed even when the eigenvector is only approximately degenerate. More precisely, we show that, when the matrix is close to one with exact degeneracy, the scaling remains valid over a range of δ much larger than the distance between the two matrices.
Suppose that a matrix M 0 has an eigenvalue λ(M 0 ) with exact geometric degeneracy β. We consider a perturbation of M 0 in the form M 1 = M 0 +ε∆M 1 , where ∆M 1 is a fixed matrix satisfying ||∆M 1 || = 1. Thus, the distance between M 0 and M 1 is ε, and for small ε (and a generic choice of ∆M 1 ) the matrix M 1 is approximately degenerate. We now apply a perturbation of size δ to M 1 in the form M 2 = M 1 + δ∆M 2 , where ∆M 2 is another fixed matrix satisfying ||∆M 2 || = 1. Denoting η := ε/δ, we can write M 2 as a perturbation of M 0 rather than M 1 , namely, M 2 = M 0 + δ(η∆M 1 + ∆M 2 ).

Sensitive Dependence of Optimal Network Dynamics on Network Structure
Takashi Nishikawa, Jie Sun, and Adilson E. Motter In the following sections we provide descriptions of several examples of systems and processes to which our analysis applies (Sec. S1), proofs of key properties of MCC networks (Sec. S2), as well as supplemental figures (Figs. S1-S3).

A. Power grids and networks of non-identical phase oscillators
Given the initial state of a power grid (defined by the voltage phase and magnitude as well as active and reactive power generation/consumption at each node) obtained from the standard power flow calculation, the short-term dynamics of generators are governed by the so-called swing equation [S1]. This equation can be expressed as where, for each generator i, the angle θ i is the voltage phase relative to a reference phase rotating at frequency ω R , H i is the inertia constant, D i is the effective damping coefficient (representing various damping and damping-like effects, including the actions of power system stabilizers), and P m,i is the constant mechanical power provided to the generator. Each generator i is modeled as a voltage source of constant magnitude E i connected to the rest of the network through a transient reactance x d,i and a terminal node, where E i is determined from the initial state of the system obtained through power flow calculation. The interactions between generators are represented by an effective admittance matrix, whose real and imaginary components are denoted by G ij and B ij , respectively. This matrix is obtained by reducing the matrix representing the physical network of transmission lines and transformers, which accounts for the network topology encoded by the adjacency matrix A, the nodeto-node impedances, and the nodes' power consumption modeled as equivalent constant impedances to the ground [S2]. While Eq. (S1) does not directly fit into the framework of Eq. (1) in the main text, it can be transformed into the same form by rewriting it in terms of the deviations from a frequency-synchronized solution, as we will show below. Equation (S1), which describes the dynamics of power generators, belongs to the class of coupled second-order phase oscillators, as it can be expressed asθ where Besides modeling power-grid dynamics, the general class of systems described by Eq. (S2) can be used to model disordered Josephson junction arrays [S3]. Note that this class of systems does not directly fit into the framework of Eq. (1) because the oscillators are not identical and the coupling functionĤ ij generally does not satisfy the requirement H ij (u, u) = 0. This is indeed the case with the power-grid equation (S1) because of the cosine term. The difficulty, however, can be overcome in any system of the form in Eq. (S2) by changing the frame of reference. Specifically, we rewrite the equation in terms of the deviations from a frequency-synchronized solution θ i = θ * i + Ω s t, satisfying for each i. Writing θ i = θ i − (θ * i + Ω s t), we obtain where . If we assume that β i = β is independent of i (which can hold even if the individual generator parameters vary widely across the network), then this equation fits in the framework of Eq. (1) with F(x, y 1 , . . . , and this H ij satisfies the assumptions (A-1) and (A-2) in the main text. In particular, we have The stability function can then be explicitly calculated as which shows that the stability function derived in Ref. [S2] remains the same for the general class of second-order phase oscillator networks in Eq. (S2) and that it is independent of the choice of the coupling functionĤ ij . The stability function is strictly decreasing as a function of Re(α) for Re(α) ≤ β 2 /4 and strictly increasing as a function of |Im(α)|. The convergence rate to the frequency-synchronized solution θ i = θ * i + Ω s t can then be determined to be Λ max = max j≥2 Λ( λ j ), where λ j are the eigenvalues of the (weighted) Laplacian matrix associated with the network of effective interactions represented by the (weighted) effective adjacency matrix A = ( A ij ).
In the case of power-grid dynamics (S1), we have For the results shown in Fig. 8(a) of the main text on the dependence of Λ max on the network topology, we assume for simplicity that all generators, transformers, and transmission lines have identical parameters. The parameters are taken from the components in the 9-bus example system discussed in Ref. [S1], and we stressed the system (power demand and consumption uniformly increased by a factor of 8), while using a smaller value of generators' transient reactances (x d,i = 0.02 per unit) to ensure stability. To construct a power-grid network for a given adjacency matrix A, we first connect the terminal node of each generator to a unique load node through a transformer. These n load buses are then connected to each other by transmission lines according to the topology given by A. Note that the eigenvalues to be inserted into Λ(α) to determine Λ max are the Laplacian eigenvalues λ j corresponding to A, which are not generally equal to λ j . However, the variations of E i and θ * i are both small for a typical power grid, making A approximately equal to a constant multiple of B = (B ij ) [see Eq. (S13)]. Since B is tightly related to A through the network reduction process mentioned after Eq. (S1), we expect λ j to be strongly correlated with λ j , and therefore to the convergence rate Λ max . Note that if we assume that β i is independent of i and the transmission lines are lossless (i.e., G ij = 0 for i = j), then Eq. (S2) would take the form of a network of coupled second-order Kuramoto-type phase oscillators [S4], which is the form used to model power grids in Ref. [S5]. If we take the limit of strong damping (large D i ) instead of assuming β i = β, the node dynamics would reduce to that of a network of first-order Kuramoto-type oscillators [S6].
We also note that the technique used above to put Eq. (S2) in the form of Eq. (1) can also be used to treat the phase-reduced model for networks of weakly coupled limit-cycle oscillators [S7] which can be regarded as a first-order version of Eq. (S2). In this case, Eq. (A2) becomesη = −αη, and the stability function is Λ(α) = −Re(α) [which is strictly decreasing in Re(α)], again independent of the choice of the coupling functionĤ ij . The convergence rate to the frequency-synchronized solution can be computed as Λ max = max j≥2 Λ( λ j ), where λ j are the eigenvalues of the Laplacian matrix associated with the effective adjacency matrix defined by A ij = −Ĥ ij (θ * i − θ * j ). In the special case of the Kuramoto model with arbitrary network structure A ij and pairwise frustration parameters δ ij , the coupling function isĤ ij (θ) = A ij sin(θ + δ ij ), and the effective interaction matrix is given by A ij = A ij cos(θ * i − θ * j + δ ij ).

B. Networks of identical oscillators
The class of coupled oscillator networks considered in Ref. [S8] can be described by Eq. (1) with where the function F(x) describes the dynamics of a single oscillator in isolation, the function H represents the signal that a node sends to other nodes, and ε is the global coupling strength. We note that H ij defined in Eq. (S16) satisfies the assumptions (A-1) and (A-2) in the main text for any (differentiable) H. In particular, we have . In this case, the network-homogeneous state represents completely synchronized periodic or chaotic motion of the oscillators. The stability function Λ(α) = Λ ε (α) derived in Appendix A is determined from Eq. (A2), which can be written asη = [DF − αεDH]η in this case [S8]. For many choices of F and H, the stability function has a range of Re(α) for which it is a strictly decreasing function of Re(α).
The experimental system of coupled optoelectronic oscillators studied in Ref. [S9] does not directly fit into the class of systems described by Eq. (1) because non-negligible time delay is present in the dynamics of individual oscillators, as well as in the coupling between them. However, the convergence rate toward synchronization in this system is determined by the Laplacian eigenvalues through a stability function, which is slightly different from Λ(α) defined in Appendix A but can be derived in a similar fashion. Indeed, the experimentally observed (finite-time) convergence rate toward synchronous state can be estimated as where µ j = Λ(ελ j /d) is the convergence rate for the jth eigenmode of perturbations, Λ is the stability function, λ j is the jth Laplacian eigenvalue of the network with adjacency matrix A ij , the constant T = 2.0 ms is the time scale for the experimental system to converge to the synchronous state, ε is the global coupling strength, and d is the normalization factor defined as the average coupling per node, i j =i A ij /n. We use the stability function Λ measured experimentally in Ref. [S9] and choose the value of ε for which an optimal network with λ 2 = · · · = λ n =λ achieves the maximum possible convergence rate. In Fig. 9, which shows our results for directed networks, we take only the real part of λ j , since Λ is available from the experiments only for real values of its argument. This approximation is justified by the fact that the dependence of Λ on the imaginary part is quadratic, and that the deviation of λ j fromλ, and thus the imaginary part of λ j , is small for small perturbation size δ. In this system, the elements of the perturbation matrix ∆A represents imperfection in prescribing the coupling strengths between dynamical units in experiments.

C. Networks of activator-inhibitor systems
We consider a network of coupled activator-inhibitor systems studied in Ref. [S10], which can be described by Eq. (1) with x i = (z i , w i ) T , where z i and w i denote the activator and inhibitor concentrations at node i, respectively, and F(x, y 1 , . . . , y n ) = f (z, w) g(z, w) + n j=1 y j , (S18) Here ε and σε are the diffusivity constants for the activator and inhibitor, respectively. The symmetry of the diffusion of these species implies that the network given by A ij must be undirected (and unweighted). For the result shown in Fig. 8(b), we used the node parameter values from Ref.
[S10] (a = 35, b = 16, c = 9, and e = 2/5) and set σ = 16. For any value of ε (and σ), this network has a state of uniform concentration given by x i = x * = (5, 10) T , ∀i. The stability function is determined from Eq. (A2) in Appendix A, which in this case becomeṡ where the partial derivatives of f and g evaluated at x * are denoted by f z , f w , g z , and g w . We note that H ij defined in Eq. (S20) has the same form as Eq. (S16). It thus satisfies the assumptions (A-1) and (A-2) in the main text, and we have is a 2 × 2 matrix that is independent of time, computing its eigenvalues leads to an explicit formula for the stability function: For the parameters used, we find this function to be strictly decreasing for The function has two values of α for which Λ(α) = 0, the larger of which can be written as α = α c /ε, where α c is a constant (which equals 11/6 for the parameter values we used). Thus, as ε decreases from a sufficiently large value, the critical value of the diffusivity constant ε = ε c at the onset of Turing instability is determined by the condition λ 2 = α c /ε c , thus giving ε c = α c /λ 2 , which is a strictly decreasing function of λ 2 .

D. Consensus protocol
The continuous-time linear consensus protocol is described by Eq. (1) with This H ij satisfies the assumptions (A-1) and (A-2) in the main text, and we have D v H ij (x * , x * ) = A ij G, where G equals the identity matrix in this case. Equation (A2) in Appendix A readsη = −αη, and the stability function is given by Λ(α) = −Re(α) [which is strictly decreasing with respect to Re(α)]. If there is a directed spanning tree embedded in the network, we have Re(λ j ) > 0 for all j ≥ 2, and the system converges from an arbitrary initial condition to a network-homogeneous state, x i = x * , ∀i [S11] at a rate given by Λ max = − min j≥2 Re(λ j ) = −Re(λ 2 ). If the network is strongly connected and balanced in the sense that the in-and out-degrees are equal for all nodes (i.e., j A ij = i A ij ), then this state corresponds to the solution of the average consensus problem, x * = i x i (0)/n [S11].

E. Diffusion over networks
Assuming the Fick's law for the diffusive process over each link of the network, the flux from node j to node i ( = j) is given by ε(x j − x i ), where ε is the diffusivity constant and x i is the density of the diffusing species at node i. The dynamics of the system are then governed by Eq. (1) with This H ij satisfies the assumptions (A-1) and (A-2) in the main text, and we have D v H ij (x * , x * ) = A ij G with G = ε. The symmetry of the diffusive process requires that the network be undirected, and hence that λ j be all real. For a connected network (which has λ 2 > 0), the system has a network-homogeneous state, x i (t) = x * = i x i (0)/n, ∀i, t, for which Eq. (A2) in Appendix A becomesη = −αεη, leading to the stability function Λ(α) = −εRe(α) [which is strictly decreasing in Re(α)]. The rate of exponential convergence to this state is thus given by Λ max = Λ(λ 2 ) = −ελ 2 .

F. Fluid networks
The equation of motion for a system of vertical pipes partially filled with liquid and connected by horizontal pipes is given by Eq. (1), where x i = (z i ,ż i ) T , variable z i represents the liquid level of the ith vertical pipe, and F(x, y 1 , . . . , y n ) = ż −2νż + n j=1 y j , (S25) and 2ν is the coefficient for the damping due to friction in the pipes [S12]. The difference in the pressure force in two connected vertical pipes with different liquid height drives the system to an equilibrium state of equal z i . Since the same pressure force acts on the liquid levels in the two pipes with opposite signs, the interaction matrix A ij is symmetric, implying that we have an undirected network. Equation (A2) in Appendix A becomeṡ We note that H ij defined in Eq. (S26) satisfies the assumptions (A-1) and (A-2) in the main text, and we have which is a strictly decreasing function of α for α ≤ ν 2 .

S2. LAPLACIAN EIGENVALUE λ 2 OF MCC NETWORKS
Here we first prove some basic properties of the function M (n, k) in subsection A. We then derive the relation λ 2 = n−k n,mc for all MCC networks for which φ < 1 (subsection B) and establish the local λ 2 -optimality for all MCC networks (subsection C). We also establish the global optimality for certain specific cases in subsection C. Finally, we derive the formula in Eq. (12) of the main text in subsection D.

A. Basic properties of M (n, k)
Recall the definition of the class of MCC networks for given n and m as the set of all n-node connected networks with m links for which the largest connected component of the complement is of size k n,mc , where m c = n(n − 1)/2 − m is the number of links in the complement. Here, as in the main text, k n,m is the smallest integer k for which m ≤ M (n, k) for given n and m, where M (n, k) is the maximum number of links allowed for any n-node network whose connected components have size ≤ k. The following proposition establishes basic properties of the function M (n, k). Proposition 1. Given n and k with 1 ≤ k ≤ n, the only network with n nodes and m = M (n, k) links whose connected components have size ≤ k is a network consisting of isolated, fully connected clusters of size k and an additional isolated, fully connected cluster of size n r := n − k , where = n/k . It follows that Furthermore, 0 = M (n, 1) < M (n, 2) < · · · < M (n, n) = n(n − 1) 2 . (S29) Proof. Given a network of n nodes and m links, suppose that its largest component size is smaller than or equal to k. By definition, we have m ≤ M (n, k). We shall show that if m = M (n, k), the number of k-node, fully connected clusters in the network must equal n/k , which is the maximum number of isolated k-node clusters allowed in such a network, and the remaining nodes must form a single, fully connect cluster. Suppose that m = M (n, k). We first show that the individual components of the network must all be fully connected. If they are not, then we can add links to a component that is not fully connected and increase the total number of links to be larger than M (n, k) while keeping the component sizes fixed (and hence less than or equal to k). Since this contradicts the definition of M (n, k) that it is the maximum number of links allowed given the number of nodes n and the largest component size k, all components in the network must be fully connected.
We now show that = n/k , where denotes the number of fully connected components of size k. If < n/k , then the number of remaining nodes n r = n − k > k. Hence, there must be at least two (fully connected) components of size k 1 and k 2 among these n r nodes which satisfy k 1 ≤ k 2 < k. Thus, we can "relocate" a node from the component of size k 1 to the component of size k 2 , while rewiring and adding links to ensure that the two components are both fully connected. This "relocation" increases the total number of links by k 2 − (k 1 − 1) ≥ 1 while k remains the largest component size of the network, which contradicts the definition of M (n, k). Therefore, we must have = n/k . The n r remaining nodes must form a single, fully connected cluster, since we can otherwise increase the number of links by adding links to that part of the network, again contradicting the definition of M (n, k). Equation (S28) then follows immediately from counting the total number of links in the components of size k and one component of size n r , all of which are fully connected.
To prove Eq. (S29), suppose that k < n and apply the same "relocation" argument to the network with m = M (n, k) links. This involves a cluster of size k increasing its size to k + 1 by incorporating a new node from another cluster (noting that k < n guarantees at least two clusters) and adding links to keep the cluster fully connected. We thus see that the total number of links must strictly increase. Since M (n, k + 1) is larger than or equal to this number of links by definition, we have M (n, k) < M (n, k + 1) for k = 1, . . . , n − 1. The equalities in Eq. (S29) follow directly from Eq. (S28).
The class of MCC networks for given n and m contains the MCC network generated by the procedure described in Appendix C, but it generally contains more. However, for certain combinations of n and m, we can show that the class contains only one MCC network. Indeed, if m c = M (n, k) for some 1 ≤ k ≤ n, then Proposition 1 applied to the complement implies that there is only one MCC network in the class. Note that in this case we have k n,mc = k. If m c increases by one, m c ≤ M (n, k) would no longer be satisfied, which forces k n,mc to jump from k to k + 1 and thus causes a jump in λ 2 = n − k n,mc , as observed, e.g., for the orange curve in Fig. 3. In the case where n r = 0 is satisfied in addition to m c = M (n, k), we can show that the unique MCC network is actually a UCM network, as stated in the following proposition.
Proposition 2. Suppose that n = k and m = k 2 ( − 1)/2 for positive integers and k. Then, the only MCC network with n nodes and m links is the network whose complement consists of isolated, fully connected clusters of size k.
Proof. The assumption n = k is equivalent to n r = 0 and implies M (n, k) = k(k − 1)/2. Since m c = n(n − 1)/2 − m = k(k − 1)/2, we have m c = M (n, k). We can thus apply Proposition 1 to the complement of an MCC network with n nodes and m links to conclude that the complement of any such network must consist of isolated, fully connected clusters of size k.
B. Proof of λ2 = n − kn,m c For a given MCC network with n nodes and m links, assume m < n(n − 1)/2 (i.e., φ < 1), which guarantees that k n,mc ≥ 2.
(We note that if φ = 1 the MCC network must be the complete graph, for which λ 2 = n, even though k n,mc = 1 in that case.) Below we show that λ 2 = n − k n,mc , or equivalently, λ c n = k n,mc , using the following propositions.
Proposition 3. The Laplacian spectrum of a network is the union of the Laplacian spectra of its connected components.
Proof. This well-known result follows immediately from reordering the indices to make the Laplacian matrix block-diagonal (with blocks corresponding to connected components) and using the fact that the eigenvalue spectrum of a block diagonal matrix is the union of the spectra of its diagonal blocks.
Proposition 4. The largest Laplacian eigenvalue of a network is at most the size of the largest connected components.
Proof. First note that the largest Laplacian eigenvalue of any network is bounded by the size of that network (a proof can be found, e.g., in Sec. 3.9 of Ref. [S13]). Applying this to each connected component of a given network, we see that the largest Laplacian eigenvalue of the component is at most the size of that component. The conclusion now follows directly from Proposition 3 by considering the union of the Laplacian spectra of the components.
Proposition 5. For any network with n ≥ 2 nodes and more than (n − 1)(n − 2)/2 links, the largest Laplacian eigenvalue is n.
Proof. Suppose that a given network has n nodes and m links, where m > (n−1)(n−2)/2. From the identity (n−1)(n−2)/2 = n(n − 1)/2 − (n − 1), the number of links in its complement satisfies m c < n − 1. Since the minimum number of links required for a connected n-node network is n − 1, the complement must be disconnected. This implies that the second smallest Laplacian eigenvalue λ c 2 of the complement is zero. Therefore, the largest Laplacian eigenvalue of the network is λ n = n − λ c 2 = n.
To prove λ c n = k n,mc , we separately show λ c n ≤ k n,mc and λ c n ≥ k n,mc . First, by the definition of MCC networks, the size of each connected component in its complement is at most k n,mc . By Proposition 4, we immediately obtain λ c n ≤ k n,mc . Next, to show λ c n ≥ k n,mc , suppose hypothetically that all connected components in the complement of the network have at most (k n,mc − 1)(k n,mc − 2)/2 links. Then, the links in each component of size k n,mc can be rewired to form a connected component of size k n,mc − 1 with an additional node that is isolated. Doing this, the maximum component size becomes k n,mc − 1, and the total number of links forming these components is at most M (n, k n,mc − 1). This implies that the original network must have m c ≤ M (n, k n,mc − 1), which contradicts with the definition of k n,mc . Therefore, the initial hypothesis is false, and there must be at least one connected component with more than (k n,mc − 1)(k n,mc − 2)/2 links. Since k n,mc ≥ 2, it follows from Proposition 5 that the largest Laplacian eigenvalue of this component is k n,mc . By Proposition 3, we conclude that λ c n ≥ k n,mc . Altogether, we have λ c n = k n,mc , and hence λ 2 = n − k n,mc for all MCC networks with φ < 1.

C. Local and global optimality
We now prove that the class of MCC networks optimizes λ 2 . We divide the proof into cases based on the link density, establishing the global optimality for certain cases (of lowest-and highest-density as well as of density close to that of a UCM network) and the local optimality for all other cases. For convenience we define Φ(n, k) := 2M (n,k) n(n−1) , the link density of any network having n nodes and M (n, k) links.
When a given network has m = n − 1 links, we have φ = 1 − Φ(n, n − 1) = 2/n, the lowest possible link density for a connected network. Any such network is necessarily a tree. We have k n,mc = n − 1, which implies that in this case there is only one MCC network, namely, a star network. This network has λ 2 = 1 and is the unique maximizer of λ 2 among all tree networks [S12]. This establishes the global optimality of the MCC network in this case.
For a network with the highest possible link density, φ = 1 − Φ(n, 1) = 1, we have m c = M (n, 1) = 0 and k n,mc = 1. The only such network is the fully connected one, which has λ 2 = n. Thus, the MCC network is trivially the global optimum in this case.
For the next highest range of link density, 1 − Φ(n, 2) ≤ φ < 1, we have 0 < m c ≤ M (n, 2) and k n,mc = 2. In this case, the complement of the network has at least one link, which can be regarded as a subnetwork whose largest Laplacian eigenvalue is 2. Since the largest Laplacian eigenvalue of a network is larger than or equal to the largest Laplacian eigenvalue of any of its subnetworks (i.e., induced subgraphs) [S13], we have λ c n ≥ 2. This implies λ 2 ≤ n − 2 = n − k n,mc , which is the value of λ 2 for any MCC network. This establishes the global optimality in this case as well.
For φ in the next highest range, 1 − Φ(n, 3) ≤ φ < 1 − Φ(n, 2), we have M (n, 2) < m c ≤ M (n, 3) and k n,mc = 3. In this case, the complement of the network has three nodes connected by two links as a subnetwork; otherwise all links would be isolated from one another, which would imply m c ≤ M (n, 2). Since the largest Laplacian eigenvalue of this subnetwork is 3, we have λ c n ≥ 3, and hence λ 2 ≤ n − 3. This shows that the eigenvalue λ 2 = n − k n,mc = n − 3 for the MCC networks is globally optimal.
In the neighborhood of each φ = 1 − Φ(n, k), which corresponds to a UCM network, we can also prove global optimality, which is summarized in the following proposition. Proposition 6. Suppose that n = k for some positive integers k and , and that the link density φ satisfies Then, the MCC networks achieve the global maximum value of λ 2 .
Proof. Consider an MCC network having n = k nodes and m links, with φ satisfying the inequalities in Eq. (S30). First note that the inequalities (S30) can be expressed in terms of the number of links m c in the complement as M (n, k) − n 2 < m c ≤ M (n, k + 1), Since n = k , we have M (n, k) = M (k , k) = 1 2 k(k − 1). Also recall that 2m/n is an upper bound for λ 2 for all networks having n nodes and m links [see Eq. (8) of the main text]. We divide the proof into three distinct cases. Case 1: m c = M (n, k). In this case, the upper bound 2m/n = k( −1) = k( −1). By Proposition 1, the complement must consist of isolated, fully connected clusters of k nodes each. Therefore, the network is UCM and hence is globally optimal by the result proved in Sec. III A 2 [and we have λ 2 = k( − 1)]. Case 2: M (n, k) − n 2 < m c < M (n, k). In this case, 1 2 n(n − 1) − M (n, k) < m < 1 2 n(n − 1) − M (n, k) + n 2 , which implies that k( − 1) < 2m n < k( − 1) + 1. Thus, 2m/n = k( − 1). On the other hand, the definition of k n,mc implies that For the ER random networks, each data point is an average over 1,000 realizations. For SA with n = 10 and 20, we took the networks that maximize λ2 over 1,000 independent SA runs starting from randomly chosen initial networks, while we used 20 runs for n = 50 and a single run for n = 100. The raggedness of the n = 10 curve for the networks obtained by SA is likely to reflect the true nature of the maximum λ2 as a function of φ, since a majority of these networks do achieve the upper bound λ2 = φ(n − 1) , and hence have the true maximum λ2. , considered as a function of n. Note that, in addition to the explicit dependence on n, the function C ,n (φ) in Eq. (13) depends on n also through φ = 2m/[n(n − 1)]. The non-smoothness of the curve between the singularity points is not a numerical artifact but instead reflects the fact that λ2 always takes an integer value according to Eq. (12). Each point on the magenta curve is the average of λ2 over 100 realizations of the ER random networks.