Dimension reduction of dynamical systems on networks with leading and non-leading eigenvectors of adjacency matrices

Dimension reduction techniques for dynamical systems on networks are considered to promote our understanding of the original high-dimensional dynamics. One strategy of dimension reduction is to derive a low-dimensional dynamical system whose behavior approximates the observables of the original dynamical system that are weighted linear summations of the state variables at the different nodes. Recently proposed methods use the leading eigenvector of the adjacency matrix of the network as the mixture weights to obtain such observables. In the present study, we explore performances of this type of one-dimensional reductions of dynamical systems on networks when we use non-leading eigenvectors of the adjacency matrix as the mixture weights. Our theory predicts that non-leading eigenvectors can be more efficient than the leading eigenvector and enables us to select the eigenvector minimizing the error. We numerically verify that the optimal non-leading eigenvector outperforms the leading eigenvector for some dynamical systems and networks. We also argue that, despite our theory, it is practically better to use the leading eigenvector as the mixture weights to avoid misplacing the bifurcation point too distantly and to be resistant against dynamical noise.

of the adjacency matrix of the network as the mixture weights to obtain such observables.
In the present study, we explore performances of this type of one-dimensional reductions of dynamical systems on networks when we use non-leading eigenvectors of the adjacency matrix as the mixture weights. Our theory predicts that non-leading eigenvectors can be more efficient than the leading eigenvector and enables us to select the eigenvector minimizing the error. We numerically verify that the optimal non-leading eigenvector outperforms the leading eigenvector for some dynamical systems and networks. We also argue that, despite our theory, it is practically better to use the leading eigenvector as the mixture weights to avoid misplacing the bifurcation point too distantly and to be resistant against dynamical noise.

I. INTRODUCTION
A variety of complex systems in the real world can be described by dynamical systems on networks [1][2][3][4]. This seems to be the case in particular when systems of question are composed of dynamical elements that are similar to each other except for the connectivity and some easily parameterizable heterogeneity across the individual elements. Examples include coupled oscillators on networks [5] including models of functioning of power grids [6], predator-prey, mutualistic, and other dynamics impacting community stability in ecological networks [7,8], gene regulatory networks [9], and epidemic processes and ecological population dynamics considered on networks * naokimas@buffalo.edu of habitat patches (called metapopulation models) [10,11]. Analyses of dynamical systems on networks have clarified various collective phenomena on networks such as phase transitions and synchronous oscillations.
Dynamical systems on networks are necessarily high-dimensional because each node is assigned with one or more dynamical variables and those variables interact via edges of the given network.
Even if the dynamics at each node is one-dimensional, the entire dynamical system on the network is N -dimensional, where N is the number of nodes in the network. Therefore, similar to various dimension reduction techniques for high-dimensional data [12][13][14], one may be tempted to map a dynamical system on networks into one in a low-dimensional space without losing much information. Then, by deploying analytical and numerical techniques suited to low-dimensional dynamical systems, we may be better able to understand the original high-dimensional network dynamics.
Gao, Barzel, and Barabási developed a heterogeneous mean-field theory, assuming uncorrelated networks with general degree (i.e., number of edges that a node in the given network has) distributions, to reduce a class of dynamical systems on networks to one-dimensional dynamical systems [15]. Their method approximates a one-dimensional projection of the original high-dimensional dynamics on networks with a reasonably good accuracy in various cases. See Refs. [16,17] for validation studies of this approach and Ref. [18] for an extension. Furthermore, to deal with general network structure, Laurence et al. developed a systematic method, which we call the spectral method, to use the eigenvalues and eigenvectors of the adjacency matrix of the network to reduce the same class of dynamics on networks into dynamics of small dimensions, such as one or two [19]. See Ref. [20] for a further advancement of the spectral method. In the case of one-dimensional reduction, which we focus on in the present study, the spectral method uses an observable that is a particular linear combination of the state variables on all nodes, denoted by R = N i=1 a i x i , where x i is the dynamical state of the ith node, and a i is the mixing weight. Then, one writes down a closed dynamical equation in terms of R. On a theoretical basis, they proposed to use the ith element of the leading eigenvector (i.e., the eigenvector associated with the largest eigenvalue) of the adjacency matrix as a i [19].
The leading eigenvector of the adjacency matrix is a key descriptor of contagious processes on networks because the adjacency matrix tells us who can directly infect whom. In fact, the ith element of the leading eigenvector gives the likelihood that the ith node is infectious when the infection rate of an epidemic process model such as the susceptible-infectious-susceptible (SIS) model is poised near the epidemic threshold [21]. For example, scale-free networks (i.e., networks with power-law degree distributions) show eigenvector localization such that the leading eigenvector has only a small fraction of considerably positive elements, which are at the largest-degree nodes and correspond to the presence of infection at these nodes [21][22][23][24]. In contrast, the other elements of the eigenvector are close to 0, corresponding to the scarcity of infection at small-degree nodes.
These and other results lend support for the spectral method [19] and its extension called the dynamics approximate reduction technique (DART) [20], which use the leading eigenvectors of the adjacency matrix as mixing weights.
In the present study, we develop a theory to argue that it is often better to use a non-leading eigenvector of the adjacency matrix for the spectral method to realize a better accuracy at reducing the original N -dimensional dynamics into a one-dimensional dynamics. We derive optimization criteria under the assumption that {x 1 , . . . , x N } is not too heterogeneous, which numerically holds true for various dynamical systems on networks [17]. Our theoretical derivation suggests that the leading eigenvector does not necessarily yield the smallest error of the dimension reduction by the spectral method. For various networks, a non-leading eigenvector, which implies that we linearly combine x i into one observable with some negative weights a i , realizes a smaller error than the spectral method with the leading eigenvector. We verify our theory by numerical simulations of three dynamical systems on different networks. Finally, we argue that, despite our theory, the spectral method using the leading eigenvector as the mixing weights is better than with the non-leading eigenvector because of two factors that our theory does not address: precision in locating the bifurcation point and the robustness against dynamical noise. Our code for computing the optimal eigenvectors and reproducing the results in this article is available at https://github.com/naokimas/nonleading-spectral.

II. SPECTRAL METHOD
Throughout the present study, we consider the following class of dynamical systems on networks [15,19,25]: where t is the time, x i is the one-dimensional dynamical state of the ith node (with i ∈ {1, . . . , N }), F (x) represents the intrinsic dynamics of the node, G(x i , x j ) represents the influence of x j on x i , and w ij is the strength of the influence of node j on node i, corresponding to the weighted adjacency matrix of the given network. We assume that the network is connected. We also assume that the network does not have self-loops, i.e., w ii = 0 for i ∈ {1, . . . , N }. However, if all nodes have a self-loop of the same edge weight (i.e., w 11 = · · · = w N N ), one can include the effect of such self-loops We describe the spectral method [19] in this section. With this method, one reduces the Ndimensional dynamical system given by Eq. (1) to an n-dimensional system, where n ≪ N , by deriving an approximate n-dimensional dynamical system in terms of observables each of which is a linear combination of {x 1 , . . . , x N }. We focus on the case of n = 1 in this paper. We consider an observable, which we denote by R, given by where {a 1 , . . . , a N } is normalized such that N i=1 a i = 1. By combining Eqs. (1) and (2), one obtains By Taylor expanding F (x i ) around x i = R to the first order, we obtain Similarly, we expand G(x i , x j ) around x i = βR and x j = γR, where β and γ are constants to be determined, to obtain where G 1 and G 2 are the partial derivatives of G with respect to the first and second argument, respectively. By substituting Eqs. (4) and (5) into Eq. (3), one obtains where and O (x − R) 2 is a short-hand notation for N i=1 O (x i − R) 2 and similar for O (x − βR) 2 and O (x − γR) 2 . By requiring that the first-order terms in Eq. (6), i.e., those containing G 1 and G 2 , disappear for any {x 1 , . . . , x N }, one obtains where x = (x 1 , . . . , x N ) ⊤ , a = (a 1 , . . . , a N ) ⊤ , K is the N × N diagonal matrix whose ith diagonal entry is equal to the weighted in-degree of the ith node, i.e., k in i ≡ N j=1 w ij , W = (w ij ) is the N × N adjacency matrix, and ⊤ represents the transposition. By substituting Eq. (2) in Eqs. (8) and (9), one obtains Because Eqs. (10) and (11) ideally hold true for any x, one obtains Equations (12) and (13) indicate that a is a right eigenvector of both K and W ⊤ . However, K and W ⊤ do not share the eigenspace in general. In particular, the eigenvectors of K are the standard unit vectors because K is a diagonal matrix. The standard unit vector a = (1, 0, . . . , 0) ⊤ , for example, satisfies Eq. (12), but it implies that R = x 1 so that we only observe the first node.
Note that a = (1, 0, . . . , 0) ⊤ satisfies Eq. (13) if and only if k in 1 = 0 such that the first node is not influenced by any other node.

A. Eigenvector minimizing the approximation error
With the optimal β value given by Eq. (14), we obtain We have two remarks on Eq. (16). First, although the use of the leading eigenvector as a has been recommended for the spectral method [19] and the DART [20], e 1 may be smaller for other right eigenvectors of W ⊤ . Second, Eq. (16) implies that e 1 = 0 if all the nodes have the same in-degree, regardless of which right eigenvector of W ⊤ we use as a. However, numerical simulations indicate that the spectral method is not exact even for random regular graphs [17]. The approximation error may be due to the higher-order terms in the Taylor expansion in Eq. (6) or the fact that we have expanded Eq. (3) around both x i = R and x i = β * R. (Although we also expanded x i around γR, we found that γ = 1.) In fact, our numerical test suggests that β * is often far from 1 both for the leading eigenvector and the eigenvector minimizing ǫ 1 , as we show for scale-free networks in Fig. 1(a) and Fig. 1(b), respectively.
For these reasons, we propose to expand Eq. (3) only around x i = R as follows. Let us set By imposing that the first-order terms on the right-hand side of Eq. (17) disappear, we obtain Section IV A for details of the degree distribution used. Note that one can rewrite Eq. (14) as Therefore, if the minimizer of ǫ 1 is associated with a small eigenvalue, α, then β * tends to be large. This explains why the β * value tends to be much larger in (b) than (a).
By substituting R = x ⊤ a in Eqs. (18) and (19) and imposing that Eqs. (18) and (19) hold true for any x, we obtain For the same reason as that for the original spectral method, Eqs. (20) and ( node as a representative of the entire system (see Appendix A for the derivation). Equation (21) implies that, as in the case of the original spectral method, a must be a right eigenvector of W ⊤ and that α is the associated eigenvalue. Therefore, we select the eigenvalue α and the associated eigenvector a that minimize the the approximation error for Eq. (20) defined by We remind that a is normalized such that N i=1 a i = 1. The modified one-dimension reduction reads A few remarks are in order. First, as in the case of the original spectral method, the leading eigenvector of W ⊤ may not minimize e 2 . Second, ǫ 2 ≥ ǫ 1 holds true because our one-dimensional reduction scheme corresponds to β = 1, whereas the original spectral method also optimizes the β value. Nevertheless, which reduction method is more accurate than the other is a nontrivial question because the original spectral method uses the Taylor expansion of each x i around two reference values (i.e., R and β * R). Third, the Perron-Frobenius theorem guarantees that a i > 0, ∀i ∈ {1, . . . , N } when a is the right leading eigenvector of W ⊤ . Therefore, the observable, R, when the spectral method uses the leading eigenvector, is a weighted average of all the dynamical variables, x i , with positive weights. The Perron-Frobenius theorem also guarantees that all the entries of the left leading eigenvector of W ⊤ , which we denote by v ∈ R 1×N , are also positive. If W ⊤ is diagonalizable and the leading eigenvalue is not repeated, we obtain va = 0 for any right eigenvector a of W ⊤ other than the leading one. Therefore, the sign of at least one a i must be opposite to the sign of some other a i . This fact implies that R = a ⊤ x is a weighted average of {x 1 , . . . , x N } including negative weights.

B. Case of regular graphs
In the case of the complete graph, K is the diagonal matrix whose all diagonal entries are  17) disappears. Therefore, the spectral method is exact. Because the complete graph is a regular graph (i.e., a network in which all the nodes have the same degree), we also obtain ǫ 1 = 0 for any eigenvector.
Similar results hold true for regular graphs in general. Specifically, for any undirected regular graphs with degree k, the leading eigenvalue of W is equal to k, and the associated eigenvector is a = (1, . . . , 1) ⊤ . We find that this pair of eigenvalue and eigenvector, which is known to be of multiplicity 1 [26], satisfies Eq. (20). No other pair of eigenvalue and the associated eigenvector of W does not satisfy Eq. (20) because Eq. (20) implies that the eigenvalue needs to be equal to k. With the leading eigenvalue and eigenvector of W ⊤ , if the network is vertex-transitive [26] (e.g., square lattice with periodic boundary conditions), O (∆x) 2 in Eq. (17) disappears due to the symmetry (i.e., x i = x j for all i, j ∈ {1, . . . , N }) such that the spectral method is exact.
Otherwise, the O (∆x) 2 term may cause the discrepancy between the spectral method and the numerical results, as is the case for random regular graphs [17].

C. Case of degree-heterogeneous random graphs
Consider undirected configuration models, i.e., uniformly random undirected networks with a given degree sequence. In the limit of N → ∞, the normalized leading eigenvector of the adjacency matrix is approximated by a i = k i / N ℓ=1 k ℓ [23], and the leading eigenvalue is approximated by α = k 2 / k , where · represents the average over the N nodes [27]. By substituting these relationships in Eq. (22), we obtain where ≈ represents "approximately equal to". If we instead use a non-leading eigenvector whose associated eigenvalue is of O(1) and assume that a i and k i are uncorrelated and that a i = O(N −1 ), we obtain In degree-heterogeneous networks, the leading term in Eq. (24) is k 4 k −2 N −1 , which is expected to be much larger than O( k 2 N −1 ) in Eq. (25). Therefore, if a i = O(N −1 ) for various nonleading eigenvectors, we expect that the spectral method is likely to be optimized by a non-leading eigenvector in degree-heterogeneous networks.
In this section, we numerically compare the spectral method with the leading eigenvector and that with the error-minimizing eigenvectors.
A. Estimated performance of non-leading eigenvectors for various networks We first examine the performance of the optimal non-leading eigenvectors in reducing ǫ 1 and ǫ 2 in comparison with the leading eigenvector for several networks.
First, we consider networks with N nodes by the Erdős-Rényi (ER) random graph in which each node pair is adjacent with probability k /(N − 1); we remind that k is the mean degree.
We generate 200 networks from the ER random graph and use the largest connected component of each network for each pair of N ∈ {100, 1000} and k ∈ {4, 10}. For the other network models that we consider in the following text, we also generate 200 networks and use the largest connected component. The largest connected component of the networks generated by the ER random graph model contains at least 90% of the nodes. We then compute the minimizer of ǫ 1 and that of ǫ 2 for each network. We find that the leading eigenvector is the minimizer of ǫ 1 and ǫ 2 in all networks.
Therefore, we should use the spectral method with the leading eigenvector [19,20] for the ER random graph.
The results are similar for the Watts-Strogatz model of small-world networks [28], whereas the leading eigenvector does not minimize ǫ 1 or ǫ 2 for a small fraction of network instances. Specifically, we set the rewiring probability to 0.1 and generate networks for each pair of N ∈ {100, 1000} and k ∈ {4, 10}. We have confirmed that the largest connected component of the network always contains all nodes. With k = 10, the leading eigenvector minimizes both ǫ 1 and ǫ 2 for all networks.
With k = 4, the minimizer of each type of error (i.e., ǫ 1 or ǫ 2 ) is not the leading eigenvector only for 1.5% and 6% of the networks with N = 100 and N = 1000 nodes, respectively.
Next, we generate networks with power-law degree distributions, which we refer to as scalefree networks, using the configuration model. To this end, we draw the degree of each node, denoted by k, by independently sampling k from a power-law distribution given by p(k) = κ(γ − we round the sampled k to the nearest integer. The mean of the probability density p(k) is . We calculate the ratio of the ǫ 1 value for its minimizer to the ǫ 1 value for the leading eigenvector. By definition, this ratio ranges between 0 and 1. If the ratio is small, the minimizer of ǫ 1 is expected to be better at approximating a one-dimensional projection of the original N -dimensional dynamics. In contrast, the ratio value equal to 1 implies that the leading eigenvector minimizes ǫ 1 . We similarly measured the ratio of ǫ 2 for its minimizer to ǫ 2 for the leading eigenvector. Lastly, we investigate the scale-free network model proposed by Holme and Kim, which produces a high clustering coefficient (i.e., many triangles) [29]. We set the number of edges that each new node has, denoted by m, to m = 2 and m = 5 to produce networks whose average degree is approximately equal to k = 4 and k = 10, respectively. We initialize the network by a star graph having m + 1 nodes. We set the probability of making a triangle for each added edge to 0.5.
We show the ratio of the minimized error to the error for the leading eigenvector in Figs. 2(e) and 2(f) for ǫ 1 and ǫ 2 , respectively. The results are qualitatively the same as those for the scale-free networks generated by the configuration model. We also find that the leading eigenvector tends to minimize the error for more network instances (i.e., the cumulative distribution is equal to 1 for a wider range of the ratio value on the horizontal axis) in the case of the Holme-Kim model than the configuration model, except for (N, k ) = (1000, 4). the eigenvalue is at least 10 −6 . We have found that 123 networks meet the same criterion when k = 10. We only use these networks to calculate the statistics in the following analyses because, in those cases, we expect that the spectral method with a non-leading eigenvector may notably be better than that with the leading eigenvector.
In addition to the scale-free networks withγ = 3.5, we also use a coauthorship network among researchers that published articles on network science by 2006 [30]. The original data set contains 1589 nodes. We only use its largest connected component, which contains N = 379 nodes and 914 edges. We regard this network as an unweighted network.

C. SIS model
First, we consider the deterministic version of the SIS model, which is also called its individualbased approximation [31,32], given by where x i represents the probability that the ith node is infectious at time t, λ is the infection rate, and µ is the recovery rate. Note that w ij ∈ {0, 1}. By definition, each infectious node infects its susceptible neighbor independently at rate λ. An infectious node independently recovers at rate µ. Because multiplying a common constant to λ and µ only changes the timescale of the dynamics, we set µ = 1 without loss of generality. For each value of λ, we run a simulation with the initial condition x i = 0.01, ∀i ∈ {1, . . . , N } until the equilibrium, denoted by x * = (x * 1 , . . . , x * N ), is sufficiently closely reached. the adjacency matrix, as a function of the infection rate, λ, for a scale-free network with N = 1000 nodes, k = 10, andγ = 3.5 in Fig. 3(a). Note that R is a weighted fraction of infectious nodes in the equilibrium. The solid line represents the results obtained from direct numerical simulations of the model. The dashed line represents the bifurcation diagram for the original spectral method, i.e., with the leading eigenvector as the weight vector a and β = β * (see Eq. (15)). We observe that the spectral method is accurate at locating the bifurcation point. However, the spectral method considerably underestimates R in the endemic phase (i.e., for λ values larger than the epidemic threshold). The dotted line in Fig. 3(a) shows the approximation of the same observable, R, by the spectral method in which we force β = 1, corresponding to Eq. (23), and continue to use the leading eigenvector as a. The use of β = 1 is discussed in Ref. [19]. We find that this one-dimensional reduction also locates the bifurcation point of the original dynamical system accurately and that it is better at approximating the R value in the endemic phase than with β = β * . We compare the numerical results and the spectral method in which a is the minimizer of ǫ 2 in Fig. 3(b). Note that the observable R is now different from that used in Fig. 3(a) because we have changed a. Figure 3(b) indicates that the spectral method with the minimizer of ǫ 2 is worse than that with the leading eigenvector at accurately locating the bifurcation point. The minimizer of ǫ 2 is better at approximating R than the leading eigenvector combined with β = β * but worse than the leading eigenvector combined with β = 1. The reason why the spectral method with the combination of the leading eigenvector as a and β = 1 does not minimize ǫ 2 but works better than the minimizer of ǫ 2 is unclear. It may be because higher-order terms in terms of ∆x i in our theory are nonnegligible or because the range of D values we have explored is not sufficiently far from the epidemic threshold estimated by the spectral method using the minimizer of ǫ 2 .
We compare the error among the three methods in Fig. 3(c). At each λ value, we defined the error as the absolute value of the difference between the R obtained by the direct numerical simulation and that obtained from the one-dimensional reduction, which we divided by the R value at the largest λ value, i.e., λ = 4. We normalized the error in this manner because the true value of the observable R depends on a. We use the absolute error instead of the relative error because R is close to or equal to 0 when λ is small. The error bars represent the mean and standard deviation. The figure confirms that the approximation error is the smallest in the case of the spectral method with the combination of the leading eigenvector and β = 1, the second smallest in the case of the minimizer of ǫ 2 , and the largest in the case of the combination of the leading eigenvector and β = β * , when λ is sufficiently larger than the epidemic threshold for the one-dimensional reduction with the minimizer of ǫ 2 (i.e., λ ≈ 0.80). The results are qualitatively the same for scale-free networks with k = 4, as we show in Fig. 3(d). We show the approximation error for the coauthorship network in Fig. 3(e). For this network, the minimizer of ǫ 2 realizes a smaller error than the leading eigenvector combined with β = β * or β = 1 for all values of λ.
It should also be noted that the spectral method with the minimizer of ǫ 2 accurately locates the epidemic threshold for the coauthorship network.
In this section, we consider the coupled double-well system given by where x i is the state of the node i; D is the coupling strength, which we assume to be common for all edges; r 1 , r 2 , and r 3 are constants satisfying r 1 < r 2 < r 3 [33][34][35][36][37][38]. We set r 1 = 1, r 2 = 2, and r 3 = 5. When the coupling is absent, the dynamics is bistable, with x i = r 1 and x i = r 3 being stable equilibria and x i = r 2 being the unique unstable equilibrium. We use the initial condition x i = 0.01, ∀i ∈ {1, . . . , N } for each value of D. With this initial condition, x i will converge to the lower equilibrium, i.e., r 1 , if there is no coupling.
We plot in Fig. 4(a) R = a ⊤ x * , where a is the leading eigenvector, against D for the coupled double-well system on the same scale-free network with N = 1000, k = 10, andγ = 3.5 as the one used in Figs. 3(a) and 3(b). The theoretical estimate of R shown in Fig. 4(a) does not depend on the β value because function G(x i , x j ) does not depend on x i for the double-well system such that Eq. (15) does not depend on β. We show in Fig. 4(b) the same relationships when a is the minimizer of ǫ 2 . Similar to the case of the SIS model, the spectral method with the leading eigenvector is substantially better at locating the bifurcation point than the minimizer of ǫ 2 is. The accuracy at approximating R is slightly better for the minimizer of ǫ 2 than the leading eigenvector except near the bifurcation point, which we confirm by the statistical analysis of the absolute error shown in Fig. 4(c). The results are similar for k = 4 except that the minimizer of ǫ 2 yields significantly smaller errors than the leading eigenvector when D is large (see Fig. 4(d)). The results are similar for the coauthorship network (see Fig. 4(e)). We will discuss the relatively low accuracy near the bifurcation point in Section V A.

E. Generalized Lotka-Volterra dynamics
Third, we consider the generalized Lotka-Volterra (GLV) dynamics is given by where x i represents the abundance of the ith species, λ is the intrinsic growth rate of the species, and D is the coupling strength as in the case of the double-well system [16,18]. The nontrivial equilibrium of this dynamical system is given by is the adjacency matrix. This equilibrium is globally asymptotically stable if and only if W is negative definite [39]. Therefore, only for the GLV dynamics, we change the definition of the adjacency matrix to set w ii = −(α max +1)/D, ∀i ∈ {1, . . . , N }, where α max is the largest eigenvalue of W , which makes W negative definite [17]. With this w ii , we rewrite Eq. (28) as where c = α max + 1. Therefore, we set F (x i ) = λx i − cx 2 i and G(x i , x j ) = Dx i x j [17].
The equilibrium of Eq. (29) is given by x * = −λ(DW − cI) −1 1, where W is the weighted adjacency matrix of the original network with the diagonal entries being equal to 0, and I is the N × N identity matrix. Therefore, we obtain On the other hand, the R value in the equilibrium for the one-dimensional reduction, Eq. (15), including the case of our modified spectral method (i.e., Eq. (23)) with the replacement of β * by 1, is given by Therefore, the spectral method is exact if and only if β * = 1 and regardless of which eigenvector of W ⊤ we use as a.
With λ = 0.5, we show in Fig

V. ADVANTAGES OF THE LEADING EIGENVECTOR TO THE MINIMIZER OF ǫ 2
The theoretical and numerical results shown in the previous sections suggest that the spectral method using the minimizer of ǫ 2 as a outperforms that using the leading eigenvector as a in many cases. However, in this section, we discuss two reasons why we still prefer the leading eigenvector as a, but combined with β = 1, in the spectral method.
A. The spectral method with the leading eigenvector better predicts the bifurcation point For the SIS model and double-well system, we have found that the spectral method using the minimizer of ǫ 2 as a is much less accurate at locating the bifurcation point than the spectral method using the leading eigenvector as a, as shown in Figs. 3(a), 3(b), 4(a), and 4(b). The reason for this phenomenon is as follows.
The modified spectral method (i.e., Eq. (23)) as well as the original spectral method (i.e., Eq. (15)) with β * being replaced by 1 (see Figs. 3, 4, and 5) uses the eigenvalue of the adjacency matrix, α, as the effective coupling strength. We can interpret the earlier one-dimensional reduction developed by Gao et al. [15] as a mean-field theory to approximate the leading eigenvalue α by a function of the nodes' degrees. Suppose that G(x i , x j ) = DG(x i , x j ), where D is the coupling strength, which we regard as the bifurcation parameter. Note that, in the SIS model, the infection rate, λ, plays the role of D. Assume that the one-dimensional dynamical system given by Eq. (23) and G(x i , x j ) = DG(x i , x j ) with α being the leading eigenvalue (i.e., α max ) undergoes a bifurcation at D = D c,max . We denote by D c,org the value of D at which the bifurcation occurs in the original N -dimensional dynamical system. Empirically, the spectral method using the leading eigenvector as a is not necessarily accurate at anticipating the bifurcation point, and one tends to obtain such that the spectral method tends to overestimate the bifurcation point for some models of population dynamics [17].
If we use a non-leading eigenvector as a, we only replace α in Eq. (23) by the eigenvalue associated with a. Therefore, the bifurcation in the one-dimensional reduced dynamical system and α NL is the non-leading eigenvalue associated with the eigenvector a under consideration. Using Eqs. (32) and (33), we evaluate the error in locating the bifurcation point by the spectral method with a non-leading eigenvector as follows: We find that ∆D c,NL > ∆D c,max > 0 and that the difference between ∆D c,NL and ∆D c,max is large if the eigenvalue ratio, α max /α NL , is large. Therefore, if the minimizer of ǫ 2 is associated with a small eigenvalue of the adjacency matrix, the bifurcation point for the one-dimensional dynamical system, D = D c,NL , tends to be much larger than D c,org . Then, for D ∈ (D c,org , D c,NL ), the onedimensional reduction is qualitatively wrong at describing the original N -dimensional dynamical system such that our method is not expected to be accurate at approximating R for the original dynamical system. This is why we excluded in the beginning of Section IV C the eigenvectors a that are associated with positive eigenvalues with tiny magnitudes and negative eigenvalues from the candidates of the minimizer of the error.
To further demonstrate the relevance of this reasoning, we show in Fig. 6 the largest eigenvalue and the eigenvalue whose associated eigenvector minimizes ǫ 2 . We use scale-free networks with N = 1000 andγ = 3.5, and vary k . As we have done in the numerical simulations whose results are shown in Figs. 3, 4, and 5, we exclude the eigenvalues smaller than 10 −6 . Figure 6 indicates that the leading eigenvalue increases as k increases, as theory predicts [27], and that the eigenvalue associated with the minimizer of ǫ 2 decreases as k increases. Therefore, Eq. (34) implies that D c,NL tends to be much larger than D c,org as k increases.
B. The spectral method with the leading eigenvector is more robust against noise We expect that the spectral method using the leading eigenvector as a is more robust against noise than that using the minimizer of ǫ 2 for the following reason. Consider dynamical noise added to our dynamical system on networks, i.e., Eq. (1). Assume that each x i fluctuates around the equilibrium in the case without noise, x * i , with mean 0 and standard deviation σ i . For simplicity, we also assume that x i and x j , where i = j, are uncorrelated in the equilibrium; note that this assumption is just for the sake of discussion here and does not hold in general due to the interaction between different nodes via edges [40,41]. Then, the expectation of R in the equilibrium is given The standard deviation of R, denoted by σ R , is given by σ R = N i=1 |a i | σ 2 i . We remind that a is normalized such that N i=1 a i = 1. We combine the following two observations to argue that R using the leading eigenvector as a is more robust against noise than R using the minimizer of ǫ 2 . First, the expectation of R depends on a unless x 1 = · · · = x N . In our numerical results shown in Figs. 3(a), 3(b), 4(a), 4(b), 5(a), and 5(b), R tends to be smaller when a is the minimizer of ǫ 2 than when a is the leading eigenvector. Second, and more importantly, σ R would be larger with the minimizer of ǫ 2 than with the leading eigenvector. When a is the leading eigenvector, the Perron-Frobenius theorem guarantees that a i > 0, ∀i ∈ {1, . . . , N }. In contrast, when a is a nonleading eigenvector, including the case of the minimizer of ǫ 2 , the orthogonality of the eigenvectors associated with the different eigenvalues implies that some of a i are positive and others are negative. Then, because N i=1 a i = 1, the |a i | value tends to be larger for a nonleading eigenvector than the leading eigenvector. For example, suppose that N = 3, that the normalized leading eigenvector is a = (1/2, 1/4, 1/4) ⊤ , and a normalized nonleading eigenvector is a = (−1, 1, 1) ⊤ . Then, if σ 1 = σ 2 = σ 3 = σ, one obtains σ R = N i=1 |a i |σ = σ for the leading eigenvector and σ R = √ 3σ for the nonleading eigenvector. Therefore, the signal-to-noise ratio for R in the case of the leading eigenvector, quantified by the expectation divided by the standard deviation of R, should be smaller than the same ratio in the case of a non-leading eigenvector.
To examine the relevance of this argument, we run numerical simulations of the three dynamical systems with noise. We add an independent white noise with the intensity √ D to the right-hand side of each of the N differential equations constituting the dynamical system on the network. In other words, we add a value sampled from the Gaussian distribution with mean 0 and standard deviation √ Ddt to the right-hand side of the differential equation in each integration time step of size dt. We set √ D = 0.2 for the SIS and GLV models and √ D = 10 for the double-well system.
For the SIS and GLV models, once x i becomes negative due to the added noise, it tends to diverge to −∞. To prevent this phenomenon, once any x i becomes negative in any simulation time step, we reset x i to 10 −6 .
We show in Figs. 7(a) and 7(b) the numerical results for the SIS model run on the same scalefree network as the one used in Figs. 3(a) and 3(b). We verify that the fluctuation in R using the leading eigenvector as a, shown by the solid line in Fig. 7(a), carries less noise than R using the minimizer of ǫ 2 as a, shown by the solid line in Fig. 7(b). We stress that we have obtained , reinforce our claim that R is more robust against noise when one uses the leading eigenvector rather than the minimizer of ǫ 2 as a.
The theoretical estimate based on the spectral method does not depend on the noise (see the dashed and dotted lines in Fig. 7). However, the target observable to be approximated, R, is noisier when a is a non-leading eigenvector than the leading eigenvector. Therefore, we conclude that the one-dimensional reduction using the leading eigenvector as a better describes the one-dimensional projection of the original network dynamics in the presence of noise unless the magnitude of the noise is small.

VI. DISCUSSION
In the present study, we have explored two ideas to try to improve the spectral method to reduce dynamical systems on networks into a one-dimensional dynamics. The first idea is to constrain the use of the Taylor expansion of the dynamical variables, {x 1 , . . . , x N }, around one reference point, R. The original spectral method uses the Taylor expansion around R and β * R, where β * = 1 in general. Our Taylor expansion has led to a one-dimensional reduction that does not contain β * (i.e., Eq. (23)). Our second idea is to use the non-leading eigenvector of the adjacency matrix that minimizes the error. We have obtained explicit expressions of the errors, with which one can systematically search the minimizer of the error, especially that of ǫ 2 (see Eq. (22)). Note that our method requires the calculations of all the eigenvalues and eigenvectors, which costs O(N 3 ) time.
This is an important limitation of the present method when we apply the method to large networks.
In contrast, the previous studies used the leading eigenvector [19,20]. We have found that, for networks that are relatively homogeneous in the degree, such as the Erdős-Rényi random graph and the Watts-Strogatz small-world network model, the leading eigenvector almost always minimizes the errors. In contrast, when the degree is heterogeneously distributed, the optimal eigenvector tended to be a non-leading one, which is particularly the case for larger and sparser networks. For these networks, our modified spectral method is expected to enjoy reduced approximation errors.
We have assessed the performance of approximating the one-dimensional observable, R, for three dynamical systems on scale-free networks for which the leading eigenvector and the minimizer of ǫ 2 do not tend to coincide. We have shown that the spectral method using the minimizer of ǫ 2 as a tends to surpass the original spectral method across the different dynamical systems and networks (see Figs. 3, 4, and 5). However, we have also found that our spectral method using the minimizer  of ǫ 2 as a has two essential limitations. First, it is not good at estimating the bifurcation point.
In particular, our method locates the bifurcation point extremely far from the correct value if the minimizer of ǫ 2 is associated with an eigenvalue that is much smaller than the largest eigenvalue (see Section V A). In this case, our one-dimensional reduction is qualitatively wrong over a wide range of the bifurcation parameter. Therefore, one cannot accurately approximate R of the original high-dimensional dynamical system in a range of the bifurcation parameter of interest. Second, the spectral method using the minimizer of ǫ 2 as a is not robust against noise. This is because, in the presence of dynamical noise, if one uses a non-leading eigenvector as a, the variance of R is larger than in the case of the leading eigenvector used as a. We have also examined the modified spectral method (i.e., β = 1) with a being replaced by the leading eigenvector. Up to our numerical efforts, this combination is the best performer in that (i) it performs better than the original spectral method (i.e., β = β * and the leading eigenvector as a) and comparably with our method using the minimizer of ǫ 2 as a in the absence of noise and that (ii) the target R fluctuates less than when a is the minimizer of ǫ 2 . The spectral method using β = 1 and the leading eigenvector is also better at locating the bifurcation point than that using the minimizer of ǫ 2 . Therefore, we recommend the combination of β = 1 and the leading eigenvector, i.e., Eq. (22) provided with the leading eigenvector and eigenvalue. We may be able to find pairs of eigenvalue and eigenvector, (α, a), that suppress ǫ 2 much better than the leading eigenvector and the associate eigenvalue is not much smaller than the leading eigenvalue. Using such a pair (α, a) may improve overall performances of the spectral method. Systematically investigating this issue warrants future work.
Both the original and modified spectral methods are based on the Taylor expansion of the differential equations in terms of x 1 , . . ., x N around one or multiple common values (e.g., R). Therefore, the methods are expected to work better when {x 1 , . . . , x N } is relatively homogeneous. We in fact reached the same conclusion for the GBB reduction in our previous work [17]. Our previous numerical simulations suggested that {x 1 , . . . , x N } was not wildly heterogeneous for various dynamical systems including those used in this paper, except near the bifurcation points [17]. However, we do not have a systematic understanding when {x 1 , . . . , x N } is more homogeneous than in other cases.
Such an understanding is expected to help us to assess the applicability of the GBB reduction and spectral methods.
The DART is a systematic method to map high-dimensional dynamics on networks into a lowdimensional dynamical system, extending the spectral method [20]. The DART is applicable to a diversity of dynamics including synchronization dynamics on networks. Like the original spectral method [19], the DART is based on the Taylor expansion at multiple reference points and the leading eigenvectors of the adjacency matrix. It is interesting to apply the present approach to the DART, in particular to the cases where the reduced dynamics have the dimension larger than one. When the dimension of the reduced dynamics is larger than one, at least one eigenvector with which to take the weighted average of {x 1 , . . . , x N } (e.g., the eigenvector associated with the second largest eigenvalue of the adjacency matrix) will necessarily contain both positive and negative elements due to the Perron-Frobenius theorem and the orthogonality of different eigenvectors. Then, the signal-to-noise ratio for the corresponding observable may be compromised in the presence of dynamical noise (see Section V B). With this possibility being included, it is worth further examining dimension reduction and resilience of noisy dynamical systems on networks. implies that we use a single variable x i to represent the entire system. Then, it is straightforward to derive β = 1 and minimize the error for Eq. (13) in terms of γ as follows: which is realized by γ = 0. We find that ǫ 3 is minimized with respect to i when the ith is node has the smallest in-degree in the network. The corresponding one-dimensional reduction is given by where i is the index of the node with the smallest in-degree.
In the case of our modified spectral method, we require that Eq. (20) exactly holds instead of Eq. (21). Then, we again obtain a = e i and α = k in i , where i ∈ {1, . . . , N }. Then, the error for Eq. (21) is given by If the network is unweighted, we use (w iℓ ) 2 = w iℓ ∈ {0, 1} to obtain ǫ 4 = k in i + (k in i ) 2 , which is minimized when we select i with the smallest in-degree. In this case, the one-dimensional reduction is given by