Threefold way to the dimension reduction of dynamics on networks: an application to synchronization

Several complex systems can be modeled as large networks in which the state of the nodes continuously evolves through interactions among neighboring nodes, forming a high-dimensional nonlinear dynamical system. One of the main challenges of Network Science consists in predicting the impact of network topology and dynamics on the evolution of the states and, especially, on the emergence of collective phenomena, such as synchronization. We address this problem by proposing a Dynamics Approximate Reduction Technique (DART) that maps high-dimensional (complete) dynamics unto low-dimensional (reduced) dynamics while preserving the most salient features, both topological and dynamical, of the original system. DART generalizes recent approaches for dimension reduction by allowing the treatment of complex-valued dynamical variables, heterogeneities in the intrinsic properties of the nodes as well as modular networks with strongly interacting communities. Most importantly, we identify three major reduction procedures whose relative accuracy depends on whether the evolution of the states is mainly determined by the intrinsic dynamics, the degree sequence, or the adjacency matrix. We use phase synchronization of oscillator networks as a benchmark for our threefold method. We successfully predict the synchronization curves for three phase dynamics (Winfree, Kuramoto, theta) on the stochastic block model. Moreover, we obtain the bifurcations of the Kuramoto-Sakaguchi model on the mean stochastic block model with asymmetric blocks and we show numerically the existence of periphery chimera state on the two-star graph. This allows us to highlight the critical role played by the asymmetry of community sizes on the existence of chimera states. Finally, we systematically recover well-known analytical results on explosive synchronization by using DART for the Kuramoto-Sakaguchi model on the star graph.


I. INTRODUCTION
Complex systems are characterized by the emergence of macroscopic phenomena that cannot be explained by the properties of its constituents taken independently [1,2]. Synchronization is an archetypal example where collective movements emerge from the interactions between the oscillators of a system [3,4]. The relationship between the interactions in a complex system and its capacity to synchronize was found to be rich and subtle in many fields of applications, including physics [5,6], neurosciences [7][8][9][10][11], and ecology [12][13][14][15][16]. Phase dynamics on networks give insights on this complex relationship: they model the oscillations in a simple way and networks encode the underlying structure of the systems [17].
However, the large number of dimensions N 1 of the system, the non-linearity of phase dynamics, and the lack of symmetries in complex networks often prevent any thorough mathematical exploration of the coupled differential equations. Moreover, it is often more desirable to describe the emergent behavior of the system with a certain observable rather than the microscopic states. For these reasons, it is often more practical and intuitive to * vincent.thibeault.1@ulaval.ca † patrick.desrosiers@phy.ulaval.ca find a dynamical system with a reduced number of dimensions n N that describes the temporal evolution of the synchronization observable of the system. By doing so, the mathematical analysis, the computational cost, and the interpretation of the behavior of the complex system are simplified.
Multiple attempts have been made to obtain such lower-dimensional descriptions. In 1994, Watanabe and Strogatz successfully transformed a N -dimensional Kuramoto dynamics with identical oscillators and all-to-all coupling into an exact 3-dimensional reduced dynamics with N −3 constant of motions [18]. In 2008, Ott and Antonsen introduced an Ansatz which allowed them to write a reduced dynamics in the limit N → ∞ for the original Kuramoto model and some of its extensions (e.g. with external driving, communities of oscillators, and timedelayed coupling) [19]. It was later shown that the Ott-Antonsen Ansatz can be applied to other phase dynamics, such as the theta model [20,21], and extended, with a circular cumulant method, to noisy phase dynamics [22].
Despite these advances, reducing the number of dimensions of a dynamical system while preserving most of its network's structural properties remains a very challenging task. Real complex networks are finite and it is known that finite-size effects shape synchronization transitions [23]. Moreover, their properties, such as the degree of their nodes, are often heterogeneously distributed. Therefore, all-to-all couplings are not suited to describe the rich behavior of synchronization dynamics on complex networks.
Attempts have been made to reduce the dimensions of dynamical systems by considering only structural features such as degrees [24,25], symmetries [26], and spectrum [25,27]. However, they do not consider variations in the intrinsic dynamics of the nodes. There are also multiple approaches to coarse-grain synchronization dynamics on the Laplacian matrix [28][29][30]. However, the methods (equation-free) are not mathematically systematic in general, i.e., they use computer algorithms to get reduced graphs instead of getting a reduced dynamical system.
The purpose of this paper is to develop a Dynamics Approximate Reduction Technique (DART) and to use it for predicting phase synchronization regimes. We first present the structural and dynamical setup for the paper (Sec. II). We then describe DART along with target procedures and we apply it to phase dynamics on networks to clarify the effects of reducing the number of dimensions (Sec. III). Finally, we use the reduced dynamics to analyze the effects of the graph structure on the synchronization regimes of the Kuramoto-Sakaguchi model, such as chimera states and explosive synchronization (Sec. IV). We gather our conclusions in Sec. V and add a number of appendices to provide details on the analysis and the numerical implementations.

II. STRUCTURAL AND DYNAMICAL SETUP
In this section, we introduce preliminary definitions and notations. In particular, we define the modular networks and the phase dynamics that will be used throughout the paper. Note that from now on, the words "network" and "graph" will be considered as exact synonyms.

A. Modular graphs
Let us consider the graph G = (V, E), where V = {1, 2, . . . N } is the set of vertices (nodes) and E is the set of edges (links or connections). The corresponding adjacency matrix, A = (A jk ) N j,k=1 , is such that the element A jk equals one if node k connects to node j and zero otherwise. All graphs considered in the paper are undirected, so all adjacency matrices are symmetric.
In this paper, a graph is said to be modular when its vertices are partitioned into disjoint blocks (groups, communities, or modules). Each block is a set that contains vertices with similar properties and each vertex belongs to one and only one block. Let q be the number of blocks. Moreover, for µ ∈ {1, . . . , q}, let B µ denote the µ-th block. Then, the ordered set B = B 1 , ..., B q provides a partition of V that unequivocally describes the modular structure of the graph. Moreover, B induces the surjection s : V → {1, ..., q} that assigns each vertex to its corresponding block. Note that if N µ is equal to the size of the µ-th block, then N = q µ=1 N µ . An example of a modular graph with q = 4 blocks is displayed in Fig. 1 (a). The vertices inside each block are densely connected, while the pairs of nodes belonging to different blocks are sparsely connected. The opposite situation also occurs, as in star and two-star graphs. Indeed, in a star graph with N nodes, one node (the core) is connected to all the other N − 1 nodes (the periphery) while the latter do not connect among themselves. Yet, this corresponds to a modular structure with q = 2 blocks: the core B 1 = {1} and the periphery B 2 = {2, . . . , N }. Now, let G 1 and G 2 be two separate star graphs with N p1 + 1 and N p2 + 1 nodes, respectively. Connecting the cores of G 1 and G 2 produces a two-star graph that is partitionable into q = 4 blocks, with two blocks for cores, B 1 = {1} and B 3 = {N p1 + 2}, and two blocks for peripheries, B 2 = {2, ..., N p1 + 1} and B 4 = {N p1 + 3, ..., N }.
A random modular graph is a set of modular graphs equipped with a probability distribution. A stochastic block model (SBM) is a type of random modular graph that is defined on a set of graphs sharing the same block structure and whose probability distribution depends on a set of probabilities that preserves the block structure and guarantees the independence of all possible edges. To be more specific, let B be as above and let which is an ordered set of probabilities.
Then, SBM(B, P) is such that the probability of drawing a graph with adjacency matrix A is The parameter p µν can thus be interpreted as the probability for a node in B µ to connect with a node in B ν . Note that it is often more suggestive to combine the probabilities p µν into the symmetric matrix U = (p µν ) q µ,ν=1 as in Fig. 1 (b).
In this paper, we focus on random modular graphs with q = 2 blocks, so B = B 1 , B 2 . We additionally impose the equality p 12 = p 21 = p out . Besides, if p 11 = p 22 = p in , the SBM is called the planted partition model [31,32]. Two extreme cases are also of special interest: p in = p out = p is equivalent to the Erdős-Rényi model (also known as the Gilbert model) G(N, p) and p in = 0 is the random bipartite graph.

B. Phase dynamics
The dynamics of an oscillator is described by a dynamical system which possesses at least one periodic orbit (cycle) for a certain set of parameters. When multiple oscillators interact, their trajectories are perturbed from their periodic state. For small perturbations, the oscillators stay in the basin of attraction of their limit cycles and the variation in the oscillation amplitudes are small. The whole dynamics then becomes mainly controlled by a system of ordinary differential equations that only involve the phases of the oscillations [4,7,33]. For a network with adjacency matrix A = ( A jk ) N j,k=1 , this system typically looks likė where j ∈ {1, . . . , N }. Moreover, θ j is the real-valued function such that θ j (t) gives the phase of oscillator j at time t ∈ R,θ j = dθ j /dt is the instantaneous phase velocity of oscillator j, ω j is a dynamical parameter related to oscillator j, f and g are real analytic functions representing the intrinsic dynamics of each oscillator, and h is a real analytic function describing the coupling among the oscillators. The functions f , g, and h are assumed to be periodic with period 2π. We also assume that their Fourier series are finite, which will be useful in Sec. III B.
In 1967, Winfree proposed one of the first models of coupled phase oscillators in which synchronization is possible [34]. The coupling function of his model takes the form h(θ j , θ k ) = h 1 (θ j )h 2 (θ k ), where h 1 (θ j ) encodes the phase response of oscillator j to perturbations N k=1 A jk h 2 (θ k ) caused by its oscillating neighbors k [35]. Following Ref. [36], we choose h 1 (θ j ) = − sin θ j and h 2 (θ k ) = σ 1 + cos θ k /N , where σ ∈ R + is a coupling constant. Thus, in this version of the Winfree model, the rate of change of the j-th phase iṡ where ω j is the natural frequency of oscillator j, f (θ j ) = 0, and g(θ j ) = 1. Inspired by the work of Winfree, Kuramoto introduced another model of non-linearly coupled phase oscillators in 1975 that has now become a classic for the study of synchronization in populations of oscillators [37]. In the network version of the Kuramoto model, the j-th phase evolves according tȯ Despite the simple form of its dynamics, the Kuramoto model is quite useful [28]. Indeed, for weak coupling and a phase coupling function h well approximated by its first harmonics, a large class of phase oscillator dynamics can be described by the Kuramoto model [33]. The model is relevant for the study of Josephson junctions [38], nanoelectromechanical oscillators [39], and exhibits a rich variety of dynamical behaviors when transposed on complex networks [23]. Adding a global phase lag α between the oscillators of the Kuramoto model lead to the Kuramoto-Sakaguchi model [40] whose dynamical equations arė for j ∈ {1, . . . , N }. This model possesses partially synchronized solutions, called chimera states [41], which will be investigated in Sec. IV.
Partly due to the inherent complexity of neural systems and their nonlinear oscillatory behavior, much research has been conducted to obtain simplified models of neurons that capture their main dynamical properties [42,43]. For example, in 1986, the theta model (also known as the Ermentrout-Kopell model) was introduced to provide a low-dimensional dynamical description of parabolic bursting while preserving some basic characteristics of higher-dimensional models that belong to the Hodgkin-Huxley family [44]. In the theta model, the rate of change of the j-th oscillator iṡ where η j stands for the interaction term defined as In the last equation, ω j is interpreted as the current injected into neuron j. The above model essentially differs from the quadratic integrate-and-fire model by a change of variables V j = tan(θ j /2), where V j is the membrane potential of neuron j [45]. Equation (6) is also the normal form for the saddle-node on invariant circle (SNIC) bifurcation illustrated in Fig. 2.

III. DIMENSION-REDUCTION METHOD
We now introduce a method that approximately reduces N -dimensional dynamical systems, akin to Eq. (2), to new n-dimensional ones with n N . For this, we first define n new linearly independent dynamical observables Z µ for µ ∈ {1, . . . , n}. We force their time evolution to obey a closed system of n ordinary differential equations. In turn, this constraint leads us to impose three compatibility conditions, taking the form of three systems of linear algebraic equations, which, in general, cannot be satisfied simultaneously. We nevertheless clarify the circumstances in which a particular condition should be favored and explain how to get satisfactory solutions, allowing us to successfully apply the method to phase dynamics and to predict their synchronization behavior on modular graphs.

A. Definition of the observables
Inspired by Koopman's seminal work on ergodic theory (e.g., see [46]), we define an observable of a dynamical system as a complex-valued function on its phase space. We restrict the observables to be smooth functions. Center of mass, total kinetic energy, and linear momentum are classical examples of smooth observables. All such observables form a countably infinite-dimensional complex algebra O.
In that context, the goal of dimension reduction consists in selecting n N observables in O whose dynamics is determined, at least approximately, by a closed system of n ordinary differential equations. Selecting the right set of observables is a difficult task. Yet, as argued in [27], this task is greatly simplified if, rather than looking for observables in the whole algebra O, we limit our search to linear observables, which form a N -dimensional complex vector space.
However, when studying the synchronization of oscillators, linear observables such as N j=1 c j θ j , where c j ∈ C for each j, have limited value. To get observables that quantify synchronization, we must first make the change of variables which maps the phase of oscillator j onto the unit circle T in the complex plane. This means that we should look for n-linear observables of the form where M is a n × N matrix with real elements. In matrix form, Choosing n linear observables Z 1 . . . Z n is thus equivalent to choosing a matrix M . We call this matrix a "reduction matrix" since it reduces N variables z j into n variables Z µ .
To further restrict the set of linear observables, we impose two additional conditions: A The rank of matrix M is n; B Each row of M is a probability vector, i.e., B 1 N j=1 M µj = 1 for all µ, B 2 M µj ≥ 0 for all µ and j. These conditions can be reformulated as follows: A The observables Z 1 , . . . Z n are linearly independent; B Each observable Z µ is a weighted average of the activity variables z 1 , . . . , z N . Condition A ensures that the dimension reduction is optimal in the sense that there is no redundancy among the observables.
Condition B makes each observable easily interpretable, namely, it is possible to decompose each linear observable as This second condition then directly implies that the inequality R µ ≤ 1 holds and R µ reaches its maximal value when, and only when, all phases θ j are equal modulo 2π. In other words, thanks to Condition B, each R µ can be interpreted as a synchronization measure. Although recent works [35,47] suggest that other observables may provide better quantitative measures of phase coherence, we use R µ because of its simple properties and because it is easily obtained from the linear observable Z µ .
Let us illustrate Conditions A-B with an example. First, suppose that for each µ ∈ {1, . . . , n}, n µ nodes are selected to form the subset O µ ⊂ V. Second, assume that (O µ ) n µ=1 is a partition of V. Third, set Then, the matrix M satisfies both Conditions A-B with the corresponding observables Z µ = 1 n µ j∈Oµ z j , µ ∈ {1, . . . , n}.
In the last example, the reduction matrix M of Eq. (10) is such that if node j contributes to the observable Z µ , meaning M µj = 0, then the same node does not contribute to any other observable Z ν with µ = ν. This last property together with Condition B imply the following: This is a stronger version of Condition A. Indeed, the orthogonality between each row of M implies that the rows are linearly independent, so the rank is equal to n. Condition A' therefore implies Condition A.
Although Condition A' will not be strictly imposed in the paper, it will always be considered as desirable. The reason is quite simple: together, Condition A' and B induce a partition of the set of nodes into n disjoint blocks, as in Sec. II A. In other words, when the reduction matrix M satisfies both Conditions A' and B, then M also provides a modular structure and each Z µ is interpretable as a weighted average inside the µ-th block. This module, however, is not necessarily related to that of the network, since information has not yet been extracted from the adjacency matrix A.
In the following sections, we will only use linear observables satisfying at least Conditions A-B. As a consequence, any row probability vector = ( µ ) n µ=1 ensures that m = (m j ) N j=1 = M, is a probability vector. Throughout the paper, we choose µ for all µ as the number of non-zero elements in row µ of M divided by the total number of non-zero elements in M . The observables Z 1 , . . . Z n can therefore be linearly combined to produce a single weighted average of the activity variables z 1 , . . . , z N : The observable R = |Z| is then bounded as 0 ≤ R ≤ 1 and serves as a global measure of synchronization (i.e. order parameter). Additionally, we will often use the time-averaged measure of synchronization where we choose sufficiently large time values, t 0 and t 1 > t 0 , such that R(t) oscillates around a certain mean or reaches a stable solution. We will sometimes denote by R the average synchronization measure over multiple quantities (e.g., over random initial conditions or various network realizations).

B. Derivation of the reduced dynamical system
We now obtain a reduced system of n differential equations that describes the dynamics of the linear observables Z 1 , . . . , Z n . We summarize the main result in Table I but the reader is invited to look at the derivation to better understand the quantities in play.
First, the N -dimensional system that we are going to reduce is Eq. (2) under the change of variables of Eq. (7): for j ∈ {1, . . . , N }. This system defines a dynamics on the complex torus T N . The functions F , G, and H are directly related to the functions f , g and h. They therefore inherit some of their properties.
In particular, F and G are holomorphic functions with domain C 2 and codomain C while H is a holomorphic function from C 4 to C. For example, with the Kuramoto dynamics, Second, we use Eq. (13) to compute the time derivative of the observable Z µ defined in Eq. (8): for all µ ∈ {1, . . . , n}.
Third, we take advantage of the fact that F , G, and H are holomorphic functions to apply Taylor's theorem: where r F µj , r G µj , r H µjk are second order Lagrange remainders and β µ , γ µ , δ µ , µ , β µ , γ µ , δ µ , µ are arbitrary complex numbers around which we apply the expansions. Also, F 1 and F 2 are the derivatives of F with respect to the first and second arguments (z j andz j ) respectively and they are evaluated at (β µ , β µ ). The same applies to G 1 , G 2 , H 1 , H 2 , H 3 , and H 4 . The substitution of these Taylor expansions into Eq. (14) then leads to the equatioṅ where Υ µ is a homogeneous polynomial of degree one in the variables z j − β µ ,z j − β µ , z j − γ µ ,z j − γ µ , and so forth, Ξ µ = j (r F µj + r G µj ) + j,k r H µjk , and we have defined the parameters with the in-degree of node j, By introducing the "dynamical parameter matrix" W = diag(ω 1 , ..., ω N ) and the "degree matrix" K = diag(k 1 , ..., k N ), the parameters Ω µ and κ µ can be written in matrix form as in Eqs. (20)(21). These parameters represent weighted averages of the dynamical parameters ω j and the in-degrees k j , respectively.
Finally, we close the system of differential equationṡ Z µ , with µ ∈ {1, . . . , n}. We thus need to convert each equation of the form of Eq. (15) into an equation that involves only the linear observables Z 1 , . . . , Z n and their complex conjugates, without explicit reference to the variables z 1 , . . . , z N and their complex conjugates. To do so, we impose three new conditions. I Exact cancellation of the first-order terms: each polynomial Υ µ is identically zero; II Linearity of the dynamical variables: all the variables β µ , γ µ , δ µ , µ , β µ , γ µ , δ µ , µ are linear combinations, with real coefficients, of the observables Z 1 , . . . , Z n or their complex conjugates; III Approximation at second order: each term Ξ µ is neglected.
Condition I readily provides formulas expressing β µ , γ µ , δ µ , and µ as linear combinations of the z j 's: The equations for β µ , γ µ , δ µ , and µ have the same form as the previous equations, but with the complex conjugate of z j and z k . This implies that β µ , γ µ , δ µ , and µ are the complex conjugate of β µ , γ µ , δ µ , and µ , respectively. Condition II requires to rewrite β µ , γ µ , δ µ , and µ as linear combinations of the observables Z 1 , . . . , Z n . This condition is directly satisfied for β µ , but not for the others. Satisfying Condition II for γ µ , δ µ , and µ is the challenging step in our calculation. Yet, it can be done if where we have introduced three unknown n × n matrices W, K, and A. We observe that if the compatibility equations (24)(25)(26) are all satisfied, then the previous equations are satisfied along with Condition II. The consistency of these compatibility equations is a subtle issue that will be addressed in the next subsection.
Condition III guarantees that no further restriction is imposed on γ µ , δ µ , and µ .
Let us now state the main result of this paper, presented in Table I. If the three compatibility equations are satisfied, then the linear observables Z 1 , . . . , Z n obey the differential equations (16), where the symbol ≈ means "equality up to second order corrections" and where the variables γ µ , δ µ , µ are the components of the vectors in Eqs. (17)(18)(19).
In general, the reduced system cannot be totally consistent with the complete set of differential equations (2). The reduced system should thus be interpreted as an approximation whose quality strongly depends upon the choice of the n × N reduction matrix M . This will be discussed in depth in the next subsection.
FIG. 3. (Color online) Schematization of DART for a modular graph. The graph of adjacency matrix A with N nodes represents the structure of the complete dynamics, while the small graphs of adjacency matrix A with n nodes illustrate the structure of the reduced dynamics. The N ×N matrices of dynamical parameters W and of degrees K are also reduced with the reduction matrix M to n × n matrices W and K respectively.
Although the matrices W, K, and A seem to have been introduced for the sole purpose of mathematical convenience, they actually have a simple interpretation. As illustrated at the bottom of Fig. 3, A is the reduced adjacency matrix, that is, the matrix that regulates the interactions in the reduced system. Indeed, A is related to the graph with four nodes in the figure. Similarly, K and W respectively describe the in-degrees and the dynami-cal parameters of the reduced system. Figure 3 therefore gives a basic intuition of DART.
To get a better idea of what a reduced system looks like, we apply DART to specific cases, namely, the Winfree, Kuramoto, and theta dynamics. The reduced dynamics are presented in Table II.

C. Construction of the reduction matrix
The problem is to determine whether we can construct a reduction matrix M that ensures the consistency of the compatibility equations. In this subsection, we prove that there is an infinite number of reduction matrices that lead to exact solutions for at least one compatibility equation while providing approximate solutions to the other equations. We then propose two procedures that aim at cleverly selecting a reduction matrix that minimizes the errors associated to the approximate solutions.

Existence of the reduction matrix M and its factorization
In Appendix B, we establish that for any N ×N matrix T , if M is a n × N matrix of rank n, with n < N , then there exists at most one solution to the matrix equation with unknown reduced matrix T of size n × n. If the solution exists, then it is equal to where M + denotes the Moore-Penrose pseudo-inverse of M .
Aµν (Zν +Zν ) (31) Moreover, if the solution does not exist, Eq. (28) provides the best approximate solution in the sense that it minimizes the mean squared error (MSE) between M T and T M .
To tackle the problem of the existence of a solution to Eq. (27), we focus on the case where T is real and symmetric, as are the matrices W , K, and A in the compatibility equations. In Appendix B, we prove that for (1) a factorization M = CV where (2) C is a real non-singular n × n matrix and (3) V is a n × N real matrix composed of n real orthonormal row eigenvectors of T , Eq. (27) has the unique solution where Λ is the diagonal matrix whose µ-th element on the diagonal is equal to the eigenvalue λ µ corresponding to the µ-th eigenvector in V .
Conditions (1)- (3) are not particularly restrictive. They simply ensure that the rows of M form a (not necessarily orthogonal) basis of a n-dimensional subspace of R N . This subspace is spanned by the row eigenvectors of T used to build the eigenvector matrix V .
We call C the coefficient matrix, as its main role is to combine the eigenvectors of T in a suitable manner such that Conditions A and B are respected. We note that, due to the non-singularity of C and the orthonormality of the rows of V , the matrix M = CV automatically complies with Condition A.
These results lead to sufficient criteria for the consistency of the compatibility equations. Indeed, if i the matrices W , K, and A share the row eigenvectors v µ , with µ ∈ {1, . . . , n}, ii M = CV , where C is a n × n non-singular matrix and V is the n × N matrix whose µ-th row is v µ , then the compatibility equations are all consistent and DART is exact to first-order.
In general, however, the matrices W , K, and A do not share eigenvectors. This makes difficult or even impossible to determine whether or not there is a reduction matrix M whose corresponding n-dimensional reduced system is exact to first-order. Nevertheless, the sufficient criteria i -ii suggest how to use eigenvectors to get a reduced system that is almost exact to first-order.

Selection of one target matrix among {W, K, A}
Even if the criterion i is generally not satisfied, we can still define V with the eigenvectors of one matrix T among {W, K, A}, called the target matrix, and solve exactly its compatibility equation while solving the others approximately. The intermediate steps to achieve this are detailed in Procedure 1.
The procedure uncovers a trichotomy in DART: we must choose either W , K, or A as a target. To link the chosen target T to its reduction matrix, we denote M as M T , C as C T , and V as V T .

Procedure 1 Reduction with a single target matrix
Input: N × N matrices W , K, and A positive integer n < N Output: n × n matrices W, K, and A 1: Select a target matrix T from the set {W, K, A}. 2: Use n orthonormal row eigenvectors of T to form the n × N matrix VT .
Note that finding the coefficient matrix C T in Step 3 is generally not straightforward. In simple cases, where all elements of V T are nonnegative, C T can take the form of a diagonal matrix. When V T has negative values, the coefficient matrix is more difficult to obtain and we must turn to more sophisticated factorization methods, including semi-nonnegative matrix factorization (SNMF) and orthogonal nonnegative matrix factorization (ONMF). The technical details of the calculation of C T and V T are given in Appendix C.
Let us discuss about Procedure 1 when the adjacency matrix A is chosen as the target matrix in Step 1. There are a number of reasons to look at this choice in more details. One recalls that A encodes the interactions among the oscillators, so the first-order errors induced by the inconsistency in the equation AM A = M A A may lead to poor estimates for the evolution of the linear observables Z µ , unless the interactions are negligible. Moreover, a recent work [27] has revealed that the dominant eigenvector of A provides an observable that is better estimated through the reduced dynamics compared to the corresponding observable based on the degrees, and thus based on K.
The unique solution to the matrix equation is then where Λ A is the diagonal matrix that contains the eigenvalues associated with the eigenvectors in V A . Any eigenvalue of A is also an eigenvalue of A and this result does not depend on the matrix C A . Hence, in addition to its structural importance, this choice reveals something quite useful about an extension of the method to more than one target matrices.

Selection of two or three target matrices
In Procedure 1, there is a lot of freedom to choose the non-singular coefficient matrices C W , C K , and C A . Yet, we have not exploited this freedom. Even if the procedure allows to solve one of the compatibility equations exactly, the resulting reduction matrix M T could generate considerable errors in the other ones. We should therefore seek for a procedure that leverages the freedom on the coefficient matrix to minimize these errors.
For instance, let us consider We now want to find a coefficient matrix C A that minimizes the MSE between M A and M W = C W V W , which ensures the consistency of the equation WM W = M W W . In such a situation, we say that W is the second target matrix. The solution to the minimization problem is which implies that the best reduction matrix is We note that V + A = V A since the rows of V A are orthonormal. The matrix C W is not yet determined, but this can be done by imposing Condition B.
In another context, one could want to minimize the error related to the degree matrix K instead of W . The second target matrix would then become K and the MSE would be minimized between M A = C A V A and the ma- where C K is a non-singular matrix chosen so that Condition B is satisfied.
We have thus succeeded in targeting two matrices, first A, then W or K. We will therefore use the notation A → W , A → K, or more generally T 1 → T 2 , to denote the choice of target T 1 followed by T 2 . The first target is reached exactly in the sense that the equation AM A = M A A is consistent, while the second cannot be reached exactly in general, but nevertheless allows the approximate resolution of the equation Procedure 2 is a formalized version of this method with multiple targets in which the first and the second target matrices are arbitrary. The same procedure is also applicable to the case of three target matrices [48].
Procedure 2 includes more constraints on the coefficient matrix than Procedure 1 in the hope of satisfying more accurately the compatibility equations. Indeed, Steps 3 and 4 can be seen as a successive imposition of constraints on the coefficient matrix, from constraints to satisfy the different compatibility equations to constraints to fulfill Condition B.
Step 4 is not trivially satisfied when V A is involved. Indeed, only the first dominant eigenvector possesses solely real positive entries which is guaranteed by the Perron-Frobenius Theorem [49, Theorem 38]. To satisfy Condition B 2 , we once again use SNMF. We also use ONMF to ensure compliance of Condition A (exactly) and Condition A' (approximately). The details are given in Appendix D.

Procedure 2 Reduction with u target matrices
Input: N × N matrices W , K, and A positive integer n < N u ∈ {1, 2, 3} Output: n × n matrices W, K, and A 1: For k ∈ {1, . . . , u}, select a target T k from the set For arbitrary CT u , Mu must satisfy Condition A. Therefore, if u = 2 and VT 2 V + T 1 is singular, return to Step 2 and choose different eigenvectors. If u = 3 and Step 2 and choose different eigenvectors. 4: Find a non-singular n × n matrix CT u by imposing Condition B [see Appendix D]. 5: Solve exactly or approximately the compatibility equations using the formulae

Choice of eigenvectors for the adjacency matrix
There is yet another aspect that needs to be clarified for the case where the adjacency matrix is selected as a target matrix, that is, the choice of eigenvectors. Although any eigenvector could be chosen in principle, many reasons speak in favor of prioritizing the dominant eigenvectors, which are the eigenvectors of A whose corresponding eigenvalues are away from the bulk of the spectrum. It is known, for instance, that the largest eigenvalue of A plays a crucial role for predicting the stability [50], the dynamic range [51], and the critical values of dynamical parameters [52][53][54] in dynamical systems on networks. Moreover, from the graph structure perspective, the eigenvector with the largest eigenvalue has proven to be a useful centrality measure [17,49] while the dominant eigenvectors are key objects in spectral methods for detecting communities [55]. Finally, Eq. (19) and Eq. (34) reveal that the dominant eigenvalues have the strongest impact on the reduced dynamics. We should therefore use only the dominant eigenvectors of A when constructing the matrix V A . But how many of them should we consider? A rule of thumb is to choose n as the number of eigenvalues that are away from the bulk of the spectrum, which is the set that contains the eigenvalues λ i of A such that λ i /λ D ≈ 0, where λ D is the largest eigenvalue (see Fig. 4). In random graphs with one community (q = 1), such as the Erdős-Rényi model, only one eigenvalue is separated from the bulk. We then simply choose C A = 1 and M A = V A = v D , the dominant eigenvector of A, to get a reduced dynamics of dimension n = 1 as in Refs. [27,56]. For a graph with two communities (q = 2), there are usually two dominant eigenvectors, suggesting to make a reduction of dimension two (n = q = 2). When the communities are densely connected among themselves, it is not possible to apply twice the one-dimensional reduction method, as used in Ref. [27]. We then need to combine the eigenvectors using a second target matrix, W for example.

D. Errors in DART
In the previous subsections, we have made approximations to get the reduced dynamics (16). Until now, we have not considered the impact of these approximations on the accuracy of the reduced dynamics synchronization curves compared to those of the complete dynamics (2). In this subsection, we first specify the sources of errors and then expose their effects on the reduced dynamics synchronization curves.
The errors in DART come from the Taylor expansion made in Eq. (15). They are of two types: first-order terms [Condition I] and higher-order (> 1) terms [Condition III]. In this paper, we focus on the first-order errors which are directly related to satisfying the compatibility equations. The sources of first-order errors depend on the choice of: 1. Target matrices; 2. Dimension n of the reduced dynamics; 3. Eigenvectors; 4. Methods to calculate of C Tu . The third and fourth sources of errors are discussed in Appendices C and D. To illustrate the impact of the first and second sources of errors, let us consider the simple setup presented in Fig. 5. We reduce the dimension of the Kuramoto dynamics on a two-triangle graph (N = 6) to the dynamics (E2) with n = 2. We choose this small graph and natural frequencies that are uncorrelated to the structure (ex. the degrees) to accentuate the errors between the synchronization curves of the complete and reduced dynamics. By doing so, the matrices W , K and A will have very different eigenvectors which makes it more difficult to satisfy the compatibility equations.
We choose the orthogonal eigenvector matrices, where c = [2( √ 3 − 1) 2 + 4] 1/2 and the two rows of V A are the first two dominant eigenvectors of A. These eigenvector matrices ensure that (in Step 3 of Procedure 2) are non-singular regardless of the combination of target matrices [57].  The RMSE is the root-mean-squared error between the complete dynamics curve and the reduced dynamics curve. The gray area between the synchronization curves is shown to illustrate qualitatively the error made by the reduced dynamics. The initial conditions are the same for each plot and are drawn from a uniform distribution U(0, 2π).

Target matrices
To evaluate the impact of the choice of target matrices on first-order errors, we proceed as follows. With Procedure 2 and the eigenvector matrices in Eqs. (41), there are 15 possible choices of target matrices: three single targets (u = 1), six combinations of two targets (u = 2), and six combinations of three targets (u = 3). We thus obtain 15 reduction matrices and 15 reduced dynamics each related to a choice of target matrices. From these results, we find the synchronization curves R t vs. the coupling constant σ for the complete and reduced Kuramoto dynamics which are presented in Fig. 6. In each plot, the choice of target matrices and the RMSE between the synchronization curves of the complete and reduced dynamics are specified just above the σ-axis.
As expected, we first observe that choosing T 1 = W allows an accurate prediction of the synchronization observable at σ = 0 compared to T 1 = K or T 1 = A. This is explained by the fact that the compatibility equation M u W = M u W for all u is perfectly satisfied when T 1 = W , but not when T 1 = K or T 1 = A.
Interestingly, the reduced dynamics with the one target procedure T 1 = A accurately predicts the form of the complete dynamics synchronization curve for high couplings. We can explain this result by the fact that, for higher couplings, the structure becomes increasingly important. Therefore, observables defined from structural quantities (like the eigenvectors of the adjacency matrix) should be favored in this coupling regime.
However, if we choose T 1 = K, the predictions of the reduced dynamics are inaccurate even when the dynamics is strongly influenced by the structure (σ > 1 for instance). From our numerical experiments, choosing K as a first target is often a bad choice. One key reason for this is that the degrees only give a local information on the nodes whereas the eigenvectors of the adjacency matrix contains global information on the network [49, Chapter 3]. Each eigenvector of A depends on all the nodes in contrast to the eigenvectors of K in general.
Another aspect to consider is the number of target matrices u. Adding targets can improve or worsen the predictions of the reduced dynamics. Indeed, by looking at the first line of Fig. 6 where the first target is W , we get a better RMSE between the synchronization curves for the target combinations W → A → K than the one target procedure with T = W . On the other hand, in the third line of Fig. 6 where the first target is A, the result is the opposite: the one target procedure with T = A leads to more accurate results than the multiple targets in Procedure 2. However, in other numerical experiments and other setups, one could also see better results for multiple target procedures than a one target procedure. Hence, the repercussions of choosing more or less target matrices is very subtle.
One should nonetheless be careful with the interpretation of the RMSE: a smaller RMSE does not mean that we have better captured the underlying features of the complete dynamics. In our method, there are multiple possible sources of errors and they could compensate each other. For instance, this could be the case for the target combinations W → A → K. (a-d) RMSE denotes the root-mean-squared error between the complete dynamics curve and the reduced dynamics curve. The gray area between the complete dynamics and the reduced dynamics synchronization curves is shown to stress the error made by the reduced dynamics. The initial conditions are the same as in Fig. 6.

Reduced dynamics dimension
It is also important to consider the effect of the reduced dynamics dimension n. In principle, adding observables should improve the predictions of the reduced dynamics. To verify this hypothesis, let us use where c = 2( √ 3 − 1) 2 + 4. The first two rows of V A are the first two dominant eigenvectors of A and the third row is the last eigenvector of A, i.e., the eigenvector related to the smallest eigenvalue. Again, the reason for this choice is to guarantee that the products V T2 V + T1 are non-singular for every combination of targets. We only use two targets, W and A, because targeting K did not improved the results.
As shown in Fig. 7 (c), it is possible to capture all the synchronization transitions by simply using the one target procedure with T = A. However, adding a dimension in the reduced dynamics does not necessarily improve the accuracy of the prediction. In Fig. 7 (a), (b), and (d), the accuracy of the predictions are similar to the related n = 2 cases in Fig. 6. The result in Fig. 7 (d) is coher-ent with the result in Fig. 6: adding a second target W worsens the result.
From this example, it is clear that the choice of target matrices and the dimension n have a crucial impact on the predictions of the reduced dynamics. It is non-trivial to decide which targets to choose and in which order. Depending on the situation, one should find the appropriate targets to minimize the first-order errors. While the choice of the second and third target matrices (if any) depends on the situation, Figs. 6-7 and our many numerical experiments suggest selecting A as the first target.

E. Application to phase dynamics on modular graphs
So far, we have used DART only for a small graph and the Kuramoto dynamics. In this subsection, we show that our formalism is effective when the dimension of the system N gets larger and can be successfully applied to other phase dynamics.
Let us first consider modular networks of two oscillators' communities B 1 and B 2 of sizes N 1 and N 2 respectively. We assume that these modular networks are drawn from the SBM and that the dynamical parameters of each community are drawn from normal distributions. We denote these normal distributions N (ω µ , v µ ), where ω µ and v µ are respectively the mean and the variance of the dynamical parameters in the community B µ , and we consider that v µ is small compared to ω µ . With this setup and DART, we investigate the possibility of predicting the macroscopic and mesoscopic phase synchronization curves for the Winfree, Kuramoto, and theta models [Sec. II B].
For the reasons mentioned in Sec. III C, we select T 1 = A and define V A with the two corresponding dominant eigenvectors. We also set T 2 = W since the dynamical parameters are known to be important in the prediction of the synchronization critical value [58].
Selecting T 2 = W when there are N different frequencies raises a problem for choosing the eigenvector matrix V W . The reason is that there are N eigenvectors that cannot be combined to form two eigenvectors with as many non-zero elements as possible (see Appendix C).
We get around this difficulty by choosing approximate eigenvectors of W to build an approximate eigenvector matrix V W . By approximate (left) eigenvectors of W , we mean row vectors v such that vW ≈ ωv where ω is an approximate eigenvalue (frequency). Let ω µ be the row vector of length N µ that contains the dynamical parameters of the community B µ . Because the parameter variance in each community is low, the following eigenvector matrix yields good approximate eigenvalues of W : where 0 Nµ is a null row vector of length N µ .
The reduction matrix is obtained using Eq. (36) where C W is determined as explained in Appendix D. Note that since the adjacency matrix A and the dynamical parameters W are random matrices, the eigenvector matrices V A and V W are random matrices. As a consequence, the reduction matrix M becomes a random matrix as well and the synchronization observables, random variables.
By computing L reduction matrices from L graph and dynamical parameter realizations, we solve approximately the compatibility equations and get L reduced dynamics for the Winfree, Kuramoto, and theta model. Figure 8 shows successful predictions of the average synchronization observable R for the three phase dynamics on L = 50 realizations of random modular graphs, dynamical parameters, and initial conditions. Double phase transitions occur both in the complete and reduced dynamics in Fig. 8 (a), (d) and (e). The first synchronization transition corresponds to the emergence of synchronization at the level of the communities, as seen in the insets, while the second transition is explained by the increase of synchronization between the communities.
The reduced dynamics also predicts abrupt transitions at the macroscopic and mesoscopic levels for the Kuramoto model on bipartite networks [ Fig. 8 Fig. 8 (c), the phase trajectories first reach the closest equilibrium points on the unit circle and then oscillate with the increase of σ. However, this behavior is not observed in the reduced dynamics which explains the result in the bottom inset (second community) of Fig. 8 (c). In Fig. 8 (f), the transition from oscillation death to an excited state (SNIC bifurcation, see Sec. II B and Fig. 2) is well predicted by the reduced dynamics at the macroscopic and mesoscopic level.
Let us now investigate the impact of specific graph realizations on the performance of DART in Fig. 8. Actually, we want to quantify how the RMSE between the complete and reduced synchronization curves varies according to the graph realizations used in Fig. 8. To distinguish the different random graph realizations, we use the distance d between their adjacency matrix A and the mean adjacency matrix A given by which is a normalized Frobenius norm. Moreover, by computing the probability distribution P (d) in the random graph ensemble [59], we can identify which adjacency matrices among the chosen realizations are the most typical in the random graph ensemble. Figure 9 shows that the realizations, drawn from (a) the bipartite ensemble and (b) the SBM, cover a wide range of possible distances in the probability distribution P (d). The figure reveals that the RMSE error between the complete and reduced dynamics synchronization curves does not vary significantly with the distance d.
This last observation is crucial. It suggests that phase dynamics on real modular networks, which differ considerably from the mean of a random graph, can be reduced using the dominant eigenvectors of its adjacency matrix.

IV. CHIMERAS AND EXPLOSIVE SYNCHRONIZATION
Dimension reductions are useful in predicting synchronization regimes, even the more exotic ones [41,[60][61][62]. In this section, we use the formalism proposed in Sec. III to get analytical and numerical insights about the emergence of chimeras and explosive synchronization. Interestingly, we find that the reduced dynamics obtained with DART are similar to those deduced from the Ott-Antonsen Ansatz [41,61]. Yet, we provide a new perspective on the existence of chimera states for homogeneous and heterogeneous modular graphs. In addition to the observation of a new kind of chimera state on the twostar graph, we find that the asymmetry of the community sizes can make the difference between getting a chimera state or not. We finally recover results from various papers on explosive synchronization with the potential to push further in certain directions.

A. Preliminary remarks on chimeras
By using the Ott-Antonsen Ansatz, it was shown in Ref. [41] that a graph with two communities of identical phase-lagged oscillators can reach a dynamical state where one community is perfectly synchronized (R µ = 1 for any µ in {1, 2}) while the other is partially synchronized (R ν < 1 for ν in {1, 2}\µ). This spectacular state of coherence and incoherence, first observed in Ref. [63], was called a chimera state or, more succinctly, a chimera [64].
The term "chimera state" has been used in diverse ways in the literature. Some efforts have been made to clarify the notion [3,76,82], but there is still no universally accepted definition. Our definition is based on Ref. [64] in which a chimera state is defined as "an array [(a graph)] of identical oscillators [that] splits into two domains: one coherent and phase locked, the other incoherent and desynchronized". Careful attention should be paid to the word "identical". By identical, we mean that all oscillators have the same natural frequency ω j = ω for all j ∈ {1, ..., N }. However, we allow the oscillators to have different neighborhoods (degrees). The oscillators are thus dynamically, but not structurally, identical.

B. Chimeras in modular graphs
We focus on the existence of chimera states in the Kuramoto-Sakaguchi dynamics on modular graphs with asymmetric modules and with identical natural frequencies.
We first apply DART to get a simple set of differential equations. We then use the latter to find the equations governing the equilibrium points of the dynamics. Finally, we analyze those points to identify the regions in the space defining the graph structure where different types of chimeras can be found.

Reduced dynamics
We study a specific case of the SBM, namely, the planted partition model with two blocks (see Sec. II A). The mean adjacency matrix is [83] where 1 Nµ is a row vector of 1's of length N µ . The degree matrix is where I Nµ×Nµ is an identity matrix of dimension N µ × N µ , 0 Nµ×Nν is a null matrix of dimension N µ × N ν and Because the oscillators are identical, To find the reduced dynamics of the Kuramoto-Sakaguchi model on the mean SBM, let us follow Procedure 2 with n = 2 and u = 2.
1: The compatibility equation (24) is already satisfied since W is proportional to the identity matrix, so there is no need to consider choosing W as a target.
We therefore select the targets T 1 = A and T 2 = K. 2: We compute the eigenvectors of A SBM and form the matrix 2 are the dominant eigenvalues of A SBM . The pseudo-inverse of V A is found by using the formula V + A = V A (V A V A ) −1 (not shown here). Moreover, we choose two orthogonal eigenvectors of K by inspection to get , which ensures that the resulting reduction matrix , satisfies Condition B. 5: We find that (M + ) jµ = δ µ s(j) which allows us to solve exactly the compatibility equations with W = diag(ω, ω), K = diag(K 11 , K 22 ), and The reduced dynamics is therefore exact to first order and is given bẏ where, without loss of generality, we have set the system in the center of mass by taking ω = 0 [84].

Equilibrium points related to chimeras
We want to identify structural and dynamical conditions that are necessary for the existence of chimeras. For this, we first set R 1 = 1 and R 2 = R I < 1 as in Ref. [41]. Then, we further simplify the reduced dynamics by introducing the phase difference Φ = Φ 1 − Φ 2 , by merging the phase equations together, and by introducing a characteristic time The resulting differential equations are where the prime indicates the derivative with respect to τ . These equations are similar to those obtained in Ref. [41] by following the Ott-Antonsen method, except that we now have an explicit dependence over the network parameters N, N 1 , N 2 , p in and p out . We have proved that if the reduced system is in a chimera state, then Eqs. (45) and (46) are satisfied. We now require the chimeras to be equilibrium points of the dynamics. This is equivalent to imposing R I = 0 and Φ = 0. A few basic manipulations allow us to conclude that the chimeras are equilibrium points only if the following equations are satisfied: where f = N 1 /N 2 is the block asymmetry parameter. Note that we have redefined R I as a function of Φ.

Chimeras in the density space
We aim to clarify the impact of the modular structure on the existence of chimeras. It is already known that the difference between the intra-community coupling strength and the extra community coupling strength plays a critical role in the emergence of chimeras [41,60]. We therefore introduce a new parameter ∆ that captures this difference in coupling strength. Following Ref. [32], we make the change of variables N 2 is the sum of the maximum possible number of links within each community divided by the maximum number of possible links in the network. Note that ρ is the average density in the SBM. The coordinates (ρ, ∆) form the density space that characterizes all the possible graphs in the planted partition model.
Let us go back to Eq. (47). Using the new coordinates (ρ, ∆), the equation becomes We now identify the bifurcations of the reduced dynamics in the density space for fixed α and f [85]. Equation (49) already reveals the form of these bifurcations: they are straight lines in the density space whose slopes depend on the value of the equilibrium point Φ * .
To obtain these different equilibrium points Φ * , we analyze the Jacobian matrix of the differential equations (45)(46). We find a Hopf bifurcation (a stable chimera/equilibrium point loses stability and trajectories converge to a stable breathing chimera) by setting the trace of the Jacobian matrix equal to zero, i.e., We also find a saddle-node bifurcation (a stable chimera is created or destroyed) by setting the determinant equal to zero [41], i.e., Equations (50)(51) allows us to compute the appropriate zeros for Φ by using standard root-finding algorithms. Substituting these zeros into Eq. (48) and then into Eq. (49) gives the slopes of the straight lines corresponding to the Hopf and the saddle-node bifurcations.
The results are displayed in Fig. 10 (a) where we have integrated the N -dimensional Kuramoto-Sakaguchi dynamics for different values in the density space. We only show the assortative region (p in > p out ⇒ ∆ > 0) because the dissortative region (p in < p out ⇒ ∆ < 0) contains no chimeras. The white regions are "Not allowed" because they are regions where p in or p out / ∈ [0, 1]. We also illustrate the trajectories of two different kinds of chimeras: a breathing chimera (stable limit cycle in black) [Fig 10 (b)] and a stable chimera (stable equilibrium point in blue) [Fig 10 (c)].
More importantly, Fig. 10 (a) shows the agreement of the predicted Hopf (solid brown line) and saddle-node (solid purple line) bifurcations with the bifurcations in the complete dynamics (heatmap). The homoclinic bifurcation (solid green line) was obtained numerically and is shown to better see the complete chimera region, which is located between the homoclinic bifurcation and the saddle-node bifurcation.
C. Periphery chimeras in two-star graphs Two-star graphs are modular (see Sec. II A). Yet, their structural properties differ considerably from those of the planted partition model studied in the previous subsection. We use DART to show that these structural differences also cause significant dynamical differences. In particular, we find that for two-star graphs, the Kuramoto-Sakaguchi dynamics with identical frequencies can lead to the emergence of a new kind of chimeras.

Reduced dynamics
We consider a two-star graph divided into the modules B 1 , ..., B 4 defined in Sec. II A. The adjacency matrix is where we have separated the modules with lines and where the empty blocks are filled with zeros. The degree matrix is therefore and the frequency matrix is W = ωI N ×N . Let us now go through all the steps of Procedure 2 with n = q = 4 and u = 2.
1: We select the targets T 1 = A and T 2 = K. 2: We analytically find the eigenvector matrix V A and the eigenvector matrix of the degree matrix, which gives the reduced matrix The latter satisfies Conditions A, B, and A'. 5: We solve exactly the compatibility equations with W = diag(ω, ω, ω, ω), K = diag (N p1 + 1, 1, N p2 + 1, 1), We find that the reduced equations for the radial and phase variables arė The above equations have the same form as the reduced equations of the Kuramoto-Sakaguchi dynamics on the mean SBM. This observation can be generalized for other types of networks and dynamics. Indeed, as shown in Table IV of Appendix A, there is always a simpler form of the reduced dynamics when W µν = Ω µ δ µν and K µν = κ µ δ µν .
It is worth pointing out that the cores have their own observables and they are already maximally synchronized (with themselves), which is not true for the peripheries. Thus, extra care should be taken for two-star graphs to avoid any confusion between the cores and the peripheries. With this in mind, we slightly change the notation as follows: where c µ and p µ stand for "core µ" and "periphery µ", respectively. The trivial conditions now become R c1 = 1 and R c2 = 1, which implyṘ c1 = 0 andṘ c2 = 0. Thus, among the eight dynamical equations, only six are nontrivial.
We reduce again the number of equations by introducing variables that describe the phase differences between the communities, i.e., . We end up with five equations to describe the Kuramoto-Sakaguchi dynamics on the twostar graph. Two equations are related to the synchronization observables in the peripheries: where we have used the characteristic time defined in Eq. (44). The other three equations describe the evolu- tion of the phase observables:

Searching for chimeras
According to our previous analysis, there is a chimera state in the two-star graph if either R p1 = 1 and R p2 < 1 or R p1 < 1 and R p2 = 1. Yet, imposing these conditions yields four coupled differential equations for which the dynamical analysis is much more complicated. We will therefore rely on numerical analysis.
We fix the parameters σ, N p1 , and N p2 . This allows us to study how the phase lag α affects the synchronization in each periphery. As shown in Fig. 11 (a) and (c), for each periphery, there is a critical value [dashed vertical line] above which desynchronization begins. For the complete (reduced) dynamics, we have denoted these critical values α 1 and α 2 (α 1 andα 2 ).
Notably, we observe a clear separation between the critical values α 1 and α 2 (α 1 andα 2 ). Let us stress that this separation happens despite the complete similarity of the peripheries: same number of nodes and same natural frequencies. Therefore, there is a region [dark region between the two vertical dashed lines in Fig. 11 (a) and (c)] where a periphery is coherent (the one that has the higher critical value) while the other is incoherent. We call any state that belongs to that region a periphery chimera. Examples of periphery chimeras are given in Fig. 11 (b) and (d).
In these examples, the periphery chimeras are breathing [41].
Because star graphs are building blocks of complex networks, our results suggest that periphery chimeras could exist within very small regions of the parameter space describing the phase dynamics of a complex network.

D. Chimera region size
We investigate the impact of the asymmetry between the communities on the size of the chimera region for the reduced dynamics of the Kuramoto-Sakaguchi model on the mean SBM (45)(46) and the two-star graph (52)(53).
First, in the case of the mean SBM, the size of the chimera region in the density space is where the coordinates (ρ 2 , ∆ 2 ) and (ρ 3 , ∆ 3 ) are the points of the intersection between the straight lines related to the bifurcations (homoclinic, saddle-node) and the limit points of the allowed region in the density space. Using Eq. (51), Eq. (49), and the equation ∆ = (1 − ρ)/(1 − β) that corresponds to the right limit of the density space, we find that the size of the stable chimera region is given by the equations where X ∈ {H, S} indicates whether the bifurcation is homoclinic (H) or saddle-node (S). The symbol a X denotes the slope of the bifurcation X while Φ X is the zero found numerically from Eqs. (50)(51).
The results are illustrated in Fig. 12 (a) where we ob- serve that the size of the stable and breathing chimera regions evolves non-linearly as a function of f . As a consequence, the total size of the chimera region quickly goes from its minimal value at f = 1.0 to its peak at f ≈ 1.1, before decreasing rapidly as f increases. A small structural change in the reduced equations can therefore have considerable impact on the resulting chimera region.
Second, in the case of the Kuramoto-Sakaguchi on the two-star graph, we investigate the impact of the asymmetry on the size of the region with periphery chimeras, which is defined along the α-axis. For most values of α, like the one in Fig. 11 (c), the size of the periphery chimera region is approximated as where we recall thatα µ is the critical value above which there is desynchronization of the periphery p µ in the reduced dynamics. In Fig. 12 (b), we observe that the relationship between the size S * * c and N p1 /N p2 is approximately linear. Figure 12 thus shows that the size of the chimera regions, which can be expressed in terms of specific structural parameters [ex. ρ and ∆ in Eq. (54)] or the dynamical parameters [ex. α in Eq. (55)], is closely related to another structural parameter [i.e., f ]. When some structural or dynamical parameters are fixed, the asymmetry between the community sizes can therefore dictate whether or not a chimera emerges.

E. Explosive synchronization in star graphs
Explosive synchronization is characterized by discontinuous (first-order) phase transitions. It is used to describe the occurrence of sudden catastrophes, such as epileptic seizures or electric breakdown [86][87][88]. A simple but instructive example of a system that can evolve towards explosive synchronization is the Kuramoto model on the star graph with degree-frequency correlation [23,61,[89][90][91]. Interestingly, the model's bifurcation diagram for the global synchronization observable exhibits hysteresis [89].
The goal of this subsection is to use DART to gain analytical insights about the Kuramoto-Sakaguchi model on the star graph. In particular, we want to reproduce some known results on explosive synchronizations.

Reduced dynamics
The Kuramoto-Sakaguchi model on the two-star graph has already been analyzed in Sec. IV C. Since the star graph is contained in the two-star graph, it is rather straightforward to adapt our previous calculations to derive the appropriate reduced dynamics.
First, let us introduce the matrices that define the frequencies and the structure of the complete dynamics of dimension N = N p + 1, i.e., Setting n = q = 2 and using Procedure 2 with A → W → K yields W = diag(ω 1 , ω 2 ), K = diag(N p , 1), The reduced dynamics is therefore given by Eq. (E5) with Ω 1 = ω 1 , Ω 2 = ω 2 , σ/N → σ and the reduced adjacency matrix A defined above.
Let R 2 = R p be the phase synchronization observable of the periphery and let Φ = Φ 1 − Φ 2 be the difference between the phase of the core and the phase observable of the periphery. Then, the reduced dynamics of the Kuramoto-Sakaguchi model on the star graph iṡ which is also reported in Ref. [61] apart from a difference in sign [92].

Global synchronization observable at equilibrium
The observable that measures the global synchronization of the network is the modulus of the complex observable Z = Re iΦ defined in Eq. (11) with = (1 N p ), that is, To find the equilibrium points of the global synchronization we first notice that d(R 2 ) dt = 2RṘ = 0 ⇐⇒ R = 0 orṘ = 0.
Hence, the global synchronization is at equilibrium if the derivative of 1 + (N − 1) 2 R 2 p + 2(N − 1)R p cos Φ with respect to time is zero.
From the above analysis, we deduce that if both R p and Φ are at equilibrium, then so is R. Now, the equilibrium solutions for R p and Φ are the zeros of Eqs. (56)- (57). Using trigonometric identities (see [93]), we readily solve these equations and get where the asterisk indicates equilibrium and After substituting these solutions into Eq. (58), we obtain the corresponding equilibrium point for the global synchronization observable: The equilibrium point exists for all values of σ that are greater than or equal to the critical coupling Below the critical coupling, a real positive solution for R * does not exist. At the critical coupling, R * is equal to where s(N, α) = 2(N − 1)(N − 2) sin 2 α.
The equilibrium points for the global synchronization observable R are illustrated in Fig. 13 (a). We observe that the equilibrium points form a hysteresis with two branches: backward branch (blue) and forward branch (gray). The numerical solutions obtained from the complete (reduced) dynamics are shown with darker (lighter) markers. We observe that the solutions of the reduced dynamics agree with those of the complete dynamics, except for the backward branch in the domain 0 < σ < σ b ≈ 1.66. An example of synchronized trajectory observed in this domain is illustrated in Fig. 13 (b), which contrasts with the synchronized solution shown in Fig. 13 (c).
The analytical solution (59) is indicated by the orange line in Fig. 13 (a). This solution clearly belongs to the backward branch of the hysteresis. For α = 0, Eq. (59) yields the same result as the backward branch reported in Ref. [90]. For α = 0, ω 1 = N p , and ω 2 = 1, the critical coupling becomes which corresponds to value of the top branch of the hysteresis found in Ref. [89]. The substitution of this result into Eq. (59) leads to the critical value reported in Ref. [90], i.e., The forward branch critical synchronization value [61] predicts accurately the numerical results obtained from the complete and the reduced dynamics. We also observe that the backward branch critical desynchronization value σ b occurs before the critical value σ c of the equilibrium point R * in Eq. (59). This is explained by the fact that the equilibrium point is not generally stable for all σ > σ c and for α = 0 [61]. However, the coupling value σ b we obtain is not equal to the value reported in Refs. [61] and [94].
To summarize, using DART with A → W → K, we successively obtain multiple analytical results on explosive synchronization. At first glance, the use of the three targets procedure may seem overkill to derive the reduced dynamics (56)(57), since any one-target procedure would give the same reduced dynamics in this case. However, any heterogeneity in the frequencies or the structure (e.g., connecting nodes in the periphery) would break the possibility to satisfy the three compatibility equations with a one-target procedure. We conjecture that using DART with three target matrices will come in handy for these kinds of perturbed systems and even pave the way for new interesting analytical results on explosive synchronization.

V. CONCLUSION
We have introduced a Dynamics Approximate Reduction Technique (DART) to obtain a low-dimensional dynamical system from a high-dimensional dynamical system on finite networks. DART generalizes previous approaches [24,25,27] for modular graphs with strong interactions between the modules, dynamically non-identical nodes, and complex observables.
Our approach has uncovered a threefold problem: to close the reduced dynamics to first order, one needs to solve three compatibility equations involving a dynamical parameter matrix W , the degree matrix K, the adjacency matrix A, and the reduction matrix M .
The form of these compatibility equations has revealed a spectral aspect of the dimension-reduction problem. If one finds a common eigenvector basis for the matrices W , K, and A, then the compatibility equations are satisfied and the reduced dynamics is exact to the first order. However, these matrices do not share eigenvectors in general.
To tackle this problem, we have introduced two procedures designed to solve one compatibility equation exactly and the two others approximately. The key for achieving this task is to perform linear transformations that project eigenvectors of one of the three matrices as close as possible to eigenvectors of the other matrices.
Yet, it is not trivial to choose the eigenvectors defining the observables of the reduced dynamics. The threefold character of DART has demonstrated that the choice of observables is subtle and not universal for dynamics with non-identical nodes: the choice depends on the dynamical and structural setup. Despite this fact, the various numerical experiments performed with the Kuramoto model on a small graph suggest that choosing the eigenvectors of the adjacency matrix, i.e., satisfying the corresponding compatibility equation exactly, is advantageous in most cases to predict the synchronization transitions curves, and should be preferred.
Using the eigenvectors of A and W , we have derived the reduced dynamics for the Winfree, Kuramoto, and theta models on random modular networks. The global phase synchronization curves obtained with the reduced dynamics are in agreement with those of the complete dynamics and exhibit, in particular, double phase transitions and SNIC bifurcations.
We have also analyzed the Kuramoto-Sakaguchi dynamics on different modular graphs with DART to get insights on exotic synchronization states. For the mean SBM, we have first investigated the impact of the network density and community size asymmetry on the existence of chimeras. We have detected new chimera regions in the SBM density space whose size varies in a surprising nonlinear way according to the community size asymmetry. In particular, we have found that the size of the chimera regions peaks when asymmetry is small. For the two-star graph, we have showed the existence of periphery chimera states, i.e., a dynamical regime in which the periphery of one star is perfectly synchronized while the periphery of the other star is partially synchronized. This type of chimera lives within a very narrow range of phase-lag values and can breathe. Finally, for the star graph, we have used DART to recover multiple analytical results from the literature on explosive synchronization.
Interestingly, despite the apparent differences in the methods, the reduced dynamics obtained from our formalism possess similarities with the ones obtained with the Ott-Antonsen approach. DART, however, is not restricted to phase dynamics, is specifically designed for finite networks, and potentially leads to new revealing perturbative analysis. Thus, it seems worth exploring more in depth the mathematical relationships between the two methods. Moreover, a thorough investigation of the errors (e.g., according to the system size N ) caused by DART is still missing.
We believe that DART will contribute to solve harder problems where the networks have more heterogeneous properties and the dynamics are driven by more complex functions. Using our approach for dynamics on real networks should provide new analytical insights about critical phenomena in a wide range of fields from epidemiology [95] to neurosciences [27] and ecology [25].
is sufficient for satisfying Eq. (B2). However, Eq. (B3) is satisfied only if the rank of M is N , which contradicts our first assumption that n < N .
We have proved so far that, in the context of dimension reduction, there exists at most one solution to Eq. (B1), i.e., T = M T M + . The existence of this solution remains non-trivial however. To go further in our analysis, we need to impose an additional restriction to the matrix T .
Thus, let us suppose that T is real and symmetric, as are the matrices W , K, and A in Eqs. (24)(25)(26). Then, T possesses N real orthonormal eigenvectors of size 1 × N (i.e., row vectors). Let us choose ≥ n eigenvectors and form the × N matrix V whose µ-th row is the µ-th eigenvector of T . Then, V V = I and V + = V .
Additionally, let us suppose that M can be factorized as where C is a real n × matrix of rank n, so n ≤ . Simple manipulations then lead to the following conclusion: Eq. (B1) is satisfied if and only if where Λ is the diagonal matrix whose µ-th element on the diagonal is equal to the eigenvalue λ µ of the µ-th eigenvector in V . In turn, Eq. (B4) is consistent if and only if Thus, C + C = I is a sufficient criterion for Eq. (B1). But C + C = I is satisfied if and only if the rank of C is . We therefore need to impose that = n and that C is a non-singular matrix. All in all, we have proved that if the three conditions below are satisfied, then Eq. (B1) has a solution: (1) M = CV , where (2) C is a non-singular n × n real matrix and (3) V is a n × N real matrix composed of n real orthonormal row eigenvectors of T . Moreover, the solution is unique and it is equal to Any row eigenvector of the target matrix T can, in principle, be used to build the matrix V T . However, to get observables that describe the large-scale dynamics of the system, the eigenvectors of V T should have as many non-zero elements as possible. Moreover, to simplify the calculation of C T , the matrix V T should already be close to a reduction matrix, which is required to satisfy Condition B. Hence, the sign of the eigenvectors should be chosen to minimize the number of negative elements. This is easily done when the target matrix is either W or K, since all eigenvectors can be chosen to contain only nonnegative elements. When the target matrix is A, there is only one eigenvector with non-negative elements, the Perron vector of A, which is associated with the largest eigenvalue of A. Thus, the Perron vector, once normalized, should always be included in V A .
b. Exact methods to compute CT for a given VT There are two typical cases: (1) all elements of V T are non-negative and (2) many elements of V T are negative, but one row of V T is non-negative.
If all the elements of V T are non-negative, as for T = W or T = K, then the solution to Step 3 of Procedure 1 is which makes M T satisfy Condition B. If V T contains at least one row with non-negative elements, as for the case T = A when the Perron vector is included, then the solution to Step 3 is of the form with and where we have placed the non-negative eigenvector in the first row of V T without loss of generality. The parameters α µ are chosen to ensure the non-negativity of all elements (EV T ) µj . A simple solution is When T = A, the last method ensures the existence of at least one coefficient matrix C A which in turn implies that the reduction matrix M A is positive and solve exactly the compatibility equation M A A = AM A . Thanks to Eqs. (C2-C4), the method also provides an explicit formula for the calculation of C A . However, the method does not confer any additional desirable property to M A . For instance, M A could be far from being row-orthogonal [Condition A'] and could thus limit our capacity to interpret the linear observables Z 1 , . . . , Z n . As a consequence, we need to compute C A in another way.
c. Approximate method to compute CT for a given VT In this paper, we rely on an approximate method that uses two matrix factorization algorithms. First, seminonnegative matrix factorization (SNMF) [97,98] decomposes V T as X 1 X 2 , where X 1 is non-singular and X 2 is non-negative [Condition B 2 ]. Actually, the algorithm minimizes the MSE between V T and X 1 X 2 . We use the Python package PyMF to perform SNMF. Second, we use orthogonal nonnegative matrix factorization (ONMF) [99] to factorize the nonnegative matrix X 2 as X 3 M T , where X 3 is non-singular and M T is as close as possible to a row-orthogonal nonnegative matrix, which is a desirable property for the reduction matrix [Condition A']. To perform the calculations, we have modified and corrected the Python package iONMF. Starting from eigenvector matrix V T , we end up with a nonnegative matrix M T , which is nonnegative and (or almost) row-orthogonal. The corresponding matrix C T is therefore X −1 3 X −1 1 . Elementary manipulations finally yield a row-normalized M T [Condition B 1 ].
Appendix D: Calculation of CT u in Procedure 2 Step 4 of Procedure 2 does not specify how to find the coefficient matrix C Tu . The reason for this lack of specificity is that there is no unique method for the computation of C Tu . We have chosen an approximate method, similar to the one of Appendix C, because it leads to reduction matrices M u that are almost row-orthogonal. The details are given below.
Let us define Therefore, the reduction matrix M u in Eq. (39) is In general, it is not possible to find C Tu such that the reduction matrix M u in Eq. (D1) is nonnegative, normalized, and as close as possible to a row-orthogonal matrix [Condition B and A']. However, we can consider M u as a factor of V u and find an approximate solution to Eq. (D1), i.e., which defines a matrix factorization problem where C Tu and M u are to be found. First, to satisfy Condition B 2 , we factor out a nonnegative n × N matrix X 1 from V u using semi-nonnegative matrix factorization (SNMF) [97,98], where C 1 is a non-singular n × n matrix found with the SNMF algorithm. Second, to satisfy Condition A', we factor out a nonnegative and almost orthogonal n × N matrix X 2 from X 1 using orthogonal nonnegative matrix factorization (ONMF) [99], where C 2 is a non-singular n × n matrix found with the ONMF algorithm. Third, to satisfy Condition B 1 , we factorize X 2 exactly to extract the reduction matrix M u (now nonnegative, almost orthogonal, and normalized) with a non-singular diagonal matrix C 3 where the µ-th diagonal value is given by N j=1 (X 2 ) µj . Finally, substituting Eqs. (D4-D5) into Eq. (D3) and inverting the relation yields M u ≈ C −1 3 C −1 2 C −1 1 V u , where we have used the non-singularity of C 1 , C 2 , and C 3 . The coefficient matrix C Tu is therefore Other NMF approaches could be useful for the computation of the coefficient matrix. For instance, convex nonnegative factorization (CNMF) seems particularly adapted to our problem. Indeed, as reported in [97], CNMF is able to approximate a matrix with both positive and negative elements as the product of two matrices: a first matrix that convexly combines the elements of the original matrix; a second nonnegative matrix whose rows are almost orthogonal. The impact of other NMF algorithms on the quality of DART will be investigated in another paper.
Appendix E: Initial conditions to get chimeras One important fact about the simulations of the Kuramoto-Sakaguchi dynamics is that the initial conditions to get a chimera state are not known a priori. Apart from Ref. [100], there is not much information in the literature about the initial conditions that lead to chimera states. In this paper, we drew multiple initial conditions from a uniform distribution for each point in the density space of the mean SBM. After some numerical exploration, we found that the number of observed chimeras per initial condition is not equal everywhere in the density space as reported in [100]. These experiments raised the question: What are the initial conditions that allow the emergence of chimeras ?
For given ρ, ∆, f , and α, the initial conditions that lead to a chimera states vary a lot. This is shown in Fig. 14. The heat maps of initial conditions show diverse non-trivial patterns (for instance, spiral patterns) and the size of the regions of initial conditions giving chimera can be big as the complete map or only a very small part of it. It is therefore a hard problem to make an intelligent choice of initial conditions to find chimeras.