Revealing dynamics, communities and criticality from data

Complex systems such as ecological communities and neuron networks are essential parts of our everyday lives. These systems are composed of units which interact through intricate networks. The ability to predict sudden changes in the dynamics of these networks, known as critical transitions, from data is important to avert disastrous consequences of major disruptions. Predicting such changes is a major challenge as it requires forecasting the behaviour for parameter ranges for which no data on the system is available. We address this issue for networks with weak individual interactions and chaotic local dynamics. We do this by building a model network, termed an {\em effective network}, consisting of the underlying local dynamics and a statistical description of their interactions. We show that behaviour of such networks can be decomposed in terms of an emergent deterministic component and a {\em fluctuation} term. Traditionally, such fluctuations are filtered out. However, as we show, they are key to accessing the interaction structure.


I. INTRODUCTION
We are surrounded by a range of complex networks composed of many units forming an intricate network of interactions. Neuron networks form an important class of examples where the interaction structure is heterogeneous [1]. Because changes in the interaction can have massive ramifications on the system as a whole, it is desirable to predict such disturbances and thus enact precautionary measures to avert potential disasters. For instance, neurological disorders such as Parkinson's disease, schizophrenia, and epilepsy, are thought to be associated with an anomalous interaction structure among neurons [2]. As in the case of neuron networks, it is impossible to directly determine the interaction structure. Therefore, a major scientific challenge is to develop techniques using measurements of the time evolution of the nodes to indirectly recover the network structure and predict the network behaviour when the interactions change.
The literature on data-based network reconstruction is vast. Reconstruction methods can be classified into model-free methods and model-based methods. The former identify the presence and strength of a connection between two nodes by measuring the dependence between their time-series in terms of: correlations [3,4], mutual information [5], maximum entropy distributions [6,7], Granger causality, and causation entropy [8,9]. Such methods alone do not provide information on the dynamics, which is necessary to predict critical transitions. Model-based methods provide estimates (or assume a priori knowledge) of the dynamics and interactions, and use this knowledge to reconstruct the network structure. When the interactions are strong, the network structure can be recovered [10][11][12]. For a more extensive account of reconstruction (model-free and -based) methods see the reviews [10,13,14]. * deniz.eroglu@khas.edu.tr In many applications, the behaviour of isolated nodes is chaotic and the interaction is weak [1,15,16]. The network structure typically has communities and hierarchical organisations such as the rich-clubs [17]. As the interaction strength per connection is weak and the statistical behaviour of the nodes is persistent, the influence of each node on the network corresponds essentially to a random signal. Existing techniques fail to reconstruct a model from the data, as they require the interaction to be of the same magnitude as the isolated dynamics. In our setting, only the cumulative contribution of many links matter and the network signals decompose into a deterministic and a fluctuation term. The latter, which is usually filtered out, turns out to give crucial information on the network structure and is fundamental to our approach.
In this paper, we introduce the notion of an effective network which aims to model a complex system from observations of the nodes evolution when the network has a heterogeneous structure, the strength of interaction is small and local dynamics are highly erratic. This approach starts by reconstructing the local dynamics from observations of nodes with relatively few connections, and then recover the interaction function from observations of the highly connected nodes whose dynamics are the most affected by the interactions as a result of the multitude of connections they receive from the rest of the network [18,19]. A key achievement is that this reconstruction enables us to identify community structures also when the coupling is only weak. Moreover, it recovers enough information to forecast and anticipate the network behaviour, even in situations where the parameters of the system change into ranges that have not been previously encountered.

A. Complex networks of nonlinear systems
We consider networks with N nodes with chaotic isolated dynamics and pairwise interactions. The network is described by its adjacency matrix A, whose entry A ij equals 1 if node i arXiv:2107.02241v1 [nlin.AO] 5 Jul 2021 receives a connection from j and equals 0 otherwise. The time evolution of the state x i (t) of node i at time t is expressed as When performing reconstruction, the isolated local dynamics F i : M → M , the coupling function H, the coupling parameter α (that is small), the adjacency matrix A, and even the dimension k is the degree of the space M , are assumed to be unknown. These equations model important complex systems such as neuron networks [20], smart grids [21,22], superconductors [23], and cardiac pacemaker cells [24].

B. Main assumptions
Our three assumptions are: (a) the local dynamics are close to some unknown ergodic and chaotic map F (that is, F − F i ≤ δ, which is often the case in applications [25,26]). (b) The network connectivity is heterogeneous, which means that the number of incoming connections at a node i (given by its degree k i = j A ij ) varies widely across the network. k i is large for a few nodes called hubs. (c) α is such that, denoting by ∆ = max i k i the maximum number of connections α∆ is of the same magnitude of DF . Assumptions (a) and (c) imply that only the cumulative effect of the coupling is important. A prime example is the cat cerebral cortex which possesses inter-connected regions split into communities with a hierarchical organization as well as modular and disassortative rich-clubs. This network has heterogeneous connectivity, chaotic motion and weak coupling [27][28][29]. Other examples include the drosophila optic lobe network [30,31]. For a given dataset, our effective network first tests whether the underlying system satisfies assumptions (a) − (c), and, if so, reconstructs the model.
We assume the availability of a time series of observations where φ is a projection to a variable on which unit interactions depend. This situation occurs frequently in applications; as with measurements of membrane potentials in neurons.

II. EFFECTIVE NETWORKS RECOVER STRUCTURE AND DYNAMICS
To obtain an effective (reconstruction of the) network from observations, we combine statistical analysis, machinelearning techniques, and dynamical systems theory for networks. An effective network provides: local evolution laws and averaged interactions for each unit that, in combination, closely approximate the unit dynamics; a network with the same degree distribution and community structures as the original system. We use the term "effective" because it gathers sufficient data to reproduce the behaviour of the original network and predict its critical transitions.
Using our assumptions for the network and local dynamics, we can show that the evolution at each node will have low-dimensional excursions over finite time scales. More precisely, the evolution rule at node i is given by where F i ≈ F is the isolated dynamics, is the rescaled degree, and where µ is physical measure of the isolated dynamics. V takes into account the cumulative effect of interactions on node i. The true dynamics is influenced by a fluctuation term ξ i (t) that is small for an interval of time which is exponentially large and depends on the state of neighbours of the ith node. This low-dimensional reduction has been rigorously established in test cases (see [19]). See Appendix for further information.
The approximation described above applies to the measured state variable y i (t). First of all we pre-process the data according to the system under study (see Appendix E). The processed variable is still referred to as y i (t). Takens reconstruction tells us that y i (t + 1) is a nonlinear function of k + 1 past points y i (t), . . . y i (t − k), for a given number k provided by the approach. Here, we focus on the case when k = 1, which occurs in many real-world examples, and discuss cases with k ≥ 2 in Appendix E. This means that where g i = f i (y i (t)) + β i v(y i (t)), and v is the corresponding projection of effective coupling V .

A. Reconstruction procedure
An effective network is obtained in three main steps: Step 1: Reduced dynamics. We employ Takens reconstruction. If the time series is high dimensional, we discard it. Otherwise, once we are in the appropriate dimension, we estimate and learn the rule g i . We decompose g i as a linear combination of basis functions, tailored to the application. The parameters of the basis functions are obtained by performing a 10-fold cross-validation with 90% training and 10% test [32,33] (see Appendix A). As the dynamics is low-dimensional other techniques such as compressive sensing [34,35] or embedding [36] can be also employed.
Step 2: Isolated dynamics and effective coupling. We run a model-free estimation that coarsely classify nodes according to their degree by assigning to every pair of y i and y j a Pearson distance s ij ≥ 0 such that s ij ≈ 0 if the attractors of i and j are similar and s ij ≈ 1 if they are distinguishable. The higher the number of nodes with behaviour different from i, the larger the intensity S i = j s ij . Low degree nodes have typically small S i while for hubs this quantity is large. Notice that for the low-degree nodes, αk i v is negligible and the dynamics at the low-degree nodes are close to f . Therefore we use g i at the identified low degree nodes to obtain an approximation for f ≈ g i , while g i at hub nodes allows to estimate β i v ≈ g i − f . We estimate β i by Bayesian inference.
Step 3: Network structure and communities. Since β i = αk i , we can recover the network's degree distribution from β i . Then, having the local rules g i , we can decompose the time series in terms of a low-dimensional deterministic part and the fluctuation term ξ i , and use this last term to recover community structures. If nodes i and j interact with the same nodes, they are subject to the same inputs and the correlation Cor(ξ j , ξ i ) is high. If not, Cor(ξ i , ξ j ) is nearly zero due to the decay of correlations in the deterministic part. Thus, Cor(ξ j , ξ i ) is high when nodes i and j have high matching index (high fraction of common connections), and are likely to belong to the same cluster. Given the matrix ρ ij = Cor(ξ i , ξ j ), we estimate the adjacency matrix A by thresholding the correlation matrix as A ij = Θ(ρ ij > τ ), where Θ is a Heaviside step function and the value of the threshold τ between 0.3 and 0.6. We then apply the modularity-based Louvain method [37] on A to detect communities.
That Cor(ξ j , ξ i ) is high when nodes i and j have high matching index, is true for generic coupling as shown by the following argument. In general, the coupling function is a sum of terms h(x, y) = u(x)v(y). This leads to noise terms where µ is the physical measure of the local dynamics. Given i and j the sum can be split into common connections to i and j and to the independent connections: where w is the noise due to the common connections (notice that w has zero mean), and ζ i , ζ j depend on different coordinates and can be assumed to be uncorrelated. Omitting the time index t, the covariance of ξ i and ξ j is After some manipulations, we obtain so, if u(x)dµ(x) = 0, the correlation between the noise will vanish even though they have a common term. Thus, the above scheme is able to recover communities if v = 0. If this condition is not met, the network reconstruction via the g i 's is not possible. We remark that v = 0 is a special condition on the coupling that is destroyed by small perturbations. It is crucial that the correlation analysis is restricted to fluctuations ξ i . Since the variance of the deterministic part of y i is larger than that of the small fluctuations ξ i , performing a direct correlation analysis between y i and y j hides all the contributions coming from the covariance between ξ i and ξ j . Consequently, the correlation of the deterministic part is close to zero due to the chaotic dynamics, as shown in Appendix A.

B. Benchmark model for the isolated dynamics
We present the effective network methodology applied to networks of neurons. We use synthetic time-series where each neuron is simulated using the Rulkov model, which has two variables, u and w, evolving at different time scales as de- The fast variable u describes the membrane potential and is the state variable measured by the observed time series y i (t), while w describes the slow currents. Different combinations of parameters σ and β give rise to different dynamical states of the neuron, such as resting, tonic spiking, and chaotic bursts. To test our procedure we considered two cases: σ = ν = 0.001 and β = 5.9, which correspond to tonic spiking, and β = 4.4 which correspond to bursting. As for the coupling, we consider chemical synaptic coupling, that is, In the chemical coupling, V s is a parameter called reverse potential. Choosing V s > u i (t), the synaptic connection is excitatory. We take V s = 20, Θ s = −0.25, and λ = 10. In addition to Rulkov maps, we show in the Appendix E that the approach performs well on a wide range of nonlinear local dynamics such as: doubling maps, logistic maps, Spiking Neurons, Henon maps. We also provide performance analysis for Rössler oscillators in Section III of the Supplementary Material.

III. REVEALING COMMUNITY STRUCTURE: THE RICH-CLUB MOTIF
We focus on the network structure of the cat cerebral cortex [29]. The network contains 53 meso-regions arranged in four communities that follow functional subdivisions; visual (16 nodes), auditory (7 nodes), somatomotor (16 nodes) and frontolimbic (14 nodes), as shown in Fig 1 (a). Some cortical areas (hubs) form a hidden layer called a rich-club and are densely connected to each other and the communities. A set of nodes form a rich-club if their level of connectivity exceeds what would be expected by chance alone. The maximum number of connections in this network is ∆ = 37.
The regions and their connections were discovered by using datasets from tract-tracing experiments [27,28]. The network obtained is weighted. For simplicity and to improve the performance in detecting communities, we turn the network into an undirected simple graph [29]. We simulate each mesoregion as a neuron interacting via electrical synapses and obtain a multivariate data {y 1 (t), y 2 (t), . . . , y N (t)} for a time T = 5000. For simplicity, we will denote y i = {y i (t)} T t=0 .

A. Comparison with previous approaches
For comparison, we recover the network using two widely employed approaches: functional networks [38][39][40], and sparse recovery techniques [10,34]. The intuition behind the functional network approach is that nodes with similar time series have similar characteristics. The functional network can be constructed by the matrix of similarities between nodes via statistical analysis [41,42]. As a measure of similarity, we employ a covariance analysis between the time series. The functional network cannot detect communities in this case since the time-series at different nodes are essentially uncorrelated ( Fig. 1 (b)). Other similarity measures give no significative improvement. See Appendix B for the details.
The key idea in sparse recovery techniques is to write the dynamics as a linear combination of basis functions with unknown coefficients, and the presence of a link is determined when any coefficient of the corresponding interaction is nonzero. Thus a link is present if the estimated coefficient corresponding to the link is above a given threshold σ.
We implemented the sparse recovery method to our bench-mark model when the strength of each connection is of order α ≈ 0.015. Hence we have chosen values of σ close to this value. The reconstructed network does not identity the clusters correctly as can be seen by comparing the blue and red markers in Figure 2. In the cases that we are studying here each individual link provides a negligible contribution and only the cumulative effect of many links is relevant. The coefficients to be recovered are close to zero, and cannot be distinguished from zero terms. A discussion on sparse recovery can be found in Section I of Supplementary Materials.

B. Community structure via effective networks
Remarkably, the effective network is able to recover the community structures ( Fig. 1 (c)). Using Step 1, 2, and 3 we obtain a model for the isolated dynamics, coupling function, distribution of degrees, and correlations Cor(ξ i , ξ j ). To apply the method of community detection in [37], we threshold the matrix of correlations, Fig. 1 (c), considering nodes i and j linked only when the correlations were greater than 0.5. We test threshold values ranging from 0.3 to 0.6 and obtained the similar results as the distribution of the entries of the matrix of correlations is unimodal and has a peak near 0.5. We use the algorithm in [43] to compute the rich-club coefficients for each node. The coefficient depends on the degree and is a number between 0 and 1. We assigned to the rich-club the nodes with coefficient at least 0.8. As shown by Figure 1 (d), the effective network methodology is able to classify the nodes in the network according to their function. Notice that our model predicts the presence of a link between two nodes i and j when Cor(ξ i , ξ j ) is high. Since every node makes most of its interactions within a cluster, two nodes with highly correlated fluctuations ξ(t) are likely to belong to the same community, and this can be enforced in the effective network by adding a connection between them.

C. Performance of the communities reconstruction
To quantify the effectiveness of community reconstruction, we compute P E = m N , where N is the total number of nodes and m is the number of nodes assigned to the wrong community. We compute P E for ∆α between 0.05 and 0.4. For each value of α, we considered 50 different simulations by choosing different initial conditions. The figure shows the plot of the mean of P E and a shaded region corresponding to the standard deviation. For ∆α values larger than 0.4, the reconstruction procedure cannot identify the communities correctly as synchronization rich club which appears.
In the Appendix E, we analyzed synthetic networks with 100 nodes which are undirected and have a rich-club structure. We used them as benchmark to evaluate the success of the reconstruction. The ability of the reconstruction procedure to recover the community structure was tested for various coupling functions and isolated dynamics.

IV. PREDICTING CRITICAL TRANSITIONS IN RICH-CLUBS
The ability to reconstruct the network and dynamics from data can be exploited to predict critical transitions that may occur when the coupling strength varies. This is crucial for applications. For example in the cat brain, a transition to col- Prediction error for misidentification of communities in the reconstructed cat cerebral cortex from synthetic data. For each realisation, the chosen parameters are the same as in Figure 1 and only the overall coupling is changed. Mean and standard deviation of prediction error (PE) computed for the network over 50 realizations for each value of α. If ∆α > 0.42, the system synchronizes and the procedure cannot reconstruct the community structures.
lective dynamics in the rich-club has drastic repercussions for the functionality of the network [29,44].
The goal is to obtain and predict the onset of collective motion in the rich-club from data recorded when the network is far from a collective dynamics. The effective network can predict the onset of such collective dynamics based on a single multivariate time series for fixed coupling strength in a regime far from the synchronized state. We analyze timeseries obtained simulating the dynamics for ∆α = 0.3, and reconstruct the network structure and the isolated dynamics.
Transitions to synchronization between the scale variable is possible while the fast spikes remain out of synchrony [45]. Notice that the slow variable w changes on a scale 1/µ. In the present setting we have 1/µ = 10 3 which is about number of points we need to apply the approach. Thus, for such short time series we can neglect the slow scale. This is also an advantage of this present approach. To estimate the transition to burst synchronization, we obtain the slow variable as a filter over the membrane potential (fast variable). Since we measure the membrane potential y i (t) = u i (t), the slow variable is given as z i (t) = µ t k=1 (y i (k) − σ) and for a choice µ and σ this can be identified with the slow variable of the model w.
In Appendix C, we derive the following equation for the slow variable of a node in the rich club: where λ = 1.42 is estimated from the data. The equation can be used to analyze the effect of the network connectivity on the dynamics. We can use the data on the network and the dynamics recovered from the time-series recorded at ∆α = 0.3 to predict that at the value ∆α ≈ 0.42 the rich-club will develop a burst synchronization (details in Appendix C).
To capture a transition to a synchronized state, we introduce a phase θ j (t) for the slow variable. To define θ j (t), we first smooth the time series [46]. Then, we find the time t n of local maxima as the nth maximum point of the slow variable. We  Fig. 1. For values in the grey shaded region, r is increasing towards close to one and the rich-club exhibits collective behavior. We can predict the critical coupling αc (standard deviation in shaded region) by studying the effective network obtained from a time series measured at ∆α = 0.3.
introduce the phase variable θ as as shown in Ref. [47]. We then compute the order parameter A small value of the order parameter, r ≈ 0, means that no collective state is present, whereas r(t) ≈ 1 means that the bursts are synchronized. Figure 4 shows that behaviour of r as a function of the coupling. The rich-club undergoes a transition to burst synchronization at ∆α ≈ 0.4 that corresponds to an increase of roughly 40% of the coupling strength and is close to the predicted value ∆α ≈ 0.42. In Appendix E, we show other examples where the local dynamics is chaotic.

V. OBTAINING A STATISTICAL DESCRIPTION OF THE NETWORK
The effective network can provide statistical description of the network structure. To illustrate this, we reconstruct the statistical properties of scale-free networks.

A. Scale-free networks of coupled bursting neurons
We consider coupled bursting neurons with excitatory synapses [45] in scale-free networks. A scale-free network has degree distribution P (k) = Ck −γ , where γ > 0 is the characteristic exponent and C is a normalising constant. We generate a scale-free network with N = 10 4 nodes such that the probability of having a node of degree k is proportional to k −γ , where γ = 2.53. We use a random network model which is an extension of the Erdös-Rényi model for random graphs with a general degree distribution. More details are provided in Appendix D.
For this reconstruction we only need 2000 data points for each node. Again, to every pair of time series y i and y j we assign a Pearson distance s ij ≥ 0 and the node intensity S i = j s ij . The empirical distribution of the intensities S i approximates the degree distribution of the network, see the second inset of Fig 5(a). In the example here, the estimated structural exponent from the distribution of S i is γ est = 3.1, which yields a relative error of nearly 25% with respect to the true value of γ (see the plots in Figure 5 a)). The functional network therefore overestimates γ, which has drastic consequences for the predicted character of the network. For example, the number of connections of a hub for a scale-free network is concentrated at k max ∼ N 1/(γ−1) , so the relative inaccuracy for the estimate k est of the maximal degree is k max /k est = N 1/γ−1/γest , which is about 500%. Such inaccuracy has important repercussions for the ability to predict the emergence of collective behaviour [19,48].
The statistical measures used for the construction of a functional network typically depend in a nonlinear way on the degrees, thus causing a distortion in the statistics. We will discuss the case of Pearson distance. Suppose that the signals {(y i (t), y i (t + 1))} are purely deterministic, y i (t + 1) = g i (y i (t)). The Pearson distance s ij between the signal at i and j is a number between 0 and 1, depending on how close these graphs are. This distance depends nonlinearly on the degrees k i and k j . Devising another distance s ij without knowledge of the interaction, in general, still carries the nonlinear dependence on the degrees. Once fluctuations from the network are included the differences between time-series can be due to fluctuations rather than differences in the degrees. The decomposition of the rules in terms of interactions and fluctuations is essential to recover degree distribution accurately.
The effective network provides a better statistical description of the network structure. To compare with the functional network approach, we constructed an effective network of the same system tested for the functional network. The estimate for γ from the effective network is γ est = 2.55, which has an error of only 1% (inset one of Fig. 5 (a)). We repeat the analysis on a different network with different parameters γ in the degree distribution. The estimated γ est values are shown in Fig. 5 (c) as a function of the true parameter γ. The relative error on the estimated exponent is within 2%.

B. Performance of the degree distribution reconstruction
In Appendix E, we present additional simulations showing how accurate the degree distribution is reconstructed for various isolated dynamics. In particular in Figure 3 we show the results for a) doubling maps with diffusive coupling, b) logistic maps with Kuramoto interactions, c) spiking neurons with electrical coupling, and d) Hénon maps with the y-component diffusive coupled with the x-component. Moreover, in Section III F we show the performance of the reconstruction for a system of differential equations coupled on scale-free networks. Reconstruction of structural power-law exponents γ of scale-free networks from data. We estimated γ from the multivariate time series obtained from the dynamics random scale-free networks with degree distribution P (k) ∝ k −γ . The plots in panel (a) compare the functional and effective network approach. We obtain better estimates using the effective network. Panel (b) shows the degree distribution of the original system (in blue) and that estimated from an effective model (in red) for the neural network in the optical lobe of Drosophila melanogaster. We obtained an accuracy of 3% in the structural exponent γ. Panel (c) shows the true exponent γ versus γest obtained with an effective network from data for spiking neuron coupled with chemical synapses. We generated 1000 networks with distinct γ, from which the γest estimate is within 2% accuracy.
We provide a study on the effects of noise in the reconstruction. We established that for stochastically stable [49] systems such that the doubling map if the noise amplitude η 0 satisfies η 0 < αk min , where k min is the minimal degree the reconstruction procedure works. When the noise amplitude is of order αk i nodes with degree less than k i cannot be estimated. We applied our method to data simulated from the neuronal network in the Drosophila Melanogaster optic lobe, which constitutes >50% of the total brain volume and contains 1781 nodes [30]. The degree distribution has a power-law tail [31]. We used spiking neurons with chemical coupling to simulate the multivariate time series, from which we constructed an effective model and estimate the degree distribution ( Fig. 5 (b)).

D. Experimental data of optoelectronic oscillators
We now apply our effective network to experimental data of networks of optoelectronic oscillators whose nonlinear component is a Mach-Zehnder intensity modulator. This data was generated in Ref. [50] where the authors studied enhancement of synchronization by structural changes in the network. The the experimental setup can also be found in Refs. [50,51]. Each element consists of a clocked optoelectronic feedback loop. Light from a 780 nm continuous-wave laser is nonlinearly transformed as it passes through the Mach-Zehnder intensity modulator. Light intensity is converted into an electrical signal by a photoreceiver and measured by a field-programmable gate array (FPGA) via an analog-todigital converter. The FPGA is clocked at 10 kHz, resulting in the discrete-time map dynamics of the oscillators. The FPGA controls a digital-to-analog converter that drives the modulator with a voltage x i (t + 1) = βI(xi(t)), closing the feedback loop. The elements are coupled electronically on the FPGA according to the desired coupling matrix as described in detail in Ref. [51]. The system can be modeled as where t is discrete time, β is the feedback strength, I(x) = sin 2 (x + δ) is the normalized intensity output of the Mach-Zehnder modulator, x represents the normalized voltage applied to the modulator, and δ is the operating point set to π/4. The data is acquired for β = 4.5 and 17 elements coupled through the network presented in Figure 6 left panel. The coupling strength σ varies from 0 to 1 in steps of 0.0325 starting from 0.015625. For each fixed value of σ, we obtain the experimental multivariate time series {x 1 (t), · · · , x 17 (t)} 15385 t=1 . We discard the first 5000 data points for each i = 1, · · · , 17 as a transient. We will provide an analysis for the coupling σ = 0.03125. First, we perform a functional network analysis by considering a correlation matrix Σ x of the multivariate time series. To obtain a model of the adjacency matrix we threshold Σ x . The value of the threshold 0.02 is chosen such that the functional network has a mean degree close to the actual network. The result is shown in Figure 6 in the middle panel and as observed the functional network does not capture the actual network structure.
Next we employ the effective network. We start by applying Step (1) to learn the function g i and Step (2) from where we obtain the degrees and coupling strength. Once we obtain g i , we filter the determinist part from x i to obtain the fluctuations ξ i . Next, we compute correlation matrix Σ ξ for the fluctuations ξ i . To turn this matrix into a network, we threshold it. Again the value of the threshold is fixed such that the mean degree is closed to the actual network. Here, any threshold value from 0.07 to 0.1 works. The result is shown in Figure  6, in the right panel, and shows excellent agreement with the actual network. In fact, only two links are misidentified.
We also performed the analysis for further coupling strengths σ. For large coupling strengths both functional network and effective network will capture the network misiden-tifying on average 4 links. In these cases, the effective network has the advantage that it provides in addition to a model of the adjacency matrix also a model for the local dynamics.

VI. CONCLUSIONS
We have introduced an effective network obtained from time-series of a complex network observing the dynamics at each node. Our method complements the existing ones in two ways: First of all, it encompasses the case of chaotic local dynamics at each node. Secondly it deals with weak coupling among the nodes. Both cases are commonly found in applications [1,15,16]. Key to the success of the reconstruction is the heterogeneity of the network which allows us to perform a multi-level reduction. To recover the community structures, we use that certain noise terms associated with the time series at two nodes in the same community are correlated. By collecting data when the network is far from critical transitions, an effective network enables us to predict a critical transition.
We have compared our procedure with methodologies most relevant for the systems considered. We have excluded results tailored to specific setups or dynamics (binary dynamics [52], and see [10] for a review). We did not consider methods that rely on measurements obtained by intervening on the system with controlled inputs [13] and restrict our attention to time-series recorded under constant conditions. When the coupling is strong, sparse recovery can be applied [34]. When the coupling is weak sparse recovery cannot distinguish small parameters from those that are identically zero thus misidentifying connections between nodes. Also model-free methods are ill-suited as the influence of a single pairwise interaction on the time-series is weak and can hardly be detected.
The effective network methodology performs well when the network is heterogenous and has a few nodes making a large number of connections while most of the nodes are less connected, and the local dynamics are chaotic and their typical orbits visit most of the phase space. The effective network approach did not perform well in two cases. The first is when most of the observed time-series take values on a very restricted part of the phase space, for example if the local dynamics has a singular attractor, as an attracting fixed point, or if it spends long periods of time in a small region, like around the fixed points of the classical Lorenz attractor. This means that we don't have access to a big portion of phase space, and no prediction is possible in those regimes of coupling strength that make these portions accessible. The passage near a fixed point also suppresses the fluctuations hindering the reconstruction of communities. This is what seems to happen for example in the bursting dynamics of Rulkov maps, when the quiescent state is too long. These situations are excluded if the local dynamics is sufficiently chaotic. The second case is when the coupling is strong enough to synchronize big parts of the network. For example, a synchronous rich-club can send similar forcing to nodes in different communities resulting in high correlations between the fluctuations. Therefore our method would identify these nodes as belonging to the same community even if they are not.
Appendix A: Effective network representation from data A summary of the effective network approach is given in Figure 7. Here we include some details that were omitted for the sake of presentation in the main text.
In Step 2 of the reconstructing procedure, we identify low degree nodes by analysing the distribution of S i . More precisely, we use the top N top nodes of the highest intensity to obtain a proxy for the isolated dynamics. We then average these rules to get g ≈ f . The choice of N top is not fixed and depends on the number of nodes and the fluctuation For scale-free (Barabasi-Albert) networks the degree of the hubs scales as N 1/2 , a good heuristic is to choose N top satisfying σ 2 g /N 1/2 top 1. The effective coupling function αk i v can be obtained analysing the family which can yield the shape of v up to a multiplicative constant via a nonlinear regression by imposing that g i − g and g j − g are linearly dependent. The choice of the base function for the fitting is supervised (see Appendix E).
In Step 3, after selecting a v that satisfactorily approximates g i − g up to a multiplicative constant over all indices i, the parameter β i is estimated using a dynamic Bayesian inference. Because the fluctuations ξ i (t) are close to Gaussian, we use a Gaussian likelihood function and a Gaussian prior for the distribution of the values of β i , and hence obtain equations for the mean and variance. We split the data into epochs of 200 points and update the mean and variance iteratively.

Community structures
Once we obtain the rules g i , we filter the deterministic part of the time series y i and access the fluctuations ξ i (recall Eq. (2)) and decompose it as ξ i = ξ c i + ξ o i where ξ c i is the fluctuation of the local mean field from nodes in the cluster containing i, and ξ o i is the contribution from outside the cluster. Since a node makes most of its connections within its cluster, ξ c i ξ o i with high probability, and thus if i and j belong to the same cluster Corr(ξ i , ξ j ) = Corr(ξ c i , ξ c j ). The common noise is generated by the common connections between nodes i and j. For fixed isolated dynamics and coupling function Corr(ξ i , ξ j ) is related to the matching index [29] of the nodes i and j. This is a parameter used to quantify the number of common neighbours of two nodes. Recall that the degree of node i is k i = N j A ij and counts the number of neighbours it has. Consider the neighbourhood of node i, Γ(i) = {j ∈ {1, . . . , N } | A ij = 1}. This is the set of nodes that shares an edge with the node i. The matching index of nodes i and is the cardinality of the overlap of their neighbourhoods µ i = |Γ(i) ∩ Γ( )|. We consider the normalised matching index: FIG. 6. Effective network from experimental data of networks of optoelectronic oscillators. We consider multivariate time series of voltages of a network of 17 weakly coupled optoelectronic oscillators (interaction corresponding to 1% of the oscillator amplitude). In the left panel, we show the actual network used to coupled the optoelectronic oscillators in Ref. [50] as adjacency matrix in a) and graph representation in d). In the mid panel, we show the reconstruction of the network by a functional network analysis in terms of its adjacency matrix in b) and graph representation in e). In the right panel, we show the reconstruction of the network from an analysis of the dynamical fluctuations by applying the effective network approach as adjacency matrix in c) and graph representation in f). The effective network provide a striking reconstruction and only two links are misidentified and are indicated in f as red links. In the graph representation, the nodes of the network are coloured according to the community obtained by a community detection algorithm [37]. Step 1: machine learning techniques & dynamical systems theory allows to obtain rules Step 2: heterogeneity in the rules allows us to obtain the interaction Step 3: reconstruction of highly chaotic oscillator network from data Under the assumption that such rules change from node to node depending on their connectivity, we estimate the coupling function. Using the fluctuations of the time series with respect to the low-dimensional rules, we recover the community structures. Gathering all this information, we obtain an effective network that can be used to predict critical transitions .

Local Interactions
or equivalently in terms of the adjacency matrix Clearly µ i = 1 if and only if i and l are connected to exactly the same nodes,and µ i = 0 if they have no common neighbours. It is well known that in the cat cerebral cortex nodes in the same community have a high matching index while nodes are distinct communities has a low matching index. This tends to be typically in modular networks [29]. For nodes in distinct clusters the component ξ c i ≈ 0, so Corr(ξ i , ξ j ) ≈ 0. We recover the network structure from a noise covariance analysis.
Filtering out the deterministic part plays a major role in recovering community structures. Suppose we have two signals of the form y i (t) = Y i (t) + ζ(t), i = 1, 2, where Y i is independent of i and ζ(t) is a common noise term. Y i represents the superposition of the deterministic chaos and the independent fluctuations. For the correlation, we have Hence, the large values of the variance of the time series (σ yi ≈ σ Yi σ ζ ) suppress the contribution of the common noise, and an analysis solely based on the the original time series y i will overlook the common noise contribution.

Appendix B: Functional networks
For networks of chaotic oscillators, building the functional network from the standard Pearson correlation between time series gives no meaningful results because of the decay of correlation intrinsic to dynamics. Functional networks are built using a Pearson distance s ij ≥ 0 describing the proximity of the dynamics at two nodes i and j. To do this, we consider the time series z i (t) := (y i (t), y i (t + 1)), t = 0, . . . , T − 1 reordered in z lex i (t) according to the lexicon order; that is, according to the magnitude of the first component of z i (t). Then, let r ij be the Pearson correlation, r ij = Cor(z lex i , z lex j ), so that r ij = 1 indicates that the attractors at nodes i and j agree. Define the Pearson distance s ij = 1 − |r ij | so that s ij = 0 indicates agreement of the dynamics and s ij > 0 measures the difference between the attractors.
The intensity S i = j s ij approximates how many nodes have a dynamical rule different from i and helps to distinguish between poorly connected nodes and hubs. Since most of the network is composed of poorly connected nodes, they exhibit a smaller S i than high-degree nodes, which are scarcer and have different dynamics from the low-degree nodes.

Appendix C: Predicting critical transitions
Here we explain how to gather the information for a theoretical prediction of the critical transition. Reduction in the rich-club. Nodes in the rich-club have degrees of approximately ∆ and make κ∆ connections inside the rich-club and (1 − κ)∆ connections to the rest of the network. Following our reduction scheme, the interactions within and outside the rich-club can be described by the expected value of the interactions with respect to the invariant measure associated with each of them. Let C denote the set of nodes in the rich-club, then the coupling term is where µ is the invariant measure for the nodes outside the richclub. Hence, for the rich-club we obtain where q i (x i (t)) = F i (x i (t)) + (1 − κ)∆α H(x i , y)dµ(y).
Predicting the transition to collective behaviour. Let us recall that when isolated u i (t + 1) = F 1,i (u i (t)) + w i (t) where F 1,i ≈ F 1 , w i (t + 1) = w i (t) + µ(w i (t) − 1), and w(t + 1) = w 0 + µ t n=0 (u(n) − 1)) (C1) Using the reduction Eq. (C1), in the network we obtain where i denotes the ith nodes in the rich-club, u is the mean in the rich-club and ξ i are fluctuations. We fix two nodes y i = u i and y j = u j in the rich-club and consider Using that F 1,i ≈ F 1 by the mean value theorem we obtain ζ(t + 1) = DF 1 (x i (t))ζ(t) + µ and considering t n=0 DF 1 (x i (n))ζ(n) ≈ λ t n=0 ζ(n) where we used that t n=0 ζ(n) is a slow variable. We obtain η(t + 1) = (λ − ∆α)η(t) + µ t n=0 η(n) For the cat cerebral cortex ∆ = 37. Given the time series {y i } for ∆α = 0.3, we estimate F 1 using our method i as the slow variables are constants over short time scales, and the obtain slow variables as a filter over the fast variables. From the data, we estimate λ = 1.42 and thus we obtain ∆α = 0.42. At this critical value the slow variables tend the stay together due to the contraction in the dynamics. This is related to the onset of synchronization in the bursts, which is captured via a phase variable through the order parameter.
For estimation of the power-law distribution parameters, we use the maximum likelihood estimator [53,54]. After that, we test the reliability between the data and the power law by using the goodness-of-fit method. If the resulting p-value is larger than 0.1, the power-law estimation is an appropriate hypothesis for the data. A complete procedure for the analysis of power-law data can be found in Ref. [55].