Machine learning prediction of critical transition and system collapse

To predict a critical transition due to parameter drift without relying on model is an outstanding problem in nonlinear dynamics and applied fields. A closely related problem is to predict whether the system is already in or if the system will be in a transient state preceding its collapse. We develop a model free, machine learning based solution to both problems by exploiting reservoir computing to incorporate a parameter input channel. We demonstrate that, when the machine is trained in the normal functioning regime with a chaotic attractor (i.e., before the critical transition), the transition point can be predicted accurately. Remarkably, for a parameter drift through the critical point, the machine with the input parameter channel is able to predict not only that the system will be in a transient state, but also the average transient time before the final collapse.

In nonlinear and complex dynamical systems, a catastrophic collapse is often preceded by transient chaos.For example, in electrical power systems, voltage collapse can occur after the system enters into the state of transient chaos [1].In ecology, slow parameter drift caused by environmental deterioration can induce a transition into transient chaos, after which species extinction follows [2,3].A common route to transient chaos is a global bifurcation termed crisis [4], at which a chaotic attractor collides with its own basin boundary, is destroyed and becomes a chaotic transient.In the real world, the accurate system equations are often unknown but only measured time series are available.Model-free and data driven prediction of the critical transition and system collapse in advance of their occurrences, while the system is currently operating in a normal regime with a chaotic attractor, has been an outstanding problem.If the underlying equations of the system have a simple mathematical structure, e.g., with the velocity field consisting of power series or Fourier series terms only, then sparse optimization methods such as compressive sensing can be exploited to identify the system equations [5,6] and consequently to predict transitions.Here, we assume that our system does not meet this condition and set out to develop a machine learning framework to predict the critical transition.
A closely related problem is to determine if the system is already in a transient state -the question of "how do you know you are in a transient?".In nonlinear dynamics, this is one of the most difficult questions because the underlying system can be in a long transient in which all measurable physical quantities exhibit essentially the same behaviors as if the system were still in a sustained state with a chaotic attractor.Applying the traditional method of delay coordinate embedding [7] to such a case would yield estimates of dynamical invariants such as the Lyapunov exponents and fractal dimensions, but it would give no indication that the system is already in a transient and so an eventual collapse is inevitable.Developing a model-free, purely data-driven predictive paradigm to address this problem is of considerable value to solving some of the most pressing problems in the modern time.Due to global warming and climate change, some natural systems may have already been in a transient state awaiting a catastrophic collapse to occur.A reliable determination at the present that the system has already passed the critical transition or a "tipping" point to a transient state would send a clear message to policy makers and the general public that actions must be taken immediately to avoid the otherwise inevitable catastrophic collapse.
In this Letter, we develop a machine learning framework to predict critical transition and transient chaos in nonlinear dynamical systems.We exploit reservoir computing, a class of recurrent neural networks [8][9][10][11], a research area that has gained considerable momentum as a powerful paradigm for model-free prediction of nonlinear and chaotic dynamical systems [12][13][14][15][16][17][18][19][20][21][22][23][24][25][26][27].Our articulated reservoir computing structure differs from the conventional one in that we designate an additional input channel for the bifurcation parameter, as shown in Fig. 1(a).The basic idea of our framework is explained in Fig. 1(b), a schematic bifurcation diagram of a typical nonlinear system.In the green region, there is a chaotic attractor together with periodic windows, in which the system functioning is normal.A catastrophic bifurcation occurs at the critical parameter value defining the end of the green region, where the chaotic attractor is destroyed through a crisis transition.For a parameter value slightly beyond the critical point, there is transient chaos leading to collapse.Suppose that, currently, the system operates in the normal regime.Given a certain amount of parameter drift, the two goals are: (i) to predict the transition point so as to determine whether the system will be in a transient chaotic regime heading to collapse and (ii) if yes, on average how long the system could survive, i.e., to predict the average lifetime of the chaotic transient.Because time series data from multiple parameter values are needed, it is necessary to specify the parameter value at which the data are taken -a task that can be accomplished by inputting the parameter value to all nodes in the underlying neural network.We demonstrate that our proposed machine learning framework can accomplish the two goals with three examples: an electrical power system susceptible to voltage collapse through transient chaos [1], a three-species food chain model [2] in ecology in which a catastrophic transition leads to transient chaos and then species extinction, and the Kuramoto-Sivashinsky system [28,29] in the regime of transient spatiotemporal chaos [30].We show that, training a reservoir network of reasonable size, e.g., 1000 nodes, with time series data taken from three parameter values in the normal chaotic regime, the machine is able to predict not only the collapse point but also transient chaos for parameter values beyond the critical point.A remarkable feature is that, after the critical transition, the probability distribution of the transient lifetime of the machine generated trajectories from an ensemble of initial conditions agrees with the true distribution, indicating that, for a given parameter drift into the transient chaos regime, the machine is able to predict, on average, how long the system can "survive" before collapse.
For simplicity, we consider a catastrophic transition caused by variations in a single parameter.(Situations of multiple parameters can be treated with multiple parameter channels.)As shown in Fig. 1(a), the input layer maps the low-dimensional time series data u(t) into the high-dimensional hidden state r(t) through a matrix W in , and the output layer maps r(t) back into low-dimensional time series v(t) through another matrix W out .The input parameter channel is connected to every node of the reservoir network via the matrix W b .The reservoir network adjacency matrix A transfers information from the hidden states r(t) at t to the next time step r(t + ∆t).The dynamical evolution of the reservoir computing machine is described by r(t , where α is the leakage parameter, k b and b 0 are two hyperparameters determining the input of the bifurcation parameter into the reservoir network.Training is administered to adjust the output matrix W out to minimize the difference between v(t) and u(t + ∆t), so that the reservoir can predict the evolution of the target system into the future with input of the dynamical variables from the past.The matrices W in , W b and A are chosen a priori and fixed during the training After training, prediction can be made either for parameter values before the critical point (still in the green regime) or after, where in the former, the machine will predict that the system is safe (b1) but, importantly, for the latter, the machine will predict that the system will eventually collapse after transient chaos (b2).
and prediction phases for efficiency.(A more detailed description of the training and predicting processes is provided in Sec.I of Supplementary Information (SI) [31].)We train the reservoir machine using time series from a few distinct parameter values -all in the normal or safe regime where the system still possesses a chaotic attractor, as shown in Fig. 1(b).Because of the additional parameter input channel, the machine trained with data at different parameter values will gain the ability to "sense" the variation in the parameter and the associated changes in the time series data from different parameter values.Such a well trained machine is a high-dimensional representation of the original dynamical system.Finally, to predict with what parameter drift the system will exhibit transient chaos and then collapse, we simply input the parameter value of interest into the parameter input channel.Then, at each time step, we collect the one-step prediction from the output layer v(t) and feed it back to the input layer u(t + ∆t) = v(t).Now the reservoir machine in the predicting phase is a closed-loop dynamical system with one constant external drive -the bifurcation parameter of interest.The machine in the predicting phase will be able to predict the system collapse preceded by transient chaos for the input value of bifurcation parameter.
To achieve better performance, for each target system, we choose the seven key hyperparameters of the reservoir computing machine based on Bayesian optimization [24], which are the average degree and the spectral radius of the reservoir network, the scales of the input matrix W in , parameters k b , b 0 , and α, as well as the regularization parameter used during the training of the output matrix W out .(A more detailed description of these hyperparameters and their optimization can be found in Secs.I and V of SI [31].)To demonstrate that our machine learning approach represents a true advance in nonlinear systems prediction, we present two examples for which the sparse optimization methods fail.In particular, the basic requirement of any sparse optimization technique for finding the system equations is sparsity: when the system equations are expanded into a power series or a Fourier series, it must be that only a few terms are present so that the coefficient vectors to be determined from data are sparse [5,6].
The first example is the Ikeda map [32][33][34] that describes the dynamics of a laser pulse propagating in a nonlinear cavity: , where z is a complex dynamical variable and the bifurcation parameter µ is the dimensionless laser input amplitude (more details in Sec.IIA in SI [31]).When the map functions are expanded into a power series, all combinations of the powers of the dynamical variables are necessary.The coefficient vectors for the dynamical variables are absolutely dense, representing an extreme type of violation of the sparsity condition.The system exhibits a boundary crisis [35] at µ c = 1.0027 and the dynamical behaviors for µ < µ c and µ > µ c are shown in Figs.2(a1) and 2((a2), respectively.There is a chaotic attractor for µ < µ c , and transient chaos leading to an escape of the system out of the previous operation region for µ > µ c .We train the reservoir machine at µ = 0.91, 0.94, 0.97 -all in the chaotic attractor regime.For each µ value, training is done such that the reservoir machine is able to predict the exact state evolution of the original system for several Lyapunov time.More importantly, we ensure that the predicted trajectories land on the chaotic attractor for an arbitrarily long stretch of time.After training, we apply some parameter change ∆µ and test, for each resulting parameter value, whether the predicted system state is a chaotic attractor or a chaotic transient.An exemplary pair of the predicted state for µ < µ c and µ > µ c are shown in Figs.2(b1) and 2(b2), respectively.It is remarkable that the reservoir machine is able to predict the collapse after a chaotic transient, as shown in Fig. 2(b2).Examining the prediction results for a set of systematically varied ∆µ values enables determination of the critical bifurcation point, denoted as µ * c .Averaging over 1,000 independent random realizations of the reservoir configurations, we obtain µ * c = 1.00 ± 0.01, which agrees well with the actual value µ c .Our machine learning framework can also predict a basic statistical characteristic of transient chaos: the lifetime distribution.To demonstrate this, we set the control parameter of the reservoir to be µ = µ * c + 0.02 so that the system is in the transient chaos regime and the distribution of the transient lifetime is exponential.The reservoir system predicts correctly the exponential distribution, as shown in Fig. 2(c), where 50 stochastic realizations of the reservoir system and 400 random initial conditions for each machine realization are used.It can be seen that pre-dicted distribution agrees well with the true one, demonstrating the power of our reservoir computing scheme for predicting transient chaos and system escape (collapse).
The second example is three-species food chain model [36] where, as the environmental capacity K of the resource species is varied, a catastrophic bifurcation and subsequent transient chaos leading to sudden species extinction occurs.The system equations, when being expanded into a power series, have an infinite number of nontrivial terms, violating the sparsity condition and rendering inapplicable any existing sparse optimization technique for data based discovery of the system equations (detailed in Sec.IIB in SI [31]).Figures 3(a1) and (a2) show typical behaviors of the predator density P for K < K c and K > K c (K c = 0.99976), where there is sustained chaos in the former and transient chaos leading to species extinction in the latter.We train the reservoir machine at K = 0.97, 0.98, and 0.99 -all in the sustained chaos regime.As shown in Figs.3(b1) and (b2), the reservoir predicts correctly that, as the value of K is increased, a catastrophic crisis will occur.Averaging over an ensemble of 500 different reservoir realizations, we obtain the predicted critical transition point as K * c = 0.9997 ± 4 × 10 −4 , which agrees well with the actual value K c = 0.99976.Figure 3(e) shows the predicted distribution of the transient lifetime obtained from 100 realizations of the reservoir machines, each with 400 random initial conditions.The predicted average transient lifetime is about 1.35 × 10 3 , which agrees well with the true value (1.33 × 10 3 ).All the transient lifetime are measured at values of K that are 2 × 10 −4 beyond the systems' critical points.
If predicting system collapse is viewed as a binary classification problem (i.e., with or without collapse), it would be useful to train the neural machine with data from both below and above the critical point.A difficulty is that, beyond the critical point, the system will collapse after a transient chaotic phase of random duration.Practically, it is infeasible to obtain sufficient training data from the system in the post-critical regime.Moreover, our machine learning method can be used to assess the likelihood of the occurrence of a crisis in the near future where no post-critical data are available.
Our results suggest that the reservoir machine trained with data from a few distinct values of the bifurcation parameter represents a "regression" between the dynamical behavior of the target system and the bifurcation parameter.The machine is able to make statistically accurate predictions outside the training region.Since training is done on as few as three different values of the bifurcation parameter, the prediction of, e.g., the critical transition point, from any individual reservoir realization will involve large errors.However, the collective prediction from an ensemble of statistically independent reservoir machines can be quite accurate.(A further analysis of the reservoir performance is provided in Secs.VI-VII in  SI [31].)In general, the prediction error would increase if the training parameter values are further from the crit-ical transition point, suggesting that reservoir machines represent a low order approximation of the real dynamical systems about the training points of the bifurcation parameter.This is a natural and inevitable trade-off.
In summary, we have articulated a parameter-aware scheme of reservoir computing to predict collapse as a result of parameter drift driving the system into transient chaos by designating an additional input channel to accommodate the bifurcation parameter, which is equivalent to introducing adjustable biases between the input and the hidden layers.With parameter dependent training, all in the regime of sustained chaos, the reservoir machine acquires the ability to capture the variations in the "climate" of the target system, thereby gaining the power to predict the system state for different parameter values.When a parameter drift pushes the system through a critical point into a regime of transient chaos where collapse is inevitable, our design of machine learning is capable of accurate prediction of the critical value of the parameter, and of the statistical characteristics of transient chaos and the eventual collapse.These features are demonstrated using an electrical power system and a food-chain model in ecology.(An example of predicting transient spatiotemporal chaos in the Kuramoto-Sivashinsky system is given in Sec.III of SI [31].) This work was supported by ONR under Grant No. N00014-16-1-2828.

FIG. 1 .
FIG. 1.Modified reservoir computing scheme and illustration of the idea to predict critical transition and collapse.(a) Structure of the modified reservoir computing machine with an input parameter channel.The system consists of three layers: the input layer, the hidden layer and the output layer.Vectors u(t), r(t), and v(t) denote the set of input signals (measured time series from the dynamical variables of the target system), the dynamical state of the reservoir network (hidden layer), and the set of output signals representing the machine's prediction.The bifurcation parameter is designated as an additional input.(b) The reservoir computing machine is trained with time series taken from a few parameter values in the green regime in which the system operates normally -indicated by the three vertical blue dashed lines.After training, prediction can be made either for parameter values before the critical point (still in the green regime) or after, where in the former, the machine will predict that the system is safe (b1) but, importantly, for the latter, the machine will predict that the system will eventually collapse after transient chaos (b2).

FIG. 2 .
FIG. 2. Predicting transient chaos and collapse of dynamicsin a nonlinear optical cavity as described by the Ikeda map.(a1,a2) Typical behaviors of the system in the sustained and transient chaos regimes, respectively, for µ = 0.99 < µc and µ = 1.01 > µc.(b1,b2) Predicted behaviors by the reservoir machine for the same values of µ as in (a1,a2), respectively.The machine predicts correctly the system collapse for µ > µc.(c) Actual (black) and predicted (red) exponential transient lifetime distributions for µ = µc + 0.02.

FIG. 3 .
FIG.3.Predicting transient chaos and collapse of a food chain system.(a1,a2) Typical time series of the predator density P in the normal and transient regimes, respectively, for K = 0.997 < Kc and K = 1.01 > Kc. (b1,b2) Predicted time series of P for the same values of K as in (a1,a2), respectively.The machine predicts correctly the sudden collapse for K > Kc. (c1,c2) Return maps constructed from the local minima of P (t) from the true time series, where the red dashed squares define an interval in which an invariant set exists: a chaotic attractor at K = 0.997 < Kc (c1) or a nonattracting chaotic set at K = 1.01 > Kc due to the escaping region about the critical point leading to transient chaos (c2).(d1,d2) Predicted return maps for the same values of K as in (c1,c2), respectively.(e) Actual (black) and predicted (red) transient lifetime distributions at K = Kc + 2 × 10 −4 .