Predicting bursting in a complete graph of mixed population through reservoir computing

We report our investigation and success story, to an extent, on the prediction of spiking and bursting dynamics in globally coupled networks, using echo state network / reservoir computing-based learning procedure. Two exemplary dynamical models, Josephson junctions and Hindmarsh-Rose neurons, are used to construct two separate networks and thereby illustrate the efﬁcacy of our strategy. In the absence of coupling, the networks consist of mixed populations in which few nodes are oscillatory (self-sustained spiking) and the rest of the nodes maintain a quiescent state. When single-input data from one oscillatory node of a network (under stronger interactions between the nodes) is used for learning, the ESN is able to predict the key dynamical features (spiking and bursting) of the other nodes. In comparison, the machine performs with improved predictions if it is fed with two inputs: one from the oscillatory population and another from an excitable population. The machine’s leaking parameter plays a crucial role, which can be tuned appropriately to enhance prediction. Furthermore, a cluster synchronization in the mixed population is conﬁrmed from the machine-generated output signals. Our work is expected to be useful as a burst predictor.


I. INTRODUCTION
A recurrent neural network (RNN), a bio-inspired machine learning tool, offers a general solution for complex tasks of predicting signal information.Noticeably, the working principle of RNN does not depend on a priori knowledge of the intrinsic dynamical relation of the considered or targeted data [1][2][3].Recently another version of RNN called as reservoir computing (RC)/echo state network (ESN) is developed for efficient prediction of complex signals for a considerably longer time [3][4][5][6][7][8].In contrast to RNN, the ESN is easier to implement and cost effective since it does not require fine tuning of its inner components except for the readout/output layer which helps matching the target behavior within a close approximation.Having its simplicity of computation and powerful ability of prediction, ESN has been used for calculating Lyapunov exponents, attractor reconstruction [9][10][11] of dynamical systems and becomes a testing bed for detecting generalized synchronization [12][13][14] in coupled chaotic systems.Apart from dynamical issues, ESN can identify nonstationarity in steady-state visual evoked potentials [15], predict stock price in a short time scale [16] and help understand the language processing [17] as well as differentiating speech signals [18].A reservoir (say, ESN) has three distinct components or layers: an input layer collecting the inputs, a reservoir network with a large number of random elements (analogous to neurons) that expands the input in a high dimensional nonlinear fashion and an output layer to produce the expected target.The readout or output layer is the only part where the weights are trained to produce a desired output which should be closer to the target data.Researchers have also devoted to find the optimal parameters of an ESN for accurate detection of target data [13,[19][20][21][22][23][24][25].
Here we first employ the ESN to predict different kinds of bursting dynamics originating from a mixed population of Josephson junctions [26], where excitable and oscillatory junction nodes (subpopulations) connect to each other over the top of a complete graph.We test here whether the ESN can, (i) predict the spiking and bursting signals with considerably high accuracy, (ii) how many inputs required for efficient prediction of spiking and bursting dynamics, and (iii) identify cluster synchronization within the subpopulations.From this perspective, we further investigate the role of the leakage parameter of the ESN on the success of prediction.We have also checked how the machine performs in presence of heterogeneity of system parameters and in the desynchronized regime of the dynamical network under learning and prediction (details are given in Ref. [34]).In a broader sense, the paper aims at an optimal and suitable strategy in the machine learning process through ESN for prediction of spiking and bursting dynamics as well as the collective behavior of coupled networks.Note that spiking is a repeated firing state that occurs when neurons send signal to other neurons through a rise of action potential [27][28][29][30], whereas bursting is a bunch of spikes, i.e., a repeated firing within a small duration of time intercepted (periodically or irregularly) by a silent or steady state.Finally, we test our scheme with a network of biologically plausible spiking and bursting Hindmarsh-Rose (HR) neuron model [31,32].For both of our example networks, the ESN efficiently predicts the critical point of emergence of the clustering states.

II. BASICS OF ESN BASED LEARNING
The ESN described, in this article, is a standard discretetime leaky tanh (tan-hyperbolic function) network.The internal state of each node of the reservoir updates themselves following a recurrent relation: where W res is a N × N matrix of internal connectivity weights of the reservoir shown in Fig. 1(a).The matrix W in (size: N × K) represents input weights and K is determined by the number of external inputs.r is an N-dimensional vector representing instantaneous internal state dynamics of the reservoir and s is the K-dimensional input vector.The tanh function is applied element-wise and α is a leakage constant.In the ESN, the elements of W res and W in are kept unchanged before or at the instant of post-processing.To train the readout/output weights, the states from the reservoir r(n) and external input s(n) are collected in X(n) = [r(n); s(n)] for each instant of time n.The output relation at time n is captured by ( For a long time series n = 1, .., Q, we can write the relation in a vector form [5] Y = tanh(W out X).
Here, Y ∈ R K o ×Q where Q is the length of the time series of each node and K o is the number of target nodes/outputs.The size of the collecting matrix of the internal dynamics is given by X ∈ R (N+K )×Q .The most simplest way to determine W out is to take the pseudoinverse of the operator X: However, the most suitable way of capturing W out is the leastsquare method [33], which looks like where λ is a regularization factor and Y is the true/target data and I is the identity matrix.In the absence of regularization, the system converges to the previous equation (4) (see Ref. [34], Sec. 1 for details).Note that, the output nonlinearity tanh is optional.To make a random reservoir qualify as an echo state network, it must exhibit certain damping properties.They can be ascertained by ensuring that the weight matrix W res has a spectral radius smaller than unity [33] which has been maintained throughout this article for each simulation.

A. Network setup
To construct the ESN, a 1000 × 1000 reservoir weight matrix W res is generated with dense connectivity and random weights are drawn from a uniform distribution over (−1, 1) and re-scaled to a spectral radius of 0.8.Input weights (W in ) are also randomly chosen from a uniform distribution over (−1, 1).One input from the mixed population (or two qualitatively different inputs) will be fed into the reservoir.Therefore, the size of W in is either N × 1 or N × 2 (see Ref. [34], Sec.1).The leakage constant α is fixed at 0.3, however, its effect on prediction is tested and described below for a range of values.

B. Training and testing
The length of the training sequence is taken as 40 000; the first 10 000 steps are discarded to wash out the initial transient and the echo signals x(n) are sampled from the remaining Q = 30 000 steps.After the training process, 20 000 data points are tested in the backdrop of the learned machine.The machine extracted and target data are represented by d i (n) and y i (n) (n = 1, . . .., 20 000, i = 1, . . ., K o ), respectively, when the mean squared error (MSE) is calculated by To illustrate the success of ESN on the prediction of bursting, we employ globally coupled networks of spiking and excitable (in quiescent state) oscillators shown in Fig. 1(b), where spiking oscillators are represented by red circles and quiescent nodes by blue circles.
Our motivation is to extract/predict information (time signal) of a large number of nodes by feeding information of the dynamics (time series data) of a small number of nodes as input to the reservoir.Assume that we use the input from one oscillatory node (x oscillatory ), then s(n) is one-dimensional.In each time step, this value is multiplied with the element of input matrix W in and then added to the intrinsic dynamics of the reservoir described by Eq. ( 1).In the same manner, we can add, multiple inputs to the reservoir (see Ref. [34], Sec. 2 for detail description).Before describing details of our main results of our selected networks, we present a glimpse of our strategy and the story of prediction on the dynamics of the Josephson junction network [26].For a larger coupling strength, all the nodes show bursting behavior in Figs.1(c)-1(d).However, data of bursting dynamics from only one oscillatory node shown in Fig. 1(c) are fed into the ESN for training and the rest of the nodes' dynamics are predicted from the trained machine.Interestingly, bursting of the oscillatory (red) nodes and the bursting response of the quiescent nodes (blue) are predicted as shown in Figs.1(e) and 1(f), respectively, with a good approximation.Further details are presented in the next section.

III. PREDICTION OF BURSTING AND CLUSTERING: NETWORK OF JOSEPHSON JUNCTIONS
We consider here a globally coupled network; the local dynamics of each node is represented by resistive-capacitiveshunted junctions (RCSJ) [35][36][37][38][39].This superconducting Josephson junction model is an analog of the classical pendulum model.Above a critical DC-bias current and for a wide range of damping parameter, an isolated RCSJ shows self-oscillation mimicking the periodic spiking behavior alike neurons' spiking [37].In presence of a periodic forcing or a shunted inductor, the RCSJ may reveal typical spiking and bursting behaviors [27,36,37,40].An array of forced RCSJs may also reveal complex collective behavior (chimera, clustering, and extreme events) [41,42] under a global mean field interaction.As suggested above, we consider here a mixed population of RCSJs (total number of junctions is N d ), where p number of oscillators are in self-oscillatory mode and the rest of (N d − p) oscillators are in a quiescent state [26].The dynamics of two subpopulations of oscillatory and excitable nodes in the network is captured by two sets of equations, θi = x i , (6) where i = 1, 2, . . ., p and j = p + 1, p + 2, . . ., N d are the self-oscillatory and excitable nodes, respectively.The variable θ i of the i th uncoupled node is the phase difference of the junction, θi =x i (dot denotes time derivative) is the voltage across the junction and is the damping parameter, h is the Planck's constant, e is the electronic charge and I i (i = j) is the DC bias current.It is to be mentioned that the isolated RSCJ model has similarities with the Sakaguchi-Kuramoto phase model with inertia [43][44][45] and also with the working model of a power grid [46].In the absence of coupling ( = 0) and a = 1.5, each node can reveal two types of dynamics: oscillatory spiking behavior (I i > 1.0) and a quiescent state (I i < 1.0).Below I i = 1.0, the system has two equilibrium points: one saddle and a stable node.At I i = 1.0, two fixed points coalesce and the junction generates a stable limit cycle through saddle-node on invariant circle (SNIC) bifurcation for I i > 1.0 [38,39].Under an external periodic forcing, an isolated junction shows a parabolic bursting [36,37,40] typically seen in neuron models [47].Here we consider a network of junctions with a subpopulation of oscillatory nodes by setting the bias current at 25 and another subpopulation as excitable (in a quiescent state) by setting I p+1 = . . . .= I N d = 0.5 with our choice of fixed a i = 1.5.It has already been established [26] that such a mixed population in a globally coupled graph, originates bursting dynamics whose pattern is determined by the fraction of population p/N d and the coupling strength .Now we present the details of how a trained ESN can proceed to predict the bursting behavior of the network nodes.We consider the network size N d = 10, where half (p = 5) of the population is excitable and the rest is oscillatory.To understand the spiking-bursting behavior, we have plotted two bifurcation diagrams in Fig. 2 where maxima of the spikes (in a burst) of one oscillatory node (in red dots) in Figs.2(a) and maxima of one excitable node (in blue dots) in Fig. 2(b) are drawn for varying coupling .
For weak coupling (left of the dashed vertical lines), the original oscillatory nodes exhibit recurrent spiking oscillation and the excitable nodes show subthreshold oscillation.The number of spikes in a single burst emerges one by one above a critical coupling strength ( c = ∼ 4.8).For instance, both the nodes will have four spikes (blue and red dotted lines) in a single burst at a coupling = 9 (black vertical line).First, we feed time series information from one oscillatory node into the reservoir for training.The rest of the nodes will be predicted after the training procedure is finished: 30 000 data points from one node are used to train the readout weights.After that, one oscillatory input of 20 000 data length (points) is used to predict 20 000 data points of rest of the nodes.We examine that the ESN predicts the signal with a low accuracy when the network is under weak interactions ( ∼ 1).The original signals extracted from the complete graph are shown in Fig. 3(a).The time series of four oscillatory nodes (in solid, dashed, dotted, and dotted-dashed red lines) are shown (other than the input node), which are asynchronous.However, the subthreshold oscillations of low amplitude (time series in blue) of the excitable five nodes are synchronized, which is further confirmed by a spatiotemporal plot in Fig. 3  and 1(f).The quiescent nodes are showing some fluctuations at the subthreshold level.However, the bursting pattern (5 or 6 peaks in a single burst and timing of spiking) of each node in the entire graph is closely mimicked in the predicted output.To quantify the accuracy of the data set generated from the machine/reservoir, we calculate the MSE of all the predicted signals with respect to their original counterparts.MSE is plotted in Fig. 4(a) as a function of .All the red lines (with solid circles) converge around ∼ 5 and maintain synchrony within the population and MSE confirms that the RC-based signals of the oscillatory nodes are highly correlated with their original counterparts as it saturates at ∼10 −5 beyond > 5.At a lower coupling, the blue nodes (solid circles) are in steady state or show small oscillation; the ESN-detected signals also show subthreshold oscillation and hence the prediction for the excitable/quiescent nodes at a weaker coupling is highly accurate and accordingly, the MSE of all the excitable nodes are significantly small.However, when we increase the coupling, the amplitude of subthreshold oscillation of each excitable node becomes larger and larger with and hence the MSE of all the excitable nodes fluctuates for a range of , and finally converges to a relatively larger value, ∼10 −2 at ∼ 5. Compared to the oscillatory nodes (red lines with filled circles), all the predicted signals of the excitable nodes (blue lines with filled circles) are less accurate for higher coupling (MSE ∼10 −2 , > 5).The reason is as follows: Feeding data from one oscillatory node into the reservoir, we expect the ESN to predict the rest of the oscillatory and excitable nodes (total nine nodes).Since the input is from one oscillatory node, the machine will have a better sharing of information with the rest of the oscillatory nodes, at a larger coupling, hence it predicts the oscillatory nodes's behavior in a better way.However, with no direct information from the excitable nodes being shared with the ESN, it cannot accurately identify the signature of subthreshold oscillations of the excitable nodes and hence prediction about their behavior is poor.For improving the performance of the ESN, we have increased the number of inputs to two when one input is taken from the oscillatory nodes and the other from the excitable nodes as shown in Fig. 4(b) to make a balance of information input.Now, it is seen that above a certain coupling > 5, MSE of all the nodes (blue and red) coalesce together nicely and saturated at ∼10 −4 , reflecting the cluster states of the entire networks.For improved performance, we play with the leakage parameter and discuss in Sec.III A. Note that the integration has been performed in MATLAB using the ODE45 routine.At the time of starting the simulations, the initial conditions are chosen randomly between 0.1 to 2 and adiabatically updated for the next coupling.This strategy is kept unchanged for all data generated on the Josephson junction network, in the main text, as well as Ref. [34]. the performance remains poor due to the desynchronization of the network, which is discussed, in detail, in Ref. [34].The vertical bar (white bar) marks the selected parameters to show the effect of desynchronization in [34].It is shown that prediction by the ESN can be improved for desynchronized signals using the two-node signal feeding strategy, but one has to make a suitable choice of the leakage parameter.We discussed these issues in Ref. (Secs. 3 and 4) [34].In the next section, we consider another network of a mixed population of neurons and test the performance of the ESN in one node as well as the two-node training approach.

IV. PREDICTION OF BURSTING AND CLUSTERING: COUPLED HR NEURONS
In a single RCSJ used above to build a network, in our first example, spiking oscillation starts due to tunneling of electron Cooper pairs through the superconducting junction for an applied DC bias current above a critical value [35].It could generate bursting similar to a neuron [37] when a slow dynamics controls the spiking as induced here by the excitable nodes [26].
To check the performance of the ESN on the prediction of busting dynamics, we present here a biologically plausible and phenomenological neuron model, namely, the HR model [31,32], which responds to a constant external triggering above a threshold to generate recurrent action potential or spiking oscillation that basically represents fast movements of Na + and K + ions across the membrane of a neuronal cell.Additionally, the model incorporated [31] a slow Ca ++ ion movement across the membrane that controlled the spiking oscillation resulting in a type of bursting dynamics.We employ our ESN technique next on a globally coupled network of the HR system.The dynamics of the nodes in the network is captured by where  machine as shown in Figs.6(a) and 6(b).In particular, for the oscillatory node in Fig. 6(a), the prediction is very poor (cf.no match of red and black colored bursting), which is consistent with our results of the RCSJ network as expected in the weakly desynchronized regime.However, for a larger = 1.2, the machine performance improves enormously.This is confirmed in Fig. 6(c) from a comparison of one oscillatory node with its predicted signal, which shows a close match (dominant red lines are seen, no trace of black line).The excitable node's signal overlaps (blue line) on its machine predicted signal (black line) in Fig. 6(d) very closely; no separate identity of the predicted signal (black line) is seen.Once again, results of the RCSJ example are confirmed.For higher coupling, the ESN performs well since clustering states emerge within the subpopulations.The machine is able to predict the bursting and the clustering behavior.For confirmation, first we feed one node information into the machine.MSE of the original and machine-generated signals of all the nodes are shown in Fig. 7(a).The predictions for the excitable/quiescent nodes (blue lines with circle) are not good compared to the oscillatory nodes (red line with filled circles).The ESN performance is poor, as usual, for training with data from one oscillatory node.The ESN performance improves for obvious reason, as explained for the RCSJ network, when we feed two nodes' information to the reservoir (one oscillatory node and another in quiescent state) for the training purpose.MSEs as a function of of all the nodes are shown in Fig. 7(b) for two-node training strategy.Note that, in both cases, there is a sharp drop of MSE around a coupling strength ∼ 1, which is a signature of clustering in the network.

V. CONCLUSION
A reservoir computing based prediction of spiking and bursting is proposed for a group of globally connected oscillators in a backdrop of mixed parameter setup.We tested our strategy using two bursting models representing the local dynamics of the networks, one superconducting Josephson junction and one biological neuron model.A mixed population of quiescent and oscillatory nodes can generate bursting dynamics in the entire graph of superconducting junctions.We have shown that ESN can predict regular and irregular bursting of the entire graph over a range of coupling strength.The ESN can perform better if we feed the reservoir with information from two nodes from two different subpopulations.We are able to find a wide range of a machine parameters to check the performance of the reservoir computing-based prediction.This reveals a possibility of tuning the machine parameter for improved performance, in general.By using two inputs, the reservoir computing (ESN) even can predict the shape of desynchronized signals in the weak coupling regime.The success of the ESN based prediction was tested for homogeneous and heterogeneous networks in terms of system parameter distribution [34].The work can be useful for efficient prediction of bursting and detection of cluster synchronization within a population and chimeralike states in the weaker coupling range of a network.These feature extractions are intriguing since otherwise, one cannot predict the onset of cluster and chimera states without using the detailed information of the dynamical models.In the near future, questions remain if more complex signals such as future arrival time of extreme events can be predicted from available time series data, and about extracting network structure information from a suitable setup of the ESN.
Codes to reproduce the results presented here are freely accessible in [48].Additional information is available from the corresponding author upon reasonable request.

FIG. 1 .
FIG. 1. Bursting prediction by ESN.(a) Components of the machine.(b) A complete graph of spiking (filled red circles) and quiescent oscillators (filled blue circles).[(c) and (d)] For strong mean-field coupling ( = 15), both the subpopulations (oscillatory nodes in red color, excitable nodes in blue color) emerge into bursting oscillation: Six spikes with different amplitudes in each burst.[(e) and (f)] Predicted signals (black lines) by the machine after feeding only one oscillatory input into the machine.Bursting of one oscillatory node (red line) is shown closely matching with machine data (black line).Bursting in the excitable unit (blue line) is also matched with the machine predicted signal (black line).However, there is a small fluctuation in the subthreshold oscillation of the predicted signal near the slow manifold, which shows a mismatch or poor prediction.

FIG. 2 .
FIG. 2. Bifurcation diagram of oscillatory and quiescent nodes.xi variable of the RCSJ is used for study.Peak values are plotted as a function of coupling strength ( ) for (a) one arbitrarily chosen oscillatory node and (b) one from the excitable subpopulation.For weak coupling ( = 1) indicated by a dashed vertical line, the oscillatory node has single and recurrent spikes (a).The quiescent node shows (b) subthreshold oscillation in this regime (dashed vertical line).For larger coupling ( > 4), both the nodes start bursting oscillations with multiple spikes (the solid vertical line at = 9).The number of spikes in a burst increases with .
FIG. 3. Comparison of original signals with the machine predicted signals from RCSJ network.One input is used from one oscillatory node.Original time signal x i (a) and spatiotemporal plot (b) of all nodes for weak coupling = 1.Time series (c) and spatiotemporal plot (d) of predicted signal from the machine.No correlation is found between the machine predicted signals and original signals extracted from dynamical equations.For higher coupling = 15, bursting appears in the entire network: time series x i (e) and all the nodes are synchronized as shown in the spatiotemporal plot (f).Machine-based predictions of time evolution in blue (g) is more accurate than the weak coupling setting except for small fluctuations near the slow manifold.Cluster synchronization is also visible from the predicted signal in a spatiotemporal plot (h).

FIG. 4 .
FIG. 4. MSE between the original and predicted signals in the RCSJ network.(a) One oscillatory node is fed into the machine.The red circled lines represent the oscillatory nodes.They merged near the coupling ∼ 5 and MSE of each oscillatory node sharply dropped to a value ∼10 −5 .The blue circled lines represent the excitable nodes.MSEs of excitable nodes are almost saturated at ∼10 −2 at a higher coupling.(b) For better prediction, two nodes' information is fed into the machine: one oscillatory and another from one node in the excitable subpopulation.MSE for all the nodes sharply dropped near a coupling ∼ 5 confirming the efficiency of prediction of the ESN.

FIG. 5 .
FIG. 5. Impact of machine parameters in RCSJ network.MSEs are plotted in a plane αplane.[(a) and (b)] One oscillatory node input is fed to the reservoir.(a) MSE is calculated between real data and machine-generated data of the excitable node.(b) MSE of an oscillatory node's original data and machine-generated data.[(c) and (d)] Two input data: One oscillatory and one excitable nodes' data are fed into the machine.(c) MSE of one excitable node and (d) MSE of one oscillatory node are is plotted.All the plots [(a)-(d)] project MSE for varying α and .The desynchronized time signals (white vertical bar) and their output from the machine at a coupling strength = 4 is discussed in detail in Sec. 2 of Ref. [34].
FIG. 6.Comparison of the time signal of two arbitrarily chosen nodes from two different subpopulations of the HR network.Results of one node data feeding to the reservoir are presented.(a) Signal of one oscillatory node (red) and the signal predicted by the machine (black).(b) Signal of one excitable node and machinebased predicted signal are shown with blue and black lines.For (a) and (b), the coupling strength is fixed at = 0.4.[(c) and (d)] When the coupling ( = 1.2) is increased, the ESN generated signals closely match the original signals of the oscillatory (c) and excitable nodes (d).

FIG. 7 .
FIG. 7. MSE between the original signal and predicted signals in coupled HR Neuronal networks.(a) One oscillatory node is fed into the machine.(b) Two nodes into the machine: one oscillatory and the other is chosen from the excitable population.MSE for all nodes are sharply dropped to a value ∼10 −5 near a coupling ∼ 1.