Critical links and nonlocal rerouting in complex supply networks

Link failures repeatedly induce large-scale outages in power grids and other supply networks. Yet, it is still not well understood, which links are particularly prone to inducing such outages. Here we analyze how the nature and location of each link impact the network's capability to maintain stable supply. We propose two criteria to identify critical links on the basis of the topology and the load distribution of the network prior to link failure. They are determined via a link's redundant capacity and a renormalized linear response theory we derive. These criteria outperform critical link prediction based on local measures such as loads. The results not only further our understanding of the physics of supply networks in general. As both criteria are available before any outage from the state of normal operation, they may also help real-time monitoring of grid operation, employing counter-measures and support network planning and design.


I. INTRODUCTION
The robust operation of physical distribution and supply networks is fundamental for economy, industry and our daily life.For instance, a reliable supply of electric power fundamentally underlies the function of most of our technical infrastructure [1][2][3][4][5][6].In periods of high loads, the breakdown of single infrastructures such as transmission lines can cause a global cascade of failures implying large-scale outages with potentially catastrophic consequences [3,[7][8][9][10][11][12][13][14][15].Periods of extreme loads are expected to become more likely in future grids as power from renewable sources, such as wind turbines, is often generated far away from the consumers (e.g.offshore) and moreover strongly fluctuating [2,[16][17][18].It is thus crucial to understand which factors limit robustness of supply networks and in particular to identify those links that are indispensable for network operation, compare also [5,[19][20][21][22][23][24].Are there simple characteristics in the physics of complex supply networks to identify which of their links are critical?Many approaches to identify critical transmission lines in power grids or critical links in other supply networks fundamentally rely, for instance, on large-scale numerical simulations of detailed models that, after a local breakdown, emulate the future dynamics of large parts of the network [19,[25][26][27].Here, we propose a complementary approach to predict a priori which links are critical in the sense that their failure yields larger-scale malfunction (outage) in the network.We identify two concepts that reveal, prior to any outage, how the network topology jointly with the load distribution influence which links are critical: The first relies on the link's redundant capacity which we quantify, the second originates from a renormalized linear response theory we derive.Based on these theoretical insights, we propose two networkbased criteria to identify critical links.A statistical evaluation suggests that the new measures predict critical links much more reliably than standard loads or flows.

II. LOADS, FLOWS, AND CRITICAL LINKS
To obtain insights into mechanisms underlying largescale outages and to pin down how the structure of a network determines its vulnerability, we base our analysis on a dynamic model of AC power grids [23,[28][29][30].Simpler generic models of supply networks and more complex load flow models of engineering yield qualitatively the same results, see appendices A 1 and B 5. The AC model characterizes the dynamics of the power angle φ j (t) by for N synchronous machines j ∈ {1, . . ., N }.Here, φ j (t) = θ j (t) − Ωt is the phase θ j (t) of unit j relative to the grid reference phase oscillating with the grid's cycle frequency Ω, e.g.Ω = 2π × 50 Hz, ω j = dφ j /dt is the deviation from the grid reference frequency, P j is the effective demand or supply (of a consumer or sink where P j < 0 or a producer or source where P j > 0, respectively) of unit j, α is a damping constant and the link capacities K ji = K ij ≥ 0 are proportional to the susceptance of the transmission line (i, j).
The power flow In the same network, even a more heavily loaded link '3' does not yield desynchronization and leaves the network fully functional.Accordingly, link '3' is highly loaded but non-critical (stable).We analyze critical links for test networks based on the topology of the British high-voltage transmission grid [10,28]: Ten nodes are randomly chosen as generators ( , Pj = +11 P0) and the others as consumers (•, Pj = −P0), with P0 = 1 s −2 .Lines connecting to generator nodes have a larger transmission capacity Kij = 2 K0 than the remaining links with Kij = K0 = 15 s −2 .See appendices A 2 and B 5) for additional test networks.
How can we identify which links are critical?The load distribution of a given supply network in normal stationary operation may serve as a first hint.Due to the distributed nature of sources and sinks and the topology of the network, some links are much more loaded than others (Fig. 1).Intuitively, rerouting the load of a highly loaded link should be harder than that of a less loaded one [41,42].It has been shown that attacks on highly loaded links on average have more severe consequences than random failures (see, e.g.[9,12]).Interestingly, whether or not the failure of a particular link induces network desynchronization is often not predictable by local measures such as load.In fact, single link failures may have completely distinct consequences for global network operation, largely independent of load: For instance, whereas the failure of one highly loaded link (link 1 in Fig. 1 a) may cause a desynchonization of phases and thus a large-scale network outage (see Fig. 1 b1), the failure of a less and only moderately loaded link (link 2) may still induce large-scale outage (Fig. 1 b2).This notwithstanding, the failure of a third link (link 3) that is more highly loaded than link 2, is uncritical to network operation (Fig. 1 b3).So predicting outages based on a link's load alone may have substantial limitations.
In what follows, we classify all links of a network into 'critical' ones, those whose failure induces longterm desynchronization and thus a non-functional network state, and 'stable' ones, those whose failure leaves the network functional.In particular, we integrate the the equations of motion (1) numerically starting from a stationary state of normal operation (phase locking).As it turns out, which links are critical depends jointly on the global topological structure of a network and its collective dynamics, in particular on the link's location within the grid topology and the entire grid's load distribution.

III. QUANTIFYING NETWORK REDUNDANCY
For a supply network to remain stable, it needs sufficient options for rerouting the (directed) flow F ab originally assigned to a failing link (a, b).Therefore, a sufficient degree of redundancy of the remaining network also matters to reliably detect critical links.How much redundancy is sufficient?To identify critical links, we quantify the redundancy as follows: If the link (a, b) fails, the (directed) flow F ab has to reroute over alternative paths in the network.However, the links along these paths have only a limited residual capacity K ij − F ij to take over this flow.We define the redundant capacity K red ab of a link (a, b) as the maximum flow that can be transmitted from unit a to b over the residual network excluding link (a, b) (see appendix A 3 for an algorithm to determine it).The redundant capacity is approximately given by that is, the minimum residual capacity across the links (i, j) on a path a → b, summed over different alternative paths a → b in the residual network which do not share a 'bottleneck', i.e. a link (i, j) where the minimum in (3) is attained.If there is only one such path then the expression becomes exact.We thus propose to consider the ratio of the actual flow and the redundant capacity as a measure to predict critical links as where h is a threshold value that can be optimized for any given specific prediction.Figure 2 compared to Fig. 1 illustrate that predictions based on redundant capacity (4) may work even for links where load-based predictions fail.

IV. FLOW REROUTING IN LINEAR RESPONSE
Alternatively, we analyze how general alterations of the capacity of a single link modify the global operation of a network.Consider a small perturbation κ ij of the network capacities at a single link (a, b) such that K ij = K ij + κ ij with κ ab = κ ba = κ and κ ij = 0 for all other links.This perturbation induces a change φ j → φ j of the steady state phases of the network.
Expanding the perturbed steady state ( φ j = φ j = 0 in Eq. ( 1)) to first order in the response ξ j := φ j − φ j and subtracting the unperturbed steady state condition yields During normal operation, we can assume that |φ i − φ j | ≤  (10).Networks and parameters as in Fig. 2.
π/2 for all links in the network [40] and thus use (2) to ob- This quantity characterizes the available capacity that the network may use to respond to perturbations in terms of the network topology and the original flows.We define the Laplacian matrix [43] Λ ij := − K ij + δ i,j n K in , of this available capacity using the Kronecker-delta symbol (δ x,y = 1 if x = y and δ x,y = 0 otherwise).With the vectorial components q i = (δ i,a − δ i,b ), equation ( 5) is recast into the form Λξ = κL ab q .We remark that this equation is linear but depends nonlinearly on the system's unperturbed state variables φ i .Using the Moore-Penrose pseudo-inverse T := Λ + , yields the phase responses Thus the power flow (2) in the perturbed network is to first order in κ given by where we have defined the link susceptibility . So if capacities are perturbed slightly, eqn.(7) tells us, to first order in the perturbations κ, which flows increase, which decrease and how much.Even for perturbations κ that are not small, this linear response argument reliably predicts where the flow is rerouted (see appendix A 4 for a detailed analysis).To Improved quality of critical link prediction.A Receiver operating characteristic curves indicate a multi-fold increase in prediction quality using the classifiers ( 4) and ( 9) in comparison to load or flow.(Statistics for 66000 links from 400 random grid realizations; model, topologies and parameters as in Figure 1) accurately also predict the magnitude of the flow change for a large perturbation of total link failure, we selfconsistently renormalize the linear response by replacing κ := −F ab /η ab←ab , thereby ensuring that the flow of the defective link vanishes, F ab = 0. From this renormalized linear response theory, the modified estimates for the rerouted network flows read The F i,j do of course depend on the considered link (a, b), that for clarity does not explicitly appear as an index.Related measures are used in engineering where they are referred to as line outage distribution factors [44].
As the rerouted total flows on each link must not be larger than the respective link capacities, we propose the predicted maximum load max (i,j) |F ij /K ij | as a discriminating measure between critical and stable links: In addition, every bridge is predicted to be critical.
This renormalized linear response theory does not only well identify which links are critical and which are stable, based on criterion (9); the fundamental equation ( 8) also predicts the value of the expected maximum load across the network (Fig. 3a).

V. LOAD AND TOPOLOGY CO-ACT
The response ξ as explicated by (6) and thus the susceptibility η are directly proportional to the load L ab of a link.The load of a link thus does determine its relevance, but other collective characteristics of the network are equally important.
For heavy loads, some of the responsive capacities K ij tend to zero and the associated network becomes weakly connected.The second eigenvalue λ 2 of the Laplacian Λ, which measures the algebraic conenctivity becomes small (while λ 1 ≡ 0, independent of the network and operating state) [43].Then the pseudo-inverse T = Λ + is dominated by 1/λ 2 and the response is given by up to terms of order 1/λ 3 .Here, v 2 denotes the eigenvector associated with λ 2 .Thus the response is specifically also determined by two intrinsically collective properties of the load distribution: the inverse algebraic connectivity 1/λ 2 and the overlap (v 2 • q), which is large for links connecting different components of the network [31,45].Direct numerical simulations confirm these theoretical predictions, Fig. 3 b,c.

VI. QUALITY OF CRITICAL LINK PREDICTIONS
The two proposed classifiers (4) and ( 9) enable the prediction of critical links with high accuracy.To test the performance we generate 400 random network realizations varying the generator positions, thereby analyzing 66000 links in total.A quantitative assessment of the different classifiers is provided by a receiver operating characteristic (ROC) curve, Fig. 4, displaying the fraction of correct predictions of critical links (the sensitivity) as a function of the fraction of false alarms (the false positive rate) when the discrimination threshold h is varied.We see that the two classifiers introduced in this paper closely approach the perfect operating point (0, 1) with almost no false alarms and alsmost perfect sensitivity (Fig. 4).The total number of incorrect predictions, including false alarms and missed critical links, is as small as 395 (0.6 percent) for the the linear response measure and 920 (1.4 percent) for the linear response measure.Thus it is possible to reduce the incorrect predictions based on loads (3149, 4.7 percent) and flows (2937, 4.5 percent) by a factor of more than seven, thereby drastically improving performance.Additional macroscopic quantifiers such as those based on the area under the ROC-curve confirm this view for a variety of different network topologies (see appendix B 5)

VII. DISCUSSION
In summary, we link the overall network topology with the load distribution resulting from the collective network dynamics and present non-local relations to identify a network's response to link failures.On this basis, we propose two network-based strategies to identify critical links by (i) quantifying the redundant capacity of the network and by (ii) estimating the flow rerouting through developing a renormalized linear response theory.The analysis suggests that the two proposed predictors are well suited to identify critical links.In particular they provide a substantial improvement of the quality of predicting critical links compared to predictions based on local measures such as a link's load.We emphasize that the quality of prediction is evaluated by statistical measures whereas the predicting measures are derived from insights about the topological connectivity and the nonlinear dynamics of the networks.
By construction, the proposed criteria readily generalize to other network topologies, to the physics of generic linear supply networks and to complex load flow models as standard in engineering (see appendices A 1 and B 5).For network operation, the proposed predictors may provide key hints for initial analysis, because they suggest which links require special attention, e.g. in detailed large scale simulations of power grid engineering.In particular, our insights about collective influences indicate that the weak spots of a network are not necessarily given by the most heavily loaded elements (cf.[9,12]) and that network-based rather than local measures provide suitable guidelines for the security assessment of real-world power grids (cf.[19]).
In this section we review different models for the simulation and analysis of power grids.

a. Power flow in AC grids
In an AC power grid the voltages and currents oscillate sinusoidally with period T and angular frequency ω = 2π/T , where we have defined the complex -valued amplitudes The electric power is oscillating, too.The net transferred energy per period is called the real or active power, The imaginary part Q = (U I * ) describes the power which is temporarily stored in capacities or inductances and is referred to as the reactive power.Furthermore, one defines the apparent power S = P + iQ = U I * .These relations show that the relative phases of voltages and currents are essential for the power flow in a grid.Calculations are performed most easily using the complex notation defined here.In the following we drop the underline to keep the notation simple.Consider a power grid with k, = 1, . . ., N nodes.The basic relation between voltage at the nodes and the currents between them is Ohm's law, where Z k is the impedance and y k = 1/Z k the admittance of the transmission line between the nodes k and .. The admittance is divided into its real part, the conductance, and its imaginary part, the susceptance, In case there are multi-circuit transmission lines we assume that y k is the sum of the admittances of the single circuits and y k equals zero if no transmission line exists.The total current injected to the node k is then given by For actual calculations it is convenient to introduce the nodal admittance matrix Y, Ohm's law can then be rewritten in a vectorial form, which yields the network equation with I = (I 1 , I 2 , ..., I N ) representing the complex currents injected into N nodes, U = (U 1 , U 2 , ..., U N ) denoting the complex voltages.The apparent power injected to a node k is then given by where θ k = θ k − θ denotes the difference in phase of the complex voltages U k and U .The effective and reactive power are then again given by the real and imaginary part of this expression [44,46], b.An oscillator model for power grid operation We model the power grid as a network of N rotating machines representing, for instance, wind turbines, or electric motors.Each machine k = 1, . . ., N is characterized by the mechanical power P mech k acting on it, which is positive for a generator and negative for a consumer.The state of a machine is determined by its angular frequency and the rotor or power angle θ k (t) relative to the reference axis rotating at the nominal grid angular frequency ω 0 = 2π × 50 Hz or ω 0 = 2π × 60 Hz, respectively.Correspondingly, ω k = dθ k /dt gives the angular frequency deviation from the reference ω 0 .The dynamic of the rotor is governed by the swing equation [28-30, 47, 48] where P el k is the electrical power that is transmitted to or from other rotating machines.The parameter M k is the moment of inertia of the rotor times ω 0 and D k measures the damping, which is mainly provided by damper windings.If a generator is decoupled from the grid in an uncontrolled way such that P el k = 0, it accelerates until finally an emergency shut-down is performed.
For simplicity we here neglect ohmic losses in the grid such that the line admittance is purely imaginary, Y k = iB k .Furthermore, we assume that the magnitude of the voltage is constant throughout the grid, |U k | = U 0 for all nodes k = 1, . . ., N .For a common two-pole synchronous machine the phase of the voltage equals the mechanical phase of the rotor.The expression (A10) for the active electric power then simplifies to Substituting this result into the swing equation yields the equations of motion (A14) Using the abbreviations U (q)   grid I (d)   FIG. 5. Equivalent circuit diagram explaining the internal voltage drop in a synchronous machine.
the oscillator model reads These equations of motion are equivalent to the structure-preserving model [29] in power engineering.
The model is mathematically equivalent, but derived from slightly different assumptions.Notably, the model is also very similar to the so-called Kuramoto model, which is studied extensively in statistical physics [49][50][51].In the Kuramoto model the inertia term is absent such that it can be viewed as the over-damped limit case.

c. Including voltage dynamics
The oscillator model can be extended to include the dynamics of the local voltages, which yields the so-called third-order model [47,48,52].The equations of motion for the mechanical motion of the synchronous machines is still given by equation (A15) but the coupling strength becomes time-dependent via the local voltages The equations of motion for the voltages can be derived from Faraday's and Ohm's law applied to the voltages, currents and fluxes of the stator and rotor windings of a synchronous generator.Assuming again lossless transmission lines one obtains where is the voltage of the field windings and ∆X is determined by the transient and static reactances of the stator windings.The actual derivation of this formula can be found in the literature [47,48,52] such that we do not go into detail here.Instead we briefly discuss the mathematical structure of the equation.
If the phase differences θ k would be constant in time, then the equation of motion (A17) would just describe an exponential relaxation to the equilibrium value which can be understood in terms of the equivalent circuit diagram shown in Fig. 5.The equilibrium voltage U (q) k seen by the grid is given by the source voltage U k .The relaxation time T k is given by the ratio of the inductance and the ohmic resistance of the field windings as in a resonant circuit.
However, the phase differences θ k are not constant in time but again depends on the voltages.This gives rise to a complex interplay of phase and voltage dynamics and new potential instabilities as discussed in [52].

d. Load flow calculations
Load flow calculations are routinely carried out in power engineering.They include ohmic losses and reactive power but are generally static.Hence they describe the steady state operation of a power grid but not the dynamics of a major outage.
In load flow calculations a system of algebraic equations describing the complex power flows through the transmission lines in the network is established.Such an equation system can be formulated when the network topology, the amount of power supply and consumption at each node, as well as the admittance of each transmission line are known.Solving the equation system numerically, one obtains information about the steady state operation of a power grid.Critical states of a power grid occur when either the equation system does not have a solution or the load of a transmission line exceeds a limiting value which depends on the line parameters.Such an overload generally does not lead to an immediate desynchronization of the grid but to an overheating or bending on longer time scales and eventually to an emergency shut down.
In the following we present a short derivation of system of equations that represents the load flow model [44,46].Within the load flow model we distinguish three types of nodes: generator buses (PV buses), load buses (PQ buses) and slack buses.PV buses are connected to generators, therefore their active power P and their voltage magnitudes |U | are fixed.PQ buses are connected to loads, i.e. their net injected active power P and reactive power Q are given, while the voltages at these nodes are unknown.A special kind of bus is the slack bus.It does not exist in reality, but is introduced to facilitate numerical computation.Active power P and reactive power Q of the slack bus remain unspecified in order to balance the active and reactive power in the system during the iteration towards the steady state solution.The physical reason for this is that the power losses on the transmission lines are not known a priori, i.e. before the solution is obtained.The slack bus acts as an ideal voltage source and ensures the power balance of the grid.work of m generators and n loads.The unknown state variables are the phases of all n + m nodes of the network and the voltages of the n loads summarized in the state vector Using all the known information (see Table I), we obtain a system of 2n + m equations describing the active power P and the reactive power Q injected to the buses (see Eq. A10 and Eq.A11) The superscript "sp" to the variables on the right-hand side indicates that the values of this variables are specified beforehand.We thus have a system of 2n + m nonlinear algebraic equations for 2n + m unknown variables x.
A common and effective way to solve this equation system is to use the Newton-Raphson method [47], which iteratively updates the state vector x from an initial guess towards the solution of Eqs.(A20).The final solution for the state vector describes the state of the power transmission network in normal, steady-state operation.

e. The DC approximation and linear flow models
When the loads and losses in a power grid are small, then the load flow calculations can be simplified considerably [44,46,53,54].Within the so-called DC approximation one first neglects all ohmic losses such that the admittance of a transmission line is purely imaginary, G k = 0.The real power injected at node k then reads (cf.equations (A10) ) In a second step one assumes that the magnitude of the voltages in the grid are fixed, typically at the reference value of the respective voltage level.Technically, all nodes must then be considered as PV buses, such that no equations for the reactive power must be taken into account when solving the load flow equations.Finally, small loads imply that the phase differences across a transmission line are small such that the sine function can be approximated to first order as The load flow calculations then reduce to a set of linear equations where the coupling coefficients are defined as The simplified equation (A22) is referred to as the DC approximation, as it is mathematically equivalent to Kirchhoff's circuit equation for a DC electric circuit.Still, it describes the flow of real power in AC power grids.Obviously, linear equations can be solved much faster than the nonlinear load flow equations (A20).The DC approximation is particularly advantageous when the flow must be calculated for many different scenarios of generation and load, because the matrix K has to be inverted only once.The limits of its applicability are discussed in detail in [53,54].
Notably, a mathematically equivalent model describes the flow in hydraulic networks [55] or vascular networks of plants [27].In this case the model is derived as follows.We denote the flow from node k to node by F k .The continuity equation then reads where P k is the source or sink strength at node k.The unique steady state is then determined by the condition that the total dissipated power should be minimal.In this expression, K k denotes the capacity of the link (k, ), which is determined by the cross section in hydraulic of vascular networks.The primed sum runs only over existing transmission lines, i.e.only over links with K k = 0. Minimizing this expression using the method of Lagrangian multipliers yields where the values of the θ are determined by a linear set of equations as in equation (A22) with the Laplacian matrix (A27)

Network Topologies
We analyze the effects of transmission line modifications and breakdowns for partly synthetic model networks based on the topology of different real-world power grids.We consider the topology of the current highvoltage power transmission grid in Great Britain with 120 nodes and 165 links [10,28] (cf.Fig. 6) and Scandinavia with 236 nodes and 320 links [23] (cf.Fig. 7).
Furthermore we consider the topology of the IEEE 118bus test grid with 118 nodes and 179 links (cf.Fig. 8), representing a part of the electric power system in the Midwestern United States as of 1962 [56].We then generate a large number of synthetic networks as a basis for statistical analysis by randomizing generation and load as described below.Model power grid based on the topology of the British high-voltage power transmission grid [10,28].Half of the nodes are randomly chosen as generators ( ) and consumers (•), respectively, with Pj = ±1 s −2 (homogenous case).In the current example 13 bridges and 4 further links are critical (marked by arrows).All of them are faithfully identified using the novel predictors proposed in this manuscript.
load F/K 0 0.25 0.5 0.75 Model power grid based on the topology of the Scandinavian high-voltage power transmission grid [23].21 nodes are randomly chosen as generators ( ), the remaining ones are consumers (•) (heterogeneous case).In the current example 45 bridges and 4 further links are critical (marked by arrows).All of them are faithfully identified using the novel predictors proposed in this manuscript.
We note that parameters are chosen to represent a heavily loaded power grid.Periods of such high loads are still rare nowadays, but are expected to become much more likely in future power systems with a large number of strongly fluctuating renewable power sources (see, e.g., [18]).In such situations the grid becomes vulnerable to the breakdown of single transmission lines as discussed in the present paper.A prime example is the European power blackout on 4th November 2006 [57].A large amount of electric power from wind turbines was transmitted from northern to south-western Europe resulting in heavy loads of several lines.The planned shut-down of one double-circuit line over the river Ems in Germany was then sufficient to bring the entire grid to the tipping point of a breakdown.In other words, this grid did not possess enough redundant capacity K red to securely take over the load of the shut-down line.A coupling of the busbars at the transformer station Landesbergen then finally triggered the blackout, in which the European grid fragmented into three asynchronous areas.
We have performed extensive simulations of the oscillator models for the three topologies described above for two different scenarios of power generation.First, we consider a heterogeneous case, where several large power plants are connected to the grid.We randomly select N gen nodes to be the connection points of the generators, the remaining N con nodes are consumers.The effective power of the consumer nodes is assumed to be P con = −P 0 , while the generator nodes have P gen = load F/K 0 0.17 0.34 0.51 0.68 FIG. 8. Model power grid based on the topology of the IEEE 118 bus test grid [56].Half of the nodes are randomly chosen as generators ( ) and consumers (•), respectively, with Pj = ±1 s −2 (homogeneous case).In the current example 9 bridges and 2 further links are critical (marked by arrows).All of them are faithfully identified using the novel predictors proposed in this manuscript.
+N con /N gen × P 0 .The links adjacent to the generator nodes are assumed to have a higher transmission capacity (2 × K 0 ), while the remaining links have a transmission capacity K 0 .In all numerical examples we set P 0 = 1 s −1 and K = 15 s −1 unless stated otherwise.In particular, we set N gen = 10 for the topology of the British grid (Fig. 6) and the IEEE 118 bus test grid (Fig. 8), whereas N gen = 21 for the Scandinavian grid (Fig. 7).This choice ensures that P gen is approximately equal in all cases such that the results are comparable.
Second, we consider a homogeneous case assuming a more distributed power generation [23,28].In this case we assume that half of the nods are generators P gen = +P 0 and half are consumers with P con = −P 0 .Again the positions of generators and consumers are chosen randomly.All links have the save transmission capacity, which we set to K 0 = 4 s −1 (British grid and IEEE grid) and K 0 = 5 s −1 (Scandinavian grid), respectively.We note that the flow is directly proportional to the load for such a homogeneous network such that both quantities have the same predictive power.
In section B 6 we collect the numerical results for the three model networks with both heterogeneous and homogeneous generation.In all cases the new methods for the predictions of critical infrastructures introduced in the main manuscript significantly outperform predictors based on load, flow, topological edge-connectivity or edge betweenness centrality.Notably, the linear response prediction works even better in the case for homogeneous networks while the redundant capacity approach performs better for heterogeneous networks.The difference becomes most apparent for the model grids based on the topology of the Scandinavian power grid.The numerical results thus fully confirm the findings discussed in the main manuscript.

Definition and calculation of the redundant capacity
The redundant capacity K red ab of a link a → b is defined as the maximum additional flow that the network can transmit from node a to b when the link (a, b) fails.Here, the maximum is defined in a purely graph-theoretical way, i.e. we search for a flow which satisfies |F ij | ≤ K ij for all i, j and flow conservation at every node.This definition is purely structural and can thus be applied to every model of a supply network.It does not take into account whether such a flow is dynamically possible or stable for the particular network type under consideration.
Furthermore, we note that the redundant capacity is directed, i.e.K red ab = K red ba .This is immediately clear from the fact that the flow is directed.If an link (i, j) is fully loaded in the sense that F ij = −F ji = K ij , then it cannot transmit any additional flow from node i to node j.However, it can transmit additional flow in the other direction from j to i.In fact, this would lower the net flow over the link (i, j) and thereby the load of the link.
The redundant capacity of the link (a, b) is calculated using a modification of the Edmonds-Karp algorithm [58].As discussed above, K red ab = K red ba such that the algorithm works with directed graphs.Step 6: Calculate the maximum available capacity along the path p: Step 7: Add the maximum available capacity to the redundant capacity: Step 8: Increase the flow along the path p: Step 9: GOTO step 3.This definition has been formulated for the oscillator model, which is mainly studied in the present paper.It can be directly applied to the third-order model introduced in section A 1 c.In this case, the capacity K ij is given by equation (A16) using the equilibrium voltages.
The concept of the redundant capacity for the prediction of overloads can be easily adapted to load flow calculations (see section A 1 d).One then has to specify whether power or current flows or both can lead to dangerous overloads and use these quantities in the al-gorithm introduced above.The results are discussed in detail in section B 5 b.

Linear response theory
Here we discuss the linear response theory for the prediction of phases and flows in the oscillator model after a local perturbation in more detail.We consider a small perturbation of the network, This perturbation induces a small change of the steady state phases of the network φ j → φ j .We note that the steady state is defined only up to a global phase shift which ha no physical significance.We thus fix the global phase as j φ j = 0 = j φ j and define the response ξ j = φ j − φ j .
The steady state of the original network is defined via for all j = 1, . . ., N .For the perturbed network we have Expanding the flows to first order in κ ij and the response ξ j yields Subtracting the equations (A31) and (A30) then yields the steady state condition to leading order in ξ j with the result for all j = 1, . . ., N .In the last step we have used the definition of the load L ab = F ab /K ab = sin(φ a − φ b ) and the perturbation matrix (A28).This result can be further simplified and analyzed by introducing the matrix and the vector q with the components In a short-hand vectorial notation, equation (A32) then reads Λ ξ = κL ab q. (A35) Before we go into detail, we first have a closer look at the matrix Λ.A steady state of the power grid model (A14) is dynamically stable if the relation |φ i −φ j | ≤ π/2 holds for all links (i, j) of the network [40,48].Stable steady states which do not satisfy this relation exist only at the boundary of the stable parameter region.We can thus assume that during normal operation we always have |φ i − φ j | ≤ π/2 for all links such that we can use the following relations: The expression will be referred to as the responsive capacity of each link, i.e. the free capacity which can be used to react to the perturbation.In this case the non-diagonal entries of the matrix Λ are all non-positive such that Λ is a Laplacian matrix for which many properties are known [43].In particular, the eigenvalues of a Laplacian matrix satisfy 0 , where λ 2 is an algebraic measure for the connectivity of the underlying network [45,59].We will make use of these relations to further analyze the properties of the matrix Λ in section A 5.
The Laplacian matrix Λ is singular as λ 1 = 0. Still, Eq. (A35) can be solved for ξ as the vector q is orthogonal to the kernel of Λ which is spanned by the vector (1, 1, . . ., 1) T .In order to formally solve equation (A35) we can thus use the Moore-Penrose pseudoinverse of Λ, which will be called T := Λ + in the following.Thus we find ξ = κL ab T q. (A38) The power flow in the perturbed network is to leading order given by Inserting the results (A38) for the phase response yields where we have defined the link susceptibility Related methods are used in power engineering, where they are commonly referred to as 'line outage distribution factors' [44].They are used in numerical studies of multiple line outages (N-x errors), where a full solution of the all possible combinations of contingencies becomes intractable [60].In this contribution we demonstrate the ability of this approach to predict the future dynamics of complex supply networks and we provide a detailed network theoretical analysis.
The equations (A38) and (A40) give the linear response of the phases and the flows to a perturbation of a single link of the network.Strictly speaking, the are valid only for infinitesimally small perturbations κ.It is a priori not clear whether these results can be extrapolated to larger perturbation strength in order to predict the effects of a complete breakdown of a single link.
An example of how the damage of a single transmission line affects the flows in a power grid is shown in Fig. 9.We compare the change of the magnitude of the flow |F ij | − |F ij | predicted by linear response theory (A40) to the actual value obtained from a numerical solution of the steady state condition for the original and the perturbed network.In case of a small damage where only 10% of the transmission capacity is lost (κ ab = −0.1 × K ab ) we find an excellent agreement between predicted and actual values as expected.The situation is more complex in case of a complete breakdown, i.e. κ ab = −K ab .We find that formula (A40) reproduces the relative changes of the flow surprisingly well.That is, linear response theory accurately predicts where the flow is rerouted even for a complete breakdown of a transmission line.However, the magnitude of the flow changes is strongly underestimated by formula (A40).Consequently, it also fails to reproduce that the load of the defective link must vanish, F ab != 0.But one can easily fix this deficit by extrapolating the results of linear response theory if we replace equation κ by −F ab /η ab←ab .This yields the modified formula In this article, we propose to use this modified linear response formula to predict impeding overloads which lead to a destabilization of the grid.An example of a successful application of this method is demonstrated in Figure 1 in the main manuscript.In the first example, formula (A42) predicts that no overload occurs in agreement with the direct solution of the steady state condition (A29).Thus we expect that the grid relaxes to a new steady state after the failure of the respective link.This prediction is confirmed by a direct numerical simulation of the equations of motion.In the second example formula (A42) predicts that secondary overloads occur after the respective transmission line failed.Indeed, numerical simulations show that no steady state solution of equation exists and that the grid becomes unstable and looses synchrony.
Furthermore, we test the quality of load prediction for stable links in figure 10, comparing the predicted maximum load after the failure of a single link with the numerically exact values.The maximum load max ab = max ij |F ij /K ij | is predicted with great accuracy in the majority of all cases.It is underestimated in approximately 20% of all cases.

Properties of the Laplacian and the susceptibility
A deeper understanding of the results of the linear response approach can be gained using results from graph theory and network science.The response ξ is determined by the load of the perturbed link and the matrix Λ through equation (A35).During normal operation Λ is a Laplacian matrix for which many important properties are known [43].These properties are particularly useful in order to understand how the response is determined by the structure of the power grid.
So assume that |φ i − φ j | ≤ π/2 hold for all links of the network.Let 0 = λ 1 < λ 2 < • • • λ N denote the eigenvalues of Λ and v n the corresponding eigenvectors.Then the response (A38) is given by The term n = 1 does not contribute since we have fixed the global phase such that j ξ j = 0.This expression shows four important properties: (1) The response generally scales with the load of the defective link L ab = F ab /K ab times the absolute strength of the perturbation κ.This is confirmed by the numerical results for a test grid shown in figure 11 (a,b).If we fix the relative strength of the perturbation instead, as for a complete breakdown where κ/K ab = −1, then the response scales with the flow of the defective link.
(2) The prefactors 1/λ n decrease with n.In particular for a heavily loaded network the algebraic connectivity λ 2 is small such that the term n = 2 dominates the sum.Then the susceptibility of all links in the network scale inversely with the algebraic connectivity of the network given by the eigenvalue λ 2 [43,59].Hence, the response and the susceptibility are large if the network defined by the responsive capacities K ij is weakly connected.The response is proportional to load of the perturbed link and the inverse algebraic connectivity.(a,b) Response as a function of the load L ab of the failing link.The response of a given network is characterized by the norm of the phase difference ξ and the columns of the susceptibility η •←ab .In addition, the phase difference is significantly enhanced if the overlap to the Fiedler vector | q • v2| is large, as indicated by the color code.(c,d) The global response, averaged over all links is proportional to the inverse algebraic connectivity 1/λ2.Shown is the slope of the phase response (solid line in (a)) and the matrix norm of the susceptibility η for networks with different values K0 = 10, 12, . . ., 40 s −2 and thus different connectivity.The network topology is the same as in Fig. 9.
(3) For a heavily loaded network, the link susceptibility scales with the overlap | v 2 • q|, where v 2 is the so-called Fiedler vector.This is confirmed by the numerical results for a test grid shown in figure 11 (a).The response is enhanced if the overlap (shown in a color code) is large.
The Fiedler overlap can be interpreted as a measure of the local algebraic connectivity of the nodes a and b.
To see this note that the Fiedler vector can be used to partition a graph into two weakly connected components [43,45].The overlap with the vector q is largest if the two nodes a and b are in different components and thus weakly connected.
To illustrate this, let us consider the limiting case λ 2 → 0, i.e. the case where the network defined by the responsive capacities K ij becomes disconnected into two fragments with N 1 and N 2 nodes, respectively.The Fiedler vector then reads (4) In the limit of a disconnected network the response ξ to a perturbation at the link (a, b) diverges if it links the weakly-connected components.If the perturbation occurs within one component, then the response remains finite.
(5) On a global scale the response ξ and the susceptibility η are essentially given by the inverse algebraic connectivity 1/λ 2 of the network.We have already discussed the importance of the term n = 2 for the response (A43).Averaging over all possible trigger links, this term dominates even for a strongly coupled network.To test this proposition, we have calculated ξ and η for a test grid with different values of the global coupling strength, i.e. we write K ij = K 0 A ij and vary the global prefactor K 0 .The numerical results shown in figure 11 (c,d) confirm that the norm of the response ξ averaged over all trigger links as well as the Frobenius norm of the susceptibility matrix η are almost perfectly proportional to 1/λ 2 .
Appendix B: Alternative identification statistics and additional results

Introducing novel indicator variables
In this contribution we introduce three novel indicator variables for distinguishing critical and stable links on the basis of the topology of the network and its loads prior to a line failure.All three approaches are based on the fact that network flow must be rerouted when a transmission line fails.We expect that this is impossible when the ratio of flow and redundant capacity as defined in A 3 is too large.Alternatively, we expect a global breakdown, when linear response theory as introduced in section A 4 predicts secondary overloads, i.e. when the predicted maximum load is too large.Additionally we test, whether a combination of both variables can be an even more successful indicator.We compare the performance of these three novel indicator variables with other indicators for the importance of links in powergrids, namely the load L ab = |F ab /K ab |, the flow |F ab |, the topological edge-connectivity τ ab and the edge betweenness centrality ab [43].Prediction or classification success does not only depend on the information comprised in an indicator variable, but also on the way, this information is used.In this contribution we test all indicator variables using two different methods of determining decision margins, i.e. values which announce critical links: thresholding and a naive Bayesian classifier [61,62].Thresholding is a very robust approach, since it does not depend on the amount of training data or training method.Additionally, all considerations concerning the rerouting of network flow support the idea that large values of the indicators r ab , max ab and c ab are linked to the criticality of links.Therefore predictions of critical links can be made by simply imposing a threshold h which distinguishes between indicator values related to critical links or stable links.On the other hand, a naive Bayesian classifier does not require any assumption on the system that generated the data set or the range of suitable indicator values.Additionally it is able to detect relations between indicator variable and target event that are more complex than a linear correlations.

Quantifying prediction success
The success of indicators and classifiers is typically evaluated using contingency tables.
A contingency tables (see Tab. II for an example) summarizes the numbers of occurrences n TP , n FP , n FN and n TN of four possible outcomes of a prediction task: True positive (TP): Link is predicted critical and is critical.False positive (FP): Link is predicted critical but is stable.False negative (FN): Link is predicted stable but is critical.True negative (TN): Link is predicted stable and is stable.

Example for a contingency table
A good indicator aims on maximizing n TP while simultaneously minimizing n FP .Setting these quantities in relation to the total numbers of critical and stable links, n critical and n stable , one can generate receiver operating characteristics (ROC).In more detail, the fraction of correct predictions, also called the true positive rate or the sensitivity is compared to the fraction of false alarms, also called the false positive rate If the classification depends on a threshold value h one can plot SEN vs. FPR for different values of h to obtain a ROC-curve (see Fig. 12 for an example).Making no predictions and therefore also no false predictions corresponds to the point (FPR, SEN) = (0, 0).Classifying every link to be critical generates the point (FPR, SEN) = (1, 1).An ROC-curve made by random predictions consists in a straight line with slope 1 through the origin.A perfect indicator creates an ROC-curve that contains the point (FPR, SEN) = (0, 1).Consequently, a classifier is judged to be the better, the nearer the ROCcurve approaches the the point (0, 1), i.e. the upper left corner of the plot.Typically, summary indices such as the area under curve (AUC) (see Figs. 12 and 19) or the distance to the upper left corner of the ROC plot are used to compare ROC curves of several different indicators.

Thresholding approach for determining decision margins
All considerations concerning the rerouting of network flow support the idea that large values of the indicators r ab , max ab and c ab occur, if the corresponding links are critical.Assuming the relation between an indicator variable x ab and the tendency of an link to be critical is linear, we can use a simple discrimination threshold h to separate values that predict critical links from others: x ab > h ⇒ predicted to be critical, x ab ≤ h ⇒ predicted to be stable.Purely topological quantites such as the the topological edge-connectivity τ ab (•) and the edge betweenness centrality ab (-) perform even worse.Note that the ROC curve for λ ab consists of discrere points only as the variable λ ab itself is discrete.The inset shows the entire ROC curves, while the main panel shows a magnification around the point (FPR, SEN) = (0, 1).The curves are based on the simulation of 400 random realizations of the British power grid with heterogeneous generation (cf.section A 2). system: where h is a threshold value that can be optimized for the specific task.(b) Based on the results of linear response theory introduced in section A 4 we propose the following classification system: Bridges are always predicted to be critical.The variable h denotes a threshold value that can be optimized for a specific task.
In the current setting, the number of false positive predictions can be minimized by choosing a high value of h, while the number of false negative predictions can be minimized by choosing a small value of h.
The resulting ROC curves for the identification of critical and stable links are shown in Fig. 13.The proposed classifier based on the redundant capacity (B7) and based  III.Performance of a classification system of critical and stable links according to the redundant capacity as defined in Eq. (B7).Contingency table for the three different values of the threshold h optimized for different tasks: (a) Threshold value h = 0.614 for which the ROC curve approaches the perfect operation point (0, 1) most closely.(b) No false positive results occur for high threshold value h = 0.84.(c) No false negative results occur for a low threshold value h = 0.0493.Results are summarized for 400 random realizations of the british power grid with heterogeneous generation (cf.section A 2).
on linear response theory (B8) closely approach the perfect value (FPR, SEN) = (0, 1).They clearly outperform classification schemes based on the local measures load L ab and flow |F ab |.Purely topological quantities such as the the topological edge-connectivity τ ab and the edge betweenness centrality ab perform even worse (cf.also [12]).
Finally, the threshold value h can be chosen according to the specific application of the classification system.A common choice is the point which has the smallest distance to the perfect operational point (FPR, SEN) = (0, 1).This choice represents a compromise between the conflicting goals of large sensitivity an small false positive rate.For the present network data set it yields the value h = 0.614 for the classification system (B7) and h = 0.977 for the classification system (B8).In these cases, the absolute fraction of false predictions lies below 0.75% or 1.39%, respectively (cf.Tables III (a) and IV (a)).A different strategy would be to minimize the number of false positive results.For the given network data this number can be reduced to zero by choosing a high value of h (cf.Tables III (b) and IV (b)).Notably, the sensitivity still remains reasonably large in this case.Similarly, the number of false negative results can be reduced to zero by chosing a small threshold value (cf.Tables III (c) and IV (c)).This strategy detects all critical links but also yields a comparably large number of false positive predictions.
The results are summarized in the contingency tables III and IV .In addition to the absolute number of true and false predictions, we also give some statistical quantities to characterize the classifier: the sensitivity (B4), the specificity SPE := 1 − FPR as well as the positive   Combining the redundant capacity and the linear response theory according to Eq. (B3) yields a combined indicator c ab > h ⇒ predicted to be critical, c ab ≤ h ⇒ predicted to be stable. (B9) As shown by the ROC curve in Fig. 13 the combined indicator further improves the performance.Most importantly, we can achieve a sensitivity of 100 % for the current dataset by choosing a threshold h ≤ 0.99021.For this choice the false positive rate is still below 2.1%.Using a threshold h = 1.28 reduces the false positive rate to zero at a sensitivity of 96.65%.

A Bayesian approach for classifying critical links
In the previous Sec.B 3, values of indicator variables are selected by imposing a threshold.This simple and robust approach is sufficient, if the relationship between the values of the indicator variable and the prediction  success is monotonous.A more general approach that does not require any knowledge about the system under study consists in classifying suitable values of indicator variables through conditional probability density functions (CPDFs), i.e., by using naive empirical Bayesian classifiers [61,62].The resulting ROC curves are proper ROC curves, i.e. , they are concave and consequently have a monotonously decreasing slope [63].This effect is obtained by choosing relevant indicator values according to the probability that the link (a, b) is critical given a indicator value x ab .To express this in formulas, we introduce the binary event variable Decision making, i.e. here predicting criticality of the link is then done using a probability threshold h p(χ = 1|x ab ) ≥ h ⇒ predicted to be critical, p(χ = 1|x ab ) < h ⇒ predicted to be stable, (B11) with x ab denoting the value of a indicator variable computed for the link (a, b).Changing h from the global maximum of the CPDF p(χ = 1|x ab ) to 0, generates an ROC curve starting from the lower left corner to the upper right.Although this approach is mathematically well defined, there are several (technical) issues that have an influence on the resulting ROC curves.In general one observes an increase in the area under curve (AUC), with increasing Cumulative probability distribution function (CPDF) for all indicator variables calculated for the British grid and heterogeneous power generation.In order to cover the wide range of indicator variables, the binning was done using irregular bin sizes on logarithmic scales, if appropriate.Number and size of bins were adapted such that each bin is based on at least two counts.number of bins used to estimated the CPDF from a given data set.However, allowing an arbitrarily large number of bins leads to ROCs which are specific for the data set under study (over-fitting) and are irreproducible when using other data sets generated by the same system under study.Therefore we determine the number of bins in an adaptive way, i.e. by choosing it such that each bin of the CPDF has at least two entries.Depending on the range of indicator variables it can be more appropriate to use the logarithm of a indicator variable and not the variable itself to determine suitable binnings.
Comparing the ROC curves in Figs. 13 and 15 shows that the Bayesian approach yields qualitatively similar results as the linear classifier.Additionally, curves generated through the Bayesian approach are smoother since they are proper ROC curves [63], i.e., their slope is monotonously decreasing.Classifying critical links based on the redundancy of the network works extremely well also for more complex power grid model such as the third order model introduced in section A 1 c.We simulate the dynamics after the breakdown of a single link for random model networks based on the topology of the British high-voltage grid (cf. Figure 6).Half of the nodes are randomly chosen to be generators or consumers with P j = ±1 s −2 .All nodes have the nominal voltage U = 1 for all nodes (cf.[52] for the choice of parameters).The ROC curve obtained from the simulation of 200 random networks (see figure 16) shows that the indicator based on the redundant capacity significantly outperforms the indicators based on the flow, topological edge-connectivity or edge betweenness centrality.

b. Critical links in the load flow calculations
Load flow calculations are routinely carried out by the grid operators to analyze the current state of the grid as well as the security after the breakdown of a single transmission line (the so-called n − 1 case).We test the novel classification schemes for telling apart critical from stable links for load flow calculations, including ohmic losses and reactive power flow.
Here, we do load flow calculation for synthetic model networks based on the topology of the British power grid.As before we consider two scenarios for generation and consumption.In the homogeneous scenario, we assume that half of the nodes generate the power P gen = P 0 and half are consumers with power demand P con = −P 0 and

FIG. 17.
Performance of indicators for critical links in the load flow calculations.Simulations are carried out for model power grids based on the topology of the British highvoltage grid (cf. Figure 6) for (a) the heterogeneous case (ten nodes randomly selected as generators) and (b) the homogeneous case (half of the nodes randomly chosen as generators).Shown is the ROC curve for the different indicators: the ratio r ab ( ), the apparent power flow |S ab | (− • −), the norm of the complex current |I ab | (− − −), and the topological edgeconnectivity τ ab (•).ROC curves are calculated using the Bayesian prediction scheme described in Sec.B 4 using data from 100 random realizations of the network.Bridges are considered trivially critical.
Q con = −0.33P0 .All transmission lines are assumed to have the same impedance Z 0 = 0.1 + i 0.5 p.u..For the heterogeneous network, 10 of the 120 nodes are randomly chosen as generators (P gen = 11P 0 ) and the rest are set to be consumers (P con = −P 0 , Q con = −0.33P0 as above).The transmission lines connecting the generators to the grid are assumed to be stronger, i.e. to have half the impedance as the remaining ones (Z = 0.5Z 0 ).For both networks the terminal voltage of the generators |U gen | is set to be 1 p.u. and we set P 0 = 1MW.For these networks we perform load flow calculations to determine the normal operational state of the grid, commonly called the n − 0 basis case.If no solution can be found, we discard the random realization.
We then simulate the failure of a single transmission line (the n − 1 case) by removing one link (a, b) from the network.The post-fault condition of the network is then determined by applying load flow calculations again for the modified topology using the the pre-fault stable state as initial condition for the solver.The grid is considered to be stable if a solution can be found for the post-fault topology, and the apparent power flow |S ij | carried by each line does not exceed the security limit K ij of this line.For better comparability with the previous sections we refer to K ij as the capacity of the line (i, j).In other words, the grid remains stable if none of the other lines is overloaded when the transmission line (a, b) fails.If not, the grid is assumed to be unstable and the transmission line (a, b) is said to be critical.
In load flow calculations the definition of the redundant capacity of a transmission line is slightly different from the one for the oscillator model, since the power flow between nodes is complex.For the homogeneous network, the transmission capacity is identical for all links, which is defined as 1.2 times the maximal apparent power flow in the stable operation state in the n − 0 case, For the heterogeneous network, the transmission capacity of the links connected to the generators K LF gen is set as 2 times the capacity of the links connecting only consumers K LF con : K LF gen = 2K LF con , where K con is defined regarding the maximal apparent power flow on the consumer connecting lines maximal apparent power flow in the stable operation state in the n − 0 case, By this definition, each transmission line in the stable operation state of the initial pre-fault networks is working safely without any overload.The redundant capacity of a transmission line K red ab is then defined by the algorithm introduced in section A 3. The only difference is that the flow F ij is replaced by the apparent power flow |S ij | times the sign of the active power flow P ij , which indicates the direction of the net power flow between nodes, i.e.The linear flow model can be considered as the DC approximation of the load flow calculations.Therefore, the simulation procedures and the parameter settings are basically the same, except for the following differences: (1) The magnitude of the voltage is fixed for all nodes, while the reactive power Q k is arbitrary.(2) Ohmic losses are neglected such that the admittance of the transmission lines is purely imaginary, Y 0 = iB = i2 p.u.. (3) The flows are defined as Similar as in the load flow calculations, a transmission line is considered as critical when its failure causes a secondary overload of at least one other transmission lines.The overload of a line is defined as following: the maximal flow in the network exceeds the line capacity, which is defined for the linear flow model as

Critical link identification on alternative topologies
We test the performance of the novel indicators by extensive numerical simulations of the power grid dynamics for various network topologies, as described in Sec.A 2, using the oscillator model.In Fig. 19 we compare AUCs for different indicators and different ways of prediction: making random predictions, using a indicator as a linear classifier, using a Bayesian classifier for in-sample prediction and using a Bayesian classifier with separation of test and training data set.In the later case, CPDFs were evaluated using 50% of the available data and ROC curves were generated using the remaining 50%.The composition of test and training data set in terms of numbers of bridges, stable and critical links was chosen randomly.Mostly, linear and Bayesian approach yield comparable results, with the linear classifier being slightly better.This is due to the fact that the relation between the value of the indicator variable and its prediction success is in A closer inspection reveals that the predicted maximum load max performs better than the ratio r ab for the datasets with homogeneous supply, i.e. equal number of generators and consumers.One reason for this finding could be the fact that the total number of critical links is higher for homogeneous than for heterogeneous supply.Furthermore we find that the performance of all indicators is lowest for the Scandinavian grid with homogeneous supply.This model network is the one with the highest fraction of critical links in total.

8 FIG. 1 .
FIG. 1. Limits to predicting critical links by local measures.(a) Stationary loads in a coarse-grained model of the British power grid.For three links (a, b) the loads L ab = F ab /K ab are indicated.(b) The load does not necessarily indicate whether a local failure causes a desynchronization and thus a major malfunction.(b1) Highly loaded link (labeled 'link 1' in panel a) induces desynchronization and is thus critical.(b2) Moderately loaded link '2' still induces desynchronization and thus is also critical.(b3) In the same network, even a more heavily loaded link '3' does not yield desynchronization and leaves the network fully functional.Accordingly, link '3' is highly loaded but non-critical (stable).We analyze critical links for test networks based on the topology of the British high-voltage transmission grid [10, 28]: Ten nodes are randomly chosen as generators ( , Pj = +11 P0) and the others as consumers (•, Pj = −P0), with P0 = 1 s −2 .Lines connecting to generator nodes have a larger transmission capacity Kij = 2 K0 than the remaining links with Kij = K0 = 15 s −2 .See appendices A 2 and B 5) for additional test networks.

FIG. 2 .
FIG. 2. Redundant capacity indicates flow rerouting options.(a) Ratio between flows and redundant capacities (color coded), with the three links from Fig. 1 indicated by arrows.(b) Dominant rerouting paths.(b1,b2) For the two links (1, heavily loaded) and (2, moderately loaded), the flows to be rerouted are larger than the redundant capacity.The bottlenecks on the rerouting path are indicated by dashed red arrows.Thus |F ab /K red ab | > 1 and these links are critical.(b3) For link 3, whereas more heavily loaded than link 2, the flow to be rerouted is smaller than the total available redundant capacity.Thus |F ab /K red ab | < 1 and the link is stable.

FIG. 3 .
FIG.3.Renormalized linear response theory predicts critical links.(a) Renormalized linear response not only predicts which links are critical (red squares) and which are stable (green disks), it also accurately tells the maximum loads for stable links (solid line, no fit parameter).(b) Response as a function of the load L ab of the failing link does grow proportionally to the load, but also depends on other factors.(c) Response grows proportionally to the overlap |q • v2| with the Fiedler vector.The solid straight line with slope slope 1/λ2 gives a strict lower bound see Eq.(10).Networks and parameters as in Fig.2.
FIG. 4.Improved quality of critical link prediction.A Receiver operating characteristic curves indicate a multi-fold increase in prediction quality using the classifiers (4) and (9) in comparison to load or flow.(Statistics for 66000 links from 400 random grid realizations; model, topologies and parameters as in Figure1) FIG. 6.Model power grid based on the topology of the British high-voltage power transmission grid[10,28].Half of the nodes are randomly chosen as generators ( ) and consumers (•), respectively, with Pj = ±1 s −2 (homogenous case).In the current example 13 bridges and 4 further links are critical (marked by arrows).All of them are faithfully identified using the novel predictors proposed in this manuscript.
Input: Capacity matrix K, Initial flow matrix F , link a → b Output: Redundant capacity K red ab of the link a → b.Step 1: Delete the link (a, b) from the effective network: K ab , K ba , F ab , F ba ← 0. Step 2: Initialize K red ab ← 0. Step 3: Calculate the residual capacity matrix K f ← K − F .Step 4: Construct a shortest path p from a to b in the directed graph defined by the matrix K f .Step 5: If there is no path from a to b: STOP.

8 FIG. 9 .
FIG. 9. Prediction of flows by original and renormalized linear response theory.(a) Transmission line loads in the steady state in a coarse-grained model of the British high-voltage transmission grid.(b,c) Change of flows |F ij | − |Fij| in the south-eastern part of the grid after a perturbation.We analyze (b) the loss of 10 % of the transmission capacity and (c) the complete breakdown of the marked link.The prediction by linear response theory (A40) on the left-hand side is compared to the actual value obtained from solving the steady state condition for the original and the perturbed network on the right-hand side.Note the different color scales.

FIG. 10 .
FIG. 10.Quality of load prediction.(a) Histogram of the actual maximum load max ab = maxij |F ij /Kij| after a the failure of a single link and the prediction based on the linear response result (A42).The prediction is very accurate in the majority of all cases.Data has been collected for all stable links in 400 test networks with the topology of the British grid and heterogeneous generation (see section A 2 for details).(b) Histogram of the relative error of the prediction ( max predicted − max actual )/( max actual ).
FIG. 11.The response is proportional to load of the perturbed link and the inverse algebraic connectivity.(a,b) Response as a function of the load L ab of the failing link.The response of a given network is characterized by the norm of the phase difference ξ and the columns of the susceptibility η •←ab .In addition, the phase difference is significantly enhanced if the overlap to the Fiedler vector | q • v2| is large, as indicated by the color code.(c,d) The global response, averaged over all links is proportional to the inverse algebraic connectivity 1/λ2.Shown is the slope of the phase response (solid line in (a)) and the matrix norm of the susceptibility η for networks with different values K0 = 10, 12, . . ., 40 s −2 and thus different connectivity.The network topology is the same as in Fig.9.
A44) it is positive for the nodes in the first fragment and negative for the nodes in the other fragment.The overlap is then given by | v 2 • q| = 0 if a and b are in the same fragment N1+N2 N1N2 if a and b are in different fragments.

FIG. 12 .
FIG.12.An example for the calculation of an ROC-curve and the area under curve (AUC).

1 FIG. 13 .
FIG. 13.Receiver operating characteristic curves of different indicators for critical and stable links.The ROC curve shows the fraction of correct predictions of the classifier vs. the fraction of false alarms for various values of the threshold value h.The proposed classifiers r ab ( ), max ab (-) and c ab ( ) closely approach the perfect value (FPR, SEN) = (0, 1).They clearly outperform classification schemes based on the local measures load L ab (−−−) and flow |F ab | (−•−).Purely topological quantites such as the the topological edge-connectivity τ ab (•) and the edge betweenness centrality ab (-) perform even worse.Note that the ROC curve for λ ab consists of discrere points only as the variable λ ab itself is discrete.The inset shows the entire ROC curves, while the main panel shows a magnification around the point (FPR, SEN) = (0, 1).The curves are based on the simulation of 400 random realizations of the British power grid with heterogeneous generation (cf.section A 2).
(a) Threshold value h = 0.977 for which the ROC curve approaches the perfect operation point (0, 1) most closely.(b) No false positive results occur for a high threshold value h = 1.066.(c) No false negative results occur for a low threshold value h = 0.733.Results are summarized for 400 random realizations of the British power grid with heterogeneous generation (cf.section A 2).
n TP n TP + n FP and the negative predictive value NPV := n NP n TP + n FN .
(a) Threshold value h = 1.044 for which the ROC curve approaches the perfect operation point (0, 1) most closely.(b) No false positive results occur for a high threshold value h = 1.289.(c) No false negative results occur for a low threshold value h = 0.99.Results are summarized for 400 random realizations of the British power grid with heterogeneous generation (cf.section A 2).
FIG. 14.Cumulative probability distribution function (CPDF) for all indicator variables calculated for the British grid and heterogeneous power generation.In order to cover the wide range of indicator variables, the binning was done using irregular bin sizes on logarithmic scales, if appropriate.Number and size of bins were adapted such that each bin is based on at least two counts.

FIG. 15 .
FIG. 15.ROC curves of different classifiers for critical links using (a) the linear classifier and (b) the Bayesian prediction scheme: the ratio r ab ( ), the predicted max.load max ab (-), the combined indicator c ab ( ) the flow |F ab | (− • −), the load L ab (− − −), the edge betweenness centrality ab (-) and the topological edge-connectivity τ ab (•).The curves are based on the simulation of 400 random realization of the British power grid with heterogeneous power generation.

FIG. 16 .
FIG. 16.Performance of indicators for critical links in the third order model.Simulations are carried out for model power grids based on the topology of the British high-voltage grid (cf. Figure 6) with half of the nodes being generators and consumers with Pj = ±1 s −2 .Shown is the ROC curve for the different indicators: the ratio r ab ( ), the flow |F ab | (− • −) , the edge betweenness centrality ab (-) and the topological edge-connectivity τ ab (•).Results are collected for 200 random realizations of the network.(b) ROC curves using the Bayesian prediction scheme.
we assume B k = 20 for all links and ∆X (d) k

FIG. 18 .
FIG.18.Performance of indicators for critical links in the linear flow model.Simulations are carried out for model power grids based on the topology of the British high-voltage grid (cf.Figure6) for (a) the heterogeneous case (ten nodes randomly selected as generators) and (b) the homogeneous case (half of the nodes randomly chosen as generators).Shown is the ROC curve for the different indicators: the ratio r ab ( ), the initial flow |F ab | (− − −), and the topological edgeconnectivity τ ab (•).ROC curves are calculated using the Bayesian prediction scheme described in Sec.B 4 using data from 100 random realizations of the network.

F
LF ij := sgn(P ij ) • |S ij |. (B14)As indicators for the criticality of a transmission line (a, b), we test the apparent power flow |S ab |, the norm of the complex current |I ab |, the ratio r ab = |F LF ab /K red ab |, the topological edge-connectivity τ ab and the edge betweenness centrality ab .The novel indicator based on the redundant capacity r ab shows a much better predictive performance than other indicators, see Figure17 .
c. Critical links in the linear flow model networks.The indicators we test include the magnitude of the initial flow of the lines |F ab |, the ratio of the initial flow and the redundant capacity r ab = |F ab /K red ab |, the topological edge-connectivity τ ab and the edge betweenness centrality ab .The indicator r ab based on redundant capacity significantly outperforms the other indicators (Figure 18).
FIG.19.Summary of the performance of different indicators for various network topologies.Shown is the AUC for critical links using the linear classification scheme (red solid lines), the Bayesian prediction scheme in sample (dark blue, dashed) and the Bayesian prediction scheme out of sample with a separation of test and training data set (light blue dashed lines).Confidence bounds (yellow) were estimated making random predictions.Data averaged over 400 random realizations for the British and the IEEE grid and 200 for the Scandinavian grid, respectively.

TABLE I .
Table I summarizes given and unknown variables that characterize the different types of buses.Suppose we are interested in a power transmission net-Characteristics of the three types of buses in load flow calculation.

TABLE IV .
Performance of a classification system of critical and stable links based on linear response theory as defined in Eq. (B8).Contingency table for the three different values of the threshold h optimized for different tasks:

TABLE V .
Performance of a classification system of critical and stable links based on the combined indicator c ab as defined in Eq. (B9).Contingency table for the three different values of the threshold h optimized for different tasks: