Optimal percolation in correlated multilayer networks with overlap

We show that the size of the optimal percolation set of a multilayer network is substantially affected by the presence of interlayer degree correlations and edge overlap. We provide extensive numerical evidence to confirm that the state-of-the-art optimal percolation strategies normally fail to identify minimal percolation sets in synthetic and real-world correlated multilayer networks, thus underestimating their fragility. We propose a family of Pareto-efficient strategies for optimal multilayer percolation, that take into account both intralayer and interlayer heuristics, and can be easily extended to multiplex networks with an arbitrary number of layers. We show that these strategies consistently outperform existing attacking algorithms, providing interesting insights about the interplay of correlations and overlap in determining the hyperfragility of real-world multilayer networks.

Network percolation theory has been recently shaken by the discovery that interdependencies and feedback loops between interacting networks change the character of the percolation transition and make it explosive [1][2][3].These results have acquired even more relevance in the last few years, due to the increasing experimental evidence about real-world systems whose structures are naturally represented as multiplex [4,5] or multilayer networks [6].
Random percolation in multiplex networks is nowadays quite well-understood [7][8][9][10][11], and the wide spectrum of possible percolation transitions, from abrupt to continuous ones [12][13][14], has been successfully related to some of the structural properties of these networks, such as the presence of interlayer degree correlations and edge overlap [10,[15][16][17][18].However, quite often targeted attacks can potentially drive a system to collapse by knocking down a much smaller fraction of nodes than required by random attacks [19][20][21][22].Hence, optimal percolation, that is the problem of finding the minimal fraction of nodes whose removal would irreversibly fragment the system, has been extensively studied both in single-layer networks [23] and, more recently, in multilayer networks as well [24].Interestingly, the single-layer optimal attack strategies [25][26][27][28] cannot be easily extended to the multilayer case, resulting in an interesting and quite active line of research [24,29].Although correlations and overlap are indeed a salient aspect of all real-world multiplex networks [6,15,16,30], the few strategies for optimal multiplex percolation proposed so far have been mainly tested on synthetic uncorrelated multilayer networks, thus neglecting inter-layer degree correlations and edge overlap.
In this work we fill this gap by studying the problem of optimal percolation in multilayer networks with nontrivial interlayer degree correlations and non-negligible edge overlap.We find that current algorithms for optimal percolation systematically underestimate the fragility of correlated multilayer networks, by overestimating the size q of the minimal set of nodes to knock down in order to destroy the mutually connected giant component.We introduce a family of Pareto-efficient algorithms [31,32], which simultaneously take into account layer-based and genuinely multi-layer node properties, and we show that these algorithms consistently provide smaller critical sets in correlated multilayer networks.

I. SIMPLE OPTIMAL PERCOLATION STRATEGIES
Let us consider a multiplex network M with N nodes and two layers.The undirected and unweighted edges on each layer are encoded in the adjacency matrices and only if nodes i and j are the endpoints of an edge at layer α.Two nodes of M belong to the same Mutually Connected Component (MCC) if there exists at least one path on each layer that connects them and traverses only nodes belonging to the same MCC.We focus on the relative size of the mutually connected giant component (MCGC), that is the maximal subgraph consisting of mutually connected nodes [1].In particular, we are interested in finding the smallest set of nodes which, if removed, would reduce the size of the MCGC to O(N 1/2 ).We call this set critical set or attack set, and we denote its relative size with q.Optimal percolation consists into assigning a score to each node, based on some structural indicator [23,33,34], and then iteratively removing nodes in decreasing order of their score.As confirmed by recent studies [24,29], single-layer attack strategies cannot be easily generalised to the case of multiplex networks, mainly because it is not immediate to combine node scores on different layers to obtain a meaningful ranking.The authors of Ref. [24,29] proposed several ways of integrating scores based on popular single-layer strategies, namely: (i) rankings based on the sum or product of the degrees in the two layers (HDA); (ii) a generalisation of the so-called CoreHD algorithm (CoreHD) [26]; and (iii) a generalisation of the Collective Influence Propagation algorithm (CI p ) [35].However, the recently proposed Effective Multiplex Degree (EMD) strategy [29] consistently improves on all of them by making use of a node score based on the multilayer structure of the system.This is evident in Fig. 1(a), where we show the percolation diagrams obtained by HDA and by two variants of EMD on a duplex with uncorrelated Erdös-Renyi layers.This case corresponds to the results reported in Ref. [29], where the authors showed that the critical set found by EMD is consistently much smaller than that found using HDA.The result is not surprising, since HDA ranks nodes using only their degrees at each layer, while EMD takes into account the multilayer structure of the system, including interlayer adjacency.

II. PARETO-EFFICIENCY FOR MULTI-OBJECTIVE OPTIMISATION
Our main hypothesis is that it should be possible to obtain even smaller attack sets by combining layer-specific and genuinely multilayer information.We use here the concept of Pareto efficiency [31,32], which was originally devised to concurrently optimise multiple cost functions.The idea is illustrated in Fig. 2. We consider a set of m node descriptors which we deem relevant for multilayer percolation, so that each node i is associated to the vector of ranks induced by each of the m scores and is mapped into a point of an m-dimensional space C. Assuming that optimal attack sets consist of nodes who are maximising all the structural descriptors at the same time, we can employ the concept of dominance strict partial order [32] to identify Pareto-efficient nodes in the space C. A point is considered Pareto-efficient if no single score associated to node i can be improved without hindering the other scores associated to node i.In general, for a given set of points there are more than one Pareto-efficient points, which constitute the so-called Pareto front for that set (see Fig. 2).
We considered the ranking of nodes induced on each layer by the three best-performing single-layer optimal percolation strategies, namely, Collective Influence (CI) [25], CoreHD [26], and Believe Propagation Decimation (BPD) [27], and we combined each of them with the ranking induced by EMD [29], which is the bestperforming optimal multiplex percolation strategy.We constructed the critical sets by iteratively removing the Pareto-efficient point having minimal Euclidean distance from the ideal point (see caption of Fig 2).Then we recompute the the set of Pareto-efficient points and iterate.By looking at Fig. 1(a), where the two layers have no interlayer degree correlations and no edge overlap, we see that the combination of EMD and Core-HD provides a sensibly smaller critical set than EMD or HDA alone.Instead, if a multiplex network has maximally disassortative interlayer degree correlations (Fig. 1(b)) or high edge overlap (Fig. 1(c)) then the relative performance of each algorithm actually depends a lot on the structure of the multiplex.For instance, EMD still outperforms HDA by a large margin when the multiplex has no edge overlap and markedly disassortative degree-degree correlations (Fig. 1(b)), while EMD is the worst algoritm when the edge overlap is not negligible.In all the cases, the algorithms based on Pareto-efficiency perform much better than both EMD and HDA.Notice that the CI-EMD algorithm performs better when we iteratively remove the entire Pareto-front at each iteration instead of selecting the closest point to the ideal point.

III. THE ROLE OF EDGE OVERLAP AND INTERLAYER DEGREE CORRELATIONS
The edge overlap of a two-layer multiplex measures the fraction of edges that are present on both layers [6,16,36].It can be measured as o ij and Θ(•) is the Heaviside step function.In Fig. 3(a) we plot the value of q obtained by each of the seven algorithms considered, as the edge overlap is increased from o = 0 (each edge present in only one of the two layers) to o = 1 (the two layers are identical).We notice that in general q is an increasing function of o, and for some algorithms the increase is quite substantial.For instance, for o = 1 EMD finds a critical sets that is 70% larger than that obtained for o = 0, while HDA finds a minimal set that is more than 50% larger than that obtained at o = 0.The relative performance of the seven algorithms is a function of o as well, meaning that an algorithm that performs best for a specific value of edge overlap might become the worst one for a different value of overlap.This is evident by looking at the inset of Fig. 3(a), which shows that the Kendall's τ correlation coefficient between the ranking of algorithms at o = 0 and the rankings obtained for o > 0 is a decreasing function of o.Moreover, the ranking of algorithms at o 0.4 is practically uncorrelated to the ranking observed for o = 0. Interestingly, the best (smallest) critical set is always obtained by one of the methods which combine layer-based and genuinely multi-layer node properties through Pareto-efficiency, with BPD-EMD consistently outperforming all the other strategies for o > 0.4.Similar results are obtained when the two layers have scale-free degree distributions, as shown in Fig. 3(b), although the typical values of q are overall smaller.This indicates that the heterogeneity of the degree distribution of each layer has some impact on the efficiency of each attack strategy, but the presence of edge overlap effectively determines the relative performance of different strategies.
Inter-layer degree correlations are known to have a substantial effects on the vulnerability of many real-world systems [30], and are responsible for consistent shifts in the position of the random percolation threshold [12,37].Several studies have found that maximally disassortative interlayer degree correlations improve the robustness of multiplex systems to both random [15,30], and targeted attacks based on the selection of nodes with the largest degrees [15].However, most of those studies have only considered the case of duplex systems with identical degree distributions on the layers, and either maximally positive or maximally negative correlation.Here we used the procedure explained in Ref. [16] to tune interlayer degree correlations between the maximally disassortative case (also called Maximally Negative (MN)) and the maximally assortative one (Maximally Positive (MP)).In order to isolate the effect of inter-layer degree correlations, we studied the performance of the seven targeted attack strategies as a function of the interlayer degree correlation coefficient ρ, imposing that each realisation of the multiplex had o ≈ 0.
In Fig. 4(a) we plot q as a function of ρ for the seven algorithms considered.Also in this case, targeted attacks based on Pareto-efficiency yield the smallest critical sets, and the discrepancy between the overall best strategy (CoreHD-EMD) and the worst one (HDA) is maximal when ρ −1.In particular, the critical set found by CoreHD-EMD when ρ −1 is around 19% (much smaller than the one found for ρ 1, i.e., around 23%), while HDA finds a much larger critical set (27%), which is even larger than the one it finds for ρ 1 (25%).This result suggests that the presumed increased robustness of multiplex networks with disassortatively correlated degrees is probably just an artefact of the algorithm used to determine the critical set.Indeed, negatively correlated multiplex systems are hyperfragile to targeted attacks based on Pareto-efficiency.
Fig. 4(b)-(c) shed light on the interplay between edge overlap and inter-layer correlations.There we consider the sequences of multiplex networks obtained by adiabatically increasing ρ while keeping the edge overlap fixed at a certain value (respectively, o = 0.2 in panel (b) and o = 0.4 in panel (c)).To obtain those sequences, we first increase the value of interlayer degree correlation [16]) and then we set the desired value of edge overlap through biased edge rewiring [38,39], thus preserving the degree sequences of the two layers.Interestingly, for o = 0.2 the size of the critical set q is a non-monotonic function of ρ, while for o = 0.4 it is a decreasing function of ρ irrespective of the specific strategy used.This indicates that if the multiplex has non-negligible edge overlap then the presence of disassortative interlayer degree correlations actually improves its robustness.In any case, the relative performance of the algorithms considered clearly depends on both interlayer correlations and edge overlap (notice in the insets of Fig. 4 that the ranking of the strategies remains the same for o = 0.2, while it varies a lot for o = 0.4).The smallest critical set for each value of ρ is consistently obtained by the CoreHD-EMD strategy.and by the second-best strategy (q sub ) in real-world multiplex networks [40], together with the corresponding size of the initial MCC, interlayer degree correlation (ρ), and normalised edge overlap (o norm = o/omax where omax is the maximum overlap in the corresponding configuration model ensemble [38,39]).The performance of each strategy heavily depends on the presence of edge overlap and inter-layer degree correlations.Pareto-efficient strategies normally perform better in multiplex networks with intermediate levels of edge overlap and correlations.
Finally, in Table I, we report the sizes of the critical sets found in several real-world multiplex networks [40] by each of the attack strategies analysed in this work.The systems considered in the table range in size from few dozens to thousands of nodes, with different values of edge overlap and interlayer degree correlations.Unsurprisingly, there is no strategy that works better than the others in all the cases, and in general strategies based on the combination of single-layer and multiplex descriptors through Pareto-efficiency tend to outperform simpler ones.

IV. CONCLUSIONS
Optimal multiplex percolation is characterised by a variety of subtleties.The comparison of several attack strategies has revealed that the results obtained on uncorrelated multiplex networks do not hold if the networks have non-negligible edge overlap and non-trivial interlayer degree correlations.Interestingly, Pareto-efficient strategies consistently outperform currently state-of-theart algorithms and, at the same time, pave the way to the systematic generalisation of single-layer percolation strategies to multiplex networks with an arbitrary number of layers.

FIG. 1 .
FIG. 1. (Colour online) (a) Percolation diagrams for different attack strategies on two-layer Erdös-Renyi networks with N = 10 5 nodes, average degree k = 5, uncorrelated layers (ρ ≈ 0) and no edge overlap (o ≈ 0).In this case, EMD provides a much smaller attack set than HDA, but the concurrent optimisation of CoreHD and EMD through Pareto-efficiency produces a smaller critical set.(b) If the two layers have maximally disassortative interlayer degree correlations (ρ = −1) but still no edge overlap (o ≈ 0), HDA performs sensibly worse than in the uncorrelated case, while CoreHD-EMD finds instead a much smaller attack set and again outperforms EMD.(c) If the duplex has a substantial edge overlap (o = 0.4) and no correlations (ρ ≈ 0), the critical set is much larger than in the other two cases.The presence of edge overlap favours HDA, but the smallest attack set is still found by the CoreHD-EMD method based on Pareto-efficiency.

FIG. 2 .
FIG. 2. (Colour online) Graphical representation of Pareto efficiency for two structural node descriptors r1 and r2.Each node of the multiplex is mapped onto a point in the (r1, r2) plane.The points for which no improvement can be achieved in one objective function without hindering the others are called Pareto-efficient (blue circle and red diamond) and constitute a Pareto front.The Pareto-efficient points are iteratively ranked according to their Euclidean distance from the ideal point (green star), i.e., the point that maximises all the objective functions.In this case, the node associated to the red diamond is ranked first.

FIG. 3 .
FIG. 3. (Color online) Size of the critical attack set q as a function of edge overlap for different attack strategies in duplex networks with N = 10 5 nodes, k = 5, and whose layers have (a) Erdos-Renyi or (b) scale-free degree distributions (γ = 2.6).The plots are obtained by starting from two identical layers (o = 1 and ρ = 1) and then iteratively rewiring the edges of one of the two layers to reduce the edge overlap until we get to o = 0 [38].Again, strategies based on Pareto-efficiency yield the best results.Results averaged over 20 realisations.

FIG. 4 .
FIG. 4. (Colour online) Critical set found by each attack strategy as a function of the interlayer degree correlations coefficient ρ [16] (Erdös-Renyi duplex with N = 10 5 nodes, k = 5), respectively for (a) no edge overlap (o = 0); (b) low edge overlap (o = 0.2); (c) intermediate edge overlap (o = 0.4).The concurrent presence of interlayer degree correlations and edge overlap strongly affects the robustness of a system against targeted attacks.Strategies that simultaneously combine single-and multilayer attacks through Pareto-efficiency consistently detect smaller attacking sets.Results averaged over 20 realisations.

TABLE I
. (Colour online) Size of the minimal attack set (qmin) found by the best-performing attack strategy (in parentheses)