Analytical computation of the epidemic threshold on temporal networks

The time variation of contacts in a networked system may fundamentally alter the properties of spreading processes and affect the condition for large-scale propagation, as encoded in the epidemic threshold. Despite the great interest in the problem for the physics, applied mathematics, computer science and epidemiology communities, a full theoretical understanding is still missing and currently limited to the cases where the time-scale separation holds between spreading and network dynamics or to specific temporal network models. We consider a Markov chain description of the Susceptible-Infectious-Susceptible process on an arbitrary temporal network. By adopting a multilayer perspective, we develop a general analytical derivation of the epidemic threshold in terms of the spectral radius of a matrix that encodes both network structure and disease dynamics. The accuracy of the approach is confirmed on a set of temporal models and empirical networks and against numerical results. In addition, we explore how the threshold changes when varying the overall time of observation of the temporal network, so as to provide insights on the optimal time window for data collection of empirical temporal networked systems. Our framework is both of fundamental and practical interest, as it offers novel understanding of the interplay between temporal networks and spreading dynamics.

A wide range of physical, social and biological phenomena can be expressed in terms of spreading processes on interconnected substrates.Notable examples are the spread of directly-transmitted infectious diseases through host-to-host contacts [1], the spatial propagation of epidemics driven by the hosts' mobility network [2,3], or the diffusion of ideas mediated by social interactions [4,5].These phenomena are the result of a complex interplay between the properties of the spreading dynamics and the network's structural and temporal features, hindering their full understanding.A central fundamental issue characterizing spreading processes is the identification of the critical condition for the wide spreading regime, encoded in the epidemic threshold parameter.This is of critical importance for epidemic containment [1], as well as for the control of the information diffusion [6].Extensive studies have characterized this parameter in the time-scale separation approximation, i.e. when the time scales of the spreading process and of the network evolution are strongly different.This includes the two limiting regimes, quenched and annealed [7,8,9,10,11,12,13,14,15]. In the first case, the network is regarded as static, as it evolves on much slower time scales than the ones characterizing the spreading process.The epidemic threshold is then computed from the adjacency matrix describing the network connectivity pattern [8,7].In the second case, the underlying network evolves so fast with respect to the dynamical process that only its time-averaged properties are relevant for spreading.Approaches like the heterogenous mean field [10], the generating function [11] and percolation theory [9] provide, in this regime, estimates of the threshold in the infinite size limit.
Recently, the extensive empirical characterization of social interactions at different scales and settings [16,17,18,19,20,21,22,23] have shown that networks often display non-Poissonian and non-Markovian temporal evolutions unfolding at time scales similar to the ones of many spreading processes of interest, stressing the need for novel theoretical tools able to overcome current limitations.Much research has focused on spreading processes occurring on time-varying networks [17,21,24,25,26,27,28,29,30,31,32], however only few studies provided so far analytical calculation of the epidemic threshold in specific cases [28,29,30,31,32].They are all based on models for time-varying networks integrating the microscopical laws governing the network evolution, under contextspecific assumptions.An analytical framework for the computation of the epidemic threshold for a generic time-varying network is still missing.To fill such gap, here we present a novel approach that, by reinterpreting the tensor formalism of multi-layer networks [33,34], extends the Markov chain approach adopted for static networks [8,7] to their temporal counterpart.The approach is applied to time-varying network models and empirical networks to provide insights on the role of the network dynamical features (such as activation rate, temporal correlations, temporal resolution and observation time window) on the computed spreading potential.

Derivation of the epidemic threshold
We consider the Susceptible-Infected-Susceptible (SIS) model [1], where individuals (i.e. the nodes of the network) can be in one of the two mutually exclusive states -susceptible or infectious.Susceptible individuals become infectious through contacts with infectious individuals at rate λ times the number of infectious contacts.Infectious individuals become spontaneously susceptible at rate µ.We consider the temporal network forming the substrate of the spreading process to be undirected and unweighted, for simplicity (for the directed and weighted cases, see Supplementary Information).In the following, we will represent it as a sequence of temporal snapshots; as a consequence, time in our framework is discrete.
In order to describe the spreading dynamics on such a substrate we extend the Markov chain equation approach for static networks [7,8] to the case of temporal networks.The SIS propagation on a generic network with N nodes and adjacency matrix A is given by: i is the probability for the node i to be in the infectious state at time t.The Markovian model of Equation ( 1), widely adopted in different fields [35,12], is based on the mean-field assumption of the absence of dynamical correlations among the states of neighboring nodes [36].For both directed and undirected networks [37,38] the study of the asymptotic state yields the derivation of the epidemic threshold (λ/µ) = 1/ρ(A † ), where ρ(A † ) is the spectral radius of the transposed adjacency matrix A † [7,8].This is known to be a lower bound estimate of the real epidemic threshold, approaching the real value with surprising high accuracy given the simplicity of the expression and its derivation [36,39].
We extend this paradigm to a temporal network by letting the adjacency matrix in Equation (1) depend on time: Here A (t) is the adjacency matrix associated to the t-th snapshot of the evolving network.
In order to ensure the asymptotic solution of the SIS process in a generic temporal network we assume periodic boundary conditions for the network dynamics.Being T the total number of network time snapshots, we impose A (T +1) ≡ A (1) .This does not imply any loss of generality given that T may be completely arbitrary.We notice that, as a consequence of the assumed periodic temporal dynamics of A (t) , the asymptotic solution of Equation ( 2) is in principle periodic with period T .We now define a more convenient representation of the coupled dynamics adopting the multi-layer approach introduced in [33].We map the temporal network to the tensor space R N ⊗ R T , where each node is identified by the pair of indices (i, t), corresponding to the node label, i, and the time frame, t, respectively.A multi-layer representation of the temporal network can be introduced through the following rules: • each node, at time t, is connected to its future self at t + 1; • if i is connected to j at time t, then we connect i at time t to j at time t + 1, and, j at time t to i at time t + 1.
The second rule is termed non-diagonal coupling in the multilayer-network framework [34].
The first rule is consistent with the ordinal coupling in such framework [40,34], however, differently from that representation, no links are found connecting nodes on the same layer, as layers do not represent anymore the adjacency matrices of the temporal snapshots.The so-defined network is therefore multipartite, since only pairs of nodes belonging to different layers are linked together (see Figure 1 for a schematic illustration of this transformation).While formally falling into a specific sub-case of the classification introduced in [41], the proposed mapping from the networks' temporal sequence to a multilayer object provides a novel representation of the temporal network that preserves the information relevant for the spreading process.The tensor representation of the obtained multilayer network is the following: Analogously to the definition of A, we can also write in this representation the tensor associated to the SIS dynamics of Equation ( 2), coupling together contagion and network dynamics: The multi-layer representation and the definition of the tensor M introduces a simplified expression for Equation (2).The tensor space can be represented in a single index notation through the isomorphism R N ⊗ R T R N T .In other words, similarly to the definition of the supra-adjacency matrix in [33,42,43,44], we can mask the tensorial origin of the space through the map (i, t) → α = N t + i, with α running in {1, ..., N T }, allowing us to write the network tensor M in the matrix form M provides a network representation of the topological and temporal dimensions underlying the dynamics of Equation ( 2), that here are interrelated and flattened.Its directed nature preserves the causality of the process, while its weights account for the SIS transitions rates.The Markov process is now described by a trajectory in R N T where the state vector pα (τ ) represents the probability of each node to be infected at each time step t included in the interval [τ T, (τ + 1)T ].Consistently, Equation (2) becomes Given that vector p encodes a 1-period configuration, the T -periodic asymptotic state of the SIS process is now mapped into the steady state pα (τ ) = pα (τ − 1).The latter can be recovered as solution of the equilibrium equation: that is formally the same as the stationary condition imposed on the Equation (1) for the static network case, and is similar to the Markov chain approaches used to solve contagion processes in multiplex and interconnected networks [43,44,45].We can then follow [7,8] and linearize the equation to obtain the threshold condition defining the epidemic threshold for the temporal network case.The spectral approach [7,8,43,44,45] yields the critical values of λ and µ above which the epidemic takes off.The spectral radius of M can be simplified with the following relation (see Methods section): where P = T t=1 1 − µ + λA (T −t) represents a weighted version of the accessibility matrix [46], having connectivity weighted by λ and waiting time weighted by 1 − µ.
The quenched and annealed regimes can be recovered within this general framework as particular limiting solutions.In the first case, it is to be noted that the sequence of temporal snapshots naturally defines the minimum time scale of the process.In order to consider a contagion dynamics much faster than the time-varying process of the network, we thus rely on the commonly adopted assumption regarding the temporal network as static, so that A (t) ≡ A. In this particular case, P = (1 − µ + λA) T .Therefore ρ(M ) = ρ(1−µ+λA) = 1−µ+λρ(A).The requirement ρ(M † ) = 1 thus recovers the expression known for the quenched case.
The study of the annealed regime is less trivial.In the assumption that λ and µ are very small, corresponding to a very slow disease dynamics with respect to the time scale of the network evolution, it is possible to replace P with its linear expansion in λ/(1 − µ) yielding where A = t A (t) is a weighted static representation of the network, formed by the sum of all the snapshots.Temporal correlations are lost, and edges count for the number of times they are active during the whole period T .Equation (7) for the epidemic threshold thus simplifies to (λ/µ) c,slow = T /ρ(A) [47], and the aggregated matrix contains all the relevant information for spreading dynamics.

Validation and comparison with stochastic simulations
In the following we validate the analytical method and compare its predictions with the behavior of a simulated SIS spreading process.For this purpose we consider six networks, three of them built from models for time-varying networks and the others obtained from empirical measures.The first network, ER, is formed by a sequence of random Erdős-Rényi graphs [48] with a given number of nodes and edges.It represents a simple and completely uncorrelated example of temporal network.The second network, ACTIVITY, is a realization of the activity driven model [29] where each node is assigned an activity potential, representing the probability of being active in a certain snapshot.Once activated, the node establishes a fixed number of connections that are renewed at each snapshot.We consider a heterogeneous activity distribution, so that the obtained networks are characterized by a temporally uncorrelated sequence of snapshots with a heterogeneous aggregated degree distribution.BURSTY is built from the model introduced in [24] and accounts for a heterogenous activation pattern describing a sequence of homogeneous networks where the inter-contact time is power-law distributed.For all these networks, size and period are chosen arbitrarily, since their choice does not impact the method validation, as discussed more in detail in the following section.As real time-varying networks we chose datasets describing human contacts of different kind: HT09 is the network of face-to-face proximity during a 2.5 days scientific conference [49]; SEX is a 1-year network of sexual contacts between prostitutes and their clients [18]; SCHOOL is a contact network describing one day in a primary school [50].Further information about these networks can be found in Table 1.Size, period and topological properties are constrained by the measurements and are very diverse.
In order to check the validity of the proposed analytical expression, we numerically solve the Markov Equation (2).For given λ and µ, we iterate the equation until the periodic state is reached and compute the average prevalence over a period i MC = i,t p (t) i /(T N ).Predictions are then also compared with the threshold behavior obtained from numerical simulations of the stochastic and microscopic SIS dynamics on the evolving networks.We use the quasi-stationary state method [51] (see Methods section) to measure the average prevalence i sim over the time-series for different values of λ, after an initial transient time is discarded.
Figure 2 shows i MC and i sim as functions of λ for two different values of µ (µ = 0.2 and µ = 0.5) for all networks under study.The average prevalence displays the expected transition behavior.The solution of the Markov chain equation i MC is equal to zero for small values of λ until the critical value of λ is reached, after which a rapid growth is observed signaling an epidemic affecting a finite fraction of the network.The transition is well predicted by the analytical expression of Equation (7).The threshold behavior obtained from numerical simulations is also very similar to the mean-field prediction.The two curves of i MC and i sim are nearly superimposed, showing that the meanfield approximation in Equation ( 2) is valid in all conditions of network size and average connectivity here considered.The presence of correlations shows its effects in proximity of the transition that is smoother for i sim with respect to i MC .This is particularly evident for the network HT09 and is a consequence of its small size (N = 113).In the Supplementary Information we report the analysis of the dynamical correlations.
The good agreement of the computed epidemic threshold with the solution of the Markov chain equation and the numerical simulation results is thus obtained under a range of different temporal network properties (presence or absence of temporal correlations, heterogeneous vs. homogeneous distributions characterizing temporal and structural observables) and sizes (from approximately 10 2 nodes to 10 4 ).In particular, small sizes are generally responsible for the failure of approximations aimed at solving the phase diagram of the system that rely on the statistical properties of the network (see Figure 2 for a comparison with the predicted threshold obtained from the activity-block approximation in the case of the activity driven model [29]).

Optimal data collection time
Available datasets characterizing empirical networks account only for a portion of the real contact process, recorded during a given time window, the extent of which may affect the prediction of the epidemic threshold.One may expect that, when the data-collection period is long enough, the data would report an approximately complete reconstruction of the temporal network properties, thus leading to an accurate estimate of the epidemic threshold.Given the resources needed for the setup of data collection deployments, we explore here the role of the period T aimed at identifying a minimum length of observation of the contact process that is optimal in providing a reliable characterization of the spreading potential.
We thus compute the epidemic threshold from Equation ( 8) for increasingly larger values of the period T up to the entire data-collection time window, for the three empirical networks under study.Figure 3 shows a saturation behavior for λ c , indicating that the data-collection period is long enough to appreciate the collapse and characterise the dynamics.Such behavior and its associated relaxation time strongly depend on the network typical time scale and on the temporal variability of its structure.More in detail, a simple structural measure -the variation of the average degree along the period -is shown to strongly impact the predicted λ c (Figure 3 and Supplementary Information).This is particularly evident in the SCHOOL network, where the daily activity of students determines considerable variations in the average degree then inducing marked oscillations in the resulting threshold.Next to the empirical networks, we also consider the BURSTY network model as it includes non-trivial temporal correlations.In this case, the critical transmission rate λ c rapidly saturates to a constant value (Figure 3), and an even more rapid saturation is observed for the other two network models (Supplementary Information).The average degree is indeed relatively stable so that small temporal windows are enough to fully characterize λ c .
Different values of the recovery rate µ lead in general to similar behaviors of λ c towards saturation, differing essentially by a scaling factor.The effect of µ on the saturation time is instead visible for the BURSTY network.In this case, when the period length is smaller than the average duration the infection, the truncation in the inter-contact time distribution clearly alters the estimation of the threshold.

Conclusion
Being able to provide a reliable and accurate estimation of the epidemic threshold for a spreading process taking place on a given networked system is of utmost importance as it allows to make predictions on the likelihood of an epidemic and also to identify the resources needed for setting up strategies for its containment.With analytical approaches having so far targeted only specific contexts, our framework allows the analytical computation of the epidemic threshold on a generic temporal network, requiring no assumption on the network topology or time variation.The proposed approach is based on the spectral decomposition of the flattened matrix representation of the topological and temporal structure of the network, extending the Markov-chain model introduced for the static network case to its temporal counterpart.Validated against the numerical solution of the model, the predicted epidemic threshold also reproduces with high accuracy the behavior observed in stochastic microscopic numerical simulations of the spreading process and outperforms methods based on network models based on the statistical properties of the network.
For simplicity, we have focused on undirected unweighted networks, however the approach is extensible to the more general case (see Supplementary Information).The technical requirement of periodic conditions does not limit the general applicability of our approach, as the method is valid for an arbitrary period length.Moreover, this feature allows the identification of a minimum length of the observation window of a real sys-tem for contact data collection, highlighting the presence of well-defined properties and patterns characterizing the system that can be captured in a finite time.
Our framework thus introduces a multi-layer formulation of spreading phenomena on time-varying networks that opens the path to new theoretical understandings of the complex interplay of the two temporal processes.

Materials
Proof of Equation ( 8) Computing the eigenvalues of M † means solving the equation det x − M † = 0, where the determinant is computed on the R N T space (det N T ).Given that x − M † is composed of T 2 blocks of size N × N , we can use the findings in [52] to reduce the dimensionality of the problem, i.e. det N T → det N .Moreover, given that several blocks of x − M † are zero, the general result in [52] simplifies to det N T x − M † = (−1) N T det N x T − P .Equation ( 8) immediately follows.

Estimation of the epidemic threshold from numerical simulations
The computation of i sim in proximity of the transition is made difficult because surviving configurations are rare, and a very large number of realizations of the process is needed in order to get substantial statistics.We use the quasistationary state (QS) method [53,51] in order to overcome this difficulty and increase our computational efficiency.QS method is based on the idea of constraining the system in an active state.Every time the absorbing state of zero infected is reached by the system, it is substituted with an active configuration randomly taken from the history of the simulation.In particular, 50 active configurations for each network snapshot are kept in memory.Whenever an active configuration is reached, it replaces one of the 50 with probability 0.2.When the absorbing state is reached, an active configuration is chosen among the 50 of that particular snapshot.For each simulation, after a relaxation time of 3 • 10 3 time steps, statistics is performed during 10 5 time steps.The method produces a time series long enough to accurately compute the observables i sim .
Tables network # nodes period T aggregating window ER [48] 500 13 -ACTIVITY [29] 1000 20 -BURSTY [24] 500 50 -HT09 [49] 113 30 1 hour SCHOOL [50] 787 42 10 mins SEX [18] 6866 13 28 days Table 1: Temporal networks considered for the validation.The first three are synthetic models, while the others are real networks.The ER model is a sequence of random graphs with 500 nodes and 750 edges, so that k = 3.The ACTIVITY model is a sequence of snapshots built with parameters values: ∆t = 1, m = 2, η = 10, γ = 2.8, = 3 • 10 −2 , in notation of [29].The BURSTY network is built with a power law distributed interactivation time, with exponent −2, and cutoff equal to the period of the network.For the real networks, the collection time is the total time spanned by the dataset.(one single realization of the network models is considered in each panel).In the case of the ACTIVITY network model, we also performed 200 realizations of the network model to compare our predictions with the mean-field ones obtained with the activity-block approximation valid for very large networks [29].Results show that the activity-block model predictions are smaller than our analytical predictions (λ ACTIVITY c = 0.065 for µ = 0.2 and λ ACTIVITY c = 0.162 for µ = 0.5 vs. λ c = 0.093 and λ c = 0.233, respectively), such discrepancy being likely due to the small size of the network considered here (10 3 nodes), and highlighting the general nature of our framework.

A.2 More on the correlation between threshold and degree fluctuations
In main paper we show how the oscillations of λ c are associated with the instantaneous fluctuations of the average degree of the network, as the period T varies.As T increases, λ c fluctuations are damped, because they are cumulatively calculated on longer periods.Adding a snapshot to a short period T has indeed a greater relative influence on the threshold than adding it to an already long period.Similarly, damped oscillations are observed when we monitor the cumulative average degree of the network, calculated on all snapshots up to period T (Figure S2).

B Higher order correlations
The quenched mean field approach disregards spatial correlations among the probabilities of nodes being infected.Let X i (t) = 0, 1 be the infectious status of node i at time t.The quenched mean field then assumes that X i X j = X i X j .In order to assess the impact of such approximation, at least in the case of two point correlations, we compute in the simulations the following quantity for each pair of nodes: where the moments are computed on a time interval well after the initial transient: t=t 0 Y (t) / (t 1 − t 0 ).σ ij ≈ 1 indicates that X i , X j are highly correlated, while low values of that quantity indicate no correlations.Figure S3 shows the distribution for σ ij on all possible pairs of nodes, for ER and HT09.For ER almost no correlations are visible among nodes, as expected, except for small fluctuations around σ ij = 0.For HT09, on the other hand, we see that σ ij is peaked around a small but nonzero value, indicating the presence of two point correlations, albeit weak.This is in accordance with the finding that the threshold computation is more accurate for ER than for HT09 (see Figure 2a,d of main paper).

C Weighted and directed networks
For sake of simplicity, in the main paper we deal with undirected unweighted temporal networks, i.e.A (t) † = A (t) and A (t) ij ∈ {0, 1}.Here we briefly show that our methodology can be extended to the more general case, if needed.The proposed approach does not require A (t) to be symmetric (moreover M αβ is not symmetric even when the A (t) matrices are).Therefore a generalisation to the directed case is straightforward, and it is simply based on the replacement of A with A † in the computation of the threshold in the annealed approximation.
The weighted case is tractable too, provided that the probability of transmitting the infection is defined in terms of the weight, according to the given context under study (the weight could for example represent the movements of hosts from one node to another).
It is customary to compute the probability of transmission along a link ij as 1 − (1 − λ) A (t) ij (see, for instance, [1]).When the weights are integer numbers, this means considering the binomially distributed probability of at least one infection given A (t) ij trials.By plugging this contribution into M tt ij , Equation (4) of the main paper thus becomes The computation of the threshold can then be carried out in the same way as described in main paper.

Figure 2 :
Figure 2: Validation of the analytical method and comparison with microscopic numerical simulations.Top panel: network models; bottom panel: empirical networks.Cross symbols represent i MC as a function of λ for two different values of µ (µ = 0.2 in blue and µ = 0.5 in red), i.e. the average prevalence obtained from the numerical solution of the Markov chain Equation (2).Circles represent i sim , i.e. the average prevalence obtained from stochastic numerical simulations of the SIS process, for the same values of µ.The arrows indicate the analytical predictions of the threshold from Equation (8) (one single realization of the network models is considered in each panel).In the case of the ACTIVITY network model, we also performed 200 realizations of the network model to compare our predictions with the mean-field ones obtained with the activity-block approximation valid for very large networks[29].Results show that the activity-block model predictions are smaller than our analytical predictions (λ ACTIVITY

Figure 3 :Figure S1 :
Figure 3: Epidemic threshold estimated from different period lengths.λ c (T ) is the epidemic threshold computed by considering only the first T snapshots of the network.For each panel, blue (red) curve corresponds to µ = 0.2 (µ = 0.5).The scale of the former is on the left side of the plot, while the scale of the latter is on the right.The gray bar chart shows the mean network degree associated to the snapshot at time T .Bar charts are in linear scale and min and max values are placed on the corresponding bars.For the empirical networks, the measure of the real time is also reported.

Figure S2 :
Figure S2: Epidemic threshold estimated from different period lengths.λ c (T ) is the epidemic threshold computed by considering only the first T snapshots of the network.For each panel, blue (red) curve corresponds to µ = 0.2 (µ = 0.5).The scale of the former is on the left side of the plot, while the scale of the latter is on the right.The gray bar chart shows the average of the mean network degree of snapshots up to time T .Bar charts are in linear scale and min and max values are placed on the corresponding bars.The measure of the real time is also reported.

Figure S3 :
Figure S3: Two-point correlations for ER and HT09.Distribution of σ = σ ij for the model ER and the real network HT09.Distributions are computed using λ = 0.3, µ = 0.5 for both networks.Moments are computed over 2 • 10 5 time steps, after a relaxation time of 3 • 10 3 .