Contact-based model for epidemic spreading on temporal networks

We present a contact-based model to study the spreading of epidemics by means of extending the dynamic message passing approach to temporal networks. The shift in perspective from node- to edge-centric quantities enables accurate modelling of Markovian susceptible-infected-recovered outbreaks on time-varying trees, i.e., temporal networks with a loop-free underlying topology. On arbitrary graphs, the proposed contact-based model incorporates potential structural and temporal heterogeneities of the underlying contact network and improves analytic estimations with respect to the individual-based (node-centric) approach at a low computational and conceptual cost. Within this new framework, we derive an analytical expression for the epidemic threshold on temporal networks and demonstrate the feasibility of this method on empirical data.


I. INTRODUCTION
Accurate models of disease progression are valuable tools for public health institutions as they enable detection of outbreak origins [1][2][3][4], assessment of epidemic risk and vulnerability [5][6][7] and, potentially, containment of the spreading at an early stage [5,8]. Mitigation strategies can thus be evaluated and employed without the need to run a large number of Monte-Carlo (MC) realizations.
In recent years, the availability of mobility and contact data with high temporal resolution, so called temporal networks, offers another opportunity to improve analytical predictions [30][31][32][33][34][35]. The timing of links between nodes matters, in particular when the network evolves on * andreas.koher@tu-berlin.de a similar time scale as the spreading dynamics, which led to an increasing interest in the interplay between disease and network dynamics [36][37][38][39][40][41].
One approach to model the states of a individual nodes in a network takes the corresponding probabilities directly as variables in a set of coupled dynamic equations [7,12,19,[42][43][44][45][46]. We will refer to this approach as the individual-based (IB) model, though it is sometimes also called N -intertwined model [19] or quenched mean field [47,48]. However intuitive, the analytic predictability suffers from the simplifying assumption that epidemic states of adjacent nodes are independent.
Recently, a change from a node-centric to an edgecentric perspective has been discussed within different frameworks in order to overcome the inherent limitation of the IB model. These approaches include branching processes [49], message passing [22,50], belief propagation [4] and, the edge-based compartmental model [38]. So far, however, edge-centric models are mostly limited to static topologies. It thus remains an open challenge to account simultaneously for topological and temporal properties of the underlying contact data and hence improve current predictions of the epidemic threshold [7,[51][52][53][54][55].
In this paper, we generalize the dynamic message passing approach from [50] for discrete-time Markovian susceptible-infected-recovered (SIR) spreading to timeevolving networks and derive the epidemic threshold within this new framework. The proposed model takes an edge-centric perspective, because the relevant dynamic equations are based on the set of edges. Furthermore, the framework integrates the complete temporal and topological information of the underlying network into the epidemic model. We will refer to our approach as contactbased (CB) and compare numerical predictions with the widely used IB model that takes a node-centric perspective. Within the CB framework we then derive a new analytic expression of the epidemic threshold for temporal networks and show that the edge-centric approach improves existing results [7,19,42,43,51,52] at a low conceptual and numerical cost. Although both modelling frameworks can in principle account for contact weights that indicate the strength of a connection, we will focus on unweighted networks for simplicity and refer to the Appendix A for an extension of the model. The CB and IB models have been implemented in Python with the source code available on Github [56].
The remainder of this paper is structured as follows: First, we summarize the conceptual framework in Sec. II and formulate in Sec. III the dynamic equations of the IB and CB models. Then, we derive the epidemic threshold for temporal networks within the CB framework in Sec. IV. We compare the edge-and node-centric approaches against Monte-Carlo (MC) simulations in Sec. V and close with a discussion in Sec. VI. The appendix includes an extension to weighted contacts and heterogeneous epidemiological parameters in Sec. A as well as a network analysis of the German cattle trade data Sec. B. Further results and applications of the CB model are summarized in Sec. C.

II. CONCEPTUAL FRAMEWORK
We consider a temporal network G = [G(0), G(1), ..., G(T − 1)] with N nodes and T snapshots sampled at a constant rate. A node l ∈ N represents an individual that is either susceptible, infected or, recovered at a given time t with a corresponding probability S l (t), I l (t) and, R l (t), respectively. Emphasizing the important difference between temporal and static elements, we refer to contacts as time-stamped links (t, k, l) ∈ C ⊂ T × N × N . Here, we denote with N , T and, C the set of nodes, time stamps and, contacts, respectively. We further assume that every contact is of a constant duration and equal to the sampling time of the temporal network. With edges, we refer to the corresponding static elements in the time-aggregated network. In other words, an edge (k, l) ∈ E ⊂ N × N exists if and only if at least one (temporal) contact is recorded between k and l. Here, we denote with E the set of edges. Moreover, we will assume directed edges throughout the paper and represent an undirected contact as two reciprocal contacts.
Following the convention in [22], we denote with k → l a directed edge from k to l, and we indicate edge-based quantities in a similar fashion.
As the stochastic process, we assume a discrete-time SIR model. Here, a susceptible node that is in contact with an infected neighbor contracts the disease with a constant and uniform (per time-step) probability β. Furthermore, we treat the transmission events from multiple infected neighbors as independent and similarly, we interpret potential (integer) edge-weights as independent infection attempts (see Appendix A). Also, we do not account for secondary infections within one time step, i.e., only direct neighbors can be affected. Once infected, the individual recovers with a uniform and constant probability µ independently of the infection process and acquires henceforth a permanent immunity.
Concerning the contact data, we will focus our numerical analysis first on a face-to-face interaction network between 100 conference participants [57]. This so-called proximity graph has a resolution of 20s and the observation time is limited to the first 24h. If necessary, we extend the data set with a periodic boundary condition in time. The time-resolved contacts enable the study of spreading of airborne diseases as well as the propagation of ideas and rumors. The data is available on sociopatterns.org and using the source code on [56], results of this paper can be easily reproduced.
As an illustrative example, we present in Fig. 1 the time-dependent probability that a selected node in the proximity graph is either susceptible (yellow), infected (red) or, recovered (gray). The results are derived from 10 4 Monte-Carlo simulations with the same initially infected node. The trajectories reflect the bursty activity of the underlying temporal network [57] within the first 12h and the subsequent inactive night time.  1. Illustrative examples of a simulated epidemic outbreak from a single initially infected node. Colors give probability that another arbitrarily selected node is in the susceptible (yellow), infected (red) or recovered (grey) state, respectively. Simulation parameters: µ = 2.85 · 10 −4 , β = 100µ, 10 4

MC realizations
As a second source of data with direct relevance to public health, we consider an excerpt of the national German livestock database HI-Tier (www.hi-tier.de). This temporal network comprises the movement of cattle between farms in Germany for the year 2010 with daily resolution. Within the observation window of 365 days more than 3 million transactions have been recorded between over 180,000 farms and traders, respectively. For more details on the graph see Appendix B. Cattle trade is considered an important transmission route for livestock-related diseases such as foot-and-mouth disease (FMD), which broke out in Great Britain in 2001 with estimated costs of 8 billion British Pounds [58]. Therefore, the analysis of the corresponding spatio-temporal graphs is highly relevant to public health institutions.

III. DYNAMIC EQUATIONS
In this section, we will present the mathematical framework to model the stochastic SIR process as outlined in the introduction and Sec. II. Our main focus is the CB model but, in order to facilitate a direct comparison between the node and the edge-based approach, we will begin with a short overview of the IB model.

A. Individual-based model
In the IB model the marginal probabilities S l (t), I l (t) and, R l (t) for all l ∈ N enter directly a set of 3 × N coupled dynamic equations. The probability to transmit pathogens from node k to l upon a temporal contact is given by βI k (t). For convenience, we introduce an indicator function with a k→l (t) = 1 if a (directed) contact from k to l exists at time t and a k→l (t) = 0 otherwise. Then, the probability for node l to receive no infection at time t from any of its neighbors factorizes to k [1 − βa k→l (t)I k (t)] and k ∈ N . With this, the marginal probability S l (t + 1) can be expressed by the probability S l (t) to be susceptible in the previous time step t and not contract the infection within the interval [t, t + 1). In the IB model, the joint probability factorizes by assumption and we obtain Here, the crucial simplification is to treat the epidemic states of l and its neighbors as mutually independent, which is sometimes referred as neglecting dynamic correlations [59]. The marginal probability I l (t + 1) follows from two independent contributions: (i) The out-flux µI l (t) indicates the transition from the infected to the recovered state. (ii) The in-flux ∆S l (t) = S l (t) − S l (t + 1) reflects the probability that node l is newly infected at time t+1. Combining both contributions leads to (2) The set of 2×N coupled dynamic equations in Eqs. (1) and (2) thus constitutes the IB model for temporal networks. The remaining marginal probability R l (t) to find node l in the recovered state follows from the conservation condition S l (t) + I l (t) + R l (t) = 1 for all l ∈ N . Finally, we will assign a probability z l = S l (0) that node l is initially susceptible as well as I l (0) = 1 − z l and R l (0) = 0 throughout the paper.
Though intuitive and in many cases sufficient from a modelling perspective, the limits of the IB model are difficult to estimate due to the ad-hoc factorization of the joint probability in Eq. (1). Even for the simplest network with two nodes connected by an undirected static edge, the IB approach can deviate significantly from the expected outcome as illustrated in [60]. In their example recovery is neglected for simplicity and only the first node is infected initially with some probability 0 < z 1 ≤ 1. Counter-intuitively, the probabilities to find each node in the infected state converge to I 1 (∞) = I 2 (∞) = 1 according to the IB model, independent of the initial condition z 1 . This is because integrating Eqs. (1) and (2) admits a probability flux from the outbreak location to the adjacent node and back to its origin again. This mutual re-infection, coined echo chamber effect in [60], appears because we neglect the fact that the probability I 2 , to find the second node in the infected state, is conditioned on the state of the first node and thus the factorization in Eq. (1) is not justifiable.
In an arbitrary network an initially infected node leads to a cascade of secondary infections within which all marginal probabilities are highly correlated. An accurate model excludes these previously infected nodes from those that can potentially contract the infection in the future. We will discuss in the next section how a shift from a node-centric to an edge-centric view can take into account some such dependencies.

B. Contact-based model
We begin with a slightly different approach to the marginal probability S l (t). First, we note that l is susceptible at time t, if it was susceptible initially (with probability S l (0) = z l ) and has not contracted the infection from any of its neighbors up to time t. We assign the probability Φ l (t) to the latter statement. Thus without introducing any approximation at this stage, we can write In order to determine Φ l (t), we make the assumption that the underlying time-aggregated graph is a tree (ignoring directionality). Then, different branches originating in node l are independent as long as l remains susceptible and thus Φ l (t) factorizes. However, if node l contracts a disease from a neighbor k with some probability and passes it on to another node k then the corresponding probabilities I k and I k are clearly correlated. A simple solution that allows different branches to nonetheless be treated as independent is to prevent a probability flow through the root node in the first place. From a graph-theoretic perspective, this corresponds to the (virtual) removal of all out-directed contacts from the root node. This approach does not modify the dynamics of the node under consideration, because it can still contract the disease and once infected, the recovery process is independent of the topology. The idea reduces, however, considerably the amount of bookkeeping that would otherwise be necessary, if we accounted for the correlations directly. The singular node l is said to be a cavity node or in the cavity state [22,50], a concept closely related to the test-node assumption [38] and the idea of cut-vertices [61]. With this, we can factorize Ψ l (t) and thus obtain Here, we introduced the probability θ k→l (t) that no disease has been transmitted from node k to the cavity node l up to time t. The change in perspective towards an edge-centric analysis introduces new auxiliary dynamic quantities such as θ k→l (t). These are defined on the set of edges E of the time-aggregated network and thus the number of dynamic variables scales with L, the number of edges.
In order to obtain a system of dynamic equations, we focus on our first edge-centric variable θ k→l . Initially, no disease was transmitted such that θ k→l (0) = 1 for all edges (k, l) ∈ E. Henceforth, the dynamic quantity reduces only (i) upon a temporal contact, indicated by a k→l (t) and (ii) if the adjacent node k is infected without having transmitted the disease earlier to the cavity node l -we denote the corresponding probability with I k→l (t). Hence, the out-flow of probability is given by βa k→l (t)I k→l (t), leading to our first dynamic equation Next, the probability I k→l (t) that node k is infective at time t and has not yet passed the disease to the cavity node l evolves according to three contributions: (i) It decreases with the recovery probability µ and (ii) with the probability β to infect its target node upon a temporal contact. These processes are independent and may contribute simultaneously with the joint probability βµ. (iii) I k→l (t) increases with the probability ∆S k→l (t) = S k→l (t) − S k→l (t + 1) that k is newly infected by at least one of its incident neighbors excluding the cavity node l. In sum and with the initial condition I k→l (0) = 1 − z k these contributions lead to: (6) Finally, we consider the probability S k→l (t) that node k, adjacent to the cavity node l, is susceptible. Since k is not affected by the state of l, it stays susceptible if it does not contract the disease from any of its remaining, incident neighbors j ∈ N k \ l. It has been shown in [62] that the corresponding probability Φ k→l (t) = j∈N k \l θ j→k (t) factorizes and thus similar to Eq. (3), we find S k→l (t) = z l Φ k→l (t) or equivalently The disease progression in the CB framework is fully characterized by Eqs. (5) and (6), a set of 2L coupled equations. Equation (7) is introduced here for convenience only and can be substituted into Eq. (6). Next, we return to the node-centric quantities. To this end, we note that S l (t) has been already determined in Eq. (4). The remaining marginals I l and R l are equivalent to the IB model and given by the conservation condition Eq. (9) and Eq. (10), respectively. Hence, we arrive at the following node-centric equations: The CB model is exact for temporal networks, whose undirected time-aggregated version is a tree-graph and therefore loop-free. Thus, the change of perspective allows us to take full account of dynamic correlations on tree topologies. Most realistic networks, however, contain a large number of loops such as triangles in social graphs, where two friends are likely to have many more friends in common. Here the CB model nevertheless appears to be "unreasonably effective" (cf. [63]) and improves predictions significantly with respect to the IB approach as we will see in Sec. V. For further extensions to the model that include heterogeneous infection and recovery probabilities as well as weighted contacts see Appendix A.

IV. EPIDEMIC THRESHOLD
The condition defining the epidemic thresholds can be derived by examining small perturbations around the disease-free state. If such perturbations die out then any outbreak remains local, but if the perturbation grows then a global epidemic may occur. For that, we consider a linearization of the dynamic Eqs. (5) -(7), which will give rise to a criticality condition, determining the epidemic threshold. We begin with the ansatz θ k→l (t) = 1−δ k→l (t) and z l = 1 − l , where δ k→l (t), l 1 are small perturbations around the disease-free state for all nodes l and edges (k, l). Thus, Eq. (5) becomes: In Eq. (7) we keep the linear terms of the Taylor expansion, which transforms the product into a corresponding sum: In Eq. (12b) we substituted the dynamic Eq. (11) and identified S k→l (t) in the next step. From the resulting Eq. (12c) we can read the linearized form of ∆S k→l , which allows us to decouple the dynamic equations for I k→l : Next, we rewrite the remaining set of L coupled dynamic equations in a compact, matrix-based formulation and therefore introduce the vectors I(t) and a(t) with elements I k→l (t) and a k→l (t), respectively. To this end, we also express the linear operation j∈N k \l a j→k (t) in Eq. (13), which acts on the elements I k→l (t) of the state vector, through the temporal unweighted nonbacktracking matrix B(t): In other words, B kl,jk (t) = 1 if the contact (t, j, k ) at time t is incident on the edge (k, l) (implying k = k), and additionally j = l. Otherwise we have B kl,jk (t) = 0. It is only the non-backtracking property j = l that sets B apart from the adjacency matrix of the ordinary linegraph. For temporal networks a subtle distinction has to be made between the first and the second index of the L × L dimensional matrix B: The first corresponds to an out-directed (static) edge (k, l) ∈ E of the underlying aggregated network and can be interpreted as a potential contact in the future. The second, however, is an incident (temporal ) contact (t, j, k ) ∈ C from node j to k at time t. We also introduce the diagonal matrix diag(1−βa(t)), with diagonal elements given by the vector 1 − βa(t).
Here, we denote with 1 the vector of all ones. With these definitions, we rewrite Eq. (13) to The explicit solution to the state vector I(T ) at final observation time T is formally given by I(T ) = P (β, µ)I(0), where the so-called infection propagator P [53] is introduced for notational convenience: In order to evaluate the asymptotic behavior, we assume a periodic boundary condition in time, i.e., B(t) = B(t + T ). This allows us to assess the vulnerability of the temporal network through the spectral radius of the propagator P . In particular, we find that a SIR-type outbreak is asymptotically stable under small perturbations, i.e., remains confined to a small set of nodes, as long as the spectral radius satisfies ρ[P (β, µ)] < 1. Thus, the phase transition is given by the criticality condition . (17) Note that for irreducible and non-negative matrices the largest eigenvalue is simple and positive according to the Perron-Frobenius theorem [64]. Assuming 0 ≤ β, µ < 1, a sufficient condition for temporal networks is to restrict contacts to the giant strongly connected component (GSCC) of the underlying time-aggregated graph. In Sec. V B we will fix the recovery probability µ and determine the critical infection probability β crit as the root of f (β) = 1 − ρ[P (β, µ)] for different empirical networks.
We conclude this section with a discussion on the static network limit. In the so-called quenched regime, the disease evolves on a much faster time scale than the dynamic topology and thus operates on an effectively static network with B(t) ≡ B(0) ≡ B and a(t) ≡ 1 for all times t. As in the temporal analysis, we restrict the network to the GSCC so that the Perron-Frobenius theorem [64] applies. In this limit the dynamic equations (5) -(7) reduce to the dynamic message passing formulation in [50]. Moreover, Eq. (16) becomes now a product where 1 = diag(1) denotes the identity matrix. The spectral radius in Eq. (17) factorizes to ρ[P fast (β, µ) T ] = ρ[P fast (β, µ)] T , and it follows that the criticality condition Eq. (17) reduces to ρ[P fast (β, µ)] = 1. Furthermore, we find from basic linear algebra that ρ[P fast (β, µ)] = (1 − µ)(1 − β) + βρ(B) and hence we obtain the corresponding static threshold condition The criticality condition in Eq. (19) deviates from the continuous-time result in [15,22]. In the derivation presented here, the term βµ in Eq. (19) accounts for the simultaneous events when a node infects a neighbor and recovers within the same time step. In contrast to the quenched regime, one can also consider the so-called annealed limit. Then, parameters β and µ are sufficiently small such that no more than one infection or recovery event can take place within the observation time. Therefore, we expand the infection propagator to the first order in β and µ and obtain: Here,ā = 1/T t a(t) andB = 1/T t B(t) denote the corresponding time averaged quantities. It is insightful to evaluate simple bounds for the set of parameters (β, µ) crit, slow that satisfies the threshold condition ρ(P slow ) = 1 in the annealed limit. With 1/T ≤ā ≤ 1 for all elements inā we thus find: Assuming the upper bound in Eq. (21) overestimates the outbreak risk and can be considered a conservative choice from an epidemiological perspective. This limit is realized for a temporal network where every edge appears exactly once within the observation time, henceā = 1/T . The lower bound in Eq. (21) is exact in case of a static network, thusā = 1, and corresponds to the continuoustime result in [15]. However, this limit underestimates the outbreak risk and therefore we conclude with a note of caution when applying results from static network theory directly to time-varying topologies.

V. APPLICATION
A big advantage common to both the node-centric IB and edge-centric CB modeling framework is a significant reduction in computational complexity compared to MC simulation. The CB model requires iteration through all edges at every time step and thus the time complexity scales with O(LT ). The IB formulation and a single MC realization require O(CT ), whereC denotes the average number of active contacts, which can be significantly smaller than L. Stochastic MC simulations on the other hand require a large number of realizations in order to provide reliable statistics. The computational disadvantage of MC simulations becomes even more apparent when we consider a complex quantity such as the epidemic threshold, which requires multiple ensemble averages for different sets of epidemic parameters in order to fit the critical infection probability (see Sec. V B). Equally important however, is the accuracy of our analytic approach. Therefore, we will compare in this section estimations from the IB and CB mean-field model with MC simulations using empirical data as introduced in Sec. II.

A. Numerical analysis of the mean-field dynamics
We begin with an analysis on the level of individual nodes. In Fig. 2, we show the cumulative infection probability for a small number of example nodes from the conference data set given the same outbreak location. The selection is intended to present qualitatively different trajectories, also demonstrating that deviations between the two models vary considerably. The MC result (blue curve) in Fig. 2A corresponds to the introductory example in Fig. 1. Here, a comparison with the analytic estimation shows that the CB approach leads to a substantial improvement to the IB model. Also in Fig. 2B-D, the trajectories are erratic, as they reflect the sudden changes in the underlying topology, highly individual and yet well approximated by the CB model. For all nodes in the network, we found that the CB model gives a closer upper bound to MC simulations because, unlike the IB framework, it accounts for dynamic correlations between nearest neighbor states. Dynamic mean-field models such as the IB and CB framework provide realistic expectation values only if stochastic fluctuations are negligible. In order to illustrate the limitations, we study epidemic outbreaks for three different initially infected nodes in Fig. 3 A, B, and C, respectively. The left column gives the time-resolved distribution of the outbreak size, and the right column presents the final distribution at the end of the three day observation period. For the ensemble average (blue line), we consider only realizations with more than 20 infected nodes overall. This threshold separates outbreaks that die out early due to stochastic fluctuations and thus permits a direct comparison with estimations from to the IB and CB framework in green and red, respectively.
We chose the outbreak locations such that the degree of stochasticity increases from top to bottom. In Fig. 3A.1, we find a narrow distribution around the ensemble average, which is well approximated by the mean-field models. Minor outbreaks due to early extinctions are well separated in Fig. 3A.2 from large epidemics. In Fig. 3B, the initially infected node leads to realizations with considerably stronger fluctuations and in Fig. 3C it is barely possible to separate early extinctions at all. Additionally, we observe a second source of stochastic variation, namely the time at which a disease escalates and hence evolves into a global epidemic. As a consequence, early outbreak sizes may be overestimated significantly before the analytic trajectory approaches the expectation value again (see Fig. 3C.1).
Remarkably, the performance of both mean-field models varies significantly with the outbreak location, even for R 0 well above the epidemic threshold. At the late phase of an outbreak, however, the mean-field models provide good approximations and consistently with Fig. 2, we find that the CB model outperforms the IB approach. In Appendix C, we demonstrate how a sufficiently large number of initially infected individuals improves significantly the predictability.
Another source of stochasticity is the choice of disease parameters β and µ, respectively. For that, we focus on the final outbreak size, averaged over all outbreak locations. The distribution as a function of the infection probability β (see Fig. 4) shows a percolation-like transition from localized spreading to epidemics that affect a considerable fraction of the network. We apply the same threshold as in Fig. 3 for a direct comparison between the averaged outbreak size and the mean-field models for β > 0.02. Here, we find that the difference between the expected size and the CB estimation is close to negligible, whereas the IB model consistently overestimates the expected value. A comparison at low values of the infection probability β becomes unreliable as stochasticity impedes a reasonable distinction between minor and global outbreaks. In order to illustrate the effect, we present in Fig. 5 the outbreak size distribution for different values of β as marked by the arrows in Fig. 4. This representation highlights the transition from the sub-to the super-critical parameter domain: The unimodal distribution in Fig. 5A characterizes localized outbreaks, whereas the bimodal distribution in Fig. 5D clearly separates early extinctions and global epidemics. Next, we focus on the critical infection probability that marks the transition.

B. Epidemic threshold
In Fig. 6A, we present the region of small β from Fig. 4, in order to focus on the transition from localized outbreaks to the sudden emergence of global epidemics. We determine the critical infection probability β crit (vertical blue line) from the maximum of the relative standard deviation [53], also known as coefficient of variation (see blue line in Fig. 6B): Here, we denote with σ and σ 2 the first and second moment of the outbreak size distribution. The coefficient of variation captures the intuition that fluctuations dominate the outbreak size distribution close to the transition. Indeed, c v diverges at the critical point for infinitely large networks, indicating a second-order phase transition [65]. Analytically, we determine β crit from the spectral criterion in Eq. (17) for the CB model and similarly within the IB framework [53,66]. The comparison in Fig. 6 shows that the IB and CB model, marked by a red dashed and green dotted line, respectively, underestimate the critical infection probability from MC simulations (blue line) and thus overestimate the outbreak risk. Consistent with our previous results, we can state that a shift from a node-to an edge-centric framework improves the analytic prediction. In Appendix C, we present similar results for different values of the recovery probability µ. Next, we continue with a realistic application of the epidemic threshold to the German cattle trade network.

Application to German cattle trade
We now consider a completely different data set, where the system size is large and contacts are sparse over time. Our example is a cattle trade network, where the movements of animals between farms in Germany are recorded on a daily basis. Next, we isolate the trade within each federal state of Germany as visualized in Fig. 7 and restrict trade to the GSCC of the underlying aggregated graph. Disregarding the smallest networks (those with less than 27 nodes), we thus obtain 12 time-varying graphs with sizes varying from 254 to 27,863 nodes and highly heterogeneous topological and temporal features (see Appendix ??).
As in the previous section, we assume that premises can be either susceptible, infected or recovered and trade events facilitates the transmission of a disease. Unlike before however, we take into account the number of traded animals during each transaction, i.e., the weight w k→l of a (temporal) contact from node k to l. To this end, we modify the infection propagator in Eq. (16) and re- place β by 1 − (1 − β) w k→l (see Appendix A for more information). In a potential outbreak, we assume that an infected node is detected with a constant probability µ each day after which it would be isolated and thus removed from the network. As a consequence highly infectious diseases such as FMD can be modelled as SIR-type epidemics [58].
In Fig. 8, we compare the critical infection probability similar to Fig. 6 for six selected federal states with different transition characteristics. The critical value derived from MC simulations varies between β crit. = 0.018 (BY) and β crit. = 1.0 (SN). The latter indicates that outbreaks remain localized for every choice of β due to sparse intrastate trade. As a potential application to public health institutions, we present in Fig. 9A the spatial variation of the epidemic risk in terms of β crit . The quantitive comparison in Fig. 9B demonstrates that spectral methods provide a lower bound with a varying degree of accuracy depending on the network details. Despite their heterogeneity in size and activity, we find for all networks that the CB model outperforms the IB approach. The detailed results for all states as well as a similar analysis for µ −1 = 120 are available in Appendix C.

VI. CONCLUSION
In this paper we have presented the contact-based (CB) model for epidemic SIR spreading on temporal networks as a conceptually similar framework to the widely used individual-based (IB) approach. Derived from the message-passing framework [22,50] it inherits its accuracy on loop-free topologies and improves analytic estimations with respect to the IB approach for arbitrary time-evolving graphs. Moreover, the focus on edge-based quantities that are updated in discrete time steps allows a seamless integration of temporal interactions. Structually similar to the node-centric IB model, the proposed CB approach poses a low conceptual barrier and admits application on large graphs.
Importantly, the accuracy of the CB model improves existing approximations of the epidemic threshold, which is a crucial risk measure for public health institutions. To this end, we have studied the largest eigenvalue of the infection propagator matrix, which determines the disease propagation in the low prevalence limit and takes into account the full temporal and topological information up to the observation time. The largest eigenvalue can be easily found through repeated matrix multiplications, i.e., the so-called power method. Without relying on extensive Monte-Carlo (MC) simulations and a subsequent parameter fit, the critical value can thus be estimated with efficient, vectorizable tools from linear algebra that are available for most high-level programming languages.
In the application section, we have focused first on a social contact-graph that can be used to analyze the propagation of airborne diseases as well as the spread of information. Our comparison between MC simulations and analytic estimations from the CB and IB model followed a bottom-up approach: We looked at (i) epidemic trajectories of individual nodes, (ii) the outbreak size given the same initially infected node, and (iii) the outbreak size for a range of infection probabilities, averaged over all outbreak locations. In all cases, the CB model provides a closer upper bound to MC simulations than the widely used IB model. All results based on the conference data set can be reproduced using the Python code provided in [56] As a particularly important application, we then compared analytic estimations of the critical infection probability with extensive MC simulations. To this end, we included a case study of livestock trade within 12 federal states in Germany with highly heterogeneous characteristics in terms size, density and temporal activity. Consistently, we found that the CB model improves the previous lower bound of the IB framework at a low conceptual and computational cost.
Many excellent results have already been derived within the IB framework for empirical networks and in the context of random graphs (see [68] for a recent review) that can further improve the CB model. We therefore expect that the conceptual simplicity of the CB framework allows to integrate features such as non-Markovianity [21], stochastic effects [69], and estimations of uncertainty [70] that are important to realistic disease models on temporal networks. Also, first steps towards higher order models that go beyond the tree-graph assumption have been proposed in the context of percolation theory [71] and diffusive transport [72,73] and we expect these improvement to also be applicable to the CB model.  The geographic distribution of nodes in Fig. 11A shows dense regions in the north-west and south-east including the above mentioned federal states North Rhine-Westphalia (NW), Lower Saxony (NI), Baden-Württemberg (BW) and Bavaria (BY). Here, we also find the largest premises in terms of total traded animals: In Fig. 11C color and size indicate the node strength, i.e., the aggregated trade volume. The heterogeneous distribution of strength becomes also apparent in Fig. 11D where in-, out-and total strength is analyzed separately. Finally, we observe in Fig. 11B the net flux, i.e., the difference between in-and out-directed trade volume. We display only nodes with at least 500 traded animals in Fig. 11B and Fig. 11C.
From a temporal perspective, we find that trade fluctuates between 10 2 and 10 4 active nodes, i.e., farms with at least one trade event on a given day, whereas minima appear regularly on the weekends (see Fig. 12A). The weekly pattern is also apparent with a view to the inter-activation time distribution, i.e., the time interval between two successive trade events for a given node (see Fig. 12). Here, we find a broad distribution of activity with peaks around 7, 14, and 21 days.
The geographic risk analysis in Sec. V B 1 requires us to separate the network into sub-graphs that correspond to the intra-state trade (see Fig. 7). The largest eigenvalue of the corresponding infection propagator allows us to evaluate the outbreak risk within a federal state due to the local movement of infected animals. In Tab Hamburg and Bremen as well as Saarland, which is particularly small state in terms of nodes, are marked with an Asterisk and are not considered for risk analysis in Fig. 9.
Separating the trade network into subgraphs as visualized in Fig. 7B inevitably reduces the outbreak risk as TABLE I. Federal states of Germany and the corresponding abbreviation together with basic network statistics: number of nodes, number of (static) edges, number of (temporal) contacts. In all cases we confined the network to the GSCC. We mark the smallest networks, i.e., the city states Berlin, Hamburg, and Bremen as well as Saarland with an asterisk.
the neglected cross-border edges would otherwise facilitate the disease transmission. In Fig. 13A we find that a considerable fraction of trade is directed across federal states and has thus been removed. This applies in particular for the federal states NI, NW and BW. Similarly, we find that the ratio between intra-state and in-directed trade lies between 0.6 (NW) and 0.9 (BY). Thus, we conclude that a considerable fraction of trade across borders is being neglected in the geographic risk analysis in Fig. 9.
It is also important to stress that we use the same parameter µ across all federal states and thus assume a uniform detection probability. In reality, federal states with a large number of premises tend to enforce stricter hygiene and intervention standards so that the actual epidemic risk for states such as Bavaria (BY) and North Rhine-Westphalia (NW) is much lower. A realistic evaluation for public health must therefore include heterogeneous recovery (detection) probabilities on the level of states or individual nodes as discussed in [75] and Appendix A as well as a complex disease response that includes trade restrictions, increased awareness and higher bio-security.

Appendix C: Further applications
From the detailed, node-level infection trajectory we can estimate the infection arrival time from a given outbreak location to all remaining nodes. For that purpose, we extend the contact sequence periodically in time until the infection probabilities are negligible. Then, we derive the infection arrival time to a single node from the corresponding cumulative infection probability (see Fig. 2) as follows: (i) The discrete derivative of the cumulative infection probability gives the probability distribution to contract the infection at a given time step. (ii) The expectation value of probability distribution gives the mean infection arrival time at the a single node, corresponding to a scatter point in Fig. 14A. Here, we compare the expected values from MC simulations with the estimated infection arrival times given by the CB and IB model, respectively. In a perfect prediction the scattered values would lie on the diagonal but, as the contact network is far from a tree-like structure, the models estimate infection arrival times smaller than the observed values. The comparison between the IB and CB framework in Fig. 14B shows a considerably smaller relative deviation of CB estimations from the corresponding MC simulations for the given set of disease parameters and outbreak location.
Another application focuses on the vulnerability of nodes with respect to a given outbreak location. Again, we assume an infinite time horizon and compare the cumulative probability that a node has been infected in the limit t → ∞. As before, we find a good correlation between simulations and the estimated vulnerability in Fig. 15, whereas the CB model consistently outperforms the IB approach and overestimates the expected values surprisingly little given that the underlying aggregated network is fairly dense (the average degree is k ≈ 19) and far from being tree-like.
FIG. 14. A: Comparison between simulated and estimated mean infection arrival times. We extend the data set periodically in time until the outbreak dies out. The discrete derivative of the cumulative infection probabilities (see Fig. 2) yields the infection arrival probabilities of which we take the average value for every node. Results according to the CB and IB model are visualized as red circles and green crosses, respectively. The epidemic starts from the same outbreak origin and disease parameters as in Fig. 1. B: Histogram over the relative deviation from the simulated infection arrival times.
The numerical values are averaged over 10 5 realizations.
FIG. 15. A: Comparison between simulated and estimated vulnerability. We compute the cumulative infection probability in the limit t → ∞, also denoted as vulnerability. The comparison with CB and IB estimations visualized by red circles and green crosses, respectively. Each value corresponds to the vulnerability of a node given the same outbreak location and disease parameters as in Fig. 1. B: Relative deviation of the estimated values with respect to MC simulations. The numerical values are averaged over 10 5 realizations

Trajectories averaged over outbreak locations
For some applications, we may be interested in the trajectory of a global epidemic, averaged over outbreak locations. A sufficiently large number of initially infected nodes would then avoid complications with the early outbreak phase [38,76]. In this case, we adjust the MC simulations such that every node is infected independently with a given probability 1 − z l = 1 − z, ∀ l ∈ N at t = 0. As for the analytic approach we only need to set a corresponding homogeneous initial condition and thus the computational complexity remains the same as in the previous case of one initially infected node. In Fig. 16 (left column) we observe a narrow, timedependent distribution of cumulatively infected nodes around the mean value for three different infection prob-abilities. Without applying any additional threshold, we find a close agreement between the averaged trajectory, the CB and IB model in all cases. In contrast to Fig. 3 of the main text, we observe in Fig. 16 (right column) only one peak in the distribution, due to the large number of initially infected nodes. One potential application is to calculate the vulnerability of a node as discussed in [6]. Here, the vulnerability is defined as the probability that a given node is eventually infected by a disease that started somewhere in the network. The value can be used to rank nodes in order to prioritize surveillance or vaccination measures to the nodes that are most likely to contract the disease when resources are limited. In Fig. 17 every curve represents the vulnerability of one node as a function of the infection probability β. These results are derived from the CB model, given an initial infection probability of 1 − z = 0.2. The individual colors correspond to the degree of the node in the underlying time-aggregated graph and serve as a guide to the eye. Interestingly, we find that the ranking, as estimated by the CB model, may change with increasing infection probabilities β as can be seen from the highlighted curve in Fig. 17. This effect has been observed earlier in the context of static networks [6] and indicates that network properties alone are often not sufficient to rank nodes as they do not take into account details of the dynamic system.
FIG. 17. Vulnerability as a function of the infection probability β estimated from the CB model. Each curve represents the vulnerability of a node, i.e., the probability to contract the infection from a set of randomly chosen outbreak locations. Here, we estimate the vulnerability according to the CB model. Starting from an initial infection probability of 1 − z = 0.2 we propagate the infection over time until convergence. We stop when the largest increase in vulnerability after 24h falls below 10 −3 . The colors indicate the degree of each node in the underlying time-aggregated graph. Moreover, the behavior of one selected node is highlighted.

Additional numerical results
The analysis of the conference data set in the main text was limited to a single value of the recovery probability with µ = 2.85 · 10 −4 . This choice corresponds to an expected infectious period of about 19.5 hours. In addition to the analysis of the main text, we present in Fig. 18 similar results for different values of µ. The left, middle and right column correspond to Fig. 4, Fig. 6A, and Fig. 6B, respectively. In all cases, the CB model gives a closer bound to MC simulations as compared to the IB approach.
Next, we complete the analysis in Fig. 8 of the main text on the epidemic threshold for the example of the animal trade network. Again, we assume µ = 1/28 and provide in Fig. 19 a detailed analysis of all federal states, excluding the city states Berlin, Hamburg, and Bremen. The left and middle column in Fig. 19, provide a similar analysis to Fig. 6A, and Fig. 6B of the main text, respectively. In other words, we present the distribution of outbreak sizes for different values of the infection probability β (left column) from which we derive the coefficient of variation c v (blue line, middle column). The right column presents values of c v that are close to the peak and a quadratic fit (green line, right column) that determines the numerical estimation of the critical infection probability (blue vertical line). This value can be compared to spectral estimations from the mean-field models. In agreement with previous results, we find that the criticality condition in Eq. 17 of the CB model (see main text) improves previous results of the IB approach.
Finally, we provide an additional analysis of the epidemic threshold for the cattle trade data with µ = 1/120. Results in Fig. 20 are akin to our previous analysis in Fig. 19, except for Saarland (SL). Here, the spectral condition in Eq. 17 of the CB model predicts that every outbreak remains localized, i.e. β CB crit. = 1, whereas MC simulations suggest a transition to global epidemics, hence β MC crit. < 1. We attribute the inconsistency to the small size of the network (26 nodes). The spectral approach assumes implicitly an infinitely large network, which is clearly violated in this case.
We summarize the results for µ = 1/120 in Fig. 21, akin to Fig. 9 of the main text. The risk map in Fig. 21A visualizes the spatial variability of the outbreak risk and each group of blue, red and green bars in Fig. 21B provide a quantitative comparison between MC results, the CB model and the IB approach, respectively.  Fig. 20 for details). The visualization is akin to Fig. 9 of the main text.