A message-passing approach to epidemic tracing and mitigation with apps

With the hit of new pandemic threats, scientific frameworks are needed to understand the unfolding of the epidemic. At the mitigation stage of the epidemics in which several countries are now, the use of mobile apps that are able to trace contacts is of utmost importance in order to control new infected cases and contain further propagation. Here we present a theoretical approach using both percolation and message--passing techniques, to the role of contact tracing, in mitigating an epidemic wave. We show how the increase of the app adoption level raises the value of the epidemic threshold, which is eventually maximized when high-degree nodes are preferentially targeted. Analytical results are compared with extensive Monte Carlo simulations showing good agreement for both homogeneous and heterogeneous networks. These results are important to quantify the level of adoption needed for contact-tracing apps to be effective in mitigating an epidemic.

Percolation theory [1-5] constitutes a subject of major relevance in the field of complex networks. It provides a simple mathematical framework which naturally applies to both networks' structural properties, (such as resilience under random damage) [6][7][8], and critical diffusion, (such as epidemic spreading in heterogeneous structures) [9,10]. As a matter of fact, even though there exists several epidemiological models with different flavors of complexity, the arguably most popular one, i.e. the SIR model, was found [9,10] to be mappable to a static link-percolation problem, which allowed to find analytical expressions for the epidemic threshold depending on the underlying network topology. These results, even if they might be only an approximation of observed features in real epidemics, still constitute a fundamental theoretical cornerstone in the field of epidemic processes. Recently there has been an increasing interest in studying the effectiveness of track and tracing policies as a measure to contain epidemic spreading [11][12][13][14]: for instance, in [14] the authors show how an effective contact tracing strategy in scale-free networks can reduce the probability of superspreading events, while in [11] it is claimed that a widely used contact-tracing app, combined with additional measures such as social distancing might be sufficient to stop an epidemic diffusion.
There are several mathematical arguments proposed in the contemporary literature to justify the abovementioned effects, however a solid percolation approach has not been proposed so far. In this work, we take a step forward in filling this gap by proposing a stylized model for epidemic spreading with contact-tracing and testing policies based on link percolation.
In particular, we first consider each individual i, of a given contact network, to be assigned a binary variable T i representing whether or not the individual has the tracing app. Then, we propose a modified version of the popular message-passing (MP) equations [15][16][17][18][19] which takes into account the following rationale. Every infected individual with probability p, called the transmissibility of the epidemic, transmits the disease to a susceptible neighbour. An individual who has got the app, will know almost instantaneously (this is an hypothesis far from reality, but simplifies the analysis) if has been in contact with an infected individual also having the app, an she/he immediately self-isolates stopping propagation. However, if infected from an individual still not having the app, she/he will not know until symptoms appear. This can be formulated as follows: individuals with the app (T i = 1) can infect only if previously infected by individuals without the app (T i = 0), while individuals without the app can infect regardless the T i value of their infector. By doing so we are able to derive a modified non-backtracking matrix [16,20,21] whose largest eigenvalue determines the epidemic threshold p c . Furthermore, for the case of uncorrelated networks, we are also able to derive an analytical expression for p c as a function of the average distribution of the tracing app, namely T (k). Our results show that in general the more the app is diffused among the population the higher is the value of p c , meaning that the endemic state is less likely to be achieved. Moreover we show that given a fixed app coverage, the optimal T (k) which maximizes p c corresponds to a hub-targeting strategy. contact network G(V, E) formed by |V | = N individuals i = 1, 2, . . . N , each individual i ∈ V is assigned a variable T i indicating whether the individual has got the app T i = 1 or not T i = 0. Assuming the contact tracing app has immediate effect on quarantining suspicious cases, a person with the app can infect only if it is infected by a person without the app, while a person without the app can infect regardless if he has got the infection from a person with the app or without the app (see Figure 1). Now, we propose a stochastic infection model as follows: for every link (i, j) we draw a random variable x ij ∈ {0, 1} indicating whether the eventual contact between one infected and one susceptible node, found at the two ends of the link, leads to the infection. We parametrize this dynamic by taking x ij = p, where p indicates the transmissibility of the epidemic.
We can simulate the stationary state of this spreading process on networks of arbitrary topology, i.e. including spatial networks with high clustering coefficient, by implementing the following Monte Carlo algorithm which takes advantage of the mapping between epidemic spreading and percolation. We name T −T the links connecting two individuals adopting the app. These links do not contribute to the propagation of the infection to nodes other than the two connected nodes. In order words the causal chains of infection stop when they involve a T − T link. Therefore we first consider the gi- Message-passing approach-To analytically predict the propagation of the epidemics on a network we use the powerful MP approach [16][17][18]. Although this approach is proven to give exact results only on locally tree-like networks, it is also well known to be very robust in the case of networks with loops, when the underlying MP algorithm converges [23]. In this work we adopt the MP approach and we use it to predict the phase diagram of the spreading process on network ensembles as a function of the level of adoption of the app in the population. The considered spreading model is stochastic and has different sources of randomness that can be taken into account by different MP algorithms in which we average different level of information [17]. The simplest message MP can be derived assuming to know everything about the spreading dynamics. This would entail first to know the contact network, secondly to know which individuals have the app, i.e. the configuration {T i } i∈V , and finally to know which links have led to an actual infection, i.e. {x ij } (i,j)∈E (see SM [22] for details). One can then relax the hypothesis of perfect knowledge about the epidemic process and we can consider the message passing processes in which we average over the distribution of {x ij } (i,j)∈E . In this situation the outcome of the epidemic spreading is dictated by the following MP equations. A node i spread the virus to node j only with probability σ i→j ∈ [0, 1] where this message is found by the MP equation where N (i) indicates the neighbours of node i. These equations directly implement the model as described in Fig. 1. Moreover a node i is infected with probability σ i ∈ [0, 1] with (2) Therefore the expected fraction S of infected individuals is given by This process has an epidemic threshold achieved when the maximum eigenvalue Λ(B) of the modified nonbacktracking matrix B is equal to one, i.e.
The modified non-backtracking matrix B for this algorithm is defined in terms of the non-backtracking matrix A of the network as Here A [16] has elements where a is the adjacency matrix of the network and δ rs is the Kronecker delta. This equation clearly shows that the epidemic threshold is dictated essentially by the nonbacktracking matrix of the network where we have removed all the T − T links. We can also average over the probability distribution of {T i } i∈V . Specifically we can assume that T i (the . . . indicates the average over the probability distribution of {T i } i∈V ) is only a function of the node degree, i.e. T i = T (k i ). This is a reasonable assumption, however we note that the adoption of the app might depend on an additional social contagion process of awareness behavior in a scenario close to the one proposed in Ref. [24]. For formulating the MP algorithm in the case in which we assume to known only the function T (k), the trasmissibility p, and the actual contact network, we consider for every ordered pair of linked nodes (i, j) the two messages indicating the probability that node i infects node j given that node i has adopted (σ T i→j ) or not adopted (σ N i→j ) the app. These two messages are given bŷ The MP equations for these messages can be obtained by averaging the MP Eqs.
(1) over all the configuration {T i } i∈V and read The probability that node i is infected σ i is given by while the expected fraction S of infected nodes is given by Eq.
(3). In this case the relevant matrix B determining the epidemic threshold given by Eq. (4) is (see SM [22] for details) Finally we assume that we do not have perfect knowledge about the network itself and can perform the average over an uncorrelated network ensemble. In this case we have two equations one for S N and one for S T indicating the probability that by following a link we reach an infected individual without the app or with the app respectively. These equations (see SM [22] for details of the derivation) read, Here T (k) indicates the probability that a node of degree k gets the app. The probability that a random node gets the infection is given by The transition is achieved for where Optimization -The formula for p c , provided by Eq. (13), is an increasing function of κ T so in order to maximize p c we need to maximize κ T . Under the L 1 norm This optimization problem gives the discrete Heaviside step functionT taking the value 0 ≤ α = T − k>kc P (k) < 1 at k = k c . Therefore the optimal solution is to have all nodes of degree k > k c with 100% app adoption and the node with exactly k = k c with the maximal adoption allowed by the constraint in Eq. (15). For this choice of T (k) we have checked the validity of the proposed message passing theory by comparing the results obtained by a direct implementation of the Monte Carlo algorithm predicting the fraction of nodes affected by the epidemics with the results of the MP algorithm defined in Eq. (8), (9) finding an excellent agreement in the case of a Poisson network (see Figure 2). We have checked that the agreement remains excellent also for scale-free networks (see SM [22]). Improvement on p c -Equation (16) tells us that given a fixed app coverage T , the best strategy in order to maximally delay the percolation transition is given by targeting the hubs. In order to verify the optimality of Eq. (16) when compared to different strategies, we considered the more general form of T (k) given by: where θ(k − k c ) is the discrete Heaviside step function taking the valueα at k = k c , and ρ ∈ [0, 1] denotes a uniform fraction of individuals adopting the app. Thanks to Eq. (17) we are able to interpolate between a purely random strategy obtained by taking the limit k c → ∞ and the optimal strategy given in the limit ρ → 0. It is straightforward to check that under the constraint defined in Eq. (15) we have respectively lim kc→∞ T (k) = T and lim ρ→0 T (k) =T (k). We have used Eq. (13) to investigate the phase diagram (characterized by the epidemic threshold p c ) of different network ensembles (a Poisson network and an uncorrelated scale-free network) as a function of ρ and k c (see Figure 3). We observe that a significant adoption of the app can significantly increase p c .
To show, in a particular example, the increase of p c due to the adoption of the app, we consider the real dataset Livemocha social-network [25]. As we can see from Fig. 4 the random adoption strategy, achieved when k c = k max , yields a very small increase in the value of p c compared to the optimal distribution, corresponding to ρ = 0. Therefore in a scenario of limited resources, represented by the constraint defined in Eq. (15), the optimal strategy corresponds to distribute the app from higher-degree nodes to lower-degree ones until the resources are exhausted. The resulting increase in p c computed according to Eq. (13) is quite dramatic and non trivial, for instance from Fig. 4 we read that if the app is optimally distributed among ∼40% of the population the increase of p c is ∼17fold, while if the same percentage is covered at random the increase is ∼1.2-fold.
Conclusions-In this work we provide a messagepassing theory able to predict the epidemic threshold of disease spreading among a population which has the option of adopting a tracing app. For simplicity we assumed that the tracing app is perfect, however the modeling framework can be relaxed and allow also for imperfect tracing and isolation. The proposed stylized mathematical framework can be useful to assess the expected impact of contact-tracing apps in the course of an epidemics. The compartmental epidemic model used is the classical SIR, and do not pretend to be a model fitted for the current pandemic of COVID-19, however the physical intuition we grasp from the presented analysis may prove fundamental to prescribe the best targeting strategy for app adoption, as well as it captures the highly non-linear effect on the reduction of the incidence provided by a certain fraction of adoption. Our preliminary results show both numerically and theoretically that the adoption of the app by a large fraction of the population increases the value of the epidemic threshold. In case of uncorrelated networks we are able to derive a closed analytic expression for p c which depends on both the network degree-distribution P (k) and the average app distribution T (k). Thanks to this expression we finally prove that in a constrainedresources scenario the value of p c is maximized when high-degree nodes are preferentially targeted.
Our results show that an optimal targeting gives rise to a dramatic increase in the value of p c when compared to a strategy in which a fraction of the resources is randomly distributed. The more randomly the app is diffused among the population the less is the increase in the percolation threshold, or equivalently, the less the app has the power of mitigating the epidemics. Overall our results show that even if the adoption of a tracing app has the effect of mitigating an epidemic, the same level of adoption can be optimally distributed to obtain a mitigation effect which is significantly higher.   (15), and p 0 c = k / k(k − 1) represents the value of the percolation threshold in the absence of app coverage (which can be obtained from Eq. (13) in the limit κT → 0). Here p 0 c = 0.00306, while the app coverage is fixed at T = 0.39175, corresponding to an optimalT (k) with kc = 20 andα = 1. The plot shows that for this particular value of T , corresponding to ∼40% of the nodes having the app, the optimal distribution is reached at ρ = 0 and corresponds to a ∼17-fold increase of pc, whereas in the case of a purely random strategy, obtained at ρ = T , the increase of pc is ∼1.2-fold. J. P. Gleeson, Physical Review E 83, 036112 (2011).

MAPPING OF EPIDEMIC SPREADING TO PERCOLATION PROBLEM
We assume that the network G = (V, E) of contacts is formed by N = |V | individuals i = 1, 2, . . . N . Each individual is assigned a variable T i indicating whether the individual has adopted the app (T i = 1) or not (T i = 0). Assuming that the track and tracing has immediate effect, a person with the app can infect only if its is infected by a person without the app, whereas a person without the app can infect regardless if he has got the infection from a person with the app or without the app. For every link (i, j) ∈ E we draw a random binary variable x ij ∈ {0, 1} indicating whether (x ij = 1) or not (x ij = 0), the eventual contact between one infected an one susceptible node find at the two ends of the link leads to the infection. Here we assume that the average of x ij is given by the transmissibility p, i.e.
x ij = p.
In order to find which are the nodes infected in this epidemic outbreak we adopt the following algorithm that uses the mapping of the stationary state of epidemic to percolation [9].
• Pre-processing of the connections-We call T − T the links connecting two individuals both adopting the app.
These links do not contribute to the propagation of the infection to nodes other than the two connected nodes. Therefore we initially remove from the network all T − T links. Specifically we associate to each link (i, j) the variable y ij ∈ {0, 1} defined as and indicating whether the link contributes or not the the spread of the disease in the network (excluding the two nodes (i, j) of the link).
• Percolation process-We find the nodes in the giant component of the resulting percolation problem. We assign to each node the indicator variable m i ∈ {0, 1} indicating if node i belongs or not to the giant component of the network with links according to the indicator function y ij . The nodes with m i are nodes that are infected by chain of contacts in which there we can never find two consecutive infected nodes with the app.
• Calculation of the fraction of infected individuals-In order to calculate the total fraction of infected individual we need to include in addition to the nodes with m i = 1 also the nodes with the app infected by nodes with the app. Therefore we define an indicator function σ i which will indicate for each individual if it is infected (σ i = 1) or not (σ i = 0). The value of σ i can be evaluated according to the boolean rule

MESSAGE PASSING ALGORITHMS FOR EPIDEMIC SPREADING IN A POPULATION PARTIALLY ADOPTING THE APP
In this section we discuss the message passing algorithms [16,17] that can be used to predict the outcome of the epidemic spreading studied in this work. We will first assume to have full knowledge about the configuration {T i } i∈V and {x ij } (i,j)∈E . Subsequently we will relax this strong assumption by assuming to known only the value of the transmissibility p by fixing the expectation value x ij = p. Finally we will relax further our assumptions and we will consider the case in which the configuration {T i } i∈V is also not known exactly and only the expectations T i = T (k i ) (where k i is the degree of the generic node i) are known.
In the first case in which the exact configurations {T i } and {x ij } are known, the message passing algorithm on a locally tree-like network predicts that a node i spreads the virus to node j only ifσ i→j = 1. If node i has the app, i.e. T i = 1, we haveσ i→j = 1 if node i has been infected by at least a neighbor node without the app and x ij = 1, otherwiseσ i→j = 0. On the other hand if node i does not have the app, i.e. T i = 0, we haveσ i→j = 1 if node i has been infected by at least a neighbour node and x ij = 1, otherwiseσ i→j = 0. Therefore the message passing algorithm readsσ where N (i) indicates the neighbours of node i. Moreover the function σ i indicating whether a node is infected (σ i = 1) or not (σ i = 0) is given byσ If follows that the epidemic threshold is determined by the equation Here Λ(B) is the maximum eigenvalue of the corrected non-backtracking matrix B of elements where A is defined in terms of the adjacency matrix of the network a as This algorithm should be modified if we do not have access to the full configuration of {x ij } (i,j)∈E . In this case we assume to know only the transmissibility of the disease p = x ij , therefore the messages are real values σ i→j ∈ [0, 1] and indicate the probability that node i infects node j. By averaging the message passing equations over all possible configuration {x ij } at fixed value of the transmissibility of the infection p we obtain the message passing algorithm where N (i) indicates the neighbours of node i. Moreover a node i is infected with probability σ i given by The epidemic threshold is always determined by Eq.(S-4) with B taking the expression In order to model different scenarios corresponding to different adoption patterns of the app we might also assume that the configuration {T i } i∈V is not known exactly and we have only access to the probability that a node adopt the app. Assuming that this probability is a function of the degree of the nodes, we have T i = T (k i ) with T (k) describing the probability that a node of degree k adopts the app. In order to formulate the message passing algorithm in this case we consider for every ordered pair of linked nodes (i, j) the two messageŝ indicating the probability that node i infects node j given that node i has adoptedσ T i→j or not adoptedσ N i→j the app. Here . . . indicates the averaged over the probability distribution of {T i } i∈V . The message passing equations for these messages can be obtained averaging the message passing Eqs.(S-7) over all the configuration {T i } i∈V and read The probability that node i is infectedσ i is given bỹ The critical threshold is obtained by linearising the message passing Eqs. (S-11), which yieldŝ In this way by solving this linear system of equations we get Therefore we obtain that the critical point is characterized the Eq.(S-4) where B is given by In this section we show the derivation of the epidemic threshold p c in the case in which we do not know exactly the structure of the contact network, i.e. we only known that the network is a random uncorrelated network with a given degree distribution P (k) and we know only the statistical properties of the configurations {T i } i∈V and {x ij } (i,j)∈E . We consider the variables S T and S N indicating the probability that by following a link we reach an infected individual with app or without app respectively. By averaging the message passing Eqs. (S-11) over the network ensemble we get where T (k) indicates the probability that a node of degree k adopt the app. The probability that a random node gets the infection is given by The system of Eqs. (S-16) can be written as The Jacobian of this system of equations is given by where Imposing that the determinant of the Jacobian is zero we obtain that the transition is achieved for p c = min 1, 1 2κ T −1 + 1 + 4 κ T κ N . (S-21)

NUMERICAL VALIDATION OF THE THEORETICAL PREDICTIONS
We have validated the proposed message passing framework by conducting extensive numerical simulations using the three message passing algorithms and the Monte Carlo simulations. We considered the choice T (k) = ρ + (1 − ρ)θ(k − k c ,α), (S-22) where θ(k − k c ,α) is the discrete Heaviside step function taking the valueα at k = k c , and ρ ∈ [0, 1] denotes a uniform fraction of individuals adopting the app. The phase diagrams obtained using the three different message passing algorithm are consistent. In particular when these algorithms are applied to a network drawn from a network ensemble they give results whose differences vanishes in the large network limit. To show evidence of this result, in Figure S − 1 we compared the phase diagram obtained using the three message passing algorithms for a Poisson network with average degree λ = 4 and N = 10 4 nodes.
In the main text of this Letter we have shown the perfect agreement between the message passing algorithm defined in Eq. (S-11) and Eq. (S-12) and the Monte Carlo simulations averaged over the distribution of {x i,j } (i,j)∈E and the distribution of {T i } i∈V in the case of a Poisson network. In Figure S − 2 we show that this excellent agreement also extend heterogeneous networks.
We have also studied the results obtained averaging over several Monte Carlo simulation for Poisson networks, BA network and for uncorrelated scale-free networks (see Figure S