Spread of infectious disease and social awareness as parasitic contagions on clustered networks

There is a rich history of models for the interaction of a biological contagion like inﬂuenza with the spread of related information such as an inﬂuenza vaccination campaign. Recent work on the spread of interacting contagions on networks has highlighted that these interacting contagions can have counterintuitive interplay with network structure. Here, we generalize one of these frameworks to tackle three important features of the spread of awareness and disease: one, we model the dynamics on highly clustered, cliquish, networks to mimic the role of workplaces and households; two, the awareness contagion affects the spread of the biological contagion by reducing its transmission rate where an aware or vaccinated individual is less likely to be infected; and three, the biological contagion also affects the spread of the awareness contagion but by increasing its transmission rate where an infected individual is more receptive and more likely to share information related to the disease. Under these conditions, we ﬁnd that increasing network clustering, which is known to hinder disease spread, can actually allow them to sustain larger epidemics of the disease in models with awareness. This counterintuitive result goes against the conventional wisdom suggesting that random networks are justiﬁable as they provide worst-case scenario forecasts. To further investigate this result, we provide a closed-form criterion based on a two-step branching process (i.e., the numbers of expected tertiary infections) to identify different regions in parameter space where the net effect of clustering and coinfection varies. Altogether, our results highlight once again the need to go beyond random networks in disease modeling and illustrate the type of analysis that is possible even in complex models of interacting contagions.


Introduction
Models of contagion are used to study the transmission dynamics of a pathogen or information being transmitted through a structured population.Most of these are defined as compartmental models [1], which mathematically distinguishes individuals based on their state; i.e., whether they are susceptible to a contagion or infectious with that contagion.Using this approach, coupling different contagions to model their interactions is straightforward as we can then simply distinguish individuals based on all possible combinations of states for the different contagions.Of particular interest is the coupling of an infectious disease with the spread of a b c d e f g h i j k l An open circle represents a susceptible individual; a shaded one, a contagious individual (infected with the disease, awareness, or both); and a black circle represents a group (or clique).The topology is constructed by allowing individuals to belong to a given number of cliques where they can be linked to other participants (solid lines).Note that in the formalism, the cliques are distinguished by their exact population and state, while the precise connections between them remain unspecified.Modified from Ref. [17].
2 Awareness and disease as parasitic infections

Network structures
To study parasitic contagions on clustered contact networks, we use a general definition of community structure where every network is decomposed in terms of groups [19].The contact network between individuals can thus be interpreted as the projection of a bipartite networks where nodes are connected to social groups of different sizes.In this context, even random links are interpreted as groups of size two.The network topology of our model is illustrated in Fig. 1.In order to highlight the effects of community structure (CS) versus random network, the CS network will be compared with its equivalent random network (ERN): a network with exactly the same degree distribution, but with randomly connected nodes.Both topologies will be studied analytically and numerically.
Typical network datasets are often only available as a collection of pairwise edges rather than higher-order structure like groups.One then has to rely on some numerical methods such as community detection to infer group structure [20].Likewise, for theoretical models, one can simply rely on known distributions of groups per node (membership) and of nodes per group (group size) from previous studies on overlapping communities [21,22].We here use the simplest possible distributions in order to avoid confounding the impact of group structure from that of degree heterogeneity or degree correlations [23].
The dynamics of a single contagion on this community structure model was studied in [17].Using a mean-field description, it was shown that the clustering of links in groups slowed down propagation as links are wasted on redundant connections instead of reaching new individuals.Expanding on this study, we more recently introduced a similar mean-field description for two synergistic disease [18], which is the model that we here generalize to interactions of other nature.

Dynamical process
To model the concurrent spread of an infectious disease and awareness of it on clustered networks, we will introduce a generalization of the model of interacting contagions used in Ref. [18].We study the coevolution of two Susceptible-Infectious-Susceptible processes (SIS) such that, at any given time, the state of each individual is determined by their status regarding the two contagion processes.Without interaction with the other contagion, an individual with contagion i would infect its susceptible neighbors at a rate β i and recover at a rate α i , but we will here introduce a parametrization scheme to modify these rates and model possible interactions as generally as possible.Note that our model is general and could be applied to any type of pairwise interaction between two SIS processes.However, as the notation will become quite involved we will ground our derivation by referring to the first contagion as the disease (with natural parameters β D and α D ) and to the second contagion as awareness (with natural parameters β A and α A ) To keep track of both contagions simultaneously, we distinguish nodes by their state [XY ] m where m is their membership number, X ∈ {S 1 , I 1 } corresponds to their state regarding the first contagion and Y ∈ {S 2 , I 2 } their state regarding the second.Similarly, we will distinguish groups by their size n and the states of the nodes they contain.I.e., [ijk] n , where i is the number of . Keeping track of the number of nodes with both contagions is critical considering that we are interested in the effect of co-infection.
In the original model of Ref. [18], co-infection had symmetric effect on both contagions, embodied in a single interaction parameter.For parasitic contagions, we want one contagionthe awareness -to benefit from being in the neighborhood of the other, the disease.Individuals might be more likely to listen to an awareness campaign if they are themselves sick or if the message comes from a sick individual.Likewise, they might be less likely to forget important information related to a disease if they are currently infected.We thus expect an increase in awareness transmission rate around infected individuals, and a decrease in loss of awareness for infected individuals.Second, we also want the disease to be hindered whenever nodes in a given neighborhood are aware of transmission risks or related treatment options.We might thus expect a decrease in disease transmission rate around aware individuals and/or an increase in disease recovery rate for aware individuals who might avoid contacts and seek treatment.
To track all these possible interactions, we therefore need to distinguish each possible infection by the state [XY ] of the infector and the state [U V ] of the infectee.The interaction of these states are embodied in a set of parameters, ρ XY U V , γ XY U V , τ D and τ A .The first two give the factors affecting the transmission rates of the first and second contagion, respectively, when dealing with a [XY ] to [U V ] contact.For example, a [I 1 I 2 ] individual will transmit the disease to a [S 1 I 2 ] individual at a rate ρ II SI β D .Of course, ρ XY U V = 0 whenever U ≡ I and γ XY U V = 0 whenever V ≡ I as these individuals are already infected with the corresponding contagion; similarly ρ XY U V = 0 whenever X ≡ S and γ XY U V = 0 whenever Y ≡ S as only infected individuals can transmit the contagion.We also consider that ρ IS SS = γ SI SS = 1 to preserve the natural transmission rate of each contagion, although we still use this term in the general equations.Finally, τ D and τ A give the factor by which the recovery rate of the disease or awareness are modified if the individual is also aware or sick, respectively.Factor of α D when the infected is also aware τ A Factor of α A when the infected is also sick

Mean-field description
A mean-field description of the time evolution of our general model can be written in the spirit of previous formalisms.Leaving out all explicit mention of time dependencies as all variables and mean-fields vary in time, the population density within each node state evolves as where U V is a mean-field value of interactions representing the expected number of interactions with contagion i, per membership, for a node in state [U V ].Notice that in the equations, the first row of terms are the recovery events, and the second the infection events.The challenge in correctly writing the equations is thus solely to correctly identify to which state each event transfers some population density.Conservation of total population density (i.e. the sum over all state densities remains equal to one) is easily verified since the sum of Eqs.(1) to (4) is zero.
Let us assume that we know the density [ijk] n of cliques that contain n individuals with i nodes contagious with the first contagion only, j contagious with the second contagion only, and k with both contagions.We could use this information to write the interaction mean-fields for March 25, 2020 5/15 the average level of interaction with contagious individuals within a given group: These expressions can be understood with the following logic.For instance, in the case of SS , the susceptible individual is twice as likely to be part of a clique with twice as many susceptible nodes, which is what the (n − i − j − k) factor takes into account.We then simply average the infection terms of each possible clique, i.e. iρ IS SS + kρ II SS , over this biased distribution of cliques.With a variant of these mean-fields, we can now follow the evolution of group states by a general, but complicated, equation: which is defined over all non-negative integers n ≥ 2 and i + j + k ≤ n.Eq. ( 9) is coupled to the previous system of ODEs through the mean-field values of excess interactions B(x) U V , representing interactions with outside groups, given by The first four terms of Eq. ( 9) are those corresponding to recoveries; positive for those corresponding to cliques relaxing into [ijk] n and negative for those where [ijk] n relaxes into a less infected state.The other terms represent each possible infection event.Notice that creating a k individual implies either removing a i or j, through their infection with contagion 2 or 1 respectively; just as recoveries can create i or j individuals when a k individual recovers from contagion 2 or 1.

Validation
To validate the accuracy of our mean-field description, we run simulations on highly clustered networks where every node belongs to 2 cliques of size 10.We use this network for two reasons: First, to avoid degree-degree correlations, such that we know that the effect of clustering will be the main structural effect.Second, to feature a realistic local clustering coefficient, C, i.e. the ratio of triangles to pairs of links around a given node, which is here C = 0.47.In Fig. 2, we show prevalence for the disease and awareness over time on a clustered (CS) and an exponential random graph (ERN), using both Monte Carlo simulations and our ODE system.The accuracy of the mean-field approximations were expected given Refs.[17,18], and for the rest of the paper we therefore rely on the ODE system rather than slower Monte Carlo simulations.
Most importantly, while we know that an awareness campaign or network clustering can both hinder the spread of a disease, it appears that network clustering can actually help a disease spread further when it is competing against a second contagion such as an awareness campaign.This result shows once more that the impacts of different dynamical or structural features can combine in non-trivial ways in models of contagion on networks.

Two-step branching process
In Ref. [18], we also introduced a simple criterion to determine whether a clustered structure would spread two synergistic contagions faster.This criterion can conceptually be thought of as a generalization of the basic reproductive number (R 0 , the number of secondary infections from an average infectious individual in a completely susceptible population), which is often used to characterize the initial speed of epidemics, which in our case considers two infection steps in order to include clustering.Physically, it can be interpreted as a two-step branching process, as March 25, 2020 7/15 we count the number of tertiary infections caused by a co-infected individual (i.e.how many second neighbors will be infected).That being said, the analogy is imperfect: The "branching process" does not repeat itself since we do not distinguish which contagion(s) caused those tertiary infections.Yet, it proved to be a useful tool in Ref. [18] to identify the net effect of clustering across parameter space; i.e., to determine whether clustering speeds up or slows down propagation.
We start with a single node infected with both contagions, and denote the average excess degree of recently infected nodes as z 1 (i.e., we assume this node received the contagions from a single neighbor and z 1 describes the average number of other neighbors this node is expected to have).For the first step of our criterion, we need to distinguish the probability of transmitting only the disease, only the awareness, or both.Since we ignore re-infection events, the latter scenario can occur in two ways: either by transmitting both while co-infected; or transmitting the first (or second) while co-infected before recovering from it and then transmitting the second (or first).Summing the two events yields the probability T II of a co-infected transmitting both contagions to a given first neighbor, i.e., Similarly, a co-infected node can transmit only the disease in two ways, either by infecting while co-infected then recovering before the transmitting awareness or by recovering from awareness before transmitting the disease.Again, summing these events gives the probability T IS of a co-infected transmitting only the disease.The same logic applies to the probability T SI of transmitting only the second contagion.We can thus write In its first neighborhood, we now know that a single co-infected individual will on average cause z 1 T II co-infections, z 1 T IS transmissions of the disease only, and z 1 T SI transmissions of the awareness only.We call those secondary infections.Our two-step branching process then looks at the number of tertiary infections, i.e. the number of transmission events of either contagions in the second neighborhood.
March 25, 2020 8/15 In a clustered network, there is an overlap between the second neighborhood and the first, such that neighbors of the original co-infection can be infected during the second step of the process if they were not already.Let us consider one of the z 1 T IS first neighbors infected only with the disease and now trying to infect a susceptible node.We know that in its own first neighborhood, a number (z 1 − 1)C (T II + T IS ) of them are already infected with the same contagion (z 1 − 1 is an approximation, equal to its excess degree minus the targeted susceptible node).The fact that a fraction of its neighborhood is already infected by the root node is the negative impact of clustering on the dynamics.However, a number (z 1 − 1)C (T II + T SI ) are now also aware, such that they could transmit it to the node of interest and change its transmissibility.This is a potentially positive impact of clustering depending on the nature of the coupling between contagions (e.g.positive for the spread of awareness, negative for the disease itself).
Still considering the same first neighbor infected with the disease only, we need to know the rate at which it is co-infected by one of its (z 1 − 1)C (T II + T SI ) aware neighbors.Assume that we know the value of that rate, denoted x D for a co-infection to a diseased node, then the probability of co-infection before recovery would simply be x D /(x D + α D ).Since we can also write that probability as every node involved recovering before co-infection, we can require the following equality: The same logic applies for co-infection involving a node that is aware but not sick.We can solve for the effective rates of co-infection through clustering, i.e. x D and x A , and obtain With these effective rates, we can write the probabilities of a tertiary transmission of either disease or awareness, respectively T disease eradication by depleting the pool of susceptible individuals.At very low β A , the awareness contagion fails to spread and the disease is left unhindered.At intermediate values of β A , awareness is able to spread mostly due to its interaction with the disease, meaning it will reach a fraction of those already reached by the disease and fail to invade the susceptible population.In this regime, we find a non-monotonous relationship between the prevalence and transmission rate of awareness because increasing β A increases the probability of awareness reaching sick neighbors, while also decreasing the global fraction of sick individuals.After a certain threshold in β A , the prevalence of the disease falls to zero and awareness then spreads as a regular contagion.
As shown in Fig. 3, the epidemic thresholds predicted by R (D) 1 = 1 are typically within a factor 2 of the true epidemic threshold.While this is a good approximation, we find that in all cases the branching factor analysis systematically underestimates the robustness of the outbreak.This is most likely due to the fact that the analysis is seeded with a co-infected individual, while awareness and disease are likely to drift apart, benefiting the disease.

Peak values
In Ref. [18], R 1 was used to determine whether two synergistic diseases would spread faster on a clustered or random network.The idea being that while clustering typically slows down dynamics, there can be an acceleration associated with the synergistic interactions and the benefit of being together by clustering.Here, both clustering in network structure and the interaction with awareness slow down the spread of the disease.We therefore do not expect to find a regime of accelerated disease spread.However, it is possible that clustering slows down awareness more than it slows down the disease, in which case a slower dynamics might still lead to a higher epidemic peak.
In Fig. 4, we vary the transmission rate of the disease and its interaction with awareness while tracking whether R structure) or with C = 0 (larger peak on the equivalent random network).We find that there can indeed be two separate regimes and the branching factor analysis provides a good approximation of where this crossover can occur.

Outlook
With disease transmission comes the possibility for awareness of the disease and of the risk factors associated with its transmission.Awareness of the disease may cause individuals to respond by reducing their own transmissibility or adopting preventative behaviors.Here, we explored the effects of awareness in a model that looks at both disease and awareness as co-contagions in a parasitic relationship: spread of the disease leads to transmission of awareness which in turn leads to decreased disease prevalence as a result of reduced disease transmission around aware individuals.Our results show that interacting co-contagion models lead to different dynamics depending on the network structure on which they unfold.Characteristic measures such as the final outbreak size and the peak incidence exhibit regimes where they can be higher in networks exhibiting clustering than on equivalent but random network structures.Altogether, our study highlights once again the need for disease models to go beyond random networks as social clustering can lead to either smaller or larger forecasts depending on the dynamics at play.We showed how interactions between contagions can combine with network structure in non-trivial ways and are therefore especially important to include in disease models.To this end, we have generalized the tools of Ref. [18] to account for more complicated interaction mechanisms.In doing so, we end up with useful analytical tools, but their development becomes so involved and complicated as to be almost intractable.This raises the important problem of developing effective models for interacting contagions whose complexity does not grow exponentially with the number of contagions or with the number of interaction mechanisms.Indeed, not only do infectious diseases interact with social contagions such as vaccination and other preventative behaviors, but they also interact, often synergistically, with other biological infections [24][25][26].New tools are therefore needed to account for all of these interactions in a tractable and insightful analytical framework [27].
Finally, with these theoretical advances also comes the need for improved data collection on the dynamics of awareness spreading and methods to measure how this materializes into effective preventative behaviors.With most of it now shared on online social media, information and messages regarding public health crises are increasingly important in shaping human behavior during epidemics.Unfortunately, data surrounding that messaging are not readily available to researchers and public health officials, even if we know it interacts in critical ways with our models and forecasts.The parallel development of theoretical frameworks and of data sharing protocols for social messaging related to public health crises will be invaluable going forward.Public awareness is an integral part of public health and advances to address its social media dimension should be integrated into existing public health surveillance systems.

Fig 1 .
Fig 1. Schematization of the particular topology and dynamics studied in this paper.An open circle represents a susceptible individual; a shaded one, a contagious individual (infected with the disease, awareness, or both); and a black circle represents a group (or clique).The topology is constructed by allowing individuals to belong to a given number of cliques where they can be linked to other participants (solid lines).Note that in the formalism, the cliques are distinguished by their exact population and state, while the precise connections between them remain unspecified.Modified from Ref.[17].

Fig 2 .
Fig 2. Parasitic interaction in a population where nodes all belong to two groups of 10 nodes.Markers represent average results of Monte Carlo simulations with error bars representing the standard deviation of over 100 runs on networks of 50 000 nodes.Solid curves are obtained by integrating the mean-field formalism.Results on the CS are shown in shade and those on its ERN are shown in black.The dynamics follow β D = 0.02, β A = 0.25, α D = α A = τ D = τ A = 1.0, ρ IS U V = 1, ρ II U V = 10 and γ XY U V = 0.05 except, ρ IS SS = γ SI SS = 1.0.

Fig 3 .) 1 = 1 . 1 Fig 4 .
Fig 3. Using the same clustered network structure and parametrization as in Fig. 2, we now plot the final prevalence (i.e.final steady-state size) of disease and awareness across a range of parameters.a) We vary the transmission rate of awareness.b) We vary the transmission rate of the disease.c) We vary the increase in awareness transmission around sick individuals.d) We vary the decrease in disease transmission around aware individuals.The dotted vertical line marks the analytical epidemic threshold (if any) as approximated by R (D) 1 = 1.All parameters are fixed to the following values unless we explicitly vary them: β D = 0.02, β A = 0.25, α i = τ i = 1, ρ IS U V = 1, ρ II U V = ρ = 100 and γ XY U V = γ = 0.005 except ρ IS SS = γ SI SS = 1.0.

Table 1 .
Description of all parameters in the model Symbol Definition {g m } Distribution of groups per node (memberships) {p n } Distribution of nodes per group (sizes) β D Transmission rate of the disease (S 1 → I 1 ) β A Transmission rate of awareness (S 2 → I 2 ) α D Recovery rate of the disease (I 1 → S 1 ) α A Recovery rate of awareness (I 2 → S 2 )