Modeling the Spatiotemporal Epidemic Spreading of COVID-19 and the Impact of Mobility and Social Distancing Interventions

On 31 December, 2019, an outbreak of a novel coronavirus, SARS-CoV-2, that causes the COVID-19 disease, was first reported in Hubei, mainland China. This epidemics ’ health threat is probably one of the biggest challenges faced by our interconnected modern societies. According to the epidemiological reports, the large basic reproduction number R 0 ∼ 3 . 0 , together with a huge fraction of asymptomatic infections, paved the way for a major crisis of the national health capacity systems. Here, we develop an age-stratified mobility-based metapopulation model that encapsulates the main particularities of the spreading of COVID-19 regarding (i) its transmission among individuals, (ii) the specificities of certain demographic groups with respect to the impact of COVID-19, and (iii) the human mobility patterns inside and among regions. The full dynamics of the epidemic is formalized in terms of a microscopic Markov chain approach that incorporates the former elements and the possibility of implementing containment measures based on social distancing and confinement. With this model, we study the evolution of the effective reproduction number R ð t Þ , the key epidemiological parameter to track the evolution of the transmissibility and the effects of containment measures, as it quantifies the number of secondary infections generated by an infected individual. The suppression of the epidemic is directly related to this value and is attained when R < 1 . We find an analytical expression connecting R with nonpharmacological interventions, and its phase diagram is presented. We apply this model at the municipality level in Spain, successfully forecasting the observed incidence and the number of fatalities in the country at each of its regions. The expression for R should assist policymakers to evaluate the epidemics ’ response to actions, such as enforcing or relaxing confinement and social distancing.


I. INTRODUCTION
As of August 2020, the novel coronavirus, SARS-CoV-2, has infected more than 20 000 000 persons worldwide with COVID-19, causing more than 700 000 deaths. From its onset, the scientific community has embarked on an unprecedented search for efficient ways to stop the advance of the epidemic wave. This extraordinary effort has involved different disciplines: epidemiological studies to characterize the transmission patterns of the new coronavirus, large-scale clinical trials to test treatments that improve the course of the most severe cases, and the design of vaccines that safely provide immunity to the world's population.
Once the epidemiological parameters were found [1][2][3][4][5][6][7][8][9][10], the mathematics and physics community made use of the most advanced epidemic models [11][12][13][14][15][16][17][18] to track and anticipate the spread of the epidemics (see Ref. [19] for an exhaustive review on these efforts). Nevertheless, the particularities of both SARS-CoV-2 transmission and the heterogeneous clinical evolution of patients with COVID-19 call for a rethinking of conventional models toward adapted models that explicitly take into account these details and the characteristics of the populations affected by the pandemic. Among these characteristics, capturing the specific demographic distribution, the complex patterns of social contacts, and the geographic mobility networks among populations is essential to understand and forecast the impact that nonpharmacological containment measures, such as social distancing or confinement, have on the spread of the pathogen.
Here, we propose a mathematical (mechanistic) model particularly tailored to capture the main features of the propagation of SARS-CoV-2. This model relies on previous metapopulation models by the authors [20][21][22][23][24] to incorporate the specific characteristics of SARS-CoV-2 transmission and the clinical features of COVID-19 patients. Concerning transmission, we consider the important effect that the large fraction of asymptomatic infections (around 40% [25]) have on the covert spread of the disease in the early stages, causing a delayed response of containment measures based on the observed incidence of symptomatic patients. For the clinical aspect, we consider the long hospitalization times required by critical cases admitted to the intensive care unit (ICU), since the saturation of these units constitutes one of the main political and health threats of COVID-19 epidemics.
To incorporate the former knowledge into a single framework, we build an epidemiological model with ten compartments characterizing the epidemiological and clinical status of individuals in each patch that make up the metapopulation. In addition, we split the former partition into three age strata (young, adults, and older adults), thus allowing us to capture in a stylized way the main epidemiological, clinical, and behavioral differences between these groups. On the one hand, SARS-CoV-2 importation and exportation events between patches are mostly due to the mobility of the active population. On the other hand, the medical evolution of COVID-19 displays significant differences across age strata [13,26,27]. In this regard, infections in the young group lead to null or mild symptoms [28], whereas, for older individuals, the infection evolves toward more severe symptoms and often requires hospitalization.
The detailed structure of the metapopulation framework offers the possibility of designing and evaluating the impact of contention policies to stop the propagation of SARS-CoV-2. In particular, we focus on those policies relying on global or targeted quarantine measures due to the extreme relevance of these interventions for the course of the pandemic. This way, our formalism allows us to incorporate the temporal evolution of the fraction of the confined population and to assess the performance of lockdown policies to mitigate the pandemic. Taking advantage of this possibility, we explore several epidemic scenarios characterized by different contention measures and evaluate their impact on the decrease of the epidemic prevalence and the saturation of the Spanish health system during March, April, and May 2020. As a very relevant application, our model enables us to derive the selection of the minimum degree of confinement needed to avoid the health system crisis while not disrupting the economic fabric of the country.
In a nutshell, the proposed model takes into account, in a stylized way, three main peculiarities of the SARS-CoV-2 transmission: (i) the silent transmission of the pathogen through the young portion of the population, (ii) the large potential for the spatial dissemination of the pathogen provided by the mobility patterns of adults, and (iii) the severe pathology caused by COVID-19 in the elderly that yields to a dramatic increase of medical and hospital demands. Thus, the model can be viewed as three coevolving spreading processes with different spatiotemporal scales.
The analysis of the model and the main results are presented as follows. First, we show the main features of the dynamical model in Sec. II, describing first how epidemiological, clinical, and social data are incorporated into the metapopulation framework and, finally, showing its validation comparing with data corresponding to the COVID-19 outbreak in Spain. Once the model is defined and validated, we tackle in Sec. III the analytical derivation of the effective reproduction number. In Sec. IV, this expression is analyzed in detail to unravel the nonlinear dependency between the mobility restrictions and the epidemic incidence. Finally, in Sec. V, we round off the article by summarizing the main findings of the work and discussing the implications in public health policies aimed at facing COVID-19.

II. METAPOPULATION MODEL
We present below the details of the metapopulation framework tailored for capturing the spread of SARS-CoV-2 and the evolution of COVID-19 patients. This presentation is divided into four parts: the explanation of the compartmental dynamics for each age stratum in a single patch, the description of contagion processes governed by social contacts and the mobility among patches, the incorporation of contention measures aimed at reducing human mobility and increasing social distance, and, finally, the validation of the model with data corresponding to the epidemic wave in Spain.

A. Compartmental dynamics
The epidemic model is built upon previous work on the method known as the microscopic Markov chain approach (MMCA) [29,30]. It is designed for the modelization of epidemic spreading dynamics but is also applied to other types of processes, e.g., information spreading and traffic congestion [31][32][33][34][35][36].
In the original MMCA formalism, which analyzes a susceptible-infected-susceptible dynamics in a single population (i.e., each node representing an individual), the description of the dynamics is made using as variables the probabilities p i ðtÞ of each node i being infected at time t. The MMCA equations express how these probabilities evolve in time [29]: A node is infected at a time step if it was previously infected and did not recover or was susceptible and became infected by one of its neighbors: p i ðt þ 1Þ ¼ p i ðtÞð1 − μÞ þ ½1 − p i ðtÞΠ i ðtÞ. These equations just rely on probabilities, i.e., sums of probabilities of disjoints events, and products of probabilities of (approximately) independent events. Moreover, they are deterministic; thus, they can be iterated from any desired initial condition to know the evolution of the p i ðtÞ of each node.
The capacity of describing the epidemics at the level of nodes explains the qualification of MMCA as a microscopic methodology. This characteristic distinguishes MMCA from the more common mean-field approaches, which are based on differential equations that express the rate of change of the fraction of infected individuals: The variables, the equations, and even time are analyzed differently.
The adaptation of MMCA to the study of epidemics in structured metapopulations, with heterogeneous agents, and subject to recurrent mobility patterns, is performed by some of the authors in previous works [20][21][22][23][24]. The main difference with respect to standard MMCA is that variable ρ m i ðtÞ represents the probability that individuals, with residence at node i, are in state m. An alternative interpretation for them is as the fraction of individuals living at the node that are in the given state. However, the equations of the dynamics are better understood with the probabilistic semantics.
The evolution of these probabilities fρ m i ðtÞg determines the dynamical state of a single patch i and is dictated by the underlying compartmental model. Here, this compartmental model is composed of the following epidemiological and clinical compartments: susceptible (S), exposed (E), asymptomatic infectious (A), symptomatic infectious (I), to be admitted in ICU (prehospitalized in ICU, P H ), fatal prognosis (predeceased, P D ), admitted in ICU that will recover (H R ) or decease (H D ), recovered (R), and deceased (D). Compartments P H and P D are introduced to account for the delays observed for individuals before they are admitted in ICU or before they die outside ICU, respectively. Similarly, the compartments H R and H D represent the individuals in ICU but separated according to their final state (recovered or deceased, respectively), which is reached at different rates.
As anticipated above, the population of a patch is further divided into N G age strata. Although we present the model in a general form, its application to COVID-19 makes use of only the three age strata mentioned above (N G ¼ 3): young people (Y), with age up to 25; adults (M), with age between 26 and 65; and elderly people (O), with age larger than 65. This differentiation is very relevant in light of the observation of a large fraction of infected asymptomatic individuals at ages (<25), in contrast with the abundance of severe and critical cases reported for people at older ages (>65).
Let us suppose we have a population of N individuals distributed in N P regions, with n i individuals residing in region (patch) i. We also consider that individuals belong to one of N G different age strata, in such a way that n g i individuals of age strata g live in region i. Thus, where n g is the total population of age strata g. Our system is completely characterized by variables ρ m;g i ðtÞ, which account for the probabilities that individuals of age stratum g assigned to patch i are in state m at time t, where m ∈ fS; E; A; I; P H ; P D ; H R ; H D ; R; Dg and g ∈ fY; M; Og. The temporal evolution of these quantities is given by These MMCA equations correspond to a discrete-time dynamics, in which each time step represents a day. Note that the sum over all ρ m;g i ðtÞ for a given patch i and age group g equals 1 for each time step t.
The rationale of the former compartmental dynamics is the following. Susceptible individuals get infected by contacts with asymptomatic and symptomatic infected agents, with a probability Π g i ðtÞ (described in Sec. II B) becoming exposed. Exposed individuals turn into asymptomatic at a certain rate η g , which, in turn, become infectious at a rate α g . Once infected, three paths emerge, which are reached at an infectious rate μ g . The first one is death without being hospitalized in ICU, with probability θ g , after a latency period governed by rate ζ g . Otherwise, with probability γ g the individuals are hospitalized in ICU, which are reached at rate λ g , while with probability 1 − γ g they recover. Individuals in ICU have a fatality probability ω g , which is reached at a rate ψ g , whereas the recovery is reached at a rate χ g . See Fig. 1 for a sketch of the compartmental model with all the transitions and Tables I-III for the list of the parameters and their values to simulate the spreading of COVID-19 in Spain, which are discussed in Appendix A.

B. Social contacts and mobility among patches
To understand the geographical diffusion of the disease, as a result of human-human interactions in small geographical patches, one has to combine the contagion process with the long-range disease propagation due to human mobility across different spatial scales. For the case of epidemic modeling, the usual metapopulation scenario assumes that the individuals within each patch are well mixed; i.e., pathogens can be transmitted from an infected host to any of the healthy agents placed in the same patch with the same probability. The second aspect of the metapopulation approach concerns the mobility of agents. Each host is allowed to change its current location and occupy another patch, thus fostering the spread of pathogens at the system level. Mobility of agents between different patches is usually represented in terms of a network where nodes are locations while a link between two patches represents the possibility of moving between them. Here, we include a set of modifications to the standard metapopulation frameworks, using the average contact network within each age strata and among them and taking into account also differences in mobility patterns between working-age individuals and the rest. Now we detail how contagions of susceptible individuals of a patch i and age stratum g are captured in the Markovian equations through the probability Π g i ðtÞ. The value of Π g i ðtÞ encodes the probability that a susceptible agent belonging to age stratum g and patch i contracts the disease. Under the model assumptions, this probability is given by where p g denotes the degree of mobility of individuals within age stratum g, R g ij is the mobility matrix (fraction of individuals of group age g that choose destination j while living in region i), and P g i ðtÞ denotes the probability that those agents get infected by the pathogen inside patch i. This way, the first term in the rhs of Eq. (12) denotes the probability of contracting the disease inside the residence patch, whereas the second term contains those contagions taking place in any of the neighboring areas. Furthermore, we assume that the number of contacts increases with the density of each area according to a monotonically increasing function f. Finally, we introduce an age-specific contact matrix C, whose elements C gh define the fraction of contacts that individuals of age stratum g perform with individuals belonging to age stratum h. With the above definitions, P g i reads

Pre-Hospitalized
Pre-Deceased R g FIG. 1. Compartments of the epidemic model. The acronyms correspond to susceptible (S g ), exposed (E g ), asymptomatic infectious (A g ), symptomatic infectious (I g ), prehospitalized in ICU (P g H ), predeceased (P g D ), in ICU before recovery (H g R ), in ICU before death (H g D ), deceased (D g ), and recovered (R g ), where g denotes the age stratum of all compartments. The arrows indicate the transition probabilities.
Parameters β A and β I correspond to the infection probabilities for contacts of a susceptible individual with asymptomatic and symptomatic infected individuals, respectively. The exponents represent the number of contacts made by an agent of age stratum g in patch i with infectious individuals-compartments A and I, respectively-of age stratum h residing at patch j. Accordingly, the double product expresses the probability for an individual belonging to age stratum g not being infected while staying in patch i.
The term z g hk g ifðñ i =s i Þ represents the overall number of contacts (infectious or noninfectious) that an agent from age group g makes inside patch i. These contacts, as mentioned above, increase monotonically with the density of patch i following function fðñ i =s i Þ, where s i is the area of patch i measured in square kilometers andñ i represents the effective population inside patch i. Likewise, the term z g is a normalization factor introduced to ensure that the average degree of population belonging to age group g is hk g i. Therefore (see details in Appendix B), where the effective population at patch i, i.e., the number of people present at patch i when commuting takes place, is given byñ which is distributed in age strata of sizẽ For convenience, we define the effective mobility matrices M g ji : that take into account both the degree of mobility of the population and the transition probabilities to the neighboring patches.
From now on, we use the tilde notation to indicate variables measured while the commuting is active, to distinguish them from variables when all the population is in its residence patch.
The function fðxÞ governing the influence of population density is selected, following Ref. [41], as The last term of the exponents in Eq. (13) contains the probability that these contacts are contagious, which is proportional to n m;h j→i , the expected number of individuals of age stratum h in the given infectious state m (either A or I) which transit from region j to region i: The discrete-time nature of this model allows for an easy computation of the time evolution of all the relevant variables, providing information at the regional level.

C. Modeling confinement measures
Finally, this formalism allows for quantifying the performance of different containment measures aimed at reducing the impact of COVID-19. In this section, we put our focus on the effect of extended lockdown policies promoted across the entire system, but other interventions such as modifying the morphology of the mobility network or isolating specific age groups can also be assessed with our framework.
To incorporate confinement policies in our formalism, we consider that a fraction of a given age stratum κ g 0 ðtÞ is under lockdown at time t. In this sense, let us remark that the time series κ g 0 ðtÞ allows us to tune the strength of those confinement measures proposed to contain COVID-19. In this sense, κ g 0 ¼ 1 corresponds to a total lockdown scenario in which all activity is paralyzed, whereas κ g 0 ¼ 0 reflects a less strict policy not affecting the population. Moreover, we introduce the effects of social distancing measures taken by the population as a reduction δ of the number of contacts made by the nonconfined population. We assume the social distancing δ to be constant in time, while the confinement level κ g 0 ðtÞ is time dependent. Let us denote as t c the initial time from which lockdown interventions are deployed. The incorporation of lockdown policies in the formalism demands some modifications of the equations. First, population confinement implies a reduction of the mobility within the population. Therefore, the degree of mobility p g becomes time dependent and now reads where ΘðxÞ is the Heaviside function, that is 1 if x ≥ 0 and 0 otherwise. As a by-product,ñ g i and z g also become dependent on time; see Eqs. (14)- (16). Second, the confined population reduces their interactions to the members of their households. Equally, independently of whether they are confined or not, the contacts of individuals belonging to age strata O are reduced to their household contacts, since they do not form part of the working population. Therefore, in the lockdown scenario, the average degree of elderly being the contacts taking place inside their households [42]. In contrast, given the nature of the interventions deployed by the authorities, the contacts outside the household of the nonconfined population belonging to age strata Y and M are restricted to the workplace and shaped by social distancing. To incorporate both aspects, in the lockdown scenario, we define the average degree of individuals belonging to age group g ∈ fY; Mg at time t, k g c ðtÞ, as where k g h and k g hþw , which encode the contacts made by confined (at home) and nonconfined (at home and at work) population, respectively, are obtained from Ref. [42]. This way, the average number of daily contacts hk g i of agents belonging to each group g ∈ fY; M; Og should be updated to account for this dependence: The time dependence of the number of interactions affects the probability that an individual from age group g and patch i becomes infected, P g i ðtÞ, which is now given by Finally, a perfect confinement would entail the complete shielding of fully susceptible households against the disease. Therefore, the confinement of the population modifies the pool of susceptible individuals reachable by their infectious counterparts [43]. Unfortunately, in practical terms, complete isolation of households is impossible, since their members are required to go out for essential activities such as buying groceries, drugs, etc. Therefore, we introduce a household permeability ϕ accounting for the social mixing among members from different households in these situations. We fix the household permeability ϕ as constant in time. While the interactions among members of different confined households clearly shape household isolation, we assume that they do not substantially modify the average number of contacts of the population. In this scenario, a relevant indicator to quantify the efficiency of the policy is the probability of one individual living in a household, inside a given municipality i, without any potentially infectious individual. Assuming that confinement is implemented at time t c , this quantity, denoted in the following as CH i ðt c Þ, is given by This confinement strategy is introduced in the dynamical Eqs. (2)-(11) by modifying Eqs. (2) and (3) for the time after t c : where we add a new compartment CH to hold the individuals under household isolation after applying confinement and returning them to the susceptible pool as confinement is relaxed. Confinement also affects the average number of contacts; thus, we must also update Eq. (13). In summary, we introduce a confinement strategy driven by social distancing and the isolation of a certain fraction of the population, which, in turn, reduces the average number of contacts, the mobility, and, as a result, the probability of becoming infected, thus reducing the overall prevalence of the disease. Therefore, confinement policies are captured in the formalism by the time series describing confinement κ g 0 ðtÞ, the social distancing factor δ, and household permeability ϕ. See Table IV for the list of the confinement parameters and their values to simulate the spreading of COVID-19 in Spain, which are discussed in Appendix A.

D. Validation of the model
The metapopulation model is calibrated with real data of fatalities caused by COVID-19 in Spain until April 8, using approximate Bayesian computation [45]. Because of the absence of more specific data, we take the reduction in the work mobility as a proxy for the temporal evolution of the confined population in Spain. Furthermore, since displacements not related to work activities were banned as the lockdown was enforced, the evolution of the work mobility reflects well the variations in the population under confinement. The confinement measures are introduced into the model from March 14 onward (before this date, no intervention was made). We obtain the time series κ g 0 ðtÞ from official reports [44] by estimating the work mobility reduction with respect to a baseline day, November 19, 2019. We assume that these time series are equal for all age groups and refer in the following to κ Y;A;O 0 ðtÞ as κ 0 ðtÞ. Further details on the data used and model calibration are shown in Appendixes A and C.
Once calibrated, we validate our model using two datasets: the evolution of daily fatalities in Spain [ Fig. 2(a)] and the evolution of the daily number of newly detected symptomatic individuals [ Fig. 2(b)]. The results in these figures show a qualitative and quantitative agreement in all the phases corresponding to the different containment interventions implemented by the government of Spain. Note that the time series of new symptomatic individuals is rescaled to account for case underreporting. For the sake of visualization, Figs. 2(a) and 2(b) incorporate the lockdown stages depicted as horizontal bars that correspond to the different interventions enforced in Spain. In LM, Spain was under lockdown but laboral mobility was allowed. In a stricter scenario (EM), only essential mobility (that corresponding to the commuting of essential workers) was allowed. In ECS, in addition to permitting laboral mobility, children and elderly people were allowed to go out during certain times of the day and adults could practice sports outdoors. In R1, the first phase of the reopening after the lockdown, some commercial activity was allowed. The shaded area covers 95% prediction intervals for the model trajectories.
Finally, taking advantage of the spatial resolution of the metapopulation framework, we show in Figs. 2(c) and 2(d) the accumulated number of fatalities and reported cases at the level of Comunidades Autónomas with those predicted by the model. In both cases, the plots show a remarkable agreement (correlations of r ¼ 0.98 and r ¼ 0.90, respectively), capturing the large heterogeneity of the impact of COVID-19 across the country.
In Supplemental Material [46], we show the use of the model to evaluate different containment scenarios, exploiting factors such as the age compartmentalization and the spatial nature of the framework. However, although being illustrative of its use to inform policy makers, we move on, analyzing the model to delve into the consequences of nonpharmacological interventions.

III. EFFECTIVE REPRODUCTION NUMBER
The stage of an ongoing outbreak is typically quantified by studying the effective reproduction number RðtÞ, which captures the number of secondary cases that an individual infected at time t will produce through the duration of its infectious period. To compute the evolution of RðtÞ using our model and understand its dependencies, we present an incremental rationale from the most rough approximation to the most accurate formula.
Let us consider, first, a scenario (known as mean-field in physics) in which an infected subject i makes hki contacts each time step. Assuming an infection probability β per contact, the expected number of individuals infected by i at each time step is βhkiρ S , where ρ S is the fraction of susceptible individuals on the population. We can estimate how many individuals are infected by subject i over an infectious period of duration τ as This very simple mean-field approach is already very informative, as it reveals the main dependencies of R on the variables of the epidemic propagation, indicating the actionable variables we can resort on to reduce R. Acting on the infection probability β can be attained with the use of face masks and physical distance or through pharmacological interventions via prophylactic treatments. Another actionable variable is hki, which can be modified by human behavioral changes such as confinement. Also, the infectious period τ can be modified by implementing a fast testing policy to confirm and isolate suspected cases. Finally, the value of ρ S , that can be directly estimated from the model, is highly dependent on the extent of lockdown policies, on the level of immunity reached in the population, or on the degree of vaccinated individuals when this pharmacological intervention is available.
The previous approach, however, neglects many important social, demographic, and mobility details key to shaping the way in which secondary infections spread through a geographically distributed heterogeneous population. Likewise, it obviates the temporal evolution of the group of individuals susceptible to contracting the disease. This naive approach can be largely improved by leveraging the information of the proposed age-structured mobilitydriven metapopulation epidemic model.
A. Determination of the effective reproduction number from the dynamics Let us define R g i , the effective reproduction number of geographical patch i and age stratum g, as the number of secondary cases produced by an infected individual belonging to patch i and age stratum g [47]. Mathematically, R g i is expressed as In the above expression, the following quantities interplay: the duration of the infectious period of each age stratum g, τ g ; the average infectivity during the infectious period, hβ g i, per age stratum g; the average number of daily contacts an individual belonging to patch i and age stratum g makes with individuals in patch j belonging to age stratum h, k gh ij ; and the fraction of susceptible individuals present in patch j belonging to age stratum h,ρ S;h j . In summary, the structure of R g i resembles the mean-field expression presented in Eq. (28), but it takes explicitly into account the mixing of individuals according to their age strata and mobility patterns (k gh ij andρ S;h j ). Nevertheless, this expression is still a static snapshot at time t.
We can extend this approach to include both the temporal dependence and the heterogeneity of contacts induced by the daily commuting patterns, making use of all the probabilities provided by our model equations. Moreover, here, we aim at studying the temporal evolution of the effective reproduction number beyond the early stage of the disease. In this sense, we define the reproduction number of patch i and age stratum g as the number of infections observed if we seed an infected individual in the aforementioned patch and age stratum [47]. This definition involves considering that this infected individual, with residence in patch i, may commute to patch j, where it will be able to contact and infect susceptible individuals coming from any other patch k.
First, we calculate the expected number of susceptible individuals of age stratum h which move from region k to region j as The fraction of susceptible individuals of each age stratum g present at time t in patch j is expressed bỹ Next, we compute the number of susceptible contacts made by an individual of age stratum g and patch i, which can be expressed as Using this expression and assuming that we have an individual from patch i and age stratum g that has become infectious, we can compute the effective reproduction number for each patch and age stratum as In Eq. (33), we assume that the fraction of susceptible individuals as well as the average number of contacts is constant during the infectious period. However, the fraction of susceptible individuals obviously changes during that time. More importantly, as containment measures come into play, the number of contacts varies as well. To account for the temporal variability of these quantities, we calculate the contributions to R g i ðtÞ of the infections produced since the individual became exposed at time t. The contributions can be determined through the probabilities to be exposed, ζ E;g , asymptomatic, ζ A;g , or symptomatic, ζ I;g , t time steps after an infection. The evolution of these probabilities is given by ζ A;g ðt þ 1Þ ¼ η g ζ E;g ðtÞ þ ð1 − α g Þζ A;g ðtÞ; ð35Þ Since the seed is initially placed as exposed, the initial conditions are given by ζ E;g ð0Þ ¼ 1 and ζ A;g ð0Þ ¼ ζ I;g ð0Þ ¼ 0. Accordingly, solving explicitly the recursive Eqs. (34)-(36), we find Incorporating these probabilities into Eq. (33), we obtain the following expression for the effective reproduction number of individuals of age stratum g in patch i: The variables R g i ðtÞ tell us the effective reproduction number of one newly infected individual from age stratum g with residence in i.
To compute the temporal evolution of the global effective number RðtÞ or the effective number R g ðtÞ associated to a given age stratum g at time t, we must make a weighted average taking into account the distribution of newly infected individuals across patches and age strata. Therefore, these indicators read respectively. To show the validity of the former analytical expression for the effective reproduction number RðtÞ, we perform in Appendix E a comparison of its time evolution with the results obtained by using EpiEstim [48,49] applied to both the incidence in our model and the daily incidence data observed in Spain.

B. Monitoring the effect of contention measures in Spain
To check the effect of the policies aimed at reducing the impact of COVID-19 in Spain, we compute the expression of RðtÞ according to Eq. (42) using the reported variations of κ 0 in this country from March 14. In Fig. 3, we represent RðtÞ along with the time series of the degree of confinement κ 0 ðtÞ [obtained from the Instituto Nacional de Estadistica (INE) of Spain [44]]. Note that the confinement during the progressive exit of the lockdown state lies between 0.7 and 0.6 approximately, stabilizing RðtÞ < 1. However, there is a clear tendency of RðtÞ to grow as the confinement level was relaxed in May (see also Fig. 9 in Appendix E). This growing tendency may be one of the reasons leading to the second epidemic wave Spain is facing during the months of July and August 2020.
Furthermore, in the plot, we obtain from Eq. (43) the effective reproduction number by age strata, that is, the R g ðtÞ of the young, adult, and elderly population (being g ∈ fY; M; Og). There, we notice that relevant heterogeneities exist among the reproduction number of the different age strata. In particular, at the end of the curve, we observe that a supercritical scenario for the adult population [R M ðtÞ > 1] is compatible with a global subcritical regime. This valuable information opens up the possibility of designing suppression strategies that allow a certain age stratum to have RðtÞ > 1, provided the average value of the whole population is below the threshold RðtÞ < 1. The same conclusion applies in terms of geographical areas instead of ages. This result offers a perspective on epidemic control that could be very useful to design strategic interventions targeting individual age stratum or particular geographical areas in heterogeneously distributed populations.
The analytical expression for R g i in relation to confinement κ 0 , social distancing δ, and household permeability ϕ can be exploited to ascertain the implications of nonpharmacological interventions. Let us assume that the aim of the intervention is that of keeping the epidemic under control, in the absence of any other tool besides confinement and social distance. In this case, the strong goal must be that of keeping of R g i ðtÞ ≤ 1 ∀ i; g, or the weaker goal RðtÞ ≤ 1. We can use the analytical dependence of RðtÞ to estimate the minimal, constant intervention in κ 0 and δ such that the goal is achieved.
In Fig. 4, we present, for the case of Spain, the lower bounds of confinement needed, assuming fixed social distancing, to achieve RðtÞ < 1. This analysis makes use of RðtÞ to obtain the minimum possible confinement that, while damaging the socioeconomic structure as little as possible, allows the epidemic to be controlled and gradually mitigated. Note that the previous goal can be relaxed, or changed, according to any other demands of the authorities, as, for example, maintaining the number of ICUs in the health system below a certain threshold, minimizing the number of fatalities, or achieving suppression in the minimum time. Figure 4 highlights the relevance of timely and strict lockdown policies, to avoid the breakdown of the health system, in the absence of other nonpharmacological interventions. We observe that small variations in the fraction of confined population κ 0 entail

IV. QUANTIFYING THE EFFECT OF CONTAINMENT MEASURES OVER RðtÞ
The expression for the global effective number RðtÞ [Eq. (42)], while useful for computing the impact that contention measures have on the spreading dynamics, does not allow a direct interpretation of how these policies, in particular, confinement and social distancing, influence the epidemic detriment. Here, starting from the complete expression derived for RðtÞ, we consider some simplifying assumptions that allow us to unravel the contribution of these measures and, based on these approximations, derive an expression for the critical confinement needed to bend the epidemic curve.

A. Phase diagram of RðtÞ as a function of confinement and social distancing
To reveal the explicit dependence of RðtÞ on confinement and social distancing interventions, we now simplify Eq. (40) by, first, neglecting the heterogeneities among different subpopulations and, second, considering a static view in which the pool of susceptible individuals from each age stratum hρ S;g i remains constant during the intervention (constant κ 0 ). The latter assumption enables us to recover Eq. (28) which, after introducing the aforementioned confinement parameters, reads where hCHi denotes the average fraction of households without any infected member. Equation (44) shows an explicit quadratic dependence of R g ðt c Þ on κ 0 that helps to understand the effectiveness of lockdown interventions. In Fig. 5(a), we plot a phase diagram illustrating how the global effective reproduction number is shaped by the fraction of confined population κ 0 and the household permeability ϕ. We highlight the phase transition occurring at Rðt c Þ ¼ 1 (white solid line) separating two different regimes, Rðt c Þ > 1 (flattening) and Rðt c Þ < 1 (bending). Note also that, as the permeability of confinement ϕ increases, the transition point is reached for larger values of confinement κ 0 , this transition being lost for very high values of ϕ ≳ 0.83.
To shed light onto the qualitative differences between the regimes separated by the transition described above, in Fig. 5(b), we plot the number of daily new cases for different levels of confinement, κ 0 , while keeping the social distancing as calibrated, δ ¼ 0.207 (95% credible interval CrI: 0.053-0.359). The confinement is applied at time t c corresponding to March 14. It is observed that, as confinement is applied to a small fraction of the population (κ 0 > 0), the curve for the number of new cases per day starts to broaden and the maximum shifts forward in time. Such a mitigation strategy is known as flattening the curve, in which containment delays and lowers the incidence peak. The consequence of the flattening scenario is that the impact over health systems is ameliorated at the expense of a larger epidemic period. However, as confinement is increased (higher κ 0 ), there is a sharp change in the behavior of the epidemic curve for κ 0 between 0.4 and 0.6, signaling that the transition point Rðt c Þ ¼ 1 lies between these two values. In particular, for values of κ 0 large enough, the epidemic curve reaches its maximum soon after containment is put in place, achieving a completely new scenario, referred to as bending the curve, in which incidence decreases steadily and the epidemic wave is shortened.
The bending scenarios shown in Fig. 5(b) correspond qualitatively to the trend observed in Fig. 2(b). In fact, from the sequence of values κ 0 ðtÞ in the inset in Fig. 3, we observe that the degree of confinement achieved clearly places the epidemic in the bending regime. Also, from the evolution κ 0 ðtÞ, we notice that after April 14 (the day when the gradual lifting of lockdown starts) a clear relaxation of confinement is observed. FIG. 4. Relation between the confinement level and time needed for epidemic extinction. Top: Time to epidemic extinction from May 15 as the reproduction number is fixated from this day onward. The reproduction number is kept constant by adjusting the confinement level through time. We define epidemic extinction as a daily incidence of fewer than ten cases. Bottom: Average confinement level necessary to keep the reproduction number constant. At each time step, we correct the confinement level by steps of 0.1% until the effective reproduction number RðtÞ differs less than 1% from the envisaged value.

B. Critical confinement for the single patch and age stratum approximation
As previously shown, the relaxation of confinement progressively weakens the bending regime, raising the question of how far containment measures can be relaxed without entering the flattening region.
We address the analysis of this effect, by deriving the expression of the critical confinement κ c 0 at which the transition from flattening to bending regimes takes place. The expression for the effective reproduction number in Eq. (40) includes a variety of terms and factors. The complexity of the expression hinders an intuitive understanding of the roots behind the transition from flattening to bending the epidemic curve induced by an efficient confinement. In order to have an analytical estimation of κ c 0 , let us consider a single well-mixed population with only one age stratum and constant confinement in time. Furthermore, to express R, we assume that circumstances do not change during the infectious period of an individual. In other words, the number of contacts stays constant, and we neglect the depletion of susceptible individuals. Accordingly, after confinement, the reproduction number is expressed as If we impose the condition for mitigation and suppression of the epidemics R ≤ 1, the critical confinement value κ c 0 must fulfill the equation Defining the following coefficients: we can write Eq. (46) as whose solution is given by Note that only the physically meaningful solution κ c 0 ∈ ½0; 1 must be retained.
To further analyze the relevance of the mechanisms characterizing the confinement of individuals in our model, we illustrate in Fig. 6 the effects of social distance encoded in δ and the household permeability ϕ on the critical confinement value κ c 0 . We observe a nontrivial functional dependence revealing that, by increasing the social distance of individuals or promoting better household isolation, the fraction of confined population needed to keep an outbreak under control is notably reduced.

V. CONCLUSIONS
As the COVID-19 pandemic affects more and more countries and threatens to overload the critical capacity of health systems, the absence of a SARS-CoV-2 vaccine requires forceful nonpharmacological containment interventions by governments. Among these interventions, the confinement of populations and social distancing have been implemented by many countries in an attempt to reduce the impact on health systems and save the time necessary to try more efficient treatments against this emerging pathogen.
To mount the strategy against SARS-CoV-2 and anticipate its trajectory, it is necessary, in addition to correctly modeling its epidemiological characteristics, to take into account as closely as possible the influence of the variability of social contacts that the pathogen uses to spread. Here, we have presented a metapopulation model to track the spatiotemporal evolution of COVID-19. This framework incorporates the essential factors of the SARS-CoV-2 transmission and clinical outcome of COVID-19 patients by considering three group of ages: young individuals (being mainly asymptomatic carriers of the pathogen), adult people (characterized by a working activity that involve recurrent mobility patterns), and the elderly (displaying the most severe and critical COVID-19 cases). Although the age partition can be further split into more groups, the former partition appears to be the minimal one to capture the main characteristics of the current epidemics. Apart from the recurrent mobility between patches (in the current work municipalities), contacts at the local level are modeled using a mean-field approach that incorporates a density-dependent function for the contact patterns between age groups. This assumption allows the straightforward consideration of household confinement and also enables one to explore contention strategies aimed at reducing the sociability within and between age-specific groups. The model constitutes an stylized framework to explore early nonpharmacological interventions when massive testing and contact tracing are not available, as was the case of the first epidemic wave in Spain and many other countries around the globe.
Analyzing in detail this model, we have shown that it is possible to construct the expression of the effective reproduction number RðtÞ capturing both the epidemiological characteristics of COVID-19 and those social patterns that facilitate its expansion. This expression enables an accurate evaluation of the spreading potential of SARS-CoV-2 on a given population and, more importantly, the assessment of the effects of implementing or lifting nonpharmacological interventions in advance. It can be used to assess the extent of nonpharmacological interventions on a given territory seeking for the less harming intervention in terms of confinement level and time needed to epidemic suppression.
Focusing on the outbreak in Spain, we have analyzed the effects of different degrees of confinement, κ 0 , using the expression for RðtÞ evaluated at the time t c when containment is applied. This way, we can accurately determine the effects on the slowdown and suppression of the epidemic caused by different degrees of confinement. Calculating the value Rðt c Þ, we observe a transition as the set of confined inhabitants decreases and the consequent social mixing increases. This transition defines a critical confinement κ c 0 that separates a supercritical scenario, in which Rðt c Þ > 1 flattens the curve, to another subcritical one, Rðt c Þ < 1, that bends the epidemic curve due to the profound change in the social contacts structure, making it almost impossible for the virus to spread.
The exact critical confinement separating the supercritical and subcritical regimes is highly dependent on the underlying social structure and the intrinsic mobility patterns of each population. Furthermore, it depends on the time it is applied, since it depends on the available pool of susceptible agents that can be culled out from the system. The generality of the expression for RðtÞ provided here makes it possible to apply it to any population and to use with any metapopulation epidemic model, paving the way for implementing timely and well-founded epidemiological and socially based nonpharmacological responses.

ACKNOWLEDGMENTS
We thank Gourab Ghoshal, Silvio Ferreira, Carlos Bouthelier, Sergio Faci, and Miguel Hernan for useful FIG. 6. Relationship between critical confinement, permeability, and social distance. Critical confinement κ c 0 computed from Eq. (51) needed to ensure that the system reaches the bending regime, i.e., R ¼ 1, as a function of the household permeability ϕ and the social distance δ. discussions. We also thank Matteo Rini for his suggestions. We now describe the choice of our parameters to study the epidemic outbreak in Spain, whose values are enumerated in Tables I-IV. Throughout the manuscript, the patches of the metapopulation correspond to municipalities which constitute the lowest administrative divisions in Spain. Regarding the population structure, we obtain the population distribution, population pyramid, and average household size at the municipality level from Instituto Nacional de Estadística [50], whereas the age-specific contact matrices are extracted from an international social study [42]. At the beginning of the pandemic, the entire population is assumed to follow their usual mobility patterns, so p g ¼ 1 for every To construct the daily mobility network, we first assume that the elderly and young population do not leave their residential municipality, so their corresponding mobility matrices read R Y;O ij ¼ δ ij . Finally, the mobility matrix governing adult population movements is estimated from extensive surveys conducted by Instituto Nacional de Estadística in 2011 [50], which capture the commuting patterns of the population living inside each municipality.
Regarding epidemiological parameters, the incubation period is reported to be η −1 þ α −1 ¼ 5.2 days [2] in average which, in our formalism, must be distributed into the exposed and asymptomatic compartments. In principle, if one does not expect asymptomatic transmissions, most of this time should be spent inside the exposed compartment, the asymptomatic infectious compartment thus being totally irrelevant for disease spreading. However, along the line of some works [51][52][53], we find that the unfolding of COVID-19 cannot be explained without accounting for infections from individuals not developing any symptoms previously. In particular, after calibrating the model, we estimate α −1 ¼ 2.756 (95% CrI: 2.135-3.377) days as the asymptomatic infectious period; see Appendix C for the details of the calibration methodology. In turn, the infectious period while being symptomatic is calibrated to μ −1 ¼ 3.915 days (95% CrI: 3.470-4.360), except for the young stratum, for which we reduce it to 1 day, assigning the remaining days as asymptomatic; this reduction is due to the reported mild symptoms in young individuals, which may become unnoticed [28]. Furthermore, we assume asymptomatic individuals are half as infectious as symptomatic ones, β A ¼ 0.5β I , as in Refs. [54,55].
Regarding the clinical parameters, we fix the fatality rate ω ¼ 30% for ICU patients according to official reports [37]. We incorporate from previously published studies the typical time from ICU admission to death as ψ −1 ¼ 7 days [39] and the stay in ICU for those overcoming the disease as χ −1 ¼ 21 days [40]. In turn, the parameters γ g and θ g , controlling the infected population requiring ICU beds and dying without occupying any of them, are estimated by correcting the real observed values [37] with the estimated underreporting from Ref. [38] (see next paragraph). Likewise, the parameters ζ g and λ g are computed by assuming that times between onset of symptoms and ICU occupation or fatal outcome are 8 and 11 days [37] on average, respectively.
To account for underreporting, we compare the seroprevalence estimates of the extensive seroepidemiological extensive survey [38] made by the Spanish authorities with that derived from the official reported data. In particular, 0.53% of the Spanish population was diagnosed as COVID-19 cases by May 4,   Fig. 2(b).
Finally, regarding the parameters controlling the efficiency of the measures taken by the authorities, we calibrate those depending on social awareness and individual behavior, such as social distancing δ and household permeability ϕ, due to the lack of data. In contrast, the fraction of the population under confinement κ 0 ðtÞ is  straightforwardly incorporated from data on the mobility reduction daily updated by INE which are shown in Fig. 3. Figure 7 contains the posterior distribution for the calibrated epidemiological parameters along with the covariance matrix showing the relations between those parameters involved in the calibration.

APPENDIX B: NORMALIZATION FACTOR FOR EACH AGE GROUP
Here we want to explain the rationale behind the normalization factor z g , first introduced in Eq. (14), which shapes the number of contacts made by individuals belonging to age group g. In the main text, we state that z g hk g ifðñ i =s i Þ represents the total number of contacts that an agent from age group g makes inside patch i. To construct this term, let us start by assuming that the number of contacts within a patch increases monotonically with the density according to a given function f. Therefore, in general terms, we can write the number of interactions of one individual belonging to age-group g inside patch i, k g i , as being x g an (in principle) arbitrary constant which may differ for the different age-groups g. In fact, x g allows for rescaling the contacts such that the average number of interactions of individuals belonging to strata g across the entire population is hk g i. This constraint imposes that This way, x g can be expressed as where z g , which is defined as the normalization factor throughout the manuscript, is given by APPENDIX C: MODEL CALIBRATION Several procedures are proposed in the literature to calibrate the mathematical models aimed at reproducing the evolution of COVID-19 across a given country. Calibration using indicators such as ICU occupation, daily incidence, or the number of fatalities rely on the quality of the data obtained. Unfortunately, in Spain, the data acquisition process is very messy, given the different protocols that every Comunidad Autónoma follows. For example, the number of officially reported cases does not constitute a reliable indicator in Spain because of delays in reporting to the National Health system as well as temporal variations in the number of tests. This unreliability makes reported cases a very noisy variable to fit. In front of this important problem, we finally decide to calibrate the model with the number of daily fatalities, data that are curated by the Spanish government.
The calibration procedure consists of the following steps: (1) Most of the parameters governing the clinical outcome of the disease are incorporated from either previous studies or official reports and do not take any part in the calibration of the model. Therefore, the calibration is mostly focused on the epidemiological parameters governing the unfolding of the disease, whose values remain still uncertain. In particular, the parameters subjected to calibration are the number of initial infectious seeds A 0 , the infectivity β I , the asymptomatic infectious period α −1 , and the infectious symptomatic period μ −1 . Finally, regarding the confinement, household permeability ϕ and social distancing δ need also to be calibrated due to the lack of data, whereas the values for κ g 0 ðtÞ are directly incorporated in the model from mobility reduction estimations published by INE.
(2) The initial date from which our model runs is February 9. To spatially distribute the initial seeds, the reported cases by March 3 are backtracked until February 9 and let the model run. Once the initial seeds are set, we calibrate the epidemiological part of the model by performing an approximate Bayesian computation (ABC) [45] including real data about the new daily fatalities at a national scale until April 8. We use the logarithmic least squares error as an objective function for the ABC. The daily fatalities from April 8 onward [ Fig. 2(a)] as well as the daily new cases for the whole time span [ Fig. 2(b)] serve as validation. As stated in Sec. A, the time series for the daily new cases provided by the health authorities is rescaled according to the findings of the nationwide serological survey to correct for the underreporting.
(3) Our model is able to predict the spatiotemporal propagation of COVID-19 triggered by communitary contagions and human mobility. To leverage the full potential of the formalism, we also check the accuracy of the formalism in predicting the fatalities at a regional level. We observe that the introduced seeds corresponding to the reported cases by March 3 are enough to predict the fatalities observed in some areas but fail in capturing the evolution of COVID-19 in other regions, possibly due to initial underreporting of cases there. To solve this discord-ance, we introduce the minimal set of infectious individuals in those largest populated cities of each Comunidad Autónoma to ensure a fair correspondence between the data and model, which is optimized by using the Nelder-Mead method.

APPENDIX D: SENSITIVITY ANALYSIS
Here, we perform a sensitivity analysis to check the robustness of our results concerning the time evolution of the effective reproduction number. As the value for RðtÞ is independent from the clinical parameters, we restrict the sensitivity analysis to those epidemiological parameters assumed to be constant when calibrating the model. In particular, we focus on studying how the values of RðtÞ in both supercritical and subcritical regimes are affected by changes in the incubation period η −1 þ α −1 , the infectivity of individuals without symptoms β A , and the infectious symptomatic period of young population ðμ Y Þ −1 . Figure 8 compares the original distribution for the effective reproduction number before and after confinement is enforced with those obtained by introducing a 10% relative modification in each of the aforementioned parameters. The mean and the shape of the distribution remain almost unaltered in both supercritical and subcritical regimes, thus proving the robustness of the results reported throughout the manuscript.

APPENDIX E: VALIDATION OF EFFECTIVE REPRODUCTION NUMBER
The validation of our analytical expression Eq. (42) for the effective reproduction number is performed by comparing it with the case reproduction number obtained using EpiEstim [48,49]. We apply EpiEstim to both the incidence in our model (EpiEstim model) as well as the real incidence data (EpiEstim data); see Fig. 9. The incidence in our model corresponds to the one of the compartment exposed (E). The real incidence data, collected until May 15, are shifted by 6 days due to the average lag between symptom onset and report [37]. The data are then subsequently shifted another 5 days accounting for the time in our model between becoming exposed and developing symptoms. Estimations of the reproduction number are then performed on both datasets by averaging over 50 randomly chosen parameter sets of our posteriors. For each parameter set, the generation time distribution is fixed correspondingly. Additionally, we perform a rolling average of 10 days on all curves. Please note that, for the EpiEstim data curve, the confidence interval is narrower than the line's thickness. The curve is plotted until the last day EpiEstim is able to perform an estimation.
We observe a good agreement between the analytical expression and the estimation by EpiEstim on the incidence of our model. Qualitatively, also the estimation using real incidence data matches the analytical expression. In particular, the difference on which day the two curves drop below one is only around 2 days. The fact that the FIG. 9. Validation of effective reproduction number. Reproduction number from the analytical expression compared to an estimation of the case reproduction number obtained using EpiEstim [48,49] on the incidence in our model (EpiEstim model) as well as the real incidence data (EpiEstim data).
FIG. 8. Sensitivity analysis. Distribution of effective reproduction number R values while exploring the parameter space by tuning parameters β A =β I , η −1 þ α −1 , and ðμ Y Þ −1 as described in Appendix D on (a) a supercritical regime (2020-03-01) and (b) a subcritical regime (2020-04-01). As a reference, we provide a baseline distribution obtained fixing the aforementioned parameters to the values reported in Table III. estimation on real incidence data results in a higher reproduction number at the beginning of the epidemics is probably due to the quickly increasing test capacity as the first cases were detected in Spain. Instead of observing the actual growth of the epidemic, tests performed in the beginning discovered to what extent the epidemic had already spread in the population. Unfortunately, there are no data available on how the number of performed tests evolved until April 23.

APPENDIX F: CODE AVAILABILITY
The model described above is implemented in the open source Julia programming language [56]. The source code is freely available online in Ref. [57], distributed under the open source licence GNU AGPL-3. The documentation can be read in Ref. [58].
For the ABC calibration described in Appendix C, we use the AproxBayes Julia package [59], that implements the approximate Bayesian computation method defined in Ref. [45].