Outbreak diversity in epidemic waves propagating through distinct geographical scales

A central feature of an emerging infectious disease in a pandemic scenario is the spread through geographical scales and the impacts on different locations according to the adopted mitigation protocols. We investigated a stochastic epidemic model with the metapopulation approach in which patches represent municipalities. Contagion follows a stochastic compartmental model for municipalities; the latter, in turn, interact with each other through recurrent mobility. As a case of study, we consider the epidemic of COVID-19 in Brazil performing data-driven simulations. Properties of the simulated epidemic curves have very broad distributions across different geographical locations and scales, from states, passing through intermediate and immediate regions down to municipality levels. Correlations between delay of the epidemic outbreak and distance from the respective capital cities were predicted to be strong in several states and weak in others, signaling influences of multiple epidemic foci propagating towards the inland cities. Responses of different regions to a same mitigation protocol can vary enormously implying that the policies of combating the epidemics must be engineered according to the region' specificity but integrated with the overall situation. Real series of reported cases confirm the qualitative scenarios predicted in simulations. Even though we restricted our study to Brazil, the prospects and model can be extended to other geographical organizations with heterogeneous demographic distributions.

A central feature of an emerging infectious disease in a pandemic scenario is the spread through geographical scales and the impacts on different locations according to the adopted mitigation protocols. We investigated a stochastic epidemic model with the metapopulation approach in which patches represent municipalities. Contagion follows a stochastic compartmental model for municipalities; the latter, in turn, interact with each other through recurrent mobility. As a case of study, we consider the epidemic of COVID-19 in Brazil performing data-driven simulations. Properties of the simulated epidemic curves have very broad distributions across different geographical locations and scales, from states, passing through intermediate and immediate regions down to municipality levels. Correlations between delay of the epidemic outbreak and distance from the respective capital cities were predicted to be strong in several states and weak in others, signaling influences of multiple epidemic foci propagating towards the inland cities. Responses of different regions to a same mitigation protocol can vary enormously implying that the policies of combating the epidemics must be engineered according to the region' specificity but integrated with the overall situation. Real series of reported cases confirm the qualitative scenarios predicted in simulations. Even though we restricted our study to Brazil, the prospects and model can be extended to other geographical organizations with heterogeneous demographic distributions.

I. INTRODUCTION
Theoretical and computational toolboxes used and developed by physicists over decades and, in particular, those of equilibrium and nonequilibrium statistical physics have found an inexhaustible source of applications in complexity [1][2][3][4] attaining an apex with the raising of network science at late 1990s [5]. Epidemic models, which are intimately related with nonequilibrium phase transitions [6], running on the top of networks soon became one of the most imminent fields of interdisciplinary physics [7][8][9] with many applications on real epidemic scenarios lead by physicists [10][11][12]. The pandemic of coronavirus disease (COVID -19) has been marked by a wide discussion heavily grounded on data-driven epidemic modeling, again with outstanding contributions lead by physicists [13][14][15][16].
Since COVID-19 outbreak was reported in Wuhan, Hubei province, China, at the end of December 2019, the scientific community has turned its attention to understanding and combating this novel threat [13,14,[17][18][19]. A characteristic of this infectious disease, caused by the pathogen coronavirus SARS-CoV-2, is the high transmission rate of individuals without significant symptoms [19,20] who traveled unrestrictedly and spread COVID-19 through all continents except Antarctica. Efforts to track the transmission from Wuhan to the rest of mainland China and world relied on mathematical modeling [13,14,18,21,22], which were soon extended to other countries [23][24][25], using the metapopulation approach [26][27][28]. In this modeling, the population is grouped in patches representing geographic regions and the epidemic contagion obeys standard compartmental models [29] within the patches, while mobility among them promotes the epidemic to spread through the whole population. Metapopulation approaches, which were ini-tially conceived in ecology contexts [30], have been used for many epidemic processes [26][27][28][31][32][33][34] and allow a geographical resolution absent in standard compartmental models [29]. One remarkable feature of the metapopulation approach is its capability to integrate the epidemic modeling in different regions into a single framework, in which correlation between epidemic waves in different regions can be investigated.
We are interested on the geographical variability and impacts of an emerging and highly contagious infections disease such as the COVID-19 in continental scales and we chose the recent pandemics in Brazil as a case of study. Brazil is a country of continental territory with an area of 8.5 × 10 6 km 2 demographically, economically, and developmentally very heterogeneous. It is a federation divided into 26 states and one Federal District, its capital city. States are divided into municipalities; each state has its own capital city. The total number of municipalities is 5570 and, according to estimates of 2019, the country population is approximately 210 million [35]. Striking characteristics of the Brazilian demography are the broad distributions of the rural and urban population fractions and urban population densities of municipalities, which are important traits in epidemic modeling [36,37]. Therefore, investigation of a pandemic and its consequences have necessarily to integrate the particularities of each region, which can be reckoned with the metapopulation approach.
In this work, we investigate a metapopulation model and applied it to the COVID-19 epidemic spread in the Brazilian municipalities. At the time of submission, early June 2020, the benchmark of 800 000 cases and 40 000 deaths was surpassed in Brazil, which became in this date the country with highest epidemic incidence and possibly a new worldwide epicenter of COVID-19 [38]. At time of resubmission, mid October 2020, the number of cases had surpassed 5.5 millions with more than 150 000 deaths. The model is supplied by demographic and mobility data mined from public sources, which are used in the stochastic simulations. We quantify how diversified the epidemic can be across the country rather than try to accurately foresee outbreaks in specific locations. We determine the level of dispersion in the curves of the epidemic prevalence in different places within a hypothetical scenario of uniform mitigation measures. We take the situation of 31 March 2020 as the initial condition, with foci in 410 municipalities, allowing to track the epidemics in all municipalities in a single integrated simulation. We predicted that the prevalence curves can vary enormously over geographical scales ranging from states down to municipalities. We also predicted that some states are characterized by outbreaks starting mainly in their capital cities, followed by epidemic waves propagating toward the countryside, while in other states there are multiple epidemic foci. Our results explain why policies of combating a countrywide pandemic should be simultaneously decentralized, in the sense that each locality has to consider its specificities, and integrated, since the fate of one region depends on the attitudes adopted by others. These qualitative behaviors have been confirmed a posteriori.
The remaining of the paper is organized as follows. The model is described in Sec. II while its computer implementation and parameters are in Appendixes A and B, respectively. Model calibration using nowcasting is presented in Sec. III while the detailed results are in Sec. IV. Some discussions and remarks are presented in Sec. V. In Sec. VI, we present a brief afterwords commenting the achievements of the model in forecasting the qualitative countrywide behavior of the pandemics in Brazil five months after the predictions were made [39].

A. Metapopulation
We consider a stochastic metapopulation model for the countrywide spread of the epidemic, in which a key ingredient is recurrent mobility [28] whereby individuals travel to other patches but return to their residence recurrently. The population is divided into patches which represent the places of residence as schematically shown in Fig. 1a). Similar models have been used for modeling of COVID-19 in Spain [16,24,25]. In particular, our model is inspired on a recent work [24] to investigate the COVID-19. We use a different, continuous-time approach where geographical, demographical, and testing capacity specificities are included. As in general metapopulation models, a patch can represent a municipality, a neighborhood or any other demographic distribution of interest. Each patch is labeled by i = 1, 2, . . . , Ω, where Ω is the number of patches, and has a resident population fixed in time given by N i . The inter-patch mobility rate is given by W ij defined as the number of individuals that travel from i to j and return to i after a typical time T r . A patch j for which W ij > 0 is a neighbor of i. Different forms of mobility, with their corresponding characteristic times, can be implemented. For the sake of simplicity, recurrent flux of individuals per day (T r = 1 d) is used in the present work. Mobility within patches is not explicitly considered. Instead, we assume the well-mixed population hypothesis [29] where an individual has equal chance to be in contact with any other in the same patch.

B. Demography and contacts
The average number of contacts per individual of patch i per unit of time is k i . Variations in the social distancing can be implemented by altering k i . Contacts are related to local mobility and are strongly correlated with the population density ξ i . Monotonic increase has been proposed [36]. We follow Ref. [24] and use a contact number modulated by k i ∝ f (ξ i ) with f (ξ) = 2 − exp(−aξ). To take into account the sparse demographic distribution of Brazil, we used the urban population density of each municipality to compute ξ i . Interactions between urban and rural populations are also explicitly considered through a symmetric 2 × 2 contact matrix C uu , C rr , and C ur = C ru , 0 ≤ C xy ≤ 1, determining the contacts between urban (u) and rural (r) populations whose fractions are represented by ω i and 1 − ω i , respectively. The number of contacts in patch i is also modulated by where the terms represent urban-urban, urban-rural, and rural-rural interactions from left to right. The contacts are parameterized by their average number per individual k , such that k i = zg i f (ξ i ) k , where z is a normalization factor, required by the relation i k i N i = N tot k and given by where N tot = i N i is the total population.

C. Epidemic model
The epidemic dynamics within the patches considers three noncontagious compartments which are: susceptible (S), that can be infected; removed (R), that represents immunized or deceased individuals; and exposed (E) that carries the virus but does not transmit it. Compartments of contagious individuals are: asymptomatic (A), that have not presented symptoms; unconfirmed (U), which are symptomatic but not yet identified by tests; and confirmed (C), that have tested positive for COVID-19. The compartment A is actually a simplification of the real situation. At least two relevant types of asymptomatic persons do exist: presymptomatic, that will develop symptoms at the end of the incubation period, and the truly asymptomatic, that becomes recovered without any significant symptoms. There exist large uncertainties with respect to the fraction of individuals who are truly asymptomatic [40,41]. According to the parameters used in this work (see Appendix B), we have approximated 30% of truly asymptomatic in compartment A which lies within the interval reported elsewhere [40]. Due to the uncertainty of the available data, we also assume a same contagion rate for both types of asymptomatic individuals which are hereafter treated as a single compartment. The epidemic transitions obey a SEAUCR model, in which susceptible individuals become exposed by contact with contagious persons (A, U, or C) with rates λ A , λ U , and λ C , respectively. Exposed individuals spontaneously become asymptomatic with rate µ A , while the latter transitions to unconfirmed (presymptomatic), confirmed, or removed (truly asymptomatic) states happen with rates β C , β U , and β R , respectively. Finally, unconfirmed or confirmed individuals are removed from the epidemic process (via recovery or death) with rate α R . The transition U→C has rate α C that may depend on the number of unconfirmed cases n U i . For a small n U i the confirmation rate is constant α (0) C , assuming that any municipality has the minimal resources for testing. The number of confirmed cases is limited by the finite testing capacity per inhabitant, ζ. We assume a simple monotonic function The transition E→U was not included since most studies agree that the asymptomatic or presymptomatic condi-tion is a central characteristic of COVID-19 [19,[40][41][42]. The contact of confirmed individuals is reduced to bk i with b < 1. Figure 1b) shows a schematic representation of the epidemic events. Individuals residing in i can move to neighbors j following a stochastic process with rates R i /T r where R i is the population fraction of i that moves recurrently; the destination j is chosen proportionally to W ij . Symptomatic (U or C) individuals do not leave their residence patches but if the transitions to U or C happen out of his/her residence they return with rate 1/T r , as do the individuals in other epidemic states. Infection, healing and other transitions follow Poisson processes with the corresponding rates. Details of the computer implementation, using an optimized Gillespie algorithm [43], are given in the Appendix A.

D. Initial condition
The initial condition is a major hurdle for the accurate forecasting of a pandemic due to high underscore and delays in reporting. First studies of the COVID-19 Wuhan outbreak estimated that undocumented infections were the source of at least 79% of documented cases [19]. This number can be larger depending on the testing policies [44,45]. The investigated scenario was the beginning of the pandemics in Brazil when testing policies and data availability at municipality level were still very premature. We assume a simple linear correlation between the increase in the number of confirmed cases of COVID-19 reported by the official authorities and infected cases (E, A, and U) in each municipality. We used 31 March 2020 as the initial condition. Let n conf i (t) andñ rem i (t) be the number of confirmed and removed (recovered plus deceased) cases of COVID-19 in each municipality i [38] of Brazil at day t. Let t −1 correspond to 24 March, t 0 to 31 March, and t 1 to 4 April 2020. These data are provided in supplementary material (SM) [46]. The initial condition at t 0 is whereñ X i is the number of individuals in patch i and state X. The last equation means that essentially all individuals who were infected one week before are currently removed. Since we are far from herd immunity at t 0 , this choice is irrelevant. This approach, based on the hypothesis that the increase of reported cases is correlated with local transmission, is a way to reduce the effects of underreporting. The prefactors were chosen together with the parameter α scenario. The last one is consistent with the estimates of the Google Community Mobility Reports for Brazil within this time window [47]; see Figs. 2 and 3.

E. Mitigation strategies
Different mitigation strategies are studied by uniformly reducing the inter-patch mobility and contact by fac- We investigate three different scenarios: none, weak, and moderate mitigations represented by (M, K) = (0, 0), (0.4, 0.3), and (0.8, 0.5), respectively. We do not tackle the regime of suppression or strong mitigation that could drop the epidemic drastically. Averages were evaluated over 10 independent simulations.

F. Remarks on parameter choices
The extensive set of parameters is aimed to reproduce the essence of the epidemic spreading at a countrywide level but not to track accurately the epidemic outbreak in every state or municipality. We specially assume uniform and constant control parameters for reduction of inter-municipality movement and contacts as well as the testing rates. This is surely not the real case. Each federative state or municipality has its own mitigation and testing policies that are very heterogeneous. So, assuming uniformity and fitting the overall data at a country level, it is possible to infer in which regions the epidemic is evolving faster or slower than the average. Finally, our initial condition is a guess driven by data but still limited. The level of underreporting is known to be high, and broad ranges of values have been reported [19,44,45]. The description of the complete set of epidemiological parameters, the initial conditions, mobility and demographic data are given in Appendix B.

III. NOWCASTING AND MODEL CALIBRATION
Irrespective of the heterogeneous testing approaches among states [38], we assume that essentially all municipalities had capacity to detect COVID-19 in the first hospitalized patients and, therefore, the number of municipalities with positive cases is expected to be an observable much less prone to the effects of underreporting than the number of confirmed cases at the beginning of the pandemics in Brazil. Figure 3 compares the evolution of municipalities with confirmed cases between 1 and 22 April 2020 obtained in simulations with different mitigation measures and real data for each Brazilian state and also for the whole country. Weak mitigation provides the best agreement with the reported data whereas deviations start to appear after three weeks. The weak mitigation parameters (M, K) = (0.4, 0.3), which fit better the overall data for Brazil, also perform well for most of the states in this time window and especially those with higher incidence of COVID-19 in the period, which lead the averages over the country. However, some states (AC, AM, PA, PE, PI, and RR -abbreviations for Brazilian federative states are given in Fig. 3) were more consistent with the simulations without mitigation while others (PB, PR, and RN) were nearer to a moderate mitigation. No tuning with respect to specific places was used and, consequently, it does not reproduce accurately the number of reported cases for all federative states due to the diversified testing and social distancing policies across different states and municipalities, in contrast with the uniformity hypothesis of the present study.

IV. GEOGRAPHICAL ANALYSIS OF EPIDEMIC PREVALENCE
The role of inter-municipality traffic is investigated reducing the contacts by 30% and varying the mobility. The total number of cases changes very little since the leading contribution comes from local transmission in municipalities with larger incidence while the number of municipalities with confirmed cases, shown in Fig. 2b), depends significantly on the inter-municipality traffic restrictions. These results confirm the epidemiological concept that reducing long-distance travelling contributes to delaying the outbreak onset in different places, but alters little the countrywide epidemic prevalence [48], as discussed in the context of COVID-19 spreading in China [13,49] and Spain [25], for example.
From now on, generic predictions under the aforementioned hypothetical scenarios are discussed. We stress that these predictions, whose time window spans after the resubmission time, were simulated in April 2020 [39]. High variability of the epidemic outbreaks (intensity, duration and delays) can be observed through all geographical scales. Diversity was found in other metapopulation approaches for epidemic processes [23,27] but the multi- scale dispersion in our simulations is remarkable. At the state level, the time when the outbreak reaches its maximum epidemic prevalence, the epidemic peak, varies considerably as shown in Figs. 4a) and 4b), where the epidemic prevalences for all states are shown as function of time for a weak mitigation.
Let T and ρ be the time and prevalence of the epidemic peak, considering both unconfirmed and confirmed individuals. According to the simulations with weak mitigation, the first state to reach the peak would be CE (T CE ≈ 34 d), followed closely by SP and AM (T SP ≈ T AM ≈ 36 d). The respective prevalences would be ρ CE ≈ 3.6%, ρ SP ≈ 4.9%, and ρ AM ≈ 5.0%. The state with highest epidemic prevalence would be RJ with ρ RJ ≈ 7.0% at T RJ ≈ 40 d. The epidemic peak would happen later (T MA = 71 d), with maximum less intense (ρ MA ≈ 2.3%) and the curve broader for the MA state. The MA case is particularly interesting because whereas the peak for the state would happen last, for its capital city São Luis it would occurs at day 41 slightly earlier than the average peak for Brazil, which would be at day 44 in this scenario. These numbers should not be quantitatively compared with the real reported cases. Firstly, the confirmed cases correspond to a small fraction of the total of cases that depends on testing rate, an unknown variable in our study, and curves for ρ C are much broader and have the peak postponed to later times. Secondly, the uniform and constant mitigation and testing scenarios do not correspond to reality.
A spatially more refined analysis is done comparing im-mediate and intermediate geographical regions of a same federative state, which are determined by the Instituto Brasileiro de Geografia e Estatística (IBGE) [50]. An immediate region is a structure of nearby urban areas with intense interchange for immediate needs such as purchasing of goods, search for work, health care, and education.  Variability of time, prevalence, and width τ (outbreak duration) of the epidemic peak throughout immediate regions are shown in the box plots of Fig. 5 for simulations with a weak mitigation. Box plots for none and moderate mitigations are qualitatively similar and presented in Figs. S8 and S9 of the SM [46]. One can see that federative states with smaller territorial areas, such as SE, RN, and RJ, tend to have smaller dispersions of the peak time while those with larger and more sparsely connected territories, as AM, MT, and MS, have larger dispersions and outliers. We also have states with larger but better connected territories such as SP, PR, and RS, for which the dispersion is significant but not characterized by outliers. Irrespective of the more cohesive outbreak patterns, the epidemic peaks in immediate regions of the SP state, for example, could take more than twice as long as the peak at the capital city. Conversely, the outbreaks in the RJ state would be less desynchronized than in its neighbors, MG and SP, with the latest immediate region reaching the epidemic peak only 50% later than the capital city Rio de Janeiro, in this scenario.
The space-time progression of the epidemic presents complex patterns as shown in Figure 6a) to g), where color maps with the prevalences of symptomatic cases in each municipality of Brazil are shown with intervals of three weeks for simulations using weak mitigation; see Figs. S10, and S11 of the SM [46] for other scenarios. Full evolutions for all investigated mitigation parameters are provided in Movies S1, S2, and S3 of the SM [46]. Epidemic waves propagating on average from the coast at east, where the most populated regions are, to inland can be seen at a country level. Moreover, when the epidemic is losing strength in the coast, the inland regions are still facing high levels of epidemic prevalence. Maps and movies suggest that the epidemic starts in the more populated places, usually the largest metropolitan regions, and spreads out towards the countryside. This is quite clear in the SP state, where the focus begins in its capital city and moves westward into the countryside. However, other states as PR and RS follow a different pattern with multiple foci evolving simultaneously.
We analyzed the correlation between time T of the epidemic peak in the prevalence curves of immediate regions and their distances D from the capital city of the respective state. Figures 6i)-k) show scatter plots of D vs T for three typical correlation patterns found in the simulations. The remaining states are given in Fig. S12 of the SM [46]. The SP state presents a strong correlation with Pearson coefficient r = 0.69, statistically significant with p-value less than 10 −7 . For PR, no statistical correlation is observed. Finally, MG has a moderate correlation with Pearson coefficient r = 0.47, but statistically significant with p-value of 10 −5 . We classified the states according to the correlation between T and D in three groups: not significant for which the p-value is larger than 0.02; strong correlation that has statistical significance with p < 0.02 and r > 0.6; and moderate correlation with p < 0.02 and r < 0.6. The results are summarized in Fig. 6h) where the map shows states colored according to their classification. The structure does not change for other mitigation approaches. The complete set of Pearson coefficients and p-values are given in Fig. S12 of the SM [46]. States in the north (RR, RO, AP, AC, AM, PA, and TO) have few immediate regions distributed in large territories and are less connected. Thus, lack of correlations is not surprising. Moderate correlations in states as MG, BA, and GO, with large but better connected territories are due to the existence of multiple important regions that play the role of regional capitals. However, PR is an extreme case with many and well connected immediate regions but total lack of correlations with the capital city, Curitiba.

V. DISCUSSION
The wide territorial, demographic, infrastructural diversities of large countries as Brazil demand modeling of a pandemic with geographical resolution higher than the usual compartmental epidemic models, which can be achieved using the metapopulation framework. We developed a stochastic metapopulation model and applied it to the COVID-19 spreading in Brazilian municipalities. Performing simulations with different hypothetical mitigation scenarios and considering the integration of regions through inter-municipality recurrent mobility, we have identified a high degree of heterogeneity and desynchronization of the epidemic curves between the main metropolitan areas and other inland regions. The diversity of outcomes is observed in several geographical scales, ranging from states to immediate regions, that enclose groups of nearby municipalities with strong economic and social ties. The intenser the mitigation attitudes, the more remarkable are the differences. Moderate or strong correlations between the delay of the epidemic peak in the interior regions and distance to the capital cities of the respective state were observed for most states with well connected immediate regions. However, one exception was the PR state where there are diverse epidemic foci evolving apparently independently of its capital city.
Our simulations suggest that uniform mitigation measures are not the optimal strategy. In a municipality where the peak without interventions would happen quite later than the capital city of its state, it would delay even more if strong mitigation measures are adopted synchronously with the capital. Nevertheless, such a place probably would have to extend the mitigation for much longer periods since once the epicenters of epidemics start to relax their restrictions, the viruses will circulate fast, reaching these vulnerable municipalities with most of population still susceptible. This is the actual current situation in Brazil at moment of resubmission. The social and economic impacts could be higher. On the other hand, an eventual collapse of the local heath system, a central question of the COVID-19 pandemic combating [24], in the countryside regions could also occur after the larger metropolitan areas were under control, raising the possibility of unburdening the local healthy care system. In fact, the desynchronization of the epidemic curves and consequently of the health-system collapse could be an asset in designing optimal resource allocations. One could prevent rapid expansion of the epidemic through smaller municipalities flattening the curves as much as possible in the main metropolitan regions. Beyond the positive consequences for their own health systems, this attitude would have the positive side effect of delaying outbreaks in the countryside. Our study shows that countryside regions are not safe in the medium-term but could have precious additional time to prepare themselves.

VI. AFTERWORDS
There is an interval of five months between conceiving the first version of the present work [39] and this final submission. As emphasized throughout the paper, our aim was to simulate hypothetical scenarios rather to do an accurate forecasting of real epidemic curves. Therefore, qualitative behaviors predicted in the simulations can now be confronted with the real series observed a posteriori. Several qualitative aspects predicted in simulations were indeed observed in real data. Maybe the most remarkable one is the confirmation of the multiscale nature of the epidemic curves presented in Fig. 4 Figure 7 presents the epidemic incidences (new cases per capita over a week) for intermediate regions of three states presenting typical behaviors observed in simulations. The scenario of propagation starting from east coast towards inland, with a considerable delay for the epidemic peak in the latter, shown in Fig. 6 was also observed. Other interesting phenomena were also verified in real data as, for example, the unusual curve for the MA state and the multifocal spreading in some states as PR, MG and GO.
Lets consider in more details the SP state shown in Fig 7a), the one with highest number of COVID-19 cases in Brazil. One can see a moderate dispersion for peak occurrence time while in the capital city São Paulo it happened considerably earlier. This picture is the same drawn in Fig. S7a) of the SM [46]. For the state of RJ, shown in Fig. 7b), the real data, with an exception for the intermediate region of Petropólis, are remarkably similar to simulations (see Fig. S5a) of the SM [46]), where intermediate regions peaks approximately synchronously with the intermediate region of Rio de Janeiro. We are not able to provide a specific reason for the different behav-ior of Petropólis. The incidence curves for the CE state, one of the first and more intensely impacted places by COVID-19 in Brazil and shown in Fig. 7c), are remarkably heterogeneous with Fortaleza reaching the maximum much earlier and Cretéus much later than the other intermediate regions. The incidence curve for the whole CE state is very broad meaning that different regions rule the state' curve at different times, a scenario in almost perfect consonance with the simulation predictions shown in Fig. S4 of SM [46]. We can also see significant dispersion for immediate regions' curves within a same intermediate region, as illustrated for Sobral in Fig. 7d). Intermediate region of Cretéus is composed only by two intermediate regions and does not illustrate the point as well as Sobral does. The maximum incidence also varied considerably as, once more, predicted in simulations.
Quantitatively, a hypothetical scenario fitting better more regions lies between weak and moderate mitigations. However, as one could guess, different hypothetical scenarios works better at distinct moments and places confirming that quantitative forecasting and even nowcasting is a very dedicated issue.

Appendix A: Computer implementation
The computer implementation of the epidemic model follows the Gillespie methods based on Ref. [43].
Definitions. We define n X ji as the number of individuals in state X (= S,E,A,U,C, or R) within the patch i that comes from patch j such thatñ X i = j n X ji is the number of individuals in patch i that are in state X. Thus, the number of residents of i in state X at patch i is n X ii and of nonresidents is given byñ X i − n X ii . Also,ñ i = Xñ number of residents in the patch i. Finally, the total number of individuals in the compartment X is N X = Ω j=1ñ X i . We have that W ii = N i − j =i W ij is the population of patch i that does not leave it. The population fraction moving recurrently from i to j is Thus, R i = j =i R ij and R ii = 1 − R i are, respectively, the population fractions that move or not from patch i. General structure of the simulation. We divide the simulation in two big groups: mobility and epidemic events. The total rate of inter-patch mobility is in which the first term represents the individuals leaving from and the second returning to their patches of residence.
The epidemic transitions T and their respective total rates Γ T are defined in Table I. The infection rate in a patch i is given by while the total rate of epidemic events is E tot = T Γ T .
In each time step, we select one mobility or epidemic Θi event with probabilities P mob = D tot /(D tot + E tot ) and P epi = 1 − P mob , respectively, and the time is increased by δt = − δt ln η where δt = 1/(D tot + E tot ) and η is a pseudo random number uniformly distributed in the interval (0,1].
Inter-patch mobility events. If a mobility event was chosen we proceed as follows.
i) Choose the patch i from where an individual will leave proportionally to the total mobility rate of the patch, which occurs with probability ii) Choose the state X of the individual of patch i that will move with probability iii) Select if the individual to move is resident or nonresident with probabilities n X ii /ñ X i and 1 − n X ii /ñ X i , respectively. iv) If the individual is resident and not symptomatic (S,E,A or R), the destination patch j is chosen with probability R ij /R i . If he/she is symptomatic, stays in the resident patch (j = i). The movement is implemented as n X ij → n X ij + 1 and n X ii → n X ii − 1.
v) If the individual to move is a nonresident, one patch j is chosen with probability n X ji /(ñ X i −ñ X ii ) and the movement is implemented as n X ji → n X ji − 1 and n X jj → n X jj + 1.
Epidemic events. If an epidemic event was chosen we proceed as follows.
i) One of the transitions T is chosen with probability ii) If T = S→E, then we choose a patch i with probability Θ i / Ω j=1 Θ j . Then, one patch j is chosen with probability n S ji /ñ S i and the populations are updated as n S ji → n S ji − 1 and n E ji → n E ji + 1. Notice that both resident (i = j) or nonresident (i = j) are considered in this rule.
iii) If T is a spontaneous transition of the form X→Y then a patch i is chosen with probabilityñ X i /N X and the patch j of residence is chosen with probability n X ji /ñ X i . Then n X ji → n X ji − 1 and n Y ji → n Y ji + 1.
This algorithm spends too much time with mobility of susceptible and removed individuals since they can exist in very large number for long times. We implemented an optimization performing the mobility of susceptible and removed in discrete times of 0.5 d. So, instead of moving one susceptible individual at a time step, we move n S ii R ij from i to j at once and those who were not infected while visiting patch j, return to i after 0.5 d. The same is done for removed individuals. The simulations yield the same results as the exact procedure within the statistical uncertainties.
Appendix B: Parameters for a data-driven approach Mobility. Mobility parameters were extracted from public sources. Local commuting C (1) ij was estimated from the national census data of 2010 provided by IBGE [52] as the number of people that daily commute among municipalities to work or study. We excluded from the dataset those travels corresponding to more than 250 km, which are reckoned by airline travel data.
Commuting data lacking destinations were included, assuming proportionality to the complete-information portion of the dataset; see [53] for details. For the sake of generality, the recurrent flux is a function W ij ). We use the simplest case F ij (x) = x. Air transportation data were obtained from the Agência Nacional de Aviação Civil (ANAC) [54] with the statistics of direct flights of 2014 considering the main nearby metropolitan region as the origin and destination of passengers. Using boarding and landing statistics of passengers, one-connection flights were inferred and the number of passengers C (2) ij flying daily from i to j was estimated [55]. In order to construct a recurrent mobility with flights, we assume W ji )/2, which represents an effective flux. We do not have estimates for duration of long-range travels. However, this does not affect the current model since the homogeneous mixing hypothesis within the patches means that only the number of individuals that stay part of the time in two different patches actually matters and not who traveled. The total recurrent mobility is W ij = W (1) ij + W (2) ij ; we assume that it occurs daily, i.e, T r = 1 d since both commuting and air travel data were obtained with this time resolution.
Epidemiological parameters. We used epidemiological parameters based on Ref. [24] which were mined from analyses of early COVID-19 epidemic spreading on Hubei province China [17,20,23]. The total time in the exposed and asymptomatic compartments was taken as µ −1 A + β −1 U = 5.2 d [17]. We used µ −1 A = β −1 U = 2.6 d. The time for a symptomatic individual be removed was estimated as α −1 R = 3.2 d [20,23]. For the truly asymptomatic individuals, we used the same recovering time of the presymptomatic ones, such that β −1 R = β −1 U + α −1 R . The infection rates λ C = λ U = λ A = 0.06 d −1 [13,24,56] and average number of contacts k = 13 were estimated to reproduce the scaling of Brazilian very early exponential growth, ∼ exp (0.3t), in reported cases of COVID-19; see Fig. S13 in the SM [46]. These parameters provide an effective infection rate λ k = 0.78 that represents an optimistic estimate nearer to the lower bounds estimated in other studies of COVID-19 [20,23,24,57]. The contact of the individuals who were confirmed for COVID-19 was depleted to a fraction b = 0.3 of their regular contacts. The confirmation rate 1/α (0) C = 10 d was calibrated together with the initial conditions to fit the amplitude of confirmed case curves after 31 March 2020 and also the total number of municipalities with confirmed cases in a weak mitigation scenario. This is approximately twice the incubation time implying that symptomatic individuals may not be tested even if tests are available. Up to 23 April 2020, Brazil had performed less than 1400 tests per million inhabitants [38]. Assuming a time window of 35 d for testing, we have approximately ζ = 1/40 000 d −1 per inhabitant. Due to the testing targeted only to severe cases of respiratory syndromes adopted in Brazil to the date of investigation, the transition from asymp-tomatic to confirmed was neglected setting β C = 0; note that the model can easily include such a transition if the testing scenario changes.
Demographic parameters. Each patch represents a municipality with resident population given by IBGE using estimates of 2019 [35], while the respective fractions of rural and urban populations were obtained from the IBGE census of 2010 [52]. The population density was determined using urban areas, obtained from high resolution satellite images, provided by Empresa Brasileira de Pesquisa Agropecuária (EMBRAPA) [58]. Both fraction and density of urban population, and, consequently, the number of contacts have broad distributions as can be seen in Fig. S13 of the SM [46]. These are key features to enhance the accuracy of the predictions beyond large metropolitan centers. Parameter a that controls the function f (x) was taken as a = 1/ ξ where ξ = 2791 inhabitants/km 2 is the average urban population density of Brazil. The rural-urban contact matrix was chosen as C uu = C rr = 1 and C ru = 0.5. This last choice only assumes that urban-rural contacts are significantly smaller than urban-urban or rural-rural interactions.

ACKNOWLEDGMENTS
We thank Marcelo F. C. Gomes for providing the filtered mobility data used in the present work; Ronald Dickman for useful discussion and manuscript reading; Jesus Góméz-Gardeñes and Alex Arenas for discussions that motivated the beginning of this work. WC thanks David Soriano-Paños for extensive discussions about metapopulation modeling and Mario Balan for discussions about IBGE data. WC thanks the kind hospitality of University of Zaragoza and GOTHAM Lab during the realization of the work. This work was partially supported by the Brazilian agencies Coordenação de Aperfeiçoamento de Pessoal de Nível Superior -CAPES (Grant no. 88887.507046/2020-00), Conselho Nacional de Desenvolvimento Científico e Tecnológico-CNPq (Grants no. 430768/2018-4 and 311183/2019-0), and Fundação de Amparoà Pesquisa do Estado de Minas Gerais -FAPEMIG (Grant no. APQ-02393-18). This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) -Brasil -Finance Code 001.