Analytical and Cellular Automaton approach to a generalized SEIR model for infection spread in an open crowded space

We formulate a generalized SEIR model on a graph, describing the population dynamics of an open crowded place with an arbitrary topology. As a sample calculation, we discuss three simple cases, both analytically, and numerically, by means of a cellular automata simulation of the individual dynamics in the system. As a result, we provide the infection ratio in the system as a function of controllable parameters, which allows for quantifying how acting on the human behavior may effectively lower the disease spread throughout the system.


I. INTRODUCTION
Compartmental models provide a conceptually simple and widely used mean to mathematically modeling the dynamics of infection transmission in isolated populations 1-4 . In such models, the population is divided into compartments, each one corresponding to a specific health status of the individuals that belong to it. For instance, the basic SIR model consists of three compartments, the Susceptible compartment (S), to which not infected, healthy individuals belong, the Infectious compartment (I), that contains infectious individuals, and the Recovered compartment (R), that contains individuals who either recovered from the infection, or died. In the SIR model, the spread of the infection is described in terms of a set of differential equations that describe the population transfer from one compartment to another one. The key parameters of the model are the rates of transfer between the compartments which, in general, are quantities to be experimentally fitted.
In fact, while the SIR model is a good tool for long time simulations, it does not take into account the latent phase, in which people are infected but not yet infectious. The simplest way to fix this flaw of the model is by adding the Exposed compartment (E). The E compartment contains individuals who have been infected, but are not yet infectious. This improvement eventually leads to the SEIR model, more appropriate for short-time simulations. For instance, the SEIR model has recently been widely employed to describe the Covid-19 infection, though with some limitations [5][6][7][8] .
Despite their effectiveness in describing a number of real-life infection dynamics, the SIR and the SEIR models, together with their generalization to a higher number of different compartments 9 , are in general affected by a limitation. Indeed, they describe, in general, the average global behavior of the population, with no attention to local dynamics that can make the level of infection strongly depend on the region in real space that one is considering. Instead, this last aspect has become of fundamental importance in, e.g., countries recently affected by Covid-19 pandemic, such as Italy, which show a strongly uneven spread of the infection across the country territory [see, for instance, https://en.wikipedia.org/wiki/COVID-19 pandemic in Italy for a detailed description of the Covid-19 infection distribution across Italy from the start of March, 2020]. In addition, the SIR-and the SEIR-type models basically describe closed systems, without the possibility for people to enter or exit from the system. So, they are also expected not to be reliable when describing relatively small populations that can exchange individuals with the surrounding "environment", such as shopping centers or closed commercial areas. In systems as such, the infection dynamics is again expected to be strongly space-dependent, due to, e.g., the presence of very popular shops in a shopping mall, where people spend on the average much more time than in different areas.
In general, it is well known that the spread of a disease has a strong dependence upon the topology of the system and on how individuals and/or subgroups are connected to each other. Specifically, the topology strongly affects the rates that eventually enter the differential equations describing the population dynamics and, therefore, it determines the specific stationary solution, describing the system over long time scales 10 . For this reason, in the last years, a remarkable amount of work has been devoted to discuss the dynamics of epidemic processes on graphs and hypergraphs [11][12][13][14] . In these systems, time-and spaceinhomogeneity leads to nontrivial consequences on the population dynamics [15][16][17][18][19][20] . These are of the utmost importance over short-time, small-space scales, as in the case of the disease dynamics on a sidewalk, in a shopping center or running tracks, where the topology of the system strongly affects the effective contact rate, that is the number of contacts, per unit of time, between individuals 21 .
Based on the above observations, in this paper we describe the population dynamics of an open crowded place in terms of a "local" SEIR model on a graph, with position-dependent parameters. In particular, after writing the set of differential equations describing our sys-tem, we first provide an approximate analytical solution within a mean field approach to the full mathematical problem and, therefore, we resort to a cellular automata (CA) simulation of the people traffic in the system [22][23][24] . In general, an analytical treatment of the problem in terms of a collection of local compartmental models in a nonuniform nonequilibrium configuration is quite difficult to deal with, as the number of differential equations scales quickly with the size of the system. At variance, the CA is able to describe the pretty complicated behavior in terms of simple rules, that apply simultaneous to all the nodes of the graph. Eventually, we employ the CA numerical results to pertinently complement and check the reliability of the (approximate) analytical ones.
To quantitatively describe the infection spread over the graph, in all the cases we compute the (average) infection ratio in the steady state of the system, C r , that is, the ratio between the variation in the average population in the compartment E over a time T long enough for the system to reach the steady state and the initial average population in the compartment S. C r measures the hazard for a healthy individual to get infected when going across the system in the presence of infected individuals. Therefore, to quantify the level of infection risk for an individual in the system, we compute C r as a function of various system parameters, some of which are particularly important, as they can be readily acted on by, for instance, controlling the entrance rate of people in a shopping center, letting people move in one direction only at each side of the aisles in the center, increasing or decreasing the number of points at which it is possible for individuals to change their direction of motion, and so on.
Considering all together in our model the effects of both effective parameters that depend on the ("intrinsic") disease dynamics and of parameters that can be tuned by acting on the social behavior of individuals, we are able to quantify how an accurate control of humandepending effects, such as social distancing, use of personal protective equipments and so on, can be effective in mitigating the effects of high contagion rates of diseases, such as the one due to Covid-19 [25][26][27][28] .
To present the main features of our model and to discuss its implementation, both analytical and numerical, in the paper we focus on three simple, prototypical models of open systems. Yet, as we eventually discuss in the paper, within the CA approach, generalizations of our models to more realistic situations are straightforward, and we plan to pursue them in forthcoming publications.
The paper is organized as follows: • In Sec. II, we define our generalized SEIR model on graphs. In particular, we present the (local) set of differential equations describing the (local) population dynamics on the graph and provide an explicit analytical mean-field solution of the equations. Eventually, after presenting the corresponding results, we highlight the main limitations of the analytical approach, which motivate our switch to the numerical, CA approach; • In Sec. III, we present and discuss the CA rules describing the spread of an infection within an open, finite connected graph. Therefore, we implement them to obtain numerical results in the same systems analytically studied in Sec. II. Eventually, we employ the numerical results to check the reliability of the analytical approach we use in Sec. II; • In Sec. IV, we provide our main conclusions and discuss about possible further perspectives of our work; • In Appendix A we review the basic formulations of the compartmental SEIR model.

II. THE LATTICE LOCAL SEIR MODEL
In this section, we define and analytically study our lattice model generalization of the SEIR model, suitable to describe the spread of an infection throughout a number of open, finite connected graphs. Within our model, we describe the dynamics of small populations, each one residing at the sites of a pertinently designed (quasi)one-dimensional lattice, and connected to each other by means of a finite rate for individuals to "hop" from one site to the others. As we discuss in the following, our model is able to encompass several typical features of infection spreading in real world, particularly evidencing, on quantitative grounds, how the spreading depends on in principle tunable parameters of the system.
Throughout this paper, we focus onto linear graphs, that is, finite one-dimensional lattices (chains), with open boundary conditions. Each chain corresponds to a simplified model of a straight way for pedestrians. More complex (and, in many cases, more realistic) models can be readily constructed by, e.g., putting together finite, one-dimensional lattices to be used (with the appropriate boundary conditions) as "elementary building blocks". Each lattice site j hosts a "local" population of individuals that is characterized by the density of healthy people n S,λ;j , that is, the number of healthy people per plaquette, by the density of exposed (infected, not contagious) people n E,λ;j , that is, the number of exposed individuals per plaquette, and by the density of infectious people n I,λ;j , that is, the number of infectious individuals per plaquette. In this respect, each cell realizes a population whose dynamics is described by the SEIR model reviewed in appendix A, though with a major simplification which we discuss next. In fact, the basic assumption of the SEIR model that only an infectious individual can turn into a recovered one (either healed, or deceased), as well as with the observation that the duration of the time that each individual spends within the graph is of a few hours, which is pretty short compared with the time scales for having a nonzero density of recovered people, enables us to set to zero the density of recovered people throughout the whole system. Furthermore, as the median incubation period of a virus such as 2019-nCoV ARD has been estimated to be about 3.0 days 29 , we set to zero the probability for exposed people to become infectious. Accordingly, we assume that the total number of infectious individuals in the system can only change by exchanges with the outside (infectious individuals either entering, or exiting, the system).
Compared to the "standard" SEIR model, our framework allows for the net number of individuals in the population at each lattice site to change, as a consequence of individuals hopping between neighboring sites on the chain. To formalize this aspect of our system, when defining the various local densities, we add a label λ , encoding various possible ways for individuals to move between different sites. In general, in our sample models, at any time t, each individual at site j of a chain can either move towards the right (corresponding to λ > 0), or towards the left (corresponding to λ < 0), on one of the possible parallel lanes labeled by |λ| = 1, ..., Λ, with rules and constraints that depend on the specific lattice topology.
Regardless of the specific lattice topology, the mathematical description of the population dynamics on the lattice can be formalized by a set of differential equations, whose parameters are determined as follows. First of all, to ease the mathematical formulation, it is useful to label each cell by both the lattice site index, j, as well as with an additional index λ encoding the information about the direction of motion. So, each "physically distinct" lattice site j corresponds to several cells, which we label with the pair of indices j, λ. Within each cell, n S,λ;j , n E,λ;j , and n I,λ;j can either be zero, or different from zero.
At any cell j, λ not residing at the endpoints of the chain (that is, with j = 1, L), on top of the "standard" SEIR dynamics for an isolated population, the number of individuals in each compartment can either change because individuals hop from-, and into-, neighboring cells moving in the direction defined by λ, or by changing the sign of λ (that is, the direction of motion) at a given j. To formally describe individual motion between different cells, we set ω h;λ to be the rate (probability per unit time) of an individual to hop from cell-j, λ to cell-j ± 1, λ (where sign plus is for λ > 0 and minus for λ < 0) and ω ℓ;(λ,λ) to be the rate for an individual to hop from cellj, λ to cell-j,λ, withλ = λ. If λλ > 0, the individual is changing its walking lane but not its direction of motion, while, if λλ < 0 the individual is changing both lane and direction of motion. To simplify our further derivation, we assume that the system is homogeneous in real space, which implies that the rates are all independent of the index j. Moreover, as there is apparently no reason for different types of individuals to move at different rates, we assume that ω h;λ and ω ℓ;(λ,λ) are independent of whether the moving individual is S, E, or I.
As we aim at eventually describing steady states of the system, without big local fluctuations in the various densities, consistently with the detailed balance principle, we assume ω ℓ;(λ,λ) = ω ℓ;(λ,λ) . Finally, to account for the "local" SEIR dynamics, we introduce the parameter ω c , which corresponds to the infection rate that determines the change in time of n S,λ;j , n E,λ;j , and n I,λ;j at given j, λ. At the endpoints of the chains, that is, for j = 1, or j = L, the rate for individuals (of any type) to enter the cell at fixed λ > 0 (λ < 0) is simply the entrance rate into the system, ω in (which is one of the tunable parameters of our system). Similarly, for j = L (j = 1), the rate for individuals (of any type) to exit the cell at fixed λ > 0 (λ < 0) is given by the exit rate from the system, ω out . The rules listed above allow us to fully determine the set of equations describing our model. In addition, they also completely define the CA rules, once the rates are traded for the corresponding probabilities at each elementary time step, by multiplying all of them by the elementary time step of the CA, ∆t (see Sec. III for an extensive discussion of this point). Therefore, for the sake of our presentation, we pictorially present all the above rules in Fig. 5 of Sec. III which, as stated above, applies to both the mathematical model and to the CA.
Putting the various ingredients listed above all together, we obtain the following set of differential equations for the local densities on the lattice, for individuals moving in both directions and for 1 < j < L: Consistently with the rules we discuss above, for j = 1 (j = L) and λ > 0 (λ < 0), the terms A general comment about the set of Eqs.
(1)-(3) is that, while they certainly apply for low values of the individual densities at each cell, it is reasonable to assume that the maximum total density of individual at each cell does not go beyond a maximum value n max (which, by symmetry, we assume to be cell-independent), that is, n S,λ;j + n E,λ;j + n I,λ;j ≤ n max . Formally, such a constraint can be easily implemented in the set of differential equations above by, e.g., substituting n {S,I,E},λ;j at the right-hand site of the equations with n {S,I,E},λ;j (n max − B n B,λ;j±1 )/n max . In fact, while we definitely take into account the constraint when solving the equations, as well as when defining the cellular automata rules below, to ease the notation we prefer not to explicitly write it down in Eqs. (1)-(3), by limiting that set of equations to the low-density regime. The function f λ;j in Eqs. (1), (3) is the joint probability to truly have infectious and healthy people at the same time in the same cell. In general, this function is not simple to derive, especially because it changes from scenario to scenario, that is, since it is strongly affected by the specific details of the lattice. However, at least in the sample cases we discuss below, we show that f λ;j can be effectively estimated by means of a simple, mean-field approximation. To be specific, we now present and discuss the form of the above rate equations in the sample cases we deal with in our work. In particular, in the following we consider three different scenarios, that is • Scenario A: this describes a small sidewalk with individuals that can move both ways within one chain only; • Scenario B: this describes, for instance, a wide shopping mall, or an aisle in a shopping center. In this case, individuals are more or less evenly distributed between the two sides of the street, in front of the shop windows, so that people on different sides cannot infect each other; • Scenario C: this describes a couple of reverse oneway streets. Basically, in this case people with different walking directions are forced to stay on different sides of the street at a distance greater than the safety distance, so that people moving toward opposite directions cannot infect each other.
Depending on the specific scenario we are focusing on, we resort to different mean-field decouplings for f λ;j ({n ν,λ;j }). In general, the mean-field approximation for the joint probability function is grounded on the ansatz with the µ (λ,λ) 's, that are either equal to zero, or to one, encoding all the specificities of each case. In particular, case-by-case we choose the µ (λ,λ) 's as follows: • In scenario A we have just one chain on which individuals can move in two opposite directions. Only one pathway is available in either direction, so, λ can only take the values ±1 and µ (λ,λ) = 1.
• In scenario C again we have just one chain on which individuals can move in two opposite directions, such as in scenario A, but now spatial separation makes it impossible for individuals moving toward opposite directions to infect each other. Accordingly, here again λ = ±1, but now µ (λ,λ) = 0.
In principle, given the appropriate initial conditions and once the various rates have been pertinently estimated, the system of differential equations reported above is enough to discuss in detail the evolution in time of the individual flows across the system. In practice, knowing, for instance, the response in time to a sudden change in the system parameters (the rates) can help to predict the increase/reduction in the infection diffusion once local boundaries between regions in the same country, or between different countries, are relaxed/enforced (that has recently become an ubiquitous procedure to keep the infection under control all around the world). While we plan to discuss these features in a forthcoming publication, here we are mostly focused onto infection propagation in an environment where people flow is expected to shortly become stationary in time. For this reason, here we just focus onto stationary solutions of Eqs.
(1)-(3). In fact, while possible nonuniformities in the stationary density distribution due to boundary effects (which are in any case negligible in the large system limit), we numerically checked that at the system boundaries, j = 1, L, the asymptotic values of the population densities are the same as in the bulk of the system, that is, for 1 < j < L.
The main parameter characterizing a stationary solution (after a relatively short-time transient) is the average time T that people spend from when they enter the system till they exit. To evaluate T , we make a number of simplifying assumptions. Specifically, we assume that the (stationary) flow in either direction does not depend on the direction itself, that the densities at any cell are independent of time and uniform, that is, that n {S,I,E},λ;j is independent of both λ and j. As a result, dropping the indices j and λ and making the other simplifying assumptions listed above, we see that Eqs.
with µ = 1 for scenario A and B and µ = 0 for scenario C.
In solving Eqs. (5), we assume n λ;j = n S,λ;j + n I,λ;j + n E,λ;j =n to be constant and independent of both j and λ. Accordingly,n only depends on the rate of people entering the system. The total number N of individuals in the system at time t is proportional to the number of entering points, that is, to the number of different values of λ, times the rate at which people enter the system, ω in = ω in;S,λ + ω in;I,λ + ω in;E,λ , minus the exit rate, ω out (which we assume to be equal to ω h , times the total number of people in the last cells that, in the uniform, stationary regime, is just equal to N/L). Therefore, one obtains with Λ being the number of different values of λ. Eq. (6) is solved by setting Extrapolating from Eq. (7) the asymptotic value of N (t) for t → ∞, we can readily get the average density of people in each cell for each value of λ,n, which is given byn Oncen is fixed by the (asymptotic) system dynamics, it is still possible for individuals in the "local" population to switch among compartments. This is, in fact, encoded in the "local" SEIR-like Eqs. (5). To discuss the SEIRdynamics, we therefore solve Eqs. (5) by setting n S (t = 0) = n 0,S , n I (t = 0) = n 0,I , and n E (t = 0) = n 0,E , with and 0 ≤ δ I , δ S ≤ 1, δ I + δ S ≤ 1. Solving Eqs. (9), one eventually obtains n S (t) = n 0,S e −ωcn0,I (1+µ)t , n I (t) = n 0,I , In our approach, the key observable to quantify the level of infection due to individual motion through our system is the infection ratio C r,λ:j (t), that is, the number of people that in cell-j, λ get infected in a time t normalized to the number of healthy people that entered the system at t = 0. Clearly, within our stationary solution we expect C r,λ:j (t) to be independent of both λ and j. Accordingly, for the sake of simplicity we henceforth denote it simply as C r (t). From Eqs. (10), we obtain that the infection ratio at time T , C r , is given by Apparently, C r measures the hazard for a healthy individual to go across the system in the possible presence of infected people. Among the parameters at the righthand side of Eq. (11), n 0,I depends on the environmental conditions about the infection spillover, µ depends in a known way on the system topology (see the previous discussion) and, therefore, T is the only parameter that has to be estimated. While, in general, T can be extracted from the results of the numerical simulation, for ω ℓ = 0 it be analytically computed through a weighted average, as we discuss next. "Mimicking", in a sense, the cellular automata approach, to compute T , we assume that the evolution in time of the system takes place via a discrete sequence of elementary time steps, each one of duration ∆t. Accordingly, the fastest path taking an individual from the entrance to the exit of the system has duration T min = L∆t. In general, however, the topology of the system allows for backturns which make the actual time spent in the system larger than T min . As a result, one obtains that is independent of ∆t, as expected.
As long as ω ℓ = 0, inserting Eq. (13) into Eq. (11) for C r , one gets and then one can draw plots of the infection ratio as a function of either ω h , at a given ω in , or of ω in , at a given ω h . In Fig. 1, we draw C r as a function of ω h for three sample values of ω in . Remarkably, we see that, at large enough hopping rate between neighboring lattice cells, C r barely depends on ω in and keeps as low as 1% − 10%. At variance, in Fig. 2 we we draw C r as a function of ω in for three sample values of ω h . Here, we see that the dependence of C r on ω in is strongly affected by the value of ω h . In particular, while keeping ω h as large as 0.4 s −1 allows for maintaining C r of the order of 10%, even at pretty large values of ω in , at variance, as soon as ω h becomes of the order of 0.25 s −1 , C r rises up to 40% when ω in ∼ 0.5 s −1 . Finally, for ω h = 0.1 s −1 , we see that C r is ∼ 50% already for ω in ∼ 0.15 s −1 . Thus, from the synoptic comparison of the plots in Fig. 1 and Fig. 2, one may on one hand infer how C r is poorly sensitive to the value of ω in , provided ω h is large enough, which basically implies that, if the individual flow through the aisles is made fast enough, the entry rate of people into the system from the outside is not a crucial parameter to keep C r low. On the other hand, one finds that, as a function of ω in , C r is strongly sensitive to the value of ω h . Indeed, Fig. 2 basically shows how easy is for C r to become already as large as 50%, if ω h is not kept large enough to avoid crowding within the system.
We discuss now the effects on C r of allowing individuals to change their direction of motion, once in the system. On intuitive grounds, one expects that letting ω ℓ = 0 should lower C r , at fixed values of the other system pa- rameters. This is reasonable because now individuals are allowed to exit the system from the same side they entered. This means that an individual who enters the system from say cell-1, 1 and who wants to reach, for example, cell-1, 2 (on the other side of the street), does no more need to run across all the street till the cell with j = L and back, thus consistently lowering the risk of infecting other people, or of being infected by other people meanwhile. In fact, allowing people to change their walking direction allows them to reach faster the shop they are interested in by lowering the path accordingly. On the mathematical side, having ω ℓ = 0 does no more allow for exactly computing T by means of a procedure similar to the one leading to Eq. (12). Therefore, to plot C r as a function of ω ℓ with the other system parameters being fixed, we numerically derived T at given ω ℓ and therefore substituted the corresponding value in Eq. (11) for C r . In Fig. 3, we report the corresponding curves for C r as functions of ω ℓ , with the other system parameters set as discussed in the caption. Apparently, we see that increasing ω ℓ always acts to lower C r . Also, as we already inferred from the plots in Fig. 1 and Fig. 2, we see that C r is further lowered by simultaneously increasing ω h and lowering ω in , as we extensively discussed above. Finally, to investigate how, and to what extent, C r depends on a tunable parameter, in Fig. 4 we plot C r as a function of ω h for different values of the infection rate ω c . Indeed, ω c is a parameter one can effectively act on from the outside by, for instance, letting people entering the system to wear a mask and/or to keep social distancing, et cetera. From the plots of Fig. 4 we see that while, as expected, at a given ω h , the lower is ω c , the lower is C r , at the lowest value of ω c , one sees that C r keeps lower than 10% as soon as ω h ≥ 0.1 s −1 . Apparently, this is a quantitative evidence of the effectiveness of using as many measures to prevent infection as possible, such as masks and social distancing. A good combina- tion of prevention measures with a pertinent engineering of the individual pathways inside a given system, as well as with an appropriate regulation of the entrance rate in the system, can apparently work as an effective mean to keep the level of infection pretty low. To comment about our result, we note that our theoretical model is definitely able to catch some interesting qualitative behaviors. However, it overestimates the infection ratio, due the mean field approximation we employ to provide an explicit form for the joint probability function, f λ;j . Indeed, for example, if we consider scenario C with a hopping probability equal to one, the contagion should be exactly zero if less than an individual enters the system at each turn. However, the theoretical model is not able to reproduce this result because it totally neglects the actual people distribution in real space within the system. Furthermore, the mean field approximation is not able to properly distinguish between scenarios B and C. Indeed, as stated above, implementing the mean field approximation fixes the parameter µ at µ = 1 in scenario B and at µ = 0 in scenario C. Yet, from the formula for C r within mean field approximation, Eq. (11), we see that the plots in the two cases just collapse onto each other, provided one sets ω in in scenario C to be twice as large as ω in in scenario B. This is a consequence of the fact that, in both cases, people are divided into two separate groups: in scenario B they are divided into two lanes; in scenario C they are divided according to their direction of motion. In both cases they interact with half of the people they would interact with in scenario A. However, in scenario C, we have to consider the relative speed between people that, in the mean field approximation is simply thrown away (two people with opposite velocity meet for sure while people moving in the same direction do not). For this reason, as well as defining a playground to extend our approach to a systematic analysis of the transient regimes and of a generic case of time-space dependent rates, in the following we resort to the cellular automata approach, by means of which we will be able to implement the joint probability function in terms of simple rules.

III. THE CELLULAR AUTOMATA RULES AND THEIR IMPLEMENTATION
In this section we discuss in detail the rules of the cellular automata describing the spread of an infection within an open, finite connected graph and how we implement them. In particular, to keep consistent with the analytical derivation of Sec. II, in the following we focus on the three different scenarios we proposed there.
Let us begin our discussion with scenario A. Referring to Fig. 5, we model this system as a 2 × L square grid with von Neumann neighborhood. On indexing each plaquette of the lattice with a "row" and a "column" index, we use the row index to store the information on the direction of the motion of each individual, while the column index keeps track of the spatial position. Accordingly, we regard each cell as a portion of a "road" of physical length L. The first row of the matrix represents people moving to the left (that corresponds to having λ = −1 in the notation of Sec. II), from cell (1, L) to cell (1, 1). At variance, the second row represents people moving to the right (corresponding to λ = +1 in the notation of Sec. II), from cell (2, 1) to cell (2, L) (note that, in scenario A, cells (1, j) and (2, j) overlap each other in real space). To each cell (λ, j), we associate three positive integers, n S,λ;j , n E,λ;j and n I,λ;j , respectively corresponding to the number of healthy, exposed and infectious individuals moving in the direction λ, as defined in Sec. II. While we allow more than a single individual to occupy the same cell, to take into account that each cell corresponds to a finite region in space, we put a constraint on maximum limit of individuals in each cell. Letting d to be the side of each (square) cell, we are therefore limiting the maximum number of people that can physically enter a d × d square region of space. Accordingly, we require that, ∀j = 1, . . . , L, we obtain λ=±1 (n S,λ;j + n E,λ;j + n I,λ;j ) ≤ n max,j , with n max,j = n max = 15 for all cells, as it appears to be reasonable for the cell size we consider (see below for the detailed discussion about our choice of the system parameters).
The populations inside each cell are updated every time step ∆t. Each time step is composed by two phases: the movement turn and the contagion turn. At variance with respect to our theoretical framework in Sec. II, in this case we are considering integrated rates at each single turn, that is, probabilities. Therefore, referring to the definition of the various rates in Sec. II, we denote with p h = ω h ∆t the probability that, in a single turn, an individual either moves backward from cell (1, j) to cell (1, j − 1), or it moves forward from cell (2, i) to cell (2, j + 1) (focusing on the "inner" cells, that is, 1 < j < L). Going along the rate formalism of Sec. II, we scenario A scenario B scenario C FIG. 5: Scenario A: people moving over a narrow street or a sidewalk; they can walk in both directions and are close together at a distance less than the safety distance d. The arrows pictorially encode the cellular automata rules that act every time step ∆t. In particular, the top row corresponds to people moving to the left, the bottom row to people moving to the right. In the bulk of the system, individuals can make a step to the next cell with probability p h (red arrow), or change their direction, with probability p ℓ (green arrow). At the boundaries, an individual can enter into the system with a probability pin (blue arrow). During the contagion phase, an healthy individual can be infected by infectious individuals moving in either direction (here, as well as in the scenario B and in scenario C, the yellow background represents the range of the infection). Scenario B is like scenario A, but with people moving in both directions distributed along the two sides of the street (the "sublattices"). People can change their direction of motion or change street side or both (the probabilities, p ℓ , associated to the three processes are, in principle, different) but, if they lie on different sublattices, they cannot infect each other. Scenario C is like scenario B, but each sublattice hosts individuals moving in one direction only. The cellular automata probabilities are related to the rates of the theoretical model of Sec. II by pA = ωA∆t. define p ℓ = ω ℓ ∆t to be the probability for an individual to change in a turn its direction of motion, that is, to move from cell (1, j) to cell (2, j), or vice versa. Individuals, whether healthy, exposes or contagious, enter the system at each turn from the boundary cells (1, L) and (2, 1). In particular, letting p in = ω in ∆t the probability that one individual enters the system in a time step ∆t and also letting δ S , δ E , δ I the average fraction of the total population corresponding to healthy, exposed and infectious individuals respectively, we have that the probability for an healthy, exposed and infectious individual to enter the system in a single time step is respectively given by p in,S = δ S p in , p in,I = δ I p in and p in,E = δ E p in , clearly with δ S + δ E + δ I = 1 (note that, while p in is the entrance probability for a single pedestrian, in general, more than a pedestrian can enter the system at each turn). Finally, individuals can exit the system from the cells (1, 1) or (2, L). When a pedestrian exits the system, it is removed from the total count of people within the lattice and, at the same time, it triggers a counter that keeps trace of its health status. Likewise, another counter keeps trace of the people that effectively entered the system. Each hopping, entering or change of direction is allowed only if the total population inside the target cell has not saturated to the allowed value n max .
The infection turn takes place right after the movement turn. In each cell (λ, j), the health status of each pedestrian has a probability p c = ω c ∆t to switch from S to E for each contagious individual that is present in cells (1, j) and (2, j). Consistently with the assumptions discussed in Sec. II and in Appendix A, we do not allow the status of E-and I-individuals to change.
Switching to scenario B, one readily sees that it is a simple "double copy" of scenario A. In this case, the matrix has four rows, rather than two, with two rows per each "side of the street". Therefore, one has two different directions of motion per each side of the street, with the possibility, for a single individual, to switch, at the same time walking side and/or direction. In order to compare the results between scenarios A and B we keep fixed the incoming individual flow halving the incoming probability p in .
Scenario C is the same as scenario A, except for the fact that people moving in opposite directions are physically separated in real space. Therefore, during the contagion turn, an S-individual can only be infected by an E-individual moving in the same direction. Furthermore, the constraint on the maximum allowed number of individual in each cell must be satisfied separately for each value of λ, that is, the constraint takes the form (n S,λ;j + n E,λ;j + n I,λ;j ) ≤ n max,λ .
At the end of the simulation, we compute the infectious ratio C r as the number of E-individuals leaving the system, minus the number of individuals already exposed before entering into the system, over the number of healthy individuals entered into the system.
The cellular automata parameters depend on the model and on the physical system we are interested to describe. In our simulation, we employ the following parameters: • The cell dimension d: this should be pertinently chosen to be of the order of the safety distance between individuals, to keep consistent with our assumption that infection spread only happens inside a cell. Accordingly, we set d = 2m; • The number of cells L: this is not a crucial parameters. It measures the length of the pathway we are considering in units of d. Throughout all out calculations, we set L = 50 but, as stated above, it can be readily changed to simulate longer, or shorter, paths; • The time step ∆t: this measure the time required by a pedestrian to walk for d meters without slowing down. In the following, we set ∆t = 2s; • The infection probability p c : in general, this depends on the effective virus transmission probability, on the pedestrian health status (which, in turns, depends on age, gender, et cetera), and by the protective equipment wore by individuals, such as masks and gloves; • Percentage of infectious individuals effectively entering the system, δ I : while, in principle, δ I is determined by the average percentage of infectious people to the main population, at the entrance to the system it can be strongly reduced (compared to the outside) by, e.g., checking the body temperature at the entrance and by forbidding people with symptoms like fever or cough to enter the system. Indeed, in the case of infection by Covid-19, it has been estimated that about 43, 8% of infectious people have fever before hospitalization. Other discriminants, for checks at the entrance, can be age and gender, indeed the median age was computed as 47 years, 58.1% were males, and only 0.9% of patients were aged below 15 years 29 .
• Entrance probability p in , "hopping" probability without changing direction, p h , and probability of changing direction, p ℓ : these depend on the number of individuals in the system and on the time spent by them in units of ∆t. Realistic estimates for those probabilities (or for the corresponding rates) can be extracted from the crowd fluxes measured, for instance, by the transit crowdedness function of Google Maps, in previous years. In addition, one should also take into account the possibility of "artificially modifying" these probabilities by, e.g., influencing people with disclaimers, turnstiles, watchmen or "nudges" 30 .
As a main sample of CA results, in Fig. 6 we plot C r as a function of p h for all the three scenarios described in Sec. II. To make the windows of values of the independent variable consistent with the one we use to draw Figs. 1,4, we let 0 ≤ p h ≤ 1 which, given the relation p h = ω h ∆t and since ∆t = 2s, corresponds to 0 ≤ ω h ≤ 0.5 s −1 of Figs. 1,4. As expected from the main features of the three scenarios, at values of all the system parameters all equal in the three different cases, for what concerns the infection rate, scenario A is the worst, since individuals moving toward opposite directions are not spatially separated from each other, scenario C is the best, due to the condition µ λ,λ = 0 (see the discussion in Sec. II for details), scenario B is halfway between the two of them. Remarkably, while we recover an acceptable qualitative agreement with the results discussed in Sec. II, we observe how, differently from the mean field approximation, the CA approach is able to correctly discriminate between scenarios B and C and to reproduce the right limit C r → 0 limit for p h → 1.
To ease the comparison between the CA results and the (approximate) analytical one, to check the reliability of the approximations we employed along the derivation of Sec. II, in drawing the plots in Fig. 6 we chose the parameters so that, once all the probabilities are converted into rates by dividing all of them by ∆t, they exactly correspond to the ones we used to draw Fig. 1. In particular, Fig. 1 was drawn for µ = 0, which correctly describes scenario C only. For this reason, in Fig. 7 we draw a synoptic plot of the curve of Fig. 1 corresponding to the parameters chosen to draw Fig. 6 and of the points of Fig. 6 corresponding to scenario C (note that we use p in as the sole independent variable for both plots, after converting ω in into p in using p in = ω in ∆t). Importantly enough, the synoptic comparison shows that, at given system parameters, the mean field approximation systematically overestimates C r at a given p h , which shows that the simple analytical approach, in a sense, provides in a simple way a "safe" upper bound on the infection risk.
To further compare the CA approach to the analytical mean field approximation, in Fig. 8 we plot the numerical data from the CA for C r as a function of p in at p ℓ = 0 for various values of p h and with all the other parameters chosen as in the derivation of Sec. II (see the figure caption for details). Qualitatively speaking, we see that the trend of the data in Fig. 8 is the same as we display in Fig. 2, which was derived by applying the mean field approximation to the generalized SEIR model. Specifically, at a given value of p in , increasing the probability for individuals to move from one cell to the neighboring one (that is, increasing the average speed of the pedestrian motion in the pathway) determines a remarkable lowering of C r , as expected in view of the fact that, as discussed in the previous section, due the possibility for individuals to exit from the same side of the street, the mean time spend by each of them inside the system is reduced, as well as the probability of infecting, or of being infected.
Finally, to complement the results reported in Fig. 4 with the corresponding analogs derived within CA frame- Cr as a function of p h computed in scenario C (green pointplot) and within mean field approximation with µ = 0 (full red line) with pin = 0.5, p ℓ = 0, pc = 0.05, δE = 0, δI = 0.05. Apparently, the mean field calculation always overestimates Cr, compared to the "exact" CA result.
work, in Fig. 9 we plot C r as a function of p h for scenario C, for various values of p ℓ and all the other parameters set to quantitatively ground the comparison with Fig. 4 (see the figure caption for details). As expected, the CA results confirm that increasing p ℓ by keeping all the other parameters fixed acts to lower C r , as it basically lowers the average time spent by individuals in the system (see Sec. II for a detailed discussion about this point). Eventually, we recover an excellent consistency between the (approximate) analytical results and the (basically exact) numerical ones obtained within the CA framework. This evidences that, on one hand, the mean field approach we employ to solve the generalized SEIR equation provides an acceptable level of qualitative description of the system behavior, on the other hand, that the  CA approach can be effectively implemented to improve the quantitative reliability of the results, when necessary.

IV. CONCLUSIONS
In this paper we have formulated a generalized SEIR model on a graph that allowed us to describe the population dynamics of an open crowded place with an, in principle, arbitrary topology. To illustrate the effectiveness of our model, we have discussed a few, simple paradigmatic cases, which we have treated both analytically, within a mean field approach to the full set of SEIR model differential equations, and numerically, by means of a cellular automata simulation of the individual dynamics in the system. As a main result of our derivation, we were able to provide the infection ratio C r as a function of "tunable" system parameters, which eventually enables us to show to what extent controlling human-depending effects may act to lower the disease spread in the system.
As an immediate further development of our work, we note that, within our approach, one may readily extrapolate the ratio between individuals that become exposed inside the system and infectious individual coming from outside as Dividing R, obtained within CA simulation, by the time by which we run the simulation and then multiplying the result by the overall fraction of time usually spent by an individual into the system under analysis, in a period as long as the incubation time, one may obtain the number of people infected by each infectious individual, that is the basic reproduction number R 0 32 , related to a specific environment. As a next development of our work, we plan to estimate this quantity, that is only a fraction of the cumulative R 0 (that is the sum of the R 0 of all the places the individual spent time in), for different scenarios, e.g., a shopping center, a pedestrian track, a gym, a school, and so on. Eventually, we plan to use it our results as a tool to evaluate the infection hazard of a given place compared to others and to, e.g., suggest which place would be safer to open first after a global lockdown. Finally, it is worth stressing that, as R 0 is an effective parameter that depends on the virus and on the social behavior, our approach allows for discriminating the contribution due the virus intrinsic properties [25][26][27][28] from the human-depending effects like social distancing, personal protective equipment and so on, and to provide a quantitative information about how to act to reduce the latter contribution.
Finally, we remark that, while, in order to make the presentation of our approach the straightest possible, we confined ourselves to three simple sample scenarios, our approach can be easily generalized to more realistic and, unavoidably, more complex graphs, by means of a proper implementation of the neighborhood between cells within CA simulation.
• The Recovered compartment (R), that is made out of individuals who either recovered from the infection, or died.
In its standard and simplest formulation, the SEIR model describes the evolution in time of the number of individuals in each sector by means of a set of differential, rate equations, given by As discussed in the main text, we describe each lattice cell as a single SEIR model in which, however, the total number of individuals can change in time, due to the nonzero probability of hopping between a cell and the nearest neighboring ones. In particular, this implies that, even when considering stationary solutions to the dynamical evolution equations for the local population, they are only on the average (in time) consistent with the conservation of the total number of individuals per each cell. The various rates in Eqs. (A1) are clearly identified as follows: • ω is the "specific infection rate", that is, the probability per unit time that an individual belonging to S is infected by getting close to another individual belonging to I; • α is the probability per unit time that an individual switches from E to I, that is, the rate that the virus incubation ends and the individual becomes infectious; • γ is the rate for an individual from I to switch to R. According to the specificities of the model, γ is identified with the healing-plus-death rate for an individual from E.
Even a rather simplified set of equations such as the ones in Eqs. (A1) can be able to provide reliable informations on the infection spillover, provided the various rates at the right-hand side of Eqs. (A1) (the "parameters") are pertinently estimated. For instance, in the case of Covid-19 infection spreadout in Italy, we employ the numerical values for the parameters rigorously estimated in Ref. [35], that is, ω = 2.25[day] −1 , α = 0.33[day] −1 , γ = 0.50[day] −1 , values that appear to fit well the infection dynamics in two among the mostly populated regions in Italy: Lombardy (Northern Italy) and Campania (Southern Italy).
While, over a long enough time, the system of Eqs. (A1) always implies, given the parameters listed above, a maximum in the contagion spreading curve and an asymptotic saturation of R(t) to R ∞ = R(t → ∞) = N , throughout our work we set α = γ = 0 and, accordingly, we neglect the equation for R(t). Indeed, in the specific system we describe, we assume that the time spent by each individual inside the system is pretty short compared to the virus incubation and healing-plus-death rate. Accordingly, I(t) can become different from zero only because of people that come from outside, with a negligible change due to the switch in time from E(t) to I(t). Given the estimate of the parameter α we provide above, the variation in I(t) over time intervals as long as a few hours is typically negligible and we accordingly neglect it, by setting α to zero from the very beginning and approximating I(t) ≈ I(0). This eventually motivates assuming R(t) = 0, as well, as we do throughout our derivation. Apparently, the above assumptions eventually lead to Eqs. (5), in the specific case corresponding to taking µ = 0 in Eq. (4), as we discuss in detail in the main text of the paper.