Superstatistical approach to air pollution statistics

Air pollution by Nitrogen Oxides (NOx) is a major concern in large cities as it has severe adverse health effects. However, the statistical properties of air pollutants are not fully understood. Here, we use methods borrowed from nonequilibrium statistical mechanics to construct suitable superstatistical models for air pollution statistics. In particular, we analyze time series of Nitritic Oxide ($NO$) and Nitrogen Dioxide ($NO_2$) concentrations recorded at several locations throughout Greater London. We find that the probability distributions of concentrations have heavy tails and that the dynamics is well-described by $\chi^2$ superstatistics for $NO$ and inverse $\chi^2$ superstatistics for $NO_2$. Our results can be used to give precise risk estimates of high-pollution situations and pave the way to mitigation strategies.

Complex driven non-equilibrium systems with time scale separation are often well-described by superstatistical methods, i.e. by mixing several dynamics on distinct time scales and constructing effective statistical mechanics models out of this mixing [1,2]. The intensive parameter β that fluctuates in a superstatistical way can be an inverse temperature of the system, a fluctuating diffusion constant, or simply a local variance parameter in a given time series generated by the complex system under consideration. This formalism is in particular relevant for heterogeneous spatio-temporally varying systems and has been successfully applied to many areas of physics and beyond, most notably Lagrangian turbulence [3], defect turbulence [4], wind velocity fluctuations [5,6], share price dynamics [7], diffusion of complex biomolecules [8], frequency fluctuations in power grids [9], rainfall statistics [10,11], and many more. Here, we apply a superstatistical analysis to an important topic of high relevance, namely the spatio-temporally dynamics of air pollution in big cities. Our main example of pollutants considered in the following are N O and N O 2 , but the method can be similarly applied to other substances.
N O and N O 2 are examples of nitrogen oxides (N O x ), which are gaseous air pollutants that are primarily discharged through combustion [12]. In urban areas, the main cause of this is through automobile emissions [13]. These chemicals have been shown to aggravate asthma and other respiratory symptoms and increase mortality. Further, short term exposure to N O 2 has been shown to correlate with ischaemic and hemorrhagic strokes [12,14,15]. While N O is a very important substance within Earth's ecosystem, for example regulating the development in plants and acting as a signaling hormone in certain processes in animals, it can also cause severe environmental damage in high concentrations [16][17][18]. N O may form acids, e.g. by chemically mixing with water (H 2 O) to form Nitric Acid (HN O 3 ), or reacting further to N O 2 [19] it can cause acid rain [15]. Though N O x are not thought to be carcinogenic, they indicate the pres- * b.schaefer@qmul.ac.uk † c.beck@qmul.ac.uk ence of other harmful air pollutants [13]. The data analyzed here was taken from the publicly available London Air Quality Network (LAQN) website [20]. The site provides readings of pollutant levels at different locations throughout Greater London at different time intervals, with the 15-minute interval data analyzed here. Inspecting the measured concentration time series for both N O and N O 2 reveals that both pollutants' concentrations vary on multiple time scales, see Fig. 1.
Concentrating on the case of N O for now, we notice that the probability distribution of the measured concentration exhibits power-law tails, see Fig. 2. The distribution is well-fitted by a q-exponential [21] of the form where u are the concentration levels [µg/m 3 ] of the pollutant, and the q-and λ q -values are parameters. q can be regarded as an entropic index [21][22][23] and it is related to λ q and the mean µ by see Appendices A and B for more details. The basic idea of the superstatistical modelling approach is that a given time series, generated by a complex driven non-equilibrium system, obeys a stochastic differential equation (SDE) where the parameters of the SDE change randomly as well, but on a much longer time scale [24]. In our case, the concentration time series, whilst being q-exponentially distributed as a whole, appears to be non-stationary, in the sense that it can be divided into shorter time slices, of length T , that each have locally an exponential density with different relaxation constant λ e each, see Fig. 3. These changes in the pollutants statistics make sense due to the changing environment, e.g. because of changing weather conditions or traffic flow varying from week to week and also across seasons. Observing simple (exponential) statistics locally, while noting heavy tails in the aggregated statistics clearly suggests a superstatistical description. One of the simplest local models is based on an exponential probability distribution of the form   . Pollution statistics is not exponentially distributed, but rather q-exponentially distributed. We display the histogram of the N O data, together with the best exponential fit (green) and a q-exponential fit (red). Also displayed are the value of the exponential parameter λe and the q-and λq-parameters for the q-exponential distribution.
where β = λ e is a local relaxation parameter that fluctuates on the larger superstatistical time scale. We determine this large time scale by computing the average local kurtosis κ of a cell of length ∆t, similarly as done in [2] for locally Gaussian distributions. We use a local average kurtosis defined as where t max is the full length of the time series. The notation . . . t0,∆t indicates the expectation for the time slice of length ∆t starting at t 0 . Assuming local expo-nential distributions with kurtosis 9, we apply Equation (4) to determine the long time scale T as the special time length ∆t such that We show an example of how T is calculated in Appendix B, with a plot illustrating how the average local kurtosis depends on ∆t. Once the time slice length T is calculated, we define an intensive parameter β for each local cell, by setting β = λ e = 1/ u t0,T . The distribution of this parameter, f (β), is then obtained from a histogram, as shown in Fig. 4a. In good approximation, we find β to be χ 2 distributed, i.e., where n is the number degrees of freedom and β 0 is the mean of β.
Integrating out the β-parameter, the marginal distribution p(u) is calculated as which evaluates to and Eq. (8) is the q-exponential distribution, which we derived from a superposition of local exponential distributions.
Fitting the n-parameter to find f (β), as illustrated in Fig. 4a, we determine the q-and λ q -parameters from Eqs. (9) and (10), respectively. Inserting the χ 2 -distribution f (β), we use (7) and (8) to compute a probability density, which both approximates the fitted q-exponential and the data very well, successfully serving as a consistency check of the superstatistical approach, see Fig. 4b.
We already noticed different statistical behavior when inspecting the trajectory of the N O 2 data in Fig. 1, as compared to that of N O. Following a similar procedure as for the N O data, we find that for N O 2 the aggregated distribution is approximated by a q-Maxwell Boltzmann distribution, of the form where Z is the normalization factor given by and where q is related to the scale parameter σ q and the mean µ by see Appendix C for details. Analogously to the N O-case, we explain the aggregated distribution as a superposition of simple local distributions, here chosen as ordinary Maxwell-Boltzmann distributions. Each local Maxwell-Boltzmann distribution is defined as   where σ mb =: β is a scale parameter. We determine the long time scale T by varying ∆t so that we locally obtain the kurtosis of a Maxwell-Boltzmann distribution, which is given as see Appendix C for the calculation. If the β values again follow a χ 2 -distribution, the superpositioned Maxwell-Boltzmann distributions would lead to an exact q-Maxwell-Boltzmann distribution, see Appendix C for a detailed calculation. The results of our superstatistical analysis for the N O 2 data are displayed in Fig. 5. The χ 2 -distribution is only a rough fit and the histogram of β is instead best fitted by an inverse-χ 2 -distribution This leads to a different superstatistics, compared to the case of N O. Integrating the conditional probability p(u|β), as in (7) but with local Maxwell-Boltzmann distributions p(u|β) and an inverse- gives the marginal distribution as where K v (z) is the modified Bessel function of the second order. This inverse-χ 2 -superstatistical model leads to a very good description of the data, see Fig. 5, and it approximates q-Maxwell-Boltzmann distributions for medium concentrations but it not exactly the same. In fact, the PDF of the observed statistics decays like p(u) ∼ exp(−const √ u) for large u values, see also Appendix C.
To conclude, we have illustrated that superstatistical methods, originally introduced in non-equilibrium statis-tical mechanics and applied to fully developed turbulent flows and other complex systems, find new applications to model the statistics of air pollution. We find excellent agreement of simple superstatistical models with the experimentally measured pollution data. A main result is that different pollutants obey different types of superstatistics (in our case χ 2 for N O and inverse-χ 2 for N O 2 ). In fact, different types of local dynamics also occur (in our case locally exponential density for N O and locally Maxwell-Boltzmann for N O 2 ). Once the precise superstatistical model for a given pollutant has been identified, precise risk estimates of high pollution situations can be given, by integrating the tails of the probability distribution above a given threshold. These estimates could help to design tailor-made thresholds for suitable policies to tackle the pollution problem in cities. We discuss some explicit examples in the Appendix D. While the precise local statistics (exponential or Maxwell-Boltzmann) does significantly influence the likelihood to observe very low concentrations, the probability to observe high concentrations is determined by heavy tails in both cases.
The superstatistical approach presented here can easily be generalized and extended. For example, we may formulate an explicit dynamical process to match the trajectories in time (in form of a superstatistical stochastic differential equation), which could lead to synthetic dynamical models and the possibility to employ short-term predictions of the pollution concentration. Further research should be devoted to understand how and why certain local statistics, such as exponential or Maxwell-Boltzmann distributions arise, and to potentially link these statistics to the physical and chemical properties of the pollutants.

ACKNOWLEDGMENTS
We acknowledge support from EPSRC Grant No. EP/N013492/1. This project has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No 840825.

Appendix A: Superstatistical cases
These appendices provide calculations of some equalities used in the main text, such as relationships between mean, variance and kurtosis for the relevant distributions. We also display how the long time scale is determined and explicitly discuss how the knowledge of concentration probability distributions could be used to set pollution thresholds and policies.
We start by providing an overview of the different cases of local distributions (exponential or Maxwell-Boltzmann) and the β-distributions of the individual scale parameters: local distr./β distr.
and the aggregated statistics as q-exponentials derived from a χ 2 distribution of β = λ e :

Calculation of λq
Here, we show how to express the scale parameter of q-exponentials λ q as a function of the mean of the distribution µ = u and the q-parameter. We start by computing the mean: . (B5)

Calculation of mean µ of Exponential Distribution
We briefly recall the relationship between the exponential scale parameter β = λ e and the mean µ as In the main text, we determined the time scale T as the time scale where the local kurtosis of the local exponential distribution takes on the value κ = 9. Here, we provide the corresponding calculation using µ = 1 β .

Superstatistical calculation of q-Exponential Distribution
We show how integrating several local exponential distributions, whose exponents β follow a χ 2 -distribution, leads to an overall q-exponential distribution: Each local distribution is given as Integrating over all of these distributions can be expressed as which we evaluate to if we identify where β 0 is the mean of β.

Determining the long time scale T
We determine the long time scale T from the time series using Equ. (4) from the main text given as (B13) The long time scale T is defined as T := ∆t such that κ(∆t) = 9. Here, we compute the average local kurtosis κ as a function of the time window ∆t and thereby determine T . Average Kurtosis (6.0233,9) Figure 6. The average kurtosis κ as given by Eq.(4) is plotted as a function of the time window ∆t (blue). The crossing between the horizontal line at κ = 9 (the kurtosis of an exponential distribution) and the κ vs. ∆t curve gives the value for ∆t = T ≈ 6.

Appendix C: N O2
In the case of N O 2 concentrations, we approximate local concentrations as Maxwell-Boltzmann distributions and the aggregated statistics is a q-Maxwell-Boltzmann distribution if σ mb is χ 2 distributed. These q-Maxwell-Boltzmann distributions would strictly arise if the scale parameters β = σ mb were following a χ 2 -distribution. Empirically, we observe that for our data, the β values follow an inverse-χ 2 -distribution instead: We obtain this inverse-χ 2 -distribution from a simple transformation of random variables when β −1 , rather than β, is χ 2 -distributed.

Calculation of σq
Here, we show how to express the scale parameter of q-Maxwell-Boltzmann distributions σ q as a function of the mean of the distribution µ = u and the q-parameter. We start by computing the mean: which gives (C5)

Calculation of mean µ of Maxwell-Boltzmann Distribution
We briefly recall the relationship between the scale parameter β = σ mb of the Maxwell-Boltzmann distribution and the mean µ as

Calculation of kurtosis κ of Maxwell-Boltzmann Distribution
In the main text, we determined the time scale T as the time scale where the local kurtosis of the local Maxwell-Boltzmann distribution takes on the value κ ≈ 3.1082.
Here, we provide the corresponding calculation

Superstatistical calculation of q-Maxwell-Boltzmann Distribution
We show how integrating several local Maxwell-Boltzmann distributions, whose exponents β follow a χ 2distribution, leads to an overall q-Maxwell-Boltzmann distribution: Each local distribution is given as Integrating over all of these distributions can be expressed as which can be evaluated to if we identify where β 0 is the mean of β.

Calculation of inverse-χ 2 superstatistical distribution based on superimposed Maxwell-Boltzmann distributions
Instead of following a χ 2 -distribution, our data indicates that the scale parameters β of the local Maxwell-Boltzmann distribution rather follow an inverse-χ 2distribution. Here, we derive the probability density function for such a superposition. Taking the conditional density p(u|β) as given in (C8) and the inverseχ 2 -distribution for the distribution f (β), the probability density p(u) is given as This PDF has still heavy tails, which decay as p(u) ∼ exp(−const √ u) for large u values.

Extra Figures
We repeat Fig. 3 from the main text for local N O 2 concentrations, i.e. analyzing brief periods of the N O 2 trajectory. Fig. 7 illustrates that in good approximation the local behavior follows Maxwell-Boltzmann distributions with fluctuating variance. Further, we also determine the long time scale T for the local Maxwell-Boltzmann distributions by setting T := ∆t for the local kurtosis κ(∆t) ≈ 3.1082, following Eq. (4) from the main text again. See Figs. 7 and 8.

Appendix D: Thresholds and policies
Policies to tackle pollution in cities often focus on thresholds and compliance with these thresholds. This then often leads to pollution just below the threshold in many regions. However, in many cases it may be more reasonable to reduce the overall exposure to pollutants [26]. Weather is responsible for many fast variations of the air pollution [26]. The focus on threshold is especially dangerous as there is strong evidence that even small concentrations of pollutants, below national standards, increase the death rate [27]. The UK government (Dept. of Health & Social Care) calls air pollution a "health emergency" and the World Health Organization (WHO) judges the situation similarly on a global scale [28]. Note that about 9500 premature deaths are attributed to air pollution in London alone every year [29]. On a global scale this number rises to about 4.9 million premature deaths, making air pollution the fifth leading cause of mortality worldwide [30].
So let us finally sketch how the knowledge of the probability density function (PDF) allows estimates of threshold crossings and total exposure. Setting thresholds on pollutant concentrations is a popular tool when setting pollution policies. Simultaneously, data coverage might not be very good in all locations. Especially if data is not available for the full time period of interest but only for a few months, estimates on how frequently thresholds are violated are difficult to obtain. As soon as we are able to determine the PDF p(u) for the concentration levels u, we can easily compute the number of days where thresholds are violated as This integral can be easily evaluated numerically, using q-exponential distributions for N O, see Eq. (8) from the main text, and Eq. (17) from the main text for N O 2 , consistent with experimental observations. The number of threshold violations explicitly depends on the parameters λ q and q, which have to be determined from the data sets but can then be compared between different locations and time periods. Having these two parameters explicitly disentangles two aspects: The rate of (rare) extreme events, encoded in q and the overall variability of the pollutant concentration, encoded in λ q (or λ mb for N O 2 ).
Complementary to thresholds, policies could instead focus on the total pollution exposure (TPE) describing the average amount citizens are exposed to. Again, this is easily computed from the PDF as T P E = 365