Transport in disordered systems: the single big jump approach

In a growing number of strongly disordered and dense systems, the dynamics of a particle pulled by an external force field exhibits super-diffusion. In the context of glass forming systems, super cooled glasses and contamination spreading in porous medium it was suggested to model this behavior with a biased continuous time random walk. Here we analyze the plume of particles far lagging behind the mean, with the single big jump principle. Revealing the mechanism of the anomaly, we show how a single trapping time, the largest one, is responsible for the rare fluctuations in the system. These non typical fluctuations still control the behavior of the mean square displacement, which is the most basic quantifier of the dynamics in many experimental setups. We show how the initial conditions, describing either stationary state or non-equilibrium case, persist for ever in the sense that the rare fluctuations are sensitive to the initial preparation. To describe the fluctuations of the largest trapping time, we modify Fr\'{e}chet's law from extreme value statistics, taking into consideration the fact that the large fluctuations are very different from those observed for independent and identically distributed random variables.

In a growing number of strongly disordered and dense systems, the dynamics of a particle pulled by an external force field exhibits super-diffusion. In the context of glass forming systems, super cooled glasses and contamination spreading in porous medium it was suggested to model this behavior with a biased continuous time random walk. Here we analyze the plume of particles far lagging behind the mean, with the single big jump principle. Revealing the mechanism of the anomaly, we show how a single trapping time, the largest one, is responsible for the rare fluctuations in the system. These non typical fluctuations still control the behavior of the mean square displacement, which is the most basic quantifier of the dynamics in many experimental setups. We show how the initial conditions, describing either stationary state or non-equilibrium case, persist for ever in the sense that the rare fluctuations are sensitive to the initial preparation. To describe the fluctuations of the largest trapping time, we modify Fréchet's law from extreme value statistics, taking into consideration the fact that the large fluctuations are very different from those observed for independent and identically distributed random variables.

I. INTRODUCTION
Diffusion and transport in a vast number of weakly disordered systems follows Gaussian statistics. As a consequence, the packet of the spreading particles is symmetrically spread with respect to (w.r.t.) the mean x(t) . In contrast, for strongly disordered systems, the packet is found to be non-Gaussian and non-symmetric [1,2]. Starting on x = 0, the slowest particles are trapped by the disorder, resulting in a plume of particles far lagging behind the mean x(t) , i.e., the fluctuations are large and break symmetry (see Fig. 1). Deep energetic and entropic traps, which hinder the motion are expected to lead to a slow down of the diffusion. The most frequently used quantifier of diffusion processes is clearly the mean square displacement (MSD). However, in the presence of deep traps, the MSD exhibits super-diffusion. This is not an indication for a fast process, instead it is due to the very slow particles far lagging behind the mean, which lead to very large fluctuations of displacements. Thus slow dynamics of a minority of particles leads to enhanced fluctuations and symmetry breaking w.r.t. x(t) . Such processes are widespread, in particular many works focused on the surprising discovery of the super-diffusion in dense environments [3][4][5][6][7][8]. This was originally investigated in the context of diffusion in disordered material [1][2][3][9][10][11][12], contamination spreading in porous medium [13][14][15][16], simulation of biased particles in glass forming systems [4] and super cooled liquids [6], pulled by a constant force.
Here we investigate the spreading of the packet of particles, using the biased continuous time random walk (CTRW) [17][18][19]. Our goal is to characterize precisely the mechanism leading to the large fluctuations. We promote the idea of the single big jump principle. This means that one and only one trapping time is responsible for the rare fluctuations. Thus in this work we show the re- The left plume of particles is due to the long trapping times, which implies that some particles are moving by far slower if compared with the mean x(t) . Somewhat paradoxically, these slow particles lead to super-diffusion as the MSD grows like t 3−α [1]. In this work we show how rare events in this process are determined by the largest trapping times. In turn, it controls the behavior of the MSD. The typical fluctuations are defined for x ∼ x(t) , i.e., close to the peak of the packet, while we focus on the rare fluctuations shown by the red solid line. The parameters are a = 5, σ = 1, and α = 1.5; see Eqs.
lation between the theory of extreme value statistics and the anomalous transport. For that we need to modify the well-known Fréchet law [20,21] which describes extreme events for uncorrelated systems. Similarly we present an analysis of the far tail of the spreading of the packet of particles, showing the deviations from the Lévy statistics describing the bulk statistics. This is done for both non- stationary and equilibrium initial conditions. While the typical fluctuations in our systems are not sensitive to the initial conditions, the rare fluctuations do, and this we believe is a general theme for systems with fat-tailed statistics.
We will relate the position of the random walker x(t) and the longest trapping interval τ max . The typical fluctuations of both observables were considered previously, and were shown to behave as if they are composed of independent and identically distributed (IID) events, namely the Lévy stable law and Fréchet's law hold for typical fluctuations (Eqs. (15) and (31) below). We show below how these laws must be corrected when dealing with the far tail. In turn standard Cramer's theorem from large deviation theory [22], which identifies the large fluctuations with the accumulation of many small steps, fails in this case studied here. More precisely, we claim below that one can obtain two limiting laws both for x(t) and τ max , the first is the just mentioned Lévy, Fréchet laws and the second is an infinite density, i.e., a nonnormalized state.
What is the principle of big jump? Many works have focus on the dominance of one big jump in a stochastic process. For example consider the activation process of a particle over a barrier, modeled with an over-damped Langevin equation. If the noise is non-correlated and Gaussian, this escape is achieved by many small displacements, accumulating to give the rare escape from the well. On the other hand, if the noise is of the Lévy type, one event giving rise to a large fluctuation dominates the escape [23]. Similar ideas hold for the analysis of random partition functions and were used in the study of the Sinai model [24,25]. In the context of a run-and-tumble model and combination phenomenon these insights are well understood [26,27]. Roughly speaking, one can see that the largest summand is of the order of the total sum, a theme which is already known.
To be more specific consider N random variables {ϑ 1 , ϑ 2 , · · · , ϑ N }. Let ϑ max be the maximum of the set and S N = N i=1 ϑ i is the sum. The dominance effect, found for example if ϑ i are IID random variables drawn from a fat-tailed distribution, is the claim that S N and ϑ max are of the same order [28]. More exactly, S N and ϑ max scale with N the same way. A more profound case is when the distribution of ϑ max is the same as that of S N , besides a trivial constant and in a limit to be specified later. This is what we and others refer to as the principle of big jump. This statement was shown to be valid for sub-exponential IID variables [28] and see also [29,30]. In the IID case, the statement is valid for any N , so the limit N → ∞ is not at all required. Here our aim is show how the big jump principle holds for diffusion in disorder systems using the CTRW model. We will modify the principle to discuss the largest trapping time and its relation to the position of the random walker, so the principle discussed below is very different if compared to the original, in particular we depart form the IID case.
In [31,32], we promoted a rate method to the big jump approach which was used to predict non-analytical behaviors of the far tail of Lévy walk process and the so called quenched Lévy-Lonentz gas model. In these works, the very basic approach is different from what we have here; see Eq. (9) below. Further the connection to the modified Fréchet law, and the difference between stationary and non-equilibrium initial conditions are discussed here for the first time.
The organization of the paper is as follows. In Sec. II, we outline the single big principle and give the corresponding definitions. Non-equilibrium and equilibrium initial conditions are investigated in Secs. III and IV, respectively. Finally, we conclude the manuscript with a discussion. We also present simulation results confirming the theoretical predications.

A. Model and definition
We consider two types of biased CTRWs [17-19, 33, 34], the first is initiated at time t = 0 while the second is an equilibrium process. These two models, differ in the first trapping time statistics, but otherwise they are identical. Let φ(τ ) be the probability density function (PDF) of all the sojourn times while h(τ ) is the PDF of the first one. It should be emphasized that the correct choice of h(τ ) depends on the initial conditions. For the widely investigated non-equilibrium initial condition, we assign h or (τ ) = φ(τ ) [35]. This time process is sometimes called an ordinary renewal process, hence we use the sub- The bottom panel describes an ongoing equilibrium process, i.e., a stationary case were the dynamics started long before the start of the observation (see the blue dashed line for an illustration). The ti corresponds to the time when the i-th event occurs, and the backward recurrence time is Bt = t−t4. The only difference between these two processes is the statistics of the waiting time of the first step. However due to the disorder, in particular the power-law trapping time distribution, this difference crucially influences the rare events and also the behavior of the MSD.
script 'or' to denote this type of initial condition. While in equilibrium situation we use [36][37][38][39] where τ = ∞ 0 τ φ(τ )dτ is the mean trapping time. We will soon explain the physical meaning of these processes.
We are interested in the position of the random walker x(t), which starts at x = 0 when t = 0. After waiting for time τ 1 , drawn from h(τ ), the particle makes a spatial jump. The PDF of jump size χ, is Gaussian where a > 0 is the average size of the jumps. Physically this is determined by an external constant force field that induces a net drift. From Eq. (2) the Fourier transform of f (χ) is f (k) = exp(ika − σ 2 k 2 /2). This yields with k → 0. After the jump, say to x 1 , the particle will pause for time τ 2 , whose statistical properties are drawn from φ(τ ). Then the process is renewed. We consider the widely applicable case, where the PDF of trapping times with 1 < α < 2. As well-known such a fat-tailed distribution yields a wide range of anomalous behaviors. See [10,17] for review on CTRW and further discussion on physical systems below. From the Abelian theorem, the Laplace τ → s transform of φ(τ ) is with b α = (τ 0 ) α |Γ(1 − α)| and s → 0. The leading term is the normalization condition. We focus on 1 < α < 2, where the mean τ of the waiting time is finite, but not the variance. The term s α comes from the long tail of the waiting times (and it is responsible for the deviations from the normal behavior). Specific values of α for a range of physical systems and models are given in [10,40].
For an equilibrium initial condition the rate of performing a jump is stationary in the sense that for any time t the average number of jumps is so the effective rate 1/ τ is a constant. In contrast, for the ordinary renewal process we have in the long time limit N (t) ∼ t/ τ , hence for short times the two processes are not identical. Since the mean τ is finite, one would expect naively that statistical laws for the two processes will be identical in the long time limit. While this is correct for some observables, for others this is false. The prominent example is the MSD. In particular, for the calculation of the rare events one must make the distinction between the two models; see below. Equilibrium initial condition is found when the particle is inserted in the medium long before the process begins. More specifically when the process starts at some time −t a before the measurement begins at time t = 0, and in the limit t a → ∞. All along we consider the displacement of the particle compared to its initial position, namely we assign x(0) = 0. For a schematic presentation of the random processes see Fig. 3. Non-equilibrium initial conditions are found when the processes are initiated at time t = 0. For example in Scher Montroll theory [41], a flash of light excites charge carriers at time t = 0 and then the process of diffusion begins, then we have an ordinary process. Mathematically, these two models merely differ by the statistics of the waiting time of the first step, and hence it is interesting to compare them, to see if this seemingly small modification of the model is important or not in the long time limit. For a Poisson process the two models are identical. In contrast, for heavy-tailed processes under investigation, we find from Eqs. (1) and (4) that As 1 < α < 2 we see that the average time for the first waiting time diverges (but not for the second etc). This means that in a stationary state the process is slower if compared to the ordinary case, hence we expect that in this case particles will be lagging even more behind the mean displacement. Let us discuss the applicability of the CTRW model. As mentioned Scher and Montroll showed how this theory describes diffusion of charge carriers in disordered medium. In some experiments, one can find α = T /T g where T is the temperature and T g is the measure of the disorder. This is also the case for Bouchaud trap model describing glassy dynamics [10]. In the context of contamination spreading biased CTRW is used with α = 1.73 [16]. Based on numerical simulations, Winter and Schroer showed the super diffusive behavior and related the dynamics to the biased CTRW [4,6]. In these systems one expects that at very long times we find normal diffusion. There are also many examples of CTRW without bias [17,40,42,43]. It is interesting to add a bias in these systems to compare the effect discussed here. Here we choose a = 1, α = 1.5, σ = √ 2, and τ = 1. The dots are simulation results obtained by generating 10 5 trajectories and the red solid line is obtained from Eq. (9) by switching random variables to a dimensionless form, i.e., The evidently strong correlations, circled on the right panel, indicate that a single trapping event is responsible for the statistics of rare events.
Transport and diffusion processes, either normal or anomalous, are composed of a large number of displacements. Hence statistical laws, like the central limit theorem, are useful tools describing universal aspects of the phenomenon. In our case a single event is controlling the statistics of the spreading packet at its tail. Let {τ 1 , τ 2 , · · · , τ N , B t } be the set of the waiting times be-tween jump events, and N i=1 τ i + B t = t is the measurement time. Here B t , called the backward recurrence time, is the time elapsing between the moment of last jump t N = N i=1 τ i and the measurement time t. N is the random number of jumps in (0, t) [44]. We define the largest waiting time according to One main conclusion of this manuscript is that the statistics of τ max determines the fluctuations of the position x(t) of the biased random walker. This holds for rare fluctuations of x(t), that still control the behavior of the most typical observable in the field: the MSD. Due to the fat tailed distribution of the trapping time φ(τ ), and using basic arguments from extreme value statistics of IID random variables, one expects that the typical fluctuations scale like τ max ∝ t 1/α , while for a thin tailed distribution of waiting time, e.g., φ(τ ) = exp(−τ ), we have τ max ∝ log(t) [20]. For the latter example '∝' means that τ max is of the order of log(t) and similarly for the former case. While we are not dealing with IID random variables, the constraint is weak in the sense that it does not modify the typical fluctuations, see below and Ref. [28,45]. Note that all these scalings, i.e., τ max ∝ t 1/α and τ max ∝ log(t), describe typical fluctuations, sometimes called bulk fluctuations. These fluctuations are described by normalized densities, specified by Fréchet's law and the Gumbel law. On the contrary, here we focus on rare fluctuations, that is to say, both τ max and t are comparable.
When Eq. (4) holds, for the biased CTRW we will demonstrate that for small x, i.e., the left plume in Fig. 1 x where " " indicates that the random variables on both sides follow the same distribution. However, the PDFs describing the position of the particle x when x is not small and of τ max are far from being identical, indeed they will be calculated below. The meaning of small x and large τ max will soon become clear when we formulate the problem more precisely. For now based on Figs. 1 and 2, we see Eq. (9) works well when x ≪ x(t) and τ max ≃ t. For example when x < 50 ≪ x(t) ≃ 1667 in the bottom panel of Fig. 1, or for the trajectory on the left panel of Fig. 2, where τ max = 988, when t = 1000 and then x ≃ 40 ≪ x(t) ≃ 1667. Eq. (9) means that the distribution of x ≪ x(t) is the same as the average size of the jumps a, times the typical number of jumps made in (0, t−τ max ), which is the time 'free' of the longest waiting time. A correlation plot based on Eq. (9) is demonstrated numerically in Fig. 4. Using simulations of the ordinary CTRW process, we generate trajectories and search for positions of the random walkers at time t and record τ max . Then we plot the random entries observing that for small x, there is a perfect correlation as predicated by Eq. (9). Such correlation plots indicate that Eq. (9) is working. We call this the principle of big jump, and it is valid for both stationary and ordinary processes. Here the big jump means large trapping time, see further discussion on the term big jump and its origin in the discussion and summary. Now we will analytically derive Eq. (9) and discuss its consequence. For that we obtain the distribution of τ max and then of x.
Remark 1 Our main results in this manuscript are Eqs. (27), (37), (46), and (56) which give explicit formulas for the PDF of x and τ max for the two types of processes under investigation. In [31] we promoted a rate formalism to treat similar problems, e.g. the Lévy walk.
Here the focus is on the exact calculation of the statistics of rare events both for τ max and x, and on the relation between these two random variables, i.e., Eq. (9).

C. The statistics of τmax
Let us proceed to derive the general formulas describing the statistics of the longest waiting times which are valid for both the ordinary and equilibrium renewal processes. The case of an ordinary renewal theory, was considered previously by Godréche, Majumdar and Schehr in Ref. [45]. They investigated the typical fluctuations of τ max , and these as explained below exhibit behavior identical to a classical case of extreme value statistics, namely Fréchet's law holds for typical fluctuations. Here our goal is very different, we aim to obtain the rare events, namely investigate the behavior when τ max is of the order t. In this case the fluctuations greatly differ from the IID case.
We define the probability that τ max is smaller than L The corresponding PDF is P τmax (t, L) and as usual F (t, L) = L 0 P τmax (t, y)dy. Clearly the probability depends on the measurement time t and this dependence is especially important for fat-tailed waiting time PDFs. It is helpful to introduce the joint probability distribution of τ max and the number of renewals N F n (t, L) = P rob(τ max ≤ L, N = n) Here h(·) in the third line of Eq. (11) is governed by the process we investigate, and Φ(t) determined by the type of the process and the number of renewals Taking the Laplace transform w.r.t. t, we find The case n = 0 corresponds to realizations with no renewals during the time interval (0, t). One can check that ∞ n=0 F n (s, L → ∞) = 1/s. This means that the density of τ max is normalized. The sum of n from zero to infinity gives The first term is related to the survival probability and the second term corresponds to the probability that at least one renewal happened in (0, t). For the equilibrium renewal process, we insert Eq. (1) into Eq. (14) while for the ordinary case we use h or (τ ) = φ(τ ). Below, from Eq. (14) we will calculate the far tail of the distribution of τ max for the two different processes, i.e., the ordinary process and the equilibrium one, and prove that Eq. (9) is valid for both cases.

III. THE ORDINARY PROCESS
Here we consider the ordinary renewal process and the ordinary CTRW to build the relation between the rare events of positions and the largest waiting times. The aim is to investigate the PDF of τ max for the nonequilibrium process which is denoted as P or,τmax (t, L). We first treat the problem heuristically to calculate the typical fluctuations. Let N = t/ τ be the average number of renewals in the long time limit. For simplification, we neglect B t in Eq. (8) and ignore the constraint N i=1 τ i + B t = t, further we replace the random N with N . This means that we treat this problem as if the waiting times are IID random variables, an approximation which turns out not sufficient in our case, still ignoring the correlation [45] P rob(τ max < L) = P rob N (τ i < L) This is the well-known Fréchet distribution [21]. A closer look reveals a drawback of this treatment of the typical fluctuations, since within this approximation the PDF of τ max is P τmax (t, L) ∼ α N (τ 0 ) α /L 1+α , for L → ∞. However in our setting τ max ≤ t. This means that we must modify Fréchet's law at its tail, in other words, the constraint that the sum of all the waiting times and the backward recurrence time is equal to the measurement time t, comes into play when τ max ∝ t, as expected. Note that the number of renewals in our case is a random variable; see Fig. 5. Now we use an exact solution of the problem to calculate the rare events. Considering the non-equilibrium renewal process, we insert h(τ ) = φ(τ ) into Eq. (13) to get [45] ∞ L P or,τmax (z)dz = 1 s with and the survival probability We are interested in the limit s → 0 (corresponding to long measurement time) and L → ∞ in such a way that sL remains a constant. As mentioned the typical fluctuations are described by Fréchet's law Eq. (15) and here instead we consider the rare fluctuations. Using Eq. (18), for L → ∞, Eq. (17) becomes where we have used the limit with s → 0. From Eq. (19) we see that G(s, L) is large for sL → constant and α > 1. According to Eq. (19), we find Note that Eq. (21) can also be derived directly from Eq. (17). Utilizing Eq. (16) and and after some simple rearrangements P or,τmax (s, L) = 1 s ∂ G(s,L) ∂L where we used the relation that P or,τmax (t, L) is the derivative of Eq. (22) w.r.t. L. Combining Eqs. (21) and (23), we have Note that the first two terms on the right-hand side of Eq. (24), namely 1/ G(s, L) and α/( G(s, L)sL), are comparable when sL → constant. Hence from Eqs. (19) and (24), we get Taking the inverse Laplace transform s → t of Eq. (25) gives our second main result with the scaling L ∝ t where I or,α (y) = αy −α−1 − (α − 1)y −α (28) with 0 < y < 1. This scaling solution describes the far tail of the distribution, where Fréchet's law does not work. In fact, these two laws are related as the y −α−1 term matches the far tail of the Fréchet law, as it should. Since 0 < y < 1 implies τ max < t, moments of τ max are computed w.r.t. this scaling solution. In contrast, the Fréchet law gives diverging variance of τ max , which is certainly not a possibility since τ max is bounded. The expression in Eq. (27) is an infinite density describing a non-normalising limiting law. More exactly I or,α (·) is not normalizable, the moments of order q > α of τ max are calculated w.r.t. this non-normalised state. More details on infinite densities see Refs. [46][47][48][49][50] B. The rare fluctuations of the position We now investigate the distribution of x proving the validity of the big jump principle Eq. (9). Let P or (x, t) be with P or (k, s) = Here f (k) is the Fourier transform of the jump length PDF f (χ), and φ(s) is the Laplace transform of waiting time PDF. The long wave length approximation, i.e., the small s and k limit, is routinely applied to investigate the long time limit of P or (x, t). However, how to choose the limit of k → 0 and s → 0 is actually slightly subtle. Utilizing Eqs. (3) and (5), and assuming that the ratio |s α |/|k| is fixed, we get Inverting, we then find a known limit theorem [51,52] where t = τ 1+α /((τ 0 ) α |Γ(1 − α)|), L α,1 (·) is the nonsymmetrical Lévy stable law with characteristic function exp[(ik) α ], and a > 0. This central limit theorem, just like Fréchet's law, has its limitations. As a stand alone formula, it predicates x 2 (t) = ∞, since the second moment of the Lévy distribution does not exist. This means that we must consider a different method to describe the far tail. To proceed we reanalyze Eq. (29) but now fixing |s|/|k|. This is a large deviation approach since such a scaling implies a ballistic scaling behavior of x and t, unlike x − at/ τ ∝ t 1/α in Eq. (31). The strategy we use now, i.e., the determination of P or (x, t) for x ∝ t, is similar to the approach in the previous section where we calculated P or,τmax (t, L). The obvious difference is that there we start with Eq. (16), while here with the Montroll-Weiss Eq. (29). More specifically in the Sec. III A we assume that sL ∝ constant, while here |s| and |k| are small and of the same order, where s and k are Laplace pair and Fourier pair of t and x, respectively.
We restart from Eq. (29), which gives (32) The derivation of Eq. (32) is given in Appendix A. The inversion of the leading term is trivial, but it yields a delta function δ(x − at/ τ ). Mathematically we choose a scaling that shrinks the density to an uninteresting object. Luckily, the correction term is important as it describes the far tail. So for x = at/ τ , we have with F −1 k→x and L −1 s→t being the inverse Fourier and the inverse Laplace transforms, respectively. We first perform the inverse Laplace transform using the convolution theorem and the pairs , and find The inverse Fourier transform of exp(ikay/ τ ) is a delta function and the ik in front of this expression is the spatial derivative in x space, hence we get Then after simple rearrangements with 0 < ξ < 1, ξ = 1 − (x/a)/(t/ τ ), and I or,α (·) being defined by Eq. (28). As Fig. 7 demonstrates, this equation describes the far tail of the density of the spreading packet, and it is complementary to the Lévy law Eq. (31). The MSD of the process is obtained w.r.t. integration over the formula Eq. (37) and in that sense this equation "cures" the drawback of the Lévy density. More importantly is the fact that the distribution of τ max Eq. (27) and x Eq. (37) have the same structure, beyond a trivial Jacobian. In other words, given the fact that these observables have the same distribution, we have proven the single big jump principle Eq. (9) for the ordinary processes. The statistics of one waiting time, τ max , determines the fluctuations at small x. And since Eq. (37) gives the MSD, which is used in most experimental, theoretical and numerical works to characterize the fluctuations, we see that the MSD is directly related to the single big jump principle and extreme value statistics. One should note that low-order moments like |x − x | q with q < α are finite w.r.t. the Lévy density, and these are given by integration w.r.t. Eq. (31).

Remark 2
We now study the case of CTRW in two dimensions and focus on an ordinary process. The joint length PDF is f (χ x , χ y ) = f x (χ x )f y (χ y ) where f x (χ x ) is the same as in Eq.
(2) and f y (χ y ) = exp(−(χ y ) 2 /2(σ y ) 2 )/( 2π(σ y ) 2 ) with σ y being a constant. This means that the drift is only in x direction. Similar to our previous calculations, we use the Montroll-Weiss equation and find P or (x, y, t) ∼ (τ 0 ) α at α I or,α (ξ)δ(y). (38) The marginal density P or (x, t) is the same as the one dimensional case Eq. (37). Note that τ max is of the order t (for the far tail), so in the y direction the particles are practically frozen. Hence we get a delta function since there is no drift in the y direction.

IV. THE EQUILIBRIUM CASE
Up to now we have considered the case when a physical clock was started immediately at the beginning of the process, i.e., an ordinary CTRW. Here we consider the equilibrium initial condition. We note that for 0 < α < 1, i.e., when the average trapping time diverges, this is related to Aging CTRW [33,34,[53][54][55] which is used as a tool to describe complex systems ranging from Anderson insulator to colloidal suspensions and it was first introduced by Monthus and Bouchaud to illustrate the diffusion in glasses [56]. In contrast, when 1 < α < 2 and Eq.
(1) holds, we have a stationary process. Then as mentioned already, the mean waiting time for the first event is infinite; see Eq. (7). In practice, if we start the process at time −t a and t a is large but finite the averaged first waiting time observed after time t a will increase with t a , and when t a tends to infinity it will diverge. Here we focus on the statistics of particles with an equilibrium condition, i.e., t a → ∞.

A. The rare fluctuations of the position
In Fourier-Laplace space, the density of spreading particles is given by [33] P eq (k, s) = 1 − h eq (s) s This equation is a modification of the Montroll-Weiss equation, taking into consideration the equilibrium initial state. Using the Laplace transform of Eq. (1), we have The first term on the right-hand side is k independent, hence its inverse Fourier transform gives a delta function on the initial condition x = 0 describing non-moving particles. This population of motionless particles is non negligible in the sense that they contribute to the MSD; see Eq. (B6).
Based on Eq. (40), we consider typical fluctuations, i.e., k, s → 0 and |k| ∝ |s α | where we used the asymptotic behaviors of φ(s) and f (k). The inverse Laplace-Fourier transform of Eq. (41) yields According to Eq. (42), the typical fluctuations are the same as the one of the ordinary case; see Eq. (31) and the dashed lines in Fig. 8. That is, the bulk fluctuations do not depend on the initial state. On the other hand, the MSDs of both cases are different, this means that the far tail of P eq (x, t) should be modified compared with the ordinary case. As mentioned before, the normalized density Eq. (42) gives an unphysical infinite MSD due to the slowly decaying tail of asymmetric Lévy distribution. This means that we expect modifications of this limiting law at the far tail.
For the rare events of the equilibrium CTRW, i.e., both s and k are small and comparable, inserting φ(s) and f (k) into Eq. (40) gives Rewriting the second term of the right-hand side of Eq. (43) as (44) and using the relation we get the main results of this section describing the packet when x is of the order of t where the non-normalised state function reads and ξ = 1 − (x/a)/(t/ τ ). Comparing with Eq. (37), we see that the infinite densities for the equilibrium and non-equilibrium processes are different. This indicates that initial conditions influence the statistics at small position even when the measurement time is long t ≫ τ . The rare fluctuations for the equilibrium case are larger if compared with the ordinary process, in particular they include a delta function contribution; see the data marked in a red circle in Fig. 8. This means that particles not moving at all contribute to the rare events. Note that Eq. (46) can be matched to the far tail of the Lévy distribution Eq. (42), as it should. We further check that the MSD is determined by the rare fluctuations Eq. (46) resulting in a different MSD compared with the ordinary process. Using the random variable η = (x−at/ τ )/(at/ τ ) with −1 < η < 0, from Eq. (46) we get see Appendix C. Similarly, η 2 or is also obtained according to Eq. (37). Utilizing Eqs. (29) and (48), Though the MSDs for both cases grow as power law t 3−α , the MSD for the equilibrium case is larger than the ordinary one. Since the mean of the first waiting time following Eq. (1) is infinite, the probability of particles experiencing a long trapping time increases rapidly compared with an ordinary situation. In turn, this considerably yields inactive particles which are trapped on the origin for the whole observation time t far lagging behind the mean. Hence, the MSD for the equilibrium process has a deep relationship with the motionless particles; see Eqs. (B6). It is interesting to find that the MSDs for both cases are determined by the far tail of the packet described by the infinite densities. As expected, when α → 2, these two processes show normal diffusion with no difference, so then the initial condition is unimportant.
where G(s, L) is defined in Eq. (17). It gives the PDF by the derivative Note that since G(s, L) is large with L ∝ 1/s. Using Eqs. (51) and (52), P eq,τmax (s, L) reduces to P eq,τmax (s, L) ∼ 1 G(s, L) + α sL G(s, L) Note that Eq. (53) is a uniform approximation in Laplace space which is effective for numerous L and large t. More exactly, within this approximation, we have the only condition that the observation time t is large enough without considering the scaling between t and L. For the typical fluctuations, the leading term of Eq. (53) is the same as the ordinary process. Thus We see that the typical fluctuations of the longest time interval of both the equilibrium and the ordinary renewal processes are the same and independent of the initial conditions, describing the behavior when L α is of the order of t.
Next we turn our attention to the case when L ∝ t. Restart from Eq. (53), the inverse Laplace transform gives our main result describing the far tail of the density P eq,τmax (t, L) ∼ (τ 0 ) α t α τ I eq,α (y) + δ(t − L) with L ≤ t. Utilizing Eqs. (7) and (55), we have (56) see Fig. 9. From Eqs. (46) and (56), it can be seen that the principle Eq. (9) is also valid for the equilibrium case. Though the typical fluctuations of τ max for equilibrium and ordinary process show no difference, their far tails are distinct from each other [see Eqs. (27) and (56)].

V. DISCUSSION AND SUMMARY
We have related the theory of extreme value statistics and the fluctuations of a particle diffusing in a disordered system with traps. As mentioned, the observation of a non-Gaussian packet P (x, t) and super-diffusive MSD is widely reported [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15]. Here we showed that a modification of Fréchet's law is required to fully characterize these fluctuations. The largest waiting time τ max is clearly shorter than the observation time t, namely the sum N i=1 τ i + B t is constrained, hence naturally we have deviations from the Fréchet law. In other words, the theory of IID random variables, completely fails to describe the phenomenon of the far tail of the packet. More profound is the observation that the statistics of τ max determines the far tail of P (x, t) for the ordinary and equilibrium processes. One trapping event, the longest of the lot, controls the statistics of large deviations, and this is very different if compared with standard large deviation theory [22], where many small jumps in the same direction control the statistics.
Our work is related to the so called single big jump principle, which was originally formulated for N IID random variables {ϑ 1 , ϑ 2 , · · · , ϑ N } [28]. It states that N i=1 ϑ i max{ϑ 1 , ϑ 2 , · · · , ϑ N } when the distribution of ϑ i is sub-exponential, and for large maximum. Note that in the CTRW model considered in this manuscript we do not have any large spatial jump, instead we have long sticking events where the particles do not move. More importantly, in our case the waiting times are constrained by the total measurement time and hence correlated, and their number N fluctuates. Hence the situation encountered here is simply different (though related) to the original one. Thus one aspect of our work was to modify the principle as we did in Eq. (9) and then describe the rare events with new Eqs. (27), (37), (46), and (56). This allowed us to connect the big jump theory to infinite densities. The solutions describing the far tails of the distributions of x and τ max are nonnormalizable, still with proper scaling they are the limits of the perfectly normalised probability densities. For example in Eq. (27), we multiply the normalized density P or,τmax (t, L) by τ (t/τ 0 ) α and then get the infinite density I or,α (L/t). The variance of τ max and the superdiffusive MSD are calculated with these non-normalised states, meaning that these quantifiers of the anomaly are sensitive to rare events.
We showed that the initial condition is an important factor controlling the behavior of the far tail of distribution of interest. We calculated these for the stationary and ordinary renewal processes, showing that for the stationary process motionless particles give an important contribution to the description of the rare fluctuations and the MSD. On the one hand this implies that the far tails are non-universal in their shapes. This can therefore be used to characterize the nature of the underlying process. As for universality, this shows up in the principle of big jump Eq. (9), as the relation between the trapping time and the position x, is independent of the underlying process.
We note that the surprising super-diffusion of a biased tracer in a crowded medium was also found based on a many body theory [5,8,57], diffusion of contamination in disordered systems, and for numerical simulations of glass forming systems [4,6] where it is interesting to check the relation of the dynamics and the big jump principle. The investigation of the single big-jump principle in the context of other models of random walks in random environments is of great interest. For example the biased quenched trap model, exhibits typical fluctuations which are the same as those found for the biased CTRW [9,10,58,59]. Will this repeat for the rare events is still unknown. Recently the case of N IID random variables constrained to have a given sum was investigated, and under certain conditions the Fréchet law was found [60][61][62]. From the constraint it is clear that the far tail of the distribution of maximum cannot be modeled with the Fréchet law since there is a cutoff at the far tail. It would be of interest to investigate the far tail of this model (there N was fixed while here N is random) and see if the non-normalized density is found here as well.