Large Deviations in the Early Universe

Fluctuations play a critical role in cosmology. They are relevant across a range of phenomena from the dynamics of inflation to the formation of structure. In many cases, these fluctuations are coarse grained and follow a Gaussian distribution as a consequence of the Central Limit Theorem. Yet, some classes of observables are dominated by rare fluctuations and are sensitive to the details of the underlying microphysics. In this paper, we argue that the Large Deviation Principle can be used to diagnose when one must to appeal to the fundamental description. Concretely, we investigate the regime of validity for the Fokker-Planck equation that governs Stochastic Inflation. For typical fluctuations, this framework leads to the central limit-type behavior expected of a random walk. However, fluctuations in the regime of the Large Deviation Principle are determined by instanton-like saddle points accompanied by a new energy scale. When this energy scale is above the UV cutoff of the EFT, the tail is only calculable in the microscopic description. We explicitly demonstrate this phenomenon in the context of determining the phase transition to eternal inflation, the distribution of scalar field fluctuations in de Sitter, and the production of primordial black holes.


Introduction
The cosmological evolution of our Universe was shaped by fluctuations. The formation of dark matter halos, and hence galaxies and galaxy clusters, was the result of large density fluctuations, which can be modeled using Gaussian random fields. Rare fluctuations, which are determined by the tail of the probability distribution, may also be important for cosmology. For example, determining if some (or all) of the abundance of dark matter is due to the presence of primordial black holes requires a precise knowledge of the tail of the distribution. Such rare fluctuations could also have played a critical role in the dynamics of (eternal) inflation at very early times.
One natural hope is that this tail of the probability distribution can be captured by some kind resummation of the perturbative series. However, implementing such a resummation simply yields the Fokker-Planck equation for the distribution of cosmological fluctuations, with small corrections that are equivalent to performing a 'local' Kramers-Moyal expansion of the underlying master equation [39] (for an alternative approach, see [45]). To assess the range of validity for these methods, a first step is to compute the corrections to these equations systematically, which allows one to explore the properties of the resulting probability distributions. To this end, we computed the corrections to Stochastic Inflation from the leading effects of primordial non-Gaussianity in [46]. By studying the phase transition to eternal inflation (an observable that is exponentially sensitive to the tails of the probability distribution), we observed that the resulting probability distribution was not under perturbative control. While one might anticipate the tail is sensitive to nonperturbative effects, the precise origin and location of this breakdown should be calculable from the Effective Field Theory (EFT) point of view. This paper will explain how the tails of these distributions are determined by a new instanton-like 1 saddle point. These new saddles have their own associated energy scale, which can invalidate the naive EFT expansion, where the IR scale is set by the Hubble scale H. As we will quantify below, there are circumstances where these saddles are under control within the EFT description, and they simply reproduce the behavior of Stochastic Inflation. However, when computing the probability for observables that are sensitive to sufficiently large deviations, the saddle lies beyond the EFT description and must be calculated by appealing to the full theory.
We will provide a framework for anticipating such a breakdown by recasting these questions in terms of random walks. In fact, the appearance of the Fokker-Planck equation in inflationary cosmology suggests that the phenomenology of fluctuations in dS spacetime In contrast, the figure on the right depicts the walker covering the same distance L in a series of smaller steps that are mostly in the same direction. Although both of these are extremely unlikely, the one on the right is much more probable than the one on the left. If we assume that the individual steps are sampled from a Gaussian distribution, the probability of making one large jump is ∼ e −L 2 , whereas the walker can get there using a series of smaller aligned steps with probability ∼ e −L . Therefore, the latter dominates the tail of the coarse grained probability distribution. This shows that the most probable paths are described by small fluctuations around a single classical deterministic trajectory, which is associated with a novel saddle point solution.
can be mapped onto the behavior of random walks. Specifically, we will show that random walks with independent and identically distributed (i.i.d.) steps give rise to the same behavior as cosmological systems. Concretely, consider an i.i.d. walk with zero mean that traverses a distance X in a number of steps N . For typical fluctuations (X ∝ √ N ) the Central Limit Theorem (CLT) tells us that X is Gaussian distributed, even if the individual steps are not. As we will review below, one can view the CLT as the result of a renormalization group flow to a Gaussian fixed point, with all non-Gaussianity being irrelevant -typical fluctuations are insensitive to the microscopic details of the walk. On the other hand, large deviations (X ∝ N ) defy this general behavior and do depend on the microscopic details. The fact that large deviations are not determined by universal longdistance behavior will have a precise analogy in cosmology, and will explain the breakdown of EFT for large fluctuations.
A deeper understanding of such fluctuations can be gleaned from the Large Deviation Principle (LDP) [47,48]. Stated simply, the LDP is a scaling law of the form P N e −N I , where P N is a probability distribution parameterized by some large number N , and I is a positive number called the rate function. In the context of random walks I ∼ O(1) for large fluctuations, which means P N ∼ e −N at the tail. As illustrated in Fig. 1, the dominant contributions to this tail come from walks that resemble a classical trajectory, which looks quite different from the usual zero mean walks. This is the new saddle that the CLT (or EFT) can fail to capture. The central goal of this work is to develop a precise map between the LDP framework and the physics of Stochastic Inflation. 2 There is a vast literature on large deviations that extends well beyond i.i.d. random walks, including applications to a number of physical problems in equilibrium and nonequilibrium statistical physics, e.g. see the review article [58]. For our purposes, the language of the LDP will be useful for two reasons: First, it makes precise the sense in which the physics leading to large deviations requires a departure from the usual long distance description in terms of the CLT and implies that one must appeal to the microscopic nature of the walk. Second, the rate functions I are often calculable in terms of a novel saddle point approximation. Combining these two insights will allow us to characterize the regime of validity of Stochastic Inflation, highlighting the situations where it improves perturbation theory and why it ultimately breaks down. We will demonstrate this concretely in the context of eternal inflation and λφ 4 in a fixed de Sitter background. The LDP will explain the behavior of the probability distribution of scalar fluctuations calculated used the stochastic framework. We then extend this understanding to models that generate primordial black holes.
This paper is organized as follows. We begin with a discussion of models that have a vanishing potential. We first explore this scenario using random walks in Sec. 2. This allows us to show how the CLT emerges from a coarse graining procedure, and to both explain the LDP and apply it in a simple context. Using this formalism, we can then understand the probability distribution for the fluctuations of the inflaton, which is the topic of Sec. 3. We then turn on a non-trivial potential for a random walk in Sec. 4, and explore the role of the LDP when computing the probability distribution for this example. This is exactly the framework we need to understand the behavior of massless scalar field theory in a dS background in Sec. 5. We then apply the same techniques to explore models whose goal is to generate primordial black holes in Sec. 6. Finally, Sec. 7 provides our conclusions and a discussion of future directions.

Random Walks and the Renormalization Group
Random walks offer a simple setting within which we can understand the conceptual details of the present work. We will review the Central Limit Theorem (CLT) and demonstrate how the long wavelength behavior of a wide class of walks fall into the universality class modeled by a Gaussian distribution [59]. The failure of the CLT for large fluctuations is exactly analogous to the breakdown of EFT we discuss later in the paper. We can understand this regime better with the Large Deviation Principle (LDP), as illustrated with some simple examples worked out in detail below. This framework will eventually allow us to gain insight into the evolution of the scalar fluctuations of the inflaton from a new perspective.

Central Limit Theorem
Consider a one dimensional i.i.d. random walk. Starting from the origin, the walker takes one step per discrete time interval, each with a displacement x chosen independently from some 'microscopic' distribution 3 p(x) with finite moments. To facilitate the discussion below, we will consider two specific examples: Both of these distributions have mean x = 0 and variance x 2 − x 2 = 1. A walker governed by p f (x) can take a step with either x = 1 or x = −1 at each turn. The Gaussian walker's probability for each step is governed by p g (x), so it is able to take a step of any size, with the most probable values being |x| 1. Examples of a typical random walk generated by p f (x) and p g (x) are given in Fig. 2. If we zoom in on each walk we can make out the difference: it is evident that the red trajectory is borne of fixed size steps whereas the blue one is not. However, at the macroscopic level the two walks look like they could have been generated by the same p(x), suggesting that the long wavelength behaviors of both walkers have something in common.
We can make this intuition precise by studying the net displacement of a random walker after N steps, which we denote as If each step is sampled from p(x), the probability of finding the walker at a distance X from the origin after N steps is P (X) can be evaluated by Fourier transforming the delta function, Figure 2: Random walks generated by a fixed step length distribution and a Gaussian distribution. If we zoom in, the difference in step size is evident from the shape of the trajectories, but at the macroscopic level it is difficult to tell which p(x) generated each walk.

4)
to obtain The second equality holds because we are assuming that the x i are i.i.d. random variables, which means their expectation values are independent. The quantity is called the characteristic function of p(x). For example e ikx g = e −k 2 /2 , if the steps are sampled from the Gaussian p g (x) in Eq. (2.1b). Plugging this into Eq. (2.5) yields which is an exact result valid for all X.
We can repeat this exercise with p f (x); after N steps, the walker will be at X with a probability In the large N limit, we can use Stirling's approximation n! n→∞ − −− → n n e −n to write If X N , we can Taylor expand I f (X) to find So long as we restrict ourselves to X √ N , the first term in the exponent dominates, and we can approximate the distribution in Eq. (2.8) as a Gaussian. In particular, P f (X) and P g (X) have the same behavior up to X ∼ √ N . A plot of X over time can be obtained by downsampling the trajectories in Fig. 2 by a factor of N . From this perspective it is clear that P (X) captures the long wavelength dynamics of the walk, which is the same at leading order for the fixed step length and the Gaussian walkers.
We can generalize these conclusions to a wide class of p(x). Going back to Eq. (2.6), the Taylor expansion in k of the logarithm of the characteristic function yields the cumulant expansion 4 For instance, x C ≡ x and x 2 C ≡ x 2 − x 2 = σ 2 0 is the variance of p(x). We can exponentiate the above expression to write Eq. (2.5) as For observables where the Gaussian contribution dominates, the integrand has support for k . Therefore, the m th term in the cumulant expansion contributes to the exponent as where x m C are independent of N since the moments of the distribution p(x) are finite constants. Therefore, the first and second terms of the exponent in Eq. ( which is a Gaussian probability distribution centered at √ N x up to corrections that vanish for large N . This result is known as the Central Limit Theorem. 5 It is very useful to interpret this result in the language of Renormalization Group (RG) evolution as applied to EFTs. In this context, one identifies a power counting parameter which facilitates the use of dimensional analysis. In the classic case of integrating out a heavy particle of mass M , the power counting is determined by the small dimensionless number E/M , where E M is the typical energy associated with the process of interest. This allows one to organize the local operator expansion of the EFT into terms which are relevant (grow larger polynomially at lower energies), marginal (only evolve at most logarithmically), and irrelevant (grow smaller polynomially at lower energies).
We can see the same principles in action by viewing our random walk examples through the lens of RG. If we organize the terms that appear in the exponent of Eq. (2.15) by how they scale with N , then we see that the mean is a relevant parameter, the variance is marginal, and the cumulants x m>2 C are irrelevant. Therefore, as N → ∞, the distribution is localized about the mean, and the Gaussian distribution emerges as a universal fixed point of the RG evolution. The interpretation is that coarse graining the distribution by zooming out (equivalently taking a large number of steps) erases any detailed memory of the microscopic distribution p(x), beyond the gross features captured by its mean and standard deviation. This is in exact analogy with EFTs, where only a small number of parameters contribute significantly to low energy observables, regardless of the detailed UV completion.

Large Deviation Principle
However, this is not the whole story. Returning to the examples introduced in Eq. (2.1), let us consider the probability P (X > N ) of finding a random walker at a distance farther than N from the origin, after N steps. The walker taking fixed (unit) size steps has no hope of going beyond N even if they were to take all N steps in the same direction, and therefore P f (X > N ) = 0. However, the Gaussian walker, with the same mean and standard deviation, can have P g (X > N ) = 0. Evidently, some information about the microscopic distribution p(x) is encoded in the region X N , which we refer to as the tail of P (X).
For both examples above, it is extremely unlikely that the walker makes it to X ∼ N in N steps, which implies P (X ∼ N ) is very small. In order to probe the tail, we need to devise 5 The CLT is more general; it is not necessary for the steps to be independent [60]. In that case Eq. (2.15) is still valid so long as the m th cumulant in Eq. (2.13) satisfies an observable that would be sensitive to such rare events. To this end, we can compute e θX assuming P g (X) with θ > 0. Naively, we might think that e θX ∼ e θO( √ N ) since P g (X) is dominated by X √ N . However, we can perform the following computation: which in fact scales as e N . The explanation for the breakdown of the naive intuition is simply due to the fact that e θX takes on very large values with small probabilities, so that contributions from such values cannot be ignored [61]. In other words, e θX probes the tail of P (X). We now introduce a new random variable called the sample mean of the i.i.d. random variables x i . Noting that the distributions transform under a change of variables as P ( X)d X = P (X)dX, we may rewrite P g and P f for large N as where I is a positive quantity called the rate function that can be read off from Eq. (2.7) and Eq. (2.9): A probability distribution P N that satisfies a scaling law of the form P N e −N I is said to obey the Large Deviation Principle (LDP). The distributions of the sample mean described by Eq. (2.18) and Eqs. (2.19) are examples of the LDP. According to Cramér's theorem, the distribution of a sample mean X of i.i.d. random variables satisfies a LDP with a rate function I( X) given by  The probability distributions of the sample mean X ≡ X/N for a fixed step length and Gaussian random walk for N = 10, computed using the LDP. For values of X close to the mean both curves overlap, in accordance with the CLT. However, they differ for large deviations from the mean. In particular the tail of the distribution, shown magnified, reveals that P f ( X) vanishes at X = 1 whereas P g ( X) does not. The dashed lines are the rate functions for each distribution.
To see why this is plausible, let us assume that LDP holds for some sample mean X. Then, P ( X) e −N I( X) and In the last step we have used the saddle point approximation to compute the integral for large N . Noting that e θX = e θx N ≡ e N λ(θ) for i.i.d. random variables, we have Then the rate function given in Eq. (2.20) follows from a Legendre transform of this result.
As an example, we can work out the rate function I f ( X) for the sample mean of the fixed step length walk using Crámer's theorem. Starting with e θx = cosh(θ), we have (2.24) The supremum can be obtained using ordinary calculus: taking the first derivative of the expression in brackets and setting it to zero gives θ max = tanh −1 ( X) = 1 2 (ln(1 + X) − ln(1 − X)), which is where the r.h.s is maximum. Substituting this θ max into Eq. (2.24) and simplifying, we recover the rate function 7 given in Eq. (2.19b). Finally, notice that the rate function I f from Eq. (2.19b) is just the negative of the entropy for a binary random variable. We touch upon this fact in Sec. 4.1. A more general discussion is given in [58].
The CLT arises from the LDP when the rate function I( X) is convex and has a single global minimum (at say, X 0 ). If this is the case, we can Taylor expand I( X) around X 0 to obtain For small deviations of X from X 0 , this quadratic expansion is a good approximation of I( X), and therefore the CLT provides the same information as the LDP. On the other hand large deviations of X are those values at which the rate function deviates significantly from the quadratic approximation. The CLT does not correctly describe such large fluctuations, and we need to rely on the LDP instead. 8

Eternal Inflation
Having set up the general ideas of the LDP, we now turn to our first cosmological application. The connection derives from the fact that scalar fluctuations during single-field inflation act locally like a 1-dimensional random walk around a classical trajectory. For a typical path, the end of inflation is determined by the classical evolution where the field distance changes linearly in time, ∆φ classical =φt. However, it is possible for quantum fluctuations of the scalar field to work against the classical motion, giving rise to inflationary periods that last significantly longer than the classical expectation. In fact, when the amplitude of fluctuations is large enough, it is known that inflation never ends everywhere [62,63] in the universe and instead gives rise to a infinite reheating volume [64], also known as eternal inflation. Remarkably, the fluctuations responsible for eternal inflation are necessarily examples of large deviations, as we will see in this section.

Review of Stochastic Inflation
The idea that inflation is essentially a random walk has a long history, starting from nearly the inception of the subject [21,23,26]. The intuition follows from considering the freeze out of modes as they cross the horizon, at which point the quantum fluctuations of these modes begin to evolve classically. In any small patch of the universe, the gradients of the field redshift away and the process is effectively a random walk. (Of course globally, there are correlations across super horizon scales, which is the one of the main reasons we invoke inflation in the first place.) Specifically, as long as the parameters of the inflationary model change slowly in time, the fluctuations of each mode in any given patch of space follow the same distribution as a random walk with i.i.d. variables, which is given the interpretation of noise generated by the Hubble temperature associated with the horizon. The distribution of fluctuations is also sensitive to the presence of a potential. In the random walk language, this is the analog of a classical external force. So the picture is a competition between the noise and this so-called drift. This idea was formalized in the framework of Stochastic Inflation [20,21], which is the statement that the probability distribution for the scalar fluctuations obeys Despite its intuitive appeal, the derivation of Stochastic Inflation from quantum field theory, a full understanding of its domain of applicability, and a framework for computing corrections to the formalism had long been elusive. These puzzles have recently been solved by interpreting Stochastic Inflation as arising from RG flow (or resumming logs) in quantum field theory [35,36,38,39,[41][42][43][44]. Concretely, by taking moments of Eq. (3.1), one can relate mixing of operators under time-evolution to the stochastic equation. In single-field inflation, the fluctuations of φ can be rewritten in terms of the adiabatic metric fluctuation ζ. However, ζ must respect the single-field consistency conditions [65][66][67][68], which are the nonlinearly realized SO(4,1) symmetries that act on the metric leaving the gauge fixed. For example, under the dilatation transformation in this group, ζ transforms as δζ = −1 − x · ∂ x ζ. The evolution of operators under (dynamical) RG must respect these symmetries and restricts the form of mixing to where t = H t, and γ n are the "anomalous dimensions" which govern the composite operator mixing; the γ n are time-independent for scale invariant correlators. This implies the most general form of single field Stochastic Inflation is 9 As discussed in [39], we can view this as the expansion of a general Markovian process 9 In order to go from the dynamical RG for correlators to a Fokker-Planck equation for a probability distribution P (ζ, t), one simply identifies ζ n = dζP (ζ, t)ζ n .
with transition amplitudes W (ζ|ζ ), such that Here we used the shift symmetry, ζ → ζ + c, to write the transition amplitudes In this sense, we see that γ n>2 corresponds to non-Gaussian corrections to the transition amplitude, which is the same as the non-Gaussianity of the probability for a step in a i.i.d. random walk.
The coefficients γ n are determined by computing the n th connected quantum field theory correlator through Explicit calculation shows that γ 1 = 0, which is a restatement of the conservation of ζ outside the horizon. The quadratic term γ 2 is determined by the variance where ∆ ζ = H 4 /(2f 4 π ) sets the amplitude of the power spectrum for ζ, and we evaluated this integral by introducing a hard UV cutoff Λ = aH and an IR cutoff K IR . 10 Comparing Eq. (3.2) to Eq. (3.3), we see that this term generates the noise term that appeared in the original formulation of Stochastic Inflation, Eq. (3.1).
From the point of view of the quantum field theory correlators and the resultant RG evolution, computing higher order corrections is completely straightforward. Applying this approach to the EFT of inflation [70,71], the first non-trivial correction to the stochastic framework in single-field inflation was found in [46]. Since ζ is derivatively coupled, we can generically generate γ n by introducing an interaction of the formζ n /Λ 4−n , where Λ = f π c s is the approximate UV cutoff 11 of the EFT of inflation when c s 1. Then by dimensional analysis,

8)
10 While this is the correct result, one might be concerned that this hard cutoff breaks spacetime symmetry. For a discussion of a dimensional regularization-like regulator that preserves the symmetries, see [38,69]. 11 When c s → 1, there are additional factors of (1 − c 2 s ) so that Λ → ∞ as c s → 1 in slow-roll models.
with c n = O(1). This leads to the naive expectation that perturbation theory should hold as long as one is working in the parameter space where H Λ. As was emphasized in [46], this is only true when the observable of interest is insensitive to the tail of the probability distribution. In the language of the LDP, these tails are dominated by a new saddle point. The energy scale associated with the LDP saddle can be significantly larger that Λ, signaling that one is sensitive to the details of the UV completion, as in the case of the random walk examples studied above.

Central Limit Theorem as a Resummation
We would now like to solve for the time evolution of P (ζ, t), assuming ζ = ζ 0 at t = 0. This will tell us the probability of different possible values of ζ, which should resemble a random walk. If the theory is Gaussian, so that γ n>2 = 0, then we are solving the heat equation where σ 2 ≡ γ 2 = ∆ ζ /(4π 2 ) is the variance. The solution to this equation is a Gaussian Using this general form, we will show that the physics of the random walk is reproduced by the solutions to this equation. First, let us consider the behavior around the peak of the Gaussian solution where (ζ − ζ 0 ) 2 σ 2 t. If we expand the full solution near the peak, we notice that as t → ∞, If we associated γ n with the n th cumulant of a random walk, and t → N is the number of steps, then the suppression of these terms precisely matches our expectations from the CLT as t → ∞, see Eq. (2.14).
Note that this implies that ζ ∼ 1 is under control for suitably large t. In contrast, perturbative calculations of the probability distribution using the Edgeworth series where k 3 = − k 1 − k 2 , breaks down for much smaller values of ζ. This shows how Stochastic Inflation improves the behavior of perturbation theory by resumming the individual modes into a single random walk.

Large Deviations and the EFT of Inflation
Now let us consider the tail of the P (ζ, t) distribution, where ζ = αt for some constant α in the limit t → ∞. The region α ≥ 1 corresponds to the regime of eternal inflation, as the random fluctuations conspire to prevent the end of inflation, even in the t → ∞ limit. The transition at α = 1 is where the quantum fluctuations exactly cancel the classical evolution of the background field. Note that because the distance is linear in t, rather than √ t, we are considering a large deviation for the probability distribution of ζ. It is straightforward to see that for these large deviations the CLT fails to calculate dominant contribution to the tail, just as it did for the i.i.d. random walk (see Fig. 2). Plugging ζ = αt into Eq. (3.12), we have γ n t n! ∂ n ∂ζ n P G (ζ, t; ζ 0 ) = O(γ n tα n σ −2n )P G (ζ, t; ζ 0 ) . (3.14) For α = O(1), there is no suppression of the higher-order terms. Concretely, the entire series in γ n will break down for sufficiently large α. For α = 1, this series will break down when Λ < f π , even though this parameter space is consistent with condition that the EFT of inflation is weakly coupled at horizon crossing, Λ > H [46]. The breakdown of Stochastic Inflation is a precise reflection of what we found in our analysis of large deviations for random walks. To see this more clearly, we can write the solution to Stochastic Inflation in terms of the Fourier transform of Eq. (3.11) We recognize this as precisely the result for a random walk we described above, see Eq. (2.13). At the same time, we can identify so that when ζ = tα, we have ρ(k, t) → e tλ(θ=ik) . Assuming t 1, we can calculate the integral over k in Eq. (3.15) using the method of steepest descents: where The integrand has been expanded around k = k (α) + δk defined by a (complex) value of k that is an extremum of the argument of the exponential, We see that using the method of steepest descents to calculate the inverse Fourier transform is equivalent to using Cramér's theorem, Eq. (2.20). Furthermore, for large α the Gaussian solution, k = (−i)α/σ 2 , is a far from the true saddle as all the terms in the k n γ n expansion will become equally important. This is, of course, the Fourier transform of the result in Eq. (3.14). When we applied the LDP to random walks in Sec. 2.2, it was clear that we become sensitive to the microphysics. We would like to understand this breakdown purely in terms of the EFT of Inflation. Concretely, the expansion in γ n is under control at horizon crossing, which is the physical energy scale where the fluctuations are produced. A natural guess is that ζ ∝ t behaves like a classical solution withζ = H orφ = f 2 π , where φ is the inflaton. To make sense of this, we can rewrite the evolution of ζ in terms of a Langevin equation, d dt where ξ(t) is a random variable that models a noise source. Assuming that the noise is Gaussian, we have The probability of finding ζ = ζ f at t = t f given the initial condition ζ(t = 0) = 0 is then For large deviations, ζ = αt, and the probability is determined by the saddle point d 2 dt 2 ζ = 0 or d dt ζ = α, so that P (ζ f ) exp −t α 2 2σ 2 . The key observation is that the probability of this large deviation is determined by a classical solution where d dt ζ = α (see Fig. 1). Translating this into the canonically normalized field, ζ c = ζf 2 π /H, this is the condition that The EFT of inflation is defined in terms of an expansion inζ c /Λ 2 and therefore wheṅ ζ c > Λ, we cannot define these classical solutions within the EFT. Concretely, we can modify the Langevin equation with nonlinear terms where c n = O(1) by the definition of γ n in terms of Λ in Eq. (3.8). We can now calculate the probability distribution as before, The saddle point is stillζ c = 0, but we can see that the probability distribution becomes ill-defined when αf 2 π > Λ 2 . It is also noteworthy that the breakdown of EFT in this specific example is not associated with the breakdown of Markovian dynamics, as higher time derivatives vanish around the classical solution (see the discussion in Sec. 4.2). For single field inflation, it is a breakdown of the EFT of Inflation itself, rather than SdSET, that is responsible for the ill-defined probability distribution for sufficiently large deviations.

Random Walks with External Forces
In the previous sections, we focused on the application of the LDP to random walks with no external deterministic forces. We argued that the late time behavior for the typical fluctuations of these systems could be determined by RG evolution. This analysis yielded the CLT, such that the resultant probability distribution was a Gaussian with zero mean.
In this section, we will study the physics of a random walk that is driven by an external deterministic force. This is easiest to understand in the case of a constant force, which is equivalent to an i.i.d. random walk where the average over steps is non-zero, x = 0. Then from Eq. (2.15), we see that the term proportional to the non-zero mean scales as √ N , and so this term grows as we take N → ∞. In the RG language, this implies that an external deterministic force has the effect of introducing a relevant deformation into the theory.

Equilibrium Distributions
In the presence of confining forces, such as a potential with a local minimum, we might expect to see ergodic behavior, such that the probability for being at a given location at a fixed time approaches a time-independent (equilibrium) distribution. Such behavior is also consistent with our expectations from thermodynamics for large numbers of confined particles. In fact, it turns out that the equilibrium distribution for this thermodynamic system is itself a quantity that is calculable using the LDP. If we imagine that a walk of length N reaches equilibrium, then the probability of finding the particle at location y during the walk at a sufficiently large number of steps, 1 n ≤ N , should simply follow the equilibrium distribution P (x n = y) = P eq (y) . where λ(θ) is determined from the equilibrium distribution for a single step in the walk. This is a qualitative argument that can be formalized in terms of the eigenvalues of the transition amplitudes. The rate function for the walk X = N i=1 x i again follows from Eq. (2.23) and Eq. (2.17), Importantly, in the limit N → ∞, the probability for any average quantity, S = n i=1 f (x i ), is just the N th power of finding one particle with f (x) = S/N . The appearance of an equilibrium quantity in the LDP calculation is not a special feature of random walks, but is common to most statistical mechanics problems [58]. In a precise sense, rate functions of the LDP are proportional to the thermodynamic free energies for large numbers of particles. The overall power of N in the probability is just the familiar relationship between extensive and intensive thermodynamic quantities. In fact, this connection was already present when we calculate the rate function for the discrete walk in Eq. (2.19b), which is (minus) the entropy associated with a binary random variable.
This perspective helps explain why the LDP can be used calculate the equilibrium probability distribution. This can be made concrete 12 in terms of a Langevin equation: where again ξ(t) is a random variable that accounts for noise, and now f x(t), t is an external deterministic force. If we assume the noise is Gaussian, then ξ(t) obeys In this case, the probability of a walk x(t) is where we used the equations of motion given in Eq. (4.4). One can confirm the first line by taking functional derivatives with respect to ξ(t) to reproduce the two-point correlator in Eq. (4.5). For making future contact with cosmology, we will assume the external force is due to a potential such that where V ≡ ∂ x V (x). Now suppose we are given x(0) = 0, and we want to integrate over all possible paths to find the probability that x(T ) = L. We can solve this problem using the method of steepest descents. First, we must find the maximum likelihood path, which is the same as finding the classical saddle for the 'effective action' From the equations of motion, we havë Using this result, we can write Including the total-derivative term from Eq. (4.8), the probability of find x(T ) = L is given in terms of  In the same sense as for the path integral, it is the paths near the classical solution that yield the dominant probability, while the contribution from the fluctuations about the classical path determine the sub-leading terms in the expansion with respect to σ. This result matches the equilibrium probability distribution we derive from the Fokker-Planck equation by setting dP/dt = 0: Integrating twice with respect to x gives log P (L) We see that the LDP reproduces the equilibrium distribution that is predicted by the Fokker-Planck equation.

Markovian Evolution
Within this framework, we can also easily understand the role of Markovian evolution when assessing the validity of the calculation of the equilibrium solution. Markovian refers to a class of theories where the next time step is fully determined by the state of the system at the previous time step. In terms of differential equations, Markovian evolution therefore is equivalent to the statement that we have a first-order (in time) equation of motion. If there were higher derivatives, then one would need to know about the state of the velocity field along with the state of the system itself to determine the next step in the evolution [72]. We can therefore model non-Markovian evolution by adding a small acceleration term to our equations of motion given in Eq. (4.4): (4.15) so that the dynamics are Markovian in the ϑ → 0 limit. Repeating our previous calculation, we find an effective action so that P (L) = exp − I(L)/σ 2 . There are two new non-Markovian terms that contribute to the action, which are small corrections when where the first and second lines correspond the first and second ϑ-dependent terms in Eq. (4.16), and the equalities in Eq. (4.17a) are due to the Eq. (4.9). Notice that both terms remain small when we impose the condition ϑV 1 everywhere along the path. If we enforce Eqs. (4.17), then the effective action is first order in time, and hence the evolution is Markovian.

Light Scalar Fields in de Sitter
Light scalar fields in de Sitter with non-trivial potentials present an additional complication beyond single-field inflation. The stochastic framework applied to these models is known to give rise to a non-Gaussian equilibrium probability distribution, acting as a kind of non-trivial fixed point of the dynamical RG. This presents a vastly different situation, compared to single-field inflation, where interactions are negligible for typical fluctuations due to the CLT. This section will show how the dynamics of these models can be mapped onto the language of random walks with external forces that we developed in the previous section.

Effective Potentials and Markovian Dynamics
Stochastic Inflation provides a compelling framework with which to understand the dynamics of light scalar fields in dS. In its original form, it describes the probability distribution for an interacting scalar in dS, via the Fokker-Planck equation The equilibrium probability distribution is given by If we take V (φ) = λφ 4 /4!, the most likely field values are |φ| Hλ −1/4 . In light of the discussion of non-Gaussian noise in Sec. 3, one would naturally wonder about the regime of validity and corrections to this formula.
It is useful to discuss Stochastic Inflation and its corrections in the context of SdSET. The relationship between the variables of SdSET and scalar field theory in dS can be understood from the SdSET ansatz for a free massless scalar: Here we have rewritten the scalar in terms of the two power law solutions as k → 0, where ϕ + is the constant (or growing) mode and ϕ − is the decaying mode. In SdSET, Stochastic Inflation is a consequence of Callan-Symanzik-like equations for the dynamical RG of the ϕ n + operators. This information can be rewritten as a master equation for the probability distribution P (ϕ + , t), which at lowest order reproduces Eq. (5.1), while also containing an infinite series of corrections: For the UV example of a massless scalar with V (φ) = λφ 4 /4!, the leading corrections (as defined below) were calculated in [39], resulting in where we redefined to remove the scheme-dependent corrections b 1 = O(λ) and b 2 = O(λ 2 ). First, let us establish in what sense these are small corrections to the original Fokker-Planck equation. If we ignore the corrections to the evolution Eq. (5.5) and the potential Eq. (5.6), so that V eff (ϕ + ) = λϕ 4 + /4!, then the equilibrium solution is the same as in Eq. (4.14): where we substituted σ 2 = γ 2 = (4π) −1 and V → V eff /3 to match Eq. (5.5). Notice that the typical fluctuations reside in the region |ϕ + | λ −1/4 . We can determine the scaling behavior of the solutions using ϕ + ∼ λ −1/4 such that the corrections to V eff (ϕ + ) are O(λ 1/2 eff ) and O(λ eff ), which we will call next-to leading order (NLO) and next-to-next-to leading order (NNLO) respectively. The cubic-derivative term, on the second line of Eq. (5.5), is similarly NNLO. By the same λ-scaling argument, the equilibrium solution can be written as P eq = CP LO (ϕ + )P NLO (ϕ + )P NNLO (ϕ + ) with The terms in P NNLO come with different powers of ϕ + but have the same λ eff counting for typical fluctuations. Importantly, the second term, which is O(ϕ 8 + ), receives contributions from both the change to V eff (ϕ + ) and the higher derivative term.
Given that our UV theory only has a marginal coupling, λφ 4 , it is not obvious that there should be a breakdown of the stochastic framework akin to what happened for the EFT of Inflation in Sec. 3. Furthermore, we saw in Sec. 4 that the equilibrium distribution is itself a result of the LDP, and therefore it is not a given that the framework could break down.
However, as we saw in Sec. 4.2, a critical assumption for the validity of the stochastic framework is that the evolution is Markovian. Therefore, the stochastic description can fail when the acceleration terms become important. The condition for non-Markovian terms to be negligible was given in Eqs. (4.17) above. In the language of SdSET, non-Markovian evolution would arise from nontrivial mixing between ϕ + and ϕ − , defined in Eq. (5.3).
Evaluating these conditions requires that we identify the parameter ϑ. For a light scalar field φ in dS, the equations of motion in the limit that k → 0 arë (5.10) In terms of dimensionless time t = Ht, we would therefore 13 expect ϑ 1/3. Assuming that ϑ −1 = O(1), Eq. (4.17b) implies that we should worry that the evolution becomes non-Markovian when Using the explicit form of the corrections provided in Eq. (5.9), we see that this is precisely where our expansion in powers of λ breaks down: Similarly, this is the scale where the infinite series of corrections to the effective potential in Eq. (5.6) become equally important. It is therefore natural to conclude that the breakdown in our perturbative expansion in λ when ϕ + > λ −1/2 is due to the failure of the Markovian assumption.

Light Scalars in de Sitter with Derivative Interactions
We argued in Sec. 3.3 that the breakdown of Stochastic Inflation for large fluctuations in single-field inflation was associated with the breakdown of the EFT of inflation. In this section, we will expore if a similar breakdown occurs for the equilibrium distribution of a scalar field φ described by an EFT with higher-derivative interactions in addition to a potential. We will take our action for our scalar to be an EFT that includes arbitrarily high powers of derivatives 14 to take the form y n Λ 4(n−1) (∂ µ φ∂ µ φ) n + . . . , (5.13) where Λ is the UV cutoff of the EFT and the . . . include operators with more than one derivative per field. This description is under control in de Sitter when Λ H. We will again take V (φ) = λφ 4 /4! such that φ is massless and its growing mode ϕ + will evolve at zeroth order in y n according to λ-corrected equations of Stochastic Inflation given in Eqs. (5.5) and (5.6). The impact of the higher-derivative couplings y n on Stochastic Inflation is nearly identical to the corrections in single-field inflation. The leading corrections in y n survive the λ → 0 limit and therefore can be determined independent of the potential. In this limit, φ has a shift symmetry φ → φ + c. Repeating the argument used for single field inflation in Sec. 3, one finds corrections where γ n ∝ y n H Λ 4(n−1) , (5.15) at leading order in y n . The presence of higher-derivative terms in the scalar EFT introduces an infinite series of derivatives in the effective Fokker-Planck equation. For this to be under control, we expect that the equilibrium solution with γ n>2 = 0 should be corrected by an expansion in powers of a small parameter. If we write P (ϕ + , t) = P eq LO (ϕ + )Q(ϕ + , t), then in the limit ϕ + λ −1/4 , Eq. (5.14) becomes where we used Eq. (5.2) for P eq LO (ϕ + ). For y n = O(1), this series is under control when Taking V eff λϕ 4 + , this tells us that the equilibrium solution is under control for Hϕ + Λ 2 /(λH), which is parametrically larger than Λ. We can make sense of the regime of validity of this result using Eq. (4.9) to relate along the classical trajectory. Now using φ Hϕ + and t = Ht, we can rewrite this condition for the expansion to be under control: We see that the breakdown is precisely where we would expect from the derivative ex-pansion of the microscopic theory. Critically, the derivation of this breakdown required knowledge of the LDP to see that the equilibrium distribution could be derived from a classical saddle, in the regime whereφ + was larger than allowed by the UV cutoff of the EFT of inflation.

Models of Primordial Black Hole Generation
The impact of rare fluctuations is particularly important for models of primordial black hole (PBH) generation. The PBHs are formed from order-one fluctuations directly in the primordial distribution and, as such, the fluctuations are exponentially unlikely for scaleinvariant Gaussian random fields. Models for the generation of PBHs therefore exploit breakdowns of both scale-invariance and Gaussianity, see e.g. [76,77] for recent reviews.
The typical approach to estimating the abundance of PBHs follows from the critical collapse model [76]. In the conventional description, one takes a smoothed density field, where W (k) is a filter that removes power on scalesk 1, and R is some distance scale. The critical collapse model assumes that any region where δ R > δ cr , for some constant threshold δ cr = O(1), will form a collapsed object with a total mass determined by the size of the region R.
Within this framework, the abundance of primordial black holes is determined by the probability of finding δ R > δ cr = O(1), where the precise value of δ cr is model-dependent. This threshold can be also be written as a critical value of ζ R ( x) [78,79], ζ cr , defined as in Eq. (6.1) such that the probability of finding ζ R ( x) > ζ cr determines that production of PBHs. For concreteness, a value of ζ cr = 0.1−0.2 arises in some analytic collapse models in radiation domination [79]. In comparison with the LDP, note that the relevant time-scale for the random walk is the number of e-folds of inflation after horizon crossing for a model with kR ∼ 1, or N e (R) = log Ra(t end )H, where t end is the time when inflation ends. For scales N e (R) = O(10), fluctuations above the ζ cr threshold would correspond to α 10 −2 using the parameterization of large deviations described in Sec. 3.3. In models of inflation consistent with observations, these values of α may still lie outside the domain of the EFT of inflation.
Non-Gaussian tails arise in a variety of contexts, included single-and multi-field inflation. In light of the connection between tails of distributions and the LDP explored in this paper, we would like to understand when such large non-Gaussian contributions can be calculated reliably given only an effective description at horizon crossing. We will argue that framing these questions in the language of RG for a random walk provides useful intuition.

Non-Gaussian Tails
We have explained how the stochastic approach to inflation translates the problem of finding the distribution of scalar fluctuations onto characterizing the behavior of a random walk. The CLT tells us that the Gaussian probability distribution is a fixed point of the conventional random walk. Just as with RG flows in quantum field theories, we can classify the deformations that could produce a non-Gaussian tail, in analogy with Sec. 2.1, into three types: relevant, marginal, or irrelevant.
Relevant: A non-zero mean, e.g. due to a deterministic force, takes us away from the Gaussian fixed point of the CLT. For inflation, this corresponds to a potential V (φ) such that the equilibrium probability distribution takes the form of Eq. (4.14), namely If the potential includes any operators other than a mass term, this distribution is non-Gaussian. Yet, since it is due to the presence of the unique relevant deformation, a large deviation from Gaussianity does not indicate a breakdown of the effective description.
In practice, non-trivial production rates for PBHs require some more complicated and possibly non-analytic potential V (φ). Some UV models may motivate particular nonperturbative shapes for V (φ), but in practice the formation of PBHs has mostly been explored using phenomenological models for the inflationary potential, see e.g. [80][81][82].
Marginal: A marginal deformation of a random walk corresponds to changing the covariance matrix that governs the steps in the walk. In the context of inflation, this means changing the amplitude of scalar fluctuations, P ζ (k), or mixing the inflaton with additional fields. The former is a common strategy but only leads to enhanced Gaussian tails. Mixing with additional fields can give rise to non-Gaussian tails in a variety of ways.
A canonical example of models that use mixing are the curvaton or modulated reheating scenarios, where the late time adiabatic mode is determined by a spectator field χ, so that for some model-dependent function F . As a concrete example, suppose our spectator field has a potential V (χ) = λφ 4 so that at leading order we have Now suppose that by some process after inflation, the adiabatic mode is determined by ζ = κχ 3 for some constant κ. By the change of variables (integrating over χ subject to the mixing with ζ), we have P (ζ) = exp − π 2 λ|ζ| 4/3 9κ 4/3 . (6.5) In this way, we can produce non-analytic behavior in the tail from otherwise local interactions. Of course, this assumes that the functions V (χ) and F (χ) are known exactly, when in fact they are themselves expansions in χ. For a given model, one must check the selfconsistency of truncating these expansions, including the corrections to V eff (χ) discussed in Sec. 5. The derivation of Eq. (6.3), while a trivial restatement of the mixing, has an important interpretation in the context of the LDP. In the LDP literature, this change of variables is known as the correspondence principle, which says that when ζ = F (χ) and the rate function for χ is known, then the rate function for ζ is given by whereĨ(χ) is the rate function that determines the large deviations of the χ field. These types of probability distributions can arise from interactions that mix the adiabatic and isocurvature modes during or after inflation [74,80,[83][84][85][86][87][88][89][90]. The probability distributions found in these examples match the discussion given here as they are welldescribed by the LDP. In SdSET, one can remove these types of mixing interactions via a field redefinition, which effectively introduces a transformation of the form Eq. (6.3) on the observable fluctuations.
Irrelevant: Non-Gaussian noise is an irrelevant perturbation of a random walk. The CLT ensures that even a highly non-Gaussian probability distribution will produce a Gaussian distribution for the total distance of the walk. We saw that this is not true for large deviations, which lie outside the regime of the CLT, but also require exact knowledge of the non-Gaussian probability distribution.
It is tempting to use non-Gaussian statistics for quantum fluctuations as a mechanism to produce PBHs. However, as discussed in Sec. 3 (and [46]), when the non-Gaussian terms in Stochastic Inflation become important, both the stochastic framework and the EFT of inflation are breaking down. In principle one can use the LDP techniques to calculate the rate of functions within the microscopic model of inflation; however, given that the stochastic framework does not apply to the microscopic theory, one must go beyond the classical probabilistic description we have used here.

Relation to Factorial Enhancement
The increased probability distribution for large fluctuations has been tied to the factorial enhancement of higher-order correlators in a number of examples [85,[91][92][93]. Concretely, if one is calculating the correlators of some field χ as a perturbative expansion in a parameter then the correlators will be factorially enhanced unless a M ∝ 1/M !. This condition implies that the expansion in Eq. (6.10) converges everywhere in the complex plane and therefore W [J] is an entire function. It was confirmed by explicit calculation in [93] that W [J] has a logarithmic branch cut 16 for V (χ) ∝ |χ| p when p > 2. In this precise sense, the non-Gaussian tails that are calculable via Stochastic Inflation also imply a factorial enhancement of the large-M -point correlators. As we saw in Eq. (3.18), when the LDP holds, the inverse Fourier transform can be calculated from the method of steepest descents and reproduces Cramér's theorem. Given that W [J] is itself calculable by the non-trivial saddle, Eq. (6.11) can be interpreted as 15 The notation in this section follows [93]. It can be related to the symbols introduced in Sec. 2.2 via the following map: J → θ, Z[J] → e θX , W [J] → −N λ[θ], V (χ) → N I( X). 16 As discussed in [93], the existence of a point in the complex plane where Z[J] = 0 is sufficient to demonstrate the W [J] is not entire.
a Legendre transform from the generating functional to the rate function W [J] → I(χ). The appearance of the Legendre transform in calculating this rate function using the LDP is equivalent to the role of the Legendre transform in relating free energies in statistical mechanics.
The role of the Fourier transform in relating the language of Stochastic Inflation and rare fluctuations also appears in the tail expansion of the probability distribution reviewed in e.g. [81]. In that case, one is Fourier transforming the time variable, rather than the field χ, but the δN -formalism ultimately relates the two at the end of inflation. In principle, the techniques of the LDP should also apply directly to the tail expansion and might offer insights into the regime of control of those calculations.
Finally, the above discussion is also related to the factorial enhancement of scattering amplitudes at high multiplicity [94][95][96][97][98][99]. In that context, semi-classical solutions have also proven to be important and are closely related to the semi-classical calculation of W [J] described above. It is likely there is a deeper connection to the LDP, as we have seen in the case of cosmological correlators.

Conclusions
In this paper, we demonstrated that the Large Deviation Principle can be used to diagnose the validity of the underlying Effective Field Theory expansion being used to derive the evolution equations of Stochastic Inflation. We showed how to interpret the dynamical Renormalization Group equations that derive Stochastic Inflation as coarse-graining a random walk. When the potential is essentially zero, for example in the case of the inflaton, we argued that this procedure leads to a Gaussian distribution as a consequence of the Central Limit Theorem. In this case, EFT expectations hold and everything is under perturbative control. However, if one asks questions that are sensitive to the tails of the probability distribution, then the LDP tells us that a new saddle point of the action dominates, and making a reliable prediction requires knowledge of the EFT to all orders (or equivalently one must appeal to the UV completion). We then showed how the LDP applies for models with a non-trivial potential, and again explored the regime of EFT validity. Finally, we showed how the LDP could be used to diagnose the validity of models that were introduced with the goal of yielding a non-trivial production of primordial black holes.
There are many important future directions to explore. It would be of great interest to apply the LDP to compute the stochastic evolution equations in UV complete examples in such a way that the impact on the tails of the distributions was completely under control. This would provide a test case analog of the random walk examples that were presented in Sec. 2 above. It would also be interesting to explore other applications of the LDP in cosmology and quantum field theory. For example, the appearance of additional saddles describing the tail of the distribution is reminiscent of scattering amplitudes with high-multiplicity [94][95][96][97][98][99] and the large charge expansion of conformal field theories [100][101][102][103][104]. These are natural settings where one might expect the LDP to play a role, and it would be exciting to make this precise. We anticipate that having connected the validity of Stochastic Inflation to the LDP will yield many new insights into the nature of quantum field theory, both in dS spacetime and beyond.