Probability Distributions for Space and Time Averaged Quantum Stress Tensors

We extend previous work on quantum stress tensor operators which have been averaged over finite time intervals to include averaging over finite regions of space as well. The space and time averaging can be viewed as describing a measurement process for a stress tensor component, such as the energy density of a quantized field in its vacuum state. Although spatial averaging reduces the probability of large vacuum fluctuations compared to time averaging alone, we find that the probability distribution decreases more slowly than exponentially as the magnitude of the measured energy density increases. This implies that vacuum fluctuations can sometimes dominate over thermal fluctuations and potentially have observable effects.

of the form τ d T ≥ −x 0 , where x 0 > 0 is a dimensionless number of the order of or somewhat less than unity. For a recent review see [5]. If the quantum inequality gives the optimal lower bound on expectation values, then P (x) = 0 if x < −x 0 . This means that −x 0 is the lowest eigenvalue of the averaged operator T , and is hence both the lower bound on expectation values, and the smallest possible outcome of a measurement in any state.
For the energy density (at least for the averages considered to date) the tail of P (x) for x 1 was found to fall as an exponential in two spacetime dimensions [1,4], but more slowly in four dimensions [2,3]. Specifically, P (x) ∼ c 0 x b e −ax c for some constants c 0 , b, a, c, of which c is the most crucial. For stress tensor operators averaged in time with a Lorentzian function, it was found in Ref.
[2] that c = 1/3. This implies that the distribution is highly skewed and so fluctuations which are several orders of magnitude larger than the standard deviation can have a non-negligible probability of occurring. This is a result which would not be possible in random processes where measurements at different moments in time are uncorrelated, in which case the central limit theorem would give a Gaussian probability distribution. By contrast our results reflect the highly correlated nature of quantum vacuum fluctuations.
Although a Lorentzian function of time is a useful model, it suffers from the defect that it describes a measurement which began in the infinite past and is only completed in the infinite future. A more realistic description involves smooth (infinitely differentiable) functions which have compact support, that is, are zero outside of a finite interval.
The probability distributions for quantum stress tensors measured in a finite interval with such functions was studied in Ref. [3]. A class of compactly supported functions was treated, whose Fourier transforms fall as e −γ|ω| α , where 0 < α < 1 and γ > 0, as |ω| → ∞. It was argued that such functions could arise in physical situations, as illustrated by a simple electrical circuit whose switch-on corresponds to α = 1/2. For this class of functions, it was shown that the tail of the probability distribution now decays with c = α/3. Thus if, for example, a measurement of the energy density in the vacuum state of the electromagnetic field is described by the α = 1/2 function, then the probability of finding a very large energy density associated with x 1 will be roughly proportional to e −ax 1/6 .
The previous results on stress tensor probability distributions [1-3] were obtained either from a moment generating function [1], or by asymptotic calculation of high moments [2,3]. In four dimensions, the moments approach suffers from the ambiguity that the moments do not necessarily uniquely determine P (x). The Hamburger moment theorem [6] guarantees that P (x) is uniquely determined by the moments of the operator provided that the n-moment grows no faster than n!D n as n → ∞, for some constant D. However, the moments of stress tensor operators averaged with the compactly supported functions of time discussed in Ref. [3] grow as (3n/α)!. The non-compactly supported Lorentzian function used in Ref.
In all of these cases, P (x) may not be uniquely determined from the moments. In general, when the moments grow too rapidly to ensure uniqueness, there can be several distinct choices for P (x) which all produce the same moments, and differ from one another by an oscillatory function of x. Even if P (x) is not uniquely determined, its integrals over a finite interval tend to cancel the oscillations and can give a reliable estimate of the probability of a result in this interval. For example, in some applications one is interested in the probability of a fluctuation which exceeds a given threshold and is given by the complementary cumulative distribution, P > (x) = ∞ x P (y) dy, and it is possible to extract bounds on this function from the moment sequence in some cases, even if the moment sequence does not determine the probability distribution uniquely [2].
There is also an independent approach to finding P (x) which does not use the moments, which is direct diagonalization of the averaged operator T by a Bogoliubov transformation to find its eigenvalues and eigenstates. The probability of finding a given eigenvalue in a measurement on the original vacuum state is then the squared overlap of the eigenstate with the vacuum. In practice, this approach must be performed numerically on a system with a finite number of degrees of freedom. This was done in Ref. [7] for a massless scalar field in a spherical cavity including about one hundred modes for time sampling associated with several values of α. The results are in reasonable agreement with those found for the tail of P (x) in Refs. [2,3]. This lends support to the conclusion in the latter references that fluctuations several orders of magnitude larger than the the typical fluctuation can have a non-negligible probability of occurrence.
Such large fluctuations may have potentially observable effects. For example, the role of large radiation pressure fluctuations in enhancing the barrier penetration by charged particles was treated in Ref. [8], where it was argued that these fluctuations have the potential in some circumstances to increase the barrier penetration rate by several orders of magnitude compare to the rate predicted by the usual quantum tunneling process. It was further suggested that this effect may have already been observed in the nuclear fusion of heavy ions with heavy nuclei. By contrast, the vacuum fluctuations of the linear electric field, which obey a Gaussian probability distribution, cause only a modest increase in penetration rates [9,10]. Quantum stress tensor fluctuations are also of interest in gravity theory, as they can drive passive fluctuations of the gravitational field, which is a variety of quantum gravity effect. Stress tensor fluctuations in the early universe could play a role in the creation of primordial density perturbations [11,12] or tensor perturbations [13]. The references just cited all deal with integrals of the stress tensor correlation function, and hence the variance of the stress tensor fluctuations. It will be of interest to study the probability of large fluctuations in these and other gravitational applications. One possible application is to the effects of vacuum fluctuations on the small scale causal structure of spacetime. In two-dimensional models, it has been found that large positive fluctuations can cause focussing of geodesics, and closure of lightcones on small scales [14,15].
Most of the previous work on the probability of quantum stress tensors fluctuations was restricted to operators averaged in time at one spatial point. The purpose of the present paper is to extend this treatment to include the effects of averaging in space as well. The outline of the paper is as follows: In Sec. II, we discuss stress tensor probability distributions in two spacetime dimensions, particularly in conformal field theory where exact results are possible. Space and time averaging of stress tensor operators in four-dimensional Minkowski spacetime is developed in Sec. III, and the sampling functions needed for this averaging are discussed. An iteration procedure for the calculation of the moments of the averaged operators is introduced. This procedure is analyzed in detail in Sec. IV. It is argued that if the spatial averaging scale is smaller than the temporal scale, then the lower moments are sensitive only to the time averaging, but the high moments will also depend upon spatial averaging. The implications of these results for the rate of growth of the moments is treated in Sec. V. It is found that the initial growth rate can be the (3n/α)! behavior found in Ref. [3] with time averaging alone. However, for larger n, there is a transition to a somewhat lower growth rate of (n/α)!. This is still too fast to satisfy the Hamburger criterion, but our results suggest that a weaker criterion due to Stieltjes holds for 1/2 ≤ α < 1, implying that the moments uniquely determine the probability distribution among those that vanish on a half-line. The implications of these results for the tail of the probability distribution are discussed in Sec. VI, where it is shown that the asymptotic form of P (x) now falls more rapidly than in the worldline case, but still more slowly than an exponential function. This reflects that fact that spatial averaging somewhat reduces the probability of large fluctuations, but this probability remains high enough to have important physical effects. The latter point is discussed in more detail in the final section, Sec. VII, where the key results of the paper are summarized and discussed. Appendix A contains an explicit construction of specific forms of the temporal and spatial sampling functions. Appendix B discusses some results on the asymptotic forms of integrals which are used in Sec. V.
Units in which = c = 1 are used throughout the paper.

II. EXACT RESULTS IN 2-DIMENSIONAL CONFORMAL FIELD THEORY
Two-dimensional conformal field theory (CFT) provides an interesting example, in which the relative effects of time and space averaging can be determined in detail. Recall that the energy density of a CFT in 1 + 1-dimensions splits into mutually commuting left-and right-moving components where we assume flat spacetime and let u = t − x, v = t + x. Any spacetime average of the energy density can be written in terms of these components as where Here, the leading factor of 1/2 is a Jacobian determinant. Now let P L be the probability density function for measurements of T L , averaged against F L , in the vacuum state, i.e., and write P R and P for the analogous probability density functions of T R (averaged against F R ) and T 00 (averaged against f ). As T L and T R commute, the probability distributions are independent and the combined probability distribution is obtained as their convolution, The probability distribution of these components of the energy tensor can be determined -at least in principleeither by a moment generating function method [1] or by conformal welding techniques [4]. The latter method can be applied to the cases of the vacuum and certain other special states, including thermal equilibrium states and also highest weight states [4]. Each method rests on the solution to certain subsidiary problems and closed form results are only available in particular cases [1, 4,16], though the method of [4] is also amenable to numerical treatment.
Here, we draw attention to a special case where the probability distribution can be determined in closed form for different spatial and temporal averaging scales. Let that is, a product of Gaussians in space and time, normalized to have unit integral over spacetime, in which and τ determine the spatial and temporal averaging scales. In this case, a simple calculation gives which is also a normalized Gaussian with characteristic width σ = √ 2 + τ 2 . It is easily seen that F R (v) = F L (v).
For any unitary positive energy CFT, the probability distribution of T L (F L ) in the vacuum state is known in closed form [1] (see [4,16] for some other closed form expressions) and is given by the shifted Gamma distribution where c is the central charge of the CFT [e.g., c = 1 for a massless scalar field], ω 0 = c/(48πσ 2 ) and ϑ is a Heaviside function. As P L and P R are identical, the overall probability distribution is the convolution of P L with itself and is again a shifted Gamma distribution To see this, it is easiest to proceed from the moment generating function c/24 (9) for P L (defined for µ < 2πσ 2 ) and note that the moment generating function for P must be Therefore the probability density function for P is just that of P L but with c replaced by 2c throughout.
We may read off a sharp quantum inequality bound on the averaged energy density from (8), namely for any physically reasonable state ψ. This inequality may also be obtained as a special case of a general quantum inequality bound proved by different methods in [17], in which a precise specification of the relevant states may be found. It is interesting to compare this bound with the worldline bound obtained in [1,17] for Gaussian smearing on timescale τ . If one attempted to derive a spacetime bound by simply averaging all these bounds in x with the appropriate Gaussian weight, one would obtain a (non-sharp) bound As one might expect, the sharp bound (11) improves on this for all > 0, and becomes progressively tighter as increases. In the limit → ∞, we see that the sharp lower bound in (11) vanishes, which is to be expected as the Hamiltonian is a positive operator. Similarly, the probability distribution (8) converges to the delta-distribution δ(ω) in this limit, reflecting the fact that vacuum measurements of the Hamiltonian result in 0 with probability 1.
Our main interest, however, is in the effect of the spatial averaging on the moments and the probability distribution for finite spatial averaging scales. Inspecting the moment generating function (10), it is clear that the n-th moment scales with the characteristic scale σ as For n( /τ ) 2 1, the moments are little changed from those obtained by pure worldline smearing. This is a special case of a more general effect whereby a worldline result can be obtained as a limit of a small spatial averaging scale, which will be discussed in Sec. VI B. At higher n, of course, the effects of the spatial averaging become apparent.
Likewise, for a range of values ω slightly greater than zero, the probability distribution of ρ is well-approximated by its values for = 0 (with τ fixed), but as ω increases, the two distributions depart from one another, with the > 0 and for spacetime averaging with the same temporal sampling scale τ and = 2τ (right-hand curve, blue). The latter is displaced to the right and decays more rapidly. The vertical asymptotes occur at the quantum inequality bound in each case.
distribution decaying exponentially faster. An illustrative plot appears in Fig. 1. Note, however, that the probability of finding a negative measurement outcome is given in terms of the lower incomplete Γ-function as which is independent of s and τ , and depends only on the central charge c (provided we maintain Gaussian sampling).
Some results for non-Gaussian worldline sampling can be found in Ref. [4,16].
Extrapolating from these results, we may expect that for general quantum field theories, spatial averaging reduces the magnitude of the quantum inequality bound and also causes the positive tail of the probability distribution to decay more rapidly. Nonetheless, we may also expect that for sufficiently low moments or for a range of smaller values in the probability distribution, one may neglect the effect of spatial averaging on scales small in relation to the temporal averaging. Nonetheless, not all features of the CFT might be expected to generalize. In particular, here the spacetime averaged probability distribution is of the same functional form as the worldline averaged case, but with different parameters. As we will see, this is a special feature of conformal fields and is not true in general.

A. Averaged operators and their moments
Let T (x, t) be a quadratic normal ordered bosonic operator in four dimensional flat spacetime, such as a stress tensor component for a free scalar or electromagnetic field. We consider a space and time average of this operator defined by where f (t) and g(x) are compactly supported functions of time and of space, respectively. They are assumed to be non-negative and satisfy and Note that the averaging process breaks Lorentz symmetry. This is to be expected, as the averaging describes a measurement made in a specific spacetime region and in a selected frame of reference. The space and time averaged operator may be expanded in terms of annihilation and creation operators in the form where [a i , a † j ] = δ ij 1 1, A is hermitian and B is symmetric. The moments of T are defined as the vacuum expectation values of powers of T : The various moments can be expressed as polynomials in the matrices, A ij and B ij . The second moment, for example, is given by The primary example which we investigate in this paper is T =:φ 2 :, the squared time derivative of a massless scalar field. We may write a mode expansion forφ aṡ where ω = |k| and V is a quantization volume with periodic boundary conditions, which fixes the summation lattice for k.
Let the Fourier transforms of the sampling functions be defined bŷ Equations (17) and (18) imply thatf (0) =ĝ(0) = 1. Here we assume that the sampling functions, and hence their Fourier transforms, are even, real functions. The matrices A ij and B ij which appear in T and hence in the expressions for its moments, may be expressed in terms off andĝ. For the case of T =:φ 2 :, we have and both of which are real and symmetric.
We can now understand why time averaging is essential in four spacetime dimensions. The time average contributes a factor off 2 (ω j + ω ) to µ 2 which renders the sum over all modes in Eq. (21) finite. If we had averaged only in space, then µ 2 would just contain a factor ofĝ 2 (k j + k ), and receive a divergent contribution from the region where k j = −k , that is, from modes with antiparallel wavevectors.
In Ref. [3], it was argued that there is a dominant contribution to µ n , which is This contribution contains the maximum number of factors of A j , which tend to be larger that the corresponding B j , because of the minus sign in thef (ω j − ω ) factor, which allows it to be larger on average than thef (ω j + ω ) factor in B j . We will assume M n continues to be the dominant contribution when spatial averaging is included. If f andĝ are non-negative, all of the omitted terms are non-negative, so M n is always a lower bound on the exact moment. The construction of non-negativef andĝ is discussed in Ref. [3] and in Sec. III B.
We now give the generalization of the discussion in Sec. IIIA of Ref. [3] to the case with spatial and temporal averaging. Use Eqs. (25) and (26) to write and we have taken the V → ∞ limit. In the case that n = 2m is even, we can write the above expression as where k = |k|, q = |q|, and we define These functions satisfy a recurrence relation for m ≥ 0, where

B. Compactly supported averaging functions
In this paper, we assume that both f (t) and g(x) are functions with compact support, and hence describe measurements made in both a finite time interval and a finite spatial region. This implies that their Fourier transforms, f (ω) andĝ(k), decay more slowly than exponentially for large values of their arguments. Starting with f , we assume that its support has characteristic width τ (in a specific example given below, this will be the length of the support), and that its Fourier transform behaves asymptotically aŝ for some constants 0 < α < 1 and C f > 0, the latter of which is fixed by the requirement that f has unit integral, i.e., f (0) = 1. It is further assumed that f is even and nonnegative, and that the same is true off . A class of functions with these properties was constructed and discussed in detail in Sect. II of Ref. [3].
Turning to g, we require similar properties and, additionally, spherical symmetry. Functions of this type may be constructed as follows. Start with a nonnegative even and smooth function of compact support, h, with support of characteristic width (in an example below, this will be half the width of the support) and Fourier transform obeyinĝ for some constants η > 0, 0 < λ < 1 and C h > 0. We also assume thatĥ(ω) has a maximum at ω = 0 and is monotone decreasing on the positive half-line, so thatĥ (ω) ≤ 0 andĥ (0) < 0. Setting we then haveĝ Using L'Hôpital's rule and the fact thatĥ (0) = 0 it is easily seen thatĝ(0) = 1, so g has unit integral over 3-space.
Note also thatĝ(k) ≥ 0 for all k. Furthermore, we may deducê Here we define s = /τ as the ratio of the spatial and temporal sampling widths. We will henceforth adopt units of time in which τ = 1, so s = , unless otherwise noted. In this situation, 1/λ measures the ratio of spatial and temporal sampling scales.
A specific example for the case α = λ = 1 2 may be based on results in [3], where a nonnegative smooth and even function L was constructed, with support [−1, 1], unit integral, and nonnegative Fourier transform obeyinĝ where the numerical value of C L = 2.9324 to 5 significant figures. See in particular Figs. 4 & 5 of Ref. [3]. Setting then f has support [−τ /2, τ /2], while g is supported in a ball of radius s. Noting thatf (ω) =L(ωτ /2) and h(ω) = sL(ωs), the transforms of f and g have asymptotic behavior , and the regime where q is smaller than the ball radius and spatial averaging is less significant (right-hand figure).
where = √ 2s and C g has numerical value The construction of some specific approximate forms forf (ω) andĝ(k) is described in more detail in Appendix A.

IV. ANALYSIS OF THE ITERATION PROCEDURE
A. Heuristic treatment Any smooth compactly supported function has a Fourier transform that decays faster than any inverse power.
Therefore the integrals in Eq. (32) are dominated by contributions from certain regions of the integration domain.
Proceeding somewhat heuristically for the moment, the factor off restricts the effective integration region to a shell of typical radius ∼ q and thickness ρf ∝ 1/τ , while the factor ofĝ restricts the effective integration region to a ball centered at q and of radius ρĝ ∝ 1/s. Overall, the integration will be dominated by contributions arising from the intersection of the ball and shell, as illustrated by Fig. 2.
If q is small in relation to the ball radius ρĝ, the shell is contained within the ball so the integration therefore extends over the whole of the shell, which has a volume ∼ q 2 ρf . Therefore one expects, roughly, that for such q and a constant C. This is the growth rate expected in the worldline limit treated in Ref. [3], and corresponds to the factor of Ω p in Eqs. (77) and (78) of that paper, as we are currently dealing with the case p = 3. On the other hand, as q becomes large in relation to the radius of the ball determined byĝ, the effective integration region volume tends to a constant ∼ (ρĝ) 2 ρf , where ρĝ is the effective support radius ofĝ and similarly for ρf . Therefore, for large q, we expect for another constant C . The consequence of this is that low moments (which are largely fixed by the small q regime) will behave like those of the worldline averaged quantities, whereas higher moments grow rather less rapidly. The distinction between low and high moments is determined by the ratio ρĝ/ρf ≈ τ /s: the smaller the scale of spatial averaging relative to temporal averaging, i.e., the larger the ratio ρĝ/ρf of momentum space averaging scales, the larger q must be to detect the effect of spatial averaging and therefore the higher the threshold beyond which the moments M n are affected by the spatial averaging. This fits in with some basic intuition: on one hand, if one shrinks the spatial averaging to a δ-function, one ought to obtain the worldline results, consistent with Eq. integrating stress tensor components other than the energy density over all space. Nonetheless, we will find that spatial averaging of these stress tensor components also reduces the probability of large vacuum fluctuations.
In the rest of this section we investigate these heuristic ideas more quantitatively by both numerical and analytic means.

B. The first iteration
To start, we consider in more detail how to approximate the first iterate G 1 (k, q), given by in the regime where q and k both tend to infinity though not necessarily at the same rate. Each of the Fourier transforms in the integrand decays rapidly as the magnitude of its argument increases. Therefore the dominant contributions to the integral are expected to arise from regions where ≈ q or ≈ −k. Unless k ≈ −q, a case that we defer for the moment, these two regions are well-separated as q, k → ∞ and their contributions may be analysed separately.
Consider first the contribution from ≈ q. In this region,f (k + )ĝ(k + ) ≈f (k + q)ĝ(k + q) = G 0 (k, q), and therefore the contribution to G 1 is expected to be approximately where the function I(q) is defined as and will be called the iteration coefficient; note that it depends only on the magnitude q of q due to spherical symmetry of g. The iteration coefficient will be studied in more detail below; in particular, it has a finite, non-zero limit as On the other hand, in the region where ≈ −k we may approximateĝ(q − )f (k + ) ≈ĝ(q + k)f (2k), maintaining the assumption that k ≈ −q. The contribution is then approximately Under the additional assumption that q k thef factor may be taken outside the integral, usingf (q − ) ≈f (q −k) ≈ f (q + k), giving an approximate contribution to G 1 . Owing to the rapid decay off (2k), this contribution is subdominant relative to that of Eq. (48) and we deduce that as q, k → ∞ with q k. Alternatively, suppose that q and k have comparable magnitudes. Provided that k ≈ −q, we may then approximate Eq. (50) usingf (2k) ≈f (k + q), and replacing q by k under the integral. Then Eq. (50) contributes approximately kI(k)G 0 (k, q) to G 1 (k, q). Combining with Eq. (48) we have in total as q, k → ∞ with k ≈ −q. In particular, as q → ∞.
If k ≈ −q the two contributing regions overlap and should not be analysed separately. Instead, we expect that where the inequality arises because 0 ≤ĝ ≤ 1.
The ability to pull factors such asf (2q) out of the integral arises because these functions become flat for large arguments, as was noted above Eq. (77) in [3]. More precisely,f (ω)/f (ω) → 0 as ω → ∞, sof = o(f ). In addition, the functionĥ defined in Appendix A satisfies |ĥ |/ĥ 0.33, and is hence relatively flat for all values of its arguement.
C. The iteration coefficient

Form for large q
Our basic hypothesis is that under the iteration Eq. (32), for q k, where the iteration coefficient, I(q), was defined in Eq. (49). Changing variables to m = q − , Our aim is to show that I(q) → I(∞) as q → ∞, where andq = q/q is a unit vector along q.
To prove this, note that for each fixed m, one has as q → ∞. Therefore the integrand approaches the required form pointwise. Noting also thatf (ω) ≤f (0) for all ω, and thatf (0)ĝ(m) is integrable, the required result follows by the dominated convergence theorem. We call I(∞) the asymptotic iteration coefficient, and identify it with the constant C which appeared in Eq. (46).

A coordinate space form of I(∞)
We may write Eq. (58) as where c is the cosine of the angle between m and q, and we let ξ = m c. Next we use Eq. (23) and perform the ξ-integration to write Next use Eq. (37) and the fact thatĥ (ms) is an odd function to write In the second step above, an integration by parts was performed usingĥ(ms) → 0 as m → ±∞. Finally, we recognize that the m-integration is an inverse Fourier transform yielding 2πh(−t/s) = 2πh(t/s) to obtain We may use Eq. (23) to writeĥ which allows I(∞) to be calculated directly from the coordinate space sampling functions, f (t) and h(t).
Recall that f (t) has a characteristic width τ = 1, and h(t/s) has width s. It is of interest to consider the limits in which one of these widths is large compared to the other. First consider the case of a large spatial sampling region, In the opposite limit of a small spatial sampling scale, we note that the function h(t/s) forces the integral to get its dominant contribution from small t, so f (t) ≈ f (0), and now we use ∞ −∞ dt h(t/s) = s to find The powers of s −3 and s −2 which appear in Eqs. (65) and (66), respectively, will be numerically confirmed in Sec. VI A.

D. Test of the iteration procedure
Here we wish to test numerically a special case of our proposed iteration procedure. Specifically, we expect that in the limit that q k. Define The ratio R as a function of q is repeated for the case that q and k are antiparallel. Now there is a local minimum when q ≈ k, surrounded by local maxima, but again R → 1 when q k.
The ratio R is plotted in Fig. 3 as a function of q for different values of k when the vectors q and k are parallel, and in Fig. 4 when they are antiparallel. We see that R ≈ 1 for large q, which supports our iteration hypothesis. We may use the results in Sec. IV B to understand some of the other features in Figs. 3 and 4. First, there are maxima in Fig. 3 near q ≈ k where R ≈ 2. This follows from Eq. (54), which further shows that the height of this ridge is bounded, so R → 2 when q → ∞ with k = q. A second feature are the minma in Fig. 4 near q ≈ −k, where R < 1.

E. A growth bound
Alongside the numerical evidence supporting our iteration procedure, it is useful to have analytic worst-case bounds on the growth of G m . We assume that there exist constants C > 0, 0 < α < 1, τ > 0, 0 < λ < 1 and > 0 such that for all ω ∈ R, k ∈ R 3 . As previously, we adopt units in which τ = 1. The parameter 1/λ measures the ratio of spatial and temporal sampling scales.
It is useful to establish some rough bounds on the way in which the functions G m can grow with m. Because it is no more difficult, we study a slightly more general problem than the recurrence relation expressed by (32) and (33).
For integer p ≥ 1, and with fixed test functions f and g whose Fourier transforms satisfy Eq. (69), we define an integral operator Ξ (p) by and consider the iteration G m+1 = Ξ (p) G m , with G 0 as in (33).
Starting from the assumption in Eq. (69), our aim is to prove that m is a polynomial of degree at most mp with coefficients independent of q and k. In our situation of interest, p = 1, so the polynomial factor in q has degree at most m, which supports the heuristic expectation given in Eq. (46). We will need two useful inequalities. The first was proved as Eq. (B6) in [3], and asserts x α + y α ≥ (x + y) α + (1 − α) min{x, y} α which holds for x, y > 0 and 0 < α < 1. Here, we also require an analogous inequality on vector norms, for x, y ∈ R 3 , 0 < α < 1, where in the first step we apply (72) to x = x and y = y and in the second, we have applied the ordinary triangle inequality, and the fact that 0 < α < 1.
The proof of Eq. (71) is inductive. The statement is true by assumption for m = 0, because it follows from Eq. (69) and Eq. (33) that for all k, q ∈ R 3 . So let us now suppose that (71) holds for some m ≥ 0. We obtain Expanding the degree-mp polynomial Q (p) m , it is clearly sufficient for our inductive argument to show that integrals of the form with r ≥ p ≥ 1, obey bounds of the form for all k, q, where P (r) is a polynomial of degree r with coefficients independent of k and q, whose leading coefficient is also independent of r.
To prove the estimate (77), we apply (73) to obtain Now split the integral into the regions < 2 1/r q and ≥ 2 1/r q. In the first of these, we can use the fact that r < 2q r if r ≥ 1, further, we apply (72) to find Thus the contribution is bounded from above by In the second region, we use e −(k+ ) α ≤ e −(k+q) α to see that the contribution is bounded by where S r,α := sup x r/α e −x = (1 − 2 −1/r ) −r (r/α) r/α e −r/α .
As the upper bound suggests, S r,α will grow rapidly in r for fixed α. We may recombine the estimates (79) and (80) as where we have simply estimated the individual integrals by their extension to all of R 3 . Using the elementary fact and the freedom to translate the origin of coordinates, one has Accordingly, L (r) (k, q) is bounded by a polynomial in q (with coefficients independent of k and q, and leading coefficient independent of r) multiplied by e −(q+k) α − k+q λ . This concludes the inductive proof of the bound (71).
We make no claim that this is the tightest possible upper bound that could be derived. However, the argument is relatively simple and indicates a worst-case growth rate for the functions G m (k, q) that is nonetheless broadly in line with the heuristic discussion of Sec. IV A, in the case p = 1.

A. Approximate Forms of the Moments
Recall that in the iteration procedure for G m (k, q), using Eq. (32), we expect for the initial iterations to each bring out a factor proportional to q 3 , and the later iterations to each bring out a factor of I(∞) q. Thus, for m 1, we expect the asymptotic form for G m (k, q), to be where C and µ are constants which correct for the possibility that the first several iterations bring out different constants and powers of q than do the later iterations. If we use this form in Eq. (30), we find where We will estimate this integral for the case that N 1. As we expect that the dominant contribution comes from q k, we approximate |q + k| ≈ q. If we assume thatf andĝ may be approximated by their asymptotic forms,

Eqs. (34) and (38), then we have
where we have written Next let k = r − q to write Define a new variable u by r = q(1 + u) 1/α to write to final integral above as where in the second step we used the fact that the dominant contribution comes from the region where u 1 because r ≈ q when q k. Thus we have For the case α = λ, this integral may be evaluated explicitly to obtain When α = 1/2, this becomes The result in Eq. (88), that S N ≈ T N , relies upon the dominant contribution to S N coming from regions where q k. when N 1. However, it is worth examining more carefully the contribution from the region where q + k ≈ 0, where the argument ofĝ becomes small, in order to show that this contribution is small in relation to T N . In this region k ≈ q and the contribution to S N is therefore bounded by for constants C and C = C/(16 1+1/α α), depending on f , g and α but not N . Here we have changed variables from k to k + q in the second line. We need this contribution to S N be small compared to T N , our estimate for S N , when N is large. Next we will examine several special cases.

1.
Case: Here we have an explicit formula for T N , given in Eq. (94), while This is suppressed compared to T N by a factor proportional to This factor decreases as N grows provided that ≤ √ 2 − 1 ≈ 0.414. Under this condition, in which spatial sampling takes place over modest scales relative to temporal sampling, we expect T N to be a good approximation to S N for large N for α = λ = 1 2 .
Although there is a discontinuity between these two forms at λ = α/2 in the form of a factor of e 2 /8 , both forms have the same dependence upon N : We may combine this result with Eqs. (95) and (98) to write The ratio of gamma functions can at most grow as a power of N , and here λ/α ≤ 1/2, so the behavior of the ratio S N 1 /T N is dominated by the e −N ln 2 factor, which decays exponentially as N increases, leading to S N 1 T N for large N .

3.
Case: α/2 ≤ λ ≤ 2α/3 The asymptotic form for I N in this case is given by Eq. (B13), where β = λ/α. Note that the exponential in the right-hand-side of Eq. (B13) contains two terms. The first is a negative term proportional to (N/α) β , which also appears in Eqs. (B11) and (B12). The second is a positive term to proportional to (N/α) 2β−1 . However, β > 2β − 1 in the range of interest here, so the first term dominates the exponential and again leads to the same leading order asymptotic behavior for I N as that given in Eq. (100). Hence, the ratio S N 1 /T N is again given by Eq. (101) for large N . In all of these cases, we conclude that S N 1 is asymptotically small compared to T N , so the region where q + k ≈ 0 does not give a large contribution to S N .

C. Numerical Tests of SN → TN
We can test the approach of S N to its limiting form, T N , for large N by numerically evaluating Eqs. (87) and (92). In the special case that λ = α = 1/2, T N is given by Eq (94), and we may use the explicit forms forf andĝ constructed in Appendix A to evaluate S N . In all cases, we may approximate the sampling functions in Eq. (87) by their asymptotic forms for large arguments if N is large. In this case, we use Eq. (34) forf . However, we need to modify the form given in Eq. (38) forĝ to avoid a singularity at q + k = 0. For this purpose, we use the cutoff-dependent form and test the dependence of the integral upon the parameter Q 0 .
The results obtained from both approaches are plotted in Fig. 5 for the case that λ = α = 1/2, where = √ 2s, and agree reasonably well. The cutoff parameter Q 0 was varied between values of about 1 and 10 without a significant effect. We can see that for smaller values of s, S N /T N becomes close to one for large N . For larger values of s, S N /T N is noticeably larger than one for the range of N considered.
Some results for α = 1/2, but λ < α are plotted in Figs. 6 and 5. In this case, Eq. (87) was evaluated using Eqs. (34) and (38). Again, the result seems to be relatively independent of Q 0 . Here we appear to find that S N → T N for N 1, but that this limit is attained more quickly for smaller values of and of λ. Note that in all cases, we find In the special case that λ < α/2 < 1/2, we are able to give a rigorous proof that S N /T N → 1 as N → ∞, but the details will be omitted here.

D. Asymptotic Behavior of the Moments
We may now use Eq. (86) and assume that S N ≈ T N to write  for n 1. If we let q → 2 −1/α q in Eq. (92), then we have where = 2 1−λ/α and I N ( ) is defined in Eq. (B7). Now we have where we have used Eq. (29), and defined As already mentioned, the asymptotic behavior of I N for large N is discussed for several cases in Appendix B, where it is found that I N /Γ(N/α) is bounded as N → ∞. This leads to a factor of Γ n+2(1+µ+λ) α − 4 in M n , which reveals that for large n, the moments grow no faster than (n/α)! (times a factor growing exponentially in n). This is slower than the (3n/α)! growth rate found in Ref. [3] for the case of time averaging alone. However, if α < 1, it is still faster than n! growth.

VI. THE TAIL OF THE PROBABILITY DISTRIBUTION
A. The form of the tail Note that Eq. (105) for M n , the dominant contribution to the n-th moment, can be written as If we let x = B q, then this expression becomes where K 0 and K are constants independent of n. Recall that the moments of the probability distribution, P (x), are The last step holds when n is sufficiently large that the the interval [−x 0 , 0] makes a negligible contribution to the integral. Comparison of Eqs. (108) and (109) suggests that for large x.
This identification is subject to the possible ambiguity that rapidly growing moments may not uniquely determine the probability distribution. However, for a probability distribution which is nonzero on a half line, as is the case here, the condition that the moments uniquely determine P (x) is the Stieltjes criterion [6], which requires for all n for some choice of constants C and D. We found in the previous section that here the moments grow no faster than (n/α)!, so this criterion is satisfied for α ≥ 1/2 and hence P (x) is uniquely determined by the moments. If α < 1/2, then we have the same situation as in the worldline case, where the moments might not uniquely determine P (x). Nonetheless, it is possible to gain some information about the tail of the distribution, as discussed in Sec. VI of Ref. [2].
The constants K and µ are not determined by the methods used here, because the transition between the low order and high order iteration regimes, discussed in Sec. V A, is not fully understood. However, the argument of the exponential in Eq. (110) is determined, and governs the primary rate of decay of the tail. If λ < α, the (x/B) α term in Eq. (110) will eventually dominate the (x/B) λ term, and we will have for sufficiently large x. In the case that λ = α, we have the asymptotic form as = in this case. Recall that B is determined by Eqs. (60) and (106). In the special case that λ = α = 1/2, we may numerically compute B as a function of s = /τ , using the the approximate forms off (ω) andĝ(k) given in Appendix A. The results are illustrated in Figs. 8 and 9.
In all regions, B decreases as s increases. As smaller values of B suppress the probability of a fluctuation with a given dimensionless magnitude x, this is consistent with the intuition that increasing relative to τ decreases the probability of a large fluctuation.

B. The transition from worldline behavior to spacetime averaged behavior
Recall that in Ref. [3], the averaging along a worldline alone was treated, and the asymptotic form of the probability distribution was found to be of the form with c = α/3. In contrast, the asymptotic form of the spacetime averaged distribution, for λ ≤ α, has a similar form, but with c = α. The effect of the spatial averaging has been to enhance the rate of decrease of the tail of P (x).
However, if the spatial sampling scale s is small compared to the temporal scale τ , we expect a finite region in x where the worldline form holds approximately. This is the regime depicted in the right part of Fig. 2 in τ = 1 units, and when each iteration produces a factor of q 3 , as predicted by Eq. (45). In this regime, the n-th moment, given by Eq. (30), will contain an integral on q of the form where we assume n 1 and use Eq, (34). The peak of this integrand, and hence the region which gives the dominant contribution to the integral, occurs at q = q * = 3(n + 1) 2α if n 1. The requirement that the worldline approximation is valid implies that q * 1/s and hence This condition gives the range of moments which are determined by the temporal sampling alone. It is interesting to determine the interval of x that largely determines these moments. If we use the approximation in Eq. (114) for P (x), the n-th moment is The maximum of this integrand is at if n b. If we set n equal to its upper limit in Eq. (117), then we obtain an estimate for the value of x at which the transition from worldline to spacetime averaged behavior occurs: where we have used c = α/3 and assumed that a factor of a/2 is of order one. As was discussed in Ref. [8], x x * is the range of validity of the worldline approximation. More generally x ≈ x * marks the transition in P (x) from its worldline form to the spacetime averaged form.
In this region, This tells us that the value of P (x) decreases exponentially with increasing n. The significance of this result lies in the fact that in a given application of the tail of probability distribution, we are typically interested in the probability of fluctuations which might be large compared to the typical fluctuation, but for which P (x) is still above some threshold of observability. Thus the regime of greatest physical interest may be one where x 1, but is not the x → ∞ limit.
Recall that the form of the tail of tail of P (x) given by Eq. (110) was derived assuming that S N ≈ T N for large N . The numerical results given in Figs. 5, 6, and 7 indicate this happening in some cases. However, in other cases, especially the λ = α = 1/2 case in Fig. 5, S N is somewhat larger than T N for N 200. Although the ratio S N /T N is still decreasing, and might approach one eventually, it is perhaps more important that S N > T N in many cases of physical interest. This implies that Eq. (110) is better viewed as a lower bound on the actual probability distribution in these cases. For example, suppose that S N ≈ A T N in some range of N 1, where A > 1 is a constant. The corresponding range of x is given by Eq. (121), given that n ≈ N for N 1. In this case, we can expect that Eq. (110) underestimates the correct distribution in this range by a factor of 1/A. Note that the overall constant in Eq. (110) is not determined by the arguments presented in this paper. An alternative approach to computing P (x) is numerical diagonalization, which was used in Ref. [7] for the case of time averaging. Work is currently in progress to extend this approach to the case of spacetime averaged operators. In principle, the diagonalization approach is free of the ambiguities encountered in the present work.
D. The case when the sampling length is large compared to the sampling time In much of this paper, we have implicitly assumed that s < 1, or < τ . However, the opposite limit of large sampling length, s > 1 is also of some interest. In this case, the diameter of the ball depicted in Fig. 2 is less than than the thickness of the shell. If s 1, the relevant illustration is the left-hand panel of this figure, but with the ball entirely contained within the shell, as the case where the very small ball is partly outside the much thicker shell will give a small contribution. In this case, the iteration will always be described by Eq. (46) with C = I(∞), and the dominant contribution to the moments, M n , will be given by Eq. (86) with C = 1 and µ = 0 for all n. However, the arguments in Sec. VI that S N ≈ T N still require that N 1. We may now write Eq. (110) for the asymptotic form of the tail of the probability distribution as for x 1, where the constant K is found from Eqs. (105) and (106) to be Unlike the more general case, here K can be computed explicitly once the sampling functions are known. Note that when s > 1, Eq. (65) tells us that where B 1 is a constant. However, the factor of C 2 f g is also a function of s. Now we consider the special case where α = λ = 1/2, where = = √ s 1. Now Eq. (123) becomes where Recall that C f g = C f C g . Further assume that these constants have the values given in Sec. III B: C f ≈ 2.93 and C g as given in Eq. (44), and that B 1 ≈ 1, as illustrated in Figs. 8 and 9. Finally, note that s 4 x = 4 T , as x = τ 4 T and T is the spacetime average of :φ 2 :. We may write the asymptotic probability distribution for T as The factor of s 6 presumably reflects the fact that the limit τ → 0 for fixed is not meaningful. Equation (128) is only valid when T is sufficiently large that P (T ) 1.

VII. SUMMARY AND DISCUSSION
In this paper, we have discussed the fluctuations of quantum stress tensor operators which have been averaged over finite intervals in both time and space. One can view this spacetime averaging as modeling a measurement process which takes place in a finite spacetime region. Some averaging is essential for the operator to have finite moments and hence a meaningful probability distribution. In the two spacetime dimensional CFT models treated in Sec. II, the averaging could be performed in time alone or equivalently in space alone, or it could be both in time and in space. In the latter case, the probability of large fluctuations is suppressed compared to the cases of time averaging alone or space averaging alone. In the four-dimensional models treated in the remainder of the paper, time averaging is essential. Space averaging alone would not suppress an infinite contribution to the moments coming from pairs of modes associated with equal and opposite momenta. For the same reason, there are no quantum inequalities for purely spatial averaging in four dimensions [18].
We have developed a formalism for treating the effects of both space and time averaging. In both cases, we assume that the averaging intervals are finite, and hence are described by compactly supported functions of time and of space.
We have assumed that there is an inertial frame (a laboratory frame) in which the space time averaging can be written as a product of a compactly supported function of time and of a spherically symmetric, compactly supported function of space. The Fourier transform of the former is taken to be asymptotically proportional to e −|ωτ | α , and that of the latter to be asymptotically proportional to e −( k) λ , where 0 < λ ≤ α < 1, τ is the characteristic width of the time sampling functions, and is that of the spatial sampling function.
We developed an iteration procedure which generalizes that used in Ref. [3] for the worldline case, and used this procedure to infer the rate of growth of the moments and the asymptotic form of the stress tensor probability distribution, P (x). Here x = τ 4 T is a dimensionless measure of the averaged operator T . We found that if the spatial sampling scale is small compared to the temporal scale, τ , then there is finite range in x which reproduces the worldline result that P (x) ∼ c 0 x b e −ax c with c = α/3. However, as x increases further, there is a transition region, beyond which P (x) again takes the same functional form, but with different values of the constants. We argued that the transition occurs at a value x * ≈ (τ / ) 3 . In particular, as x → ∞, we find c ≈ α. This larger value of c compared to the worldline case reflects the role of spatial averaging in suppressing large fluctuations. Nonetheless, with α < 1, the probability distribution still falls more slowly than an exponential function. This allows the possibility of large physical effects from the fluctuations of space and time averaged stress tensors.
A typical vacuum fluctuation of the energy density or other stress tensor components is described by the root mean square value, x rms , which is expected to be of order of one in τ = 1 units. In the case where the switching function corresponds to α = 1/2, then the probability density for a large fluctuation of the space and time averaged energy density is roughly proportional to e − √ x . A large fluctuation with x = 100 x rms is expected to be suppressed by a factor of order e −10 = 4.5 × 10 −5 compared to a typical fluctuation. By comparison, in a process described by a Gaussian distribution, such a large fluctuation would be suppressed by a factor of e −10 4 .
The results in this paper potentially have applications to several areas of physics, including phonon fluctuations in condensed matter physics, quantum tunneling, density fluctuations in the early universe [11,12], and the small scale structure of spacetime [14,15].  is less than about 0.003. For larger values of ω,ĥ fit (ω) was selected to approximateĥ asy (ω). However,ĥ(ω) undergoes some oscillations before approachingĥ asy (ω), as may be seen in Fig. 11.

Appendix B: Fulks' generalization of Laplace's method
The classical method of Laplace for asymptotic evaluation of integrals applies to expressions of the form as the parameter h becomes large. As is well-known, the asymptotic behavior of I h is determined by the properties of f and h near the global minimum of φ on the integration range, as well as the character of this minimum -in particular, whether it is a stationary or nonstationary minimum, and whether it is located at an endpoint or in the interior. In this section we discuss more the general problem in which the integral depends on two large parameters, both of which are becoming large, but at different rates. To be specific, we will assume that k grows more slowly than h, to the extent that k = o(h) as h → ∞.
Fulks [19] considered integrals of the form (B2) where −∞ < a < b ≤ ∞, in which φ has a single global minimum at a. As he remarks, it is easy to generalize to the situation in which −∞ ≤ a < b ≤ ∞ and φ has a single interior global minimum at t * ∈ (a, b), and we will state the results for this case.
Theorem 1. Suppose that • φ has a single global minimum at t * ∈ (a, b), near which it is C 3 , and is nonincreasing in [a, t * ] and nondecreasing in [t * , b] • ψ is C 2 near t * , and continuous on [a, b] • f is continuous at t * and f (t * ) = 0; it is also locally integrable and the integral I h,k exists for sufficiently large h, k.
Then if h, k → ∞ with k = o(h), the asymptotics may be given as follows: exp (−hφ(t * ) + kψ(t * )) ; 2. if 0 < lim inf k/ √ h and lim sup k/ √ h < ∞ then exp −hφ(t * ) + kψ(t * ) + ψ (t * ) 2 k 2 2φ (t * )h ; (B4) 3. if √ h = o(k) and ψ (t * ) = 0 then where τ is determined by hφ (τ ) = kψ (τ ) and is the position of the global minimum of −hφ(t) + kψ(t). If, more specifically, k = o(h 2/3 ), one has (Other special cases can be given, for different conditions on the growth of k relative to h and suitable higher regularity of φ and ψ. In general we can solve for τ as a series in k/h and the exponent will contain terms proportional to h(k/h) a for all a ∈ N 0 so that h(k/h) a is constant or growing as h → ∞).
Proof. Apart from the parenthetic comment, all the statements are lightly adapted from Theorems 1-4 and the Corollary of [19], noting the comments that follow the Corollary. The comment is evident by expanding the inverse function to η(t) = φ (t)/ψ (t) using Taylor's theorem with remainder, noting that τ = η −1 (k/h).
As a check on the result for λ = α/2, we note that I N can be evaluated in terms of Kummer functions in this case.
Changing variables to v = q α/2 , one has in agreement with our results above.