Eigenstate Thermalization and Disorder Averaging in Gravity

Naively, a resolution of the black hole information paradox appears to involve microscopic details of a theory of quantum gravity. However, recent work has argued that a unitary Page curve can be recovered by including novel replica instantons in the gravitational path integral. Moreover, replica instantons seem to rely on disorder averaging the microscopic theory, without a definite connection to a single, underlying unitary quantum system. In this letter, we show that disorder averaging and replica instantons emerge naturally from a gravitational effective theory built out of typical microscopic states. We relate replica instantons to a moment expansion of the simple operators appearing in the Eigenstate Thermalization Hypothesis, describe Feynman rules for computing the moments, and find an elegant microcanonical description of replica instantons in terms of wormholes and Euclidean black holes.


I. INTRODUCTION AND SUMMARY
Recent discussions of the black hole information paradox have led to significant progress in understanding information loss in semiclassical effective field theory [1][2][3]. Surprisingly, they have also shed light on how the semiclassical calculation may be rectified in order to be consistent with unitarity, without appealing directly to an underlying microscopic theory. 1 In these recent discussions, the inclusion of replica instantons is central to maintaining consistency with unitary evolution [4,5]. These are Euclidean configurations which contribute to correlations between several copies of the theory, represented by distinct asymptotically AdS boundaries. A single unitary boundary theory with fixed couplings can have no such correlations, so gravitational calculations involving connected multi-boundary correlators are naturally interpreted in the context of an ensemble of theories. The goal of this letter is to clarify the origin of this statistical description.
The statistical description of a single quantum theory is familiar in the context of the Eigenstate Thermalization Hypothesis (ETH) [10][11][12][13]. The basic idea is that a closed (isolated) chaotic many-body system, when probed only with simple (macroscopic) operators, looks for all intents and purposes thermal. This replaces the ideas of Gibbsian ensembles, or couplings to external heat baths, as a foundation for quantum statistical mechanics. We are still discussing a single quantum system, evolving unitarily in a pure state: the effective coarsegraining comes from our limitations in gathering information about the system. Specifically, the matrix elements of a collection of simple operators {O a } in energy eigenstates {|E } can be described in a chaotic system (without any simple con- * jpollack@phas.ubc.ca † rozali@phas.ubc.ca ‡ sully@phas.ubc.ca § daw@phas.ubc.ca 1 For related discussions see also [6][7][8][9]. served quantities apart from the energy) as In a given theory, the variances of the matrix elements R (a) ij are a fixed set of O(1) numbers. If we lack sufficient information to distinguish a specific state, we can effectively replace the matrix elements by random variables that have the correct statistics.
In the statistical description, to leading order the R ij are independent Gaussian random variables. We emphasize that using this statistical description does not necessarily mean that we are considering an ensemble of theories or coupling our system to an external bath; here, rather, we are interested in the properties of typical states in a single theory.
In this letter, we argue that the correct objective for the semiclassical saddle-point expansion of low-energy effective field theory is the reproduction of the correlation functions of simple operators in typical microscopic states. These correlators are well-described by the ETH and we write down effective partition functions that generate their moments.
We derive a set of 'Feynman rules' for diagrammatically computing the partition functions for the moments of correlators. And we show that, in holographic theories, the partition functions and their diagrammatic expansion may be understood in terms of gravitational path integrals. In the path-integral description, the higher moments are macroscopic quantities closely related to the replica instantons of [4,5]. Making a coarse assumption of chaos in the microcanonical ensemble, we find a particularly simple description of the higher moments and of replica instantons in terms of familiar Euclidean black holes connected by 'wormholes'.

II. ENSEMBLES, QUANTUM CHAOS, AND THE ETH
In this section, we expand on the ETH and its relationship to low-energy effective field theory.

arXiv:2002.02971v1 [hep-th] 7 Feb 2020
Consider a microscopic Hilbert space H for a theory with a gravitational description. We are ultimately interested in computing quantities related to the physics of this quantum system. For concreteness, we take the theory to be a conformal field theory (CFT), in any dimension, with large central charge c. Within the microscopic Hilbert space, let us concentrate on the subspace of states within some microcanonical energy window of width δE about energy E, denoted H E . 2 We will consider sufficiently high energies E so that the microcanonical Hilbert space has dimension exp[S(E, δE)] ∼ c. For the remainder of this letter, we suppress any dependence on δE.
In our microscopic theory, we are naturally interested in correlators and transition amplitudes for simple operators and states within the window, e.g.
for states |ψ i , |ψ j ∈ H E and some appropriately-chosen collection of simple operators O a . We usually think of the operators as 'small' products of local operators, each with ∆ ∼ O(1). In the Heisenberg picture this means we also exclude operators evolved for too long in time. 3 The precise choice we make is unimportant to the argument of this letter.
To design an effective field theory we also require a specification of a state or of a distribution over states. A coarse-graining over states reflects uncertainty in determining the true microscopic state using our simple lowenergy operators, as well as uncertainty in how the original microscopic state was produced.
What is the correct ensemble of states to study when we coarse-grain? Our goal in this letter is not necessarily to solve this problem exactly, but to explore the consequences in effective field theory. Nevertheless, to be concrete, we will attempt to build a sensible distribution at the coarsest level.
Although energy is a conserved charge, restricting our effective field theory to simple operators for finite times limits the ability of low-energy observers to probe the exact energy of microstates. Only after times exponentially large in the entropy S can operators probe the energy splittings in the microcanonical window. 4 Were there other conserved charges accessible to our simple operators O a , these could be measured to further refine the microcanonical ensemble into sub-ensembles conditioned on the measurement of these charges, just as grand ensembles are used in statistical mechanics. 2 Here we are considering the CFT quantized on a sphere of radius R, where the energy is simply related to the conformal dimension ∆ by E ∼ R∆. 3 In the Schrödinger picture, an initial state which is typical in the microcanonical window may become atypical after exponentially long times due to quantum ergodicity. 4 Crudely, a typical energy splitting is order δE/e S , which requires time δt ∼ e S /δE to probe.
We will assume, instead, that our system is chaotic. For the purposes of this letter, we identify chaos by the fact that no such charges are measurable by our simple operators. As a result, typical states relevant to physical processes are indistinguishable from those drawn at random from H E by applying a Haar-random unitary in L(H E ) to a reference state |ψ 0 ∈ H E . 5 In this case, for typical states |ψ i , |ψ j ∈ H E we expect simply by the central limit theorem. Exactly as in the discussion of the ETH above, f  ij has the statistics of a matrix of iid random complex numbers with zero mean and unit variance. Note that although the entries for an individual matrix are iid, matrices for different operators will typically exhibit correlations, with a smooth covariance (4) As we will show below, this covariance is simply related to microcanonical operator traces.
Our assumption that states cannot be distinguished within the microcanonical ensemble by simple operators is in essence a restatement of the ETH, as in Eq. (1). There, the energy eigenstates themselves, as probed by simple operators, look like typical microcanonical states. Note that in the ETH, the functions f (a) 2 , and higher moments f (a) n , depend on the energy differences ∆E = E i − E j , as well as the average energy E. However, if our energy window is narrower than the Thouless energy this dependence disappears [14]. We will limit ourselves to this regime in the following.
In summary, we will build an effective field theory to describe the typical, expected value of correlators or, with even better precision, sufficiently large sums or averages over microscopic states, such as The effective field theory should describe statistical properties of this set of correlators, that is, the moments of correlation functions f . We will see that this results in apparent "disorder averaging" in the effective description. For the holographic theories under consideration, the typical correlators will be simply determined by semiclassical gravitational saddles.

III. GENERATING FUNCTIONS FOR MEAN CORRELATORS
We start with the simplest case: the use of our effective field theory to calculate the averaged correlators To summarize those observables, we can write a generating function for the microcanonical mean values as where By "generating function," we mean as usual that derivatives with respect to sources give expectations: We are implicitly including O 0 = I in the sum, with J 0 = 1 fixed. We make this choice for all microcanonical generating functions in the remainder of this letter. 6

A. Feynman rules for the mean partition function
Let us begin here to introduce some 'Feynman rules' to compute the mean partition function. These follow from more standard diagrammatics for unitary integrals (eg. [15]), but we have chosen a notation particularly suited to the case at hand. We will extend these rules in subsequent sections as we explain how to obtain the higher moments.
We indicate a correlator ψ i |O|ψ j by a vertex, associated with a numerical factor e −S : The outer lines carry an index of the state, with an arrow indicating whether it is a 'bra' or a 'ket'. The inner lines carry the index structure of the operator that is inserted. When the indices i, j of the outer lines are equal, they can be contracted (and similarly for indices m, n of the operator trace): When contracted, they form a geometry with the topology of a disk, inside of which there is a loop indicating the contraction of the indices of the inserted operator. Thus, our mean partition function is computed by the diagram B. Example: the microcanonical partition function The definition (7) immediately leads to a simple expression for the microcanonical partition function by summing over e S random states, converging to their mean: Diagrammatically, we can write It is often more standard to consider a canonical generating function, especially for gravitational effective theories. Then we have the coarse-grained, thermal CFT partition function where we have approximated the sum as an integral when E ∼ c is large and δE/E ∼ c −α , 1/2 < α < 1. Here ρ(E) = exp(S)/δE is the density of states.

C. The gravitational description
The partition function for a CFT at finite temperature is prepared by a path integral on S d × S 1 . When a bulk dual exists, the gravitational picture is well-known (see e.g. [16][17][18]). At high temperatures, the leading semiclassical saddle to the bulk gravitational partition function with boundary S d ×S 1 is a Euclidean black hole. One can compute simple bulk correlation functions in this background, and find that their boundary limit matches the leading-order thermal CFT correlation function [18][19][20].
Like our effective generating function, the bulk saddle is not sensitive to the exponentially small level splittings; this can seen most easily in the real-time analytic continuation, where bulk correlators continue to decay for all time without random noise at Heisenberg time scales and without quantum Poincaré recurrences [18,[21][22][23].
Note that the canonical picture is not essential to this story. With a little more effort, one can similarly find the bulk solution dual to the microcanonical partition function [24]. As long as the microcanonical width scales as O(1) < δE < O(c 1/2 ), the projection of the bulk gravitational path integral onto a microcanonical band results in a single semiclassical bulk geometry.
Thus, we can equate our Feynman diagram to a gravitational geometry: with an equivalence of partition functions

IV. GENERATING FUNCTIONS FOR SECOND MOMENTS
So far, we have only required that the saddles of our effective field theory describe the mean microcanonical value of simple correlation functions. However, physical processes may probe slightly more fine-grained 'mesoscopic' questions about the CFT, namely the quantities f now involving two copies of the theory. Using Haar averages over the unitary group, one can check (Appendix A) that where the traces are over H E (we drop the subscripts when clear from context) and products of operators have indices contracted only in the microcanonical subspace. The first line in the above expression depends only on mean values. At leading order, it is just the product of disconnected mean generating functions. However, it also receives an e −S correction. 7 The second line is a 7 Technically the correction is 1 e S +1 , and so we might wish to think of this as re-summing an infinite number of e −nS contributions. connected contribution not derivable from the mean generating function. Like the first line, it also receives an e −S correction.
We can write a generating function for the general second moment as for A. Feynman rules for the second moment partition function With the Feyman rules we introduced to compute the mean correlator, we can already compute the leading order contribution on each line of (14): To compute the subleading terms, we need to introduce a new vertex which carries a power of 1 e S +1 and enforces that the state index passing along the lines it joins are equal: We thus have Feynman diagrams for the corrections:

B. The gravitational description
We already argued that the gravitational description of Z (1) is part of the standard AdS/CFT dictionary. Now the gravitational description of Z (2) requires just a slight elaboration. We seek a gravitational partition function to generate correlators of the form Recall that these are microcanonical traces where operator indices have also only been contracted within the microcanonical subspace. Just as before, where the microcanonical partition function was given at leading order by the microcanonical black hole saddle, the insertion of another simple operator in the trace does not shift to another saddle and we again can compute the correlator using the same bulk solution. And as the same energy runs between both operators, they must be equally spaced on opposite sides of the circular Euclidean-time boundary.
Again, the reader may be more familiar with the canonical picture. Translating into the canonical language, we have the approximate identity (in the thermodynamic limit) Here it is even more apparent that the operators are equally spaced on opposite sides of the thermal circle. Thus, we can write where J (1/2) is the source for the operator Our immediate take-away is that the Euclidean wormhole needed to compute Z (2) is just the standard wormhole for the microcanonical (or thermal) black hole.
Furthermore, while we have no direct gravitational interpretation of the corrections to the leading terms, we have suggested that they might be thought of as topologically non-trivial wormholes that glue the geometries together: See [4] for a related discussion of 'handles' joining replica instantons.

C. Example 1: Product of two CFTs
We know that in the full microscopic theory, the microcanonical partition function for a CFT on (S d × S 1 ) 2 is just a square of the partition function on one copy. But while the sum over eigenstates factorizes as expected, when we insert simple operators, the ETH tells us that the off-diagonal terms cancel to good approximation, leaving only the diagonal second moment: Thus, the factorization of the partition function is actually misleading in terms of the non-apparent factorization of correlators in this expansion.
We can nevertheless generate the connected contribution to the partition function squared from a partition function that is itself connected. We may replace the sum over eigenstates with an equal number of typical states since summation of large numbers of typical states is sufficient to calculate microcanonical averages. This leads us to the partition function we considered in the mean case, Z CFT (E, J 1,a ). Using (10) and (15), we see that we need to compute tr ρ 2 K . We can view this more generally as a partition function of the form From Eq. (15), this is Note that, while the connected correlator was suppressed in the squared CFT partition function, here it can grow to equal size when K ∼ S.
To calculate the second Rényi entropy, S 2 (ρ K ) = − log Z K,2 , we take O a,b = I and find

V. HIGHER MOMENTS
We can similarly compute higher moments of our correlation functions. Using Haar averages, one can show that where σ is a permutation. The normalization N i and details of the proof can be found in Appendix A. In short: each permutation in (28) is described at leading order by one of our Feynman diagrams showing how identical states are joined together to form traces of the operators. For example, the fourth moment contains a term To compute the normalization, we must resum an infinite series of tree-level contributions from vertices that link the boundaries together. We already showed in Section IV A how to compute the leading order corrections, in terms of vertices that join two boundaries together. Further corrections take the form where δ (k) We can represent each higher-order correction as a kpoint vertex with a Feynman diagram The terms c k (S) are O(1) combinatorial objects obeying a simple recursion relation.

A. The gravitational description
As each leading-order term needed to compute the higher moments is just a product of microcanonical traces, it will be calculated by the same gravitational saddles as the microcanonical black hole of energy E without operator insertions. For each trace with p operators inserted, we space them equally in order of the trace around the Euclidean time circle. The canonical picture is likewise simple. The canonical generating function for the trace with p operators is just given by where J (m/p) am is a source for the operator e −β E Hm/p O am e β E Hm/p .
As in the case of the second moment, we can view the subleading corrections to each trace as an infinite series of k-boundary wormholes joining the true gravitational geometries together, for instance Again, we have no true gravitational solution dual to these corrections, and view this as a heuristic description to motivate further work.

B. Example: higher Rényi partition functions
As an example, let us consider the particular case of Z K,n = tr [ρ n K ]. Again, we will think of this as a generating function for correlators with simple operators inserted between each ρ K . Let us simplify our notation by inserting a fixed sequence of operators {O m }, and expand the trace as For simplicity, we will ignore the 'wormhole' corrections and compute only the leading term for each trace type.
By use of (28), we can rewrite this as where σ n = (1 2 . . . n) and d(·, ·) is the Cayley distance between the permutations, measuring the minimal number of transpositions to change one permutation to the other. Each such transposition (j k) encodes an equality constraint ψ ij = ψ i k that allows us to contract the corresponding indices and reduces the number of contributing terms in the sum by e −K . We have decomposed each permutation uniquely as a product of disjoint cycles σ = C 1 · C 2 · · · C c(σ) , where here we are also counting possibly-trivial cycles. Each trace corresponds to a cycle C i = (C i1 . . . C idi ) of length d i . See Appendix A for further details.
For O i = I, the above expression simplifies to allowing one to explicitly compute the 'leading trace' contribution to the Rényi entropies for any n.
We can also rewrite this partition function in terms of canonical gravitational partition functions (adding back in the dependence on sources) as

VI. DISCUSSION
Let us compare our results to [4]. There, statistical averages for the Rényi traces of a density matrix were computed by defining the ensemble in terms of a dual replica wormhole computation in JT gravity. Here, in Section V B, we have done the same calculation, but our averages compute the result for a single, fixed microscopic theory where we have assumed our states are all typical in the microcanonical ensemble. One immediate simplification from this perspective is that our connected replica wormholes are just standard microcanonical wormholes, joined by topologically non-trivial wormholes. In our case, the bulk branes of [4] glue the distinct copies together seamlessly into a single connected boundary. We compare bulk geometries in Fig. 1.
a. Other ensembles. Although we find different bulk geometries from [4], this should not be seen as evidence against its claims. Assuming our system is chaotic, we have argued that the right notion of a typical state in the microcanonical window is one drawn at Haar random. Adopting a different notion of typicality implies that more knowledge about the system is accessible to simple observers, such as more detailed information about the distribution of energy eigenstates in the microcanonical window, or other deviations from chaos. In these cases, there may very well be corrections to the moments of our distribution. Such corrections would require a modified gravitational interpretation (see for instance [25][26][27] and references therein).
Moreover, by restricting ourselves to states within a narrow microcanonical band, we have avoided questions about the dependence of the moments on energy differences, as displayed in the fuller version of ETH presented in Eq. (1). Understanding the relevant bulk geometries for these more general cases, and their relation to previous work, is of obvious interest. What we wish to emphasize is that an understanding of the correct ensemble and notion of typical states need not come from averaging over theories, but can arise from a clearer operational understanding of how to integrate out microscopic splittings in an effective theory.
b. Quenched and annealed averages. In Sec. IV D, we calculated Rényi entropies for highly entangled microcanonical density matrices. This led to the appearance of correlated disorder between replicas, and hence a quenched average from the statistical perspective. In this case, the connected contribution could compete with the disconnected part.
In contrast, the product of partition functions in Sec. IV C involved sums over microscopic states which were uncorrelated between copies. A connected contribution arose from the cancellation of random phases, but this was always exponentially suppressed compared to the disconnected part, because there were no correlations. This is an annealed average over disorder in the statistical picture, and a similar average arises when calculating Rényi entropies for near-product density matrices. See [4] for a related discussion.
c. EFT and the RG. Here, we have used a notion of coarse-graining somewhat different from the standard Wilsonian perspective of integrating out high-energy degrees of freedom. It would be illuminating to directly relate the integrating out of microscopic splittings at high energy, as used in the ETH, to other approaches to renormalization and coarse-graining that have been studied in the holography and field theory literature, e.g. [28][29][30][31][32][33].
d. Completing the EFT. We have employed a fairly skeletal notion of effective field theory in this letter, restricting our consideration to semiclassical bulk saddles but leaving the relation to the full gravitational effective field theory (including states with large deviations from the mean) unclear.
Relatedly, we have concentrated on short-time physics, when states in the Schrödinger picture have not had the opportunity to explore atypical corners of Hilbert space. Many authors have explored the connection be-tween ETH and gravitational saddles in this late time regime [18,21,23,34], and it would be interesting to relate the present letter to this earlier work.
where the |ψ are random. To select such a typical state, we apply a Haar-random unitary U ∈ U(H E ) to an arbitrary reference state |ψ 0 ∈ H E .
To perform the ensemble average of (A1), we set |ψ im = U (im) |ψ 0 and |ψ jm = U (jm) |ψ 0 , and integrate using the group-invariant measure over the choice of unitary operator. To simplify the calculation, we insert a resolution of the (microcanonical) identity on either side of the operators O am in any convenient basis (labeled by k m and l m ): Thus, we have n m=1 ψ im |O am |ψ jm = n m=1 km,lm Symmetry under permutation of the indices i m , j m dictates the form of the final answer: Here (·) n is a rising Pochhammer symbol. We have also partitioned the set of states ψ im into component blocks of identical states. The sizes of these blocks are labeled q l . This normalizaion can be found by choosing O to be the identity.

Sums of typical states
We can compute the "empirical" microcanonical average by summing indices i m , j m over e K random states. Since the higher moments are exponentially suppressed, the central limit theorem gives

Leading trace terms
The equation (A4) can be rewritten using the structure of the permutations σ. Suppose we have a cycle decomposition σ = C 1 · · · C c(σ) , where the cycle C i = (C i1 , . . . , C idi ) has length d i , including trivial cycles with d i = 1. For simplicity, we will focus on the 'leading trace' terms, where for each trace structure we only keep the leading term. After some algebra,we obtain We can project onto a specific ordering of our states j m = τ (i m ) by including a product of Kronecker deltas δ jmτ (im) in the summation. If such a term is already enforced by the permutation, it is absorbed, but each remaining delta reduces the number of terms by a factor e K . We therefore reduce by a total factor exp[Kd(σ, τ )], where d(σ, τ ) is the smallest number of transpositions required to change σ into τ , also known as the Cayley distance. Explicitly, the leading trace contribution is