Equilibrium to Einstein: Entanglement, Thermodynamics, and Gravity

Here we develop the connection between thermodynamics, entanglement, and gravity. By attributing thermodynamics to timeslices of a causal diamond, we show that the Clausius relation $T\Delta S_{\text{rev}}=Q$, where $\Delta S_{\text{rev}}$ is the reversible entropy change, gives rise to the non-linear gravitational equations of motion for a wide class of diffeomorphism invariant theories. We then compare the Clausius relation to the first law of causal diamond mechanics (FLCD), a geometric identity and necessary ingredient in deriving Jacobson's entanglement equilibrium proposal -- the entanglement entropy of a spherical region with a fixed volume is maximal in vacuum. Specifically we show that the condition of fixed volume can be understood as subtracting the irreversible contribution to the thermodynamic entropy. This provides a"reversible thermodynamic process"interpretation of the FLCD, and that the condition of entanglement equilibrium may be regarded as equilibrium thermodynamics for which the Clausius relation holds. Finally, we extend the entanglement equilibrium proposal to the timelike stretched horizons of future lightcones, providing an entanglement interpretation of stretched lightcone thermodynamics.


I. OVERVIEW
The discovery that black holes carry entropy [1,2], provides the two following realizations: (i) A world with gravity is holographic [3], and (ii) spacetime is emergent [4]. The former of these comes from the observation that the thermodynamic entropy of a black hole (1) goes as the area of its horizon A H , and the latter from noting that black holes are spacetime solutions to Einstein's equations. In fact, black holes are not the only spacetime solutions which carry entropy; any solution which has a horizon, e.g., Rindler space and the de Sitter universe, also possess a thermodynamic entropy proportional to the area of their respective horizons. The fact that Rindler space carries an entropy is particularly striking as there the notion of horizon is observer dependent. This leads to the proposal that an arbitrary spacetime -which may appear locally as Rindler space -is equipped with an entropy proportional to the area of a local Rindler horizon, and that thermodynamic relationships, e.g., the Clausius relation T ∆S = Q, have geometric meaning. Specifically, That is, Einstein gravity arises from the thermodynamics of spacetime [4].
Recently it was shown how to generalize (2) to higher derivative theories of gravity [5]. By attributing a temperature and entropy to a stretched future lightcone -a timelike hypersurface composed of the worldlines of constant and uniformly radially accelerating observers -the equations of motion for a broad class of higher derivative theories of gravity are a consequence of the Clausius relation T ∆S rev = Q, where ∆S rev is the reversible entropy, i.e., the entropy growth solely due to a flux of matter crossing the horizon of the stretched lightcone. This result shows that arbitrary theories of gravity arise from the thermodynamics of some underlying microscopic theory of spacetime.
Despite some successes in deriving (1) in specific cases [6,7], it is still unclear what the physical degrees of freedom encoded in S BH correspond to microscopically. Similarly, the underlying microscopics of spacetime giving rise to Einstein's equations is obscure. A potential explanation comes from studying entanglement entropy (EE) of quantum fields outside of the horizon. For a generic (d + 1) quantum field theory (QFT) with d > 1, the EE of a region A admits an area law [8,9] S EE A = c 0 where is a cutoff for the theory, illustrating that the EE is in general UV divergent, and A is the area of the (d − 1) boundary region ∂A separating region A from it's complement. Identifying c0 d−1 → 1 4G suggests S BH to be interpreted as the leading UV divergence in the EE for quantum fields outside of a horizon.
Further progress can be made when we consider quantum field theories with holographic duals. Specifically, in the context of AdS d+2 /CFT d+1 duality [10], one is led to the Ryu-Takayanagi (RT) conjecture [11]: which relates the EE of holographic CFTs (HEE) to the area of a d-dimensional (static) minimal surface γ A in arXiv:1810.12236v2 [hep-th] 25 Jan 2019 AdS d+2 whose boundary is homologous to ∂A. 1 The RT formula (4) is specific to CFTs dual to general relativity, and does not include quantum corrections. The proposal was proved in [13], and has been extended to include quantum corrections [14], and for CFTs dual to higher derivative theories of gravity [15]. When the minimal surface γ A is the horizon of a black hole, one observes that black hole entropy is equivalent to HEE, S HEE | γ A =H = S BH [16]. Similar to the situation with black hole thermodynamics, this observation suggests that gravity emerges from quantum entanglement, i.e., spacetime is built from entanglement [17,18]. To take on this proposal, one can study the properties of HEE and look for the resulting geometric consequences. Indeed, the EE of a QFT generically satisfies a first law reminiscient of the first law of thermodynamics [19,20] δS EE A = δ H A .
Here δS EE A is the variation of the EE of region A, while δ H A is the variation of the modular Hamiltonian H A defined by ρ A ≡ e −H A . When one specializes to the case where the region A is a ball of radius R, the modular Hamiltonian can be identified with the thermal energy of the region.
For holographic CFTs the first law of entanglement entropy (5) can be understood as a geometric constraint on the dual gravity side. By substituting (4) into the left hand side (LHS) of (5), and relating the energymomentum tensor of the CFT to a metric perturbation in AdS, one arrives at the linearized Einstein equations [21]: By considering the higher derivative gravity generalization of (4), similar arguments lead to the linearized equations of motion for higher derivative theories of gravity [22]. The non-linear behavior of gravitational equations of motion is encoded in a generalized form of (5), where one must take into account the relative entropy of excited CFT states [23,24]. In this way, gravity emerges from spacetime entanglement.
Recently it has been shown how to derive gravitational equations of motion from entanglement considerations without explicit reference to AdS/CFT duality, and is therefore slightly more general than the derivation in [21,22]. This approach, first proposed by Jacobson, is the entanglement equilibrium conjecture [25], which can be stated as follows: In a theory of quantum gravity, the entanglement entropy of a spherical region with a fixed volume is maximal in the vacuum. This hypothesis relies on assuming that the quantum theory of gravity is UV finite (as is the case in string theory) and therefore yields a finite EE, where the cutoff introduced in (3) is near the Planck scale, ∼ P , and being able to identify the entanglement entropy S A EE with the generalized entropy S gen , which is independent of [26,27]: Here S ( ) BH is the Bekenstein-Hawking entropy (1) expressed in terms of renormalized gravitational couplings, and S ( ) mat is the renormalized EE of matter fields. The generalized entropy S gen is independent of as the renormalization of gravitational couplings is achieved via the matter loop divergences.
When one interprets the EE as the generalized entropy, one may therefore assign EE to surfaces other than cross sections of black hole horizons, or the minimal surfaces identified in the RT formula (4). In this way, without assuming holographic duality, one discovers a connection between geometry and entanglement entropy. Furthermore, taking into consideration the underlying thermodynamics of spacetime [4], this link provides a route to derive dynamical equations of gravity -not from thermodynamics, but from entanglement.
With these consderations in mind, the variation of the EE of a spherical region at fixed volume is given by i.e., the vacuum is in a maximal entropy state. In the case of small spheres, this entanglement equilibrium condition is equivalent to imposing the full non-linear Einstein equations at the center of the ball [25]. Recently this maximal entropy condition has been generalized to include higher derivative theories of gravity, where S ( ) BH in (7) is replaced by the higher derivative extension of gravitational entropy, the Wald entropy S ( ) Wald , in which case the maximal entropy condition becomes where the volume V must be replaced with a new local geometrical quantity called the generalized volume W . This condition, when applied to small spheres, is equivalent to imposing the linearized equations of motion for a higher derivative theory of gravity [28].
Here we aim to extend the work of [5] and [28] and develop the connection between thermodynamics, entanglement, and gravity. Specifically, we will consider the geometric set-up of [28] and provide a "physical process" derivation of the geometric identity known as the first law of causal diamond mechanics (FLCD) crucial in deriving the entanglement equilibrium condition. We accomplish this as follows: First attribute thermodynamics to sections of causal diamonds in an arbitrary spacetime, and compute the Clausius relation T ∆S rev = Q, where ∆S rev is the reversible (gravitational) entropy change computed via a reversible thermodynamic process. Using the techniques developed in [5], we will then show that the Clausius relation is geometrically equivalent to the non-linear gravitational equations of motion for a broad class of diffeomorphism invariant theories of gravity, thereby connecting gravity to thermodynamics. Next, we show how the FLCD relates to the Clausius condition, by explicitly showing that the leading contribution to generalized vol-umeW is precisely the entropy change due to the natural increase of the causal diamond, presenting an equivalence between entanglement equilibrium and (reversible) equilibrium thermodynamics in theories of gravity.
Then, noting the geometric similarities of causal diamonds and stretched lightcones, we will derive a "first law of stretched lightcones", and show that it is equivalent to an entanglement equilibrium condition. This not only sheds light on the microscopic origins of the thermodynamics of stretched lightcones, but also provides another derivation of the non-linear (semi-classical) Einstein equations and (linearized) equations of motion of higher derivative theories of gravity from spacetime entanglement. We will also discuss why the Clausius relation gives rise to the non-linear equations while the entanglement equilibrium condition gives only the linearized equations for higher derivative theories of gravity. This will help us better understand how both [28] and the work here may be extended to include the non-linear contributions of higher derivative equations of motion.
The outline of the paper is as follows: We begin by reviewing the geometric set-up of the stretched lightcone and causal diamond in section (II) and observe the similarities between the two constructions. In section (III) we present an alternative derivation of the FLCD and show how it relates to the Clausius relation T ∆S rev = Q applied to the diamond. We further show that the Clausius relation is geometrically equivalent to the full non-linear gravitational equations of motion for a broad class of diffeomorphism invariant theories of gravity. A first law of stretched lightcones is developed in section (IV), where we show that it is equivalent to an entanglement equilibrium condition, which we also illustrate is equivalent to the linearized gravitational equations of motion being satisfied.

A. Stretched Lightcones
We begin with a review of the construction of the stretched lightcone (for more details see [5]). For concreteness, let us first restrict to pure D-dimensional Minkowski space. In Minkowski space there are D+1 2 independent Killing vectors χ a corresponding to spacetime translations and Lorentz transformations. The flow lines of Cartesian boost vectors, e.g., x∂ a t + t∂ a x , trace the worldlines of Rindler observers, i.e., observers traveling with constant acceleration in some Cartesian direction.
The stretched future lightcone can be viewed as a spherical Rindler horizon generated by the radial boost vector: where r is the radial coordinate and x i are spatial Cartesian coordinates. We define the stretched future lightcone as a congruence of worldlines generated by these radial boosts. Unlike their Cartesian boost counter-parts, which preserve local Lorentz symmetry, the radial boost vector is not a Killing vector in Minkowski space; this is because radial boosts are not isometries in Minkowski space. The flow lines of ξ a trace out hyperbolae in Minkowski space. Let us define a codimension-1 timelike hyperboloid via the set of curves which obey where t ≥ 0 and α is some length scale with dimensions of length. This hyperboloid can be understood as a stretched future lightcone emanating from a point p at the origin. The constant-t sections of the hyperboloid are (D − 2)-spheres with an area given by Here we have that ξ 2 = −α 2 , and is therefore an unnormalized tangent vector to the worldlines of the spherical Rindler observers. The normalized velocity vector is defined as u a = ξ a /α, with u 2 = −1, and has a proper acceleration with magnitude The stretched future lightcone, in Minkowski space, can therefore be understood as a congruence of worldlines of a set of constant radially accelerating observers, all with the same uniform acceleration of 1/α. Let us now consider what happens in an arbitrary spacetime. In the vicinity of any point p, spacetime is locally flat. The components of a generic metric tensor can always be expanded using Riemann normal coordinates (RNC): where the Riemann tensor is evaluated at the point p, the origin of the RNC system. Here x a are Cartesian coordinates and η ab is the Minkowski metric in Cartesian coordinates. Since a generic spacetime is locally flat, there still exist the D+1 2 vectors χ a which preserve the isometries of Minkowski space, locally, however, they are no longer exact Killing vectors; the presence of quadratic terms O(x 2 ) in the RNC expansion (14) indicates that these vectors will not satisfy Killing's equation and Killing's identity at some order in x. The specific order depends on the nature of the vector χ a , e.g., for Lorentz boosts the components are of order O(x). Therefore, for the generators of local Lorentz transformations, Killing's equation and Killing's identity will fail as Observe that the t − t and t − i components satisfy Killing's equation at O(1), while the i−j components fail to obey Killing's equations even at leading order. This means that Killing's identity will also fail; in fact it fails to order O(x −1 ). We also note that on the t = 0 surface our radial boost vector is an instantaneous Killing vector.
In an arbitrary spacetime our notion of stretched future lightcone must be modified. In a curved spacetime it is straightforward to show that Motivated by the stretched horizon defined in the black hole membrane paradigm [29], we define the stretched future lightcone Σ as follows: Pick a small length scale 2 .
Then select a subset of observers who at time t = 0 have a proper acceleration 1/α. If we follow the worldlines of these observers we would find that generically they would not have the same proper acceleration at a later generic time. This problem can be remedied by choosing a timescale α. Over this timescale the initially accelerating observers have an approximate constant proper acceleration, and the stretched future lightcone Σ can be regarded as a worldtube of a congruence of observers with the same nearly-constant approximately outward radial acceleration 1/α, as can be seen in figure (1). With this definition, therefore, Σ can be interpreted as a surface with constant Unruh-Davies temperature T ≡ a/2π. Figure 1: A congruence of radially accelerating worldlines ξ a with the same uniform proper acceleration 1/α generates the stretched future light cone of point p, and describes a timelike hypersurface, Σ, with unit outwardpointing normal n a . The boundary of Σ consists of the two codimension-two surfaces ∂Σ(0) and ∂Σ( ) given by the constant-time slices of Σ at t = 0 and t = , respectively. The co-dimension-1 spatial ball B is the filled in co-dimension-2 surface ∂Σ.

B. Causal Diamonds
In a maximally symmetric background, a causal diamond can be defined as the union of future and past domains of dependence of its spatial slices, balls B of size with boundary ∂B. The diamond admits a conformal Killing vector (CKV) ζ a whose flow preserves the diamond (see figure (2)). Conformal Killing vectors are those which satisfy con-formal Killing's equation where Ω satisfies and is related to the conformal factor ω 2 ofḡ ab = ω 2 g ab via 2Ω = ζ c ∇ c ln ω 2 . Conformal Killing vectors also satisfy the conformal Killing identity We can define a timelike normal U a to B via with being some normalization such that U 2 = −1. In fact, it can be shown in general that where κ is the surface gravity and K is the trace of the extrinsic curvature. One also has where we have the binormal N ab = 2U [a N b] , where N a is the spacelike unit normal to U b . The spatial slice B is taken to be the t = 0 slice. For concreteness, in D-dimensional Minkowski space, the CKV which preserves the causal diamond is [28] We point out that ζ a goes null on the boundary, t = ±r, and ζ 2 = −1 when r = t = 0. We also have and, We see that the causal diamond has constant extrinsic curvature, constant surface gravity κ = 2/ , and ζ a is an exact Killing vector on the t = 0 surface B. Let us remark on the similarities between the radial boost vector ξ a (10) generating the stretched future lightcone, and the conformal Killing vector ζ a (25) preserving the causal diamond. Specifically, we find that ξ a satisfies where the δ i a δ j b are present to project the non-zero contributions. We see that ξ a is a vector which satisfies Killing's equation in specific metric components, and one which fails as a modified CKV in other components. This comparison leads us to define a conformal factor associated with ξ: for which one finds and It is also straightforward to work out and where L ξ is the Lie derivative along ξ a , and the extrinsic curvature of the spherical boundary ∂Σ is K = h ab K ab = g ab ∇ b n a , since h ab = g ab − n a n b .

III. THERMODYNAMICS OF CAUSAL DIAMONDS
Consider the past of the causal diamond, i.e., the bottom half below the t = 0 co-dimension-2 spherical slice ∂B of Fig. 2 3 . Our picture for a physical process will be comparing the entropy between a time slice at t = − for positive and t = 0 after some energy flux has entered the past of the diamond. At the boundary t = ± r, ζ 2 = 0, and therefore, in Minkowski space, the boundary of the causal diamond represents a conformal Killing horizon of constant surface gravity κ, and therefore an isothermal surface with Hawking temperature T = κ/2π. An arbitrary spacetime will include curvature corrections, however, to leading order in a RNC expansion about a point p, ζ 2 ≈ 0, and κ remains approximately constant. If we followed the worldline of ζ from time t = − to t = 0, we would find that κ would be different at each of these time slices. Motivated by the set-up of the stretched lightcone, we choose a timescale over which the surface gravity κ is approximately constant. Therefore, in an arbitrary spacetime ∂B of the causal diamond represents a local conformal Killing horizon, which may be interpreted as an isothermal surface with constant Hawking temperature T = κ/2π.
We associate with this conformal Killing horizon a gravitational entropy [30], i.e., time-slices ∂B of the causal diamond have an attributed entropy. The form of the entropy depends on the theory of gravity under consideration, e.g., for Einstein gravity, the correct form is the Bekenstein-Hawking entropy (1). Here we consider a diffeomorphism invariant theory of gravity in D spacetime dimensions defined by the action I: Here the gravitational Lagrangian L is written as a function of the metric and the curvature tensor R abcd . This action encompasses a large class of theories of gravity which do not involve the derivatives of the Riemann tensor, e.g., f (R) gravity, and Lovelock theories of gravity. The equations of motion for such theories are It is straightforward to verify that in the case of Einstein gravity, L = R, this reduces to Einstein's field equations. For a general theory of gravity of this type we must generalize the Bekenstein-Hawking entropy formula. We take this generalization to be the Wald entropy [31]: where we have introduced the Noether potential associated with a diffeomorphism x a → x a + ζ a , where we will take ζ a to be a timelike (conformal) Killing vector, and have infinitesimal binormal element of ∂B: Wald's Noether charge construction of gravitational entropy was originally developed to yield an expression for the entropy of a stationary black hole in more general theories of gravity. Here we make the non-trivial assumption of local holography that this gravitational entropy can also be attributed locally to the spatial sections of causal diamonds whose structure is preserved by ζ a . For computational convenience, we will first not work directly on the horizon, but instead work on the timelike stretched horizon of the causal diamond -a codimension-1 timelike surface we call Σ. At the end of the calculation we will take the limit where our stretched horizon coincides with the conformal Killing horizon. The fact that we have to take the step in which we move to the conformal Killing horizon -a null hypersurface -is a marked difference with the analogous calculation using stretched future lightcones [5].
The Wald entropy at time t is The total change in entropy between t = 0 and t = − is where we have invoked Stokes' theorem for an antisymmetric tensor field M ab : where the overall sign depends on whether Σ is timelike (−), or spacelike (+). For our discussion of causal diamond thermodynamics we are interested in the timelike version, however, it will be illustrative for future discussion if we do not specify, for now, the signature of codimension-1 surface Σ.
Moving on, we have We have yet to use any properties of ζ d , which to leading order is a conformal Killing vector, satisfying (18) and (20). We have then: and where we used that P abcd shares the same algebraic symmetries of the Riemann tensor. Substituting (44) and (45) into (41) yields where the overall + (−) sign indicates that Σ is a timelike (spacelike) surface. In appendix (A), we consider the spacelike surface and provide an alternative derivation to the first law of causal diamond mechanics for higher derivative theories of gravity as presented in [28].
Using that dΣ a = N a dAdτ = ∂ r a dAdτ = x i /r∂ i a dAdτ , and that we are integrating over a spherically symmetric region, we find that to leading order in the RNC expansion, that the final two terms integrate to zero since we are integrating over a timelike surface with spherical compact sections. Thus, to leading order, (47) The two terms we neglect here, of course, have higher order contributions due to the RNC expansion, and in order to derive the non-linear equations of motion we must deal with these higher order contributions. We follow the technique developed in [5], in which we modify the conformal Killing vector ζ a by adding O(x 3 ) corrections and higher such that they remove the undesired higher order effects of the two terms we neglect. The details may be found in the appendix (B).
The above expression (47) represents the leading order contribution to the total entropy variation, including the effect due to the natural increase of the spatial sections of the (past) causal diamond -an irreversible thermodynamic process. Presently we are interested in the change in entropy due to a flux of matter crossing the conformal horizon -a reversible thermodynamic process 4 . We therefore remove the entropy due to the natural increase 4 We can consider the following analogy to help describe this process and our use of the terms 'irreversible' and reversible': Imagine we have a box a gas sitting on a burner. When the box opens the gas will leave the box simply due to a free expansion, which has an associated irreversible entropy increase. The heating of the box will also lead to a reversible entropy increase. The natural increase of our diamond -to the past of t = 0 -is analogous to the free expansion of the gas and we therefore identify this process as having an associated irreversible entropy increase.
of the diamondS: where to get to the second line we used that ∂ i ζ j ∝ δ ij , which cancels with its contraction with P tijk , and ∂ t ζ j = −2x j / 2 , and in the third line we used that 2/ 2 = κK/(D − 2), and again the fact we are integrating over a spherical subregion. To this order P abcd is constant, allowing us to pull it through the integral.
We may arrange the above suggestively as 56 This expression is recognized to be the leading contribution of the generalized volumeW (A3) that is, where ∆S =S(0) −S(− ), and ∆W =W (0) −W (− ).
Since the area on a future time slice ∂B(0) is smaller than the that of ∂B(− ), one has ∆S > 0. Note that this is not the case for time-slices to the future of t = 0, and therefore the thermodynamics of causal diamonds is peculiar; we will have more to say about this in the discussion. We therefore define the reversible entropy variation as (52) 5 As written,S is a bit misleading. It would appear thatS goes like the volume rather than the area. However, this is in fact not the case. Indeed, in the case of general relativity, using K = (D − 2)/ , and that on the t = 0 slice ∂B, r = , it is straightforward to show thatS = A/4G, where A is the area of the spherical subregion ∂B. 6 In the context of general relativity, we note that the this expression is nothing more than the Smarr formula for a maximally symmetric ball in flat space -the "thermodynamic volume" is notably absent [32]. This is because we are considering perturbations about Minkowski spacetime. Even if we considered perturbations about a more general MSS, the thermodynamic volume would be subdominant.
Calling this variation the reversible change in entropy is analogous to the Clausius relation in ordinary thermodynamics Q = T ∆S rev .

A. Gravity From Thermodynamics
Next, following [4,5], define the integrated energy flux across Σ as where the energy momentum tensor can be approximated to leading order by its value at p. As we make the transition to the conformal Killing horizon, the interior of Σ becomes causally disconnected from its exterior, allowing us to identify Q as heat -energy which flows into macroscopically unobservable degrees of freedom. The Clausius relation T ∆S rev = Q for our set-up results in the geometric constraint: Since this holds for all causal diamonds Σ, we may equate the integrands leading to At the boundary, t = + r, i.e., when the timelike stretched surface moves to the conformal Killing horizon, one has g ab N a ζ b = 0. Therefore, at the conformal Killing horizon, the above is valid up to a term of the form f g ab , where f is some yet to be determined scalar function. The form of f can be determined by demanding covariant conservation of T ab . Specifically, we are led to where L(g ab , R abcd ), and Λ is some integration constant. We recognize the above as the equations of motion for a general theory of gravity. In this way we see that the equations of motion for a theory of gravity arise from the thermodynamics of causal diamonds. We have reproduced the results of [5], however, using the geometric construction of causal diamonds.
This approach to deriving the equations of motion offers a thermodynamic perspective to the derivation of linearized equations of motion from the entanglement equilibrium proposal as presented in [28]. In particular, we found that the generalized volumeW can be interpreted as the natural increase of the causal diamond. To apply the Clausius relation for a reversible thermodynamic process, we removed this increase and, therefore,W is the contribution which generates irreversible thermodynamic processes in the causal diamond construction. We note that removingW also appears in the first law of causal diamond mechanics (A15), and consequently the entanglement equilibrium condition (A27).
It is interesting to compare the above construction with that of the stretched future lightcone. As shown in [5], the non-linear equations of motion for the same class of theories of gravity arise as a consequence of the Clausius relation applied to the stretched future lightcone -a co-dimension-1 timelike hyperboloid. Unlike the above derivation, one need not take the limit that the stretched horizon goes to a null surface. This is because the stretched horizon of the future lightcone acts as a causal barrier between observers living on the exterior of the cone from its interior, allowing for a welldefined notion of heat even in the absence of a Killing horizon. In the causal diamond set-up we had to take the limit that the stretched horizon moves to the conformal Killing horizon for technical reasons; it is unclear what the physical reason for this may be as the energy passing through the past causal diamond seemingly has a well-defined notion of heat.
Moreover, in the future stretched light cone set-up, one similarly removes the entropy change due to the natural expansion of the hyperboloid. In light of the result above, that the entropy change due to the natural increase in the diamond may be interpreted as the generalized volume, naively we guess that the natural entropy change of the hyperoloid might have a similar interpretation. This suggests that we can think about the derivation of the gravitational equations of motion using the stretched future lightcone construction from an entanglement entropy perspective, i.e., perhaps the gravitational equatons of motion arise from an entanglement equilibrium condition, analogous to that given in [25,28]. We explore this idea in the next section.

IV. ENTANGLEMENT OF LIGHTCONES
Recently the equations of motion for a generalized theory of gravity were derived from the thermodynamics of the stretched future lightcone [5]. Of course, thermodynamics is a placeholder until a more precise quantum prescription of a system is developed -in this case, a quantum theory of spacetime. We can make progress, however, following the recent paradigm relating entanglement to geometry. Indeed, it is natural to interpret the thermodynamic entropy of the stretched lightcone as coming from the entanglement of quantum fields outside of the stretched horizon. Moreover, due to the geometric similarity between causal diamonds and stretched lightcones (II), we are motivated by the derivation of linearized gravitational equations from the entanglement properties of the causal diamond [25,28]. We therefore aim to derive linearized equations of motion via the en-tanglement of the stretched lightcone.
Our procedure is as follows. First we compute δS Wald and derive an off-shell geometric identity analogous to the first law of causal diamonds, which we call the first law of stretched lightcones. We will use the Noetheresque approach illustrated in (A). Next we will show how this offshell identity is equivalent to the variation of the entanglement entropy, following arguments presented in [28]. Finally, we will find that the linearized form of the gravitational equations emerge from an entanglement equilibrium condition. In essence, we are simply considering Jacobson's entanglement equilibrium proposal [25] for the geometry of stretched lightcones in an arbitrary background (where we explicitly consider perturbations to Minkowski space). One expects to find a similar result as established in [28], simply by noting that the stretched lightcone shares enough geometric similarities to the causal diamond.

A. First Law of Stretched Lightcones
Recall that ξ a satisfies (29) where Ω ξ = t/r, and we have defined The derivation presented in (A) relies on the fact that ζ a is an exact conformal Killing vector in flat space; specifically the fact that ζ a satisfies the conformal Killing identity. Here the vector ξ a is not a conformal Killing vector, and therefore, it will not satisfy the conformal Killing identity. The issue is thatg ab defined above is not the metric, and therefore this object will have a nonvanishing covariant derivative. However, since we are considering the time t = 0 surface, the fact that ξ a does not satisfy the conformal Killing identity is not a problem for us because Ω ξ will vanish at t = 0. Therefore, all terms Ω ξ ∇g which would appear can be neglected. Therefore, we may simply start from (46) above with the following substitutions Moreover, for the lightcone κ = 1, and on the codimension-1 spatial ball B, α is set to be constant 7 , and has volume element dB a = U a dV . Thus, we have that 7 The acceleration of the spherical Rindler observers is fixed as a constant at t = 0, via the construction described in (II).
where we dropped the term proportional to Ω ξ as it vanishes at t = 0. Let us study the bottom line. Using (∇ c Ω ξ )| t=0 = −1/r 2 ξ c , we find to leading order we have Note that this object is proportional to the surface area of the spherical subregions; in fact in the case of Einstein gravity, P abcd GR = 1 2 (g ac g bd − g ad g bc ), the above simply becomes − A ∂B 4G , the Bekenstein-Hawking entropy. Motivated by the derivation of the first law of causal diamonds in [28] we might be inclined to refer to this object as the generalized area 8 , however, this object appears in [5] (see equations (67)-(68) of their paper), and is identified as the entropy due to the natural background expansion of the hyperboloid,S. Specifically, and therefore, Next, introduce the matter energy H m u associated with spherical Rindler observers with proper velocity u, Then, following the same arguments given in [25,28] (and reviewed in (A)), we find is equivalent to the linearized gravitational equations of motion about flat spacetime for L(g ab , R abcd ) theories of gravity: The off-shell identity is simply where δC ξ represents the linearized constraint that the gravitational field equations hold. We can actually understand this first law of stretched lightcones as the Iyer-Wald identity [33] in the case of the stretched horizon of spherical Rindler observers, rather than the dynamical horizon of a black hole. As illustrated in (C), we may actually interpret the generalized area as the variation of the gravitational Hamiltonian.
Moreover, the first two terms on the LHS of (67) can be combined into a single object [28], namely, the variation of the Wald entropy while keeping the generalized area constant, i.e., leading to The Wald formalism contains the so-called JKM ambiguities [34]; one may add an exact form dY linear in the field variations and their derivatives to the Noether current, and Y to the Noether charge. This would lead to a modification of S Wald andS. However, it is clear the combined modification will cancel, allowing us to write whereS =S +S JKM . For more details on this calculation one need only follow the calculation presented in [28] as it is identical in the stretched lightcone geometry.

B. Gravity from Entanglement
Our aim here is to show how the first law of stretched lightcones -an off-shell geometric identity -can be understood as a condition on entanglement entropy. Before we consider the scenario with stretched lightcones, let us recall what happens in the case of a causal diamond. The entanglement equilibrium conjecture makes four central assumptions which we outline here and are reviewed in (A). These assumptions include [35]: (i) Entanglement separability, i.e., S EE = S U V + S IR ; (ii) equilibrium condition, i.e., a simultaneous variation of the quantum state and geometry of the entanglement entropy of the causal diamond is extremal, and the geometry of the causal diamond is that of a MSS; (iii) Wald entropy as UV entropy, i.e., the variation of the UV entropy is proportional to the Wald entropy at fixed generalized volume, and (iv) CFT form of modular energy, i.e., the modular energy is defined to be the variation of the expectation value of the modular Hamiltonian -which for spherical regions may be identified with the Hamiltonian generating the flow along the CKV which preserves the causal diamond -plus some scalar operator X.
Reference [25] showed that the above postulates can be used to derive the full non-linear Einstein equations, while [28] showed these postulates lead to the linearized gravitational equations for higher derivative theories of gravity. Here we will discuss how to justify the above assumptions (for a more pedagogical review, see [35]) and attempt to apply a similar set of assumptions for the case of stretched lightcones.
Assumption (i), where we require minimal entanglement between IR and UV degrees of freedom, is in fact a fundamental feature of renormalization group (RG) flows. More precisely, an RG flow requires a decoupling between high and low momentum states. Thus, in a Wilsonian effective action we would expect minimal entanglement between UV and IR modes. We also would assume that this basic feature of effective field theory to continue to hold in the theory's UV completion. This assumption is reasonably justified in both the causal diamond and stretched lightcone set-ups.
The second assumption (ii) asserts that the vacuum state in a small region of spacetime may be described by a Gibb's energy state, and that for a fixed energy, this state will have a maximum entropy, i.e., δS EE = 0. Moreover, the requirement that the causal diamond is described in a MSS is simply there to prevent curvature fluctuations from producing a large backreaction which spoil the equilibrium condition. In other words, the semiclassical (linearized) equations hold if and only if the causal diamond is in thermodynamic equilibrium. Likewise, we may safely make this same assumption about the stretched lightcone: when the stretched lightcone is in thermal equilibrium, the gravitational equations hold (via the Clausius relation), and vice versa.
Assumption (iii), like assumption (i), is also not very controversial. All that is being said is that one should identify the area ∂B of the causal diamond, and, similarly, the cross-sectional area of the stretched lightcone ∂Σ, as the area of the planar Rindler horizons existing at the edge of the causal diamond, and the area of the timelike spherical Rindler horizon, respectively. Motivated by the Ryu-Takayanagi proposal, we then simply identify these areas with the entanglement entropy of each region. We should point out a difference between the two pictures, however. It is known that the entanglement entropy of the causal diamond D[B], i.e., the causal domain of a spherical ball region B, is equivalent to the entanglement entropy of B itself. Meanwhile, we are saying that the entanglement entropy of the stretched horizon, Σ, is equivalent to the ball B whose boundary is ∂Σ. This has been established in the context of spherical Rindler space, which we may interpret our stretched lightcone as being: The entanglement entropy of spherical Rindler space is equal to the area of the horizon ∂Σ [36].
Unlike the first three assumptions, which all rely on the underlying UV physics, assumption (iv) makes an assertion about the form of the modular Hamiltonian for IR degrees of freedom. In the case of causal diamonds one makes two observations. First, a causal di-amond in Minkowski space may be conformally transformed to a (planar) Rindler wedge. Then, via an application of the Bisognano-Wichmann theorem [37], for CFTs the modular Hamiltonian H mod , defined via the thermal state ρ IR = Z −1 e −H mod , is proportional to the Hamiltonian generating the flow along the CKV ζ, i.e., H mod = 2π/κH m ζ [16]. This implies then that the variation of the modular Hamiltonian is equal to the variation of of H m ζ , plus some additional spacetime scalar X, i.e., This specific assumption is interesting in that it may be explicitly checked, and has been justified [35,38], though with the stipulation that X may depend on .
In the case of stretched lightcones, our assumption is then that the modular Hamiltonian H mod u , defined by ρ Σ = Z −1 e −H mod , is proportional to the radial boost Hamiltonian, and that we may also include a spacetime scalar X. We would like to be able to similarly justify this assumption, as was accomplished in the causal diamond case. While currently this assumption is non-trivial and has not been computationally justified, we find that it is reasonable, as we now describe. The stretched lightcone Σ, like spherical Rindler space, can be understood as the union of Rindler planes; indeed, if we constrain ourselves to the y = z = 0 plane, the radial boost vector ξ a = rδ a t + t∂ a r reduces to a Cartesian boost vector. Each Rindler plane may be associated with a single causal diamond. The union of these causal diamonds yields a single "radial causal diamond" [36] 9 . Therefore, the congruence of uniformly and constantly, radially accelerating observers comprising the stretched lightcone have an associated radial causal diamond. Moreover, the radial boost ξ a preserves the flow of the hyperboloid Σ. Our assumption is that the entanglement entropy of the stretched lightcone is that of the radial causal diamond which is also that of spherical region B. Thus we define the modular Hamiltonian as above and assume that it is proportional to the Hamiltonian generating the flow of Σ. For similar arguments given in [35,38], we expect -but have not proved -that for CFTs we may also modify the modular Hamiltonian by a spacetime scalar.
Let us now briefly show how the first law of stretched lightcones -an off-shell geometric identity -can be un-derstood as a condition on entanglement entropy. In particular, we can follow the discussion given in [28] (also reviewed in (A), making only a few simple changes). We perform a simultaneous (infinitesimal) variation of the entanglement entropy on a stretched lightcone of S EE with respect to the geometry and quantum state. By entanglement separability, δS EE takes the form where the UV contribution is state independent and is assumed to be given by δS U V = δ(S Wald + S JKM )S , while the IR contribution comes from the modular Hamiltonian via the first law of EE, δS IR = δ H mod = 2παδ H m u . Then, using the first law of entanglement entropy for a system in which the background geometry is also varied (A27), we arrive to valid for minimally coupled, conformally invariant matter fields. Thus, there is an equivalence between the following statements: (i) S EE is maximal in vacuum for all balls in all frames, and (ii) the linearized higher derivative equations hold everywhere. In other words, the entanglement equilibrium condition is equivalent to the linearized higher derivative equations of motion to be satisfied, and vice versa. This equivalence may be verified via a simple modification of the calculations presented in [28]. We also note that here we considered perturbations about Minkowski space, however, one could, in principle, generalize this to a MSS, and while the above discussion was particular to theories of gravity described by L(g ab , R abcd ), i.e., those which do not depend on the derivatives of the Riemann tensor, we could have included those derivatives as well.

V. DISCUSSION AND FUTURE WORK
After reviewing the geometric similarities between causal diamonds and stretched future lightcones (II), we presented a derivation of the full non-linear gravitational equations of motion in (III) by assigning thermodynamics to the conformal Killing horizon of the causal diamond, i.e., a Hawking temperature T H = κ/2π, and a local holographic entropy S Wald . The equations of motion were a geometric consequence of the Clausius relation T H ∆S rev = Q, where ∆S rev (52) is defined as the entropy solely due to a matter flux crossing the horizon. We found that the quantity K 2GW , whereW is the generalized volume, can be understood as the entropy of the natural increase of the causal diamond. This provides a microscopic interpretation of the generalized volume. Our physical process derivation of the equations of motion was motivated by [5] where the gravitational equa-tions were derived from the thermodynamics of stretched future lightcones.
Motivated by [25,28], in (IV) we showed how to derive the linearized gravitational equations of motion from the entanglement equilibrium proposal, i.e., that the entanglement entropy for spherical entangling regions is maximal in the vacuum. We did this by first deriving an offshell geometric identity, the first law of stretched lightcones, and showed that it was equivalent to the first law of entanglement entropy in the case of spherical subregions and conformally invariant matter. In the derivation of the first law of stretched lightcones we found an expression for the generalized area, which is nothing more than the entropy due to the natural expansion of the stretched lightcone. To complete this derivation, however, we to had make the non-trivial assumption that the entanglement entropy of the spherical entangling region ∂Σ is the entanglement entropy of Σ, and the modular Hamiltonian H mod is proportional to the radial boost Hamiltonian H m u . This is a speculation which requires justification and will be persued in the future.
We can summarize our findings of (IV) and the equivalent statement for causal diamonds [25,28] as whereS is the irreversible entropy due to the natural change of the background geometry -identified as the generalized volume in the case of causal diamonds, or the generalized area in the case of stretched lightcones -and where T is the temperature associated with the horizon of the surface, namely, the Hawking temperature T H = κ/2π in the case of causal diamonds, or the Unruh-Davies temperature T = 1/2πα in the case of stretched lightcones. Entropy being maximal in the vacuum implies that the linearized constraint is satisfied, leading to the linearized form of the equations of motion of higher derivative theories of gravity, or, in the special case of Einstein gravity, the full non-linear equations. Apart from the entanglement equilibrium interpretation of the (mechanical) first laws of causal diamonds and stretched lightcones, it is natural to interpret them as thermodynamic relationships, namely we may view T δS EE |S = 0 as the infinitesimal variation version of T ∆S rev = Q, where the condition of constant generalized volume (or generalized area) is equivalent to studying reversible thermodynamic processes. More specifically, the UV contribution to δS EE would be identified with change in reversible gravitational entropy ∆S rev , while the IR contribution to entanglement entropy would be identified with the heat Q. Moreover, if entanglement entropy is UV finite, as assumed here, and satisfies the Clausius relation, then whatever the content of the underlying theory is, the entanglement entropy must be proportional to the Bekenstein-Hawking entropy, in the case the theory of spacetime is governed by Einstein gravity [40].
Unlike the derivation of the non-linear equations using stretched lightcones [5], in the case of working with causal diamonds we had to take the limit that we are working on the conformal Killing horizon. One reason for this may be because the future stretched lightcone is defined as a surface of proper acceleration, and the binormal dS ab is constructed from u a -the proper velocityand n a -a normal vector proportional to the proper acceleration. In this way the stretched lightcone is directly analogous to the stretched horizon of a black hole, which has well-defined thermodynamics. Due to the geometric similarities, it appears that the (past) stretched causal diamond has a similar interpretation.
Another way in which the two physical process derivations are different is that in the case of stretched lightcones, ∆S rev > 0 and ∆S > 0. Therefore, positive (classical) energy flux causes the (reversible) thermodynamic entropy to increase. Consequently, the null energy condition (NEC) is satisfied, and the stretched lightcone behaves as an ordinary thermodynamic system. These same features hold when we restrict ourselves to the past (before t = 0) of the causal diamond. In contrast, the future of the causal diamond, i.e., to the future of t = 0, one has ∆S < 0. From a geometric point of view, the reason for this decrease in thermodynamic entropy is clear: The cross-sectional area of the causal diamond is decreasing as one moves forward in time, reaching zero at the tip of the cone. Moreover, matter that is inside of the future of the diamond is free to leave the system -there is no horizon preventing it from leaving, and matter entering from the outside must be moving faster than the speed of light 10 ; there is even a question as to whether a diamond is thermodynamically stable 11 . In fact, the causal diamond has further non-classical thermodynamic properties: A causal diamond behaves as a system with negative temperature. However, in the context of entanglement equilibrium, a (conformal) matter flux yields a positive change in entanglement entropy δS EE > 0. In a related context these observations were recently discussed in [32], where it is suggested that the "classical" part of entropy is governed by negative temperature, while the quantum corrections present in the entanglement entropy are seemingly ruled by a positive temperature.

A. Future Directions
We conclude by noting ways in which our work may be extended.

Local First Laws
We now have two derivations of the gravitational equations of motion via a thermodynamic process, and an application of the Clausius relation T ∆S rev = Q. Recently [41], it was shown that one may write down a hybrid first law of gravity and thermodynamics connecting matter energy E and work W with the gravitational entropy S evaluated on the stretched future lightcone of any point in an arbitrary spacetime. It would be interesting to see if we can find a similar first law of causal diamonds. In fact, recently, Jacobson has established a first law for a causal diamond in a maximally symmetric space, analogous to the first law of black hole mechanics [32]. In this set-up, the causal diamond is equipped with a cosmological constant, and one discovers that a local gravitational first law of causal diamonds is reminiscent of the Smarr formula for a ball in a maximally symmetric space. Moreover, if one wishes to interpret this first law as a Clausius relation, then the causal diamond, classically, is a thermodynamic system with a negative temperature. It would be interesting to study the thermodynamic behavior of the causal diamond, as well as look for a similar local first law for stretched lightcones, and verify that the stretched lightcone is a thermodynamic system with positive temperature.

Non-Linear Equations of Motion
It is interesting that we were able to derive the full nonlinear gravitational equations of motion via a reversible process, while we only found the linearized equations of motion via the entanglement equilibrium condition. This is because we restricted ourselves to first order perturbations of the entanglement entropy and background geometry. Higher order perturbations to the entanglement entropy lead to a modified form of the first law of entanglement entropy, e.g., the second order change in entanglement entropy is no longer proportional to the expectation value of the modular Hamiltonian (5), but rather one must include the relative entropy. Moreover, as pointed out in [28], using higher order terms in the RNC expansion and higher order perturbations to the entanglement entropy could make it possible to derive the fully nonlinear equations of an arbitrary theory of gravity. Indeed, these ideas were recently incorporated in the context of holographic entanglement entropy to derive the non-linear contributions to gravitational equations [23,24,42]. Due to the simlarity between the holographic and entanglement equlibrium approaches, developments in one is likely to inform the other.
We should also point out that the way we derived the non-linear gravitational equations via a physical process was by modifying ζ a and ξ a to deal with the fact that ζ a and ξ a are both approximate Killing vectors. It would be interesting to see whether these modifications have a microscopic interpretation and could be employed in the context of entanglement equilibrium such that the non-linear equations of motion arise without needing to consider second order perturbations to the entanglement entropy.
Acknowledgments I would like to thank Victoria Martin, Batoul Banihashemi, and Maulik Parikh for helpful comments on the manuscript, and Ted Jacobson for insightful discussions.
Appendix A: FLCD and Entanglement Equilibrium

First Law of Causal Diamond Mechanics
Here we present a slightly different derivation of the first law of causal diamond mechanics (FLCD) for higher derivative theories of gravity than given in [28]. Let us take the minus sign of (46), when Σ is the co-dimension-1 spacelike ball B. In this picture, the ∆ is not referring to a comparison of S Wald at two different time slices, i.e., not a physical process -all we have done is make use of Stokes' theorem. To make this point clear we drop the ∆.
Following the same steps shown in (III), we have where we have chosen to write the volume element of B as dB a = U a dV . On B(t = 0), Ω = 0, leading to: The final term is where we used (∇ c Ω)| B = κKU c /(D−2), and introduced the induced metric h bc on B. This contributionW is proportional to a part of the generalized volume introduced in [28]: Here P 0 is a theory dependent constant defined by the P abcd tensor in a maximally symmetric solution to the field equations via P abcd M SS = P 0 (g ac g bd − g ad g bc ). It can be verified that in the case of Einstein gravity (A4) is the spatial volume V of the diamond. Our expressionW does not include the P 0 term 12 .
We observe that, like W ,W is also proportional to the physical volume in the case of Einstein gravity. Specifically, in Einstein gravity, P abcd = 1/2(g ac g bd − g ad g bc ), we findW This expression is reminiscent of the Smarr formula for a maximally symmetric ball with a vanishing cosmological constant: (D − 2)A = (D − 1)KV [32]. This suggests thatW is really related to the entropy; indeed, in the body of this report we will find such an interpretation when we study the thermodynamics of causal diamonds.
Moving on, to linear order in the Riemann normal coordinate expansion, a perturbation about flat space leads to [28] where we have separated P abcd = P abcd GR + P abcd higher . Introducing the matter conformal Killing energy H m ζ , we find Notice then that for all timelike unit vectors one finds that is equivalent to the tensor equation [25]: δR ad − 2∂ b ∂ c (δP abcd higher ) + (δX)η ad = 8πGδT ad , (A10) 12 We can arrive to the generalized volume (A4) by subtracting P abcd M SS from P abcd in the expression for the Wald entropy; specifically, replace P abcd with P abcd − 1 (D−1) P abcd M SS in S Wald . Repeating the steps that lead to (A2) will include an additional term which is precisely the extra term found in W , missing fromW .
where we have introduced the spacetime scalar X, an assumption to be explained momentarily. Demanding local conservation of energy leads to δ R ad − 1 2 η ad R + Λη ad −2∂ b ∂ c (δP abcd higher ) = 8πGδT ad , (A11) which we recognize as the linearized gravitational equations of motion around flat space.
More explicitly, suppose that we are only considering higher curvature theories of gravity. Then, following the arguments of [28]: Meanwhile, which exactly matches what is found in appendix C of [28]. The Einstein contribution can be dealt with following the method described in [25], and as briefly described above.
The condition (A9) can be understood as the Iyer-Wald identity for a theory of gravity for the geometric set-up of a causal diamond: where δC ζ is the linearized constraint that the gravitational field equations hold. Following [28] one finds that the first law of causal diamond mechanics can be understood as the Iyer-Wald identity [33] in the case of a conformal Killing horizon as opposed to the dynamical horizon of a black hole. In this picture the generalized volume can be interpreted as the variation of the gravitational Hamiltonian. The first two terms on the LHS of (A15), moreover, can be combined into a single object, namely, the variation of the Wald entropy keepingW held constant, i.e., leading to As identified in [28], the Wald formalism contains (JKM) ambiguities in how the Noether current and Noether charge are defined. In particular we may add an exact form dY that is linear in the field variations and their derivatives to the Noether current, and Y to the Noether charge. This would modify both the entropy S Wald andW . However, as verified in [28], the combined modification cancel, and one may write whereW =W +W JKM . This shows that the resolution of the JKM ambiguity yields the same on-shell first law, provided the Wald entropy and generalized volume are modified by an exact form dY .

Entanglement Equilibrium
Let us now show how the first law of causal diamond mechanics -an off-shell geometric identity -is related to a condition on entanglement. In an effective field theory the entanglement entropy can be computed using the replica trick [43], where one defines the entropy as where the effective action I eff (n) is evaluated on an orbifold with a conical singularity at the entangling surface with excess angle 2π(n − 1). If a covariant regulator is used to define the theory, the resulting expression for the entanglement entropy is a local integral of diffeomorphism invariant contributions. When the entangling surface is the bifurcation surface of a stationary horizon, the entanglement entropy is simply the Wald entropy. In the case of nonstationary entangling surfaces, the computation can be accomplished used squashed cone techniques [44], leading to extrinsic curvature modifications of the Wald entropy [15] -the so-called Jacobson-Myers entropy [34]. As discussed in [28], the extrinsic curvature modifications of the Wald entropy may be identified with the JKM ambiguities mentioned above. Thus, the entanglement entropy is given by the Wald entropy modified by specific JKM terms, i.e., the Jacobson-Myers entropy. This realization allows us to relate the entanglement entropy to our off-shell geometric identity (A18). The below discussion closely follows [25,28]. As briefly described in the introduction, we are performing a simultaneous geometric and quantum state variation of the entanglement entropy in a causal diamond. Therefore, the variation of the entanglement entropy δS EE includes a UV, state-independent contribution and an IR statedependent contribution The IR contribution describes states of a QFT in a background spacetime, while the UV contribution represents short distance physics, including quantum gravitational degrees of freedom. We should point out here that we are positing that the Hilbert space of states on B can be factorized into IR and UV contributions, H B = H U V ⊗H IR , i.e., entanglement separability -there is minimal entanglement among degrees of freedom at widely separated energy scales. Upon a UV completion, the entanglement entropy in a spatial region is finite in any state, with leading term proportional to the area of the boundary of the region, and higher order contributions described by the Wald entropy. Therefore, when the geometry is varied, the entanglement entropy in the diamond (which is equivalent to entanglement in B) from the UV degrees of freedom near the boundary ∂B will change by The scale of UV completion -which we take to be below the Planck scale -is such that H IR and H U V contain degrees of freedom with energies above and below . We take the size of the causal diamond to be such that L Planck < < 1/ . The separation between UV and IR degrees of freedom allow us to define the IR vacuum state of the ball B where ρ is the total quantum state of the diamond. Formally we may write ρ IR as a thermal state where H mod is the modular Hamiltonian and Z is the partition function. In Minkowski space, the causal diamond may be conformally transformed to the (planar) Rindler wedge. The Bisognano-Wichmann theorem then allows us to interpret ρ IR as a true thermal state with respect to the Hamiltonian generating time-translation; in the case of a conformal field theory the modular Hamiltonian will take a specific form in terms of the matter Hamiltonian H m ζ (A7) [16] i.e., the Hamiltonian generating flow along the CKV ζ. The entanglement entropy due to IR degrees of freedom S IR = −trρ IR log ρ IR will satisfy the first law of entanglement entropy [19,20] δS IR = δ H mod . (A25) We shall make the further conjecture, and assume that the variation of the modular Hamiltonian will carry an additional term δX that is a spacetime scalar such that Such a conjecture was made in [25]. There one assumes, to leading order that δ H mod ∝ (δ T 00 +δX), which has been shown to be a correct assumption [35,38], though δX may depend on . Adding this to our total variation of δS EE , we have a modified first law of EE We may now postulate the equilibrium condition: A small diamond is in equilibrium if the quantum fields are in a vacuum state and the curvature is that of a MSS, e.g., Minkowski space. Moreover, motivated by the first law of causal diamond mechanics, we require that B has the sameW as in vacuum. With this, we substitute (A27) into (A18), using (A24), leading to which is valid for minimally coupled, conformally invariant matter fields. When the variation of δS EE vanishes, we recover (A11). We therefore arrive to an equivalence between the following statements: (i) the entanglement entropy S EE is maximal in vacuum for all (small) balls in all frames, and (ii) the linearized higher derivative equations hold everywhere. That is, the entanglement equilibrium condition is equivalent to the linearized higher derivative equations of motion to be satisfied, and vice versa. The verification of this equivalence can be found in the appendix of [28], which we will not repeat here but was described earlier.

Appendix B: Failure of Killing's Identity
In our derivation of the gravitational equations of motion via the thermodynamics of causal diamonds, we made use of the conformal Killing equation and the conformal Killing identity (B2) An arbitrary spacetime, however, does not admit a global conformal Killing vector, therefore ζ a can be understood as an approximate conformal Killing vector. More precisely, ζ a will fail to be a conformal Killing vector to some order in a Riemann normal coordinate expansion of the arbitrary spacetime (14). The order at which these quantities fail depends on the order of the vector itself. The conformal Killing vector ζ a we used with Ω = −2t/ 2 , was specific to D-dimensional Minkowski space, and is of order where the O(0) contribution is a constant. From this one finds that in an arbitrary spacetime ζ a will fail the conformal Killing equation to order O(x) + O(x 3 ) and the Killing identity to order O(0) + O(x 2 ). Note that the term we keep in deriving the equations of motion, namely the integrand of 13 is, O(0) + O(x 2 ). However, since dΣ a = N a dAdτ , with N a ∝ x i /r, the O(0) contributions vanish due to the fact we are integrating over a spherical subregion for which ∂B x i dA = 0. Therefore, we need only concern ourselves with the O(x 2 ) contributions coming from the failure of the conformal Killing identity. We realize, in fact, that the only contribution of the conformal Killing identity we made use of was the term proportional to the Riemann tensor, R ebcd ζ e -we neglected all other contributions. This means that we effectively treated ζ a as an approximate Killing vector rather than an approximate conformal Killing vector. We therefore find ourselves in a similar situation as the authors of [5]: We must remove the higher order contributions coming from the failure of Killing's identity. Specifically, in the integrand (B4), the term P abcd ∇ b ∇ c ζ d should be replaced with with from which we see that f bdc = −f bcd . Here f bcd quantifies the failure of Killing's identity. Our task is therefore to find a way to eliminate Σ dΣ a P abcd f bcd , at least to the order at which we keep the desired contribution Σ dΣ a P abcd R ebcd ζ e . Specifically the integrand we wish to keep The O(0) contribution, as mentioned above, vanishes due to the fact we are integrating over a spherical subregion. Therefore, the order of the integrand we are interested in keeping is O(x 2 ), and we must remove the O(x 2 ) contributions of the undesired term.
To study this problem we introduce the notation where f (0) bcd denotes the O(0) contribution to f bcd , f (1) bcd the O(x) contribution, and so forth. We will use this notation to decompose each object appearing in the integrand (B7), i.e., N a = N (0) a , and P abcd = P abcd (0) + P abcd (1) + .... In order to remove contribution (B7) to the desired order, we will follow the method developed in [5], by modifying ζ a and N a , by adding undetermined higher order contributions to ζ a . The algorithm for removing the terms can be described as follows: The integrand of (B7) is a collection of monomials. Because we are integrating over a spherical subregion, many of these monomial contributions will vanish, e.g., when the integrand goes like tx i /r. Some terms will remain, however, and the only way to remove these contributions is to add in higher order modifications to ζ a , e.g., where here the greek indices µ, ν run over the whole spacetime index. We can likewise modify N a . These modifications to ζ a will include additional contributions to f bcd of the same monomial structure as before. We then choose the undetermined coefficients D aµνρ , etc. so as to cancel these terms. In essence we add counterterms to ζ a to remove (B7) to the desired order. One problem which may arise is whether there are enough undetermined coefficients to cancel all of the monomials which may appear. Putting all of this together, the lowest order contribution in the integrand of the offending term (B7) is As already discussed, this term vanishes via parity arguments. The next order term in the integrand is O(x), As we will see, we can in fact drop the terms proportional to N (1) a , as it generates more work for us than it helps. To summarize the algorithm, in order to say we have achieved in deriving the nonlinear equations of motion for higher derivative gravity, we must show how to eliminate the above two contributions (B12) and (B13). We do this by modifying the ζ to include higher order contributions, and count the number of undetermined coefficients to see if we have enough terms to eliminate (B12) and (B13). At first glance it seems as though this is indeed possible simply by a naive counting of the number of monomials which appear in the integrand, compared to a naive counting of the number of undetermined coefficients that are available.

Removing O(x) Contributions
First we write f bcd in a more useful form We can drop the whole second term because it is symmetric in indices cd and is being contracted with P abcd . What remains is: We think about modifying ζ a in the following way: where the ζ (0) a contribution is constant. A similar expansion holds for N a . Let's now classify f (0) bcd . Clearly we get a contribution from ∂ b ∂ c ζ d , and from the ∇Ω terms. Specifically, Let's look at the O(x) contribution of which would be present in (B12) even without modifying ζ a or N a . This is: Thus our task is to compute It is straightforward to work out that the only non-zero term is Therefore, the only non-zero contribution will be: But this term vanishes because f (0) tjk is symmetric in jk indices, while P itjk (1) is antisymmetric. Thus, the entire contribution: In fact, whenever we have something of the form N (a) bcd , we see that it vanishes, as we never specified the form of P abcd above. We will therefore be able to drop some terms appearing in the O(x 2 ) contribution (B13) as well.
There is another term in (B12) which appears due to ζ a being an approximate (conformal) Killing vector, namely, the one proportional to f (1) bcd . Without modifying ζ a , the only contribution to this comes from To leading order, we have ∇Ωg ∼ (∇Ω)(p) µ x µ η, where η is the Minkowski metric. Calling (∇ d Ω) µ (p) ≡ Ω dµ (p), and noting that ζ (0)e = δ te , we find that, without modifying ζ a , we have: where it is understood that (R tbcd ) µ is evaluted at the point p. Now we work to see which of must be cancelled. Let's work out each of the f (1) bcd . The only non-zero contributions we have include: Then, using the symmetries of P abcd and f (1) bcd , we have: Using spherical symmetry, and that N (0) i = x i /r, we see that the only non-vanishing contributions to this will be when µ = m -a spatial index, i.e., where More precisely, the only non-vanishing contribution occurs when i = m, i.e., We see then that the only type of polynomial we see appearing includes (x i ) 2 /r -or (D − 1) such terms for a D-dimensional spacetime. This shows us that we must modify ζ a such that we can eliminate such contributions. Consider, then, the modification where C µνρd is a collection of D 4 completely undetermined coefficients. It is easy to see that this will provide a contribution to f (1) bcd only through Putting this into the integrand (B30) we have Or, using spherical symmetry, We see then that there are more than enough C coefficients to eliminate the undesired terms. The only other contribution in (B12) is one which arises form the N (1) a modification. Clearly, this term is unnecessary, and therefore we simply do not modify N at this level. This then takes care of the (B12) termby modifying ζ a at O(x 3 ) as shown above, we can remove the undesired (B12). Let's move on to the O(x 2 ) contribution, (B13).

Removing O(x 2 ) Contributions
We first point out some simplifications we can make to (B13). Using that N (0) bcd all cancel, we can neglect all such terms. Likewise, we can drop any term proportional to N (1) a . Thus, we have A priori we have no reason to drop the N a modification, however, as we will see, we may drop it simply because we have enough coefficients to eliminate all undesired terms, leaving us with two terms. Note that N (0) bcd will include contributions both from the failure of ζ being a Killing vector, and from us modifying ζ a . This means we bring in a large number of C coefficients, potentially all D 4 of them. However, (D − 1) of these coefficients we potentially used, while many others cannot be used due to the fact we are integrating over a co-dimension-2 sphere. Thus, while there are a handful of remaining C coefficients which can be used to eliminate the O(x 2 ) integrand, we cannot rely on or assume we have each coefficient; we must look to modifying ζ a by adding a term of the form which we see has D 5 undetermined coefficients. Therefore, by a naive counting argument we find that we will have more than enough D and remaining C coefficients to eliminate all undesired contributions at the O(x 2 ) level. Begin with with Following a similar strategy to remove O(x) contributions and using [5] as a guide, several lines of algebra later show that and where we have written P abcd this fixes what µ, ν have to be. Either µ = 0, ν = j = i or µ = j = i, ν = 0. All other contributions vanish due to integration.
We would now add together (B42) and (B43) in the integrand (B37). We see that we have enough D coefficients to cancel these terms, without introducing N (2) a . This can be explicitly checked in the case of f (R) gravity in 2 + 1 dimensions -the most restrictive example. Since we have more than enough coefficients to account for the above monomial contributions, we need not modify N a at all, and may therefore have eliminated (B37). This completes the derivation of the equations of motion.

Appendix C: Iyer-Wald Identity For Stretched Lightcones
Here, after reviewing the basic set-up of the Iyer-Wald formalism [33], we consider the Iyer-Wald identity for the geometry of future stretched lightcones. We will closely follow the arguments presented in [28] due to the geometric similarity between the stretched lightcone and causal diamond.

Iyer-Wald Formalism
Let L[φ] be the local spacetime D-form Lagrangian of a general diffeomorphism invariant theory, where φ represents a collection of dynamical fields, e.g., the metric and matter fields. Varying the Lagrangian yields where E denotes the equations of moton for all of the dynamical fields, and θ is the symplectic potential (D − 1)-form. The antisymmetric variation of θ leads to the symplectic current, a (D − 1)-form, whose integral over a Cauchy surface B gives the symplectic form for the phase description of the theory. Given an arbitrary vector field ξ a , evaluating the symplectic form on the Lie derivative L ξ φ yields the variation of the Hamiltonian H ξ which generates the flow ξ a : Now take B to be a ball-shaped region, and let ξ a be a future-pointed, timelike vector that vanishes on the boundary ∂B. When the background geometry satisfies the field equations E = 0, , and ξ vanishes on ∂B, we arrive to Wald's variational identity where we have introduced the Noether current J ξ with i ξ representing the contraction of the vector ξ a on the first index of the differential form. Recall that the Noether current J ξ can always be written as [45] J ξ = dQ ξ + C ξ , where Q ξ is the Noether charge (D − 2)-form and C ξ are the constraint field equations associated with diffeomorphism gauge symmetry. When we assume that the matter equations are imposed, one finds where E ab is the variation of the Lagrangian density with respect to the metric, and a is the volume form on B.
Combining (C3), (C4), and (C6) leads to the Iyer-Wald identity: When the linearized constraints hold, δC ξ = 0, the variation of the Hamiltonian is a boundary integral of δQ ξ . We will show that this off-shell identity leads to the first law of stretched lightcones. Observe that, unlike the case with black hole thermodynamics, δH ξ here is nonvanishing; this is because ξ a is not a true Killing vector. Let us proceed and evaluate the Iyer-Wald identity (C8) for an arbitrary theory of gravity for the geometric set-up for the stretched lightcone described above (II). Here we will make the simplifying assumption that the matter fields are minimally coupled, such that the Lagrangian splits into metric and matter contributions with L g being an arbitrary diffeomorphism-invariant function of the metric, Riemann tensor, and the covariant derivatives of the Riemann tensor 14 . This separation allows us to also decompose the symplectic potential and the Hamiltonian as θ = θ g + θ m , and δH ξ = δH g ξ + δH m ξ . Therefore, the Iyer-Wald identity (C8) becomes We can relate the integrated Noether charge to the Wald entropy via [31]: where G is Newton's gravitational constant, and the Wald entropy functional S Wald is with dS ab = 1 2 (n a u b − n b u a )dA 15 . Following, [33], this relationship also holds for first order perturbations ∂B δQ ξ = −4GδS Wald .
(C14) 14 In our discussion above we did not consider theories of gravity which also depend on derivatives of the Riemann tensor, however, it is easy to modify our arguments to include such theories -in the case one perturbs around maximally symmetric spacetimes. 15 A brief comment on notation: For comparison to [28], we note that there the authors choose the convention where 1/4G → 2π, and use that the Wald entropy is written as where µ is the volume form on ∂B, which ab = −n ab ∧ µ.
Our next task is to evaluate the variation of the gravitational Hamiltonian δH g ξ . As we detail below, this leads us to the derivation of the generalized area of stretched lightcones, analogous to the generalized volume of causal diamonds constructed in [28].

Generalized Area of Stretched Lightcones
Here we closely follow the arguments presented in [28] to work out the variation of the gravitational Hamiltonian for an arbitrary theory of gravity in the geometric set-up of the stretched lightcone. In the calculation that follows we will consider the case of looking at perturbations about a maximally symmetric background (MSS), specifically Minkowski space. Along the way we will mention how some of these assumptions might be relaxed.
For a Lagrangian that depends on the Riemann tensor and its covariant derivatives, the symplectic potential θ g is given by (C15) where we use the notation of [28] such that P bcd = a P abcd , and S ab and T abcd... i are locally constructed from the metric, its curvature, and covariant derivatives of the curvature. Due to the antisymmetry of P bcd in c and d, the symplectic current (C2) takes the form ω g = 2δ 1 E bcd ∇ d δ 2 g bc − 2E bcd δ 1 Γ e db δ 2 g ec + δ 1 S ab δ 2 g ab Let's now employ the geometric set-up discussed above. We use the fact that we are perturbing around a maximally symmetric background. This allows us to write R abcd = R D(D − 1) (g ac g bd − g ad g bc ) , with a constant Ricci scalar R, such that ∇ e R abcd = 0 , L ξ R abcd | t=0 = 0 .
Moreover, since the tensors P abcd , S ab and T abcd... i are all constructed from the metric and curvature, they will also have vanishing Lie derivatives along ξ a , when evaluated on B.
If we replace δ 2 g ab in (C16) with L ξ g ab , and make use of (33) Following similar computations performed in [28] we find to leading order in the RNC Showing this takes quite a few lines of algebra, however, when all is said and done, we can take (33) of [28] and simply replace g bc withg bc . Thus, we are varying the object B dB a α r 2 P abcd u dgbc .
However, after converting back to the conventions used in the body of this paper, we find that i.e., the entropy due to the natural expansion of the hy-perboloidS (62).
In summary, we have arrived to the off-shell variational identity Imposing the linearized constrant δC ξ = 0, this simply becomes the first law of stretched future lightcones for higher derivative gravity.