Causality in Curved Spacetimes: The Speed of Light&Gravity

Within the low-energy effective field theories of QED and gravity, the low-energy speed of light or that of gravitational waves can typically be mildly superluminal in curved spacetimes. Related to this, small scattering time advances relative to the curved background can emerge from known effective field theory coefficients for photons or gravitons. We clarify why these results are not in contradiction with causality, analyticity or Lorentz invariance, and highlight various subtleties that arise when dealing with superluminalities and time advances in the gravitational context. Consistent low energy effective theories are shown to self-protect by ensuring that any time advance and superluminality calculated within the regime of validity of the effective theory is necessarily unresolvable, and cannot be argued to lead to a macroscopically larger lightcone. Such considerations are particularly relevant for putting constraints on cosmological and gravitational effective field theories and we provide explicit criteria to be satisfied so as to ensure causality.


Introduction
Causality and unitarity play a crucial role in fixing the structure of a Lorentz invariant quantum field theory as was recognized early on. This is most immediately apparent in the dispersion relation methods [1] utilized for example in the spectral representation of Källen and Lehmann [2,3] where the Fourier space Feynman propagator is recognized to be an analytic function of complex momentum squared up to a pole and right hand branch cut. These dispersion relation methods evolved into the S-matrix analyticity program of the 1960's which -albeit in a different form-plays a crucial role today in amplitude methods. More recently these ideas have been used to put constraints on low energy effective theories (EFTs), either through positivity bounds [4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20][21][22], demanding scattering time delays are positive (asymptotic (sub)luminality) [23][24][25][26][27], or related methods.
One unfortunate feature of these methods is that there is no clear way to extend them to curved spacetimes, except perhaps maximally symmetric cases, see [22,28,29] for some attempts to deal with this. The powerful analyticity properties combined with crossing symmetry are spoiled when the background spacetime is time-dependent. In the absence of clear causality constraints, one reasonable guess is to demand that relativistic causality should require the propagation speed of all degrees of freedom to be (sub)luminal. For a nongravitational theory this is largely a reasonable criterion, and indeed it is known that certain constraints from positivity bounds appear to be connected with (sub)luminality of fluctuations about different backgrounds [6].
For a gravitational theory these questions are altogether more subtle. The ability to perform field redefinitions that change the off-shell metric means that it is no longer clear which lightcone to use as a reference. Due to the ambiguity of field redefinitions, causality in a gravitational theory is usually phrased for on-shell invariant quantities, such as the requirement that the tree level scattering matrix is analytic in terms of its Mandelstam variables. The relation between causality and analyticity is highlighted for example in Refs. [30][31][32].
But such considerations are typically of little use in understanding causality in spacetimes with significant curvature effects such as FLRW and Schwarzschild. The peculiar subtleties associated with causality in curved spacetime are well known in the context of the low-energy effective theory for QED below the electron mass [33,34], where the low-energy phase and group velocity of light is known to be superluminal for certain polarization states on some curved geometries. The relation with analyticity and causality in this example has been discussed extensively in the literature (see for example refs. [35][36][37][38][39][40][41][42]).
From the cosmological perspective, it is known that various cosmological models such as inflation and dark energy theories can easily exhibit different speeds of propagation in different sectors. It is also known that the speed of gravitational waves (GWs) in a given low energy EFT can be different than the luminal speed inferred from the metric from which the theory is constructed. Once again this leads to the obvious question of how causality should be imposed in the cosmological context. In the literature, it has often been set, by fiat, that the speed of GWs and of all other species should be (sub)luminal in an arbitrary background with respect to the metric out of which the theory is constructed. Doing this often imposes constraints on the signs of coefficients in the effective action, but as already mentioned such a criterion is not invariant under field redefinitions.
This criterion can often be correct, specifically when the EFT operator that creates the effect can lead to a macroscopic increase in the light cone. However in the canonical case of gravitational effective theories it is not necessarily correct, and indeed demanding c s ≤ 1 for all species can sit in contradiction with those same requirements of causality and analyticity. In two recent papers [43,44] it was pointed out that integrating out matter fields typically leads to curvature operators in the low-energy EFT of gravity that can lead to a (small!) superluminal low energy speed for GWs. For the low-energy EFT of gravity, constraining the signs of the low-energy operators to be so as to entirely forbid superluminal low-energy speed, no matter how small, would lead to a criterion in contradiction with known partial high energy completions and more generally expectations based on analyticity. Instead, we highlight that a small amount of low-energy superluminal speed does not directly imply that the support of the retarded propagator lies outside the usual lightcone.
The key point is that superluminality of the low energy speed, is only in conflict with relativistic causality if it can be integrated over time to make a large macroscopic effect, i.e. the light cone of causal influence is measurably larger than the background geometry lightcone. In solving for the retarded propagator perturbatively in the EFT expansion around the background one, this only occurs when there are secular effects that need to be resummed, leading to significant differences in the structure of the propagator at late times (or large distances). We will demonstrate that in those situations where a small superluminal low energy speed arises in a given gravitational effective theory, the requirement that the EFT is under control automatically precludes any secular effects, preserving causality by ensuring that at any finite order in the convergent EFT expansion, the retarded propagator has the same causal structure as the unperturbed one. That this is the case hinges on smallness of the superluminal speed correction that arises within gravitational effective theories. We may regard this a self-protection mechanism against causality violation for consistent low energy EFTs.
In the case of asymptotically flat geometries where an S-matrix may be defined this discussion may be further sharpened. We may precisely define a relativistic generalization of the Eisenbud-Wigner scattering time-delay [45][46][47][48] ∆T . In particular for spherically symmetric spacetimes we may define a time delay ∆T for each partial wave via the derivative of the scattering phase shift at fixed ∆T = 2 ∂δ (ω) ∂ω , (1.1) in terms of the incident particle's energy ω. We may also consider the inequivalent but related time delay for fixed impact parameter b In General Relativity (GR) these time-delays are the well-known 'Shapiro' or 'gravitational' time-delays [49]. There is a long history of using positivity or boundedness 1 of this time delay as a means of imposing causality (see for example [50] for a review) going back to Eisenbud and Wigner and improved by Smith [47]. More recently in the context of relativistic field theories, in [23] the eikonal approximation was used to determine this scattering time-delay in various low energy effective theories and subsequently in for example [24-27, 51, 52] where it was generally argued on the same lines outlined by Eisenbud and Wigner [45,46] that an overall negative time-delay is a signal of causality violation and inconsistency of the low energy effective theory. This criterion is sometimes dubbed absence of 'asymptotic superluminality' [53]. The perspective of some literature is that any time delay for which there is net negative sign ∆T < 0, i.e. a net time advance, leads to a causality violation, since it implies propagation faster than the asymptotic spacetime, regardless of magnitude, [51].
In the case of weakly coupled UV completions, it was argued in Ref. [23] that apparent time-advances in the low energy effective theory could be resolved by an infinite tower of higher spins as in the case of string theory. While useful for understanding the nature of possible UV completions, from the perspective of a low energy physicist this tower of higher spins would just show up as an infinite number of local operators. Clearly including more and more irrelevant operators in the low-energy EFT cannot change the statement that the phase shift (as computed using the leading order operators in the EFT) is of a particular sign at sufficiently low-energy. Furthermore since the low energy effective theory is designed to describe large distance physics, it should reliably compute the large distance behaviour of the retarded propagator. Any time advance in addition to the Shapiro contribution would signal that the retarded propagator has support outside the lightcone set by the background geometry. Whether or not the UV completion admits an infinite tower of spins cannot by itself resolve this tension with causality at large distances which is where the real issue lies.
Hence the import of the observation that the UV theory resolves the time advance for the low energy physicist is merely the generic statement that the low energy effective theory has a cutoff (set by the lowest mass in the tower for a weakly coupled UV completion), and only a resolvable time advance calculable within the regime of validity of the effective theory would signal a true causality violation. This criterion is of course true regardless of whether or not we consider a weakly coupled UV completion. This implies that, from what the low-energy EFT is concerned, the resolution behind a negative sign of the phase shift in an EFT and its apparent tension with causality cannot lie in the existence of a higher-spin tower per se but rather must lie in the actual order of magnitude of the phase shift/time advance itself and the existence of a cutoff irrespectively of what precisely happens at that cutoff (irrespectively on whether or not it represents the onset of a tower of higher spins or whether it represents instead the mass of other heavy particles whose loops are relevant.) Moreover, the causality criterion proposed in [23] is only that the total time delay ∆T is positive within the regime of validity of the effective theory ∆T > 0. This is the essential point of the 'asymptotic superluminality' condition [53], local perturbations and corrections to the effective theory may speed up local propagation relative to the background geometry, but this is viewed as acceptable as long as they remain slower than the asymptotic geometry.
Our perspective is that this is incorrect, or at least incomplete, for two reasons (a) strict positivity of the total time delay fails to account whether the delay is resolvable, and (b) more generally what is required is (resolvable) positivity of the EFT corrections to the time delay relative to the background 2 . This may be seen by a more careful consideration of the well known example of QED in curved spacetime. It is well known that loop corrections from charged particles induce low energy superluminal propagation for photons in a curved background, e.g. for a Schwarzschild background, as first noted by Drummond and Hathrell [33]. The total time delay is then where ∆T g is the time-delay induced by the curved background spacetime, i.e. the Shapiro time-delay, and ∆T EFT is in this case the Drummond-Hathrell correction from loop effects.
According to [23] this is consistent with causality because it is noted that in order for the negative ∆T EFT to overpower the positive (in D > 4) ∆T g , it is necessary that the impact parameter is smaller than the inverse mass of the charged particle integrated out, which means the low energy effective theory can no longer be trusted. However, this argument does not resolve the causality problem that appears to arise at larger impact parameter where ∆T EFT is negative, and its calculation can be trusted within the regime of validity of the EFT, but the overall ∆T is positive. Any negative ∆T EFT implies that the photons are travelling superluminally relative to the background metric (accounted for by the Shapiro contribution) and this sits in clear contradiction with the fact that in the known UV completion, namely QED in curved spacetime, causality remains intact with the causal lightcone defined by the background geometry and not the asymptotic Minkowski geometry. Causality in this case should then be a statement about ∆T EFT itself, and not about the full ∆T . Yet demanding ∆T EFT > 0 sits in contradiction with the well-known result of Drummond and Hathrell.
Hence the 'asymptotic' causality condition of [23] fails to address what is the resolution of the apparent causality violation in the low-energy Drummond-Hathrell EFT. As demonstrated later, and also cleanly argued in [54] for calculations in the shockwave (eikonal) limit, the real resolution of causality is that the negative Drummond-Hathrell contribution is unresolvable within the effective theory, namely within the regime of validity of the EFT it remains always true that the advance is smaller than the 'geometric optics' resolution scale 3 where ω is the asymptotic energy of the scattering particle. Equivalently this is the statement that the EFT contribution to the scattering phase shift is bounded by unity It is therefore clear that by itself the sign of the time-delay correction cannot be sufficient in determining whether or not acausality will follow, it is crucial to consider also its magnitude.
Only if there is a time advance, calculable within the EFT, larger than the wavelength of the scattering state, can we infer genuine causality violation. The eikonal approximation used in [23] 4 relies on resumming an infinite number of ladder diagrams which are "typically" 3 Some literature use the cutoff to define resolvability, this is not the definition used here for reasons explained around (5.26). 4 In Ref. [23] as well as in for example the nice recent related discussion of [27] derivations are principally performed within the eikonal approximation which has the virtue of having a relatively clean interpretation in terms of Feynman diagrams as a resummation of ladder diagrams. As such, these derivations are well suited to perturbative S-matrix calculations. All of the calculations we perform here are in the related semiclassical approximation. This approximation is harder to understand in terms of a resummation of Feynman diagrams, however it is straightforward to calculate it by applying the WKB approximation to the corrected Green's functions. This is the method we will use in what follows and we refer to Appendix B for more details on the semiclassical approximation. The eikonal approximation may be obtained from a high energy limit of the enhanced as compared to all the other diagrams. Implicitly this resummation amounts to an exponentiation of the lowest order phase shift in the partial wave expansion More precisely it is the statement that a resummation of the t−channel exchange and its associated higher order ladder diagrams appropriately exponentiate in the manner This resummation does make sense when the enhancement does actually occur, or in other words when the resulting phase-shift is |δ | 1 so that these terms may be taken large relative to other small corrections. However, validity of the low energy effective theory requires |∆δ EFT | 1 in the case of QED and we should really replace the exponentiated form with e 2i(δ g +∆δ EFT ) → e 2iδ g 1 + 2i∆δ EFT + 1 2 (2i) 2 (∆δ EFT ) 2 + terms of the same order in EFT expansion .
Although the eikonal resummation is valid for the usual δ g contributions, we cannot take seriously the exponentiated ∆δ EFT contributions as an indicator of high energy behaviour which is necessary in order to interpret them as contributing to a physical time delay. Explicit calculations in the UV theory such as those performed in [54] confirm that eikonal resummed Drummond-Hathrell result bears no relation to the true high energy behaviour of the scattering phase shift.
We shall argue that this is a more general phenomena, applying equally for GWs and indeed any EFT. Naive scattering time advances do occur in the low energy effective theory of gravity, and generic EFTs in curved spacetime, arising for example in Schwarzschild spacetime from matter loops in close analogy with the QED case. However as we show later these time advances are not resolvable and are seen to satisfy 8) which is equivalent to the statement that This shows up generically in the fact that local fluctuations about the background geometry allow for mildly superluminal fluctuations. Once again though this superluminality is not resolvable, and cannot be used to argue for any causality violations. By contrast, genuinely semiclassical approximation, as we outline in Appendix C. In the relativistic context the latter may be viewed as a Penrose limit of the former. Due to the close connection many of our statements apply implicitly to both the semiclassical and eikonal approximation methods.
acausal EFTs lead to resolvable time advances calculable within the regime of validity of the EFT and we give an example later. Consistent EFTs are seen then to self-protect from any macroscopic causality violation, and in this sense contain remnant information of their consistent UV completion. We stress again that the precise nature of the UV completion is immaterial to this particular part of the argument. The previous bounds and 'self-protection' mechanism should indeed hold irrespectively of whether we are dealing with a weakly coupled infinite tower of spins, or a more mundane heavy loop contribution as in the Drummond-Hathrell case.
These observations have significant impact on how we put constraints on low energy effective theories. The overly enthusiastic low energy physicist who demands that the Wilson action should be constrained by the requirement that all fluctuations around every background should be (sub)luminal relative to the background, or similarly that the scattering time delay correction relative to the background (Shapiro/gravitational) time delay is positive may easily risk ruling out EFTs with consistent Lorentz invariant and causal UV completions. A more nuanced discussion is required that establishes whether either of these effects lead to a macroscopically observable causality violation within the regime of validity of the effective field theory. In what follows we give well-known examples of both situations.
The rest of the manuscript is organized as follows: In section 2 we review why causality and analyticity typically require a subluminal low-energy phase velocity, while pointing out some caveats that occur in curved spacetimes. In section 3, we highlight subtleties that arise when dealing with superluminal low-energy speed in gravitational EFTs and the relevance of being able to take a decoupling limit where the gravitational degrees of freedom decouple before being able to restrict low-energy coefficients based on superluminal criteria. We highlight cases where macroscopic superluminalities are allowed (and even sometimes imposed) by analyticity and causality. Such cases are particularly important when attempting to restrict the allowed coefficients in cosmological EFTs. In section 4 we show how the small amount of superluminal low-energy speed we expect from the EFT of gravity leads to no physical propagation outside the light-cone and is therefore not in contradiction with causality. The same type of arguments is then shown to apply for Black Hole (BH) spacetimes in section 5. These BH spacetimes are asymptotically flat and the connection with the sign and magnitude of the scattering phase shift can be made manifest within the EFT of gravity. The same type of arguments and absence of secular growth is also made explicit in the EFT of QED below the electron mass as highlighted in section 6 where we make it clear that a negative phase shift of sufficiently large magnitude to be in tension with causality can never be realized within the regime of validity of the low-energy QED EFT. This is contrary to what occurs in other EFTs where the semiclassical or eikonal approximation can remain under control for sufficiently large phase shift and hence lead to a resolvable physical time advance and tension with causality as illustrated in section 7. We end with a summary and discussion in section 8. Appendix A provides a review of the low-energy EFT for gravity as well as the graviton dispersion relation and the direction of the RG flow. The relations between the semiclassical approximation used to computing the phase shift and time delay are reviewed in Appendix B and the relation to the eikonal approximation is outlined in Appendix C. Finally Appendix D provides useful formula to compute the EFT corrections to the time delay relative to generic effective backgrounds.

Refractive Index
Implications from Analyticity and Unitarity: It is a familiar result that the speed of propagation of a wave in a medium is in general different than in vacuum. For instance for a rotational and translation invariant medium, it is sufficient to describe the propagation speed through the refractive index n(ω) for which the phase velocity is given by v p = c/n(ω). The speed of propagation of a wavefront is determined by the front velocity v f which is given by (from now on we set c = 1), v f = lim ω→∞ 1/n(ω) . (2.1) Relativistic causality demands that v f ≤ 1. However this does not preclude the possibility that the low energy phase or group velocity is superluminal. Superluminal group velocities in particular are a well studied experimental phenomena [55][56][57][58] and do not in any way contradict causality, they rather indicate the failure of group velocity as a useful concept in a dispersive medium. This was recognized long ago by Sommerfeld and Brillouin [56] who resolved the apparent paradox between superluminal group velocities and relativity well before any experimental evidence for this phenomena.
However the front velocity is not the end of the story. The full requirement of causality is that the retarded propagator vanishes outside of the forward lightcone, In addition to the front velocity being luminal, the latter generally requires that the refractive index is an analytic function in the upper half complex ω plane [59]. Applying Cauchy's theorem assuming analyticity leads to the Kramers-Kronig relations which for future comparison is most usefully written as A travelling wave moving in the z direction takes the form e iωt−in(ω)ωz and so it is the real part Re[n(ω)] that determines the speed, and the imaginary part the dispersion. Now in a normal medium, unitarity demands that the imaginary part of the refractive index is positive: Im[n(ω)] ≥ 0 for real ω > 0. At zero frequency, the real part is given more precisely as Hence we conclude a bound on the low energy phase velocity Since the front velocity cannot be superluminal v f ≤ 1, it is typically inferred that the low energy phase velocity cannot be superluminal unless we violate either (1) analyticity, or (2) unitarity.
Low-energy effective theory: This particular argument is strengthened if we imagine a situation in which the dispersive imaginary part of the refractive index is only non-zero for frequencies above some scale M , i.e.
In this situation, there exists a low energy effective theory valid for frequencies ω M for which the refractive index is well approximated by a Taylor series with leading low energy term and positive dimensionless EFT coefficients The equation of motion that describes the propagation of the wave with amplitude φ(t, x) is of the form for some dimensionless coefficients c n . The higher time-derivatives arise here due to the low energy expansion and do not imply additional states. Indeed within the context of the 1/M 2 expansion this equation may be rewritten in the conceptually nicer form (2.11) In this low energy regime, there is by assumption no dispersion, and effects from high energy physics are captured by local higher derivative operators. The leading order group velocity is the same as the phase velocity v g = 1/n(0). Thus low energy sources propagate at the speed 1/n(0). If we compute the retarded propagator for φ as a perturbative expansion in 1/M then each term at finite order will vanish only outside the forward lightcone defined by Thus we must have n(0) ≥ 1, otherwise even a tiny superluminal velocity n(0) = (1 + ) −1 with > 0 would integrate up over sufficiently long periods of time to an arbitrary large increase ∆x in the spatial size of the causal lightcone from a given event ∆x = |t − t |, leading to causality paradoxes. In running this argument, it is crucial that |t − t | may be made arbitrarily large. We will see that when considering the same argument in a Friedmann-Lemaître-Robertson-Walker (FLRW) geometry or on the background of a BH, it is exactly this assumption that breaks for reasons to be explained. Furthermore it is unclear whether (2.5) holds in curved spacetimes due to the generic absence of conventional analyticity.
The fact that the situation is more subtle in curved spacetimes is well known from work on the low energy effective theory for QED in curved spacetime [33,34] which has been discussed extensively in the literature [35][36][37][38][39][40][41][42], where it is noted that the low energy phase velocity in a curved spacetime can be superluminal without contradicting the requirement that the front velocity is luminal. Our subsequent discussions in sections 4, 5 and Appendix A will parallel this for the speed of GWs themselves.

Analyticity with Gravity
Since one of our principal interests is the speed of gravity, i.e. the speed of GWs in a curved background, we would ideally repeat the argument of the previous section. A knowledge of the spectral properties of the propagator for GWs in a curved spacetime could allow us to infer concrete statements about the low energy speed. Unfortunately a direct application of these arguments to curved spacetimes is not available since there is no requirement that analyticity should hold in general. For cosmological spacetimes, this is made transparent by the inherent time dependence of the background meaning that frequency ω is no longer a good Fourier variable.
Fortunately all information from analyticity in Minkowski spacetime is not completely lost. Consider a particular diffeomorphism invariant low energy gravitational theory. Analyticity constraints will impose restrictions on the form of low energy action based on analytic scattering amplitudes in Minkowski spacetime or spectral density requirements. Since the underlying gravitational theory is diffeomorphism invariant, this can immediately be used to infer constraints on covariant operators in the effective Lagrangian which in turn have consequences around curved spacetimes. One set of arguments of this kind are reviewed in appendix A for which we summarise the essential points here. These arguments are closely related to positivity bound arguments that apply to scattering amplitudes [4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20][21].
Rather than working with a dispersion relation for the refractive index (2.3), we can determine a Källen-Lehmann type dispersion relation for the gravitational wave propagator on Minkowski spacetime. Due to gauge invariance, this is most conveniently expressed as the exchange interaction 'T T -amplitude' between two conserved sources with gravitational propagator as given in (A.8), Despite being a different quantity, (2.13) is conceptually similar to (2.3) where the functions ρ 2 (µ) and ρ 0 (µ) are positive by unitary, and are analogous to Im[n(ω)], and the subtraction constants C 2 (µ 0 ) and C 1 (µ 0 ) analogous to n(∞). As for the standard Källen-Lehmann spectral representation, the momentum space argument is an analytic function of complex momentum squared s = −k 2 + i up to a pole at k 2 = 0 and a right-hand branch cut 5 . In defining (2.13) we have introduced an arbitrary subtraction scale even though the result is independent of that scale. This is encoded in the renormalization group style equation (see appendix A for derivation) (2.14) Although µ 0 is not the sliding scale or cutoff of the usual renormalization group, it encodes the same flow, and we see that unitarity demands positivity of the flow from the UV into the IR, that is Although this does not constitute a proof, it certainly leads to the expectation that C IR S > 0. Attempts to prove this have been given in [60] for the case of gravity coupled to a Maxwell field via S-matrix positivity arguments 6 . Specifically positivity of C IR 2 would follow if the 5 The branch cut lies on the real axis for −k 2 ≥ 0 with the physical value understood to be the limit from the upper half complex plane, hence s = −k 2 + i . 6 Since the curvature squared corrections associated with CS can be removed with field redefinitions, the S-matrix constraints of [60] are strictly speaking applied to the F 4 and F 2 R terms in the Einstein-Maxwell graviton scattering amplitude with the massless t-channel pole removed, has a positive second s derivative in the forward scattering limit.
What does this have to do with GWs propagating in curved spacetimes? The answer is that the coefficients C IR S determine precisely the coefficients of the covariant curvature squared terms in the low energy effective theory. If we begin with the tree level Wilsonian effective action which includes the leading curvature squared terms expected to arise from integrating out loops and higher spin heavy states (see Eq. (A.19)), the above dispersive arguments enforce positivity of the coefficients of R 2 and R 2 µν − R 2 /3 or equivalently stated positivity of the Weyl curvature squared term W 2 µνρσ . We may then use the resulting local equations for the low energy effective theory to determine an effective equation for the propagation of low energy GWs precisely analogous to (2.11). By direct analogy we can define the low energy speed by the refractive index coefficient n(0) that relates the leading 'two derivative' part of this effective equation. The central result of [43] is that the low energy refractive index n(0) is determined at leading order by C IR 2 , and the sign of the latter directly determines the subor super-luminality of the low energy speed of propagation. In situations where the leading curvature squared terms vanish, such as Schwarzschild spacetime it is the curvature cubed terms which determine the leading effect, as calculated in [43] and considered in detail in [44].

Non-Gravitational Criterion
Before getting to our main discussion on the speed of GWs we will review here several key points that arise when dealing with EFTs on a background that spontaneously break Lorentz invariance. Indeed, it is straightforward to write down Lorentz invariant EFTs which exhibit superluminal propagation around spontaneously broken Lorentz backgrounds. The canonical example is that of a P (X) model (ignoring gravity for now and setting the field on Minkowski), for which the non-minimal kinetic term takes the form [61] For the 'wrong' sign choice a < 0, this leads to superluminal propagation around a simple time-dependent background 7 φ = φ(t). Indeed the sound speed about such EFT. However, if we take the perspective that the coefficients of these operators are zero before the field redefinition, then positivity implied by the arguments of [60] would indeed infer that C IR 2 > 0. The Källen-Lehmann dispersive arguments are clearly weaker that the S-matrix bounds since the former are sensitive to field frame. 7 The same EFT with a < 0 also leads to superluminal propagation around any other background that spontaneously breaks Lorentz invariance by picking up a preferred direction for ∂µφ no matter whether ∂µφ is timelike or spacelike.
a background is given by and is superluminal for a < 0. This is connected with a violation of positivity bounds [4][5][6].
The departure of the speed from unity is always suppressed by factors ofφ 2 /M 4 and is therefore always small within the regime of validity of the EFT. Nevertheless, even such a small correction can lead to significant macroscopic consequences. This is because in this field theory on Minkowski spacetime setting, there is no limit to how long we may wait to integrate this effect up, and even a small local effect can therefore build up to a macroscopic size. Roughly speaking, at a time t, the future lightcone emanating from a given spacetime point at t 0 will be larger by a radius and the smallness ofφ/M 2 may easily be compensated for by the largeness of |t − t 0 | which is otherwise unrestricted. Since this distance will be macroscopically observable, which means not only will it satisfy ∆r M −1 , crucially it is resolvable ∆r λ where λ is the wavelength of the propagating fluctuation, we are in a situation where we may imagine violations of causality. At the very least, this would imply a violation of causality as implied by the (asymptotic) Lorentz invariant lightcone.
Indeed stated this way, it is clear that any EFT in Minkowski spacetime that leads to any amount of homogenous superluminality, no matter how small, will necessarily lead to macroscopic causality violation after a long enough time. Since there is no restriction on how long we can wait, then we must conclude that standard relativistic causality imposes the strict requirement that 8 c s ≤ 1 .
Our central claim is that the same argument does not apply to the low energy speed in a curved spacetime or gravitational setting for a number of reasons. As we have already alluded to, in gravitational theory (in a setup that spontaneously breaks Lorentz invariance), the question is more subtle for two reasons: first, the notion of low energy speed (i.e. any speed inferred from a low energy effective theory) is not frame independent and second, due to the nature of the spacetime in question, it may not be possible to integrate up a small departure in speed to make a macroscopic effect. In other words, the would-be superluminality may not be resolvable. When this is the case we cannot necessarily legitimately conclude that there is any causality violation in the regime of validity of the effective theory. In fact we will see that in the two most interesting curved spacetimes, namely FLRW and Schwarzschild, this is exactly what happens.
When computed from the leading terms in a low-energy EFT, the low-energy speed may not manifest any explicit frequency dependence. This occurs in particular if only at most second order derivative terms have been included in the low-energy EFT and any other higher derivative term has been considered as irrelevant, as was the case in the P (X) EFT considered previously in (3.1). We stress however that this is always an artifact of the truncation to the leading EFT interactions. However from the very definition of a low-energy EFT, (3.2) can only give an appropriate description for c s at low-energies, in this case at frequencies of at most ω M 2 /|φ|. Beyond the regime of validity of the low-energy EFT, the computation of the speed is simply not valid. Even if the formula does not break down mathematically, it does break down physically when flowing to higher energy as higher order irrelevant operators ought to be taken into account, until ultimately the theory ought to be traded for its higher energy counterpart, explicitly including heavier modes. See Ref. [62] for a discussion.

Decoupling Limit
In many situations, in a given gravitational effective field theory, it may be possible to take a decoupling limit 9 M Pl → ∞, keeping some other interaction scale M in the system fixed, for which the helicity-2 gravitational degrees of freedom decouple from all other degrees of freedom while maintaining the interactions that arise at the lowest energy scale M . Whenever this is possible, the resulting decoupled effective theory can be analyzed from the perspective of an interacting field theory on a fixed (Minkowski or other curved) background, and in this situation the above argument (3.4) is expected to be valid. With this in mind we may then declare that if in the field frame for which the decoupling limit is well-defined for all species then causality is (expected to be) satisfied. The condition on the field frame is crucial as we will see in Section 3.3. More precisely we will see that rate to which the effect goes to zero as M Pl → ∞ is crucial, and a more refined version of this statement is the bound (3.8).
• Superluminal Speed in the Decoupling Limit.-As a first example, we may consider the gravitational version of our canonical example in the previous subsection 3.1. Promoting the previous example to a gravitational effective field theory it is straightforward to take the limit M Pl → ∞ with g µν = η µν + h µν /M Pl keeping the scale M fixed. In this limit, we are left with two decoupled sectors, on one side a free massless spin-2 degree of freedom h µν and on the other side an interacting P (X) scalar field theory on Minkowski identical to that considered in 3.1 to which the usual superluminality and positivity bounds violation arguments apply.
• Luminal Speed in the Decoupling Limit.-On the other hand, we may now consider a modification to the low-energy speed that is parametrically suppressed by powers of M 2 /M 2 Pl relative to the previous effect, take for instance As we shall see, this is closer to this typical situation for the low energy speed of GWs with corrections as in [43]. In this case it is not possible to take the limit M Pl → ∞ and have c s differ from unity without something else blowing up. For instance we may try to scale M ∼ 1/M Pl but with the understanding that M sets the scale of other irrelevant operators in the effective theory, this would inevitably lead to a break down of the low energy EFT at arbitrarily low scales, invalidating the calculation of the speed.
Our central claim is that whenever a situation like (3.7) occurs where M is related to the cutoff of the low-energy EFT in which the speed has been computed (or to the scale of irrelevant operators), then the condition (3.5) is actually satisfied and it would then not be legitimate to demand that c s ≤ 1 for all degrees of freedom away from the decoupling limit (i.e. at finite M Pl ). Specifically we will see that in actual examples, the low energy speed c s is typically expected to be superluminal without leading to any macroscopic violation of causality. The key to this is the smallness of the effect, and this is in turn tied to the fact that the speed is luminal in the limit M Pl → ∞. As long as (3.5) continues to hold, we do not anticipate any violation of causality. More precisely we shall see that if with α ≥ 2 there will be no macroscopic observable effects. A single M Pl suppression would not be sufficient, however all gravitational induced corrections to the sound speed arise at a minimum with a M 2 Pl suppression (in a local theory).

Macroscopic Superluminality allowed by Analyticity
In order to illustrate the subtleties that emerge when dealing with superluminalities in a gravitational theory, we give here an example of an effective theory for which superluminal GWs are required by analyticity! Consider the effective Lagrangian of the form which includes a non-minimal coupling between gravity and the scalar. It is straightforward to show that on considering perturbations around a cosmological solution sourced by a time dependent scalar φ(t), the GWs propagate superluminally if a > 0 and subluminally if a < 0. Furthermore this effect is a macroscopically observable one since Surely then, since this superluminality is macroscopically observable, causality/analyticity considerations will demand that a < 0? In fact it is straightforward to see that this is not the case and the precise opposite actually holds. Indeed, one can change frame so that the Lagrangian (3.9) exactly matches (3.6) by the simple redefinition whence standard positivity bounds following from analyticity impose that a > 0 as the 'causal' choice. In terms of the canonical normalized gravitational fluctuations This example nicely illustrates two points (a) the ambiguity of speed under field redefinitions and (b) the importance of causality constraints being imposed in the frame in which the decoupling limit is well defined. Indeed, unlike the field frame implicit in (3.6) for which the decoupling limit M Pl → ∞ is well defined, the Lagrangian (3.9) does not have a well defined decoupling limit. Indeed in taking the limit M Pl → ∞ in this frame we would have The lesson to learn from this is that simply demanding that for a given EFT the speed of propagation of fields is (sub)luminal in a given field frame is not only unjustified, it may even explicitly violate the requirements that do come from causality. For this reason, the only safe requirement to impose on the EFT is that given in (3.5) which only applies in a field frame for which a decoupling limit exists.

Speed of Gravity in Cosmology
Let us now focus our discussion on the specific case of cosmological spacetimes. As reviewed in appendix A, the leading corrections to the low-energy EFT for gravity may be expressed in the form where GB designates the Gauss-Bonnet term and W the Weyl tensor. In addition when matter is included, we may allow for non-minimal matter curvature interactions, as for example RF 2 terms in the case of Einstein-Maxwell. In order to focus on the genuine gravitational interactions, we shall not consider these non-minimal matter interactions in what follows (including them can lead to additional sources of superluminalities that can be dealt with in a more standard way). Within the low-energy EFT, one cannot determine the sign of the coefficients C IR R 2 ,W 2 but as argued in appendix A the positivity of the RG flow (A.11) implies and in what follows we shall make the a priori not-so-unreasonable assumption that C IR W 2 may be positive.
Given a covariant action of the local form (4.1) encoding the EFT corrections, it is straightforward to compute the corrections to the equation of motion for tensor fluctuations on a cosmological background as done in [43]. Identifying what we mean by speed is however slightly subtle since the truncated equation of motion contains higher time derivatives. These may be removed with field redefinitions and traded for space derivatives just as in the conversion from (2.10) to (2.11). After this is done, the equation of motion for tensor GWs on FLRW may, by virtue of symmetry, be put in the following form where we work in conformal time ds 2 = a(η) 2 (−dη 2 + d x 2 ). The previous relation may be rearranged and express in the form where m 2 eff = β 0 (η) is an effective mass andc 2 s (k, η) = ∞ n=1 β n (η)k 2n−2 an effective k dependent sound speed. We then define the low-energy sound speed to be c 2 s (η) ≡c 2 s (0, η) = β 1 (η), namely the speed of propagation implied by the truncated equation Explicit calculation using (A. 19) gives the low energy speed [43] Since the null energy condition requiresḢ < 0, then C W 2 > 0 would imply that this low energy speed is slightly superluminal. Note that we do not include the effective backgroundgenerated mass m eff in consideration of the speed of propagation, because what is relevant is the causal support of the retarded propagator. If (4.5) were the exact equation this would be determined by c 2 s (η) alone [64]. This is in agreement with what is typically meant by the low energy speed. If the retarded propagator for the exact equation (4.3) is determined as a perturbative expansion in spatial derivatives, with the (up to) two derivative terms (4.5) taken as the leading part, then at any finite order in perturbations the causal support for the retarded propagator will be determined by (4.5). Clearly the relevant question is, when is this a good indication of the true causal support of the exact retarded propagator.

Validity of EFT in FLRW
Before proceeding, we need to address the conditions for the validity of the EFT, e.g. the validity of equation (4.3) to describe the evolution of GWs. In the application of the Wilsonian effective theory with cutoff M we can only describe momenta for which covariant operators are small relative to the cutoff scale, e.g. 2 M 2 and R M 2 . In the cosmological context, since Lorentz invariance is broken by the background, this means we can only use the effective theory to describe the evolution of modes in the region where In the typical situation in whichḢ ∼ O(H 2 ) this may be stated as Note that this scale is much higher 10 than that typically considered in trans-Planckian type arguments where it would be argued that the EFT breaks down when k ∼ a(t)M [65,66]. The reason being is that we assume the underlying theory is Lorentz invariant, and so we require a locally Lorentz invariant combination to be comparable to M 2 . In the cosmological context where only time translations are broken, we may for example decompose the Ricci tensor in the manner where κ µ is a non-normalized time-like vector (since we are dealing with cosmology here). Given an on-shell wave of momentum k µ for which k µ k µ ≈ 0, then (4.8) is the locally Lorentz invariant bound |κ µ k µ | M 2 . This may be taken together with the requirement that κ µ κ µ M 2 and |Ω| M which require H 2 M 2 and |Ḣ| M 2 . Indeed the argument for why (4.7) is the more general condition and not (4.8) is that de Sitter invariance in the limitḢ → 0 is sufficient to ensure validity of EFT at arbitrarily on-shell high momenta.
To clarify this, let us think of a typical example EFT organized in the standard manner where all irrelevant operators are suppressed by the common scale M to the appropriate power. Schematically the effective action takes the form 11 with the usual understanding that we allow for all local scalar operators constructed out of the appropriate number of powers of the Riemann tensor and covariant derivatives in any order. Given the underlying locality and Lorentz invariance, any term in the effective dispersion relation for GWs around a curved background not of the local Lorentz invariant form (ω 2 −k 2 /a 2 ) will necessarily come suppressed by some power of the background curvature quantities H 2 ,Ḣ and derivatives thereof. Since these terms spontaneously break Lorentz invariance they may come together with k 2 /a 2 and ω 2 terms and so will naturally package in dimensionless combinations of the forṁ For example these will arise from terms in (4.11) with factors of M −4 R µν ∇ µ ∇ ν . Terms with the same number of powers of curvature, but higher powers of k such asḢ k 4 /a 4 M 6 will necessarily only arise is the quasi-Lorentz invariant combinatioṅ as for example coming from terms like M −6 R µν ∇ µ ∇ ν 2 (acting for instance on the scalar curvature). This is in essence due to index contraction, if we limit ourselves to a fixed number of powers of curvature, since Weyl is zero for FLRW, once the Ricci tensor indices have been contracted, all remaining indices must be contracted with the metric which locally takes a Lorentz invariant form. Thus schematically the effective form of the dispersion relation will be (in terms of the physical momentumk = k/a) (4.14) 11 This is for example the schematic form of curvature dependence in the low energy EFT for string theory in which M is the string scale 1/ √ α which is parametrically below the Planck scale [67,68].
The condition that the EFT remains under control requires that at a minimum the β corrections are small or more precisely that the β series is at least convergent in an asymptotic series sense. Since we are allowed arbitrary integer powers of the a i , this will only be true if each of the dimensionless ratios in brackets are kept smaller than unity. Hence in addition to the expected requirements that the curvature remains small, H 2 M 2 and |Ḣ| M 2 , we infer that This implies that in the typical situation for whichḢ ∼ O(H 2 ), the momentum cutoff appropriate for an on-shell state, i.e. a propagating gravitational wave, is therefore as specified in (4.8). Due to redshifting in the cosmological context of an expanding Universe, the bound (4.7) is strongest at the earliest times which is where we shall make use of it.

Causality constraint
While it is known and observed that in many media the low energy phase and group velocities may temporarily become superluminal, this is only in conflict with causality if the superluminality may be integrated up to a macroscopic effect for which the lightcone is clearly larger than the Lorentzian lightcone. One way to characterise this in asymptotically flat spacetimes is to ask whether there is an 'asymptotic superluminality' [23,53]. In practice, this amounts to asking whether there can be an integrated time advance in a scattering event, which would imply that the signal from a scattering process could arrive before that of an unscattered wave -in a Lorentz invariant theory this would then be associated to some type of acausality.
In the cosmological context (or any other curved geometry which is not asymptotically flat), we do not have such a clean tool, and any S-matrix calculation of this form would only be approximately valid at subhorizon scales. We can however ask, by virtue of the symmetry of the FLRW spacetime, how much larger the lightcone of propagation is emanating from some event after many Hubble times. On first sight, we may imagine that even the tiniest amount of superluminality in the low-energy phase could be integrated up to some large observable effect over the entire age (or even future) of the Universe. Crucially, this is not the case as we now explain.
Let us work with the effective metric seen by GWs in the EFT of gravity, i.e. an effective metric with speed c s (t) as in (4.6). We now consider the future lightcone emanating from a spacetime event at time t i as determined with respect to this effective metric. If C W 2 > 0 then at a given time t > t i this lightcone is larger than the usual FLRW lightcone by a radial distance ∆r Using (4.6) at leading order in the EFT expansion this distance is In an expanding Universe, the integrand on the right hand side is bounded by This implies given H(t) < H(t i ) for t > t i , assuming the null energy condition is satisfied.
Post-inflation period.-For a mode of a given comoving momentum k, the earliest time at which we can trust the EFT calculation of the speed is set by (4.8) to be such that (assuming that for most of cosmic historyḢ ∼ O(H 2 ) which is true post inflation) from which we infer where λ(t) = 2πa(t)/k is the physical wavelength. Finally the cutoff of the EFT should be at most M 2 M 2 Pl /C W 2 , so the bound essentially becomes ∆r(t) λ(t) . We recall that ∆r represents the distance that low-energy GWs may propagate outside the light cone set by the FLRW background metric (i.e. the light cone seen by minimally coupled fields) if C W 2 > 0. For any GW this distance is always much less than the actual physical wavelength λ of the GW (if we ensure that we remain within the regime of validity of the low-energy EFT). This distance is therefore not resolvable, and if it ever were resolvable, one would not be able to trust the result as it would rely on applying the low-energy EFT beyond its regime of validity. Thus causality remains intact provided we limit ourselves to asking questions that are fully within the regime of validity of the EFT.
Quasi-inflationary period.-The situation is only slightly more subtle when there is a quasi-inflationary (or late-time acceleration) period for which |Ḣ| H 2 . Consider for example a constant equation of state w ≈ −1 for which the scale factor takes the form 1+ω) . The additional contribution to the comoving displacement coming from an inflationary epoch t e > t > t b is This is further suppressed by √ 1 + ω relative to the previous estimate, and so on applying the condition (4.7) at the beginning of inflation we are led to the same conclusion (4.22).

Suppression is key
The key as to why causality is not being violated by the superluminal speed here is the smallness (Ḣ/M 2 Pl ) of the effect, i.e. the gravitational suppression. To clarify this let us return to the case of a genuinely acausal example as in the 'wrong' sign P (X) model with a = −|a| [61], i.e.
then the speed of propagation for the scalar about say a time-dependent background φ(t) is given in (3.2).
Although the departure from unity for the speed is small in an EFT sense, its macroscopic secular effect can be arbitrary large, even on an FLRW background. To illustrate this, let us suppose that this scalar is also the dominant source for the background expansion, then the leading order Raychaudhuri equation isḢ = −φ 2 /(2M 2 Pl ) + . . . and so Assuming order unity Wilson coefficient a ∼ O(1), the departure from luminality is larger by a factor of M 2 Pl /M 2 compared to the previous example. We infer a maximal displacement of the light cone of order Whenever M M Pl we can engineer a situation where there is an observable violation for causality with ∆r(t) λ(t), justifying the inherent acausality of the wrong sign P (X) model. By comparison, in the low-energy EFT for gravity, the corrections to the speed is suppressed by an additional factor of M 2 /M 2 Pl (or in the BH case of section 5 by a factor (M/M Pl ) 4 ) relative to (4.25), which is precisely what makes the displacement unobservable. It is clear from (4.26) that we need at least two powers of M Pl suppression to ensure the unobservability of this effect (3.8), justifying the claim made in (3.8).

Secular effects
To restate the previous results slightly differently, suppose we tried to infer the retarded Green's function describing the response of a field to a source. From the exact equation (4.4) the momentum space retarded Green's function may be defined by Ideally this equation would be solved exactly, however we only know its form within the context of an EFT expansion. The picture closest to the classical one is where we infer the propagator by means of a WKB approximation as discussed for example in [69]. The exact retarded propagator is given by where h k (η) are the normalized 'positive frequency' solutions of (4.4). If we implicitly resum the secular contribution from the sound speed, these will be build out of modes of the WKB form .
The resulting propagator will have causal support on the lightcone determined by the speed c s in the exponent of the exponentials. However, the secular resummation implicit in (4.29) only makes sense if the argument of the exponential in square brackets becomes of order unity or larger, otherwise this effect is clearly a perturbative one. However, as we have seen previously, provided we demand the EFT bound (4.7) then and so in the calculation of the Green's function we may always treat the exponential perturbatively When computed in this manner, the resulting Green's function will have at any finite order the same lightcone structure as the FLRW background metric as determined by the leading exponential e i k. x∓ik t dt a(t ) . We can only justify the resummation of the terms that arise from expanding the exponential if those were the only terms to arise from the EFT expansion. But this is of course not the case, they represent only a subset of contributions and since their individual contribution remains perturbative we have no reason to expect that for example the term of quadratic order in (ik t dt cs(t )−1 a(t ) ) is any larger than other term that arise at the same order in the EFT expansion.
The implication of (4.7) is that since (4.30) is satisfied, it implies that (4.27) is best solved as a perturbation series defined by iterating the equation for which ∂ 2 η + m 2 eff + k 2 G 0 ret (η, η ) = δ(η − η ) has the causal support of the background metric. This is legitimate as long as the there is no secular growth in the perturbative expansion, which amounts to the requirement (4.30) which in the EFT of gravity follows from (4.7). We see that the EFT validity condition (4.7) is crucial to understanding how causality is preserved. It is the presence or absence of secular growth in the perturbative expansion that tells us whether or not the sound speed departure from unity is physical or not.

Speed of Gravity near Black Holes in the EFT of Gravity
As a second class of configurations that spontaneously breaks Lorentz invariance, we consider D = 4 BH types of solutions and focus on static and spherically symmetric Ricci-flat vacuum configurations. This situation is not only particulary interesting phenomenologically, it also provides an explicit asymptotically flat example where S-matrix arguments can be applied. In this vacuum flat case the R 2 operators in the EFT of gravity affect neither the background solution nor the propagation of GWs to first order in curvature corrections. Instead, the leading contributions arising from the dimension-six operators of the form where M is the 'naive' cutoff of the EFT. We consider the background solution to be that solution which is Schwarzschild if the R 3 operators were absent, i.e. that of a corrected nonrotating black hole. In Schwarzschild coordinates, the equation of motion for the odd and even polarizations of the GWs h is the same and governed by an effective metric Z µν [44] Z µν D µ D ν h + V h = 0, where the effective metric is expressed as and the corrected metric functions are and Here r g is the Schwarzschild radius of the BH solution in GR without the corrections from the EFT. In this EFT, the BH horizon r g is slightly displaced by an amount proportional to (M M Pl r 2 g ) −2 1 and is necessarily the same for every species, no matter how they couple to gravity.
We can show that on the background of BH-like solutions, the speed of GWs can be both superluminal or subluminal depending on the signs of the EFT of gravity. Typically when the angular speed is subluminal, the radial speed is superluminal, as is the case if we think of this EFT as arising from integrating out a spin-1/2 field. Generically, the radial speed is given in terms of the coupling constants d 9,10 as follows [44]  and is superluminal whenever 2d 9 + d 10 < 0. In the past, these types of arguments have been used to constraint EFTs. We emphasize here that this would be the wrong approach. First the choice 2d 9 + d 10 > 0 would still lead to superluminalities in other configurations (e.g. superluminal angular low-energy speed), and second as in the case of the EFT of gravity in cosmological settings, the amount of low-energy superluminality is so suppressed that it can never lead to any macroscopic violation of causality as we explain below.

Validity of EFT in BH spacetime
The discussion of the validity of the EFT in a BH spacetime is closely analogous to that in FLRW in section 4.2. One crucial difference however is that the leading order background geometry has vanishing Ricci tensor. Thus corrections to the propagation equations will be governed by the Weyl tensor and its covariant derivatives. Given an on-shell mode of momentum k µ with k µ k µ ≈ 0, the naive highest order in k µ tensor we can construct that is linear in curvature is W abcd k a k b k c k d , but this vanishes by virtue of the symmetries of the Weyl tensor. Hence the tensor with the highest powers of k is and by symmetry of the Weyl tensor we therefore have A ab k b = k b A ba = A a a = 0. In a general EFT expansion, we anticipate all scalar local operators to arise suppressed by the cutoff scale. This includes operators that are combinations of A ab contracted with the metric and with itself. Hence following a reasoning identical to that in section 4.2, the highest possible on-shell momenta is determined by (at least) the EFT requirements for integer n. In addition, we must require the more obvious curvature requirement |W abcd W abcd | M 4 and related covariant derivative requirements.
Since Schwarzschild is time translation invariant, on-shell modes are best characterized in terms of their frequency ω ∼ i∂ t . For a transverse wave k µ = (−ω, 0, 0, ±ωr 1/2 sin θ/ 1 − r g /r) with k µ k µ = 0 the tensor A ab is given by Interestingly, due to the symmetry of the spacetime, for a radial travelling wave k µ = (−ω, ± ω (1−rg/r) , 0, 0), the situation is then more subtle. We then have Since for such a radial mode A ab ∝ k a k b , then the conditions (5.8) are automatically satisfied. This does not mean that ω can be arbitrarily large, but rather that we should look more closely at the higher derivative bounds such as and as well as (k µ ∇ µ ) p (W abcd W abcd ) M 4+2p . (5.14) These last two conditions are seen to be the strongest one in the limit p → ∞ and amount to k µ ∇ µ M 2 , i.e. to the relation ω M 2 r . (5.15) This is significantly stronger than (5.10), except near the black hole horizon, but applies only for modes with a significant radial component for which k µ ∇ µ = ±ω∂ r + . . . picks out the radial dependence of the background geometry.

Causality and Time delay
We now focus on the case of a superluminal radial velocity where ∆c 2 s (r) given in (5.6) is positive for r > r g , (2d 9 + d 10 < 0). As a warm up let us consider the simple analysis of a radial moving trajectory. If we were to simultaneously send an outgoing radial photon and a GW with wavelength λ from a distance r 0 > r g M −1 Pl just outside the BH horizon, the GW will arrive in advance of the photon at infinity by an amount ∆T adv ahead of light, whose maximal value is Remaining within the regime of validity of the EFT and being able to trust this answer requires the bound (5.15) to be satisfied at the distance r 0 , leading to the requirement where we have used the requirement W 2 M 4 Pl . Consequently the would-be scattering time advance is smaller that the physical wavelength of this mode and is therefore not resolvable. This result remains true regardless of how close to the horizon the initial wave starts out (so long as we only consider waves of frequencies that are within the regime of validity of the EFT at that point as dictated by (5.15)). We see in fact that the effect described by (5.18) is even further suppressed relative to the FLRW effect by a factor rg M 2 Pl r 3 0 which is consistent with the fact that we are here relying on curvature-cubed (dimension-6) operators rather than curvature-squared (dimension-4) operators.
Moving beyond radial trajectories, let us consider the scattering time-delay induced by the black hole background. By symmetry the retarded propagator may be determined in an angular momentum eigenbasis with angular momentum for which the Eisenbud-Wigner time delay [46,48,70] as shown in Appendix B applied to the metric (5.3) is where r t is the turning point where the denominator vanishes r 2 Z Ω = Z t b 2 . As discussed in Section C.2 on asymptotically-Schwarzschild spacetimes this is logarithmically divergent in D = 4, but this need not concern us as the time delay correction to the usual GR Shapiro time-delay is well defined with f (r) = 1 − r g /r and r t (0) the usual GR turning point. Following the approach given in Appendix D at leading order in the EFT expansion this is where the function δR is defined in (D.7). At leading order in a r g /b expansion is dδR(r) dr = 144r g (d 10 + 2d 9 ) (4b 2 − r 2 ) M 2 M 2 Pl b 2 r 5 + . . . , (5.21) and yields at leading order in r g /b a time delay As expected when d 10 + 2d 9 < 0, for which radial modes are superluminal, this corresponds to a time advance ∆T EFT < 0. However we may follow a reasoning similar to that in (5.18) to see that this time advance remains unresolvable within the EFT. Indeed even if we content ourself with the weaker EFT condition (5.10) we would find for r g /b 1 and it is for this reason that causality is not being violated.

Failure of the Eikonal/Semiclassical Approximation
Since the contribution to the scattering phase shift is essentially ∆δ EFT ∼ 2ω∆T EFT the condition (5.23) is equivalent to the statement that the EFT contribution to the phase shift remains perturbatively small in the regime of validity of the EFT.
Indeed not only is it smaller than unity, even if we were to take the extreme limit b → O (r g ), it would still be parametrically suppressed by (M Pl b) −2 . Specifically terms of order (∆δ EFT ) 2 will be of the same order as other EFT contributions that come in at the order (M Pl b) −4 . This brings us to the essential point, it is implicit in the eikonal resummation (for example that performed in [23]) that the t-channel exchange ladder diagrams dominate over other Feynman diagrams in the perturbative expansion so that it is consistent to resum them into the exponentiated form This is a well justified procedure for the GR contributions which give rise to the Shapiro time-delay, and indeed since the Shapiro time-delay is of order r g times a logarithmic factor, there is no problem engineering a situation for which δ g 1 or equivalently ∆T g ω −1 by having ω r −1 g , which is not in conflict with any consistency requirement of pure GR.
By contrast, since the EFT contributions to the t-channel exchange are perturbatively small, it is not legitimate to resum them while neglecting other Feynman diagrams that can arise at the same order. We see that the situation of the low-energy EFT of gravity on a Schwarzschild background is closely analogous to that in FLRW. Performing an eikonal resummation of the contribution δ EFT is equivalent to the resummation of the would-be secular terms implicit in (4.29) and (4.31). In both cases this resummation is simply not justified at least as an indicator of causal support. Here, this shows up for us in the fact that the EFT correction to the scattering time delay is not resolvable within the approximation used to calculate the time delay, in analogy with the support of the FLRW lightcone.
We stress that our condition for resolvability is not the same as that sometimes required in the literature, namely that the magnitude of the time delay is large in comparison to the naive cutoff, e.g.
While we need to ensure that the scattering state/wave remains within the regime of validity of the EFT throughout its trajectory so as to be able to use the low-energy EFT to determine its time-advance, nothing demands that the time-advance itself should be measured within the low-energy EFT. By itself, the bound (5.26) is therefore irrelevant. Moreover, 1/M is not the cutoff for time measurements within the EFT because the time delay is not a Lorentz invariant quantity. This is why we are careful in sections 4.2 and 5.2 to identify the appropriate cutoff based on locally Lorentz invariant combinations. Resolvability here means whether it can be consistently computed in the semiclassical/eikonal approximation, and at its heart the latter assumes frequencies and wavelengths that are large in comparison to the scales of variation of the background quantities.

Low-Energy EFT for QED Below the Electron Mass
It is helpful at this point to compare the above discussion with the classic case of superluminal speeds in a low energy EFT, that of QED in curved spacetime first pointed out by Drummond and Hathrell [33] and extended in [34]. The result of [33] is particularly clean in that it does not require gravity to be dynamical, i.e. it would be obtained in a decoupling limit M Pl → ∞ for fixed background curvature. Furthermore on the same background, different polarizations of light can be shown to have low energy speeds which are both superluminal and subluminal. This gives rise to gravitational birefringence [33] and from higher order operators gravitationally induced dispersion of light [34]. The causal implications of this result for the photon have been discussed extensively in the literature [35][36][37][38][39][40][41][42].
For our present purposes it is sufficient to note that leading effect from an effective action of the form which is the relevant part of the low energy effective action for QED on a curved spacetime where M is the electron mass. The operator RF F leads to polarization-dependent corrections to the low-energy sound speed. For a transverse-travelling wave (with momentum in the angular direction) on a Schwarzschild background, the low-energy sound speed is of the form [33] where β P is an order unity polarization dependent constant. Specifically for radially polarized light β P > 0, while β P < 0 for angular polarization. Interestingly were (6.1) the exact Lagrangian, we see that the equations for the electromagnetic field remain second order, and this speed would be the group/phase and front velocity. In practice this is not the case due to dispersion from higher order operators in the EFT not included [34]. At this level, the low-energy speed (6.2) includes no frequency dependence (none of the higher order terms O r 2 g M 4 r 6 in (6.2) would include any frequency dependence), yet higher-order operators that have not been included in the low-energy EFT (6.1) would affect the speed at high energy and (6.2) can certainly not be the speed of light at arbitrarily high energy.

Unresolvable Time Advance
As noted in [33] the propagation of a photon can be understood in terms of evolution in an effective metric of the form with the standard f (r) = (1 − rg r ). Using the results of appendix B for the effective metric (6.3) the naive expression for the Eisenbud-Wigner scattering time delay is where r t (β P ) is the turning point as computed in the EFT with parameter β P . Again in D = 4 this is logarithmically divergent due to slow Coulomb 1/r fall off in four dimensions but what is relevant is the extra time delay relative to scattering in Schwarzschild with β P = 0, i.e.
5) which is a finite expression. As shown in appendix D, using (D.8) to first order in β P this is given by For large 12 impact parameter in comparison to the Schwarzschild radius b r g we have to leading order in an expansion in r g /b in the large limit and to leading order in r g /b. As expected, this is a time advance for β P > 0. 12 We note that the integral on the right hand side of (6.5) does become arbitrarily large as rt(0) approaches 3rg/2 which is when b → 3 √ 3rg/2. However this is purely an artifact of the fact that this is the peak of the effective potential (B. 19), and is the point at which we may easily transmit across the potential barrier. As such, the boundary conditions used for the solutions to derive (6.5) are not appropriate.
At the point of closest approach, the wave is purely transverse and the bound (5.10) applies. Again to leading order in r g /b, we infer that for the EFT to remain valid at the impact parameter, the frequency should satisfy Putting this together the time advance is bounded by from which we again conclude that while the low-energy superluminality pointed out by Drummond and Hathrell in the context of the EFT for QED below the electron mass on a curved Schwarzschild background does technically lead to a time-advance ∆T < 0, this advance is not resolvable. In fact since in this example we know the UV completion we can calculate exactly the phase shift, and as shown in [54] in the shockwave limit the contribution to the phase shift is always small ensuring the time-delay is indeed unresolvable. In addition it is shown in [54] that in the limit of high frequencies lim ω→∞ ∆T EFT (ω) = 0. This confirms the underlying Lorentz invariant causality of the UV completion, but is secondary to the resolvability criterion in understanding how the low-energy EFT is consistent with causality.

Case of a Resolvable Physical Time Advance
In the previous sections we have seen how well known examples such as QED in curved spacetime and the EFT of gravity lead to time advances that are nevertheless unresolvable and hence not in tension with causality. To demonstrate that the time-delay analysis is not without content, we consider here an example of the opposite case, an EFT which does lead to a resolvable time advance and can therefore be concluded to be in tension with causality. For this we need to put ourselves in the situation where the bound (3.5) is violated in the decoupling limit for which it is sufficient to return to our canonical example of a P (X) scalar field minimally coupled as in (3.6) with the negative parameter a = −|a|.
In order to parallel the previous Schwarzschild discussion we consider sourcing the scalar field in the manner where T is the trace of the stress-energy tensor of all other matter fields present in that spacetime and β is a dimensionless coefficient which may be taken parametrically larger than unity. In particular if we consider a situation similar to that of the previous sections, with a Schwarzschild geometry generated by a mass M * located at r = 0, with Schwarzschild radius r g = 2M * /M 2 Pl so that T itself is a delta function source. Ignoring for now the backreaction of the scalar field onto the geometry (i.e working in the field theory on curved spacetime limit), the background profile φ 0 (r) for the scalar field is determined by the solution of 13 so that in the weak field regime, To remain within the safe region of the EFT we require |X| M 4 and so Considering fluctuations about this background, φ = φ 0 (r) + δφ(t, r, ϕ), the angular speed of δφ-waves remain luminal while the radial speed is not only superluminal for a < 0, the departure from luminality is enhanced by a factor of (M 2 Pl /M 2 )(r g /r) as compared with that in QED (6.2) (and by a factor of (M 2 Pl /M 2 )(M 2 Pl r 2 g )(r/r g ) ≫ 1 as compared to the EFT of gravity (5.6)). More precisely the effective metric for the fluctuations of the scalar field δφ to leading order in the EFT expansion is of the form Once again following the approach of appendix D, the leading EFT correction to the time delay relative to Schwarzschild is where here This simplifies considerably in the limit of large impact parameter b r g to be dδR(r) dr = aβ 2 M 2 Pl r 2 g (3b 2 − r 2 ) 4b 2 M 4 r 4 . (7.9) 13 We stress it is not important that there is a global solution of this equation for all r, as we are working with a truncated EFT and the solution can only be trusted in the regime given typically by (7.4).
which may easily be engineered to be ∆δ EFT,P (X) 1, by taking M 2 * /M 2 M 2 b 2 while maintaining b M −1 to ensure (7.4). For such a phase shift, the implicit summation that goes into the semiclassical/eikonal approximation is justified as the ladder diagrams dominate over other contributions at each order in loops. This situation is orthogonal to that of the EFT of gravity and that of QED below the electron mass and as discussed in section 4.4 is due to the lack of suppression of the speed correction, a consequence of the fact that (3.5) is violated.

Discussion
To summarize, when dealing with a gravitational effective theory, superluminal low energy speeds (with respect to the metric out of which it is constructed) is not only possible, but is sometimes demanded by underlying causality and analyticity criteria. Indeed this has been well known since the work of [33,34] for QED in curved spacetime [35][36][37][38][39][40][41][42], and more recently noted in [43,44] for GWs. Relativistic causality nevertheless remains intact because the causal support of the retarded propagator vanishes outside of the lightcone of the metric, defined in the field frame with a well-defined decoupling limit. This apparent contradiction is resolved in the examples of FLRW and Schwarzschild we have discussed by identifying the regime of validity of the effective theory, and asking -given the low energy form of the retarded propagator -whether it is possible to influence events outside of the metric lightcone? When the propagator is computed perturbatively, this can only happen when there are secular terms in the perturbative expansion which need to be resummed, and it is this resummation that extends the support of the lightcone. This is exactly what happens in known pathological cases where relativistic causality is violated [6,61]. Here we have shown that in the case of EFT corrections discussed in [43,44], the condition for the validity of the EFT automatically precludes any secular behaviour and hence relativistic causality is left intact, despite explicitly superluminal low energy speeds.
With this in mind, faced with a given gravitational effective theory, how should we apply causality requirements in the absence of clean S-matrix requirements? As stated in the introduction many works in the literature simply demand that around a given background, the speed of propagation of all modes is (sub)luminal relative to the metric out of which the theory is constructed. This is a demonstrably false criterion as it is not invariant under field redefinitions as the example in section 3.3 clearly demonstrates. As noted above, the issue about field frame dependence can be partially resolved by working in a frame in which it is possible to consistently take a decoupling limit M Pl → ∞ for which gravitational effects decouple (see section 3.2 for clarifications on what is meant by a decoupling limit). It is then consistent to demand that the resulting Minkowski spacetime field theory respects all of the standard causality/analyticity/positivity requirements. In particular this leads to the condition (3.5). Once this is done, this implies that all effects that lead to mild superluminalities are in this frame M Pl suppressed effects.
The next question is how do we impose causality away from the decoupling limit, at finite M Pl , or in situations in which there is no clean decoupling limit? Once again, simply demanding c s ≤ 1 for all perturbations around a given background is an incorrect criterion, as the known examples illustrate [33,34,43,44]. Ideally we may just appeal to the full UV theory to demonstrate that the exact retarded propagator is causal [35][36][37][38][39][40][41][42], however we rarely have that luxury, and furthermore in time-dependent spacetimes such as FLRW such analyticity methods are not applicable. We can however cleanly answer this equation within the low energy effective theory itself provided that we appropriately identify its regime of validity. Indeed causality resolution should entirely lie within the purview of the low energy EFT as by construction is gives the large distance macroscopic description of the theory, and this is where any causality violation will become apparent. If the superluminal speed were physical, it would lead to a macroscopic consequence, specifically an enlarged support for the retarded propagator. Since at zeroth order in the EFT expansion we assume that the retarded propagator has the support implied by the usual metric lightcone, and since we can structure the calculation of the corrected retarded propagator as a perturbative expansion around this, the telltale sign for a modified causal structure is secular behaviour in the perturbative expansion that needs to be resummed. We have shown explicitly that in the cases considered in [43,44] this secular behaviour is absent provided we restrict ourselves to effects which can be consistently calculated in the regime of validity of the EFT in question. Concretely in FLRW validity of the EFT imposes a cutoff in on-shell momenta of the form This cutoff is parametrically higher than the invariant cutoff M due to the underlying assumption of Lorentz invariance of the UV completion 14 .
The entire discussion for cosmological spacetimes parallels exactly the more straightforward case of scattering in asymptotically flat/Schwarzschild geometries. There we may use the positivity of the Eisenbud-Wigner scattering time delay as a clean criterion for causality. However any computed time advance from EFT corrections can only be interpreted as a macroscopic causality violation if it is resolvable meaning if ∆T EFT ω −1 . In all the known cases of time advances arising from consistent UV completions, the time advance is not resolvable and its associated contribution to the scattering phase shift is less than unity |∆δ EFT | 1 and ∆T EFT ω −1 . Hence the eikonal/semiclassical resummation, which is the parallel of the secular resummation of the retarded propagator, is not justified as an indicator of causal properties. Explicit UV completions show that the high energy behaviour of the corrections to the phase shift are very different, and generically we expect lim ω→∞ ∆T EFT (ω) → 0, equivalent to the expectation that in an underlying Lorentz invariant theory lim ω→∞ c s (ω) → 1. We stress again however that we are not using these expectations to resolve the naive causality issue with the low energy EFT.
In using causality to constraint EFTs then, in the end then the issue comes down to not just the sign of Wilson coefficients in an effective action, but crucially the size. If a given operator induces a large superluminal effect that is either (a) nonzero in the decoupling limit, or (b) leads to secular behaviour within the regime of validity, then we can safely conclude that this would violate the traditional requirements of relativistic causality as in the case of the 'wrong sign' P (X) model. However, if a given operator leads to a small superluminal speed which is insufficient to give rise to any secular growth in the regime of validity of the EFT, then it is safe to conclude that this is allowed by the requirements of relativistic causality. Although we have not emphasized this here, there is a strong interplay between these requirements and the application of positivity bounds in a gravitational setting as discussed in [43]. This weaker requirement of 'signs' of EFT coefficients likely connects with weaker requirements for positivity bounds in the presence of a massless graviton.

A.1 Graviton Dispersion Relation
The leading modification to the dispersion relation of a gravitational wave in a curved background can be determined by first determining the effect in Minkowski spacetime. That is because diffeomorphism invariance may be used to covariantize the Minkowski answer and the ambiguities in this procedure turn out to be subleading corrections. With this in mind let us first determine the form of the graviton propagator or two point function in Minkowski spacetime. At the linear level this may be described by a Lorentz tensor h µν which transforms under linear diffeomorphisms as h µν → h µν + ∂ µ ξ ν + ∂ µ ξ ν . Although nonlinearly we cannot construct local gauge invariants for gravity, at the linear level we can by means of introducing a conserved external source T µν for which ∂ µ T µν = 0, and considering the 'T T amplitude' that describes the interaction between two sources (equivalently the free field connected generating function) Following standard arguments, causality, unitarity and Lorentz invariance fix the form of this amplitude to be where formally where the polarization tensors are In writing this expression we assume that this is the true propagator and so is independent of UV cutoffs or RG sliding scales used in computing loops. The first term in (A.3) is the usual massless graviton pole which must arise by diffeomorphism invariance uncorrected other than by a wavefunction renormalization 15

A.2 Renormalization group without renormalization
In practice however (A.3) is only valid if the integrals ∞ µ 0 dµρ 2 (µ)µ −1 and ∞ µ 0 dµρ 0 (µ)µ −1 converge for some finite µ 0 > 0, i.e. provided they converge in the UV. In general they do not and their divergence is directly related to the renormalization of the curvature squared terms in the effective action. As such on dimensional grounds we expect them to be logarithmically divergent, which is borne out by explicit calculation at one-loop [43].
To deal with this we perform one subtraction defined at an arbitrary scale µ 0 to give Since the propagator cannot depend on the arbitrary subtraction scale µ 0 we obtain the dispersion relation analogue of the renormalization group equations the right hand side being finite if our assumption about the overall number of subtractions was correct. We stress though that µ 0 should not be confused with any UV cutoff or sliding scale used in computing loops, it is however clear that it plays a similar role. The integration constant that arises in the solution of the equation are the undetermined subtraction constants. It is natural to define the IR values of these constants at µ 0 = M 2 IR , and the UV at some high energy scale µ 0 = M 2 UV . Integrating the RG equation we have , (A.10) and so we clearly have by unitarity Thus the dispersion relation demands positivity of the flow from the UV to the IR. It does not however guarantee that C IR S > 0.

A.3 Low energy effective field theory
Let us now make an assumption similar to that described in the previous section that the dominant contribution to the spectral densities comes from energies for which µ ≥ M 2 , where M is viewed as the cutoff of the low energy effective theory. In other words we assume that there is a weakly coupled low energy effective theory for which the loop contributions from light fields are small relative to the effects from heavy fields whose masses satisfy M I ≥ M . This is quite natural here for situations in which the number of heavy fields with masses greater than M is much larger than the number of light fields, precisely because each field at one-loop contributions logarithmically to C S . In this case the propagator may be split up as where the IR part comes exclusively from loops of light fields of masses smaller than M , and is thus parametrically suppressed if the low energy effective theory is weakly coupled and the remaining part is that which will essentially be described by the tree level low energy effective theory Pl P 0 µναβ (A.14) The IR part G IR µναβ (k) is generically non-local since it includes loops of light fields, in particular those of the graviton itself. For example if ρ 0,2 (µ) are approximately constant over the range 0 ≤ µ < M 2 then we have approximately By contrast G EFT µναβ (k) is local when viewed at energies |k 2 | M 2 . In other words we may perform a standard EFT expansion in the form Furthermore taking µ 0 = M 2 IR ≈ 0, then in the region M 2 IR |k 2 | M 2 we may approximate this as This is the standard form of the a tree level effective theory description of the propagator as a local derivative expansion, and is the direction analogue of (2.7).

A.4 1PI and Wilson Effective action
In performing the decomposition (A.12), we are implicitly assuming that the light loops are computed through a unitarity cut method, consistent with the dispersion relation. In short, rather than computing the loop process through standard means, we compute its imaginary part, and then infer its remaining contribution through its dispersion relation. With this proviso, we then recognize that G EFT µναβ (k) will be the two point function computed from the Wilsonian effective action valid below the scales |k 2 | < M 2 , and G µναβ (k) = G EFT µναβ (k) + G IR µναβ (k) will be the result of the 1PI effective action. As always this split is arbitrary, here depending on the subtraction scale µ 0 which hence we take as some IR scale µ 0 = M 2 IR .
Assuming we only compute loops of heavy fields (not the graviton itself), then the 1PI effective action and Wilsonian effective actions will be diffeomorphism invariant. We may thus write a local covariant action which can reproduce the propagator G EFT µναβ (k) which is found to be The higher derivatives terms are relevant for BH solutions, but these cannot be inferred from our above argument and must be computed explicitly as done in [43,44]. However they do have the virtue of being prescription independent. We can also add to this action a Gauss-Bonnet term which cannot be inferred from our calculation. Indeed up to a Gauss-Bonnet term (A. 19) may equivalently be written as where The virtue of writing this covariantly is that assuming that matter remains minimally coupled to this metric then we may infer the effect of these additional corrections on other backgrounds such as FLRW. The positivity of the flow (A.11) then implies The covariant form of the 1PI effective action being non-local is more complicated (see [73][74][75][76][77][78][79][80][81]), but it is sufficient to note that the following covariant expression reproduces the desired T T amplitude and is consistent with the Wilsonian effective action The last term is a non-local extension of the Gauss-Bonnet whose coefficient cannot be inferred from the arguments made so far. In particular we cannot assume that ρ GB (µ) > 0. As a special case, a non-local action of this form is used for example in [82] for the specific form of ρ 2 (µ) and ρ 1 (µ) that arises from one-loop integrals of massive and massless states. We stress again that despite appearances (A.22) is independent of the arbitrary subtraction scale µ 0 .

B Semiclassical Phase Shift and Time delay B.1 Langer approach
The scattering phase shift δ in the semiclassical (WKB) approximation for scattering in a spherically symmetric background is easily computed and we sketch the essential result here. We will assume that the equation of motion of the propagating degrees of freedom may be put in the form of a scalar field living on an effective metric Z µν as is the case in all the discussed examples. Although generically this equation will have an effective mass, this mass term makes a negligible contribution to the scattering phase shift for high frequencies, consequently it is sufficient to consider a massless scalar for the purposes of our discussion.
To accommodate the behaviour of modes with small we follow the approach of Langer [83]. The fluctuations of the effective scalar in a D-dimensional spherically symmetric background expressed in the form can be expressed in terms of D dimensional generalization of spherical harmonics, i.e. eigenstates of ∇ 2 D−2 = − ( + (D − 3)) and satisfies a wave equation for a given We will initially assume that Z µν has no singularity, and no horizon. In this case a naive application of the WKB approximation to the equation in this form will result in an expression that does poorly for low , although gives the correct classical phase shift at large . The origin of this problem as first noted by Langer [83] is that the scattering problem here is defined on the line r ≥ 0, and the behaviour of the solutions near r = 0 is not well approximated by WKB. This problem is easily resolved performing a coordinate transformation r = e ρ which maps the origin at r = 0 to ρ = −∞. The correct asymptotic solution near ρ = −∞ is now the exponentially decaying WKB solution.
To proceed we change variables and then define φ = e −(D−3)ρ/2 (Z r ) −1/2 χ which puts the equation in the canonical form In the usual case for which D = 4 and Z r = Z t = Z Ω = 1 this gives As pointed out by Langer, this corresponds to using the standard WKB formula with the replacement ( + 1) → ( + 1/2) 2 which is relevant at low .
The turning point, i.e. what is interpreted classically as the point of closest approach, is defined by W (ρ t ) = 0. For ρ < ρ t we have W < 0 and the desired WKB solution is that one that decays exponentially as ρ → −∞ χ ≈χ for some normalization constantχ. Using the WKB matching formula this matches onto for ρ > ρ t , where r t is the turning point expressed in terms of r and In the idealized case in which all components Z I asymptote to unity faster than 1/r, then the scattering phase shifts are determined by requiring that this solution has the asymptotic form χ ∝ e 2iδ e iωr + e iπ(D−2)/2 e iπ e −iωr . (B.9) Performing the comparison we obtain the standard WKB formula for the partial wave phase shifts The total Eisenbud-Wigner time delay for each partial wave is then given by This simplifies in the limit of large , fixed apparent impact parameter b = ( + (D − 3)/2)/ω, so that β R (r) may be neglected we find which is the 'classical' time-delay result for a particle moving along a null geodesic in the metric Z µν .

B.2 Dealing with a Horizon
To deal with spacetimes with a horizon, it is necessary to modify the Langer transformation and instead of taking r = 0 to −∞, we map the horizon to −∞. This is achieved by defining an analogue of the tortoise coordinates for which the two dimensional r, t metric is conformally flat, i.e. for which Z In deriving this we assume that r t is sufficiently larger than the peak of the potential V eff (r), which for example in D = 4 occurs at approximately r = 3r g /2 for large , so that there is sufficient barrier that (B.6) is approximately the solution inside the potential barrier. Clearly as the turning point approaches the top of the barrier we must take appropriate consideration of the absorbed waves [84,85]. Translated back into the original coordinates this iŝ which gives a time delay This expression is to be compared with (B.12). The two differ only in the sub-leading semiclassical contribution which accounts for the different boundary conditions describing the two different physical situations.

C Eikonal as a limit of Semiclassical Phase Shifts
Since many discussions of casuality in effective field theories are phrased in the eikonal or shockwave (Penrose limit) approximation (see for example Refs. [23,27,86]), it is worth showing here that the eikonal approximation can be obtained straightforwardly as a limiting case of the semiclassical approximation, and hence the latter may be regarded as more general. We begin with the phase shift relevant to the 2D conformally flat coordinatesr (B.21). We split the potential in the form of the usual centrifugal potential plus corrections V eff (r) = b 2 ω 2 r 2 + U eff (r) . (C.1) with b = ( + (D − 3)/2)ω −1 . The eikonal approximation corresponds to assuming a high energy limit ω 2 U eff (r) so that we may treat U eff (r) perturbatively. In the present relativistic context this limit is more subtle than it is in non-relativistic quantum mechanics since U eff (r) itself scales with ω 2 . Nevertheless its different radial dependence ensures there is always a regime in which we may imagine ω 2 U eff (r). Naively we can just perturb the square root in (B.21), however this becomes problematic since the point of closest approacĥ r t is itself dependent on the potential U eff (r) and a naive expansion will lead to an ill-defined expression. The solution is to use the relation between b andr t b 2 ω 2 r 2 then perturbing around Minkowski we havê dτ (δZ µν p µ p ν ) = 1 2 ∞ b dr dτ dr (δZ µν p µ p ν ) , (C. 10) where the momentum p µ and velocity are their solutions in Minkowski spacetime written in radial coordinates, i.e. dr dτ = ω 1 − b 2 /r 2 , p t = ω, p r = ω 1 − b 2 /r 2 , p θ = ωb/r 2 which for a spherically symmetric background in 2D conformally flat coordinateŝ which is consistent with (C.8) given U eff (r) = b 2 ω 2 r 2 (δẐ t (r) − δẐ Ω (r)) for large from (B.18).
(C. 29) Unlike the result in dimensions D > 4 (C.23), this corresponds to a time advance ∆T = −r g 2 ln( + 1/2) + 1 + 1 ( + 1/2) 2 . (C.30) We should not interpret this as any violation of causality though. Firstly, even in non relativistic quantum mechanics time advances do occur, as is well known in the example of scattering from a hard sphere where, for a sphere of radius a, the scattered wave of speed v is reflected at a time a/v before a free wave would reach the center at r = 0 and so is reduced in travel time by 2a/v. For this reason, Wigner's original causality condition for S-wave scattering is stated as [46,50] ∆T =0 ≥ −2a/v , (C. 31) up to fluctuations at the scale ω −1 . The implication is that unlike in higher dimensions, the black hole acts like scattering off of a hard sphere with an dependent radius of order r g . However this is also at the scale of the ambiguity in the definition of the phase due to the divergence from the Coulombic behaviour, i.e. the logarithmic term included in (C.27) and so we should be careful to read too much into this. Indeed in defining the tortoise coordinates analogous to (C.14) we are faced with a logarithmic divergence which must be cutoff at a scale Cr = r − C r dr 1 1 − rg r = r − r g ln((r − r g )/(C − r g )) . (C.32) In pure Schwarzschild, C is usually fixed to be 2r g but in a spacetime which is only asymptotically Schwarzschild there is no requirement for this particular choice. The inherent logarithmic ambiguity in ther coordinate translates into the same ambiguity in the phase shift and hence time delay. As such we will content ourselves with determining time delays relative to the asymptotic Schwarzschild for which the ambiguity related to the definition of the phase shift cancels out. This corresponds to the classical criterion [53]. It is noteworthy that these issues are entirely avoided in higher dimensions. Another way to understand these subtleties is to introduce an IR regulator through an effective mass µ by replacing the Coulombic r g /r with a Yukawa form r g e −µr /r. This renders the naive phase shift definition (B.10) and (B.22) finite, and gives a time delay logarithmically sensitive to the IR cutoff. This divergence will cancel in considering time delay differences as we do throughout, allowing us to take the limit µ → ∞.
Alternatively, from a classical point of view we can regulate the divergence by asking for the time delay for a trajectory that begins and ends at finite radii r b and r e respectively relative to the same result in Minkowski spacetime with impact parameter b Evaluating this for a Schwarzschild geometry we have the well known Shapiro time-delay written in terms of coordinate time t [49] ∆T g = with f (r) = 1 − r g /r which to first order in r g /r t is 17 ∆T g = r g ln r e + r 2 e − r 2 t r t + r g ln   r b + r 2 b − r 2 t r t   + r g r e − r t r e + r t + r g r b − r t r b + r t +(r 2 e − r 2 t ) 1/2 + (r 2 b − r 2 t ) 1/2 − (r 2 e − b 2 ) 1/2 − (r 2 b − b 2 ) 1/2 + O(r 2 g /r t ) . (C.35)

D Time Delay Corrections
We are generally interested in corrections to the time-delay from EFT corrections to the effective background geometry and hence consider the geometry (B.1) with now Z µν → Z µν + δZ µν . In terms of the corrected factors Z I → Z I + δZ I as defined below the wave equation (B.2), the correction to the time-delay is defined by where r t is the turning point in GR (where Z t (r t )r 2 t = Z Ω (r t )b 2 ) and r t + δr t is the turning point associated with the effective metric Z µν + δZ µν .
For simplicity, we shall denote the integrant in GR as A and that in the EFT as A + δA, To determine the correction ∆T EFT , we perform in each integral a distinct coordinate transformation r → ρ. For the second integral in (D.1) we perform a change of variable defined as r = R(ρ) so that the relation (D.4) below be satisfied and for the first integral in (D.1) we 17 This result is more often quoted as the total time signalling time back and forth between r b and re in the proper time of an observer at r b which in our notation is ∆τ b→e→b = 2(1 − rg/r b )((r 2 e − b 2 ) 1/2 + (r 2 b − b 2 ) 1/2 + ∆T g ).