Gravitational-Wave Constraints on an Effective--Field-Theory Extension of General Relativity

Gravitational-wave observations of coalescing binary systems allow for novel tests of the strong-field regime of gravity. Using data from the Gravitational Wave Open Science Center (GWOSC) of the LIGO and Virgo detectors, we place the first constraints on an effective--field-theory based extension of General Relativity in which higher-order curvature terms are added to the Einstein-Hilbert action. We construct gravitational-wave templates describing the quasi-circular, adiabatic inspiral phase of binary black holes in this extended theory of gravity. Then, after explaining how to properly take into account the region of validity of the effective field theory when performing tests of General Relativity, we perform Bayesian model selection using the two lowest-mass binary--black-hole events reported to date by LIGO and Virgo---GW151226 and GW170608---and constrain this theory with respect to General Relativity. We find that these data can rule out the appearance of new physics on distance scales of 70-200 km. Finally, we describe a general strategy for improving constraints as more observations will become available with future detectors on the ground and in space.


I. INTRODUCTION
The detections of gravitational waves (GWs) from merging black-hole (BH) and neutron-star binaries by the LIGO and Virgo collaborations [1] provide a novel opportunity to test the highly-dynamical, strong-field regime of gravity. All tests to date indicate that these observations are fully consistent with the predictions of General Relativity (GR) [2][3][4][5]. Combined with tests from the Solar System and cosmological observations [6][7][8], we now have a great amount of evidence that GR accurately describes gravity over a broad range of scales.
While positive checks for consistency are significant results by themselves, it remains important also to consider how GW observations can directly inform our understanding of how gravity behaves in the high-curvature regime. Work towards this second goal typically proceeds in one of two directions. One option is to consider a particular alternative to GR (typically at the level of an action), then calculate how the differences in the underlying physical theory translate to an observable signal, i.e., the gravitational waveform, and finally use a detected GW to measure the physical parameters that define that theory. While straightforward, this approach suffers due to its specificity; a plethora of proposals for modifying GR in the high-curvature regime have been considered [9], and testing each individually is highly inefficient. The second option instead considers phenomenological deviations [10][11][12][13][14] from the expected GW signal in GR, it uses observations to constrain these deviations, and then maps those bounds to constraints on specific non-GR theories [3,15]. But, this approach does not always provide a clear connection to the fundamental physics/principles one wishes to test.
In this work, we adopt an alternative approach, employing the powerful tools of Effective Field Theory (EFT) to test nat- * noah.sennett@aei.mpg.de ural extensions of GR with GW observations. Broadly speaking, EFT provides a systematic framework to encode all modifications to an existing theory that could arise after introducing some form of new physics. Though this task may seem daunting, the construction of an EFT is dramatically simplified by restricting ourselves to modifications that: (i) respect certain general principles or symmetries (e.g., locality and Lorentz invariance), and (ii) are associated with a particle (or some form of undetermined excitation) whose mass is too large to be directly probed with experiments. Under these assumptions, the Lagrangian of our EFT is determined by simply adding to the original Lagrangian all possible terms allowed by the symmetries that we wish to preserve, constructed using the fields already present in the original theory. Dimensional analysis dictates that these new terms are each suppressed by some energy scale, dubbed collectively as the cutoff scale Λ c , which, roughly speaking, corresponds to the mass of the particles that modify the original theory. Note that describing physics at energies above this cutoff would require the introduction of new particles to the theory, or equivalently, new fields to the effective Lagrangian, vastly complicating the problem. However, provided we work at energies below the cutoff scale [assumption (ii) above], all physical observables can be estimated without specifying these new particles in detail. The high-energy-agnosticism of the EFT provides the primary advantage of this approach over the tests of specific alternative theories described earlier; by working within the confines of an EFT, we can simultaneously test a large range of possible extensions of GR at once.
Reference [16] constructed an EFT suitable for testing gravity with current GW experiments following the guidelines above. In this paper, we examine the constraints that can be set on this extension of GR given existing GW observations. The work is organized as follows. In Sec. II, we review the main properties of the EFT extension of GR constructed in Ref. [16]. In Sec. III we construct gravitational waveforms arXiv:1912.09917v1 [gr-qc] 20 Dec 2019 that describe the quasi-circular, adiabatic inspiral of BBHs in this theory. Then, in Sec. IV, we employ Bayesian inference, notably the model-selection method, to place constraints on this theory when Λ c 1/r s , r s being the BH radius, demonstrating that current GW data rule out modifications to GR entering at scales in the range Λ −1 c ∼ [70, 200] km (or equivalently, Λ c ∼ [1,3] peV). Finally, we provide some concluding remarks and discuss future directions in Sec. V. Lastly, in the Appendix A, we consider setting constraints in the regime Λ c 1/r s , while in the Appendix B we discuss how our findings are modified if an inspiral-merger-ringdown waveform model instead of an inspiral-only model were used. Henceforth, we use natural units G = c = 1 except when stated otherwise.

II. EFFECTIVE-FIELD-THEORY EXTENSION OF GENERAL RELATIVITY
A. Extension of General Relativity using high powers in Riemann tensor The EFT constructed in Ref. [16] encodes the most general extension to gravity under a number of assumptions: (i) the following principles and symmetries are preserved by the new physics: locality, causality, Lorentz invariance, unitarity and diffeomorphism invariance, (ii) there is no new particle lighter than the cutoff of the theory, and (iii) the extension to the theory is testable with experiments such as LIGO and Virgo. The resulting EFT takes the form where Λ,Λ, Λ − are cutoff scales for each operator, and the dots in Eq. (2.1) denote terms with powers in the Riemann tensor beyond four. Though the cutoff scale for each term is, in principle, independent, it is quite natural to assume that they are comparable (i.e., Λ ∼Λ ∼ Λ − .) In Eq. (2.2), we employ the Levi-Civita tensor µνρσ , defined such that 0123 = 1/ √ −g.
We refer to the EFT (2.1) as "EFT of General Relativity", in short EFTGR. Except where noted, we assume the coupling constants to be positive (i.e., Λ 6 ,Λ 6 , Λ 6 − > 0), however, in principle one should not dismiss the parameter space Λ 6 ,Λ 6 , Λ 6 − < 0 [16] 1 . It is well known that the terms in Eq. (2.1), as well as others, are needed by renormalizability of the theory (e.g., 1 It has been argued that causality [17] and analyticity of the graviton amplitudes [18] requires Λ 6 ,Λ 6 > 0 and Λ 12 − ≤ 2/(Λ 6Λ6 ) [16]. However, these arguments seem to require the graviton's momenta to be larger that the cutoff scale Λ and therefore, strictly speaking, to be outside of the regime of validity of the theory [16]. We therefore regard these arguments just as highly disfavoring these theories. see Ref. [19]). Renormalizability, however, suggests that the scales suppressing those operators are different from the ones in Eq. (2.1). In fact, the main novelty of Eq. (2.1) lies in realizing that such low-energy suppression for those operators and such scalings were possible from an EFT point of view, and, as we will describe, are not yet ruled out by tests of gravity. The label of EFTGR is referred in this paper only to the particular scaling assumed in Eq. (2.1), which gives a theory testable by LIGO and Virgo, and not to the general, and already wellestablished fact, that GR is an EFT (e.g., see Ref. [16] for detailed discussions).
For clarity, we now briefly review the construction of the action (2.1); the interested reader should refer to Ref. [16] for more detail. As discussed above, our assumption that no new particle enters with mass below the cutoff of the EFT precludes the introduction of any new fields to the theory; thus our effective action must only be a function of the metric g µν . Furthermore, our assumption of diffeomorphism invariance implies that new terms must be expressible as functions of the Riemann tensor and its covariant derivatives. Consider a Taylor expansion of such a function: dimensional analysis dictates that terms polynomial in the Riemann tensor must be suppressed by powers of some cutoff scale Λ c . In principle, we could write an infinite number of higher-order curvature terms suppressed by correspondingly large powers of some cutoff scale (i.e., terms given schematically by (R µνρσ /Λ 2 c ) n ). But, if we are interested in observables whose energy scale E is much smaller than Λ c (as in the present context), then each term, schematically of the form (R µνρσ /Λ 2 ) n , contributes as (E 2 /Λ 2 ) n 1. Therefore, for a given precision of the prediction, only a finite number of terms needs to be kept.
For our purposes, we keep only the leading-order corrections to the Einstein-Hilbert action; these are precisely the terms given in Eq. (2.1). However, our truncation of the higher-order curvature corrections remains valid only so long as the sub-dominant terms remain negligible at the level of precision at which we work. Yet, working within this restricted regime of validity, we see that any extension of GR that upholds conditions (i) and (ii) described above must admit a low-energy 2 EFT of the form (2.1).
Note that the leading-order corrections to the Einstein-Hilbert Lagrangian come from terms that are quartic in the Riemann tensor (or equivalently, eight derivatives of the metric) rather than terms quadratic or cubic in curvature. In vacuum (to which we restrict our attention), terms quadratic in the Riemann tensor can be removed via an appropriate redefinition of the gravitational field 3 . Our dismissal of terms cubic in Riemann is more subtle. There are several theoret-ical arguments that disfavor the presence of these terms, see Refs. [20,21], as well as Sec. 2.2 of Ref. [16]. Additionally, there are several cancellations in the PN calculation of the contribution of these terms to the GW signal [16]. This leads to the suppression of these effects, going as v 4 instead of v 2 , as instead expected from naive estimates, and makes the computation of the non-vanishing effects a harder task due to proliferation of Feynman diagrams, as typically occurs once one goes beyond the leading order. In spite of these facts, further study of these terms remains an interesting direction that we leave for future work.
Finally, one can, in principle, include in the action (2.1) additional terms containing covariant derivatives of the Riemann tensor that equate to eight derivatives of the metric; however, these terms can always be absorbed into the existing terms in Eq. (2.1) via integration by parts (see Sec. 2.1 of Ref. [16]), and so we can work with the simplified action without any loss of generality.
Clearly, the higher-order curvature terms in Eq. (2.1) produce the greatest deviation from GR when the energy scales in the problem approach the cutoff scale of the EFT. In the context of a BBH system, this occurs when the cutoff scale is comparable to the inverse Schwarzschild radius of the merging BHs (the shortest curvature scale in the problem). For typical stellar-mass BHs observable by the LIGO and Virgo detectors, this corresponds to an energy scale of the order of inverse kilometers (or tens of picoelectron volts).
By particle physics standards, this energy cutoff is quite low. Indeed, there is no a priori expectation that GR should receive corrections at such low-energy scales. Yet, there remain several factors that merit further exploration of this EFT despite its contrivance. First, as we will discuss in more detail below, under some mild assumptions on the ultraviolet (UV) completion of this EFT, there is no experimental evidence that gravity excludes the possibility of new physics on the scale of inverse kilometers. Second, a presupposition of naturalness in our EFT is itself a theoretical bias; given the wealth of observational data now available from the LIGO and Virgo experiments, it is worthwhile to adopt a more agnostic viewpoint, letting the data determine the correct description of Nature rather than our theoretical preconceptions. Finally, the Lagrangian in Eq. (2.1) with cutoff Λ c is consistent at classical and quantum level even for very low-energy cutoff Λ c .

B. Soft ultraviolet completion of the effective-field-theory extension of General Relativity
The predictions of GR have been tested with exquisite precision in laboratory on the Earth and Solar-System experiments [6]. Naively, one may expect that such experiments could be sensitive to modifications of GR that enter at scales of inverse kilometers in the form of Eq. (2.1). However, there exist conditions under which the extensions of among the terms in the Lagrangian, they are experimentally ruled out, and then one moves on to the next testable theory, as we do. GR that we consider completely circumvent these weak-field constraints while still impacting strong-field phenomena detectable through GWs from stellar-mass BHs. In this subsection, we outline this additional condition precisely, which we denote as the existence of a soft ultraviolet (UV) completion.
The key distinction between the weak-field and strong-field regimes alluded to above is not the distance scales of the systems involved, but rather the curvature scales manifested in each. Typical curvature scales in the Solar System and terrestrial experiments are of the order of less than 10 −8 km −1 [7], whereas stellar-mass BHs generate curvatures up to scales of inverse kilometers. Under the assumption of a soft UV completion, deviations from GR like those in Eq. (2.1) can still arise at cutoff scales Λ c of inverse kilometers, and yet have negligible effects on weak-field tests. A different point of contention may arise from tests of gravity that involve X-ray binaries [22,23]; however, the precision of current observations is not high enough to rule out the models we consider here (see Sec. 8 of Ref. [16] for further discussion of all relevant experiments).
To outline what is meant by a soft UV completion, we first recall that the EFT in Eq. (2.1) breaks down at the scale Λ c . Thus, to make predictions on shorter distances scales, one needs to introduce new degrees of freedom and use a more fundamental theory. As a result, one cannot guarantee that an EFT of the form (2.1) reduces to GR on distance scales shorter than Λ −1 c . At energies (or inverse distances) below Λ c , all EFT effects grow as (E/Λ c ) n with n > 0; if one extrapolates this growth to E Λ c , then the EFT effects grow very large at short distances. The assumption of a soft UV completion states exactly that this blow up at high energies does not occur. Namely, we assume that all EFT effects saturate at the EFT breakdown point E ∼ Λ c and either stay constant or even decay at higher energies.
A similar sort of soft UV completion occurs in many familiar theories that have already been experimentally verified, most notably the Standard Model of particle physics. There, the scale Λ c is nothing but a mass of some heavy particle (e.g., the Higgs boson or the W-boson), Λ c m. In these cases, at energies E below the cutoff, the mass enters the predictions for many physical observables as (E/m) n with n > 0; however, when E > m, the mass no longer enters in the denominator of any terms. Unfortunately, we are not able to provide any explicit example of a UV completion for our EFT (whether we require the softness assumption or not). But this fact only stresses the versatility of the EFT approachwe can make explicit predictions without knowledge of the UV physics, which can be extremely complex. By working only within its regime of validity, we are able to place bounds on EFTGR with GW observations with no further assumption about the underlying short-distance theory besides that of a soft UV completion.
Let us mention that the soft UV completion assumption appears to be a very generic necessary requirement for extensions of GR to be potentially testable by GW observatories -for example there exist simple extensions of our EFT that differ by the addition of a single light scalar or pseudoscalar field which couples to the curvature invariant C or C correspondingly. These theories are usually called dynamical Gauss-Bonnet and dynamical Chern-Simons gravities in the literature (see, e.g., Ref. [9]). Simple EFT analysis reveals that for the values of parameters testable by GW observatories these theories also must have a cutoff at the scale of order of the inverse Schwarzschild radius and require the same assumption about the short-distance behavior as our EFT [15]. These features, required by theoretical consistency, are often missed since these theories are not properly treated as EFTs (see, however, the recent works [15,[24][25][26]). Possible infrared modifications of gravity, like theories of massive gravity, also require a similar sort of assumption about the UV completion [27].

C. Phenomenological consequences of the effective-field-theory extension of General Relativity
There are several phenomenological consequences associated to the new terms present in Eq. (2.1). Let us focus on binaries comprised of two BHs of similar size, whose Schwarzschild radii we collectively denote with r s . If Λ c 1/r s , the observable deviations from GR present in our EFT mainly arise from modifications to the Kerr geometry of each individual BH, as well as, the quasi-normal modes (QNMs) of the remnant BH produced by their merger. Reference [28] investigated how the GR Kerr geometry is modified in this regime, showing that: i) the equatorial frequency at the light ring and innermost stable circular orbit (ISCO), and the spininduced quadrupole moment receive corrections, ii) the Z 2 symmetry of the geometry is broken, iii) the tidal Love numbers become non-vanishing, and iv) the QNM spectrum is modified with respect to GR even for non-spinning BHs. Various thermodynamical properties of BHs in this EFT were also recently studied in Ref. [21]. We collectively refer to the corrections to binary dynamics and GW signal corresponding to these deformations of the Kerr geometry in EFTGR as finitesize effects; in Appendix A we show that these finite-size effects are too weak to be observable with current GW events, and therefore we are not yet able to put any constraints in the regime Λ c 1/r s . Therefore in this paper we will instead focus on the opposite regime Λ c 1/r s , whose phenomenological consequences were studied in Ref. [16] in larger detail (in particular, see Sec. 7 of Ref. [16] for the discussion of the two regimes). In this case, the EFT breaks down at distances larger than the gravitational radius and, due to the assumption of soft UV completion, the finite-size effects saturate 4 , and the dominant 4 The application of the assumption of a soft UV completion to this case goes as follows: one can think of a BH as a system with the characteristic scale E ∼ 1/r s , then, for r s > Λ −1 c , finite-size effects enter the predictions for a binary system with the scaling v n /(r s Λ c ) m for some positive powers n, m. If not for the assumption, extrapolating this scaling naively, for r s < Λ −1 c , one could expect very large finite size effects, however, the assumption states that for r s < Λ −1 c one can replace all the factors of 1/(r s Λ c ) with some number not larger than order one. Thus, modifications to the tidal Love signature of EFTGR in the GW signal arises from modifications of the two-body interactions between the BHs. As we will discuss in more detail below, these are corrections that enter at 2PN order, though they are additionally suppressed by a factor of 1/(Λ c r) 6 1, with r being the separation of the two BHs. Explicitly, the relative correction to the amplitude and phase of the GWs scales as v 4 /(Λ c r) 6 ∼ v 16 /(Λ c r s ) 6 , which can also be thought of as an 8PN term, albeit with an extraordinarily large coefficient. The relative size of finite-size effects and these new orbital effects-as well as the regimes of applicability of perturbative methods through which they are computed-are depicted schematically in Fig. 1; this figure also illustrates the saturation of corrections beyond the cutoff scale of our EFT, as imposed by the assumption of a soft UV completion.
In this regime (i.e., when Λ c 1/r s ), the corrections due to the modification to the intrinsic Kerr geometry cannot be computed explicitly, as this would require the full UV completion of the EFT. One can however estimate them at the level of order of magnitude. Finite-size effects from modifications to the spin-induced and tidally-induced quadrupole moments produce corrections to binary motion (and the subsequent GW signal) that scale as χ 2 v 4 and v 10 , respectively, where χ is the dimensionless spin of the BHs. Note that we have used our assumption of a soft UV completion to eliminate the scaling with Λ c r s in these terms; as detailed in the footnote above, we have assumed that these effects saturate for BHs of size r s ∼ Λ −1 c , and thus this scaling can simply be replaced by an order one number in the regime we work r s Λ −1 c . Using this simplification, we estimate the relative impact of these finite-size effects as compared to the orbital effects discussed above by computing the total number of GW cycles each contribute to the inspiral. For BHs of mass greater than ∼ 10M , we find that the orbital effects typically contribute at least 100 times as many cycles as finite-size effects over the frequency bandwidth of the Advanced LIGO and Virgo detectors. Thus, we can safely neglect finite-size effects in our current study, leaving these details for future work.

III. GRAVITATIONAL-WAVE PREDICTION FOR THE INSPIRAL REGIME IN THE EFFECTIVE-FIELD-THEORY EXTENSION OF GENERAL RELATIVITY
In this section we explicitly show that because EFTGR modifies the conservative and dissipative dynamics of a two-body system, it leads to new terms in the GW phasing, which might be observed or constrained by GW observations.
In the early-inspiral phase the dynamics of a binary system of two objects with mass m 1 and m 2 can be characterized by an effective action that in center-of-mass frame takes the numbers and spin-induced multipole moments of the BH geometry need not dramatically affect the evolution of a binary system.
orbital effects [16] ∼ (rΛ c ) −6 v 4 BH perturbation theory PN approximation FIG. 1. The phenomenological consequences of EFTGR in binary BH systems and the regimes of applicability of perturbative approximations in which they are computed. The vertical axis schematicaly depicts the corrections to a generic observable (here taken to be the orbital phase φ orb ) relative to its prediction in GR as function of the inverse cutoff scale Λ −1 c . For extensions to GR at distance scales below the Schwarzschild radii of the BHs r s , one can use BH perturbation theory to compute the finite-size effects that influence the binary dynamics [28], shown in dark blue. For Λ −1 c r s , this approximation scheme breaks down, shown in gray, and the finite-size effects cannot be computed explicitly; however, the assumption of a soft UV completion ensures that these effects saturate at Λ c ∼ 1/r s , changing by no more than a factor of order one for cutoff energies below this value. Similarly, new orbital effects [16] are computed using the PN approximation; these corrections are subdominant to the aforementioned finite-size effects when the cutoff distance scale Λ −1 c is much smaller than the orbital separation r, but dominate when Λ c ∼ 1/r. However, by construction, EFTGR is only valid over distance scales larger than the Λ −1 c , and so the PN prediction of new orbital effects cannot be extended to spearations below Λ −1 c . Thus, as shown in cyan, the PN approximation can be applied when Λ −1 c r s , where both finite-size and orbital effects are known or when Λ −1 c r, where orbital effects are known and finite-size effects are small enough to be neglected.
form [16,29]: 1) where µ = m 1 m 2 /(m 1 + m 2 ) is the reduced mass of the system, v(t) is the relative velocity between the objects, V(r(t)) is the potential energy, Q i j (t) is the mass quadrupole moment of the system and . . . denote higher-order multipole moments.
As shown in Ref. [16], in the PN and Λ c 1/r s regimes, the leading-order correction to the gravitational potential of a binary system with non-spinning or slowly-spinning components is generated by the term C 2 in Eq. (2.1), and reads: The new terms in the action (2.1) also modify the gravitational radiation emitted by the binary. The leading order correction comes again from the term C 2 [16], and it can be written as a renormalization of the Newtonian quadrupole-moment Q Newt i j of the binary system: 3) where . . . denote higher order PN corrections in the qudrupole moment. Henceforth, we focus solely on these leading-order EFT corrections, and compute their effect on the conservative and dissipative dynamics. We then use the balance equation and calculate the leading-order correction to the GW phase.

A. Conservative and dissipative dynamics
On the orbital timescale, the various time-dependent quantities in the action (3.1) can be treated as constant and the radiative terms can be neglected. Restricting to quasi-circular orbits and varying the action (3.1) with respect to r(t), the equations of motion of the binary system can be written as: where . . . denote higher PN corrections. The above equation can be inverted to get a relation between the orbital radius r and the orbital frequency Ω. To leading order in 1/Λ one gets (3.5) Using the equations above we find that the binding energy (per unit total mass) is given by where M = m 1 + m 2 is the total mass of the system, ν = m 1 m 2 /M 2 is the symmetric mass ratio, and we define v ≡ (MΩ) 1/3 . For convenience we also introduce the parameter d Λ ≡ 1/Λ. Restoring the higher PN corrections in GR we have where E GR denotes the PN expression for the binding energy in GR. Furthermore, the renormalized quadrupole moment (3.3) leads to corrections to the GW flux. As in GR, Eq. (3.1) predicts a leading-order GW flux given by the quadrupole formula where · · · indicates the time-average over an orbit. The resulting GW flux can be written as where F GR (v) is the PN expression for the flux in GR, and the leading-order correction to the flux F Λ (v) reads (3.10)

B. Gravitational waveform in the stationary phase approximation
We are now in the position of computing the leading-order correction to the GW phase in EFTGR.
In the quasi-circular, adiabatic inspiraling stage, it is common to compute the PN approximation of the GW signal in the Fourier domain, using the stationary phase approximation (SPA) (see, e.g., Refs. [30,31]). In this approximation the frequency-domain waveform can be written as [31] h where φ(t) is the orbital phase and πF(t) = dφ(t)/dt defines the instantaneous GW frequency F(t).
The quantity t f is the saddle point where dΨ(t)/dt = 0 (i.e., the time at which the instantaneous GW frequency F(t) is equal to the Fourier variable f ). In the adiabatic approximation Ψ and t f are given by [31] where we define v f ≡ (πM f ) 1/3 , while t ref and φ ref are integration constants, and v ref is an arbitrary reference velocity, commonly taken to be the velocity at the ISCO.
Using the PN-expanded binding energy and flux, and expanding the ratio E (v)/F (v) at the consistent PN order, the integral in Eq. (3.13) can be solved explicitly. We find that the leading-order correction to the GW phase is given by 14) where we Ψ GR ( f ) represents the GW phase in GR 5 .
As already anticipated above, an important conclusion of this calculation is that although the EFTGR corrections are formally at 8PN order, when d Λ v 2 /M ≡ (Λr) −1 ∼ 1 the corrections are numerically at 2PN order.
Lastly, we restrict the amplitude of the EFTGR SPA waveform in Eq. (3.11) to the GR expression, because we expect that non-GR corrections to the phase dominate the signal.

C. On the validity of the waveform model in the effective-field-theory extension of General Relativity
As already mentioned the EFTGR described by the action (2.1), and hence the SPA waveform (3.14), loses validity for orbital separations r d Λ . Therefore, when employing the EFTGR SPA waveform, we should keep in mind that it is a reliable characterization of the gravitational radiation from the binary system only for orbital scales larger than d Λ .
When comparing the signal with the data it is therefore convenient to translate the orbital separation r into a characteristic frequency in the data such that we can identify the frequencies at which our waveform model can be trusted. There is no unique way to do this since r is not a gauge-invariant quantity. A simple choice is to employ the Kepler's law and relate the orbital (angular) frequency to the orbital separation as r = (M/Ω 2 ) 1/3 + O(v 2 /c 2 ). Since, for quasi-circular orbits, the GW frequency is given by twice the orbital frequency we define the quantity such that the EFTGR SPA waveform given by Eqs. (3.11) and (3.14) remain valid for GW frequencies f f Λ . In addition, when using a PN waveform model we also need to restrict to velocities such that v 1. There is no unique choice for when the PN expansion is no longer valid, but a typical choice is to use the frequency of the ISCO, f ISCO , as the cutoff. In the test-particle limit for a Schwarschild BH this is given by f ISCO = 1/(6 √ 6πM), where M is the total mass of the system. We therefore assume the model to be only valid for frequencies f < f cutoff = min[ f ISCO , f high ], where f high < f Λ is the cutoff frequency up to which we trust EFTGR.
Finally, there is no obvious prescription for the choice of f high , but generically an EFT can be employed without significant modifications until higher-order corrections become comparable to the leading one. The calculation of the corrections in EFTGR that are subleading in 1/(Λr) goes beyond the scope of the present paper. However, we can estimate the frequency at which the EFTGR breaks down by comparing the leading correction to some quantity to the GR contribution. As we have already discussed, our expansion contains, in fact, two small parameters: 1/(Λr) [or (d Λ /r)] and v(r).
From Eq. (3.3), we see that the leading-(d Λ /r) correction to the quadrupole goes as δQ d Λ /Q Newt ∼ (d Λ /r) 6 v(r) 4 , thus the subleading-(d Λ /r) correction would be, schematically, of order δQ sub d Λ /Q Newt ∼ (d Λ /r) 8 v 4 (r), where, again, we highlighted only the dependence on the parameters and neglected numerical coefficients. We wish to estimate the radial separation r low , or equivalently the frequency f high , at which δQ sub d Λ /Q Newt ∼ δQ d Λ /Q Newt . Clearly, we are going to obtain r low ∼ 1/d Λ , but the numerical factors are quite important due the high-exponent in r by which our effect scales. Furthermore, since we do not compute explicitly the subleading correction, we need to estimate this by looking only at our leading answer. Since the terms we consider depends both on d Λ /r and on v 4 (r), we can estimate when the expansion in d Λ /r breaks down only if we also have an estimate of when the v expansion does. In fact, if we do this estimate when the v expansion breaks down, our resulting estimate for f high is expected to be independent of the PN order of the expression we use for the estimate (as it should). Therefore, since we expect the PN expansion to break down at the ISCO orbit, we estimate that the d Λ /r expansion breaks down when δQ d Λ /Q Newt ∼ 1 [or (dV Λ /dr)/(dV Newt /dr) ∼ 1], with v evaluated at f = f ISCO . Let us notice that, given our choice of f high , at the maximum frequency we analyze the signal (i.e., f ISCO ), the signal is at most comparable to GR. At lower frequencies, the d Λ corrections decrease more rapidly than GR. So the modifications are smaller or equal than the GR contribution at all frequencies we analyze. Finally, for equal-mass binaries, the two conditions from the quadrupole and the gradient of the potential lead to almost identical estimates: f high ≈ 0.35 f Λ . Ultimately, the question of what is the correct f high can only be addressed by explicitly computing the next-order corrections, which we leave to future work. Therefore, to account for the fact that the current criteria are clearly not sharp, though are expected to be reasonably accurate, we use several values of f high nearby 0.35 f Λ to account for the sensitivity of our analysis to this choice.

IV. GRAVITATIONAL CONSTRAINTS USING BAYESIAN-SELECTION METHODS
A. Setting bounds on gravity theories constructed within the effective-field-theory approach To constrain the parameter d Λ we use a Bayesian modelselection approach, as we now briefly review.
Consider a model or gravity theory H i that we wish to test in light of the observed data d. Bayes' theorem tells us that the posterior probability density of the model given the data can be computed through: where I denotes all the prior information that one holds, P(H i |I) is the prior probability of the model H i , P(d|I) is a normalization constant and P(d|H i , I) is the marginal likelihood, also known as the evidence, for the model H i . If the model H i is described by a set of parameters θ = {θ 1 , θ 2 , . . . }, the marginalized likelihood P(d|H i , I) is given by where P(θ|H i , I) is the prior probability of the parameters θ within the model H i and P(d|θ, H i , I) is the likelihood of the observed data d assuming a given value of the parameters θ in the model H i . Consider now two competing theories H i and H j that we wish to compare given the observed data. To quantify the strength of one model against the other in describing the data, one can compute the ratio of the posterior probabilities, also known as the odds ratio: where in the last step we define the Bayes factor B j i = P(d|H j , I)/P(d|H i , I), and the ratio P(H j |I)/P(H i |I) denotes the prior odds. A large Bayes factor B j i 1 indicates that the data favors model H j over H i , while a small Bayes factor B j i 1 implies that H i is preferred over H j . Here we are interested in comparing a GR waveform model against the EFTGR waveform model described in the previous section.
The framework presented above can also be used to take advantage of multiple detections in order to increase our confidence for or against a given model. Consider a set of N independent GW events with independent data sets given by d = {d 1 , d 2 , . . . , d N }. We can write a combined odds ratio for the catalog of GW events as [13]: 4) where we define the combined Bayes factor as (C) B j i = P(d|H j , I)/P(d|H i , I). This can be further simplified by noting that for a set of N independent events with unrelated parameters θ one has [13,32] P(d|H i , I) = N k=1 P(d k |H i , I) , (4.5) and therefore where (k) B j i denotes the Bayes factor for the event k. Finally, to make inferences on the unknown parameters θ for a given event, we use again Bayes' theorem to compute the posterior probability P(θ|d, H, I) = P(θ|H, I)P(d|θ, H, I) P(d|H, I) . (4.7) For the prior probability density P(θ|H, I) we follow the choices in Ref. [33]. The likelihood function P(d|θ, H, I) is defined as the distribution of the residuals, assuming they are distributed as Gaussian noise colored by the noise-power spectral-density S n ( f ) for each detector [33]: where the sum in the above expression is over all the detectors and we define the noise-weighted inner product as [34]: whereg andh denote the Fourier transform of g and h, respectively. To sample the posterior density functions and compute the evidences we use a nested sampling algorithm as implemented in the LALInference code [33] of the LIGO Algorithm Library Suite (LALSuite) [35]. For the minimum frequency in the analysis we use f low = 20 Hz. On the other hand, as we already discussed, when setting f high we must take into account the fact that EFTGR is expected to break down when f > f Λ , where we remind that f Λ is defined through Eq. (3.15). Therefore when computing the likelihood function we set f high < f Λ . When choosing the value of f high , one must also take into account the fact that the precise value of the GW frequency f at which the model breaks down is unknown. The choice of f high therefore adds a systematic error in the constraints on d Λ . To study the impact of this choice we work with several values of f high . It is clear that the smaller the ratio f high / f Λ is, the more conservative is the bound we place on the EFTGR parameter Λ. In particular, for our lowest choice, f high = 0.2 f Λ , the effects on the physical observables from the EFTGR corrections are smaller than the GR values for all the data points we use. Moreover, f high ≈ 0.35 f Λ allows the EFTGR effects to be comparable to the GR effect for the highest frequencies included in the analysis, consequently, one may expect significant corrections from the subleading EFTGR terms in this case.
Furthermore, since f high depends on d Λ , to constrain this parameter we use here a different approach compared to what is usually done in parameterized tests of GR [2,4,5,13,15]. Given the EFTGR waveform model (3.14), the standard approach would be to consider d Λ as a free parameter and compute a posterior distribution through Eq. (4.7) given some GW event. However, considering the dependence of f high on d Λ , an easier approach is to instead perform a Bayesian model selection where one compares, on a given subset of the data where the EFT analysis can be reliably performed, the hypothesis that the data contains a signal described by the EFTGR model (3.14) with a given value of d Λ against the hypothesis that the signal is described by GR, which corresponds to the particular case d Λ = 0. Given a GW event for which one gets a measure of the total mass using a full inspiral-mergerringdown (IMR) template in GR, this allows us to fix f high for the value of d Λ that one wishes to probe.
In addition to these difficulties, the EFTGR corrections to the GW phase given by Eq. (3.14) are only valid in the PNexpanded approximation of the inspiral phase, and do not provide any information about the expected non-GR corrections in the merger-ringdown phase of the signal. Therefore to constrain d Λ we employ the Flexible Theory Agnostic (FTA) code developed for LIGO and Virgo analyses [4,5], and add the non-GR correction in Eq. (3.14) to a GR inspiral-only PN waveform model valid up to 3.5PN order 6 . This waveform is artificially set to zero for frequencies f > f ISCO , the regime at which the PN expansion is expected to loose significant accuracy. We should emphasize that choosing f ISCO as the frequency cutoff of our waveform model is largely arbitrary, and therefore this specific choice adds another systematic effect in the constraints that we can place on d Λ . To quantify this uncertainty in Appendix B we also show our results when using an IMR waveform, finding negligible differences. One might worry that the use of an inspiral-only PN approximant containing an abrupt cutoff at f ISCO might lead to a bias in the measured parameters if part of the post-inspiral signal is loud enough, even when assuming GR [40]. We have indeed checked that for GW150914 [41], a GW event with a significant signal-to-noise ratio (SNR) in the merger-rigdown part of the signal, using a GR inspiral-only PN template leads to biases in the measured BH masses when compared to the measurement done with an IMR model. However, for lowmass GW events that, in the frequency band of the detectors, are mostly dominated by the inspiral part of the signal, we do not expect this to be a problem. To justify this statement we have checked that for the two lowest-mass binary BH (BBH) events reported thus far by the LIGO and Virgo collaborations, GW151226 [36] and GW170608 [37], we indeed find no biases in the measurement of the waveform parameters when using a GR inspiral-only PN waveform model 7 . This can be 6 In the LIGO Algorithm Library [35] the specific name of the waveform models that we use in this paper are TaylorF2 (see, e.g., Ref. [31]), IMR-PhenomD [38] and SEOBNRv4 [39] 7 GW151226 and GW170608 were detected with a network SNR of ∼ 12 and ∼ 15, and with total source frame masses of M tot 22M and M tot 19M , respectively. seen in Fig. 2 where we show the marginalized posterior density distributions for the chirp mass that we obtain when using the inspiral-only PN-approximant TaylorF2 (with d Λ = 0) for different values of f high , and compare these results with the posterior obtained when using an IMR template (SEOB-NRv4) with f high = 1 024 Hz. As expected, the statistical error increases with decreasing f high , due to the decreasing SNR, but the measurements when using the IMR template and the inspiral-only PN template are in good agreement. The same conclusion holds for the other parameters of the model. Given these conclusions in what follows we focus our tests on the two longest signals detected so far by LIGO and Virgo, notably GW151226 and GW170608.
B. Bounds on the effective-field-theory extension of General Relativity using GW151226 and GW170608 Using the Bayesian framework presented above we now show that GW151226 and GW170608 can already be used to set meaningful constraints on the parameter d Λ .
Our main results are summarized in Fig. 3 where we show the natural logarithm of the Bayes factor of the EFTGR hypothesis against GR, ln B EFTGR GR . We remind that the EFTGR hypothesis corresponds to the hypothesis that the data is well described by the EFTGR waveform model (3.14) with a given value of d Λ against the hypothesis that the signal is described by GR, i.e., d Λ = 0. We show our results when using an inspiral-only PN approximant in Fig. 3, but similar results hold when using an IMR waveform (see Appendix B for more details). To quantify the systematic error in the uncertainty of the frequency at which the EFTGR model cannot be trusted, we consider different choices of f high = [0.2, 0.25, 0.35] f Λ . To compute f Λ (see Eq. (3.15)) we fix the total BH mass to be the median value for the total BH mass obtained when estimating the parameters with an IMR waveform model in GR as given in Refs. [36,37]. Our results show that ln B EFTGR GR < 0 for a wide range of values for d Λ , implying that the EFTGR hypothesis is disfavored for some values of d Λ . The constraints we obtain are similar between the two events, due to their similar total mass. However, for GW170608 one obtains slightly stronger evidence against the EFTGR hypothesis due to the larger SNR of this event when compared with GW151226. Setting a threshold ln B EFTGR GR −5 8 to constrain d Λ , we can draw the strong conclusion that values of d Λ between roughly 70 and 200 km are strongly disfavored by the data, independently on our choice of f high . The evidence against the EFTGR hypothesis is, however, highly dependent on f high . This should be expected given that the effect of d Λ only appears in the waveform phase (3.14) at high PN order and therefore, for a given d Λ , the larger f high the larger the contribution from the non-GR term and the stronger the constraints that we can place.
The qualitative features of our results can be largely understood in terms of two competing effects: (1) for small d Λ the Bayes factor asymptotically goes to B EFTGR GR ∼ 1, since the non-GR contribution decreases as d Λ becomes smaller, and (2) in the opposite direction, for increasing values of d Λ , one must also decrease f high and therefore for some large value of d Λ the amount of data that we use is no longer sufficient to dig out the signal from the noise, no matter which model we use. Therefore, for some large value of d Λ the Bayes factor should also approach B EFTGR GR ∼ 1, as we indeed see in Fig. 3. Due to these two competing effects, we are only able to constrain an interval of values d low Λ < d Λ < d high Λ . The lower-limit d low Λ in this constraint is more robust against different choices of f high , 8 In the typically used classification of Ref. [42], B EFTGR GR < 1/100 corresponds to decisive evidence for GR against the EFTGR hypothesis. mostly depends on the maximum value of f high below which the matched-filtered SNR is too small to separate the signal from the noise.
Using Eq. (4.6), the results presented in Fig. 3 can be combined to improve the constraints on d Λ . The resulting combined Bayes factors are shown in Fig. 4. One can see that by combining both events the evidence against the EFTGR hypothesis around d Λ ∼ 150 km increases significantly when compared to the constraints obtained with a single event. The importance of combining the information from different GW events is even more striking for the case f high = 0.2 f Λ . For that choice of f high , GW151226 alone does not give any meaningful constraints, however when combining it with GW170608, one gets strong constraints for values of d Λ around d Λ ∼ 150 km. We expect these constraints to further improve as more GW events are observed in the future.
The constraints of the individual and the combined results are summarized in Table I. A strong conclusion of our analysis is that coupling constants in the region d Λ ∼ [70, 200] km are strongly disfavored. This is in agreement with the prediction made in Ref. [16], and can be easily understood from the fact that in this range d Λ /r ISCO ∼ O(1) for the BBHs we considered. This regime is the region where the deviations from GR are maximized while still keeping the perturbative control of the EFT. Interestingly, this sets a typical scale that a given event can constrain. Combining the information from more events will therefore not only be important to increase the confidence for or against a given value of d Λ , but also to increase the range of values of d Λ that one can probe and possibly constrain.
Finally, so far we have assumed that Λ 6 > 0, but as mentioned in Sec. II, there is no strong reason to dismiss the case Λ 6 < 0. Thus, we have also extended the analysis discussed in this section to the case Λ 6 < 0 9 , and have found that when combining the results from both events,

V. CONCLUSIONS
We presented the first constraints on the EFTGR proposed in Ref. [16], which is described by the action (2.1), using the two lowest-mass GW events so far detected by LIGO and Virgo.
Our results show that current observations already disfavour coupling constants of the order d Λ ≡ 1/Λ ∼ [70, 200] km. The Bayesian method we employed can be easily used to combine information from several observations, and therefore we expect that future GW detections will allow us to further constrain this theory.
Our constraints assumed that, in the regime where Λ 1/r s , the leading order non-GR effect come from the properties of the binary and not modifications from the BH geometry. As argued in Sec. II, the situation may be different for highly spinning BHs in which case, within the soft-UV completion assumption, modifications to the spin-induced BH quadrupole moment may dominate. By adding order one corrections to the GR spin-induced BH quadrupole moment, we have checked that for the events we considered the modifications to the binary dynamics always dominates. This is in agreement with recent work, where it was shown that the spininduced BH quadrupole moments cannot be well constrained for GW151226 and GW170608 [43]. Constraining the spininduced BH quadrupole moment with future GW events is an interesting direction that may further improve our constraints.
We focused our analysis to the regime Λ 1/r s . Complementary constraints in the regime Λ 1/r s , can in principle be obtained from, e.g., measurements of the spin-induced BH quadrupole moment, the tidal Love numbers or measurements of the BH-remnant's QNMs [28]. The events observed by LIGO and Virgo so far have too low SNR to put any constraints in this regime (see Appendix A). In fact, it would be interesting to study how well future detectors on the ground and in space, such as Cosmic Explorer, Einstein Telescope and LISA, can constrain Λ, not only because BBHs will have much higher SNR, but also much longer inspiral phase, thus making our constraints more robust by choosing a lower value of f high .
Although we only showed constraints for Λ, the same method can be applied to constrainΛ and Λ − , but constraints on the latter will likely require events with a larger SNR and binaries with spinning components for which the effect of the termsC 2 andCC is expected to be non-negligible [16]. It would also be interesting to analyse EFTGR with cubic terms in the Riemann tensor, though they are theoretically disfavored.
Finally, our results relied on computing the leading-order non-GR PN correction, but stronger constraints could in principle be obtained by building IMR waveforms for these EFTs. This will require numerical simulations of BBHs in the EFTGR. Numerical simulations could be done along the lines of the proposals in Refs. [24][25][26]44] where significant progress has been made to perform numerical simulations of BH mergers in an EFT extension of gravity with higherderivative operators and an additional light degree of freedom. FIG. 6. Same as Fig. 3, but also showing the case where we use an IMR waveform model (IMRPhenomD). For the IMR model the EFTGR correction is taped at the merger frequency that we take to be the peak of the waveform's amplitude . binary, the modifications to the GW template are mainly associated to modifications of the BH geometry. For this case, the leading-order modifications in the inspiral phase come from a modified BH spin-induced quadrupole moment and nonvanishing tidal Love number [16,28]. Within the PN expansion, the influence of the spin-induced quadrupole moment enters the GW phase at 2PN order whereas the leading-order tidal Love number enters at 5PN order (see, e.g., Ref. [45]). By inserting the explicit expressions for the BH spin-induced quadrupole moment and tidal Love number found in Ref. [28] in a PN template, we can therefore check whether we can impose any constraints in this regime. For simplicity we only consider the Λ term in the action (2.1), but results would be similar for the other terms. We also only consider GW170608 since this event has a larger SNR than GW151226.
We used the same Bayesian model selection procedure presented in Sec. IV, except that now no constraints need to be imposed on the highest frequency in the data. Instead, since we are considering the regime d Λ M, we define a mass associated with Λ given by m Λ ≡ Λ −1 and impose a minimum mass M min in the prior probabilities of the BH masses such that only masses satisfying M min ≥ m Λ are considered in the analysis. This ensures that d Λ < M is always satisfied. Our results are shown in Fig. 5. They confirm that, independently on M min , there is no preference for either GR or the EFTGR model. Therefore no constraints can be imposed in this regime.