Multipolar Effective-One-Body Waveforms for Precessing Binary Black Holes: Construction and Validation

As gravitational-wave detectors become more sensitive, we will access a greater variety of signals emitted by compact binary systems, shedding light on their astrophysical origin and environment. A key physical effect that can distinguish among formation scenarios is the misalignment of the spins with the orbital angular momentum, causing the spins and the binary's orbital plane to precess. To accurately model such systems, it is crucial to include multipoles beyond the dominant quadrupole. Here, we develop the first multipolar precessing waveform model in the effective-one-body (EOB) formalism for the inspiral, merger and ringdown (IMR) of binary black holes: SEOBNRv4PHM. In the nonprecessing limit, the model reduces to SEOBNRv4HM, which was calibrated to numerical-relativity (NR) simulations, and waveforms from perturbation theory. We validate SEOBNRv4PHM by comparing it to the public catalog of 1405 precessing NR waveforms of the Simulating eXtreme Spacetimes (SXS) collaboration, and also to new 118 precessing NR waveforms, which span mass ratios 1-4 and spins up to 0.9. We stress that SEOBNRv4PHM is not calibrated to NR simulations in the precessing sector. We compute the unfaithfulness against the 1523 SXS precessing NR waveforms, and find that, for $94\%$ ($57\%$) of the cases, the maximum value, in the total mass range $20-200 M_\odot$, is below $3\%$ ($1\%$). Those numbers become $83\%$ ($20\%$) when using the IMR, multipolar, precessing phenomenological model IMRPhenomPv3HM. We investigate the impact of such unfaithfulness values with two parameter-estimation studies on synthetic signals. We also compute the unfaithfulness between those waveform models and identify in which part of the parameter space they differ the most. We validate them also against the multipolar, precessing NR surrogate model NRSur7dq4, and find that the SEOBNRv4PHM model outperforms IMRPhenomPv3HM.


I. INTRODUCTION
Since the Laser Interferometer Gravitational wave Observatory (LIGO) detected a gravitational wave (GWs) from a binary-black-hole (BBH) in 2015 [1], multiple observations of GWs from BBHs have been made with the LIGO [2] and Virgo [3] detectors [4][5][6][7][8]. Two binary neutron star (BNSs) systems have been observed [9,10], one of them both in gravitational and electromagnetic radiation [11,12], opening the exciting new chapter of multi-messenger GW astronomy. Mergers of compact-object binaries are expected to be detected at an even higher rate with LIGO and Virgo ongoing and future, observing runs [13], and with subsequent third-generation detectors on the ground, such as the Einstein Telescope and Cosmic Explorer, and the Laser Interferometer Space Antenna (LISA). In order to extract the maxi-mum amount of astrophysical and cosmological information, the accurate modeling of GWs from binary systems is more critical than ever. Great progress has been made in this direction, both through the development of analytical methods to solve the two-body problem in General Relativity (GR), and by ever-more expansive numerical-relativity (NR) simulations.
One of the key areas of interest is to improve the modeling of systems where the misalignment of the spins with the orbital angular momentum causes the spins and the orbital plane to precess [14]. Moreover, when the binary's component masses are asymmetric, gravitational radiation is no longer dominated by the quadrupole moment, and higher multipoles need to be accurately modeled [15]. Precession and higher multipoles lead to very rich dynamics, which in turn is imprinted on the GW signal. Their measurements will be able to shed light on the formation mechanism of the observed systems, probe the astrophysical environment, break degeneracy among parameters, allowing more accurate measurements of cosmological parameters, masses and spins, and more sophisticated tests of GR.
Faithful waveform models for precessing compact-object binaries have been developed within the effective-one-body (EOB) formalism [16][17][18], and the phenomenological approach [19][20][21][22][23] through calibration to NR simulations. Recently, an inspiral-merger-ringdown phenomenological waveform model that tracks precession and includes higher modes was constructed in Ref. [24] (henceforth, IMRPhenomPv3HM) 1 The model describes the six spin degrees of freedom in the inspiral phase, but not in the late-inspiral, merger and ringdown stages. In the co-precessing frame [27][28][29][30][31], in which the BBH is viewed face-on at all times and the GW radiation resembles the nonprecessing one, it includes the modes ( , m) = (2, ±2), (2, ±1), (3, ±3), (3, ±2), (4, ±4) and (4, ±3). Here, building on the multipolar aligned-spin EOB waveform model of Ref. [32,33], which was calibrated to 157 NR simulations [34,35], and 13 waveforms from BH perturbation theory for the (plunge-)merger and ringdown [36], we develop the first EOB waveform model that includes both spin-precession and higher modes (henceforth, SEOBNRv4PHM). The model describes the six spin degrees of freedom throughout the BBH coalescence. It differs from the one of Refs. [17,18], not only because it includes in the coprecessing frame the (3, ±3), (4, ±4) and (5, ±5) modes, beyond the (2, ±2) and (2, ±1) modes, but also because it uses an improved description of the two-body dynamics, having been calibrated [32] to a large set of NR waveforms [34]. We note that IMRPhenomPv3HM and SEOBNRv4PHM are not completely independent because the former is constructed fitting (in frequency domain) hybridized waveforms obtained by stitching together EOB and NR waveforms. We stress that both SEOBNRv4HM and IMRPhenomPv3HM are not calibrated to NR simulations in the precessing sector. Finally, the surrogate approach, which interpolates NR waveforms, has been used to construct several waveform models that include higher modes [37] and precession [38]. In this paper, we consider the state-of-the-art surrogate waveform model with full spin precession and higher modes [39] (henceforth, NRSur7dq4), developed for binaries with mass ratios 1-4, (dimensionless) BH's spins up to 0.8 and binary's total masses larger than ∼ 60M . It includes in the co-precessing frame all modes up to = 4. Table I summarizes the waveform models used in this paper.
The best tool at our disposal to validate waveform models built from approximate solutions of the Einstein equations, such as the ones obtained from post-Newtonian (PN) theory, BH perturbation theory and the EOB approach, is their com-
The paper is organized as follows. In Sec. II we discuss the new NR simulations of BBHs, and assess their numerical error. In Sec. III we develop the multipolar EOB waveform model for spin-precessing BBHs, SEOBNRv4PHM, and highlight the improvements with respect to the previous version [17,18], which was used in LIGO and Virgo inference analyses [5,46,47]. In Sec. IV we validate the accuracy of the multipolar precessing EOB model by comparing it to NR waveforms. We also compare the performance of SEOBNRv4PHM against the one of IMRPhenomPv3HM, and study in which region of the parameter space those models differ the most from NR simulations, and also from each other. In Sec. V we use Bayesian analysis to explore the impact of the accuracy of the precessing waveform models when extracting astrophysical information and perform two synthetic NR injections in zero noise. In Sec. VI we summarize our main conclusions and discuss future work. Finally, in Appendix A we compare the precessing waveform models to the NR surrogate NRSur7dq4, and in Appendix B we list the parameters of the 118 NR simulations done for this paper.
In what follows, we use geometric units G = 1 = c unless otherwise specified. For runs from the SpEC catalog [44] the opacity was changed so that runs with similar parameters are clearly visible. We indicate with squares precessing BBH runs performed as part of this paper.
Significant progress has been made in recent years by several NR groups to improve the coverage of the BBH parameter space [40][41][42][43][44][45], mainly motivated by the calibration of analytical waveform models and surrogate models used in LIGO and Virgo data analysis. While large strides have been made for 2 www.black-holes.org aligned-spin cases, the exploration of precessing waveforms has been mostly limited to q ≤ 4, χ 1,2 ≡ |χ 1,2 | ≤ 0.8, typically 15-20 orbital cycles before merger, and a large region of parameter space remains to be explored. Simulations with high mass ratio (q ≥ 4) and high spin (|χ 1 | > 0.5) are challenging, primarily due to the need to resolve the disparate length scales in the binary system, which increases the computational cost for a given level of accuracy. Furthermore, for high spin, the apparent horizons can be dramatically smaller, which makes it more difficult to control the excision boundaries, further increasing the computational burden.
Here, we want to improve the parameter-space coverage of the SXS catalog [44], while maintaining a manageable computational cost, thus we restrict to simulations in the range of mass ratios q = 1-4 and (dimensionless) spins χ 1,2 = 0.3-0.9, with the spin magnitudes decreasing as the mass ratio increases. In Fig. 1 we display, in the q-χ 1 parameter space, the precessing and non-precessing waveforms from the published SXS catalogue [44], and the new precessing waveforms produced as part of this work.
We choose to start all the simulations at the same (angular) orbital frequency, MΩ 0 ≈ 0.0157, where the value is not exact as it was modified slightly during the eccentricity-reduction procedure in SpEC [65]. This corresponds to a physical GW starting frequency of 20 Hz at 50M and results in the number of orbits up to merger varying between 15 and 30 in our new catalog.
We parametrize the directions of the spins by three angles, the angles θ 1,2 between the spins and the unit vector along the Newtonian orbital angular momentum,L N , and the angle ∆φ between the projections of the spins in the orbital plane. Explicitly, where cos θ 12 = χ 1 · χ 2 . We make the choice that χ 1 lies in theL N − n plane, where n is the unit vector along the binary's radial separation, at the start of the simulation. The angles are chosen to be θ i,0 ∈ {60 • , θ max }, and ∆φ 0 ∈ {0, 90 • }. Here θ max is the angle that maximizes the opening angle of L N around the total angular momentum J and is computed assuming that the two spins are co-linear, giving with |L N | = µM 2/3 Ω −1/3 for circular orbit, being Ω the orbital angular frequency. For each choice of {q, χ} we choose 10 different configurations divided into two categories: i) χ 1 = χ 2 = χ, θ i,0 ∈ {60 • , θ max }, ∆φ 0 ∈ {0, 90 • } giving eight runs, and ii) χ 1 = χ, χ 2 = 0, θ 1,0 ∈ {60 • , θ max } giving two runs. The detailed parameters can be found in Appendix B. These choices of the spin directions allow us to test the multipolar precessing model SEOBNRv4PHM in many different regimes, including where the effects of precession are maximized, and where spin-spin effects are significant or diminished.

B. Unfaithfulness for spinning, precessing waveforms
The gravitational signal emitted by non-eccentric BBH systems and observed by a detector depends on 15 parameters: the component masses m 1 and m 2 (or equivalently the mass ratio q = m 1 /m 2 ≥ 1 and the total mass M = m 1 + m 2 ), the dimensionless spins χ 1 (t) and χ 2 (t), the direction to observer from the source described by the angles (ι, ϕ 0 ), the luminosity distance d L , the polarization ψ, the location in the sky (θ, φ) and the time of arrival t c . The gravitational strain can be written as: where to simplify the notation we introduce the function ξ ≡ q, M, χ 1 (t), χ 2 (t) . The functions F + (θ, φ, ψ) and F × (θ, φ, ψ) are the antenna patterns [66,67]: Equation (2.3) can be rewritten as: where κ(θ, φ, ψ) is the effective polarization [68] defined as: , (2.6) which has support in the region [0, 2π), while A(θ, φ) reads: Henceforth, to ease the notation we suppress the dependence on (θ, φ, ψ) in κ. Let us introduce the inner product between two waveforms a and b [66,67]: where a tilde indicates the Fourier transform, a star the complex conjugate and S n ( f ) is the one-sided power spectral density (PSD) of the detector noise. We employ as PSD the Advanced LIGO's "zero-detuned high-power" design sensitivity curve [69].
Here we use f in = 10Hz and f max = 2kHz, when both waveforms fill the band. For cases where this is not the case (e.g the NR waveforms) we set f in = 1.05 f start , where f start is the starting frequency of the waveform.
To assess the closeness between two waveforms s (e.g., the signal) and τ (e.g., the template), as observed by a detector, we define the following faithfulness function [33]: (2.9) While in the equation above we set the inclination angle ι of signal and template waveforms to be the same, the coalescence time t c and the angles ϕ 0τ and κ τ of the template waveform are adjusted to maximize the faithfulness. This is a typical choice motivated by the fact these quantities are not interesting from an astrophysical perspective. The maximizations over t c and ϕ 0τ are performed numerically, while the maximization over κ τ is done analytically following the procedure described in Ref. [68] (see Appendix A therein).
The condition ξ s (t s = t 0 s ) = ξ τ (t τ = t 0 τ ) in Eq. (2.9) enforces that the mass ratio q, the total mass M and the spins χ 1,2 of the template waveform at t = t 0 (i.e., at the beginning of the waveform) are set to have the same values of the ones in the signal waveform at its t 0 . When computing the faithfulness between NR waveforms with different resolutions this condition is trivially satisfied by the fact that they are generated using the same initial data. In the case of the faithfulness between NR and any model from the SEOBNR family, it is first required to ensure that t 0 has the same physical meaning for both waveforms. Ideally t = t 0 τ in the SEOBNR waveform should be fixed by requesting that the frequency of the SEOBNR (2, 2) mode at t 0 τ coincides with the NR (2,2) mode frequency at t 0 s . This is in practice not possible because the NR (2,2) mode frequency may display small oscillations caused by different effects -for example the persistence of the junk radiation, some residual orbital eccentricity or spinspin couplings [65]. Thus, the frequency of the SEOBNR (2, 2) mode at t = t 0 τ is chosen to guarantee the same time-domain length of the NR waveform. 3 . In practice, we require that the peak of ,m |h m | 2 , as elapsed respectively from t 0 s and t 0 τ , occurs at the same time in NR and SEOBNR. For waveforms from the IMRPhenom family we adopt a different approach, and following the method outlined in Ref. [21], also optimize the faithfulness numerically over the reference frequency of the waveform.
The faithfulness defined in Eq. (2.9) is still a function of 4 parameters (i.e., M s , ι s , ϕ 0s , κ s ), therefore it does not allow to describe the agreement between waveforms in a compact form. For this purpose we define the sky-and-polarizationaveraged faithfulness [18] as: Despite the apparent difference, the sky-and-polarizationaveraged faithfulness F defined above is equivalent to the one given in Eqs. (9) and (B15) of Ref. [18]. The definition in Eq. (2.10) is less computationally expensive because, thanks to the parametrization of the waveforms in Eq. (2.5), it allows one to write the sky-and-polarization-averaged faithfulness as a double integral instead of the triple integral in Eq. (B15) of Ref. [18]. We also define the sky-and-polarization-averaged, signal-to-noise (SNR)-weighted faithfulness as: where the SNR(ι s , ϕ 0s , θ s , φ s , κ s , D Ls , ξ s , t cs ) is defined as: This is also an interesting metric because weighting the faithfulness with the SNR takes into account that, at fixed distance, the SNR of the signal depends on its phase and on the effective polarization (i.e., a combination of waveform polarization and sky-position). Since the SNR scales with the luminosity distance, the number of detectable sources scale with the SNR 3 , therefore signals with a smaller SNR are less likely to be observed. Finally, we define the unfaithfulness (or mismatch) as

C. Accuracy of new numerical-relativity waveforms
To assess the accuracy of the new NR waveforms, we compute the sky-and-polarization-averaged unfaithfulness defined in Eq. (2.10) between the highest and second highest resolutions in the NR simulation. Figure 2 shows a histogram of the unfaithfulness, evaluated at ι s = π/3 maximized over the total mass, between 20 and 200 M . It is apparent that the unfaithfulness is below 1% for most cases, but there are several cases with much higher unfaithfulness. This tail to high unfaithfulness has been observed previously, when evaluating the accuracy of SXS simulations in Ref. [39]. Therein, it was established that, when the non-astrophysical junk radiation perturbs the parameters of the simulation sufficiently, the different resolutions actually correspond to different physical systems. Thus, taking the difference between adjacent resolutions is no longer an appropriate estimate of the error.
We also find that the largest unfaithfulness occurs when the difference in parameters is largest, thus confirming that it is the difference in parameters that dominates the unfaithfulness.

D. Effect of mode asymmetries in numerical-relativity waveforms
The gravitational polarizations at time t and location (ϕ 0 , ι) on the coordinate sphere from the binary can be decomposed in −2-spin-weighted spherical harmonics, as follows For nonprecessing binaries, the invariance of the system under reflection across the orbital plane (taken to be the x-y plane) implies h m = (−1) h * −m . The latter is a very convenient relationship -for example it renders unnecessary to model modes with negative values of m. However, this relationship is no longer satisfied for precessing binaries.
As investigated in previous NR studies [70,71], we expect the asymmetries between opposite-m modes to be small as compared to the dominant (2, 2)-mode emission (at least during the inspiral) in a co-rotating frame that maximizes emission in the (2, ±2) modes, also known as the maximumradiation frame [29,72]. However, while the asymmetries are expected to be small during the inspiral, the difference in phase and amplitude between positive and negative m-modes might become non-negligible at merger.
As we discuss in the next section, when building multipolar waveforms (SEOBNRv4PHM) for precessing binaries by rotating modes from the co-precessing [27][28][29][30][31] to the inertial frame of the observer, we shall neglect the mode asymmetries. To quantify the error introduced by this assumption, we proceed as follows. We first take NR waveforms in the co-precessing frame and construct symmetrized waveforms. Specifically, we consider the combination of waveforms in the co-precessing frame defined by (e.g., see Ref. [39]) Note that if the assumption of conjugate symmetry holds (i.e., if h P −m = (−1) h P * m ), then for even (odd) modes, h + m (h − m ) is non-zero while the other component vanishes. If the assumption does not hold, it is still true that at given , one of the components is much larger than the other, as shown in top panel of Fig. 3. Motivated by this, we define the symmetrized modes (for m > 0) as [39] The other modes are constructed as h P −m = h P * m for m < 0, and we set m = 0 modes to zero. The bottom panel of Fig. 3 shows an example of asymmetrized waveform for the case PrecBBH000078 of the SXS catalogue, in the co-precessing frame. It is obvious that the asymmetry between the modes has been removed and that the symmetrized waveform does indeed represent a reasonable "average" between the original modes. The symmetrized waveforms in the inertial frame are obtained by rotating the co-precessing frames modes back to the inertial frame.
In Fig. 4, we show the sky-and-polarization averaged unfaithfulness between the NR waveforms and the symmetrized waveforms described above, maximized over the total mass, including all modes available in the NR simulation, that is up to = 8. For the vast majority of the cases, the unfaithfulness is ∼ 0.5%, and all cases have unfaithfulness below 2%. This demonstrates that the effect of neglecting the asymmetry is likely subdominant to other sources of error such as the modeling of the waveform phasing, although the best way of quantifying the effect is to perform a Bayesian parameterestimation study, which we leave to future work.

III. MULTIPOLAR EOB WAVEFORMS FOR SPINNING, PRECESSING BINARY BLACK HOLES
We briefly review the main ideas and building blocks of the EOB approach, and then describe an improved spinning, precessing EOBNR waveform model, which, for the first time, also contains multipole moments beyond the quadrupolar one. The model is already available in the LIGO Algorithm Library [73] under the name of SEOBNRv4PHM. We refer the reader to Refs. [17,18,33,74,75] for more details of the EOB framework and its most recent waveform models. Here we closely follow Ref. [18], highlighting when needed differences with respect to the previous precessing waveform model FIG. 3: Top: the behavior of h ± m in the NR simulation PrecBBH000078. Note that especially during the inspiral, |h + 22 | is much larger than |h − 22 | while |h − 33 | is much larger than |h + 33 |. Bottom: an example of waveform symmetrization for the same NR case, shown in the co-precessing frame. The symmetrized waveform obeys the usual conjugation symmetry as expected, and represents a reasonable average to the behavior of the unsymmetrized modes. developed in Ref. [18] (SEOBNRv3P 4 ).

A. Two-body dynamics
The EOB formalism [76][77][78][79] can describe analytically the GW emission of the entire coalescence process, notably inspiral, merger and ringdown, and it can be made highly accurate by including information from NR. For the two-body conservative dynamics, the EOB approach relies on a Hamiltonian H EOB , which is constructed through: (i) the Hamiltonian H eff of a spinning particle of mass µ ≡ m 1 m 2 /(m 1 + m 2 ) and spin S * ≡ S * (m 1 , m 2 , S 1 , S 2 ) moving in an effective, deformed Kerr spacetime of mass M ≡ m 1 + m 2 and spin S Kerr ≡ S 1 + S 2 [80][81][82], and (ii) an energy map between H eff and H EOB [76] where ν = µ/M is the symmetric mass ratio. The deformation of the effective Kerr metric is fixed by requiring that at any given PN order, the PN-expanded Hamiltonian H EOB agrees with the PN Hamiltonian for BBHs [15]. In the EOB Hamiltonian used in this paper [81,82], the spin-orbit (spin-spin) couplings are included up to 3.5PN (2PN) order [81,82], while the non-spinning dynamics is incorporated through 4PN order [33]. The dynamical variables in the EOB model are the relative separation r and its canonically conjugate momentum p, and the spins S 1,2 . The conservative EOB dynamics is completely general and can naturally accommodate precession [17,18] and eccentricity [83][84][85]. When BH spins have generic orientations, both the orbital plane and the spins undergo precession about the total angular momentum of the binary, defined as J ≡ L + S 1 + S 2 , where L ≡ µ r × p. We also introduce the Newtonian orbital angular momentum L N ≡ µ r ×ṙ, which at any instant of time is perpendicular to the binary's orbital plane. Black-hole spin precession is described by the following equations In the EOB approach, dissipative effects enter in the equations of motion through a nonconservative radiation-reaction force that is expressed in terms of the GW energy flux through the waveform multipole moments [86][87][88][89] as where Ω ≡ |r ×ṙ|/|r| 2 is the (angular) orbital frequency, d L is the luminosity distance of the BBH to the observer, 4 We note that whereas in LAL the name of this waveform approximant is SEOBNRv3, here we add a "P" to indicate "precession", making the notation uniform with respect to the most recent developed model SEOBNRv4P [18].
FIG. 5: Frames used in the construction of the SEOBNRv4PHM model: the observer's frame (blue), defined by the directions of the initial orbital angular momentumL(0) and separation r(0), and co-precessing frame (red), instantaneously aligned withL(t) and described by the Euler angles (α, β, γ) (see text below for details). and the h m 's are the GW multipole modes. As discussed in Refs. [32,33], the h m used in the energy flux are not the same as those used for building the gravitational polarizations in the inertial frame, since the latter include the nonquasi-circular corrections, which enforce that the SEOBNR waveforms at merger agree with the NR data, when available.

B. Inspiral-plunge waveforms
For the inspiral-plunge waveform, the EOB approach uses a factorized, resummed version [33,[87][88][89] of the frequencydomain PN formulas of the modes [90,91]. As today, the factorized resummation has been developed only for quasicircular, nonprecessing BBHs [88,89], and it has been shown to improve the accuracy of the PN expressions in the test-particle limit, where one can compare EOB predictions to numerical solutions of the Regge-Wheeler-Zerilli and Teukolsky equations [16,36,92,93].
The radiation-reaction force F in Eq. (3.3) depends on the amplitude of the individual GW modes |h m |, which, in the non-precessing case, are functions of the constant alignedspin magnitudes χ 1,2 ·L. In the precessing case, these modes depend on time, as χ 1,2 (t) ·L(t), and they depend on the generic, precessing orbital dynamics through the radial separation r and orbital frequency Ω, which carry modulations due to spin-spin couplings whenever precession is present. However, we stress that with this choice of the radiation-reaction force and waveform model, not all spin-precession effects are included, since the PN formulas of the modes [90] also contain terms that depend on the in-plane spin components.
For data-analysis purposes, we need to compute the GW polarizations in the inertial-frame of the observer (or simply observer's frame). We denote quantities in this frame with the superscript I. The observer's frame is defined by the triad {ê I 1) . Moreover, in this frame, the line of sight of the observer is parametrized asN ≡ (sin ι cos φ o , sin ι sin φ o , cos ι) (see Fig. 5). We also introduce the observer's frame with the po- 1) , which spans the plane orthogonal toN. To compute the observer's-frame modes h I m during the inspiral-plunge stage, it is convenient to introduce a noninertial reference frame that tracks the motion of the orbital plane, the so-called co-precessing frame (superscript P), described by the triad {ê P (i) } (i = 1, 2, 3). At each instant, its z-axis is aligned withL:ê P (3) ≡L(t) 5 . In this frame, the BBH is viewed face-on at all times, and the GW radiation looks very much nonprecessing [27][28][29][30][31]. The other two axes lie in the orbital plane and are defined such as they minimize precessional effects in the precessing-frame modes h P m [27,29]. After introducing the vector Ω e ≡L × dL/dt, we enforce the minimum-rotation condition by requiring that dê P (1), (2) As usual, we parametrize the rotation from the precessing to the observer's frame through time-dependent Euler angles (α(t), β(t), γ(t)), which we compute using Eqs. (A4)-(A6) in Appendix A of Ref. [18]. We notice that the minimumrotation condition can also be expressed through the following differential equation for γ: We compute the precessing-frame inspiral-plunge modes just like we do for the GW flux, namely by evaluating the factorized, resummed nonprecessing multipolar waveforms along the EOB precessing dynamics, and employing the time-dependent spin projections χ 1,2 (t) ·L(t). Finally, the observer's-frame inspiral-plunge modes are obtained by rotating the precessing-frame inspiral-plunge modes using Eq. (A13) in Appendix A of Ref. [18].

C. Merger-ringdown waveforms
The description of a BBH as a system composed of two individual objects is of course valid only up to the merger. After that point, the EOB model builds the GW emission (ringdown stage) via a phenomenological model of the quasinormal modes (QNMs) of the remnant BH, which forms after the coalescence of the progenitors. The QNM frequencies and decay times are known (tabulated) functions of the mass M f and spin S f ≡ M 2 f χ f of the remnant BH [94]. Since the QNMs are defined with respect to the direction of the final spin, the specific form of the ringdown signal, as a linear combination of QNMs, is formally valid only in an inertial frame whose z-axis is parallel to χ f .
A novel feature of the SEOBNRv4PHM waveform model presented here is that we attach the merger-ringdown waveform (notably each multipole mode h mergr−RD m ) directly in the coprecessing frame, instead of the observer's frame. As a consequence, we can employ here the merger-ringdown multipolar model developed for non-precessing BBHs (SEOBNRv4HM) in Ref. [33] (see Sec. IVE therein for details). By contrast, in the SEOBNRv3P waveform model [18], the merger-ringdown waveform was built as a superposition of QNMs in an inertial frame aligned with the direction of the remnant spin. This construction was both more complicated to implement and more prone to numerical instabilities.
To compute the waveform in the observer's frame, our approach requires a description of the co-precessing frame Euler angles (α, β, γ) that extends beyond the merger. To prescribe this, we take advantage of insights from NR simulations [95]. In particular, it was shown that the co-precessing frame continues to precess roughly around the direction of the final spin with a precession frequency approximately equal to the differences between the lowest overtone of the (2,2) and (2,1) QNM frequencies, while the opening angle of the precession cone decreases somewhat at merger. We find that this behavior is qualitatively correct for the NR waveforms used for comparison in this paper.
To keep our model generic for a wide range of mass ratios and spins, we need an extension of the behavior noticed in Ref. [95] to the retrograde case, where the remnant spin is negatively aligned with the orbital angular momentum at merger. Such configurations can occur for high mass-ratio binaries, when the total angular momentum J is dominated by the spin of the primary S 1 instead of the orbital angular momentum L. This regime is not well explored by NR simulations, and includes in particular systems presenting transitional precession [14]. In our model we keep imposing simple precession around the direction of the remnant spin at a rate ω prec ≥ 0, but we distinguish two cases depending on the direction of the final spin χ f (approximated by the total angular momentum J = L + S 1 + S 2 at merger) relative to the final orbital angular momentum L f : ) where χ f = |χ f |, and the zero-overtone QNM frequencies for negative m are taken on the branch ω QNM lm > 0 that continuously extends the m > 0, ω QNM lm > 0 branch [94] (the QNM refers to zero overtone). In both cases,α ≥ 0. We do not attempt to model the closing of the opening angle of the precession cone and simply consider it to be constant during the post-merger phase, β = const. The third Euler angle γ is then constructed from the minimal rotation conditionγ = −α cos β. The integration constants are determined by matching with the inspiral at merger. We find that the behavior of Eq. (3.4) in the case χ f · L f < 0 is qualitatively consistent with an NR simulation investigated by one of us [96]. However, we stress that this prescription for the retrograde case is much less tested than for the prograde case.
Furthermore, one crucial aspect of the above construction is the mapping from the binary's component masses and spins to the final mass and spin, which is needed to compute the QNM frequencies of the merger remnant. Many groups have developed fitting formulae based on a large number of NR simulations (e.g., see Ref. [97] for an overview). To improve the agreement of our EOB merger-ringdown model with NR, and to ensure agreement in the aligned-spin limit with SEOBNRv4 [32] and SEOBNRv4HM [33], we employ the fits from Hofmann et al. [98]. In Fig. 6 we compare the performance of the fit used in the previous EOB precessing model SEOBNRv3P [17,18,75] to the fit from Hofmann et al. that we adopt for SEOBNRv4PHM. It is clear that the new fit reproduces NR data much better. This in turn improves the correspondence between NR and EOB QNM frequencies.
For the final mass we employ the same fit as in previous EOB models, and we provide it here since it was not given explicitly anywhere before: where a =L · (χ 1 + χ 2 /q 2 )/(1 + 1/q) 2 , and E ISCO (a) is the binding energy of the Kerr spacetime at the innermost stable circular orbit [99]. Finally, for precessing binaries, the individual components of the spins vary with time. Therefore, in applying the fitting formulae to obtain final mass and spin, one must make a crucial choice in selecting the time during the inspiral stage at which the spin directions are evaluated. In fact, even if one considers a given physical configuration, evaluating the final spin formulae with spin directions from different times yields different final spins and consequently different waveforms. We choose to evaluate the spins at a time corresponding to the separation of r = 10M. This choice is guided by two considerations: by the empirical finding of good agreement with NR (e.g., performing better than using the time at which the inspiral-plunge waveform is attached to the merger-ringdown waveform [33]), and by the restriction that the waveform must start at r > 10.5M in order to have small initial eccentricity [18]. Thus, our choice ensures that a given physical configuration always produces the same waveform regardless of the initial starting frequency.
To obtain the inspiral-merger-ringdown modes in the inertial frame, h I m , we rotate the inspiral-merger-ringdown modes h P m from the co-precessing frame to the observer's frame us- ing the rotation formulas and Euler angles in Appendix A of Ref. [18]. The inertial frame polarizations then read The SEOBNRv4PHM waveform model inherits the EOB Hamiltonian and GW energy flux from the aligned-spin model SEOBNRv4 [32], which features higher (yet unknown) PN-order terms in the dynamics calibrated to NR waveforms. These calibration parameters were denoted K, d SO and d SS in Ref. [32], and were fitted to NR and Teukolskyequation-based waveforms as polynomials in ν, χ where χ ≡ S z Kerr /(1 − 2ν) with S Kerr = S 1 + S 2 the spin of the EOB background spacetime. In contrast to the SEOBNRv3P waveform model, which used the EOB Hamiltonian and GW energy flux from the aligned-spin model SEOBNRv2 [75], the fits in Ref. [32] include odd powers of χ and thus the sign of χ matters when the BHs precess.
The most natural way to generalize these fits to the precessing case is to project S Kerr onto the orbital angular momentum L in the usual spirit of reducing precessing quantities to corresponding aligned-spin ones. To test the impact of this prescription, we compute the sky-and-polarization-averaged unfaithfulness with the set of 118 NR simulations described in Sec. II, and find that while the majority of the cases have low unfaithfulness (∼ 1%), there are a handful of cases where it is significant(∼ 10%), with many of them having large in-plane spins.
To eliminate the high mismatches, we introduce the aug- The source parameters are ι s = π/3, φ s = π/4, κ s = π/4. The NR waveform includes all modes up to and including = 4, and extends for 44 GW cycles before merger. For models that include only = 2 modes, the unfaithfulness are several percent 8% for IMRPhenomPv3 and 6% for SEOBNRv4P. Meanwhile, adding the higher mode content drastically improves the agreement, with mismatches going down to 2% for IMRPhenomPv3HM and 1% for SEOBNRv4PHM. The agreement is particular good for SEOBNRv4PHM, which reproduces the higher mode features at merger and ringdown faithfully.
mented spin that includes contribution of the in-plane spins: . (3.7) Here S ⊥ i ≡ S i − (S i · L)L and α is a positive coefficient to be determined. Note that the extra term in the definition of the augmented spin ≥ 0 for any combination of the spins. We setχ = 0 when S Kerr = 0. Fixing α = 1/2 insures that the augmented spin obeys the Kerr bound. Using the augmented spin eliminates all mismatches above 6%, and thus greatly improves the agreement of the model with NR data.

IV. COMPARISON OF MULTIPOLAR PRECESSING MODELS TO NUMERICAL-RELATIVITY WAVEFORMS
To assess the impact of the improvements incorporated in the SEOBNRv4PHM waveform model, we compare this model and other models publicly available in LAL (see Table I) to the set of simulations described in Sec. II, as well as to all publicly available precessing SpEC simulations 6 .
We start by comparing in Fig. 7, the precessing NR waveform PrecBBH00078 with mass ratio 4, BH's spin magni- 6 The list of all SXS simulations used can be found in https://arxiv. org/src/1904.04831v2/anc/sxs_catalog.json tudes 0.7, total mass M = 70M and modes ≤ 4 from the new 118 SXS catalog (see Appendix B) to the precessing waveforms IMRPhenomPv3 and SEOBNRv4P with modes = 2 (upper panels), and to the precessing multipolar waveforms IMRPhenomPv3HM and SEOBNRv4PHM (lower panels). This NR waveform is the most "extreme" configuration from the new set of waveforms and has about 44 GW cycles before merger, and the plot only shows the last 7 cycles. More specifically, we plot the detector response function given in Eq. (2.5), but we leave out the overall constant amplitude. We indicate on the panels the unfaithfulness for the different cases. We note the improvement when including modes beyond the quadrupole. SEOBNRv4PHM agrees particularly well to this NR waveform, reproducing accurately the higher-mode features throughout merger and ringdown.
We now turn to the public precessing SXS NR catalog of 1404 waveforms. First, to quantify the performance of the new precessing waveform model SEOBNRv4P with respect to previous precessing models used in LIGO and Virgo inference studies, we compute the unfaithfulness 7 against the precessing NR catalog, including only the dominant = 2 multipoles in the co-precessing frame. Figure 8 shows the histograms of the largest mismatches when the binary total mass varies in  FIG. 9: Sky-and-polarization averaged, SNR weighted unfaithfulness for an inclination ι = π/3 between NR waveforms and SEOBNRv4PHM, including and omitting higher modes. The vertical dashed lines show the medians. Not including higher modes in the model results in high unfaithfulness. However, when they are included, the unfaithfulness between SEOBNRv4PHM and NR is essentially at the same level as when only = 2 modes are compared (see Fig. 8).
the range [20,200]M . Here, we also consider the precessing waveform models used in the first GW Transient Catalog [5] of the LIGO and Virgo collaboration (i.e., SEOBNRv3P and IMRPhenomPv2). Two trends are apparent: firstly, SEOBNRv3P and IMRPhenomPv2 distributions are broadly consistent, with both models having mismatches which extend beyond 10% , although SEOBNRv3 has more cases at lower unfaithfulness; secondly, SEOBNRv4P has a distribution which is shifted to much lower values of the unfaithfulness and does not include outliers with the largest unfaithfulness below 7%. Next, we examine the importance of higher modes. To do so, we use SEOBNRv4PHM with and without the higher modes, while always including all modes up to = 5 in the NR waveforms. As can be seen in Fig. 9, if higher modes are omitted, the unfaithfulness can be very large, with a significant number of cases having unfaithfulness > 7%, as has been seen in many past studies. On the other hand, once higher modes are included in the model, the distribution of mismatches becomes much narrower, with all mismatches below 9%. Furthermore, the distribution now closely resembles the distribution of mismatches when only = 2 modes were included in the NR waveforms. Thus, we see that higher modes play an important role and are accurately captured by SEOBNRv4PHM waveform model. Moreover, in Fig. 10 we display, for a specific choice of the inclination, the unfaithfulness versus the binary's total mass between the public precessing SXS NR catalog and SEOBNRv4PHM and IMRPhenomPv3HM. We highlight with curves in color the NR configurations having worst maximum mismatches for the two classes of approximants. For the majority of cases, both models have unfaithfulness below 5%, but SEOBNRv4PHM has no outliers beyond 10% and many more cases at lower unfaithfulness (< 2 × 10 −3 ). We find that the large values of unfaithfulness above 10% for IMRPhenomPv3HM come from simulations with q 4 and large anti-aligned primary spin, i.e. χ z 1 = −0.8. An examination of the waveforms in this region reveals that unphysical features develop in the waveforms, with unusual oscillations both in amplitude and phase. For lower spin magnitudes these features are milder, and disappear for spin magnitudes 0.65. These features are present also in IMRPhenomPv3 and are thus connected to the precession dynamics, a region already known to potentially pose a challenge when modeling the precession dynamics as suggested in Ref. [100], and adopted in Ref. [24].
We now focus on the comparisons with the 118 SXS NR waveforms produced in this paper. In Fig. 11 we show the unfaithfulness for IMRPhenomPv3(HM) and SEOBNRv4P(HM) in the left (right) panels. We compare waveforms without higher modes, to NR data that has only the = 2 modes, and the other models to NR data with ≤ 4 modes. The performance of both waveform models on this new NR data set is largely comparable to what was found for the public catalog. Both families perform well on average, with most cases having unfaithfulness below 2% for models without higher modes and 3% for models with higher modes. However, for some configurations IMRPhenomPv3(HM) reaches unfaithfulness values above 3% for total masses below 125M . Once again, the overall distribution is shifted to lower unfaithfulness values for SEOBNRv4P(HM).
When studying the distribution of unfaithfulness for these 118 cases across parameter space, it is useful to introduce the widely used effective χ eff [79,101,102] and precessing χ p [103] spins. These capture the leading order aligned-spin and precession effects respectively, and are defined as  The sky-and-polarization averaged, SNR-weighted unfaithfulness as a function of binary's total mass for inclination ι = π/3, between IMRPhenomPv3HM and NR (left) and SEOBNRv4PHM and NR (right) for 1404 quasi-circular precessing BBH simulations in the SXS public catalog. The colored lines highlight the cases with the worst maximum mismatches for both models. Note that for the majority of cases, both models have unfaithfulness below 5%, but SEOBNRv4PHM has no outliers beyond 10% and many more cases at lower unfaithfulness. where with B 1 = 2 + 3m 2 /m 1 , B 2 = 2 + 3m 1 /m 2 and we indicate with χ i⊥ the projection of the spins on the orbital plane.
We find that the unfaithfulness shows 2 general trends. First, it tends to increase with increasing χ eff and χ p . Secondly, that cases with positive χ eff (i.e. aligned with Newtonian orbital angular momentum) tend to have larger unfaithfulness. This is likely driven by the fact that inspiral is longer for such cases and the binary merges at higher frequency. We do not find any other significant trends based on spin directions. It is interesting to note that the distribution of mismatches from the 118 cases is quite similar to the distribution from the much larger public catalog. This suggests that the 118 cases do indeed explore many different regimes of precession.
To further quantify the results of the comparison between the precessing multipolar models SEOBNRv4PHM and IMRPhenomPv3HM and the NR waveforms, we show in Figs. 12 and 13 the median and 95%-percentile of all cases, and the highest unfaithfulness as function of the total mass, respectively. These studies also demonstrate the better performance of SEOBNRv4PHM with respect to IMRPhenomPv3HM.
To summarize the performance against the entire SXS catalog (including the new 118 precessing waveforms) we find that for SEOBNRv4PHM, out of a total of 1523 NR simulations we have considered, 864 cases (57% ) have a maximum unfaithfulness less than 1%, and 1435 cases (94% ) have unfaithfulness less than 3%. Meanwhile for IMRPhenomPv3HM the numbers become 300 cases (20% ) below 1%, 1256 cases (83% ) below 3% 8 . The accuracy of the semi-analytical wave-form models can be improved in the future by calibrating them to the precession sector of the SXS NR waveforms.
An interesting question is to examine the behavior of the precessing models outside the region in which their underlying aligned-spin waveforms were calibrated. To this effect we consider 1000 random cases between mass ratios q ∈ [1,20] and spin magnitudes χ 1,2 ∈ [0, 0.99] and compute M SNR between SEOBNRv4PHM and IMRPhenomPv3HM. Figure 14 shows the dependence of the unfaithfulness on the binary parameters, in particular the mass ratio, and the effective and precessing spins. We find that for mass ratios q < 8, 50% of cases have unfaithfulness below 2% and 90% have unfaithfulness below 10%. The unfaithfulness grows very fast with mass ratio and spin, with the highest unfaithfulness occurring at the highest mass ratio and precessing spin. This effect is enhanced due to the fact that we choose to start all the waveforms at the same frequency and for higher mass ratios, the number of cycles in band grows as 1/ν where ν is the symmetric mass ratio. These results demonstrate the importance of producing long NR simulations for large mass ratios and spins, which can be used to validate waveform models in this more extreme region of the parameter space. To design more accurate semi-analytical models in this particular region, it will be relevant to incorporate in the models the information from gravitational selfforce [104][105][106], and also test how the choice of the underlying EOB Hamiltonians with spin effects [107,108] affects the accuracy.
Finally, in Appendix A we quantify the agreement of the precessing multipolar waveform models SEOBNRv4PHM and IMRPhenomPv3HM against the NR surrogate model NRSur7dq4 [39], which was built for binaries with mass ratios 1-4, BH's spins up to 0.8 and binary's total masses larger than ∼ 60M . We find that the unfaithfulness between the semi-analytic models and the NR surrogate largely mirrors the results of the comparison in Figs. 12 and 13. Notably, as it can be seen in Fig. 17, the unfaithfulness is generally below 3% for both waveform families, but SEOBNRv4PHM outperforms IMRPhenomPv3HM with the former having a median at 3.3 × 10 −3 , while the latter is at 1.5 × 10 −2 .

V. BAYESIAN ANALYSIS WITH MULTIPOLAR PRECESSING WAVEFORM MODELS
We now study how the accuracy of the waveform model SEOBNRv4PHM (and also IMRPhenomPv3HM), which we have quantified in the previous section through the unfaithfulness, affects parameter inference when synthetic signal injections are performed. To this end, we employ two mock BBH signals and do not add any detector noise to them (i.e., we work in zero noise), which is equivalent to average over many different noise realizations. This choice avoids arbitrary biases introduced by a random-noise realization, and it is reasonable since the purpose of this analysis is to estimate possible biases of cases analyzed for this model is 1507 instead of 1523.  Fig. 10 and Fig 11. The solid (dotted) line represents the median (95%-percentile) of all cases. For all total masses, we find that the median mismatch with SEOBNRv4PHM is lower than 1%, about a factor of 2 lower than IMRPhenomPv3HM. The 95th-percentile shows a stronger dependence on total mass for SEOBNRv4PHM, with mismatches lower than IMRPhenomPv3HM at low and medium total masses, becoming comparable at the highest total masses. FIG. 13: The highest unfaithfulness over total mass for all cases shown in Fig. 12. The median of unfaithfulness is around 1% for SEOBNRv4PHM and 2% for IMRPhenomPv3HM (shown as dashed vertical lines). Note that for SEOBNRv4PHM, the worst unfaithfulness is below 10% and the distribution is shifted to lower values.
in the binary's parameters due to inaccuracies in waveform models.
We generate the first precessing-BBH mock signal with the NRSur7dq4 model. It has mass ratio q = 3 and a total source-frame mass of M = 70M . The spins of the two BHs are defined at a frequency of 20 Hz, and have components χ 1 = (0.30, 0.00, 0.50) and χ 2 = (0.20, 0.00, 0.30). The masses and spins" magnitudes (0.58 and 0.36) of this injection are compatible with those of BBH systems observed so far with LIGO and Virgo detectors [4][5][6][7][8]. Although the binary's parameters are not extreme, we choose the inclination with respect to the line of sight of the BBH to be ι = π/3, to emphasize the effect of higher modes. The coalescence and polarization phase, respectively φ and ψ, are chosen to be 1.2 rad and 0.7 rad. The sky-position is defined by its right ascension of 0.33 rad and its declination of -0.6 rad at a GPStime of 1249852257 s. Finally, the distance to the source is set by requesting a network-SNR of 50 in the three detectors (LIGO Hanford, LIGO Livingston and Virgo) when using the Advanced LIGO and Advanced Virgo PSD at design sensitivity [69]. The resulting distance is 800 Mpc. The unfaithfulness against this injection is 0.2% and 1% for SEOBNRv4PHM and IMRPhenomPv3HM, respectively. Although the value of the network-SNR is large for this synthetic signal, it is not excluded that the Advanced LIGO and Virgo detectors at design sensitivity could detect such loud BBH. With this study we want to test how our waveform model performs on a system with moderate precessional effect when detected with a large SNR value, considering that it has an unfaithfulness of 0.2%. For the second precessing-BBH mock signal, we use a binary with larger mass ratio and spin magnitude for the primary BH. We employ the NR waveform SXS:BBH:0165 from the public SXS catalog having mass ratio q = 6, and we choose the source-frame total mass M = 76M . The BH's spins, defined at a frequency of 20 Hz, have values χ 1 = (−0.06, 0.78, −0.47) and χ 2 = (0.08, −0.17, −0.23). The BBH system in this simulation has strong spin-precession effects. We highlight that this NR waveform is one of the worst cases in term of unfaithfulness against SEOBNRv4PHM, as it is clear from Fig. 10. For this injection we choose the binary's inclination to be edge-on at 20 Hz to strongly emphasize higher modes. All the other binary parameters are the same of the previous injection, with the exception of the luminosity distance, which in this case is set to be 1.2 Gpc to obtain a network-SNR of 21. The NR waveform used for this mock signal has unfaithfulness of 4.4% for SEOBNRv4PHM and 8.8% for IMRPhenomPv3HM, thus higher than in the first injection.
For the parameter-estimation study we use the software PyCBC's pycbc generate hwinj [109] to prepare the mock signals, and we perform the Bayesian analysis with parallel Bilby [110], a highly parallelized version of the parameterestimation software Bilby [111]. We choose a uniform prior in component masses in the range [5,150]M . Priors on the dimensionless spin magnitudes are uniform in [0, 0.99], while for the spin directions we use prior isotropically distributed on the unit sphere. The priors on the other parameters are the standard ones described in Appendix C.1 of Ref. [5].
We summarize in Fig. 15 the results of the parameter estimation for the first mock signal for SEOBNRv4PHM (blue), IMRPhenomPv3HM (red) and NRSur7dq4 (cyan). We report the marginalized 2D and 1D posteriors for the component masses m 1 and m 2 in the source frame (top left), the effective spin parameters χ eff and χ p (top right), the spin magnitude of the more massive BH a 1 and its tilt angle θ 1 (bottom left) and finally the angle θ JN and the luminosity distance (bottom right). In the 2D posteriors, solid contours represent 90% credible intervals and black dots show the value of the parameter used in the synthetic signal. In the 1D posteriors, they are represented respectively by dashed lines and black solid lines. As it is clear from Fig. 15, when using the waveform models SEOBNRv4PHM and NRSur7dq4, all the parameters of the synthetic signal are correctly measured within the statistical uncertainty. Moreover, the shape of the posterior distributions obtained when using SEOBNRv4PHM are similar to those recovered with NRSur7dq4 (the model used to create the synthetic signal). This means that the systematic error due to a non perfect modeling of the waveforms is negligible in this case.
For the model IMRPhenomPv3HM while masses and spins are correctly measured within the statistical uncertainty, the luminosity distance D L and the angle θ JN are biased. This is consistent with the prediction obtained using Lindblom's criterion in Refs. [100,[112][113][114] 9 . In fact, according to this criterion, an unfaithfulness of 1% for IMRPhenomPv3HM would be sufficient to produce biased results at a network-SNR of 19. Thus, it is expected to observe biases when using IMRPhenomPv3HM at the network-SNR of the injection, which is 50. In the case of SEOBNRv4PHM the unfaithfulness against the signal waveform is 0.2% and according to Lindblom's criterion we should also expect biases for network-SNRs larger than 42, but in practice we do not observe them. We remind that Lindblom's criterion is only approximate and it has been shown in Ref. [115] to be too conservative, therefore the lack of bias that we observe is not surprising.
In Fig. 16 we summarize the results of the second mocksignal injection. The plots are the same as in Fig. 15 with the only exception that we do not have results for the NRSur7dq4 model since it is not available in this region of the parameter 1)/(2SNR 2 ), where N intr is the number of binary's intrinsic parameters, which we take to be 8 for a precessing-BBH system. Note, however, that in practice this factor can be much larger, see discussion in Ref. [115].
space. In this case the unfaithfulness between SEOBNRv4PHM (IMRPhenomPv3HM) and the NR waveform used for the mock signal is 4.4% (8.8%). According to Lindblom's criterion, at the network-SNR of this mock signal we should expect the bias due to non-perfect waveform modeling to be dominant over the statistical uncertainty for an unfaithfulness 1%. Therefore we might expect some biases in inferring parameters for both models. Lindblom's criterion does not say which parameters are biased and by how much. The results in Fig. 16 clearly show that both models have biases in the measurement of some parameters, but unfaithfulness of 4.4% and 8.8% induce different amount of biases and also on different set of parameters (intrinsic and extrinsic).
In particular for the component masses (top left panel of Fig. 16), the 2D posterior distribution obtained with SEOBNRv4PHM barely include the value used for the mock signal in the 90% credible region. This measurement looks better when focusing on the 1D posterior distributions for the individual masses for which the injection values are well within the 90% credible intervals. The situation is worst for the IMRPhenomPv3HM model, for which the 2D posterior distribu- tion barely excludes the injection value at 90% credible level. In this case also the true value of m 1 is excluded from the 90% credible interval of the marginalized 1D posterior distribution. Furthermore, χ eff and χ p (top right panel of Fig. 16) are correctly measured with SEOBNRv4PHM while the measurement with IMRPhenomPv3HM excludes the true value from the 2D 90% credible region. From the 1D posterior distributions it is clear that the source of this inaccuracy is the incorrect measurement of χ p , while χ eff is correctly recovered within the 90% credible interval. A similar situation is observed in the measurement of a 1 the spin magnitude of the heavier BH and θ 1 its tilt angle (bottom left panel of Fig. 16). Also in this case SEOBNRv4PHM correctly measures the parameters used in the mock signal, while IMRPhenomPv3HM yields an incorrect measurement due to a bias in the estimation of a 1 . Finally, we focus on the measurement of the angle θ JN and the luminosity distance D L (bottom left panel of Fig. 16). In this case the value of these parameters used in the synthetic signal is just slightly measured within the 90% credible region of the 2D posterior distribution obtained with SEOBNRv4PHM. As a consequence the luminosity distance is also barely measured within the 90% credible interval from the marginalized 1D posterior distribution and the measured value of θ JN results outside the 90% credible interval of the 1D posterior distribution. The posterior distributions obtained using IMRPhenomPv3HM are instead correctly measuring the parameters of the mock signal. We can conclude that even with an unfaithfulness of 4.4% against the NR waveform used for the mock signal the SEOBNRv4PHM model is able to correctly measure most of the binary parameters, notably the intrinsic ones, such as masses and spins.

VI. CONCLUSIONS
In this paper we have developed and validated the first inspiral-merger-ringdown precessing waveform model in the EOB approach, SEOBNRv4PHM, that includes multipoles beyond the dominant quadrupole.
The improvement in accuracy between SEOBNRv4 and SEOBNRv3P (i.e., the models with only the = 2 modes) is evident from Fig. 8, where we have compared those models to the public SXS catalog of 1405 precessing NR waveforms, and the new 118 SXS NR waveforms produced for this work. The impact of including higher modes in semi-analytical models to achieve higher accuracy to multipolar NR waveforms is demonstrated in Fig. 9. Figures 10, 11, 12 and 14 quantify the comparison of the multipolar precessing SEOBNRv4PHM and IMRPhenomPv3HM to all SXS NR precessing waveforms at our disposal. We have found that for the SEOBNRv4PHM model, 94% (57% ) of the cases have maximum unfaithfulness value, in the total mass range 20-200M , below 3% (1%). Those numbers change to 83% (20% ) when using the IMRPhenomPv3HM. The better accuracy of SEOBNRv4PHM with respect to IMRPhenomPv3HM is also confirmed by the comparisons with the NR surrogate model NRSur7dq4, as shown in Fig. 17. We have investigated in which region of the parameter space the unfaithfulness against NR waveforms and NRSur7dq4 lies, and have found, not surprisingly, that it occurs where both mass ratios and spins are large (see Fig. 18). When comparing SEOBNRv4PHM and IMRPhenomPv3HM outside the region in which their corresponding aligned-spin underlying models were calibrated, we have also found that the largest differences reside when mass ratios are larger than 4 and spins larger than 0.8 (see Fig. 14). To improve the accuracy of the models in those more challenging regions, we would need NR simulations, but also more information from analytical methods, such as the gravitational self-force [104][105][106], and resummed EOB Hamiltonians with spins [107,108].
To quantify how the modeling inaccuracy, estimated by the unfaithfulness, impacts the inference of binary's parameters, we have perfomed two parameter-estimation studies using Bayesian analysis. Working with the Advanced LIGO and Virgo network at design sensitivity, we have injected in zero noise two precessing-BBH mock signals with mass ratio 3 and 6, having SNR of 50 and 21, with inclination of π/3 and π/2 with respect to the line of sight respectively, and recovered them with SEOBNRv4PHM and IMRPhenomPv3HM. The unfaithfulness values of those models against the synthetic signals considered (i.e., NRSurd7q4 and SXS:BBH:0165) range from 0.2% to 8.8%. The results are summarized in Figs. 15 and 16. Overall, we have found that Lindblom's criterion [100,[112][113][114][115] is too conservative and predicts visible biases at SNRs lower than what we have obtained through the Bayesian analysis. In particular, we have found, when doing inference with SEOBNRv4PHM, that an unfaithfulness of 0.2% may produce no biases up to SNR of 50, while an unfaithfulness of 2.2% can produce biases only for some extrinsic parameters, such as distance and inclination, but not for binary's masses and spins at SNR of 21. A more comprehensive Bayesian study will be needed to quantify, in a more realistic manner, the modeling systematics of SEOBNRv4PHM, if this model were used during the fourth observation run of Avanced LIGO and Virgo in 2022 (i.e., the run at design sensitivity).
The improvement in accuracy between SEOBNRv4 and SEOBNRv3P (i.e., the models with only the = 2 modes) is evident from Fig. 8, where we have compared those models to the public SXS catalog of 1405 precessing NR waveforms, and the new 118 SXS NR waveforms produced for this work. The impact of including higher modes in semi-analytical models to achieve higher accuracy to multipolar NR waveforms is demonstrated in Fig. 9. Figures 10, 11, 12 and 14 quantify the comparison of the multipolar precessing SEOBNRv4PHM and IMRPhenomPv3HM to all SXS NR precessing waveforms at our disposal. We have found that for the SEOBNRv4PHM model, 94% (57% ) of the cases have maximum unfaithfulness value, in the total mass range 20-200M , below 3% (1%). Those numbers change to 83% (20% ) when using the IMRPhenomPv3HM. We have found several cases with large unfaithfulness (> 10%) for IMRPhenomPv3HM, coming from a region of parameter space with q 4 and large ( 0.8) spins anti-aligned with the orbital angular momentum, which appear to be connected to unphysical features in the underlying precession model, and cause unusual oscillations in the waveform's amplitude and phase. The better accuracy of SEOBNRv4PHM with respect to IMRPhenomPv3HM is also confirmed by the comparisons with the NR surrogate model NRSur7dq4, as shown in Fig. 17. We have investigated in which region of the parameter space the unfaithfulness against NR waveforms and NRSur7dq4 lies, and have found, not surprisingly, that it occurs where both mass ratios and spins are large (see Fig. 18). When comparing SEOBNRv4PHM and IMRPhenomPv3HM outside the region in which the alignedspin underlying model was calibrated, we have also found that the largest differences reside when mass ratios are larger than 4 and spins larger than 0.8 (see Fig. 14). To improve the accuracy of the models in those more challenging regions, we would need NR simulations, but also more information from analytical methods, such as the gravitational self-force [104][105][106], and resummed EOB Hamiltonians with spins [107,108].
The newly produced 118 SXS NR waveforms extend the coverage of binary's parameter space, spanning mass ratios q = 1-4, (dimensionless) spins χ 1,2 = 0.3-0.9, and different orientations to maximize the number of precessional cycles. As we have emphasized, the waveform model SEOBNRv4HM is not calibrated to NR waveforms in the precessing sector, only the aligned-spin sector was calibrated in Refs. [32,33]. De- spite this, the accuracy of the model is very good, and it can be further improved in the future if we calibrate the model to the 1404 plus 118 SXS NR precessing waveforms at our disposal. This will be an important goal for the upcoming LIGO and Virgo O4 run in early 2022. Furthermore, SEOBNRv4HM assumes the following symmetry among modes h m = (−1) h * −m in the co-precessing frame, which however no longer holds in presence of precession. As discussed in Sec. II D, forcing this assumption causes unfaithfulness on the order of a few percent. Thus, to achieve better accuracy, when calibrating the model to NR waveforms, the mode-symmetry would need to be relaxed.
Finally, SEOBNRv4HM uses PN-resumed factorized modes that were developed for aligned-spin BBHs [88,89], thus they neglect the projection of the spins on the orbital plane. To obtain high-precision waveform models, it will be relevant to extend the factorized modes to precession. Considering the variety of GW signals that the improved sensitivity of LIGO and Virgo detectors is allowing to observe, it will also be important to include in the multipolar SEOBNR waveform models the more challenging (3,2) and (4, 3) modes, which are characterized my mode mixing [116][117][118]. Their contribution is no longer negligible for high total-mass and/or large mass-ratio binaries, especially if observed away from face-on (face-off).
Lastly, being a time-domain waveform model generated by solving ordinary differential equations, SEOBNRv4HM is not a fast waveform model, especially for low total-mass binaries. To speed up the waveform generation, a reduced-order modeling version has been recently developed [119]. Alternative methods that employ a fast evolution of the EOB Hamilton equations in the post-adiabatic approximation during the long inspiral phase have been suggested [120], and we are currently implementing them in the simpler nonprecessing limit in LAL.

Acknowledgments
It is our pleasure to thank Andrew Matas for providing us with the scripts to make the parameter-estimation plots, and Sebastian Khan for useful discussions on the faithfulness calculation. We would also like to thank the SXS collaboration for help and support with the SpEC code in producing the new NR simulations presented in this paper, and for making the large catalog of BBH simulations publicly available. The new 118 SXS NR simulations were produced using the high-performance compute (HPC) cluster Minerva at the Max Planck Institute for Gravitational Physics in Potsdam, on the Hydra cluster at the Max Planck Society at the Garching facility, and on the SciNet cluster at the University of Toronto. The data-analysis studies were obtained with the HPC clusters Hypatia and Minerva at the Max Planck Institute for Gravitational Physics. The transformation and manipulation of waveforms were done using the GWFrames package [72,121]. faithfulness below 1% and IMRPhenomPv3HM about a factor of 3 larger. The right panel of Fig. 17 shows the maximum unfaithfulness distribution and the same trends are also observed. SEOBNRv4PHM outperforms IMRPhenomPv3HM, with the median of the former being 4 times smaller than the one of the latter. Finally, to gain further insight into the behavior of the waveform models across the parameter space, we show in Fig. 18 the maximum unfaithfulness as a function of mass ratio and the effective spin.