A multimode quasi-normal spectrum from a perturbed black hole

When two black holes merge, the late stage of gravitational wave emission is a superposition of exponentially damped sinusoids. According to the black hole no-hair theorem, this ringdown spectrum depends only on the mass and angular momentum of the final black hole. An observation of more than one ringdown mode can test this fundamental prediction of general relativity. Here we provide strong observational evidence for a multimode black hole ringdown spectrum using the gravitational wave event GW190521, with a maximum Bayes factor of $56\pm1$ ($1\sigma$ uncertainty) preferring two fundamental modes over one. The dominant mode is the $\ell=m=2$ harmonic, and the sub-dominant mode corresponds to the $\ell=m=3$ harmonic. The amplitude of this mode relative to the dominant harmonic is estimated to be $A_{330}/A_{220} = 0.2^{+0.2}_{-0.1}$. We estimate the redshifted mass and dimensionless spin of the final black hole as $330_{-40}^{+30}~\mathrm{M}_{\odot}$ and $0.86_{-0.11}^{+0.06}$, respectively. We find that the final black hole is consistent with the no hair theorem and constrain the fractional deviation from general relativity of the sub-dominant mode's frequency to be $-0.01^{+0.08}_{-0.09}$.


INTRODUCTION
A perturbed black hole approaches equilibrium by emitting a spectrum of damped sinusoidal gravitational-wave signals [1][2][3].Unlike other astrophysical objects, the ringdown spectrum of a black hole is remarkably simple.General relativity predicts that the frequencies and damping times of the entire spectrum of damped sinusoids, or "quasi-normal modes", are fully determined by just two numbers: the black hole mass M and angular momentum J, as described by the Kerr solution [4].This prediction, a consequence of the black hole "no-hair theorem", does not hold in many alternate theories [5].If astrophysical black holes are observed to violate this property, it indicates new physics beyond standard general relativity.
In order to observationally test this prediction using binary black hole mergers, an important observational challenge must be met: at least two ringdown modes must be observed [6].The higher the binary mass ratio asymmetry, the more likely it is that sub-dominant ringdown modes are observable.However, more asymmetric binary systems are less likely to be formed, and also lead to weaker signals.Population studies suggested that such multimode ringdown modes were unlikely to be observed until the next generation of gravitational-wave observatories [7,8], since black-hole population models did not anticipate observations of massive, asymmetric binaries.
Here we confound this expectation with the gravitational-wave event GW190521, detected by the two LIGO detectors and Virgo at 03:02:29 UTC on May 21st, 2019 [9,10].This is the heaviest confidently detected black-hole merger event observed to date [11,12].The signal is consistent with the merger of two high mass black holes which merge at a low frequency relative to the detector sensitivity band.As such, it has a barely observable inspiral; the signal is dominated by the merger and ringdown phase.
GW190521 was initially reported as the merger of two comparable mass black holes [9,10].If the spins are aligned, one would not expect to detect sub-dominant ringdown modes.However, there was significant posterior support for precession in GW190521.Furthermore, subsequent re-analysis of the data found that the progenitor masses could have been unequal [13,30].Both of these scenarios suggest the possibility of detectable subdominant modes [15,16].Here we find strong evidence for multimode damped sinusoids in the ringdown phase of the gravitational wave event GW190521.

MULTIMODE AGNOSTIC SEARCH
A quasi-normal mode description of the gravitational wave from a binary black hole is not expected to be valid until after the binary has merged to form a perturbed black hole.On the flip side, the damping time of an O(100 M ⊙ ) black hole is O(10 ms), leaving a window of only a few tens of milliseconds after merger in which the ringdown is detectable above noise.Accurate identification of the merger time is therefore crucial to extract quasi-normal modes from the data.To account for uncertainty in the merger time of GW190521 due to modelling systematics, we perform a series of analyses in short time increments starting at a geocentric GPS reference time t ref = 1242442967.445.This time is taken from the maximum likelihood merger time obtained via the analysis in Nitz & Capano [13].We also fix the sky location to the maximum likelihood values from the same analysis.
The ringdown spectrum of a Kerr black hole consists of an infinite set of frequencies f ℓmn and damping times τ ℓmn labeled by three integers (ℓ, m, n).Here ℓ and m are the usual angular harmonic numbers.The third index n ≥ 0 denotes overtones, with n = 0 being the fundamental mode.The most agnostic way to search for quasi-normal modes from a perturbed black hole is to search for them individually, without assuming any relation between them.Such a search is complicated by the nature of quasi-normal modes: they are not orthogonal, meaning that modes that overlap in time must be sufficiently separated in frequency or damping time in order to be distinguishable.Simulations of binary black hole mergers have shown that the fundamental ℓ = m = 2 mode is typically significantly louder than other modes.In order to extract sub-dominant modes from noisy data in an agnostic search it is useful to separate the dominant mode in frequency from the others.A visual inspection of the time-and frequency-domain data taken at the reference time revealed significant power in the two LIGO detectors between 60-70 Hz (see the Supplemental Material).In order to isolate this and search for sub-dominant modes we constructed three frequency ranges: "range A" between 50 − 80 Hz, "range B" between 80−256 Hz, and "range C" between 15−50 Hz.We search for one quasi-normal mode in each range using Bayesian inference.We use uniform priors on the relative amplitudes of the modes in range B and C between 0 and 0.9 times the mode in range A. No other relation is assumed between the modes.
We repeat this analysis at time steps of t ref + 0, 6, 12, 18, and 24 ms.As expected from the visual inspection of the data, we find a significant mode in range A at all grid points, which decreases in amplitude at later times.A clear second mode is found in range B. This mode is most visible at t ref + 6 ms, the result of which is shown in Fig. 1 (results at other times are shown in the Supplemental Material).The frequency of the secondary mode at this time is 98 +89 −7 Hz with a damping time of 40 +50 −30 ms, while the primary mode has a frequency of 63 +2 −2 Hz and damping time 26 +8 −6 ms.The signal-to-noise ratio (SNR) of the primary and secondary modes of the maximum likelihood waveform is 12.2 and 4.1, respectively.Results from range C (not shown) are consistent with noise.
The dominant mode found at 63 +2 −2 Hz is expected to be the quadrupolar ℓ = m = 2, n = 0 fundamental mode.Measurement of f 220 and τ 220 provides an estimate of the mass and angular momentum of the remnant black hole [17].This in turn predicts the entire ringdown spectrum of subdominant modes.Figure 1 shows that the subdominant mode at 98 +89 −7 Hz is consistent with the ℓ = m = 3, n = 0 mode.This is also in agreement with expectations from numerical simulations of binary black hole mergers [15,18,19].
To quantify the agreement between the expected (330) mode and the observed mode in Range B, we multiply the observed posterior in range B (color map in Fig. 1) with the expected distribution using range A (indicated by the blue contour in Fig. 1) and integrate [20].This yields a statistic ζ that is proportionate to the agreement between the expected mode and the observed mode.Repeating on all the grid points, we find that ζ obtains a maximum value of 1.27 at t ref +6 ms, consistent with our visual inspection.In Ref. [20] we perform a large simulation campaign in the data surrounding GW190521.Repeating the agnostic analysis on these simulations, we find that the probability of finding ζ ≥ 1.27 in noise is ∼ 0.004.

CONSISTENCY WITH THE KERR SOLUTION
The search for damped sinusoids in the previous section assumed no particular relation between different modes, with a corresponding large prior parameter volume.In this section, we assume that the frequency and damping times of the damped sinusoids are related as in the ringdown of a Kerr black hole.This reduces the prior parameter volume and focuses in on particular modes.The amplitudes and phases of the modes are left as free parameters, since they depend on the specific initial state of the remnant black hole immediately after the merger.
For this analysis, we model the ringdown signal based on the final Kerr black hole mass, M f , and dimensionless spin, χ f = J f /M 2 f .We expect only a subset of the entire spectrum of quasi-normal modes to be visible above noise.Including all possible modes in our signal model can lead to overfitting the data.For this reason we perform several analyses which include different combinations of the (330), (440), (210), and (550) modes, in addition to the dominant (220) mode.Numerical simulations of binary black hole mergers have generally shown these modes to be the strongest [19].Giesler et al. [31] showed that including overtones of the dominant harmonic allows a quasi-normal mode description of the signal to be used at earlier times, close to merger.We therefore also perform analyses in which we include the first overtone of the dominant harmonic (ℓmn) = (221).We use a prior range on the (330) amplitude of [0, 0.5] A 220 , while for the (221) mode we use [0, 5] A 220 ; prior ranges for the other modes can be found in the data release accompanying this paper.These choices for the amplitude priors are sufficiently broad that they comfortably include results from numerical simulations [19,31].Other prior choices are possible; e.g., a broader amplitude prior was adopted in Ref. [27].
We repeat these analyses in 1 ms intervals between t ref + [−9 ms, 24 ms].We use Bayes factors to determine which model is most favored at each time step.For the model that includes the fundamental dominant harmonic (220) and its overtone (221), the Bayes factor is evaluated against the model with only the (220) mode.For models that include the (330) (or other sub-dominant modes), the Bayes factor is evaluated against the stronger of the (220) or ( 220)+(221) models.
The Bayes factors for the various multimode Kerr models are shown in Fig. 2. Consistent with the agnostic results, we find strong evidence for the presence of the (330) mode, with the Bayes factor for the (220) + (330) model peaking at 56 ± 1 at t ref + 6 ms.The marginal posterior on the (330) amplitude is peaked away from zero at this time with a value of A 330 /A 220 = 0.2 +0.2 −0.1 ; see Fig. 3.The maximum likelihood ringdown waveforms at this time are shown in the Supplemental Material.
A Bayes factor of ∼ 50 should correspond to a false alarm probability of 0.02.In Ref. [20] we validate this by repeating the analysis on a large number of simulated signals added to off-source data surrounding GW190521.There, we find that the distribution of Bayes factors in noise matches expectations; e.g., the probability of obtaining a maximized Bayes factor as large as 56 from noise is ∼ 0.02.Our quoted Bayes factor is therefore robust against background noise fluctuations.
As can be seen in Fig. 2, the model that includes the (220), (330), and (210) modes is nearly as strong as the model with just the (220) and (330) mode in it, and is slightly favored at t ref + 18 ms, indicating some support for the presence of the (210) mode in addition to the (330).However, the ratio of evidences between the (220)+(330) and the (220)+(330)+(210) models is order unity -i.e., the data is uninformative as to whether the (210) mode is observable in addition to the (330).A model consisting of just the (220) and ( 210) is disfavored at all times compared to any involving the (330), as can be seen in Fig. 2. We therefore only claim detection of the (330) mode, and make no claim regarding the observability of the (210) mode.
Figure 4 shows the redshifted mass and the dimensionless spin of the final black hole, measured with the (220) + (330) Kerr model at 6 ms after t ref .We find that the remnant black hole has a redshifted mass −40 M ⊙ and dimensionless spin χ f = 0.86 +0.06 −0.11 .If a quasi-normal model without overtones is used too close to merger, the resulting final mass estimate can be biased toward larger values [3,31].We find the final mass estimate to be stable between 6 and 12 ms using the (220) + (330) model (see the Supplemental Material for plot).This indicates that by this time the black hole has reached a regime of constant ringdown frequency -a requirement for the validity of linear-regime, quasi-normal modes.
Given the strong evidence for the presence of the (330) mode at t ref +6 ms, we can perform the classic no-hair the- orem test [6] (see also Ref. [32] for a reformulation in terms of parametric deviations).Here, we keep the dependence of f 220 and τ 220 on (M f , χ f ) as in the Kerr solution but introduce fractional deviations δf 330 and δτ 330 of f 330 and τ 330 , respectively.We find good agreement with general relativity.Figure 4 shows the Kerr black hole mass M f and dimensionless spin χ f associated to the (330) mode frequency f 330 (1 + δf 330 ) and damping time τ 330 (1 + δτ 330 ) measured at 6 ms after t ref .Plots of the posterior distributions on the fractional deviations are provided in the Supplemental Material.We constrain the fractional deviation from Kerr to δf 330 = −0.01+0.08 −0.09 .The damping time is only weakly constrained, with δτ 330 = 0.6 +1.9 −1.2 .

DISCUSSION AND CONCLUSIONS
The redshifted final mass of GW190521 measured by the LIGO and Virgo Collaborations (LVC) using a (220) ringdown fit was (1 + z)M f = 282.2+50.0 −61.9 M ⊙ , or 259.2 +36.6 −29.0 M ⊙ when analyzing the full signal [11].The low-mass-ratio part of the posterior of Nitz and Capano [13] found (1 + z)M f ∼ 260M ⊙ using the full signal [21,22].These results are somewhat in tension with the final mass and spin inferred from the ringdown modes found here.
However, the complete waveform models used in the above analyses may not include all relevant physical ef-fects.This, coupled with the fact that GW190521 has a very short inspiral signal, can lead to systematic errors for parameter estimation.For example, the waveform models used in the LVC analysis and Nitz & Capano assume quasicircular orbits, but several studies have indicated that the binary may have been eccentric at merger [23][24][25].These studies have also found slightly larger estimates for the binary's total mass, making them more consistent with our estimate for the final mass.Even without eccentricity, the reanalysis in Estelles et al. [30] using a recalibrated timedomain model found a bimodal distribution for the final mass and spin.One of these modes yields similar estimates for the mass and spin as we obtain here; see Ref. [20] for a more detailed comparison.
The ringdown waveforms used in this paper are simpler and more robust than full inspiral-merger-ringdown models for signals like GW190521, provided they are applied sufficiently late in the post-merger regime.This argument would tend to favor the estimates derived in this paper for the total mass.Nevertheless, a full resolution of this tension is beyond the scope of this work.
Evidence for overtones of the (220) mode very close to merger were previously found for the events GW150914 [26] and GW190521 074359 [27] (not to be confused with GW190521), although the strength of the evidence for the overtone is disputed [28,29].Black hole spectroscopy tests showed consistency with the Kerr hypothesis for these events [26,27].However, the resulting constraints were weaker than what we find with the (330) mode here.Furthermore, while there is strong numerical evidence for the presence of ringdown overtones close to the merger [31], a number of theoretical questions remain as to the validity of a quasi-normal description of the black hole close to merger [33][34][35][36][37].
The true nature of the gravitational wave event GW190521 has been the subject of much speculation [38][39][40].The interpretation of GW190521 as a head-on collision of two highly spinning Proca stars [40] predicts the presence of a (200) mode [41].We do not find evidence for such a mode.Additionally, the high-mass, multiple-mode ringdown signal observed here does not agree with the scenario of a very massive star collapsing to a black hole of mass ∼ 50M ⊙ and an unstable massive disk [42].
Expectations based on population models were that black hole ringdown signals with multiple modes were unlikely to be observed with the Advanced LIGO and Virgo detectors [7,8] (although those population models did not include massive binaries).However, Forteza et al. [15] predicted that for even moderately asymmetric binaries (with mass ratios ≳ 1.2), the (330) mode would be the best observable mode.They further predicted that the (330) mode's frequency could be constrained to the ∼ 10% level if the ringdown SNR is ≳ 8. Our results are remarkably consistent with this prediction: we get a ringdown SNR for GW190521 of ∼ 12, and we constrain the (330) frequency to be within ∼ 10% of the expected value from General Relativity.
In summary, we have shown that GW190521 displays a distinct subdominant mode and that this mode is consistent with the (330) ringdown mode of a Kerr black hole.

SUPPLEMENTAL MATERIAL The ringdown signal model
In the quasi-normal mode (QNM) spectrum of a perturbed Kerr black hole, the allowed frequencies f ℓmn = ω ℓmn /(2π) and damping times τ ℓmn are labeled by three integers ℓ = 2, 3, . . ., −ℓ ≤ m ≤ ℓ, and n = 0, 1, 2, . ... These can be combined together in a complex frequency Ω ℓmn = ω ℓmn + i/τ ℓmn such that the ringdown signal of a perturbed Kerr black hole can be expressed as a sum of damped sinusoids: where h + and h × are the plus and cross polarizations of the gravitational wave, M f is the mass of the black hole in the detector frame and D L is the luminosity distance to the source.The functions −2 S ℓmn (ι, φ; χ f ) are the spinweighted spheroidal harmonics of spin weight −2, which depend on the inclination angle ι between the black hole spin and the line-of-sight from the source to the observer, the azimuth angle φ between the black hole and the observer, and the dimensionless spin χ f of the black hole.
The complex QNM frequencies Ω ℓmn can be determined from the Teukolsky equation [43,44].According to the nohair theorem, the frequencies and damping times are determined by the mass M f and spin χ f of the black hole, with χ f ∈ (−1, 1).Positive (negative) spin means the perturbation is co(counter)-rotating with respect to the black hole.The amplitudes A ℓmn and phases ϕ ℓmn depend on the initial perturbation and take different values for different ℓmn modes.
For a given ℓ and n, the +m and −m modes are related to each other by ω ℓ−mn = −ω ℓmn and τ ℓ−mn = τ ℓmn [17].Furthermore, if the initial perturbation is symmetric under reflection at the equatorial plane, the amplitude and phase of the ±m modes are related to each other by A ℓ−mn e iϕ ℓ−mn = (−1) ℓ A ℓmn e −iϕ ℓmn .Numerical simulations of black hole binaries have shown that it is difficult to break reflection symmetry in the ringdown [45].To simultaneously sum over the ±m modes for a given ℓn, we parameterize the waveform as where A 0 ℓ|m|n is the intrinsic amplitude of the (ℓmn) mode and If the parameters ∆β ℓmn and ∆ϕ ℓmn are both zero, the waveform reduces to the case found for reflection symmetry.
When using the Kerr model, we do two analyses at several times: one with reflection symmetry, for which ∆β ℓmn and ∆ϕ ℓmn are both set to zero, and one in which a common ∆β and ∆ϕ for all modes are allowed to vary uniformly between [−π/4, π/4) and [−π, π), respectively.
We find the analysis with reflection symmetry is slightly favored by the data.For example, the maximum (330) Bayes (which also peaks at t ref + 6 ms) is 42 when reflection symmetry is broken, compared to 56 when reflection symmetry is assumed (as discussed in the main text).We also find that we recover consistent values for parameters -final mass, spin, and (330) amplitude -for both models, and that ∆β and ∆ϕ are centered on zero in the model without reflection symmetry.We therefore only report results from the model with reflection symmetry in the main text.
In all analyses we fix the azimuthal angle φ = 0, as it is degenerate with the modes' initial phases.To obtain the frequency and damping times for a given mass and spin we use tabulated values from Berti et al. [17], which we interpolate using a cubic spline.For the spheroidal harmonics we use tabulated values of the angular separation constants (also from Berti et al. [17]) and solve the recursion formula given in Leaver [44].Our code for doing this is publicly available on GitHub [46].
For the agnostic analysis, we do not assume any mode corresponds to any particular ℓmn.We therefore use arbitrary complex numbers X ℓ±mn = e iψ ℓ±mn in place of the −2 S ℓ±mn .Here, the ψ ℓ±mn are allowed to vary uniformly in [0, 2π).We vary a common ∆β parameter, but fix the ∆ϕ parameter to zero, as it is degenerate with the X ℓmn .

Computational analysis methods
Standard parameter estimation with gravitational waves begins with Bayes' theorem.Given some data s and a signal model h that depends on some set of parameters λ, we wish to know the posterior probability density function p(λ|s, h).Applying Bayes' theorem we have where p(s|λ, h) is the likelihood function, p(λ|h) is the prior, and Z is a normalization constant known as the evidence.Estimates on a single parameter are obtained by marginalizing the posterior over all other parameters; marginalizing over all parameters yields the evidence.Taking the ratio of evidences Z A /Z B for two different signal models yields the "Bayes factor".If our prior belief for the validity of the two models is the same, the Bayes factor gives the odds that model A is favored over model B. Using a scale by Kass and Raftery [47], a Bayes factor greater than 3.2 is considered "substantial" evidence in favor of model A; greater than 10 is "strong" evidence; greater than 100 is "decisive".
Evaluating the posterior requires a likelihood function p(s|λ, h).Consider a gravitational-wave detector, which we sample every ∆t seconds over a time T to obtain N = ⌈T /∆t⌉ time-ordered samples s = {s 0 , ..., s N −1 }.A network of K detectors sampled in this way will produce a set of samples s net = {s 1 , ..., s K }.To obtain a likelihood function we first consider the hypothesis that the set of samples only contain noise p(s net |n) = p(n net ).
In gravitational-wave astronomy it is common to assume that, in the absence of a signal, the detectors output stochastic Gaussian noise that has zero mean and is independent across detectors.Under this assumption the probability density function describing the network of time-ordered noise samples n net is a product of K N −dimensional multivariate normal distributions, Here, C d is the covariance matrix of the noise in detector d.In order to evaluate this function it is necessary to know what the inverse of the covariance matrix is.
If we assume that a detector's noise is wide-sense stationary and ergodic, then its covariance C is a symmetric Toeplitz matrix with elements given by the autocorrelation function of the data.If the autocorrelation function goes to zero in some finite amount of time that is less than T /2 (for the LIGO and Virgo detectors, this typically happens within a few seconds), then the covariance matrix is asymptotically equivalent to a circulant matrix.The inverse of the covariance matrix can then be well-approximated by that of the equivalent circulant matrix.All circulant matrices have the same eigenvectors, e −2πikp/N / √ N [48].Solving for the eigenvalues yields an analytic expression for C −1 : the j, k-th element is the discrete inverse Fourier transform of 1/S n evaluated at the k − j time step, where S n is the power spectral density of the detector's noise.Substituting this back into Eq.( 4), yields a canonical likelihood function for the noise [49], Here, the inner product ⟨•, •⟩ is defined as where ũ is the discrete Fourier transform of the time series u and * means complex conjugation 1 .The signal hypothesis is that the data consists of the signal plus the noise.The likelihood function p(s|λ, h) is therefore Eq. ( 6) with the n d replaced by the residuals s d −h d .However, this assumes that h is an accurate model of the signal across the entire observation time T .As stated above, quasi-normal modes only model the gravitational wave from a binary black hole after the merger, when the two component black holes have formed a single, perturbed black hole.Performing Bayesian inference using quasinormal modes as the signal model therefore requires excising times from the data when the ringdown prescription is not valid.In other words, instead of considering the full set of time samples s = {s 0 , ..., s N −1 }, we wish to only evaluate the truncated set s tr = {s 0 , ..., s a , s a+M , ..., s N −1 }, with M > 1.The data between the time steps [a, a + M ) is said to be "gated".
The probability density function of the truncated noise is still a multivariate normal distribution (excising dimensions from a multivariate normal is equivalent to marginalizing over those dimensions), and so Eq. ( 4) still applies.The challenge is that the covariance matrix of the truncated noise C tr is no longer Toeplitz.Its eigenvectors can no longer be approximated by that of a circulant matrix, and so the expression for the likelihood Eq. ( 6) is no longer valid.The inverse of the covariance matrix needs to be found by other means.One possibility is to numerically invert the covariance matrix.However, this is numerically unstable due to the large dynamic range of the matrix's elements, and computationally impractical for observation times of more than a ∼second.Instead, we use "gating and in-painting" to find the likelihood of the truncated time series.This was applied to the problem of matched filtering in Zackay et al. [51].Here we apply it to parameter estimation.
Define n ′ = n g +x, where n g is the noise with the gated times t ∈ [a, a + M )∆t zeroed out, and x is a vector that is zero everywhere except in the gated times.If the non-zero elements of x are such that (C −1 n ′ )[k] = 0 for all k ∈ [a, a + M ), then n ′T C −1 n ′ will be the same as the truncated version n T tr C −1 tr n tr .Our aim is to solve the equation C −1 (n g + x) = 0 in the gated region.Since x is zero outside of the gated region, C −1 x only involves the [a, a + M ) rows and columns of C −1 , which form an M × M Toeplitz matrix [cf.Eq. ( 5)].We therefore solve for x such that where the overbar indicates the [a, a + M ) rows (and columns) of the given vector (matrix).This can be solved numerically using a Toeplitz solver [56,57].Adding x to the gated noise ("in painting") will then yield the same result as if we had truncated the noise and the covariance matrix.
Note that if the gate spans the entire beginning of the data segment, the truncated covariance matrix C tr is Toeplitz, and so could be inverted numerically using a Toeplitz solver.This is the method used by Isi et al. [26].The advantage of using in-painting is that it involves solving for an M × M matrix rather than an (N − M ) × (N − M ) matrix.Gating and in-painting also have other applications beyond what we use it for here, such as excising glitches from data when doing parameter estimation.
To evaluate the likelihood for a signal, we use n g = s g − h g (i.e., the residual with the gated region zeroed out) in Eq. ( 8) and solve for x.We can then use x + s g − h g in the standard likelihood, Eq. ( 6).We do not attempt to normalize the likelihood, which would involve finding the determinant of the truncated covariance matrix.For this reason, we calculate and report Bayes factors by comparing models that start at the same time offset from t ref , for which the determinant cancels.
We use the open source PyCBC Inference library for performing Bayesian inference [52,53], to which we have added the gated likelihood described above.For all analyses we use a gate of two seconds, ending at the start time of the ringdown.For sampling the parameter space we use the dynesty nested sampler [54] and we numerically marginalize over polarization to speed up convergence.To better estimate the Bayes factor of the (220)+(330) model, we repeat the (220), (220)+(221), and (220)+(330) analyses 11 times, with one of the 11 using twice as many live points as the other 10.Bayes factors are averaged over the 11 analyses to reduce the uncertainty; parameter estimates are quoted using the run with the larger number of live points.
We use data for the event GW190521 made publicly available by the Gravitational Wave Open Science Center [55].We fix the sky location to the values given by the maximum likelihood result of Nitz & Capano [13], although we have obtained similar results using the LVC's maximum likelihood sky location [10].When we use that sky location the (220)+(330) Bayes factor peaks at the same time as reported in our work, with a value of ∼ 45.
We use a geocentric GPS reference time of t ref = 1242442967.445[13].With the sky location used in our analyses, this corresponds to the detector GPS reference times 1242442967 + 0.4259 at LIGO Hanford, +0.4243 at LIGO Livingston and +0.4361 at Virgo.Credible intervals in the text are quoted to 90%.Despite the tighter prior constraints than that used for the (330) mode, we obtain a larger 90% credible interval on both δf 221 and δτ 221 .The posterior also peaks toward the prior boundaries.This may be due to the noise at low frequency, or may indicate that the signal is not fully captured by a sum of quasi-normal modes at this time.

FIG. 1 .
FIG. 1. Marginal posterior probability distributions on frequency and damping time from an agnostic quasi-normal mode analysis of GW190521 at 6ms after t ref .A single mode is searched for in each of the shown frequency ranges, range A (50 to 80Hz) and range B (80 to 256Hz).Top panels show the marginal posterior on the mode frequencies, with priors indicated by dotted lines.White dotted (dashed) contours in the bottom panels show the 50th (90th) credible regions.Assuming the dominant mode in range A corresponds to the (220) mode of a Kerr black hole, we estimate what the frequency and damping times would be of the (330), (440), and (550) modes (blue, green, and red regions, respectively).The mode in range B is clearly consistent with the expected frequency and damping time of the (330) mode.Here we do not see the (440) and (550) modes, indicating they are weaker than the (330) mode.This is consistent with an asymmetric binary black hole merger.

FIG. 2 .
FIG. 2. Bayes factor of Kerr models that include the indicated modes.The Bayes factor for the (220) + (221) model is calculated against the (220)-only model.For all other models, the Bayes factors are calculated by comparing the evidence for the model against the maximum of the (220)-only model and the (220)+(221) model.The hexagon marks the point with the largest overall Bayes factor, which is for the (220) + (330) model.Quoted mass and spin estimates are taken at this point, as well as the no-hair test described in the text.

FIG. 3 .
FIG. 3. Marginal posterior distribution of the amplitude ratio of the (330) mode relative to the (220) mode, A 330 /A 220 .The gray dotted line shows the prior, which was uniform in [0, 0.5).Vertical dashed lines indicate the 90% credible interval.The posterior distribution is obtained from the (220) + (330) model starting at t ref + 6 ms (red hexagon in Fig. 2), which is the most favored model in the Kerr analysis.

FIG. 4 .
FIG. 4. Posterior distribution of final redshifted mass (1 + z)M f and dimensionless spin χ f measured at 6 ms after t ref assuming the identified modes are the (220) and (330) modes of a Kerr black hole.Dashed lines indicate the 90% credible interval.For the Kerr with δ(330) results, we use fitting formulae [17] to convert the frequency f 330 (1 + δf 330 ) and damping time τ 330 (1 + δτ 330 ) into mass and spin.

FIG. S. 4 .
FIG. S.4.Spectra plots at 0, 6, 12, and 18 ms showing marginal posterior distributions from frequency ranges A, B, and C in the agnostic analysis.Also shown are the expected regions for the (210) (magenta), (330) (blue), (440) (green), and (550) (red) modes, assuming the peak in region A is the (220) mode.A mode is clearly visible in range A and in range B; C is consistent with noise (note the small difference between the posterior and prior in range C).Both the mode in range A and the mode in range B shift to higher frequency between 0 and 6 ms, then remain largely constant while the posterior widens.A secondary peak consistent with the (440) mode is noticeable at 18 ms, but the signal is weak at this point.
Whitened data in each detector.Dark traces show the data gated at the reference time t ref in each detector, which is indicated by the gray vertical lines.Right: Frequency domain representation of the Hanford and Livingston data shown in the left panel.Also shown is the frequency domain representation at an off-source time, one second later.The signal is clearly visible in the LIGO time-domain data, and is seen as a spike in the frequency domain data between ∼ 60 − 70 Hz.The primary frequency bin ("A") boundaries were set to isolate this spike.Frequencies below (region "C") and above (region "B") were searched for additional QNMs in the agnostic analysis.Whitened data in each detector, with a gate applied at 6 ms (gray lines) after the detector reference time t FIG. S.2.Left: ref (gray dotted lines).Semi-transparent traces show the whitened data without the gate.Plotted are the maximum likelihood waveforms using just the (220) mode (black) and the (220) plus (330) mode (orange).