Quasinormal modes and their overtones at the common horizon in a binary black hole merger

It is expected that all astrophysical black holes in equilibrium are well described by the Kerr solution. Moreover, any black hole far away from equilibrium, such as one initially formed in a compact binary merger or by the collapse of a massive star, will eventually reach a final equilibrium Kerr state. At sufficiently late times in this process of reaching equilibrium, we expect that the black hole is modeled as a perturbation around the final state. The emitted gravitational waves will then be damped sinusoids with frequencies and damping times given by the quasinormal mode spectrum of the final Kerr black hole. An observational test of this scenario, often referred to as black hole spectroscopy, is one of the major goals of gravitational wave astronomy. It was recently suggested that the quasinormal mode description including the higher overtones might hold even right after the remnant black hole is first formed. At these times, the black hole is expected to be highly dynamical and nonlinear effects are likely to be important. In this paper we investigate this remarkable scenario in terms of the horizon dynamics. Working with high accuracy simulations of a simple configuration, namely the head-on collision of two nonspinning black holes with unequal masses, we study the dynamics of the final common horizon in terms of its shear and its multipole moments. We show that they are indeed well described by a superposition of ringdown modes as long as a sufficiently large number of higher overtones are included. This description holds even for the highly dynamical final black hole shortly after its formation. We discuss the implications and caveats of this result for black hole spectroscopy and for our understanding of the approach to equilibrium.


I. INTRODUCTION
The process of binary black hole coalescence, the formation of a remnant black hole and the associated emission of gravitational waves, provides a rich arena for tests of general relativity (GR). The inspiral regime where we have two distinct black holes inspiralling into each other is well described by the post-Newtonian approximation. A useful framework for tests of general relativity in this regime is provided by the parametrized post-Newtonian framework. It can be argued however that it is the merger regime, which involves the formation of the remnant black hole and its approach to equilibrium, that is the most promising in the search for new physics. It is during the merger that the nonlinear and nonperturbative effects of general relativity are most prominent. Moreover, the approach of the remnant black hole to equilibrium is closely related to one of the important predictions of general relativity, namely the so-called black hole no-hair theorem (see e.g. [1][2][3] for reviews with diverse viewpoints). The final state of the remnant black hole in astrophysical situations is predicted to be a Kerr black hole determined by just two parameters, namely the final mass and angular momentum. When the remnant black hole is initially formed, the spacetime in the vicinity of the horizon is highly dynamical and nonlinear, and it is responsible for the emitted gravitational radiation. In classical general relativity, the horizon itself cannot emit any radiation. Rather, it absorbs part of the emitted radiation to reach equilibrium. The gravitational wave emission at late times during this approach to equilibrium is expected to be described by a superposition of exponentially damped sinusoidal signals, with the frequency and damping times determined just by the final black hole's mass and angular momentum [4][5][6] (we can neglect the electric charge for astrophysical black holes). It is an important goal of gravitational wave astronomy to verify (or disprove) this scenario observationally.
Towards this goal, the notion of "black hole spectroscopy" has been proposed [7][8][9]. The basic idea is straightforward: Given that the ringdown frequencies and damping times are determined by just two parameters, if we are able to observe multiple ringdown modes, then the masses and spins inferred from each mode must be consistent. This is then potentially a stringent test of the no-hair theorem; see e.g. [3] for a more detailed discussion. Moreover, the test applies in principle to any astrophysical process which leads to the formation of a remnant black hole which approaches equilibrium. A binary black hole merger is the obvious target, but it also applies to binary neutron star mergers or the gravitational collapse of sufficiently massive stars. In its original formulation, it was assumed that black hole spectroscopy should only work once the black hole is sufficiently close to equilibrium. Consider for example the remnant black hole formed from a binary black hole merger. When the final black hole is initially formed, it is highly distorted and dynamical, and far from equilibrium. There is thus no a priori reason why the perturbatively defined quasinormal mode frequencies should be associated with the black hole at this point. This issue of isolating the perturbative regime where black hole spectroscopy can be applied is arXiv:2010.15186v2 [gr-qc] 13 Jan 2022 considered e.g. in [10].
An important recent development was the suggestion that it might in fact be possible to associate the remnant black hole almost immediately after merger with quasinormal modes [11][12][13]; see also [14][15][16]. Given the considerations mentioned at the end of the previous paragraph, this would seem to be a very unlikely proposition. However, as shown in these works, it is clearly true that it is possible to model the gravitational waveform immediately after the merger phase as a superposition of quasinormal modes. For this, it is essential to include the higher ringdown overtones which had, for the most part, not been included in previous analyses. If true, it could greatly improve the prospects of black hole spectroscopy and would indicate a remarkable simplicity in black hole mergers. It is therefore necessary to investigate this scenario from different perspectives, and one such perspective is in the strong field region near the black hole horizons. The goal of this paper is to investigate whether the dynamics of the remnant black hole horizon can be described by a superposition of quasinormal modes (including the higher overtones).
Towards this end, in this paper we carry out a numerical study of the remnant black hole formed by the head-on collision of two nonspinning black holes with unequal masses. This simple configuration, while not of great astrophysical significance, allows one to obtain very accurate numerical relativity simulations. The manifest axisymmetry of such systems also ensures that there is no ambiguity in the choice of coordinate systems and that physical gauge invariant quantities can be extracted in a straightforward manner. The nonlinearities and dynamics of general relativity are of course still present: a common horizon is formed when the two individual black holes get sufficiently close to each other; it settles to a final Schwarzschild black hole, and gravitational radiation is emitted in the process. This provides us with a simple case where the physical question of interest can be fruitfully explored without worrying about many of the complications present in astrophysically realistic situations. Two geometrical quantities related to the final black hole are of interest for our purposes: the angular modes σ l (l = 2, 3, . . .) of the shear σ of the outgoing light rays at the horizon, and the nontrivial mass multipole moments I l (l = 2, 3, . . .) of the horizon. We calculate σ l and I l as functions of time and we attempt to describe each of them by a superposition of quasinormal modes. We find that, indeed, including the higher overtones can allow for obtaining excellent fits for σ l (t) and I l (t) starting almost immediately after the merger. The high precision of our numerical simulation allows us to include angular modes with 2 ≤ l ≤ 12, i.e., a total of 11 independent time series, and we show that all of these modes are described by combinations of quas-normal modes provided higher overtones are included. Furthermore, while the multipole moments I l are not fully independent of the shear as we shall see, they do provide yet another 11 functions for testing the hypothesis. Similar studies of the gravitational waves at infinity, e.g. [11], typically consider only the dominant l = |m| = 2 wave mode, with a recent extension to a joint analysis of the |m| = 2, l = 2, 3, 4 wave modes (which are coupled, due to spheroidal/spherical mode mixing in the Kerr final state considered, unlike in our more symmetric case) [16]. Thus, this work represents a significant additional evidence compared to previous work in the literature.
The reader might legitimately ask: i) why should the behavior of σ l and I l at the horizon have anything to do with the actual observable quantity, namely the outgoing gravitational waves which could be observed by gravitational wave detectors? Are the horizons not causally disconnected from the outside observers and thus observationally irrelevant? ii) Even if one finds these calculations to be of interest, and even though we are careful in extracting gauge invariant quantities, is the apparent horizon not itself dependent on which time slicing the numerical simulation uses? How can we guarantee that the results would not be entirely different with a different choice of the time coordinate? Let us address these in turn.
For question i), we point out the remarkable correlations that exist between the outgoing radiation seen by a far away observer, and the in-falling radiation that could be seen by a hypothetical observer living near the horizon. Even though these two observers are not in causal contact with each other, the gravitational radiation they would see comes from the same source, namely the nonlinear and time dependent gravitational field in the vicinity of the binary system [17][18][19][20][21]. It is thus not a surprise that both observers will see qualitatively similar features. In fact it was shown in [22] that the two observations agree qualitatively. The present study can be viewed as further evidence of these correlations. Thus, by studying the behavior of the horizon, we can learn something about the outgoing radiation (and vice versa).
Regarding ii), it is likely true that one could have chosen a particularly "bad" slicing and time coordinate which could have obscured any of the correlations mentioned above. First, we could have made a different choice of spatial Cauchy surfaces for the numerical evolution which would generically lead to different dynamical horizons. Even though there are known constraints on how different the dynamical horizons can be [23], it is possible in principle to choose spatial slices such that the horizons could be extremely distorted [24,25]. However, we are not aware of any numerical simulations which use, or can practically use, such extreme choices. Second, even within a given choice of slicing, there is still the possibility of choosing a different time parameter adapted to the slicing, t → t = F(t). This would change the functional dependence of any relevant function of time f (t) into f (t ) = f (F −1 (t )). We shall make no attempt to do any such reparametrizations in this paper, and we shall simply work with the slicing and time coordinate used in the simulation.
What is significant is that our results show that there is at least one choice of slicing and of an adapted time coordinate, which happens to be a widely used one, for which the correlations are manifestly present. Specifically, we employ the 1 + log slicing, along with a Γ-driver shift condition [26,27]. These gauge conditions also set the time parameter and spatial coordinates of the simulation. An important property of these gauge conditions is that they are "symmetry seeking", i.e., they attempt to find a timelike Killing vector if there is one, and thus define reasonable local observers.
The remainder of the paper is structured as follows. A brief summary of the basic quantities we calculate and study is provided in Sec. II. This section defines and identifies the horizon shear and multipole moments as quantities of interest. In the following sections we describe the methods used and the results of attempts to fit these quantities using the ringdown frequencies and damping times associated with the final black hole. The fitting procedure is described in Sec. III and applied to the shear and multipole moments in Sec. IV. Sec. V discusses the implications of these results and whether it is possible to conclude, and in what sense, whether overtones are really associated with the highly distorted remnant black hole immediately after its formation. Sec. VI concludes with a summary and suggestions for future work.

II. BASIC NOTIONS
There are two main aspects relevant to our study: i) the quasinormal modes (QNMs) of a black hole, which are usually defined within the context of black hole perturbation theory, and ii) the nonperturbative study of quasilocal black hole horizons. This section briefly summarizes the basic notions and results for both of these aspects.

A. Quasinormal modes
The metric perturbations of a Schwarzschild black hole (which is the final geometry relevant for our study), for both polar and axial perturbations, can be combined into scalar functions ψ which satisfy equations of the form [28][29][30] Here, as usual, r = r + 2M log(r/2M − 1) with M being the black hole mass, and r is the usual Schwarzschild areal coordinate. The potentials V ± for the polar and axial perturbations are functions of r and they depend on M and on the mode index l. The potentials also differ depending on the nature of the perturbation, and we shall here be concerned almost exclusively with spin-2 fields (see Sec. II B 3). Quasinormal modes are obtained by imposing outgoing boundary conditions at both infinity, and at the horizon. Only a discrete set of (complex) values of the frequency ν allow for these dissipative boundary conditions, and these are labeled by the integers (l, m, n) where (l, m) are the usual angular mode indices in a decomposition into spherical harmonics 1 , and n = 0, 1, 2, . . . is the overtone index. See [31] for an analytic method for calculating this spectrum and [32,33] for a compilation of the values in different situations. We also show a sample of m = 0 quasinormal mode frequencies (real and imaginary parts) for a Schwarzschild black hole of unit mass below in Table I. Finally, there are several interesting mathematical and numerical issues related, in particular, to the non-self-adjoint nature of the problem. For example, of great potential interest is the recent suggestion that the higher overtones might in fact be unstable [34]. Similarly, the issue of the completeness of the quasinormal modes is also of great interest; see e.g. [35][36][37].
B. Nonperturbative framework for studying quasilocal horizons

Horizon definition
The study of horizons here is based on the notions of marginally trapped surfaces and dynamical horizons (see e.g. [38][39][40][41][42][43] for reviews). Here we shall only briefly summarize the basic notions required for our purposes. The first is that of a marginally outer trapped surface (MOTS). This is a closed spacelike 2-surface S whose outer null-normal a has vanishing expansion: Here q ab is the intrinsic metric on S. MOTSs are closely related to trapped surfaces with negative expansions for both the outgoing and ingoing null normals, and the significance of these notions goes back to the singularity theorems [44,45]. Their presence implies the existence of a spacetime singularity to its future, and thus indicates the presence of a black hole. Well developed methods exist to locate MOTSs in numerical relativity (NR) simulations [46]. Here we shall employ the method developed in [47,48] and available from [49], which in turn uses libraries described in [50][51][52][53][54][55][56][57]. As a MOTS evolves in time, it traces out a 2+1-dimensional world-tube H which we shall refer to as a dynamical horizon. Several mathematical and physical properties of H are known and summarized in the review articles referred to above. The behavior of dynamical horizons in black hole mergers has been studied in detail recently [48,[58][59][60].

Setup and numerical simulation employed
The configuration we consider here is the head-on merger of two nonspinning black holes initially at rest. The initial data is the time symmetric Brill-Lindquist puncture data [61]. This data describes a spatial slice Σ with vanishing extrinsic curvature K ab = 0, and conformally flat 3-metric h ab = Φ 4 δ ab . The conformal factor Φ is a harmonic function on threedimensional Euclidean space with two points removed (the punctures). At a point x, where r 1,2 are the respective distances from x to each of the two punctures, and m 1,2 are known as the bare masses of the two punctures. We will note the total Arnowitt Deser Misner (ADM) mass as M = m 1 + m 2 . We study here a particular configuration with m 2 /m 1 = 1.6. The ADM mass has a value of 1.3 in the code units used, but we instead set it as the mass unit here, i.e., M = 1 in this work.
The simulations are carried out based on the Baumgarte-Shapiro-Shibata-Nakamura (BSSN) formulation of the Einstein equations using the Einstein Toolkit [62,63], with the initial data being generated by TwoPunctures [64]. We evolve the spacetime using an axisymmetric version of McLachlan [65], which uses Kranc [66,67] to generate efficient C++ code. As mentioned earlier, our gauge conditions use a 1 + log slicing and a Γ-driver shift condition [26,27]. Further details of our simulation method are described in [48]. The results presented in the present paper use data obtained from a simulation with a spatial grid resolution of res = 240. Additional simulations with resolutions of res = 60, 120, 180, and restricted simulations with higher resolutions of res = 480, 960, have been used to ensure convergence of our results 2 . We do not use mesh refinement and instead choose our numerical domain large enough to ensure that boundary effects do not reach the horizons up to the final time of t f 38.5 M of the simulations.
In the resulting spacetime, we initially have two disjoint MOTSs S 1,2 . As the time evolution proceeds, S 1 and S 2 approach each other, touch at a particular time labeled t touch , and then go through each other after that. Sometime before t touch , at a time labeled t bifurcate 1.06M, a common horizon forms and immediately bifurcates into two MOTSs representing an outer and an inner branch S out and S in respectively. S in moves inwards, becomes increasingly distorted and eventually merges with S 1 S 2 at t touch , and then develops selfintersections. The focus of this paper is not any of these phenomena, but rather the behavior of S out which moves outwards and loses its distortions as it approaches its final state as that of a spherically symmetric Schwarzschild black hole. We shall in particular look at two particular quantities on S out as functions of time, namely the shear σ of the outward null normal a and the mass multipoles of S out . In the remainder of this section, we shall define these quantities and explain why they are of interest.

Observables on the outer common horizon
We begin with the definition of the shear. Here it will be convenient to introduce a complex basis for tangent vectors on a MOTS: m a andm a , that satisfy m ·m = 1 and m · m = 0. Then, the shear of the outgoing null normal is defined as Such a complex basis is determined up to a spin rotation freedom m → e ιψ m. Under this transformation, the shear transforms as σ → e 2ιψ σ, thus σ is said to have spin weight +2. This means that σ can be expanded in angular modes using spin-weighted spherical harmonics 2 Y lm (θ, φ) of spin weight +2.
There still remains the question of whether there is a preferred choice of angular coordinates (θ, φ); we will end up with different mode decompositions for different choices. The general solution to this is given in [68]. (In the present case, since we have manifest axial symmetry, a simpler approach suffices.) On a surface S of spherical topology equipped with an axial symmetry ϕ a , we can introduce preferred angular coordinates (θ, φ). First, we assume that ϕ a vanishes at only two points, which are taken to be the poles. On the integral curves of ϕ a , take φ to be the affine parameter along ϕ a , normalized to lie in the range 0 ≤ φ < 2π. One of the meridians, i.e., the lines joining both poles and everywhere orthogonal to ϕ a , can be arbitrarily selected. The intersection of this meridian with each integral curve of ϕ a then defines the point on that curve where φ is set to zero. The other coordinate θ, is defined via ζ = cos θ according to Here A S is the area of S, ab is the volume 2-form, and D a is the covariant derivative compatible with q ab . The first equation ensures that ϕ a D a ζ = 0. Hence, ζ is constant on each integral curve of ϕ a , and the meridians are integral curves of D a ζ. The second equation fixes the freedom to add an additive constant to ζ in the first equation. With these choices, it is shown in [69] that the metric q ab is written as where R S := √ A S /4π and It can be shown that −1 < ζ < +1, and it goes from +1 to −1 as we go from one pole to the other. Therefore we can set cos θ = ζ with 0 < θ < π (and extend it to θ = 0 or π at the poles). We have thus specified (θ, φ) on S (up to a rigid rotation by adding a constant to φ corresponding to choosing the φ = 0 meridian). A suitable choice for m a is given by the following form for its dual 1-form m: We can now expand σ as We take only the (l, 0) modes because of the manifest axisymmetry. This symmetry and our specific choice for m a also imply here that σ and the σ l are real. Under time evolution, the mode amplitudes σ l will then be real-valued functions of time that we aim to model with a combination of damped sinusoids.
The importance of σ lies in the fact that the shear, or more precisely |σ| 2 , yields the dominant part of the energy flux infalling into the black hole [70,71]. Based on the discussion in the introduction, we expect the energy fluxes across the horizon to be highly correlated with the outgoing radiation which is determined by the |N| 2 with N being the News function [72]. Thus one would expect σ to be closely correlated with N. This has been shown to be indeed the case for the inspiral regime [22]. Here our focus is on the postmerger regime. The outgoing radiation is represented by the two polarizations h +,× , or equivalently by a complex combination h = h + + ιh × . The News function is given by N =ḣ. Thus, when h is a combination of damped sinusoids then so is N and thus, if the proposed correlations mentioned above do exist, the same should be true for σ. Thus, if the higher overtones appear in h, then they should also appear in the shear σ, and vice versa.
Let us now turn to the multipole moments. As for any mass or charge distribution, it is possible to define suitable mass and current multipoles for black hole horizons [69]. For nonspinning configurations where the individual black holes are nonspinning and the orbital angular momentum is also vanishing, as in our case, we only need to consider the mass multipoles. These are moments I l of the intrinsic scalar curvature R of S calculated from Eq. (6): Just as for the shear, we calculate I l as functions of time and look for the presence of ringdown modes therein. We will only consider l ≥ 2 since I 0 is constant as a topological invariant (here I 0 = √ π) and I 1 vanishes at all times due to the symmetries of the angular coordinates used [68,69]. Preliminary investigations of σ l (t) and I l (t) are given in [60]. Given that we will analyze essentially the same dataset as in [60] (here obtained from performing the same simulation with a higher resolution), it will be useful to summarize the results. We begin with plots of σ l and I l as functions of time, shown in Fig. 1. The behavior of the modes σ l (t) and I l (t) all have similar qualitative behaviors: a rapid initial decay followed by a slower decay with oscillations. The higher the l, the more rapid the initial decay. At late times on the other hand, the damping rates of different modes seem very similar, but the higher modes have higher oscillation frequencies. While we shall not discuss it in this paper, we mention in passing that the in-falling energy flux also has a contribution from a vector field ξ a [70,71] (denoted ζ a in these references).
As for the shear, we can perform a mode decomposition for the vector field as well, but using spin-weight-1 spherical harmonics. The time dependence of these modes ξ l , l ≥ 1, is shown in Fig. 2. For l 2, it is evidently more complex than the shear. This vector contribution is however subdominant, and we shall study this in detail elsewhere.
It is also useful to note that the final black hole horizon is not in equilibrium at early times just after it is formed. An easy way to see this is by looking at the area growth of the final black hole. Fig. 3 shows the area of the final black hole as a function of time starting from when it is initially formed. We see a rapid initial increase showing unambiguously the dynamical nature of the black hole in this regime. The analysis of [60] shows, using many different criteria all of which give approximately the same answer, that the black hole can be considered close to equilibrium after ∼ 10M after its formation.
It was shown in [60] that the late-time behavior, on the other hand, is consistent with the principal (fundamental) quasinormal mode. It was also shown there that at early times after the merger, the observed high decay rates were at first glance not consistent with any of the higher overtones considered separately. However, this early-time postmerger behavior analysis was rather simplistic. Here we perform a more sophisticated analysis by considering the entire time series of the shear and multipoles (instead of breaking it up into early and late portions), and model it with a superposition of quasinormal modes including the higher overtones.
We conclude this section with a brief discussion of the relation between the multipoles and the shear. The shear is a spinweight-2 field, hence it is expanded in spin-weighted spherical harmonics, and it is natural to expect its decay rates to follow the spin-2 quasinormal modes. The scalar 2-curvature of the horizon R, on the other hand, is a spin-weight-0 field which is why Eq. (10) uses, in effect, the spin-weight-0 spherical harmonic in defining its moments. Should we then expect the decay rates of R to follow the spin-0 quasinormal modes? If not, then what should we expect? To answer this question, using the Gauss-Codazzi relations applied to 2-surfaces, we can relate R to the spacetime Riemann curvature: Specifically in this paragraph we denote the shear of a (elsewhere simply denoted σ) with a superscript ( ) in order to distinguish it from the shear σ (n) of the ingoing null normal. Ψ 2 is a component of the Weyl tensor and Re[Ψ 2 ] is its real part. We see that R depends linearly on σ ( ) . The shear σ (n) is not directly associated with the in-falling radiation, and Ψ 2 is also not associated with the radiative part of the gravitational field. Thus, σ ( ) controls the time dependence of R, and it is reasonable to expect the decay rates of R to follow the spin-2 quasinormal ringdown modes. It is also useful to note that for a black hole in equilibrium when there is no in-falling radiation (formally modeled as an isolated horizon [73][74][75][76][77][78][79][80][81][82][83]), the shear σ ( ) vanishes, R is time-independent and R = 4Re[Ψ 2 ].

III. OVERTONE MODELS AND FITTING PROCEDURE
In this section we introduce the basic concepts and framework commonly used for the modeling of ringdown-type waveforms (including the outer horizon shear modes and multipoles in our case) with overtones and we the discuss the statistical tools used to fit such models to NR data.

A. The overtone model
At late times, we can decompose a spin-weight-s field s X propagating in a Schwarzschild (or more generally Kerr) background as a sum of damped sinusoids, namely, Here, the (l, m) indices describe the angular decomposition of the modes (with m = −l, . . . , l), and sỸlm are the spin-weighted spheroidal harmonics 3 ; for a perturbed Schwarzschild black hole as in our case, they reduce to the spin-weighted spherical harmonics s Y lm . n = 0, 1, 2, . . . accounts for the n-tone excitations of a given (l, m) mode, with n = 0 being the fundamental "tone" and n = 1, 2, . . . corresponding to overtones, and t r is a suitable reference time where the linear perturbation theory is expected to describe the dynamics accurately [14,15,85,86]. If linear perturbation theory applies, the quasinormal mode frequencies and damping times 4 ω ± lmn and τ ± lmn are solely determined by the black hole's final mass and angular momentum. 3 In particular, the shear as defined above, following the usual convention, has spin weight +2 and is thus decomposed in spin-weight +2 harmonics. One could as well have worked with a spin-weight -2 field by using the complex conjugate of the shear instead, which would then have been expanded in spin-weight -2 harmonics. Both types of fields have the same QNM spectrum in a Kerr (or Schwarzschild) background. (See, e.g., Eqs. (3.29) and (3.30) in [84] relating the Weyl tensor components Ψ 4 and Ψ 0 , which respectively have spin weight -2 and +2, showing that both variables are isospectral.) 4 Alternatively, one can rewrite the exponential factors exp[−ιω ± lmn (t − t r ) − (t − t r )/τ ± lmn )] in Eq. (12) for each (l, m, n) component as an exp[−ιν ± lmn (t − t r )] with a complex frequency ν ± lmn of which ω ± lmn is the real part and the damping rate 1/τ ± lmn is (up to a sign change) the imaginary part. For a given choice of the (l, m, n) indices, one finds two families of solutions, those with ω + lmn > 0 and those with ω − lmn < 0, corresponding to the co-rotating and counter-rotating modes respectively, with their associated damping times τ ± lmn and complex amplitudes A ± lmn [8,14,16,31,33,87,88]. Note that for the Schwarzschild case, ω + lmn = −ω − lmn and τ + lmn = τ − lmn for any m; this also holds in the general Kerr case if m = 0. Hence, in these cases, the (absolute) values of the frequencies and damping times are independent of the family (co-or counter-rotating) of modes considered. Throughout this work, we will set t r to the time value used for the late-time fits in [60], that is in the units used in the present work, t r /M = 20/1. 3 15.4. The amplitudes at t r , A ± lmn , are unknown complex numbers that only depend on the perturbation conditions set up during the inspiral-plungemerger phase of the binary black hole evolution. Hence, they are fully determined by the initial parameters of the binary prior to the merger -in our head-on, nonspinning case, the mass ratio of the two colliding black holes and their relative boost at a given separation.
To allow for deviations on the complex frequencies from the QNM values, Eq. (12) may be replaced by where α ± lmn and β ± lmn are two sets of perturbation parameters for each co-or counter-rotating mode. These will measure the deviations to the QNM spectrum (as predicted by perturbation theory within GR), while the latter spectrum is recovered for α ± lmn = β ± lmn = 0. To perform black hole spectroscopy, one shall require a) that the posterior distributions of α ± lmn and β ± lmn are consistent with zero and b) that the frequency values can be resolved to a given nσ credible value [15]. The latter is technically difficult due to the low sparsity of the QNM real frequencies. For instance, for the (l = 2, m = 0) QNM of a s = 2 field in a Schwarzschild spacetime, the real frequencies of the fundamental mode and first overtone only differ by 7% (see the corresponding frequency values in Table I, left panel), making the separate resolution of the two tone frequencies a challenging task [14,15]. An attempt to estimate the overtone frequencies by means of the Bayesian framework on GW150914 data was partially tackled in [12] by performing a parameter estimation on a reduced parameter space. Other recent studies have provided estimates on the QNM parameters by performing fits to NR data [11,[14][15][16]60]. Fitting the data circumvents the extensive exploration of the parameter space by estimating the physical parameters from maximum likelihood estimation algorithms.
The fields originated from head-on collisions of nonspinning black holes -as in our case-are fully described by the m = 0 modes due to the rotational symmetry of such collisions. In this scenario, all angular m 0 modes vanish, i.e., s X l, m 0, n = 0. We will only allow for deviations from the QNM spectrum respecting its symmetries, ω + l0n = −ω − l0n and τ + l0n = τ − l0n : i.e., we set α + l0n = α − l0n and β + l0n = β − l0n . Moreover, the fields we are considering, i.e., the multipole moments I l , and the shear modes σ l as defined above in Sec. II B 3, are real-valued functions. For such variables, the two families (co-and counter-rotating) of modes combine, with their complex amplitudes at t r satisfying A − l0n = (A + l0n ) * , so that and for a real amplitude A l0n and phase φ l0n . With the above remarks, from now onwards we can drop the ± superscripts on all parameters and simplify the ansatz of Eq. (13) into, with s X l0 = Re( s X l0 ). As a sign flip on the real frequencies would still be possible in principle, we specify that the corotating (positive real part) choice is implied for the complex frequencies ν l0n and their real part ω l0n . We will further drop the fixed s = 2 and m = 0 subscripts on X l ≡ s X l0 in the following. The parameters α l0n and β l0n help one test the effects of eventual systematic errors sourced by a) including an insufficient number of tones n max when modeling the data with Eq. (14) or b) the presence of non-negligible nonlinearities in the data [14,15,85]. We note in passing that introducing the parameters α l0n and β l0n may also be used in a more general context to parametrize deviations from general relativity.
In this work we model the data for the shear modes σ l and multipoles I l with 2 ≤ l ≤ 12, using multiple values of n max , up to n max = 15. In general, we fit for the amplitude A l0n and for the phase φ l0n , and we either set the frequency deviation parameters β l0n and α l0n to zero or additionally fit for them.
We compute the QNM spectrum values {ω l0n , τ l0n } of the final black hole using the qnm Python script [89], which combines a Leaver solver with the Cook-Zalutskiy spectral approach to the angular sector [31,90]. Our final black hole has no spin and its mass is slightly lower than M due to the gravitational radiation. This relative mass decrease with respect to M can be estimated at about 7 · 10 −5 from the outer horizon area at late times. We simply approximate the final mass as M when computing the QNM spectrum, implying a similar relative error on τ l0n and ω l0n which are proportional and inversely proportional to the final mass, respectively. In Table I, we show as an example, a sample of the resulting QNM frequencies ω l0n and damping rates 1/τ l0n , for l = 2, . . . , 12 and n = 0, . . . , 5.
The fitting algorithm is explained in Sec. III B. In Sec. IV A we explore the fit results for a single-tone (n max = 0) analysis. We observe that the single-tone model is not sufficient to fully describe even the late-time data. In Sec. IV B we extend the results to the multiple-tone (n max > 0) analysis and to the whole dataset, with all the α l0n and β l0n parameters set to zero. In this case and for large enough n max , we find that the model is sufficient to describe the data even including the early times where the horizon is not in equilibrium. In Sec. V we discuss this, and investigate whether one can infer from it an actual presence and predominance of overtones over nonlinear contributions right from shortly after the horizon is formed.

B. The fitting algorithm
We use a maximum likelihood estimation algorithm to obtain the best-fit parameters λ i . Those correspond to the parameter values that minimize the χ 2 , namely, where h x λ stands for the model given by Eq. (14) and evaluated at the parameters λ = {A l0n , φ l0n } or λ = {A l0n , φ l0n , α l0n , β l0n }, and h NR = {σ l , I l } stands for the numerical data for the shear modes or multipoles, respectively. We sum over the data points k at all times t = t k ∈ [t 0 , t f ], for a certain fit starting time t 0 which may be picked at any value t bifurcate ≤ t 0 ≤ t f , and where, as above, t f 38.5M is the end time of the simulation. Minimization of (15) is performed running the Levenberg-Marquardt algorithm for nonlinear fitting 5 as implemented in Mathematica [91].
To assess the fit goodness we use the mismatch M as in [11,[14][15][16], which is defined as The standard errors δλ i on the parameters are computed from 5 Note that in the case where the free parameters are only the amplitude and phase of each mode, λ = {A l0n , φ l0n }, i.e. their complex amplitude, and writing the expansion under its complex form as in Eq. (12), the fitting problem is linear and a dedicated scheme could have been used instead [16]. In this work, we however use the same (nonlinear) algorithm for either choice of the set of free parameters to fit for. This allows for a consistent approach throughout our investigation and for direct comparisons between models where some of the frequencies are left as free parameters and models with all frequencies set to the QNM values (as in Sec. V B).
) (left) and damping rates 1/τ l0n ≡ 1/τ ± l0n = −Im(ν ± l0n ) (right) for a Schwarzschild black hole with unit mass, for l = 2, . . . , 12 and n = 0, . . . , 5, computed with the qnm Python script [89]. the diagonal terms of the covariance matrix as [91], where is the residual sum of squares; N is the number of data points in the time range considered, i.e. in [t 0 , t f ]; and p is the number of parameters one wants to fit for. H i j is the Hessian matrix defined as which is evaluated at the best fit parameters λ.

C. Exponential rescaling procedure and numerical errors
The NR data appears to be exponentially damped at late times, and so are the damped-sinusoidal models of the class (14) that we fit to this data. Due to this damping, the fitting procedure based on the residual sum of squares -rather than relative differences-will capture better the behavior of the NR data towards the beginning of the time interval considered (close to t 0 ) than towards the latest times (close to t r ). A reliable estimate of the oscillation frequencies, damping rates, and amplitudes of each tone in the model would rather require an accurate match of the relative amplitudes and positions of the successive extrema (or zeros), and thus a small relative deviation to the NR data, on the entire time interval considered.
To this aim, when looking specifically for the best-fit frequencies, damping rates and/or amplitudes 6 , we will first apply a time-dependent rescaling of the NR data and of the model, determined by the damping rate of the fundamental QNM. That is, we will fit a rescaled dataset with the similarly rescaled model according to where the unrescaled model h x (t) is given by Eq. (14) for a certain n max , and with the parameters α l0n and β l0n either set to zero, or left as free parameters. In this way, provided the late-time decay rate of the NR data is comparable to the fundamental QNM value on the time range considered, the rescaled dataset will have an approximately constant (rather than decaying) amplitude towards late times. This then allows for a more accurate retrieval of the complex frequency parameters α l0n and β l0n , if left free, and of the amplitudes A l0n . Such a rescaling will of course also scale up the numerical errors at late times. However, the high resolution used here means that the relative error on the computed shear modes and multipoles remains rather low for all modes l ≤ 12. We conservatively estimate the error on the NR results -assuming it to be dominated by discretization error-as the difference between the values obtained at the highest resolution, res = 240 -used throughout this paper-and the next highest resolution available 7 , res = 180. The relative error on the shear modes and multipoles computed in this way increases with l 6 The rescaling procedure presented here is dedicated to improving the accuracy of the determination of these parameters. To ease the interpretation, we will not use this rescaling when we rather simply wish to evaluate the quality of the fit of a model to the NR shear modes and multipoles (either by computing the corresponding mismatch or via direct visualization). In this case we shall simply compare the model h x (t) to the data h NR (t) directly. 7 To test the validity of this measure of the error (let us denote it as 180−240 ), as the amplitude of the modes decreases. Away from the zeros of the modes, it ranges from about 10 −6 at l = 2 to about 1% at l = 12 for the shear modes, and up to 1 order of magnitude larger for the multipoles. We accordingly consider l = 12 as a threshold value for sufficiently small numerical uncertainties and we shall not consider the higher-l modes in the present work. Fig. 4 shows the rescaled NR datah NR and the numerical error on this rescaled data as a function of time for the two extreme cases of the smallest (σ 2 , left panel) and the largest (I 12 , right panel) relative error among the modes we consider.

IV. FIT RESULTS
A. Single-mode fits with variable frequency and damping rate Before considering higher overtones, we first focus on the s = 2 fundamental QNMs of the remnant black hole. We aim at checking whether we recover the good agreement found in [60] (with a different method as used here) between the damped oscillating behavior of the shear modes and multipoles at late times on the one hand, and the complex frequencies of these fundamental modes on the other hand.
To this end, we consider a single-tone model where the frequency and damping rate are free parameters, i.e., the ringdown model (14) restricted to the n = 0 fundamental tone: We will then check to what extent the best-fit frequencies and damping times in such a model match the fundamental QNM values, corresponding to α 0 = 0 and β 0 = 0, for the latetime shear modes and multipoles. Thanks to the high numerical resolution, we can perform this analysis up to the l = 12 mode of the shear and multipoles, so that we can also consider whether the conclusions of [60] on the late-time behavior of these variables do extend beyond the l = 7 mode.
Note that for convenience, we here drop the l and the m = 0 indices on the free parameters, that is on A 0 ≡ A l00 , φ 0 ≡ φ l00 , α 0 ≡ α l00 and β 0 ≡ β l00 . It should however be understood that their values will, of course, depend on the variable X l considered, i.e., on the observable X (shear or multipoles) and on the mode l.
we have computed the differences 240−480 between the res = 240 datasets and higher resolved datasets with res = 480 but with a shorter simulation time t (480) f 15M. We have checked that the error 180−240 is larger than 240−480 , which is expected given that the discretization scheme of the datasets is globally fifth-order accurate and that all datasets are shown to be in the convergent regime [48]. This trend has been confirmed for the examples of the l = 2, 4, 8, 12 modes for both the shear and the multipoles data.

Late-time best-fit frequencies and comparison to the fundamental QNM values
We first turn our attention to the late-time data as selected in the same way as in [60], i.e., we consider t ∈ [t 0 , t f ] for t 0 /M = 20/1. 3 15.4. (In this case, the fit starting time t 0 then coincides with the constant value we have set for t r .) We begin by considering the shear modes σ l for 2 ≤ l ≤ 12. Fig. 5 shows the best-fit values for each mode l for the real (left panel) and imaginary (right panel) parts of the model's single frequency ν = (1 + α 0 ) ω l00 − ι τ −1 l00 (1 + β 0 ) −1 , using the rescaling procedure (21) to obtain a better accuracy in the recovery of ν. The results for Re(ν) and Im(ν) are normalized to the corresponding fundamental QNM values, and we also include as error bars on these results, the 1σ (standard) deviations on the best-fit estimates as computed from the covariance matrix using Eq. (18) (see Sec. III B). The results from [60] are also indicated for comparison when available, i.e., for l ≤ 7.
The disagreement to the QNM values for the imaginary part does however increase for larger values of l, up to the order of | Im(ν)/Im(ν l00 ) − 1| 20% to ∼ 30% for l = 10, 11 and 12. For these large-l modes the quasinormal fundamental mode is thus no longer an accurate model of the shear modes for the range t 15.4M, at least regarding their damping rates. The | Im(ν)/Im(ν l00 ) − 1| 10% deviations at several of the smaller-l values despite uncertainties much smaller than this number suggest that this may even be the case for most of the modes. This may be a consequence of the presence of residual nonlinear deviations to equilibrium at these times, or of the residual presence of higher overtones for these modes. The latter hypothesis would explain the larger decay rates found for most modes and the very small deviations on the real parts of the frequencies: the respective real frequencies ω l01 and ω l00 of the QNM first overtone and fundamental mode differ by 1 − ω 201 /ω 200 7% for l = 2 and by even smaller amounts for all higher-l modes, while the decay rates of the QNM first overtones are typically 3 times larger than those of the fundamental modes (see Table I). The decay rates smaller than the fundamental QNM value found for l = 7 and l = 10 would be harder to explain in this scenario, but could be caused by a modulation induced by a higher overtone if this latter is nearly in antiphase with the fundamental mode. This would cause a decrease in the overall amplitude in the early part of the time range considered (i.e., for t close to t 0 ) before the overtone fully decays away.
Note that the systematics due to the errors in the numerical data are not included in the error bars shown. We can estimate these errors by comparing the best-fit frequencies to those   18)). The values are normalized to the fundamental QNM values and the reference unity value for this ratio is marked as blue disks. The best-fit values we obtain without using the rescaling procedure are also shown for reference (green diamonds) with their associated 1σ uncertainties, and the values quoted from [60] for l ≤ 7 are also given for comparison (orange squares). For the latter, we do not show error bars on the figure as precise uncertainties were not given for every l. These uncertainties were estimated to be of the order of δ(Re(ν))/Re(ν) ±1% for the real part and δ(Im(ν))/Im(ν) ±10% for the imaginary part for each l ≤ 6, and larger for l = 7.
found by instead fitting the (rescaled) next-highest-resolution (res = 180) NR data. The relative deviations ∆(Re(ν))/Re(ν), ∆(Im(ν))/Im(ν) obtained in this way on the best-fit real and imaginary frequencies are very small for the low values of l.
They remain under 10 −4 for all modes for the real part, and under 10 −3 for all modes for the imaginary part. There are further systematic errors if, e.g., nonlinearities or higher overtones are present since they are not accounted for in the model. For comparison, we also include in Fig. 5 the results obtained from directly fitting the model (22) to the NR data on the same time range, this time without applying the rescaling (21). For the real frequency Re(ν), these results remain within ±1.5% of the QNM values ω l00 for all modes. They however display substantial systematic errors with much larger devia-tions to the QNM values Im(ν l00 ) for the imaginary parts Im(ν) for almost all modes, due to an inaccurate fitting of the decaying amplitude over time.
We then repeat this analysis for the multipole moments I l for 2 ≤ l ≤ 12. The results are shown in Fig. 6 where the available (l ≤ 7) results from [60] for the multipoles are again also given for comparison. As also found in the latter reference, the results we obtain for the multipoles are qualitatively very similar to those obtained for the shear modes. We here again focus on the results obtained after applying the rescaling procedure given by Eq. (21), which is expected to improve the determination of the frequencies.
The real frequencies Re(ν) again remain within ± ∼ 1.5% (although no longer within ± ∼ 1% at large l values) of the fundamental QNM real frequencies Re(ν l00 ) = ω l00 . The imaginary frequencies Im(ν) still feature relative deviations to the QNM values Im(ν l00 ) = τ −1 l00 of up to ± ∼ 10% for l ≤ 9 and larger deviations for l ≥ 10, although they do remain below ∼ 20% in magnitude for all modes. While these magnitudes do change, we note that the signs of the deviations Im(ν)/Im(ν l00 ) − 1 are the same as those found for the shear for every mode l.
The numerical relative errors on the best-fit frequency values, estimated in the same way as for the shear above, are again very small at small l. They reach slightly larger values than for the shear, from ∆(Re(ν))/Re(ν) ∼ 1 · 10 −3 to just above 2 · 10 −3 , for the real frequencies and l = 10 to 12. For the imaginary part, these relative error estimates stay below ∆(Im(ν))/Im(ν) ∼ 10 −3 , as for the shear, for l ≤ 9; but they reach about 4 to 9 · 10 −3 for the last three modes, implying a small but non-negligible possible systematic error on the values of Im(ν)/Im(ν l00 ) shown in these cases.
Note that the QNM complex frequencies used here are still those of a field of spin-weight 2, as for the shear. The late-time oscillations of the multipoles match well the corresponding real frequencies (as well as the damping rates to a lesser extent), even though -unlike the spin-weight-2 shear scalarthe multipoles are scalar fields of spin weight 0. For instance, the fundamental l = 2 QNM real frequency for a spin-0 perturbation is nearly 30% larger than the corresponding frequency for spin-2 perturbations; a difference which would be easily noticeable. Hence, the dynamics of the geometry of the outer common horizon as measured by the mass multipoles may be determined by the shear flux at the horizon, at least at the late times considered so far. This is entirely consistent with the discussion at the end of Sec. II B 3.

Dependence on the fit starting time
We can also let the fit starting time t 0 vary and span the available interval [t bifurcate , t f ]. One can expect the behavior of the shear scalar and multipoles to be fully described by Best-fit parameters α 0 (blue) and β 0 (orange) as functions of the fit starting time t 0 , for the shear σ 2 mode and for the singletone model (22). The rescaling procedure (21) has been applied prior to fitting. 1σ deviations (Eq. (18)) around the best-fit values are included.
the fundamental QNM (n = 0) at large t, i.e., in the nearequilibrium regime and after the higher overtones have decayed away. If this is the case, the best-fit complex frequencies to the shear modes and to the multipoles should converge towards the fundamental QNM values at late times. Fig. 7 shows the best-fit complex frequency deviation parameters α 0 and β 0 for the time range [t 0 , t f ] as a function of t 0 , for the shear l = 2 mode as an example, along with 1σ uncertainties on these parameters as given by Eq. (18). The rescaling given by Eq. (21) has again been applied to the model and to the NR data for a more accurate retrieval of the complex frequency. The results similarly obtained for the l = 2 multipole are shown in Fig. 8 -still using the s = 2 fundamental QNM as the reference complex frequency value-with very similar behaviors. In both cases, both parameters α 0 and β 0 do appear to converge towards zero, or at least to a value of modulus |β 0 | < 10 −2 in the case of β 0 , as t 0 approaches t f .
However, α 0 and -more prominently-β 0 clearly devi- ate from zero for earlier fit starting times. In particular, β 0 reaches increasingly negative values, corresponding to larger and larger damping rates, as t 0 is decreased. Such deviations are of course expected at early times when nonlinear deviations to equilibrium should still be present. The observed behavior of α 0 and β 0 is however also compatible with the presence of QNM overtones, which are damped faster than the fundamental mode. Their presence may indeed be expected at least at the intermediate times t/M 10 to 20, if the nonlinear dynamics have become sufficiently negligible, and before these overtones have decayed much below the fundamental mode.
The best-fit α 0 and β 0 values for the higher shear modes and multipoles typically display similar behaviors as those observed here for l = 2, although the residual magnitudes of these quantities towards t f can be a little larger than in the l = 2 case, with |α 0 | 1% and |β 0 | 5%, and with typically α 0 < 0 and β 0 > 0. For both the shear and multipoles, the l = 7 and l = 10 modes (already singled out in Figs. 5 and 6 for their atypical best-fit damping rates) are exceptions regarding the best-fit β 0 , which takes again substantial positive values (ranging from β 0 0.18 to β 0 0.37) for very late starting times 30M t 0 < t f -where few data points remain-after being first damped with increasing t 0 for t 0 30M.

B. QNM models including overtones
In the previous subsection, we noted that the late-time oscillations of each of the shear modes and multipoles for 2 ≤ l ≤ 12 were well modeled by the real frequency ω l00 of the corresponding s = 2 fundamental QNM of the remnant black hole. On the other hand, the damping rates of these oscillations showed some deviation to the fundamental QNM imaginary frequencies τ −1 l00 , especially at large l. Moreover, the deviations on both the real and the imaginary frequencies generally appeared to increase as earlier times were taken into account, and to nearly vanish if only the very end of the dataset was considered. We have accordingly suggested that the shear modes and multipoles are fully described asymptot-ically by the fundamental QNMs while the behavior at more intermediate times (say, around t = 15M) may correspond to the additional residual presence of higher overtones.
Moreover, as already pointed out in [60], each of the shear modes and multipoles clearly displays a steep nonoscillating decay at early times (at t/M 4), with a substantially larger damping rate than the late-time damped-oscillatory regime. Accordingly, none of the modes can be correctly described by a single damped sinusoid over the whole time range [t bifurcate , t f ]. It was noted in [60] that the larger decay rate observed at early times, while reminiscent of the large values of the imaginary frequencies of the QNM overtones, generally did not appear to quantitatively match the imaginary frequency of any particular overtone. It was left as a possibility that this early damping regime may nevertheless correspond to combined contributions of multiple overtones.
Accordingly, we will now examine the hypothesis that the behavior of each of the outer common horizon shear modes and multipoles is consistently described by a combination of QNMs, including overtones, over the whole available time range from the very formation of this horizon or shortly afterwards. To this end, we consider the multiple-tone model given by Eq. (14) with all complex frequencies set to the QNM values, i.e., α l0n = β l0n = 0 ∀l, n: where we have again dropped the m = 0 index and the implicit l dependence on the free parameters: A n ≡ A l0n and φ n ≡ φ l0n . We then check for the agreement of such a combination of QNMs to the numerically computed shear modes and multipoles as the total number n max of overtones is varied. We here aim at directly comparing the above class of models to the NR results, rather than at accurately estimating best-fit frequencies or amplitudes. Accordingly, for a more straightforward comparison and interpretation, in this subsection we will directly use the models and NR datasets without applying a rescaling procedure such as that of Eq. (21).

Shear modes
We first consider the shear modes σ l , with, as in the previous subsection, 2 ≤ l ≤ 12. We begin by considering, for each l, how the mismatch M between the best-fit model on [t 0 , t f ] and the NR data, improves as more and more overtones are included in the model, depending on the fit starting time t 0 . The results are shown in Fig. 9 for a sample of l values (l = 2, l = 4, l = 7 and l = 11), with M as a function of t 0 and for multiple values of the total number of overtones n max . We do not go beyond t 0 /M = 30 here as the number of data points in the remaining time interval would become too low for the fitting algorithm to always converge, especially for large n max . The mismatch between the highest-resolution NR data (res = 240) and the NR results at the next-highest resolution (res = 180) on the time range [t 0 , t f ] is also shown as a function of t 0 , as an estimate of the numerical error, for comparison. We have also checked the validity of this mismatchbased estimate of the NR error in a similar way as explained for the local error 180−240 in footnote 7. The other values of l not shown here typically display characteristics intermediate between those presented below.
For n max = 0 we find again that the fundamental QNM alone matches the shear modes better and better towards later times, but also that for a given t 0 , the quality of the fit provided by this fundamental mode alone overall degrades for increasing l. Both of these trends still hold for models additionally including a small number (e.g., n max = 1 or 2) of QNM overtones.
Furthermore, we observe that the quality of the fit clearly improves for almost any t 0 as the number of overtones is increased. For l = 2 and l = 4 here, one can notice a relatively sharp decrease of the mismatch at small t 0 to small values M < 10 −5 for a certain number n max of overtones, beyond which adding more overtones decreases the mismatch less significantly. This occurs at n max = 3 for l = 2 and around n max = 4 for l = 4. This suggests that these numbers of QNM overtones provide a good modeling of the entire dataset for these modes, at least beyond the first 0.5M or so after t bifurcate . Such a trend does not appear as clearly for the higher l values shown here, but the mismatch still gets down to small values at early t 0 for a sufficiently large number of overtones. The number of overtones needed for the mismatch to stay below a certain threshold appears to increase with l. For large l, the numerical error becomes too large for reliable constraints on large numbers of overtones, and for estimating the number of overtones needed for the mismatch to stay below a too small threshold. For l = 11 for instance, the mismatch with the res = 180 results reaches M 10 −4 for certain values of t 0 , and it appears that any improvement in the mismatch by increasing the number of overtones beyond n max = 6 lies within this numerical uncertainty.
While a useful synthetic quantitative tool, the mismatch is based on absolute deviations between the data and the model, and for this reason it does not necessarily clearly reflect how well the damped behavior of the modes is represented by the model at all times. For such damped data, it will more accurately measure the relative deviations to the model towards the early parts of the time interval considered.
For this reason, we also directly examine, for each shear mode σ l with 2 ≤ l ≤ 12, the best-fit multiple-tone QNM models as arising from Eq. (23), and we compare them to the corresponding NR data, as the number of included overtones increases. A fixed value of t 0 shortly after t bifurcate is used in the fitting process. We show in Fig. 10 on a logarithmic scale the NR l = 2 shear mode and the best-fit models for several successive n max values. Two additional examples are similarly presented in Figs. 11 and 12, corresponding to the l = 5 and l = 10 modes respectively. Numbers of overtones lower than those shown never provide a relevant match to the shear modes beyond a very short time range past t 0 . Conversely, higher numbers of overtones than those shown provide no visible improvement.
For this analysis we have selected a fit starting time t 0 = (3/1.3) M 2.3M. We thus start fitting the data fully within the early decay regime, but not immediately at the formation time t bifurcate 1.06M. This avoids the times immediately after t bifurcate , up to t 1.3M, where an even steeper decrease is observed due to the infinite slope occurring at the bifurcation with the inner horizon (see Sec. V A and Fig. 16). The choice of t 0 yet about 1M further beyond this very specific regime also allows us to evaluate the robustness of the fit results, by checking for the continued agreement to the data at times preceding t 0 .
In overall agreement with the mismatch investigations above, we find that the behavior of each of the shear modes 2 ≤ l ≤ 12 is well described, over the broad time range [t 0 , t f ] considered, by a combination of a sufficiently large number of QNM overtones. For the l = 2 mode, a combination of two overtones captures well the damping and oscillation features of the mode at all times, and three overtones are enough to ensure a very small relative deviation at all times even including the times t < t 0 . The same remarks hold for n max = 5 and n max = 6 overtones for l = 5, while a good modeling of the l = 10 mode requires n max = 11 to n max = 13 overtones (with a larger discrepancy to the data at t < t 0 for n max = 13 in this case). Similar results are found for all modes for adequate values of n max .
As suggested by the above examples, the number of overtones required for an accurate representation of the shear modes increases with l, reaching large values of n max > 10 overtones for l ≥ 10. In general, a reliable modeling of the mode l typically requires n max = l or n max = l + 1 overtones, occasionally up to n max = l + 3 (in particular for the "atypical" cases l = 7 and l = 10). These estimates may be unreliable for l ≥ 10 as the numerical uncertainty may become larger than the contribution of some of the highest overtones considered over most of the time range. In the mismatch analysis above, we pointed out that the improvements of the mismatch to the l = 11 mode while adding more overtones beyond n max = 6 are within our estimate of the numerical error. From a similar consideration, for the example of the l = 10 mode shown here, the models with n max ≥ 10 overtones may actually be poorly constrained for this value of t 0 due to the uncertainties in the NR results.
Nevertheless, it is quite remarkable that the late-time damped oscillations, the intermediate-time regime and the early steep decay without oscillations of each of the shear modes can be consistently captured by a sum of QNM tones, with a relatively small number of overtones for small l. As the real frequencies of the first few modes typically remain close to that of the fundamental mode and thus close to the frequency of the observed late-time oscillations, one would in particular expect such a sum of QNMs to feature oscillations and zeros over the range where the shear modes undergo a steep decay. Instead, the observed oscillation-free early-time regime is well reflected by the best-fit QNM model for large enough n max , including (in nearly all cases) the domain t < t 0 which is not involved in the fitting procedure. . For the lower two panels, for the sake of readability we have only included the models with the fundamental mode only (n max = 0) or with odd numbers of overtones. Mismatch curves with even numbers of overtones remain consistent with the overall trend of decreasing M with increasing n max and typically lie in between the curves corresponding to the adjacent odd numbers.

Multipole moments
We can repeat the above analysis for the mass multipoles for 2 ≤ l ≤ 12 (still using the spin-weight-2 QNM frequencies). The results are qualitatively very similar to those obtained for the shear modes. Fig. 13 shows the mismatch between each of two example multipoles, I 2 (left panel) and I 7 (right panel), and the corresponding best-fit models of the class described by Eq. (23), as n max and t 0 are varied. The numerical error is estimated in the same way as for the shear in Sec. IV B 1 above, and is again shown as an orange continuous line. We find again a decrease of M as t 0 increases towards t f at fixed n max at least for n max ≤ 2, and also a decrease in mismatch with increasing number of overtones at fixed t 0 . We note that in this case the sharp decrease in mismatch at early t 0 with increasing number of overtones, for l = 2, already occurs at two overtones (vs. three overtones for the shear l = 2 mode). We also generally find again larger mismatches for larger l, for a given number of overtones and a given t 0 .
We then directly compare the best-fit models to the NR multipole results for a fixed fit starting time t 0 as n max increases. We use again the same value of t 0 2.3M as for the similar direct comparisons made for the shear modes above in Sec. IV B 1. Two examples of such comparisons for the multipoles are shown on a logarithmic scale in Fig. 14 (I 2 ) and in Fig. 15 (I 4 ) for the relevant numbers of overtones. Consistently with the mismatch results for l = 2, I 2 shows a small qualitative difference with the results obtained for σ 2 . Namely, a two-overtone model already ensures a very small relative deviation to the NR I 2 data at all times, while a comparable accuracy required three overtones for σ 2 . For I 4 , on the other hand, a comparable match is not reached under n max = 6, and the best-fit seven-overtone model atypically features an oscillation within the range [t bifurcate , t 0 [. We note that this the NR l = 2 shear mode (black dots) and the associated best-fit models of the class (23) (blue continuous lines) including, from left to right and top to bottom, n max = 0 to n max = 5 overtones. The entire NR dataset is shown, i.e., from t = t bifurcate to t = t f . All of the fits shown in this figure were obtained using the same fit starting time value t 0 2.3M. The corresponding {t = t 0 } vertical line, indicating which part of the dataset (to the right of this line) was actually used to constrain the model, is indicated in red on each plot. One can note the good agreement of the model to the data both after and before this starting time for n max ≥ 3. multipole has a lower magnitude at early times than the surrounding ones I 2 to I 6 (see Fig.1, bottom panel), which may be related to this lower fit quality compared to the other modes.
More generally, as for the shear, good matches to the behavior of each multipole I l are obtained over the whole time domain, when including at least n max = l or n max = l+1 overtones (or occasionally slightly more, such as for l = 4). In most cases, such a good match also extends back to t < t 0 . Note that for the multipoles, the models with n max ≥ l overtones may be poorly constrained (due to numerical uncertainty) for l ≥ 9.  Fig. 10 for the l = 5 shear mode and n max = 2 to n max = 7 overtones. The same fit starting time is used. A relatively good agreement to the data is obtained both after and before the fit starting time at n max = 6, and this is further improved at n max = 7.

V. IS THE EARLY HORIZON DYNAMICS REALLY DUE
TO OVERTONES?

A. General considerations
In the previous section we found, rather surprisingly, that the first few shear modes and multipoles could be well described by a combination of QNMs including the fundamental mode and a few overtones, over the whole time interval available -or at least ignoring the first ∼ 0.3M immediately after horizon formation. This conclusion also holds for the larger values of l considered (i.e., at least up to l = 12), although larger numbers of overtones are needed as l increases. In this section, we consider various criteria, described below, in order to assess the robustness and physical relevance of such a QNM combination description. We shall see, however, that a clear answer remains elusive.
In section IV A we confirmed that each of the shear modes and multipoles appears to be fully described asymptotically by the corresponding spin-weight-2 fundamental QNM alone  Fig. 10 for the l = 10 shear mode and n max = 8 to 13 overtones. The same fit starting time is used. A good qualitative agreement to most of the data, including before the fit starting time, is obtained for n max = 11 and improves further at late times for n max = 12 and 13, although an oscillating behavior -atypically-reappears in the best-fit model at very early times for n max = 13.
for t → ∞. We however observed deviations from this at intermediate times (e.g., around t = t r 15.4M). We noted that a detectable residual presence of rapidly decaying QNM overtones could be expected in this regime, assuming that the nonlinear deviations to equilibrium are already negligible at these times [60,85]. At earlier times t 8M however, the area of the outer common horizon is still varying steeply (see Fig. 3), suggesting a still dynamical regime for the horizon at those times. Accordingly, one might not expect the QNMs of the final Schwarzschild black hole to account well for the evolution of the shear flux and geometry of the common horizon from almost immediately after its formation.
In particular, at the time t bifurcate of the common horizon formation, the observables on this horizon, such as the shear modes and multipoles, have an infinite slope as a function of our time coordinate t. This is not a numerical artifact but rather a direct consequence of the bifurcation of the inner and outer common horizons at their joint formation. This is il-  Fig. 10 for the l = 2 multipole, for n max = 0 to n max = 3 overtones. We use again the same fit starting time as for the shear modes, t 0 2.3M. The model matches well the data both after and before this time at n max = 2, and this improves further (especially at t < t 0 ) at n max = 3.  Figs. 10 and 14 for the l = 4 multipole, for n max = 2 to n max = 7 overtones. The same fit starting time t 0 is used. A relatively good agreement to the data is found after and before this time for n max = 4, improving for increasing n max , with the exception of n max = 7 where the best-fit model displays an unusual behavior for t < t 0 . lustrated in Fig. 16 by considering the shear modes on both of the common horizons near their formation and bifurcation time t bifurcate . As a consequence of this infinite slope, no finite sum of QNMs (or any damped sinusoids) can strictly match qualitatively the outer horizon shear modes or multipoles as a function of t for t t bifurcate . This might however only be due to a coordinate singularity on the horizon. The simulation time t which we use is a time coordinate adapted to our spacetime slicing, and such an infinite slope should indeed occur for any choice of slicing by Cauchy surfaces equipped with an adapted time coordinate. On the other hand, this coordinate is not suitable around t t bifurcate for the description of the smooth 3-surface formed by the union of both common horizons. One could imagine using instead, for example, the radius of the horizon as a more adapted coordinate for this purpose. This will be discussed elsewhere. In this work we shall be content with using the simulation time t, discarding a short time range of about 0.3M after t bifurcate from our analyses. This range corresponds to a short faster-than-exponential decrease that can be observed in the shear modes and multipoles and that matches the vertical tangent at t = t bifurcate . We have seen that the shear modes and multipoles can be well described by combinations of QNM tones at all times past this short regime.
In the present section, we thus aim at investigating whether the results we presented in section IV should really be interpreted as the physical presence of initially high-amplitude QNM overtones determining the entire evolution of the outer common horizon past the first ∼ 0.3M. We must also consider the alternative -that these results are simply an artifact of fitting the observables σ l and I l with damped sinusoids with sufficiently many free parameters.
We note in particular that the modes at large l can only be well matched by a sum of QNM tones if this sum extends to a large number n max of overtones, leaving a large number of degrees of freedom for fitting the data (that is, 2n max + 2 degrees of freedom; with the minimum n max required ranging from 10 to 14 for 10 ≤ l ≤ 12). The numerically computed shear modes and multipoles feature a steeper and steeper early-time (t birfurcate + 0.3M t t birfurcate + 3M) damping as l increases. On the other hand, the damping rates of the QNMs for a given n are independent of l to first approximation, but increase with n (see, e.g., Sec. 3.1 of [87]; an illustration of this can also be seen in Table I of the present work, right panel). It was thus expected from the observed behavior of the shear modes and multipoles that their modeling in terms of QNMs would require higher overtones for larger l. It is not obvious however from a theoretical perspective that a larger early-time amplitude of high overtones should have been expected a priori for higher shear modes or multipoles.
One may come back to the generalized model of Eq. (14) with multiple tones (n max ≥ 1), leaving the parameters α l0n and β l0n free, and attempt to check if the best-fit model indeed recovers the QNM frequency values, corresponding to α l0n = β l0n = 0. Unfortunately, already for n max = 1 and even more for larger n max , the frequency deviation parameters appear to be very hard to constrain -even when the rescaling procedure of Eq. (21) is applied. These parameters typically feature large fitting uncertainties and overlaps between tones or with zero-frequency models (α l0n = −1). This is likely due to the small differences between the QNM real frequencies (used as reference values) for successive tones, as well as to the rapid decay of QNM overtones. Accordingly we cannot really conclude on the actual presence of QNM overtones from such an analysis. This does however suggest that constraining the deviations of the complex frequencies of a combination of damped sinusoids from the theoretical QNM frequency values can be very challenging in general, even for zero-noise, low-systematic error data.
In the following subsections we will thus rather probe the robustness of the QNM modelling using several tests of fit stability and fit comparison. These tests provide more insight than the approach mentioned hereabove, even though they still do not allow us to reach a definitive conclusion. We will simply focus on the shear for this investigation, and specifically on the l = 2 shear mode as an example and as an easy case that can be modeled with a relatively small number of overtones.

B. Comparing models with different numbers of overtones but an equal number of free parameters
We first consider the relative quality of the fits provided either by a sum of a few QNM tones, or by another model of the general class of Eq. (14) with less modes but some of the parameters α l0n and β l0n left free rather than being set to zero. We choose the second model in such a way that both models have the same number of free parameters. We will consider two such pairs of models, with respectively four and six free parameters. We compare the models within each pair in terms of their mismatch to the NR l = 2 shear mode for the best-fit parameters (without applying a rescaling such as that of Eq. (21)), as a function of the fit starting time t 0 . These results suggest a small preference at nearly all times for the one-overtone model with QNM frequencies over a single-damped-sinusoid model even though the complex frequency of the latter is freely adjusted. The improvement in mismatch does however occur at most but not all values of t 0 , and barely goes beyond 1 order of magnitude when it occurs. In particular, for most values of t 0 /M 28, i.e. at very late times, we obtain similar values of the mismatch M for both models considered. This is consistent with Fig. 7 since, in this regime, deviations to a fundamental-mode-only model (with QNM complex frequency) are expected to be mostly negligible. Fig. 18 shows the similar mismatch for two six-parameter models. Both models assume the presence of the fundamental QNM, which we seem to recover asymptotically at late times. They both consider an additional contribution, which takes the form of either the first two QNM overtones, or of a single damped sinusoid with unconstrained complex frequency. The first model (green continuous line) thus corresponds to the (n max = 2)-overtone model with QNM frequencies of the class of Eq. (23), with free parameters {A 0 , φ 0 , A 1 , φ 1 , A 2 , φ 2 }. The second model (red dashed line) corresponds to the general ansatz of Eq. (14) for n max = 1 and with α l00 and β l00 set to zero. The free parameters in this case are {A l00 , φ l00 , A l01 , φ l01 , α l01 , β l01 }. No clear preference is found for either model, both of them alternately having the lowest mismatch for various ranges of t 0 , and with very small differences between both mismatch values.
Hence, a {fundamental QNM + first QNM overtone} model is only marginally preferred to a single-damped-sinusoid model, and assuming the presence of the fundamental QNM, we cannot conclude about the additional presence of two QNM overtones vs. that of an arbitrary single additional damped sinusoid. This neither confirms nor rules out the actual presence of QNM overtones, but hints again quite strongly at the difficulty of confidently determining a) the presence of overtones and b) the frequencies of multiple damped sinusoids that may be present in the data.

C. Comparison to a toy model with altered real frequencies
We now investigate how the quality of multiple-tone fits depends on deviations in the frequencies of the tones with respect to the QNM values. We here focus on the real frequencies, noting that the imaginary frequencies (or damping rates) of successive QNM tones n are well separated, while the corresponding real frequencies vary by smaller amounts for small n values (Sec. 3.1 of kokkotas:1999bd; see also Table I hereabove). For the (l = 2, m = 0) mode considered here, the real frequencies of the first three QNM overtones ω 20n , n = 1, 2, 3, for instance, are smaller than the fundamental-mode one ω 200 by about 7%, 19% and 33% respectively (see Table I, left  panel).
For this purpose, we consider a family of arbitrary toy models following the general ansatz of Eq. (14), with a variable total number of overtones n max . We define this family by setting α l00 and all the β l0n parameters to zero, i.e., we keep the fundamental QNM and we keep all damping rates at the QNM values, and by setting the other α l0n parameters, n > 0, to a specific choice of nonzero values. Our choice here is to set every α l0n such that the real frequency of each tone in the model stays equal to the fundamental QNM real frequency: ω l0n (1 + α l0n ) = ω l00 ∀n. We thus end up with the following family of models, parametrized by n max : The free parameters of the models are the amplitudes A l0n and the phases φ l0n . We probe the ability of these artificial models to match the shear l = 2 mode at all times as n max is varied, in a similar way as was done, e.g., for Fig. 10 in section IV B 1. That is, we set the same early fit starting time t 0 ∼ 2.3M as for the latter figure and we directly study the relative deviation of the best-fit model to the NR data (and to its general behavior) for each n max , without using a rescaling such as that of Eq. (21) in the fitting process. Fig. 19 shows the results similarly to Fig. 10 with the bestfit model for each n max as a continuous line and the NR data as dots, as a function of t and on a logarithmic scale. We show here only the most relevant values of n max = 2 and n max = 3. n max = 0 (fundamental QNM only, already considered earlier) and n max = 1, do not provide a good match to the overall behavior of the shear mode, while values of n max > 3 show little visible difference to the n max = 3 case.
Interestingly, we find again for this artificial model a rather good match to the data for n max = 2, and a very good match at all times (including prior to t 0 but after t t bifurcate + 0.3M) for n max = 3. The results are qualitatively very similar to those obtained with the multiple-QNM-tones model of Eq. (23) in section IV B, despite the unphysical real frequency values used here for the overtones. Hence, our conclusions of a good modeling of the shear modes or multipoles by combinations of sufficiently many QNM tones are not very sensitive to the actual frequencies (at least regarding the real part) used in the overtones model. We see here that similar conclusions can be reached with models that do not match the GR QNM values for n > 0.

D. Stability of the multiple-QNM fits with the fit time range
We finally study some aspects of the stability of the bestfit parameters when fitting the multiple-tone QNM model of Eq. (23) to the NR data over the range [t 0 , t f ] as t 0 is varied. Such a stability can be seen as a necessary condition for the consistent presence of a set of QNM overtones in the data. If these modes are present, then for instance their amplitudes should be recovered consistently over a range of t 0 values where they are detectable.
We have already mentioned some stability properties of the fits provided by this model for large enough numbers of overtones in section IV B. It is indeed noteworthy that when selecting a fit starting time t 0 2.3M t bifurcate + 1.2M, in almost all cases where any given shear mode or multipole is well matched by the model after t 0 , the nonoscillating damped regime extending before t 0 to t t bifurcate + 0.3M is also well recovered, qualitatively and quantitatively. These QNM models thus consistently match the behavior of the data even at times where they have not been constrained. This suggests that the corresponding QNM overtones are recovered consistently for some range of times around this t 0 .
Here we turn to the investigation of the best-fit amplitude parameters A n obtained for each tone n in multiple-tone models of the class of Eq. (23) (hence, with all frequencies equal to the QNM values), still for the example of the shear l = 2 mode. These parameters are by definition amplitudes computed at the fixed time t r , and we check for their constancy as we vary the time t 0 at which the fit is started. The same or a similar test has been used to check for the presence of overtones in numerical gravitational-wave ringdown models e.g. in [11,15,16].
As we want to retrieve the amplitudes of the tones, we here apply the rescaling procedure given by Eq. (21) prior to fitting 8 . Note that we still expect the amplitudes (at t r ) of the overtones not to be accurately determined for too large t 0 , as the overtones are damped much faster than the fundamental mode and hence still decay in the rescaled data.
We focus first on the model in the case of n max = 3, which we found to be the smallest number of overtones matching very well the behavior of σ 2 at all times. Fig. 20 shows the resulting best-fit amplitude parameters for the fundamental mode and for the three overtones considered, as functions of t 0 , along with their 1σ fitting uncertainties (Eq. (18)).
The amplitude parameter A 0 of the fundamental mode is remarkably constant throughout the figure, providing strong further support for the presence of this mode in the data. The amplitude parameters A n (n > 0) of the overtones, on the other hand, are clearly inconsistent between different values of t 0 4M. This does not really contradict the presence of overtones in the data as these amplitudes are expected to be poorly determined beyond early times once the overtones have decayed. Interestingly however, all overtones have a stable bestfit amplitude parameter over the range 1.5M t 0 /M 4M, which corresponds to the regime of early-time exponential decay. Each of the amplitudes is thus consistently determined over multiple values of t 0 if this regime is accounted for in the fit. We note that the ratios of the overtone amplitudes computed at the horizon formation, A n exp[(t r − t bifurcate )/τ l0n ] (n > 0), to the amplitude of the fundamental mode at the same time A 0 exp[(t r − t bifurcate )/τ l00 ], as determined here from the stable early-time best-fit values of A n and A 0 , are of the order of ∼ 2, ∼ 23 and ∼ 34 for n = 1, n = 2 and n = 3 respectively.
We show for comparison in Fig. 21 the best-fit amplitude parameters obtained in the same way with instead n max = 4 overtones included in the model. The resulting A 0 is still constant over the whole time range considered and A 1 is still relatively stable over the same early-time interval as above, with values roughly consistent with those obtained from the threeovertone model. The higher overtones (n ≥ 2) on the other hand do not appear to be stable over any time range. This may however simply indicate that their amplitudes cannot be constrained accurately enough even in the exponential damping regime due to a large number of free parameters and a too 8 More explicitly, the rescaled model reads in this caseh x (t) = A 0 cos[ω l00 ∆t+φ 0 ]+Σ nmax n=1 A n cos[ω l0n ∆t+φ n ] exp[−(τ −1 l0n −τ −1 l00 ) ∆t], which we fit to the rescaled datah NR (t) = h NR (t) exp[+τ −1 l00 ∆t], with ∆t = t − t r . The amplitude parameters A 0 , A n are by definition the amplitudes of each mode at the fixed time t r , and they are formally neither affected by this rescaling nor by changing the fit starting time t 0 . The best-fit values found for these parameters, on the other hand, may vary, e.g. if the modes are not well recovered by the fitting procedure when they have been highly damped, or if the data contains more than the QNMs included in the model.
FIG. 20. Best-fit amplitudes at t r (with 1σ uncertainties as given by Eq. (18)) as a function of the fit starting time t 0 for the fundamental mode and overtones in the (n max = 3)-overtone QNM model of Eq. (23), for the shear l = 2 mode. The rescaling procedure given by Eq. (21) has been used before fitting.
quickly decaying fourth overtone. For σ 2 , n max = 3 seems to be an optimal number of overtones that models well the data at all times while still allowing the amplitude of each tone to be correctly constrained. The considerations of this subsection -including the discussion recalled hereabove from section IV B -still do not provide any definitive conclusion about the actual presence of QNM overtones in the shear modes or multipoles, in particular since the stability of the best-fit model over some time range is a necessary, but not a sufficient condition for their presence. Yet these results perhaps represent the most supportive clue that we obtain in favor of the behavior of the horizon being indeed dominated by QNMs from shortly after its formation. The results of the previous subsections would not directly con- tradict such a statement. They would rather point towards the difficulty of separating QNM overtones from any other combination of damped sinusoids with roughly comparable complex frequencies (and thus of deciding on the presence of QNM overtones vs. such other decaying modes), and even of determining how many tones would have non-negligible contributions.

VI. CONCLUSION AND DISCUSSION
We have shown in this paper that the dynamics of the final apparent horizon in a binary black hole merger can be very well described by the quasinormal modes of the final black hole, from shortly after this horizon is formed onward. We have studied two quantities of interest, namely the shear of the outgoing normal at the horizon, and the horizon multipole moments; both of these are well modeled by quasinormal modes provided a large enough number of overtones is included. We have considered here a high-precision numerical simulation of a head-on collision of nonspinning black holes, but we expect these results to qualitatively hold for other configurations (of higher astrophysical interest) as well.
We have first confirmed that the behavior of each of the shear modes σ l and of the horizon mass multipole moments I l , for 2 ≤ l ≤ 12, is dominated at late times by the corresponding fundamental quasinormal mode. This is compatible with linear perturbation theory, which can be expected to hold in this regime and predicts an asymptotic predominance of the fundamental quasinormal modes since the associated overtones have shorter damping times. This result strengthens the conclusions of [60], and supports the presence of correlations between the emitted gravitational waves and the dynamics of the final black hole horizon (cf. [22]). Deviations from a description only in terms of the fundamental mode are however also evident, especially at early and intermediate times. This is accounted for by also including the higher overtones. We have shown that the shear and multipole moments, for essentially the entire time after the common horizon formation, are well described by superpositions of quasinormal modes including the overtones.
These results are in qualitative agreement with studies of the gravitational waveform extracted far away from the source. For example, in [11] it is found that the dominant (l = |m| = 2) harmonic of the gravitational waveform for a particular quasicircular initial configuration (with mass ratio 1.22 and moderate spins aligned with the orbital angular momentum) is well modeled right up to the peak of the strain by including up to seven overtones. The ability to detect and separate the successive overtones in the early stages of the ringdown, before they have decayed, would improve the prospects for black hole spectroscopy, and for observational probes of the black hole no-hair theorem.
In the present work, we have probed another part of spacetime by focusing on the horizon of the final black hole; we however expect strong correlations between both dynamics, arising from the same source [22]. The simpler geometry in our study, and the focus on the horizon, allowed for a high numerical precision and for an investigation of all geometric modes up to l = 12, rather than just the dominant l = 2 mode, for both the shear and the multipole moments. For all of these modes, we have obtained similar qualitative results. In particular, in the case of the l = 2 mode, we have found that two to three overtones suffice for an accurate modeling of both variables from shortly after the horizon formation onward. We have however also noticed the general increase in the number of overtones necessary for a good description of the geometric l mode as l increases.
Such results remain surprising because, shortly after the common horizon is formed, it is highly distorted and cannot be described as a linear perturbation of a Schwarzschild horizon. As evidence for this, we have noted that the area of the horizon increases at a very significant rate in this regime. The total relative change in area is only of about 6% however, so it could be argued that perturbation theory is still of some utility. One can then wonder whether obtaining a description of the horizon dynamics in terms of ringdown modes in this regime implies that the horizon is, in some suitable sense still to be understood, still a small perturbation of a stationary black hole. We have accordingly studied, through various possible criteria, whether one should conclude a) at the linear perturbation spectrum indeed already driving the horizon dynamics at early times, or b) at the more prosaic alternative that the quasinormal modes are just a suitable function basis for the shear and the multipoles, so that there is no deeper interpretation of these results.
Despite this investigation, and given the lack of a calculation from first principles, a conclusive answer to this question is still elusive. We have noted that the infinite slope featured by all shear modes and multipoles at t bifurcate prevents their formal description by a finite sum of QNMs at horizon formation, at least in terms of the t parameter used. Nevertheless, this constraint does not rule out such a model even at only slightly later times such as during the observed "early-time" decay phase at 0.3 (t − t bifurcate )/M 3. Hypothesis a) is supported by the stability observed to some extent in the bestfit amplitudes of each mode when the time t 0 at which the fit is started varies and spans the early-time range quoted above. This stability also manifests itself in the continued qualitative agreement of the model to the data at times prior to t 0 that is typically observed when t 0 lies in this range. On the other hand, models with lower numbers of overtones but some of the overtone frequencies let free to deviate from the QNM values, did not show a clear preference for the QNM overtones spectrum. The same was found using an example toy model with real frequencies artificially set slightly away from the QNM values. This is compatible with hypothesis b), but these results do not rule out the actual predominance of overtones at early times, hypothesis a), given that a clear preference for non-QNM frequencies was not found either. This rather hints at the difficulty of resolving individual modes in a sum of damped sinusoids with frequencies comparable to that of the QNMs, and of determining how many such modes can be included and constrained, even with essentially noise-free data. We expect these issues -including the overall difficulty of firmly ruling out the predominance of nonlinearities over overtones at early times-to hold similarly when the ringdown is analyzed from the emitted gravitational waves, complicating an overtone-based spectroscopy. The lack of a clear-cut recovery of the QNM overtone frequencies, in particular, was indeed also observed for the dominant (l = |m| = 2) gravitationalwave mode during ringdown in a binary black hole merger simulation in [15].
Turning now to future directions, there are a few straightforward possible extensions of the present work. First, the present analysis may be completed by a closer look at the more involved behavior of the vector modes ξ l (see Fig. 2 and the associated brief discussion in section II B 3). Within the same setup as considered here, it would also be natural to try other parametrizations of "time" to circumvent the infinite slope at t bifurcate in each of the observables as functions of t. This could also allow for a consistent joint treatment of both the inner and outer common horizons, which constitute indeed a single smooth hypersurface in spacetime. Second, one can look for a possible generalization of the results to a wider variety of configurations, including the astrophysically important quasicircular orbits and accounting for black hole spin. Third, a more fundamental investigation of the mechanisms driving the early-time dynamics of the outer common horizon could shed more light onto the fast exponential decays observed at these times for all of the shear modes and multipoles. We have found here that quasinormal overtones can indeed combine in such a way as to produce this behavior. An investigation of the mechanisms driving it could either provide more insight into why such a combination would take place, or rule out quasi-normal modes as a relevant explanation for this (possibly still nonlinear) regime altogether.