Unbiased estimation of gravitational-wave anisotropies from noisy data

One of the most exciting targets of current and future gravitational-wave observations is the angular power spectrum of the astrophysical GW background. This cumulative signal encodes information about the large-scale structure of the Universe, as well as the formation and evolution of compact binaries throughout cosmic time. However, the finite rate of compact binary mergers gives rise to temporal shot noise, which introduces a significant bias in measurements of the angular power spectrum if not explicitly accounted for. Previous work showed that this bias can be removed by cross-correlating GW sky maps constructed from different observing times. However, this work considered an idealised measurement scenario, ignoring detector specifics and in particular noise contributions. Here we extend this temporal cross-correlation method to account for these difficulties, allowing us to implement the first unbiased anisotropic search pipeline for LIGO-Virgo-KAGRA data. In doing so, we show that the existing pipeline is biased even in the absence of shot noise, due to previously neglected sub-leading contributions to the noise covariance. We apply our pipeline to mock LIGO data, and find that our improved analysis will be crucial for stochastic searches from the current observing run (O4) onwards.


I. INTRODUCTION
During its first three observing runs, the LIGO-Virgo-KAGRA Collaboration detected dozens of gravitationalwave (GW) signals from binary black hole coalescences [1][2][3], with the fourth run currently ongoing.This number is expected to increase to several thousand by the end of the fifth observing run [4].As a result, interest is now shifting to other, undetected observables, like the GW background (GWB)-a persistent signal made up of numerous superimposed GW sources thoughout the history of the Universe [5,6].With the recent strong evidence for a GWB in the nanohertz band provided by pulsar timing arrays [7][8][9], we can expect exciting developments in the field within the next few years.
While the GWB can originate from many different early-universe cosmological mechanisms [10], the astrophysical GW background (AGWB) [11] arising from the combined emission of compact binary coalescenses (CBCs) is expected to be the dominant component.In the past years, significant research has been devoted to probing the anisotropies in the AGWB, as these could provide us with valuable information with regards to the large-scale structure (LSS) of the Universe and, in particular, galaxy clustering [12][13][14][15][16][17][18].
Much like other cosmological observables such as the cosmic microwave background (CMB), the anisotropies in the AGWB are typically probed by means of their angular power spectrum, C ℓ .A significant challenge in the case of the AGWB is that the finite rate of CBCs leads to temporal shot noise, which can significantly bias inferences of the angular power spectrum of the AGWB [19][20][21].It is essential that this bias be properly accounted for in order to calculate the true angular power spectrum.Reference [22] introduced a technique to remove this shot noise bias by performing a cross-correlation of GW sky maps produced from nonoverlapping epochs within the full dataset.However, this work did not consider limitations related to GW detector noise and interferometer orientation.In this paper, we show both in theory and in practice how the optimal C ℓ estimator first introduced in Ref. [22] does indeed compute the correct angular power spectrum even after one accounts for the aforementioned effects, unlike the estimator currently used by both the LVK Collaboration and other authors [23][24][25][26] to probe the angular power spectrum of the AGWB [27].
As a by-product of this analysis, we show that the existing search pipeline is biased even in the absence of shot noise, due to sub-leading contributions to the covariance matrix.These contributions are sourced by the GWB signal itself, which is much weaker than the noise power in the detectors, and is thus usually neglected.However, as we show explicitly below, these terms can be significantly larger than the angular power spectrum one is searching for, and therefore bias the C ℓ estimator by more than 100%.The method we describe here is robust to these errors, and is thus the first unbiased search pipeline for GWB anisotropies with LVK data.
The paper is organised as follows.In Sec.II, we show how the current C ℓ estimator is biased with contributions from detector noise, shot noise and the signal itself, while the optimal estimator is by construction unbiased.Section III describes the anisotropic analysis we perform on mock data to validate our theoretical predictions, for detector noise sampled from O3 and LIGO A+ sensitivity.We also make predictions for the currently ongoing O4 observing run.We conclude in Sec.IV by discussing the importance of our findings for current and increased detector sensitivity, and providing possible future extensions of our work.

II. UNBIASED SEARCH METHOD
Our data vector P (referred to as the cross-spectral density (CSD) in relevant literature [28]) is formed from cross-correlations between the strain measured in the different detectors in our network, with I ̸ = J labelling these detectors, and τ the length of the time segments used to measure this cross-power (typically 192 s in anisotropic LVK analyses, with 50% of the data overlapping [26]).Each cross-correlation is formed from data at a particular frequency f (which can be positive or negative) and time segment t.We assume that the strain data, ŝ, are Gaussian, so that the first and second moments of P are where ⟨• • •⟩ N denotes an average over noise realisations, N I (f, t) is the noise power spectral density PSD in detector I, and γℓm IJ (f, t) are the spherical harmonic components of the overlap reduction function (ORF) of detector baseline IJ.We have absorbed a frequency-dependent factor into the definition of these compared to the definition usually found in the literature, as this simplifies many of the expressions below, where α is the assumed spectral index of the astrophysical/cosmological signal and f ref = 25 Hz.This definition factors out the frequency-dependence of the signal, so that the GWB spherical harmonics Ω ℓm depend only on time, due to different realisations of time-dependent shot noise.(Here we assume for simplicity that the signal is a power-law in frequency, as expected for an astrophysical background from quasi-circular binary inspirals well before merger; incorporating more detailed modelling of the frequency dependence of the signal may allow for more effective removal of the shot noise [29].) We estimate the spherical harmonic components of the GWB as [27] where the dirty map Xℓm and Fisher matrix Γ ℓm,ℓ ′ m ′ are defined as Xℓm = (γ ℓm | P ), Here we have introduced a noise-weighted inner product over time-, frequency-, and baseline-dependent complex functions, Note that there is a subtlety associated with the quantity we are using to do the noise weighting, PI .To leading order this is just the noise PSD N I , but since it is measured from the strain data, it will inevitably include some contribution from the GWB signal.In principle, since this is a data-dependent quantity, we should treat it as a random variable.However, in practice it is estimated independently of the data segment under consideration (e.g. using neighbouring segments), so that it can be effectively considered as a deterministic quantity.It is still unclear exactly what value this deterministic noise estimate will take, particularly when we consider corrections of order Ω ℓm .As we show below, this is a deficiency of the existing search method, which can be rectified in a way that also minimises the impact of shot noise.
To see this, it is necessary first to write down a form of Eqs.(2) and (3) which averages over shot noise and cosmological fluctuations, as well as detector noise.The first and second moments of the GWB spherical harmonics under these averages are [22] where C ℓ is the true GWB angular power spectrum that we are aiming to measure, and W τ ≫ C ℓ is the shot noise power in a segment of length τ , which can be calculated for a given model of the AGWB using expressions in Ref. [22].We see that the shot noise gives an excess contribution to the variance of GWB spherical harmonic measurements made at the same time t = t ′ , and that this contribution is spectrally white (i.e.independent of angular multipole ℓ).Combining Eqs. ( 2), (3), and ( 9), we find A naive first guess for an appropriate C ℓ estimator is to compute the average power in the (2ℓ + 1) spherical harmonics corresponding to a given multipole ℓ, Using Eqs. ( 6) and ( 10) we can show that the expected value of this quantity is The second term here is to be expected, and simply reflects the fact that the monopole (ℓ = 0) is not zero-mean like the other spherical harmonics.The third term is a noise-induced bias, which is obtained by expanding out the expressions in Eq. ( 6), This expression involves contributions from pure detector noise, shot noise, and from the signal itself.In the limiting case where there is no signal and no shot noise, this simplifies to which is the standard expression first derived in Ref. [27].
(We have assumed that PI = N I in this case.) In existing LVK analyses [26], this leading-order contribution to the bias is explicitly subtracted from the angular power spectrum estimates, so that the corresponding estimator is given by However, it is clearly not possible to set the GWB signal and shot noise to zero in real data, so Eq. ( 13) tells us that, even if we subtract the usual bias term (14), the current estimator Ĉ(curr) ℓ is still biased by an amount much larger than the signal we are searching for, since The exact size of this bias depends on the properties of the noise estimates PI (f, t), which themselves include further contributions from the GWB.It is possible in principle that one could define a procedure for generating the noise estimates such that the bias is on average given by Eq. ( 14); however, it is not at all clear how to do this in practice.
A better alternative is to notice that all of the contributions to the bias ( 13) can be traced back to terms proportional to δ tt ′ in Eq. (10).In other words, the bias comes from auto-correlations of each data segment with itself.One can thus remove the bias exactly, by explicitly excluding such auto-correlations when constructing the C ℓ estimate.This is the same method that was first suggested in Ref. [22] for removing the bias induced by shot noise; however, we see here that the same procedure removes all contributions to the bias, including those from detector noise.
Practically, one proceeds by constructing multiple dirty maps X(i) ℓm , i = 1, . . ., n, corresponding to disjoint subsets (or epochs) of the total dataset, each with its own Fisher matrix Γ (i) ℓm,ℓ ′ m ′ .The optimal C ℓ estimate is then formed by summing over all nonequal pairs of maps, One can then explicitly show, by a very similar calculation to those above, that this is unbiased, We can calculate the variance of this estimator, using the fact that the spherical harmonic estimates Ω(i) ℓm are Gaussian by the central limit theorem (since they are each formed by summing over many independent data segments).The general expression is very lengthy, but in the weak-signal regime where Let us consider the simplest case, where the Fisher matrices for each of the subsets of the data are equal to one another, and are thus given by 1/n times the total Fisher matrix, We can then simplify Eq. ( 18) to give This is equal to the variance found in Ref. [27] for the current estimator, times a factor of n/(n−1).We therefore see that our unbiased estimator is equally efficient in the limit of large n, just as in Ref. [22].

III. VALIDATION ON MOCK DATA
A. O3 sensitivity

Pipeline & settings
To run our anisotropic analysis, we make use of the publicly available code PyStoch [30,31].This pipeline has been designed and tested on O3 folded data [32] (CSD & PSD), for which it computes the dirty map and Fisher matrix either in pixel space or spherical harmonic space.The user specifies an assumed spectral index, α; a frequency range (the whole GWB frequency range in the standard LVK analysis corresponds to [20,1726] Hz); a detector baseline; and the number of pixels or maximum multipole they want to probe, ℓ max .
In our case, we adopt α = 2/3 to consider an astrophysical background [33] and run PyStoch in spherical harmonic space assuming the LIGO Hanford-Livingston (HL) baseline.We produce mock CSDs1 made up of the following components: • Detector noise.This is generated as random Gaussian realisations (different at every element in our time-frequency CSD array) of the folded data [34] at O3 sensitivity.For increased LIGO sensitivity (see section III B), we use the expected amplitude spectral densities (ASD's) from Ref. [35].
• Shot noise.We generate shot-noise sky maps (different at every time segment in our data) with Healpy [36,37] as random realisations of the same flat C ℓ spectrum (corresponding to white noise).We then multiply each map with the ORF (given by Eq. 4) to produce a shot-noise CSD.
• Mock AGWB signal .We simulate a sky map that represents the AGWB signal.Again, this is done with Healpy, using a model of the cosmological C ℓ spectrum from Ref. [19] (see section III A 3).We subsequently multiply this map with the ORF (Eq.4) to produce a CSD for the GWB.
We produce dirty maps, Fisher matrices and clean maps from our simulated CSDs by means of Eqs. ( 5) and ( 6).In the results that follow, we have not applied any regularisation schemes to the Fisher matrix inversion.See Appendix B for a discussion on the impact of regularisation in our study.Given that we are working in the weak-signal regime, we expect the associated statistical uncertainty in the C ℓ calculation to be large compared to the injected values for the short-duration (∼ 1 yr) datasets simulated here.
Our PyStoch-based analysis is sufficiently fast that we can simulate many independent realisations of shot noise and detector noise.We achieve this by choosing appropriate values for the maximum spherical multipole (ℓ max = 8) and the analysed frequency range ([20, 520] Hz), which allow us to perform our simulations relatively cheaply while still capturing the key features of the problem.We stress however that our methods are general, and can be applied to a broader range of multipoles and frequencies in a full analysis.

Validation with existing estimator
First we confirm that the existing estimator ( Ĉ(curr) ℓ ) performs as expected in the weak-signal regime by running PyStoch on simulated detector noise, without injecting shot noise or an underlying cosmological signal.In Fig. 1, we plot the clean-map angular power spectra of 1, 000 random detector noise realisations (blue curves), along with their mean (black curve).As expected, this mean curve exhibits excellent agreement with the weak-signal limit of the bias, N (lim) ℓ , as given by Eq. (14) (dashed yellow curve).In Fig. 2 we plot the estimator Ĉ(curr) ℓ , which corresponds to the same spectra but with the weaksignal limit of the bias subtracted.At O3 sensitivity, the shot-noise and GWB contributions to the bias in the current estimator are much smaller than the uncertainty of the estimate, so that the resulting curve is consistent with zero for pure detector noise.In the same figure, we plot the expected variance of this current estimator from our calculations (Eq.( 20) in the n → ∞ limit), as well as the sample variance observed in our ensemble of 1, 000 noise realisations.The excellent agreement between these is a useful consistency check for our simulation pipeline.

Modelling shot noise and large-scale structure
We now turn to the full problem, in which the simulated data contain a GWB signal with intrinsic angular correlations due to cosmic large-scale structure, as well as spurious correlations due to temporal shot noise.We simulate the maps that we inject in random detector noise realisations using realistic values for the shot noise and LSS (monopole and anisotropies) cited in Ref. [19].The reported values in that work for the angular power spectrum are in units of Ω 2 GW at a frequency of f ref = 65 Hz.We consider the shot noise and LSS values estimated for a subtracted foreground of r * = 250 Mpc (some degree of foreground subtraction must be assumed in order to avoid a divergence in the expected shot noise power at short distances).See Appendix A for a detailed conversion of these results into the right units and f ref for our analysis.
In Fig. 3, we plot the predicted bias for the existing estimator Ĉ(curr) ℓ in the weak-signal limit, as given by Eq. ( 14), along with the expected angular power spectra for the shot noise and underlying AGWB.Everything has been normalised with respect to the monopole, C ℓ → C ℓ /(4π Ω2 ).We note that for the detector noise and the shot noise, the assumed amount of time over which each data set has been collected is one year.For O3 sensitivity, the detector noise is orders of magnitude above the shot noise and AGWB signal, so that we are unambiguously in the weak-signal regime.It is also clear that for O3 sensitivity, we cannot expect to probe the angular power spectrum of the AGWB.This is consistent with the  non-detection of such a signal in O3 data [26].

Injected map simulations and angular power spectra
We generate detector noise and shot noise assuming one-month data sets, such that both vary randomly at (blue) estimators in a realistic O3-like scenario.We plot the expected variances as error bars and the sample variances as error bands.Everything has been normalised with respect to the AGWB monopole.For 1 year of O3 data, the AGWB (dashed grey curve) is very small compared to the uncertainties in the angular power spectrum estimate.
every time segment in our data sets.We then inject the same, fixed underlying LSS signal to each data set.
In total, we produce 1,200 clean maps corresponding to one month of data each.In Fig. 4, we plot the angular power spectra for these clean maps, along with the ensemble mean.This once again matches the bias in Eq. ( 14), as expected in the weak-signal regime.
We subsequently distribute the clean maps in 100 sets of 12 maps each to construct one-year data sets 2 .For each set, we calculate the Ĉ(curr) ℓ and Ĉ(opt) ℓ estimators.The results are plotted in Fig. 5, along with the predicted variance (given by Eq. ( 20) for Ĉ(opt) ℓ , and the same expression in the n → ∞ limit for Ĉ(curr) ℓ ) and sample variance for both estimators, which are in good agreement in both cases.Due to the weak signal and short timespan (relative to O3 sensitivity), both estimators have large uncertainties, and are both equally consistent with the injected signal.(The variance of the unbiased estimator is larger by a factor of 12/11, as expected; this factor can be reduced by dividing the data into a larger number of subsets.)This confirms that the effects of shot noise and the bias in the existing analysis method are both negligibly small at O3 sensitivity.In the section below, we show that this will no longer be true at design sensitivity. 2The corresponding dirty map and Fisher matrix for each one-year dataset that are used in the Ĉ(curr) ℓ estimator calculation are obtained by adding up the 12 different dirty maps and Fisher matrices, respectively.ℓ spectrum (black curve) and the expected (O5) bias in the weak-signal regime (dashed yellow curve) are also plotted.The discrepancy between the black and yellow curves indicates the presence of bias due to beyond-weak-signal-limit effects.Everything has been normalised with respect to the AGWB monopole.For 1 year of O5 data, the AGWB spectrum (dashed grey curve) is still small compared to the uncertainties in the angular power spectrum calculation.

B. O5 sensitivity
Advanced LIGO is expected to reach design sensitivity, A+, during the O5 observing run [38].We run simulations for this detector sensitivity to see how the two C ℓ estimators perform in this setting.We follow the procedure that is outlined in section III A, with the sole difference that the detector noise is generated for A+ instead of O3 sensitivity.
Again, we produce 1,200 clean maps corresponding to one month of data each.We plot the angular power spectra of all clean maps in Fig. 6.This time, their mean clearly lies above the weak-signal expression for the bias, N (lim) ℓ .This is unsurprising, as the monopole and shot noise terms in Eq. ( 13) are significantly larger compared to the detector noise at A+ sensitivity than at O3 sensitivity.As a result, the limiting-case expression for the expected bias is no longer sufficient to describe the actual bias in the analysis.Therefore, the estimator Ĉ(curr) ℓ we obtain after subtracting the detector bias will still be biased by the sub-leading contributions, which are no longer negligible.In contrast, the optimal estimator introduced in this work is always unbiased; hence, we expect the corresponding curves to lie below those of the current estimator for these simulations.
As before, we distribute the clean maps in 100 sets of 12 maps each and plot the current and optimal estimators in Fig. 7, along with their expected and sample variances.This time, there is a lack of agreement between the sample variance for each estimator and the theoretical prediction given by Eq. ( 20), with the latter being consistently smaller.This can be understood as being due to the breakdown of the weak-signal approximation; one now needs to account for additional contributions to the variance coming from the shot noise and the AGWB.We also see that the optimal estimator Ĉ(opt) ℓ returns a spectrum that is, on average, significantly below that of the current estimator Ĉ(curr) , and is thus closer to the injected C ℓ signal.This reflects the bias inherent to the current estimator beyond the weak-signal regime, and illustrates that it will be crucial to use the unbiased estimator Ĉ(opt) ℓ once A+ sensitivity is reached.

C. Discussion
We close this section with a comparison of our mock simulations for O3, O4, and O5 sensitivity.We repeat our analysis for the projected O4 sensitivity as well to include a forecast of the effect on present-day data.Much like for the case of A+, we notice that the mean C ℓ curve for the 1,200 produced clean maps lies above the detector bias.However, the discrepancy is not so clear for O4, starting as negligible at lower multipoles, and only becoming measurable at higher multipoles.Looking at the performance of the two estimators, we therefore find that the optimal estimator performs visibly better than the current one only for ℓ ≳ 5.
In Fig. 8, we summarise our results for the current (left panel) and optimal (right panel) estimators, for the three different detector sensitivities that we assumed during our analysis.The different C ℓ estimators are plotted as dotted curves of different colours (summarised in the figure legend).Again, the error bands denote the sample variances, while the AGWB (dashed grey curve) is also plotted.
For both estimators, increasing the detector sensitivity results in narrower error bands, as expected.In the case of the optimal estimator, these bands are always centered on the injected AGWB signal; however, for the current estimator, they deviate away from the AGWB, towards positive values (particularly for higher multipoles).Hence, while the optimal estimator yields more precise and accurate C ℓ measurements for increased detector sensitivity, the bias in the current estimator becomes increasingly evident.
We quantify the deviation of the mean of the Ĉ(curr) ℓ estimator from the AGWB signal in terms of the sample standard deviations difference between them, which is summarised in Table I.

IV. SUMMARY AND OUTLOOK
In this study, we have investigated methods for measuring GW background anisotropies in the presense of both detector noise and temporal shot noise.We started by describing the C ℓ estimator currently used in LIGO-Virgo-KAGRA analyses, and demonstrated that this estimator is biased due to contributions from the shot noise and from the target GW signal itself.We then presented a new estimator which avoids this bias by only cross-correlating between data segments at unequal times, validating our predictions on simulations using mock data.The two estimators produce similar results for O3 sensitivity, where we confidently lie in the weak-signal regime.However, we find that for LIGO A+ sensitivity it is crucial to use our new estimator to obtain accurate inferences of the GWB angular power spectrum, with some improvement over the current estimator already visible at O4 sensitivity.
In the future, we plan to extend this work by testing both estimators on real O4 data.Without the computational cost of running the analysis on thousands of simulated data sets, it will be possible to probe higher spherical harmonic ℓ modes, in the whole GWB frequency range ( [20,1726] Hz).It will also be straightforward to include more detectors in the analysis-such as Virgo and KAGRA-and perform the analysis on multiple detector baselines.This will improve the conditioning of the Fisher matrices involved in the angular power spectrum estimation, allowing access to higher angular resolution.
In addition, it will be interesting to test the capabilities of planned 'third-generation' detectors in probing GW anisotropies.It is already clear that the large bias in the angular power spectrum for future detectors will render the current search method unsuitable.On the other hand, the optimal estimator will be able to measure the LSS anisotropies with precision orders of magnitude better than for current detectors.performance of the current estimator in the presence of detector noise, shot noise, and the AGWB is not affected significantly in the case of O3 sensitivity (first tested in Fig. 5).However, the result changes significantly for O4 and O5 sensitivity.In Fig. 9, we plot the current (top) and optimal (bottom) estimator, for O4 (left) and O5 (right) sensitivity and the CNC and RE regularisation schemes.The CNC scheme has an almost negligible impact on the Fisher matrix inversion and the current and optimal estimators (same for O3), giving almost the same results as reported in Section III.This is because the chosen CNC is very close to the Fisher matrix's condition number in this case.On the other hand, imposing the RE scheme significantly affects the Fisher matrix inversion and results in an enhancement of the bias in the current estimator.As for the optimal estimator, this performs equally well for both regularisation schemes.Furthermore, the discrepancy among the theoretical uncertainties and sample variances for O4 and O5 sensitivity appears greater for the RE scheme.We conclude that when using a more aggressive regularisation scheme, error estimates are reduced and hence the noise-bias becomes more evident in the current estimator, and the weak-signal approximation looses validity.

Figure 1 .
Figure 1.The angular power spectra of 1,000 clean maps (blue curves), as recovered from simulations with detector noise only at O3 sensitivity (mock 1-month data sets).The mean Ĉ(raw) ℓ spectrum (black curve) matches the expected bias in the weak-signal limit, N (lim) ℓ (dashed yellow curve).

Figure 2 .
Figure 2. The mean Ĉ(curr) ℓ estimator of 1,000 simulations with detector noise only, at O3 sensitivity.We plot the predicted variance of the estimator with error bars, and the sample variance in our ensemble of simulations as a shaded region.

Figure 3 .
Figure 3. Expected angular power spectrum of the temporal shot noise (black curve) and the AGWB (dashed grey curve).The dashed yellow curve denotes the expected bias in the weak-signal limit, given by the O3 (HL) detector noise.Both the detector noise and the shot noise correspond to 1 year of data.All curves have been normalised with respect to the AGWB monopole.

Figure 4 .
Figure 4.The angular power spectra of 1,200 clean maps (blue curves), normalised to the monopole, as recovered from simulations with O3 detector noise, shot noise and the AGWB (mock 1-month data sets).The mean Ĉ(raw) ℓ spectrum (black curve) matches the expected (O3) bias in the weak-signal regime (dashed yellow curve).

Figure 6 .
Figure 6.The angular power spectrum of 1,200 clean maps (blue curves), normalised to the monopole, as recovered from simulations with O5 detector noise, shot noise and the AGWB (mock 1-month data sets).The mean Ĉ(raw)

Figure 7 .
Figure 7.Comparison of the performance of the Ĉ(curr) ℓ (orange curve) and Ĉ(opt) ℓ (blue) estimators.We plot the expected variances as error bars and the sample variances as error bands.Everything has been normalised with respect to the AGWB monopole.For 1 year of O5 data, the AGWB spectrum (dashed grey curve) is still small compared to the uncertainties in the angular power spectrum calculation.