Mock data study for next-generation ground-based detectors: The performance loss of matched filtering due to correlated confusion noise

The next-generation (3G/XG) ground-based gravitational-wave (GW) detectors such as Einstein Telescope (ET) and Cosmic Explorer (CE) will begin observing in the next decade. Due to the extremely high sensitivity of these detectors, the majority of stellar-mass compact-binary mergers in the entire Universe will be observed. It is also expected that 3G detectors will have significant sensitivity down to 2-7 Hz; the observed duration of binary neutron star signals could increase to several hours or days. The abundance and duration of signals will cause them to overlap in time, which may form a confusion noise that could affect the detection of individual GW sources when using naive matched filtering; matched filtering is only optimal for stationary Gaussian noise. We create mock data for CE and ET using the latest population models informed by the GWTC-3 catalog and investigate the performance loss of matched filtering due to overlapping signals. We find the performance loss mainly comes from a deviation in the noise's measured amplitude spectral density. The redshift reach of CE (ET) can be reduced by 15%-38% (8%-21%) depending on the merger rate estimate. The direct contribution of confusion noise to the total signal-to-noise ratio (SNR) is generally negligible compared to the contribution from instrumental noise. We also find that correlated confusion noise has a negligible effect on the quadrature summation rule of network SNR for ET, but might reduce the network SNR of high detector-frame mass signals for detector networks including CE if no mitigation is applied. For ET, the null stream can mitigate the astrophysical foreground. For CE, we demonstrate that a computationally efficient, straightforward single-detector signal subtraction method suppresses the total noise to almost the instrument noise level; this will allow for near-optimal searches.

The next-generation (3G/XG) ground-based gravitational-wave (GW) detectors such as Einstein Telescope (ET) and Cosmic Explorer (CE) will begin observing in the next decade. Due to the extremely high sensitivity of these detectors, the majority of stellar-mass compact-binary mergers in the entire Universe will be observed. It is also expected that 3G detectors will have significant sensitivity down to 2-7 Hz; the observed duration of binary neutron star signals could increase to several hours or days. The abundance and duration of signals will cause them to overlap in time, which may form a confusion noise that could affect the detection of individual GW sources when using naive matched filtering; matched filtering is only optimal for stationary Gaussian noise. We create mock data for CE and ET using the latest population models informed by the GWTC-3 catalog and investigate the performance loss of matched filtering due to overlapping signals. We find the performance loss mainly comes from a deviation in the noise's measured amplitude spectral density. The redshift reach of CE (ET) can be reduced by 15%-38% (8%-21%) depending on the merger rate estimate. The direct contribution of confusion noise to the total signal-to-noise ratio is generally negligible compared to the contribution from instrumental noise. We also find that correlated confusion noise has a negligible effect on the quadrature summation rule of network SNR for ET, but might reduce the network SNR of high detector-frame mass signals for detector networks including CE if no mitigation is applied. For ET, the null stream can mitigate the astrophysical foreground. For CE, we demonstrate that a computationally efficient, straightforward single-detector signal subtraction method suppresses the total noise to almost the instrument noise level; this will allow for near-optimal searches.

I. INTRODUCTION
It has been seven years since Advanced LIGO detected the first gravitational-wave (GW) event GW150914 [1] on September 14th, 2015. During these seven years, Advanced LIGO and Advanced Virgo have continuously moved toward their design sensitivities [2][3][4][5], and have performed three observation runs (O1, O2, and O3). Nearly 100 GW signals from compact binary coalescence (CBC) have been observed, of which more than 90 are from binary black hole mergers (BBH) [6,7], two come from binary neutron star inspiral (BNS) [8,9], and two black hole neutron star mergers (NSBH) [10] with relatively high significance. With the increasing number of observations, we know more about the population properties of these CBC sources [11][12][13]. KAGRA, a GW observatory in Japan that is under active development [14], conducted a joint observation with GEO600 [15] in Germany at the end of O3 [16]. Before 2030, a third LIGO detector, LIGO-India, is expected to be operating [17], and all these detectors will be further upgraded toward the A+ [5,18,19], AdV+ [20] and KAGRA+ [21] configurations. At that time, there will be five kilometer-scale observatories [22].
After 2030, these detectors will be joined by more advanced next-generation (3G/XG) GW detectors, such as Einstein Telescope (ET) [23][24][25][26][27][28] in Europe and Cosmic * alex.nitz@aei.mpg.de Explorer (CE) [29][30][31] in the United States. There will also be space-borne GW detectors, such as LISA [32,33] and either Taiji [34][35][36][37][38] or TianQin [39][40][41][42]. For the 3G detectors on the ground, the low-frequency sensitivity will be improved to enable observation from 2-7 Hz, down from the 10-20 Hz of second-generation (2G) detectors. GW signals will stay in the detectors' sensitive frequency band for hours or even days [43,44]. The higher sensitivity of 3G observatories means a significantly higher detection rate (O 10 5 BNS mergers per year); the increased detection results in numerous signals overlapping in time. This problem is in fact more serious in spaceborne GW detectors, where O 10 8 white dwarf binaries from the Milky Way and nearby galaxies produce a GW foreground noise [45][46][47][48]. For stationary and Gaussian noise, the matched filter is the optimal linear filter that maximizes the signal-to-noise (SNR) [49]. For the 2G GW detectors, the instrumental noise can at most times be assumed to be stationary and Gaussian, with notable additive nonstationary "glitches" [50][51][52]; methods based on matched filtering are widely used in the search of CBC signals [53][54][55].
If in sufficient abundance, numerous overlapping signals can form another kind of noise in addition to the detector noise, that is, confusion noise. The authors of [56] first proposed that Einstein Telescope might be affected by confusion noise. We note that "confusion noise" is used to refer to a large population of unresolvable GWs; here we collectively refer to the foreground population of overlapping signals, many of which are individually resolvable, but retain sufficient numbers to behave as an additive noise source. A few years later [57] launched the first mock data challenge for Einstein Telescope; they simulated binary neutron star (BNS) sources according to a state-of-art population model at that time, and used the ihope pipeline [54,58,59] based on matched filtering together with a null stream method (a technique to cancel the GW strain in the data for better detector noise estimation) [60][61][62][63][64][65][66][67][68][69][70] to analyze the simulated data. They found no significant impact of confusion noise in searching with data from Einstein Telescope. A later analysis produced consistent results [71]. These early mock data challenges were conducted before significant constraints on the CBC population were available. Recently, several groups have begun to study the influence of overlapping signals on the parameter estimation for CBC sources [72][73][74][75][76][77]. There is also a search study for overlapping signals in 2G cases [78].
Currently, there are a few methods to improve signal detection in the signal overlapping case. The "null stream" is a method of cancelling astrophysical sources in detector data to achieve better power spectral density (PSD) estimation [60][61][62][63][64][65][66][67][68][69][70]. It requires (1) at least 2 coaligned (colocated) detectors [62,65] or 3 detectors if not colocated, (2) the PSD's shape of each detector should be the same [57], (3) it is assumed that there are only two polarizations of GWs, which is consistent with the general theory of relativity [79,80]. For distant detectors, the null stream becomes sky-dependent [64], it can only cancel the GW signal incident from a specific direction at a time (because the null stream combination is dependent on the time delays). Thanks to the structure of ET (closed-loop composed of three V-shaped detectors [23]), the distances of the three subdetectors (E1, E2, E3) are very close, so the time delay between them can be ignored; all GW signals can be canceled by appropriately summing the data from these three subdetectors. However, these assumptions (especially the first one) might be unrealistic for CE, so we need to develop other complementary methods. For LISA, a "global fit" [81] is usually used to avoid the bias caused by signal overlapping. In this method, the number of signals in the data is also used as an unknown variable and the parameter estimation of all signals is done simultaneously. The degeneracy of parameters and convergence of the sampling in such high-dimensional parameter space are potential problems.
In our study, we create mock data for CE and ET. Unlike the previous two ET mock data challenges, we use CBC population models based on the latest GWTC-3 observational constraints. We use these simulated data to investigate several key factors that might negatively affect the performance of matched filtering, such as the bias in PSD estimation, the bias from the SNR contribution of overlapping GW signals, and the bias caused by the confusion noise's correlation between different detectors. We perform a computationally optimized matched filtering search to demonstrate identifying and subtracting the majority of high SNR sources. We find that postsubtrac-tion, we are able to obtain a PSD close to the design sensitivity for CE. This can be used as the first-stage foreground cleaning before more sophisticated searches.
The structure of this paper is as follows: In Sec. II, we introduce the CBC population models used in this paper and how to simulate the time-domain mock data of CE and ET based on these models. In Sec. III, we derive the matched filtering equations in the presence of overlapping signals. In Sec. III A, III B, and III C, we investigate the effects of overlapping signals on PSD estimation, cross term calculation, and network SNR calculation under correlated noise. After that, we present our signal subtraction method and results in Sec. IV. Finally, Sec. V is the summary and discussion.

II. POPULATION MODELS AND MOCK DATA GENERATION
In this paper, we need to simulate a realistic population of the known types of CBC GW sources, such as BBH, BNS, and NSBH, and then inject their GW signals into simulated instrumental noise. We create a simulated set of sources that follows the population estimates of the GWTC-3 catalog [13]. Researchers have started to use the observed mergers to constrain parametric models of the population [11,12,[82][83][84]; the nearly one hundred current observations place some constraints on the BBH population, however, the BNS and NSBH populations remain highly uncertain due to the small number of observed sources.
A. The population models Quasicircular CBC systems can be described by 15 parameters; these include intrinsic parameters such as mass and spin of the components and extrinsic parameters such as sky localization, luminosity distance, orientation angle, and polarization angle. In systems containing neutron stars (such as BNS and NSBH), there are also the tidal deformation parameters of the neutron star [85,86]. We choose the prior that sources have an isotropic distribution of viewing angle, polarization angle, and sky localization. For the remaining intrinsic parameters and the distance distribution, we base our choice for the population of BNS, NSBH, and BBH parameters on the most constrained models from the latest GW catalogs.
To specify the distribution of source-frame BBH primary mass and mass ratio, we use the results of the power-law+peak mass distribution model (shown in Fig. 10 of [13]). A recent study shows that the mass distribution of black holes in NSBH systems also agrees with this distribution [84], so we adopt the same mass distribution model in the BBH and NSBH systems. For the mass distribution of neutron stars in BNS and NSBH systems, we use the distribution of the power-law model, shown in Fig. 7 of [13]. There are only four confident observations containing at least one neutron star (two BNS events, GW170817 [8] and GW190425 [9], and two NSBH events, GW200105 and GW200115 [10]) and the LVK uses all of these events to constrain the mass distribution of neutron stars. Due to the limited number of observations, the NS's population properties are poorly constrained.
We choose the spin amplitude distribution of black holes in both BBH and NSBH systems following the distribution shown as the solid black curve in Fig. 15 of [13]. In this work, we use an isotropic spin orientation distribution for BBH sources, as related parameters are not highly constrained. For the NSBH and BNS systems, consistent with the current observations of low spinning neutron stars [8][9][10], we assume the neutron star to be orbit-aligned and slowly spinning in NSBH systems and nonspinning in BNS systems. We ignore the tidal deformation of the neutron star due to its relatively small effect on the signal's SNR [43,87].
The distribution of sources in luminosity distance or redshift and the total merger rate determines the number of signals present in the simulated data. We adopt the simulation method of [43,57,71]; we assume that all CBC systems are generated by stellar evolution, more precisely the evolution of field binaries [88], and we ignore dynamical encounters in the dense environment [89] and primordial black holes formed in the early Universe [90]. Since all the CBC systems in the simulation come from stellar evolution, the redshift distribution of these CBC sources must be directly related to the star formation rate (SFR). We use the SFR in [91]. The coalescence rate density in the source-frame is the convolution of the star formation rate and the time delay probability distribution [43], (1) whereρ 0 is the local coalescence rate density (in the unit of Mpc −3 yr −1 ), which is equivalent to a rescaling factor, and f (z) is the normalized coalescence rate density, such that f (0) = 1 andρ(0) =ρ 0 . The SFRρ * is in the unit of M Mpc −3 yr −1 . P (τ ) is the probability distribution of the time delay τ . The time delay τ = t(z) − t (z f ) refers to the total time from the formation of the binary progenitor system (when redshift is z f ) to the merger of the compact binary system due to GW emission (when redshift is z). This delay is determined by the difference between the lookback time of z and z f , and the lookback time at redshift z is defined as where H 0 is the Hubble constant, and Ω Λ and Ω m are the densities of dark energy and nonrelativistic matter respectively. In this paper, we assume the standard ΛCDM cosmology [92], so H 0 = 67.74 km s −1 Mpc −1 , Ω Λ = 0.6910, and Ω m = 0.3075. In Eq. (1), we convert the integral variable from the time delay τ to the redshift z for convenience. At present, there are several models of time delay distribution P (τ ), such as Gaussian delay model [93], log-normal delay model [94], power-law delay model [94] and inverse delay model [95], the first three are derived from actual observations, and the last one is suggested by the population synthesis [96][97][98][99][100] and used by the previous ET mock data challenges [57,71]. We also use the inverse delay model in this paper, which means P (τ ) ∝ 1/τ . In order to obtain the distribution of the event rate of CBC as a function of redshift in the detector frame, we need to multiply the coalescence rate density expressed by Eq. (1) by the comoving volume element dV (z)/dz, and then divide it by 1 + z caused by the time dilation, so we get the following equation where the comoving volume element dV (z)/dz is described by where c is the speed of light in the vacuum, and D L is the luminosity distance between the CBC source and the detector, which is defined as For the local merger rate of each CBC type, we choose the rates based on LVK's population paper of GWTC-3 [13] and the public presentation [101]. For BBH signals, we choose 22 Gpc −3 yr −1 and 45 Gpc −3 yr −1 as median and upper local merger rates respectively, 250 Gpc −3 yr −1 and 1900 Gpc −3 yr −1 for BNS signals, and 170 Gpc −3 yr −1 and 320 Gpc −3 yr −1 for NSBH signals. Note that in the latest version of [13], they have changed the rate of NSBH to lower values (several months after our project started), but in this paper, we still use their original values. The coalescence rate of BBH, NSBH, and BNS in the detector frame as a function of redshift is shown for the median local merger rate case as the upper plot of Fig. 1. We draw the redshift (luminosity distance) of the GW signal from this distribution. When the redshift is higher than 20, there are few CBC sources generated by the stellar evolution, so we choose to simulate GW sources only up to redshift z max = 20.
We can get the average time interval between two adjacent GW signals of the same type (the overline means the average) as, For median local merger rate cases, we find the average time intervals for BBH, BNS, and NSBH are 359.4 s, 31.6 s, and 46.5 s respectively. For the cases of upper local merger rate, we find the average time intervals are 175.7 s, 4.2 s, and 24.7 s respectively.

B. Mock data generation
We create mock data for both Einstein Telescope and Cosmic Explorer. According to population models in the previous Sec. II A, we use either the median or upper local merger rate to simulate a population of sources and then project the GW signal to each detector, according to the equation where α is the index of each detector, F α + (θ, φ, ψ) and F α × (θ, φ, ψ) are antenna pattern functions for the two GW polarizations, which depend on the sky localization (θ, φ) and polarization angle ψ. These parameters are sampled from distributions defined in Section II A. We generate and save each type of GW signal separately, then add them into the simulated Gaussian detector noise to create two datasets; each dataset has ∼ 6 hours of data, one for the median local merger rate case, the other one for the upper local merger rate case.
We use the latest time-domain phenomenological higher-order mode model IMRPhenomTPHM [102] to simulate BBH and NSBH sources. As the mass ratio of BBH and NSBH systems can be large, higher-order modes will have an impact on GW waveforms [103]. We expect the tidal deformation of neutron stars in NSBH systems is relatively weak [43,104], so we do not use the two existing NSBH models which include the tidal effect [105,106]. In addition, these two models only consider the dominant (2, ±2) mode. For BNS sources, we use the frequency-domain phenomenological model IMRPhenomD [107] instead of the post-Newtonian waveforms [108] or IMRPhenomPv2 NRTidalv2 [109]. Next-generation GW detectors may observe high-redshift BNS signals, so the merger and postmerger part of the redshifted signal might be within the detector's sensitive frequency band [44]. However, accurate models of merger and postmerger waveform of BNS are still under development [110][111][112], we use the merger and ringdown parts of IMRPhenomD to mimic them. We also neglect the tidal deformation of neutron stars for BNS systems in this paper. In the source frame, the postmerger signal of a BNS is in the kHz frequency band [112], according to our population model most BNSs are at z ∼ 2. The SNR of a BNS postmerger at 70 Mpc is ∼ 8 for ET, whereas the inspiral would have an SNR of ∼ 1000 (we have used IMRPhenomPv2 NRTidalv2 model to calculate SNR for its inspiral part); the relative SNR in the postmerger is negligible [112]. If we rescale this example signal to z ∼ 2, we find an inspiral SNR is ∼ 12, while postmerger is ∼ 0.2. For the purposes of detection, more than 99.9% of the SNR is recovered without the postmerger signal. We do not expect the presence of a post-merger signal to bias PSD estimation due to the negligible relative SNR which is contained within a shorter duration; most of the data will all be free of contamination by postmerger signals. We neglect tidal effects as they will be completely subdominant for the parts of the signal where there are a significant number of overlapping sources.

Einstein Telescope
Einstein Telescope (ET) [23][24][25][26][27][28] is the European plan for the next-generation observatory following Advanced Virgo [3,20,113]. ET consists of three V-shaped subdetectors (called E1, E2, and E3); they overlap with each other to form an equilateral triangle, with an arm length of 10 km. In order to achieve the required sensitivity at both low frequency and high frequency, each V-shaped subdetector uses a "xylophone" configuration, i.e. there is one interferometer for low frequencies and another for high frequencies [23,26]. This design is intended to bring the sensitive frequency band of ET down to ∼ 2 Hz. At present, the site selection of ET is still under evaluation. We use the fiducial coordinates and orientation defined in the LALSuite [114] and assume the ET design sensitivity curve [23,26]. Each of ET's subdetectors has a different antenna pattern [57]; the V-shaped detector has a sensitivity loss of 1/ sin π 3 compared to an equivalent length L-shaped detector (caused by the opening angle is π 3 , not π 2 ) [26,57]. We neglect the effect of the Earth's rotation in this paper. In our simulations, we set the low-frequency cutoff of the ET dataset to 2 Hz.

Cosmic Explorer
Cosmic Explorer (CE) is a proposed next-generation observatory in USA [29][30][31]. Compared with Einstein Telescope, CE adopts a more conservative design. CE uses the L-shaped configuration of the current-generation observatories, but with an arm length of 40 km. The design enables sensitivity to frequencies down to ∼ 5 − 7 Hz. CE and ET have their own advantages, and they can form a 3G detector network to further improve the overall signal detection capabilities [43,44,115,116]. Similar to ET, the location of CE has not yet been determined. In this paper, we place one CE at the position of the LIGO-Hanford observatory [2]. We use the latest CE design sensitivity [31] to recolor the stationary, Gaussian, and whitened noise, and set the low-frequency cutoff of the CE dataset to 5 Hz. We show an hour of simulated data in Fig. 2. Note that while BBH signals are still largely separated in time, BNS signals overlap in time. Due to the improved low-frequency sensitivity, each BNS signal in the detector sensitivity band will last for hours or even days [44,57,71]. However, we note that the majority of signals will remain distinguishable, because their timefrequency evolution is different from each other.

III. THE BIASES OF MATCHED FILTERING CAUSED BY CONFUSION NOISE
Assuming additive Gaussian and stationary noise, and for a known signal, the matched filter is the optimal linear filter that can maximize SNR [49]. As a consequence, matched filtering is widely used as the basis of modeled searches for CBC sources [53][54][55]. In this section, first we briefly review the basic principles of the matched filtering method, and then we study whether these overlapping GW signals will have negative effects on the performance of matched filtering.
For GW signals from the compact binary coalescence, there are state-of-art methods to numerically simulate the GW signals predicted by general relativity [117,118]. However, in order to meet the speed requirements of data analysis, there are several kinds of approximants, including post-Newtonian approximation models [108], phenomenological models [102,107,119,120], effectiveone-body numerical relativity models [121,122], and surrogate models [123,124]; these models may be compared and calibrated with the results of numerical relativity simulations.
In order to understand the behavior of the matched filter, we first define the scalar or inner product in the frequency domain as below 2. An hour of simulated data for CE assuming median local merger rates and corresponding population models. The gray line represents the sum of all GW signals and detector noise. The black, brown, and green lines represent the injected BBH, NSBH, and BNS signals, respectively. Because of the difference in the event rate and the signal duration, we can see that BNS signals overlap in time, with no gaps between signals, NSBH signals also have large overlaps, while many BBH signals remain isolated.
noise), defined by s(f ) is the Fourier-transformed detector data, defined bys(f ) = +∞ −∞ s(t)e −2πitf dt. The " ·· " here means the average of many noise realizations, not the same as the inner product " ·|· ". In the general case, the timedomain observational data s(t) from a GW detector is (10) which is the linear summation of detector noise n(t) and all GW signals j h j (t) (among them h k (t) is the signal of interest), j is the number of GW signals in this data, and it is unknown before data analysis. If there is only one GW signal h k (t) in the data, then j =k h j (t) in the Eq. (10) disappears.
If we have a template h(t, Θ) where the merger is at t c = 0 and Θ represents all parameters of the wave-form, the template with arbitrary merger time t c is h(f, Θ)e 2πif tc . By substitution into Eq. (8), we can define which is just the inverse Fourier transform of the inner product of s and h. An efficient search for the signal with unknown time can be efficiently done with a Fourier transform. According to Eq. (10), we can rewrite Eq. (11) as Note that the inner product has contributions from the detector noise n(t), confusion noise j =k h j (t) made by overlapping signals, and the particular GW signal h k (t) that we are interested in.
According to Eq. (7), the detector strain caused by the GW signal is a linear combination of two GW polarizations, and the combination coefficients are determined by the antenna response function of the detector, which depends on the sky location and polarization angle of the signal. Together with the source's luminosity distance, chirp mass, and inclination angle, these factors affect the overall amplitude of the signal's strain. For typical CBC matched filtering searches [54,55,125] (quasicircular, nonprecessing, and only the dominant (2, ±2) mode), it is usually assumed that the two polarizations of the signal satisfy the relationship h + = ih × , which means that the two polarizations are related by a phase difference of π 2 . To maximize over both an overall orbital phase and previously mentioned angles, we only need to use one of the polarizations h + along with the complex matched filtering SNR ρ(t), the σ 2 is the variance of the noise and is also known as the optimal SNR, which is the matched filtering SNR when the template perfectly matches the signal in the absence of noise. It can be expressed as In the following subsections, we examine the performance loss of matched filtering caused by the bias in the estimation of the PSD caused by numerous overlapping signals, the bias due to the cross terms between the waveform template and overlapping signals in the data, and the impact of correlated noise on a network of detectors.

A. Power spectral density estimation
In this subsection, we discuss the impact of overlapping signals on PSD estimation and the loss of the detector's horizon distance [31,54,126] due to a biased estimate of the instrumental noise. We can see in Eq. (9) and Eq. (10) that if there are overlapping signals, S tot h (f ) should be higher than the instrumental-only S n (f ). We use the median Welch estimation method [54,127] to calculate the PSD of our mock data. We use 512 s of data to estimate the PSD from 35 different times from each of our datasets. The reason for choosing 512 s is just following the PSD estimation in the current real searches, we also have tried other values, and their results are similar. For each 512 s segment, we use 16 s as the subsegment and 50% overlap to calculate a PSD using the Welch averaging method. We use these 35 PSDs to calculate the mean PSD for the entire dataset as well as the 1-σ confidence interval for each frequency.
On the left of Fig. 3, we show the total estimated amplitude spectral density (ASD, the square root of the PSD) of CE for the median and upper local merger rates. For comparison, we also plot the design ASD of CE and the mean ASD of simulated detector noise (without signals) for comparison. We see that the ASD of the median merger rate data is only slightly higher than the design ASD and the mean ASD of the instrumental noise, just at the boundary of the 1-σ estimate of the instrumental noise, however, there is a significant bias for the upper merger rate dataset.
If the value of the local merger rate is between the range used in our paper, the overall ASD will be between the green and blue lines shown in the figure. For ET, we show similar curves on the left of Fig. 4. It can be seen that the overall impact of overlapping signals on the ASD of ET is much smaller. Only in the case of upper local merger rate, will there be deviations noticeably above the detector noise's mean ASD between 5 Hz and 11 Hz.
To understand the contribution of each kind of source to the total ASD of CE, in the right panel of Fig. 3, we compare the mean ASD of the simulated data only containing each type of source (BNS, NSBH, BBH) to the instrumental noise. Each ASD is plotted as deviations relative to the signal-free detector noise curve. For the median local merger rate case, we notice a deviation in the total ASD, which includes contributions from all signal classes and the instrumental noise, of up to about 10% in the 5 Hz to 60 Hz range, and for the upper local merger rate, in this case, the deviation extends to 100 Hz, and in the range of 10 Hz to 20 Hz, the deviation can reach 20%.
BNS mergers are the main source of bias in the measured ASD. For the median local merger rate case, the BNS-only deviation is mainly concentrated below 10 Hz and less than 10%, while for the upper local merger case, the deviation is most pronounced between 10 Hz and 20 Hz, up to 40%. We notice that the confusion noise from BNS is almost stationary, which means its PSD does not vary too much over time. The contribution of NSBH is far less than that of BNS and for the BBH datasets, regardless of the merger rate, the contribution to the total ASD is negligible.
Similarly, for ET we find the ASD deviation mainly comes from the overlapping BNS signals. The deviation is mainly concentrated between 5 Hz and 10 Hz. For the upper local merger rate case, the deviation is at most about 10%, and for the median local merger rate, the deviation is at most about 5%. The overall deviation is much smaller than that of CE.
In order to study the loss of signal detection caused by biased estimation of the PSD or ASD, we calculate the horizon redshift (distance) under different conditions. The horizon redshift (distance) means the redshift (luminosity distance) of the GW source when the GW source is above the detector plane, face-on, and the optimal SNR is 8, a threshold often used to characterize the sensitivity of the detector [31,54,126]. The results are shown on the left side of Fig. 5, which shows the horizon redshift for GW sources with different source-frame total , and BBH (orange) sources for both the median (solid) and upper (dotted) merger rate cases. If there is a monochromatic GW signal, the total ratio (blue) at each frequency gives the SNR reduction factor due to the confusion noise. In the upper rate scenario, BNS sources dominate the ASD bias and form a quasistationary foreground noise. BBH sources have negligible effects on the ASD for both rate cases because their presence leaves most of the data uncontaminated. Noise estimation that uses median or median-mean Welch averaging is not significantly affected by a small number of outliers at a given frequency. The peak in the ASD bias for CE is around 15 Hz. masses. The confusion noise-free estimates are consistent with [31].
The existence of a large population of foreground signals, left unmitigated, would reduce the sensitivity to CBC mergers; the most extreme sensitivity loss is for sources with source-frame total mass ∼ 10 M . For the median local merger rate case of CE, the horizon redshift loss of CBCs with source-frame total mass less than 10 M will be 5% to 15%, and for the upper merger rate case, the loss will be as high as 15% to nearly 40%. For ET, the loss is generally lower than CE, for the median local merger rate case, the loss is about 2% to 7%, and for the upper local merger case, the loss can reach 5% to 20%.
The loss in sensitivity may impact science at various masses. For example, sources with the total source-frame mass between 2 and 3 M are useful to study the minimum mass of the neutron star [128]. Sources with a total source-frame mass between 1 and 2 M may be primordial in origin and are the target of subsolar searches  [129,130]. For this kind of source, we find a 2% to 25% loss in the horizon redshift. For CBCs with a redshift higher than 20, it is generally considered that they might originate from Population III (Pop III) stars or primordial black hole (PBH) mergers formed in very early Universe [131][132][133][134], because these two types of sources do not follow the stellar evolution, and therefore do not follow the redshift distribution of Fig. 1 (this distribution is only valid for field binaries). The detection of such events at high redshift can be clearly distinguished from the typical stellar-origin CBCs [132,134]; the presence of overlapping signals will also significantly reduce their detection efficiency.

B. Matched filtering cross terms
In this subsection, we discuss the effect that cross terms in Eq. (12) produced by overlapping signals have on the measured matched filtering SNR. In order to visually show the bias caused by this term, we show the matched filter output around an example signal in Fig. 6 for the median merger rate data. For simplicity, we directly use the mass parameters of this injected signal to generate the GW template and then calculate the complex matched filtering SNR according to Eq. (13); we use the corresponding mean PSD (the solid green one) in Sec. III A. To understand the impact of each component of the matched filtering SNR, we examine: (1) the data with only a specific injected signal and no detector noise, (2) the data containing all injected BNS signals but no detector noise, and (3) the data containing all injected BNS signals and the detector noise. The left side of Fig. 6 shows 10 s centered on the injection time.
The visual inspection makes it clear that the impact of overlapping signals is negligible in this example.
We plot the probability density function (PDF) from ∼ 2800 s of each complex SNR time series |ρ(t)| in the right panel of Fig 6. According to Eq. (13), if there is only Gaussian and stationary noise of the detector in the data, then s | h + 2 and s | h × 2 are the squares of two Gaussian variables, so ρ 2 (t) follows the chi-squared distribution with 2 degrees of freedom, and the degree of freedom is 2 because it is the sum of the squares of 2 Gaussian variables. So by definition, the modulus of ρ(t) follows the Rayleigh distribution [125], as shown by the red dashed line on the right side of Fig. 6. The total matched filtering SNR distribution only marginally deviates from the Rayleigh distribution, just slightly shifting to the direction of high SNR, this indicates the confusion noise will slightly increase the signal's SNR on average. As expected, the detector noise dominates in the 2800 s data. The SNR of the GW signal only has a peak around 8 on the right (not shown in this histogram, because of the number of bins). The PDF of the confusion noise's SNR does not follow the Rayleigh distribution expected from the Gaussian noise. The overall value is smaller than 1 and the peak value is close to 0.

C. Correlated noise in the detector network
In this subsection, we investigate the bias caused by the correlation of confusion noise among different detectors. For the matched filtering SNR of the detector network, we generally assume different detectors or data can be combined in quadrature as where ρ i is the matched filtering SNR of the ith detector in the detector network. This formula is strictly valid only when the noise of each detector is statistically independent of each other [135,136]. In the absence of confusion noise, if the distance between detectors is far enough, this is a reasonable assumption. Real GW searches use Eq. (15) to calculate the network SNR [55]. However, for the 3G detector network, even if the instrumental noise is uncorrelated, the confusion noise composed of overlapping GW signals will still be correlated between different detectors [see Eq. (7)], so there will be a correlation in the total noise, and Eq. (15) might no longer hold.
Next, we will discuss the network SNR of the detector network composed of two 3G detectors (such as two CEs) based on the method of [135,137]. From the Fourier transform of the time-domain strain [Eq. (10)], we can obtain the frequency-domain strain of the first detector I and the detector noise n(f ), the confusion noise j =k h j (f ), and the signal h k (f ) are given as n ID (f ), n IC (f ), and h I (f ), respectively. According to Eq. (9), we can get the one-sided PSD for n ID (f ) and n IC (f ) as similarly, for the second detector II we have the same results, just change the index from "I" to "II." We assume the PSDs of these two detectors are same and the confusion noise is isotropic. According to the independence of each component, we have (m, n = I, II) and for the confusion noise in two detectors, we have where γ(f ) is the overlap reduction function (ORF) [138,139], which only depends on the relative position and orientation between two detectors, −1 ≤ γ(f ) ≤ 1. Then we can rewrite Eq. (17) and corresponding equations for the detector II in a matrix form, which is the noise matrix or the PSD matrix [135,137], in the equation, we ignore 1 2 δ (f − f ) for simplicity, and we define P (f ) ≡ S C (f )/S D (f ) as the PSD ratio between the confusion noise and the instrumental noise. As we can see, if there is no confusion noise (P (f ) = 0), the PSD matrix becomes a diagonal matrix, the diagonal element is just the PSD of the detector noise. But in general cases, the diagonal element is the approximation for the PSD of the total noise S tot h (f ) ≈ S D (f ) + S C (f ). We use "≈" because we ignore the phase difference between S D (f ) and S C (f ), but the ensemble average can approximately eliminate this random phase difference.   24)]. The |γ| is the absolute value of the overlap reduction function (ORF) between two GW detectors. The R tot det is the ASD ratio between the total noise and the instrumental noise, note that P which is the PSD ratio between the confusion noise and the instrumental noise. When the weight w(f, P, γ) is very close or equal to 2, that means the quadrature summation rule of network SNR [Eq. (15)] is still valid. We use circles (triangles) to mark the most extreme weights of CE (ET) network for the median rate case (solid) and upper rate case (hollow). These maximum R tot det values are selected from the median and upper merger rate cases in Fig. 3 and 4, then we use the corresponding frequency to select |γ|.
Following [135,137], we generalize the discussion and conclusions with the sky-averaged SNR, rather than for a CBC signal with specific parameters. We replace the numerator in Eq. (14) with the sky-averaged h I (f )h I (f ) * β , here we use the "β" to abstractly represent sky localization and polarization angle, " ·· β " means average over these parameters. We additionally account for the variation of each term with the frequency and the integral over the frequency range, (21) similar to the noise matrix, we have so we have the signal matrix for this two-detector network and we define the weight w(f, P, γ) as "tr" means take the trace of a matrix. When there is no confusion noise, i.e., P (f ) = 0, then Eq. (24) is 2; this is equivalent to the quadrature summation rule. In order to more easily compare the information from the ASD ratio plots, we replace P is the ASD ratio of total noise to that of the instrumental noise. Here we use "≈" also for the ignorance of the phase. In Fig. 3 and 4 we see that the maximum of R tot det (f ) is around 1.30 (1.12) for the CE (ET) upper rate case, and around 1.10 (1.06) for the CE (ET) median rate case. We plot the weight w(f, P, γ) in Eq. (25) as the upper plot in Fig. 7, the lower one is the weight w(f, P, γ) divided by 1+P (f ), which is equivalent to σ 2 2β (with the confusion noise) divided by σ 2 1β (without the confusion noise). The upper plot shows the network SNR loss only caused by the noises' correlation, the lower one also includes the loss from the biased PSD in the single detector case.
For E1, E2, and E3 in ET, |γ(f )| is around 0.375 when f below 100 Hz [57], combined with the maximum ASD ratio mentioned above, we use triangles to mark the most extreme weights of ET in Fig. 7. We can see that the quadrature summation rule of network SNR is still valid for ET in both rate cases. As for CE, although the final sites of two CE detectors have not yet been selected, if we assume they have the same position and orientation as the current two Advanced LIGO detectors, we can use the |γ(f )| from [138], where the |γ(f )| can be around 0.8 at 15 Hz. The weight w(f, P, γ) might be reduced to around 1.8 (1.6) at most for median (upper) rate case. For the w(f, P, γ)/(1 + P ) of CE, it can be around 1.5 (1.0) at the most for median (upper) rate case.
If a signal has a low detector-frame total mass (such as a nearby BNS signal), then it will cross a wide range of frequencies; after the integration in Eq. (24), the final ratio should still be very close to 2. However, for a signal with a very high detector-frame total mass (such as the high redshift signal or the IMBH signal), the frequency range within the detector is very narrow, then the network SNR loss due to the correlated confusion noise will not be negligible. But in general, we can still assume the quadrature summation rule of network SNR is approximately true for two CE detectors, especially so if mitigation is applied as described in the next section.

IV. SUBTRACTION OF BINARY NEUTRON STAR SIGNALS
As shown in Sec. III, the large population of overlapping signals will have a negative impact on the sensitivity of the 3G ground-based GW detectors; this is primarily due to biases in the estimation of the PSD. One straightforward method to reduce the impact is to subtract the known signals from the data. It can be seen from Sec. III A that BNS signals have the greatest impact on 3G detectors, so for simplicity, in this section, we will focus on the impact of a BNS-only population.
Here we choose to do only a single-detector search and subtraction, which is conservative if multiple detectors are operating in a network, however, it demonstrates the worst scenario that only a single highly sensitive detector is operating.
Previous papers have used similar operations in the detection of the cosmological stochastic GW background. For example, [140] and [141] discuss how to reduce the BNS foreground signals from the Big Bang Observer (BBO) data, and then use the "residual noise projection" method to further reduce the residual noise in the previous subtraction step, making it possible to detect the cosmological stochastic GW background through the standard cross-correlation method. There are also related studies on signal subtraction to detect the cosmological stochastic GW background of the 3G ground-based GW detectors, such as [142] and [143]. According to Sec. III, signal subtraction may also be required for more typical CBC searches. As a first step, we test a straightforward signal subtraction method that can be performed without detailed knowledge of the foreground signal population and estimates of their source parameters.
In Sec. III, we described the basic principle of matched filtering. When conducting a search, while we may maximize over the extrinsic parameters analytically as men-tioned in Sec. III, to maximize over the intrinsic parameters, a bank of waveform templates is used which is designed to cover the target parameter space of the analysis [54]. The goal is to find a set of discrete lattice points in the intrinsic parameter space such that any point in this parameter space matches a template in the bank with a degree higher than a certain threshold (usually the minimal match is 0.97). This ensures that the number of missed signals due to the bank's discreteness can be minimized. At present, the methods of generating template banks can be divided into the stochastic [144][145][146][147][148], geometric [149][150][151], and hybrid [152][153][154][155].
In this paper, we use PyCBC's stochastic method used in the 4-OGC search [7]. Since the low-frequency cutoff of ET and CE will be as low as 2 Hz and 5 Hz, therefore, the BNS signal will last for hours or even days in the frequency band of the detector [44]. For the signal subtraction, we choose to generate a template bank starting only from 10 Hz, where the signals will be shorter than an hour. It would be expected that this initial analysis would be followed by a deep analysis after the initial subtraction has been performed. We use IMRPhenomD to generate the template bank, the reference frequency f ref and low-frequency cutoff f min are both 10 Hz, and the sampling rate f s is 4096 Hz. We consider the case of no spin, so the intrinsic parameter is the detector-frame total mass and mass ratio. According to the simulation in Sec. II A and III A, 3G detectors might detect the BNS from high redshift, so the detector-frame total mass might be several tens of times solar masses. For bank generation, we choose [2.4, 60] M as the detector-frame total mass range, and [1, 1.636] as the mass ratio range. The lower bound of the total mass is consistent with the two lowest detector-frame mass NSs in our population simulation, and the upper bound of total mass covers BNS from high redshift. The upper bound on mass ratio is calculated by the lowest and highest mass NS from our population model. Two template banks (with a minimal match of 0.97) are generated using the design sensitivities of ET and CE, requiring 95652 and 152537 templates for CE and ET, respectively.
We analyze mock data containing only BNS signals but otherwise following Sec. II. There are two sets of data for CE, corresponding to the median local merger rate dataset and the upper local merger rate dataset. Similarly, there are also two datasets for ET. We divide each dataset into several 1-hour data segments with 50% overlap, and use the Welch method to estimate the PSD of the first data segment for matched filtering. For each template, we calculate the corresponding SNR time series and record the filter's value, time, and template parameters for each SNR sample that exceeds |ρ| = 6. After the entire template bank is searched, we need to cluster the triggers because the SNR time series peak of the trigger itself has a width (as can be seen from Fig. 6), and different templates generate triggers for the same injected signal. We apply a sliding time window of 1 s and select the trigger with the largest |ρ| within this window. For this study, we neglect triggers that correspond to signals (from 10 Hz) which are only partially within the dataset. The number of false alarms caused by the detector's stationary and Gaussian noise is related to the number of templates N t in the template bank (strictly, we should use the number of effective templates, because some templates in the stochastic bank are redundant), the sampling rate f s , the data duration T , and the SNR search threshold ρ th , then the number of false alarms can be roughly estimated by N t f s T e −ρ 2 th /2 . In our search, at the SNR threshold of 6, most of the triggers are false alarms, however, we find that at a threshold of 7, the triggers are mostly true alarms, this behavior is also predicted by the equation in the last paragraph. At a threshold of 7 and for CE's median local merger rate BNS dataset, the number of true alarms is about 40% of the total injected signals. For CE's upper local merger rate BNS dataset, the number of true alarms is about 33% of all injections. In contrast, for the two BNS datasets of ET (strictly speaking, we just use the E1 detector, not include E2 and E3), the corresponding numbers are just around 3%, because the majority of injected signals for ET have the optimal SNR lower than 7, note that the f min used in our search also limits the capability of ET.
To illustrate the parameter accuracy of these true alarms (parameters obtained by the matched filtering can be regarded as point estimates of true parameters), we show the accuracy of the detector-frame chirp mass and the merger time in Fig. 8, M 0 and t 0 are the true detector-frame chirp mass and true merger time of the injected signal, respectively, while M m and t m are the point estimates obtained by the matched filtering. When we compare triggers and injections, we consider the detector-frame chirp mass and the merger time to be the same up to one decimal place as true alarms. We use the cumulative distribution function (CDF) to show the bias of these point estimates. More than 90% of true alarms in all datasets have a deviation of chirp mass smaller than 0.1% and a deviation of merger time smaller than 0.005 s. For the CE upper rate case, it has a longer tail to larger deviations, this is caused by the possibly very close signals [72][73][74][75][76][77] or the misassociation issue, sometimes there might be more than one trigger that meets our true alarm criteria (within 0.1 s and 0.1 M ) when compared with injections, so we might choose a nearby but wrong injection.
For each identified candidate trigger, we rescale the corresponding template according to the trigger's complex SNR, to reconstruct an estimate of the injected signal. If we express the complex SNR we obtained through the matched filtering as |ρ m | e iϕm and the template associated with the trigger is h + , we get the rescaled template with the following formula We sequentially subtract the rescaled template h(t, Θ m ) corresponding to all triggers from the dataset. As mentioned before, the number of false alarms is related to the threshold |ρ th |. If the threshold is too low (so the match between the template and data is not good enough, the SNR is just contributed by a small portion of the template), it means that we are almost injecting the antiphased waveform of false alarms that were not originally in the dataset, which will increase the deviation of the PSD instead. So in order to examine the effect of FIG. 9. The ratio of the ASD after signal subtraction at different SNR thresholds to the detector noise ASD for CE (upper panels) and E1 (lower panels) and both the median (left panels) and upper merger rate scenarios (right panels). The blue lines represent the ASD deviation before the signal subtraction. The orange, green, and red lines represent the results after subtracting the rescaled trigger waveforms with SNR thresholds of 10, 9, and 7, respectively. The gray lines represent the optimal subtraction of all signals with an optimal SNR greater than 7. For CE, the deviations peak around 15 Hz, our best subtraction results can almost achieve the optimal results above 10 Hz. For E1, the deviations peak around 7 Hz, we cannot achieve better ASD by using the current subtraction method, because most signals are below the SNR threshold 7.
different thresholds on the signal subtraction, we select |ρ th | ∈ {7, 8, 9, 10} to filter triggers, if the threshold is 6, we find that too many false alarms will make the deviation higher than the case without subtraction, so we do not show the results with a threshold of 6. The ASD ratios after the signal subtraction are shown in Fig. 9, we do not show the results for |ρ th | = 8 to make the plot clearer, because they are just between results of 7 and 9. For CE's median local merger rate dataset, the maximum deviation before subtraction reaches 2%-3%. After subtraction by different thresholds, the maximum deviation is reduced. The results of different thresholds are generally similar, but it can still be seen that the lower the threshold, the better the subtraction; the deviation of the PSD is nearly eliminated if using a singledetector SNR threshold of 7. For the CE upper local merger rate dataset, the maximum deviation before subtraction is about 15%. After subtracting triggers with |ρ th | above 10, it is significantly reduced to 6%. It can be seen that the deviation of PSD mainly comes from high SNR signals (|ρ| above 10). Similar to the median rate dataset, as the threshold decreases, the ASD ratio reduces overall, and the best result is the red line. For the ET median rate cases, the bias (about 1%-2%) is around 7 Hz, and the signal subtraction cannot reduce the ASD bias. For the ET upper rate cases, the results are similar to median rate cases, but with a higher bias peak of around 7 Hz. In order to understand the results of signal subtraction, we also show the results obtained by excluding all signals with the optimal SNR higher than 7 in the corresponding dataset, which we can regard as an "optimal subtraction" when |ρ th | = 7, because these results are not affected by false alarms, and there is no residual noise caused by template parameters' deviation. For the results of the CE median rate, the achieved subtraction and the optimal subtraction are consistent above 10 Hz, and in the 5-10 Hz interval, however, the subtraction is marginally worse than the optimal reference, this is because we just extrapolating the waveform to this frequency range [see Eq. (26)], that means much larger mismatch and residual noise here. For the CE upper rate results, similarly, in the frequency range above 10 Hz, there is an agreement between our subtraction and the optimal reference, however, in the 5-10 Hz interval, no significant subtraction is observed. For the ET median (upper) rate cases, all results are very similar to the optimal reference, which means the remaining bias is mainly due to signals with the optimal SNR lower than 7.
To quantitatively study the improvement in detection ability brought by the straightforward signal subtraction, we calculate the optimal SNR loss (assuming the optimal orientation) of equal mass CBC signals with different detector-frame total mass under different ASDs, as shown in Fig. 10. It can be seen that the overall effect of confusion noise on ET is lower than CE. Even for the upper rate total ASD, the observed SNR loss in ET is around 4%. For the median rate total ASD, the overall loss is lower than 2.5%. In contrast, the loss of CE's upper rate total ASD is around 12%. For a signal with a total detector-frame mass of 1000 M , the loss will even reach 18%, and for the median rate total ASD, the loss is also around 5%. For CE, we show the results before and after BNS signal subtraction (|ρ th | = 7), for the median rate BNS ASD, after signal subtraction, the overall loss can be reduced from 1.2% to around 0.2%. For the upper rate BNS ASD, the loss can be greatly reduced from around 8% to 2%. These results are consistent with Fig. 5. For ET, it can be seen from Fig. 9 that very few BNS signals can be detected and subtracted from only ET's subdetector E1. Even for the upper rate BNS ASD, the loss of ET is below 5% for most sources.
Here we discuss if we can further improve the signal subtraction results. As we have mentioned before, the residual noise and the SNR threshold (or false alarms) limit the capability of signal subtraction. First, let us discuss the origin of the subtraction residual. If we regard the template bank as a template manifold [135], the parameters of the real signal are located in h(t, Θ) in the manifold if there is no systematic error in the waveform modeling. Due to the existence of the detector noise n, the total signal s measured by the detector (here we ignore overlapping signals for simplicity) will be located outside the template manifold. Using the entire template bank to perform matched filtering on the data s is equivalent to finding the closest point h(t, Θ m ) to s in the template manifold, so the vector between point s and point h(t, Θ m ) should be perpendicular to the tangent plane at h(t, Θ m ) in the manifold (see Fig. 1 in [135]). In fact, the inner product 8 can be regarded as the projection of a on b, then we have Θ) is the residual noise (with the opposite sign). Here we decompose the detector noise n into n ⊥ and n , which are the perpendicular and parallel components at the point h(t, Θ m ), respectively. As we can see here, n ⊥ | ∂ Θm h(t, Θ m ) should be 0, so n = h(t, Θ m ) − h(t, Θ) = −r(t), which means the residual noise is caused by the parallel component of detector noise n at the point h(t, Θ m ). The residual noise can be further reduced by the residual noise projection method after the first-stage signal subtraction [140,141]. This follow-up method is based on the Fisher information matrix (FIM) and the signal manifold [135], which requires the signal to have a sufficiently high SNR, that is, satisfying the linear signal approximation (LSA) [156], which is not valid for many low SNR signals in our simulations. As we can see in Fig. 9, above 10 Hz, even without this follow-up step, the ASD biases are already minimal for 3G detectors. However, it might be worth investigating if we can further remove some part of the residual noise below 10 Hz by using a similar secondstage method (without lowering the f min = 10 Hz in our bank generation and search); for the SNR threshold or false alarms, according to the results from Sec. III C, the quadrature summation rule of the network SNR still approximately holds for the 3G network, so we can utilize multiple detectors to increase the total SNR and lower the SNR threshold in each detector, that means we can detect and subtract more signals. Also, we can use the coincidence test of the arrival time and the consistent test of the template's parameters between different detectors, to reduce the number of false alarms [55].

V. CONCLUSIONS
In this paper, we simulated the time-domain strain data of the 3G ground-based GW detectors CE and ET based on the latest GWTC-3 population results (see Fig. 1 and 2). Due to the improved low-frequency sensitivity and much higher detection rate of the 3G ground-based GW detectors, GW signals will overlap each other, forming confusion noise. Since the matched filter is the optimal linear filter only under stationary and Gaussian noise, the addition of the correlated non-Gaussian confusion noise (see Fig. 6) might have an impact on the performance of matched filtering-based searches. We quantitatively investigated the factors that might affect the performance of matched filtering, such as the deviation caused by confusion noise to the ASD, the deviation FIG. 10. The optimal SNR loss as a function of total detector-frame masses. The left plot shows the loss when accounting for all source types, while the right plot shows the results before and after the signal subtraction, but for data including only BNS signals. In general, the larger the detector-frame total mass (less than 1000-2000 M ), the higher the SNR loss, then it drops quickly to almost zero; for 1000-2000 M systems, the majority of the signal is contained within the most biased frequency ranges. The SNR loss peaks at higher masses for ET than CE because the PSD bias is shifted to lower frequencies. By comparing the left and right plots, it can be seen that the main cause of SNR loss is the ASD deviation caused by BNS signals. The SNR loss in CE caused by confusion noise is more significant than in ET due to its higher BNS sensitivity.
caused by cross terms in the inner product, and the correlation of confusion noise among the detector network, which might break the standard quadrature summation rule of the network SNR.
We found that the most significant impact from confusion noise on matched filtering comes from biases in PSD or ASD estimation (see Fig. 3 and 4). We used the horizon distance (redshift) for different sources to measure the loss caused by the biased ASD (see Fig. 5). For ET, the confusion noise made by median (upper) local merger rate estimates of CBC sources will reduce the horizon redshift by up to 8 (21)%. For CE, the deviation of the ASD is much larger; the horizon redshift can be reduced by 15 (38) % for the median (upper) merger rate scenarios. A portion of PBH and Pop III sources from redshift higher than 20 may be missed if the impact of confusion noise is left unmitigated; these GW sources are important scientific targets for future 3G detectors [157]. In addition, subsolar compact binaries and high-redshift BNSs will also be affected. The population of sources whose total source-frame mass is higher than 100 M will still be fully detected but with reduced SNR for nearby signals.
In addition to a biased ASD, the presence of a foreground population of signals could directly contribute to the matched filter if there is an overlap between sources. As expected, we found that confusion noise has different properties than the detector's stationary and Gaussian noise (see Fig. 6), but in general its contribution to the SNR is significantly smaller than the instrumental noise. If the merger times of adjacent CBC signals are close enough, however, the contribution of cross terms between sources can no longer be ignored; the effect on parameter estimation has been studied in numerous works [72][73][74][75][76][77].
We also investigated the SNR loss caused by the noises' correlation in 3G detector networks. We adopted the method from [137] and used the effective number of detectors as a function of the overlap reduction function and ASD ratio to quantify this loss. Combined with the results of our biased ASD, we found that the quadrature summation rule of the network SNR is still approximately valid for ET, but might be modified for high detectorframe mass signals for a network of two CE detectors (see Fig. 7).
In order to reduce the influence of confusion noise on the ASD, we tested a straightforward single-detector signal subtraction (see Fig. 9) that can be implemented at a minimal computational cost relative to a full search. We examined our method with different SNR thresholds for BNS datasets with different local merger rates. BNS signals are the main contributor to the confusion noise, especially for upper local merger rate cases, and it is straightforward to extend our method to NSBH cases. For CE, when the SNR threshold is 7, we obtained nearly optimal subtraction results, almost back to the instrumental noise level. Since the vast majority of signals in E1 are lower than our minimum threshold, the current signal subtraction results of ET are not ideal; the null stream method of ET is needed as a supplement to our method [57,71]. For CE, our method can limit the SNR loss to 0.2% (median BNS rate) and 2% (upper BNS rate) in general (see Fig. 10). Our demonstrated signal subtraction procedure can be used as a first-stage foreground cleaning, allowing for more sophisticated follow-up stages. Our results show that this straightforward single-detector implementation is sufficient to enable the archival detection of typical binary mergers. For the early warning of mergers with 3G detectors [158], we might expect more significant biases for high-redshift mergers, however, expect the detection of nearby, optically bright, sources would not be significantly impacted.
Our current signal extrapolation and subtraction method has several constraints: (1) if the threshold of SNR is too low, there will be a large number of false alarms. Since there must be some signals below the SNR threshold, this will limit our method's capability, and (2) the f min used in the bank generation and signal search will affect the extrapolation accuracy of the rescaled template at frequencies lower than f min , thereby the residual noise lower than f min after the subtraction is larger. Because the template bank is generated above f min , the max mismatch of the bank (3%) is only valid above this frequency, lower frequencies need a denser template bank at an increased computational cost. One may further lower the SNR threshold by using observations in a detector network; multiple detectors increase the total net-work SNR and allow for coincidence tests to reduce the contamination from false positives. It may also be worth investigating how to combine the residual noise projection method of [140,141] to further reduce the subtraction residual.