GWTC-2: Compact Binary Coalescences Observed by LIGO and Virgo During the First Half of the Third Observing Run

Disclaimer/Complaints regulations If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library: https://uba.uva.nl/en/contact, or a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. You will be contacted as soon as possible.


I. INTRODUCTION
Since the discovery of gravitational waves from a binary black hole (BBH) coalescence in 2015 [1], the Advanced LIGO [2] and Advanced Virgo [3] gravitational-wave detectors have opened a new window on our Universe [4][5][6][7][8]. Binary black hole observations allow us to probe gravity in the strong-field regime [9,10] and to establish the rate and population properties of BBH coalescences [11]. In addition to BBHs, Advanced LIGO and Advanced Virgo detected the first gravitational-wave signal from a binary neutron star (BNS) coalescence, GW170817 [12], which is also the first joint detection of gravitational waves and electromagnetic emission [13,14]. Gravitational-wave discoveries have a profound impact on physics, astronomy, and astrophysics [13,[15][16][17][18][19], and the public release of LIGO and Virgo data [20,21] enables groups other than the LIGO Scientific Collaboration and Virgo Collaboration (LVC) to perform analyses searching for gravitational-wave signals [22][23][24][25][26][27][28] and to report additional candidate events in some cases.
We present the results of searches for compact binaries in Advanced LIGO and Advanced Virgo data taken between 1 April 2019 15∶00 UTC and 1 October 2019 15∶00 UTC. This period, referred to as O3a, is the first six months of Advanced LIGO and Advanced Virgo's 11-month-long third observing run. The first gravitational-wave transient catalog (GWTC-1) of compact binary coalescences (CBCs) includes candidate events observed by Advanced LIGO and Advanced Virgo during the first (O1) and second (O2) observing runs [8]. The increased sensitivity of Advanced LIGO and Advanced Virgo during O3a enables us to increase the number of confident gravitational-wave detections more than threefold over GWTC-1. Together, GWTC-1 and the new candidate events presented here comprise GWTC-2. Figure 1 shows this consistent increase in both the effective binary neutron star volume time (BNS VT) of the gravitational-wave network and the number of detections across these observing runs. The BNS VT is a Euclidean sensitive volume of the detector network [29,30] multiplied by the live time of the network and should be approximately proportional to the total number of detections. Our analysis of the O3a dataset results in 39 gravitational-wave candidate events passing our false alarm rate (FAR) threshold of 2.0 per year. Given our use of multiple search pipelines to identify candidate events, we expect approximately three false alarms, i.e., candidate events caused by instrumental noise, to be present in this catalog. It is not possible to determine with certainty which specific candidate events are due to noise; instead, we provide statistical measures of false alarm rate and probability of astrophysical origin. Among these candidate events, 26 have been reported previously in real-time processing via GCN notices and circulars [31]. Furthermore, four gravitational-wave candidate events from O3a are already published separately due to their interesting properties: GW190425 [32] is the second gravitational-wave event consistent with a BNS coalescence; GW190412 [32] is the first BBH observation with definitively asymmetric component masses, which also produced detectable gravitational radiation beyond the leading quadrupolar order; GW190814 [33] is an even more asymmetric system having an approximately 23 M ⊙ object merging with an approximately 2.6 M ⊙ object, making the latter either the lightest black hole or heaviest neutron star known to be in a double compact object system; GW190521 [34,35] is a BBH with total mass of approximately 150 M ⊙ having a primary mass above 65 M ⊙ at 99% credibility.
Here, we present 13 candidate events for the first time along with the 26 previously reported candidates. Among the 39 candidates, we find gravitational-wave emission consistent with the coalescence of BBHs, BNSs, and neutron star-black hole binaries (NSBHs).
We report on the status of the Advanced LIGO and Advanced Virgo gravitational-wave detectors (Sec. II) and the properties and quality of the data taken during the analyzed period (Sec. III). We then describe the analysis methods that lead to the identification of the 39 gravitational-wave candidates (Sec. IV), as well as the inference of their parameters (Sec. V). Next, we report the significance of the identified candidates, as well as a comparison to the public gravitational-wave alerts (Sec. VI). Finally, we discuss the properties (Sec. VII) and the reconstructed waveforms (Sec. VIII) of each event. Further interpretation of the binary population is conducted in companion papers [36,37]. We will analyze the second half of Advanced LIGO and Advanced Virgo's third observing run (O3b) in future publications.

II. INSTRUMENTS
The Advanced LIGO [2] and Advanced Virgo [3] detectors are kilometer-scale laser interferometers [38]. The current generation of detectors started operations in 2015 and since then have been alternating periods of observation with periods of tuning and improvement. Since O1 [39] and O2 [8], the sensitivity and robustness of the detectors improved significantly.
The LIGO detectors underwent several upgrades between the end of O2 and the start of O3a [40]. The main laser sources were replaced to allow for higher operating powers. The LIGO Hanford detector operate with 37 W of input power and the Livingston detector with 40 W. Those levels can be compared to 30 and 25 W during O2 for Hanford and Livingston, respectively. The laser sources replacement also reduces fluctuations in the input beam pointing and size that were previously detrimental for the detector sensitivity [41]. At both LIGO detectors, the FIG. 1. The number of compact binary coalescence detections versus the effective VT to which the gravitational-wave network is sensitive to BNS coalescences. The effective VT is defined as the Euclidean sensitive volume [30] of the second-most-sensitive detector in the network at a given time, multiplied by the live time of that network configuration. The Euclidean sensitive volume of the network is the volume of a sphere with a radius given by the BNS inspiral range [29,30] (shown in Fig. 3) of the second-mostsensitive detector in the network. To account for the addition of single-detector candidates in O3a, a single-detector Euclidean sensitive volume is also included, defined using the inspiral range of the most sensitive detector divided by 1.5. The effective BNS VT does not account for differences in sensitivity across the entire population of signals detected, necessary cosmological corrections, or changes to analysis pipeline efficiency between observing runs but, as shown in this figure, is consistent with the currently observed rate of detections. The colored bands indicate the three runs: O1, O2, and O3a. The black line is the cumulative number of confident detections of all compact binary coalescences (including black holes and neutron stars) for GWTC-1 [8] and this catalog. The blue line, dark blue band, and light blue band are the median, 50% confidence interval, and 90% confidence interval, respectively, of draws from a Poisson fit to the number of detections at the end of O3a. The increase in detection rate is dominated by improvements to the sensitivity of the LIGO and Virgo detectors with changes in analysis methods between observing runs being subdominant. two end test masses were replaced with mirrors with lower scattering losses [42], allowing for higher circulating power. Additionally, annular test masses were installed to reduce noise induced by residual gas damping [43]. In the Hanford interferometer, one of the two input test masses was also replaced, because the one which was previously installed had a large point absorber [44] that limited the amount of power that could be handled in the arm cavities.
The buildup of electric charge on the test masses was also an issue during previous runs; therefore, several actions were undertaken to mitigate the contribution of this noise source: Electric field meters were installed in end stations to monitor the local electric field; baffles were installed in front of the vacuum system ion pumps to mitigate charging, and a test mass discharging system was put in operation and successfully deployed on all LIGO Hanford test masses.
Parametric instabilities [45], i.e., radiation-pressureinduced excitation of the test masses' mechanical modes, also limit the maximum power allowed into the interferometer. This problem is mitigated with the installation of acoustic mode dampers [46] that reduce the mechanical quality factor of the test mass resonant modes and, thus, suppress parametric instabilities.
The high-frequency (≳1 kHz) sensitivity of both detectors is significantly improved compared to the O2 observing run (a factor of 1.68 for the Hanford detector and 1.96 for the Livingston detector), partially due to the increased circulating power made possible by the improvements already discussed and by the installation of squeezed light sources [47,48] to reduce the quantum vacuum noise entering the interferometers [49], making O3 the first observing run of the LIGO detectors that routinely implements quantum noise-reduction techniques. GEO600 has been using the same approach since 2011 [50,51] but has not detected gravitational waves so far due to an overall too-low sensitivity.
Additionally, many beam dumps and baffles were installed at both LIGO sites, to mitigate the effect of scattered light [52] that can be the source of nonstationary disturbances. Finally, the feedback control systems for the seismic isolation and for the angular and longitudinal control of the instruments were improved, increasing the detectors' duty cycle and robustness against external disturbances. With respect to O2, the LIGO Hanford median BNS inspiral range, as defined in Ref. [29], increased by a factor 1.64 (from 66 to 108 Mpc) and the LIGO Livingston median range by 1.53 (from 88 to 135 Mpc).
Also, in Virgo between the O2 and the O3a observing runs, many upgrades have been implemented to boost the sensitivity. The most important upgrade is the replacement of the steel wires suspending the four test masses with fused-silica fibers [53] to improve the sensitivity below 100 Hz. This upgrade was achieved by changing the design of the final stage of the mirror suspension to improve the screening of the fused-silica fibers [54] from residual particles injected by the vacuum system. In parallel, the vacuum system was modified to avoid dust pollution of the environment.
Another upgrade to improve the low-frequency sensitivity is the suspension of the external injection bench (used to manipulate and steer the input laser beam into the interferometer). In this way, the seismic motion of the optics is reduced and, consequently, the beam jitter noise contribution [55,56].
The major upgrade to boost the Virgo high-frequency sensitivity is the installation of a more powerful laser that can deliver more than 65 W output power. After some commissioning activities at different power values, the laser power injected into the interferometer is set to 19 W, improving both the sensitivity and the stability of the interferometer at the same time. The laser power is almost doubled compared to the 10 W injected during O2 [57]. In this configuration, due to the marginally stable power-recycling cavity, the aberration induced by thermal effects prevents a reliable and robust interferometer longitudinal control and worsens the alignment performances. Therefore, the thermal compensation system actuators were used to stabilize the powerrecycling cavity. The end test masses' radii of curvature were tuned with the ring heaters, maximizing the power circulating in the arm cavities [59,60].
The squeezing technique adopted by LIGO to improve the high-frequency sensitivity of the detectors is also implemented in the Virgo interferometer [61]. This system was fully operational also in Virgo for the first time during O3. To reduce the optical losses optimizing the squeezing performance, the photodiodes installed at the interferometer dark port to measure the gravitational-wave signals were replaced by high quantum efficiency ones [62].
Moreover, complementary activities have been carried out in parallel to the main upgrades, reducing the overall contribution of the various technical noises. In particular, the improvement of the control strategy for the suspended benches allows the reduction of the noise contribution below 30 Hz, and the installation of baffles and diaphragms on the optical benches and inside the vacuum tanks reduces the impact of the scattered light on the sensitivity. Finally, sources of environmental noises have been identified and removed. All these upgrades increase the Virgo median BNS range by 1.73 (from 26 to 45 Mpc) with respect to the O2 run. Figure 2 shows representative sensitivities of the three detectors during O3a, as measured by the amplitude spectral density of the calibrated strain output. Figure 3 shows the evolution of the detectors' sensitivity over time, as measured by the binary neutron star range. The up time of the detectors is kept as high as possible but is nevertheless limited by many factors, such as earthquakes, instrumental failures, and planned maintenance periods. The duty cycle for the three detectors is 76% (139.5 days) for Virgo, 71% (130.3 days) for LIGO Hanford, and 76% (138.5 days) for LIGO Livingston. With these duty cycles, the full three-detector network is in observing mode for 44.5% of the time (81.4 days). Moreover, for 96.9% of the time (177.3 days), at least one detector is observing, and for 81.9% (149.9 days) at least two detectors are observing. For comparison, during the O2 run, the duty cycles are 62% for LIGO Hanford and 61% for LIGO Livingston, so that two detectors are in observing mode 46.4% of the time and at least one detector is in observing mode 75.6% of the time. The goal is, of course, to obtain the highest possible duty cycle. The current performance is limited by planned maintenance periods and the time needed to recover the control of the interferometers after large transients triggered, for example, by earthquakes. Work is ongoing at all detectors to improve the duty cycle.

III. DATA
Before analyzing LIGO and Virgo time-domain data for gravitational waves, we apply multiple data-conditioning steps to accurately calibrate the data into strain and mitigate periods of poor data quality [63]. Segments of data where each interferometer is operating in a nominal state, free from external intervention, are recorded [64]. Data from outside these time periods are not used in analyses unless additional investigations are completed to understand the state of the interferometer [6,33]. The data-conditioning process involves calibration of the data, both in near-real time and in higher latency; subtraction of noise from known instrumental sources; and identification of short-duration noise transients, which we refer to as glitches [65], that should be excluded from any searches for astrophysical candidates either by not considering the data containing the glitches or mitigating the glitches with methods such as gating.
For time periods containing gravitational-wave candidate events, additional investigation of the data quality is completed as a part of event validation to evaluate if instrumental artifacts could impact the detection and analysis of the candidate events [66]. These investigations sometimes lead to additional data-processing steps such as modeling and subtraction of glitches beyond what is completed to mitigate glitches before searching the data.
This section further outlines the procedures used to calibrate the data collected by LIGO and Virgo, characterize the data quality, validate any identified gravitationalwave candidates, and subtract glitches that may impact the analysis of candidates.

A. Calibration and noise subtraction
The optical power variations at the gravitational-wave readout ports of the LIGO and Virgo detectors are calibrated into a time series of dimensionless strain measured by the detectors before use by astrophysical analyses [67,68]. The calibration process requires data-conditioning filters whose response is complex valued, frequency dependent, and informed by detailed modeling of the feedback control system along with the interferometric, optomechanical response of the detectors [69]. Some control system model parameters vary slowly with time throughout operation of the interferometer. These parameters must be monitored, and, when possible, the filters are corrected in near-real time (low latency) [70]. Other parameters may change at discrete times and cause systematic error in the data stream that cannot be accounted for in low latency. Examples of such an error can arise from poorly compensated changes in electronics configurations, accidental application of incorrect control parameter values, model errors not yet known at the start of the observing period, and hardware problems such as failures of analog electronics within the control system. Most of these sources of error are subtle and can be assessed only once they are measured and quantified a posteriori.
All three detectors use photon calibrator (Pcal) systems [71][72][73] for absolute reference. These reference systems are used to develop each static detector model, measure parametric time dependence, and establish residual levels of systematic error in each strain data stream once constructed. Each of these measurements required to evaluate the systematic error is done by using the Pcal systems to drive forces on the end test masses via radiation pressure, creating displacement above other detector noise. Validating the strain data stream in low latency for all time, by establishing carefully quantified estimates of the systematic error with these excitations, competes with the desire for unhampered astrophysical sensitivity. As a compromise, systematic error is measured continuously only at a select few frequencies at the edges of the sensitive frequency band of the detector with monochromatic excitations during observation. The data stream is validated only at high-frequency resolution, and across the entire detection band, at roughly weekly cadence: The detectors are fully functional but are declared out of observation mode, and swept-sinusoid and colored-random-noise Pcal excitations are driven above the noise. These measurements can provide only approximate, point-estimate bounds on the data stream's error in low latency; they cannot reflect the error distribution for all time.
Once an observing period with a stable detector configuration is completed, estimates of the probability distribution of systematic error for all observation time are created [74]. These estimates leverage the power of hindsight, the collection of measurements mentioned above, and other measurements of individual components gathered while the detector is offline. During this error characterization process, if any identified systematic error is egregious and well quantified, where possible, the control system model and data-conditioning filters are modified to remove the error. The data stream is then regenerated offline from the optical power variations and control signals, and the systematic error estimate is updated accordingly [67].
Results in this paper are derived from either low-latency (C00) or offline, recalibrated (C01) strain data, depending on whether offline data were available and whether the results are sensitive to calibration error. Detection algorithms for gravitational-wave candidates, described in Sec. IV, are insensitive to typical levels of systematic error in calibration [75], so low-latency data may be used for additional offline analyses without concern. However, if available, offline data are preferred for improved accuracy and completeness. At the time the data were searched for candidate events, offline data were available only for a portion of O3a. The candidate events presented in this paper detected prior to 5 June 2019 are identified using LIGO offline data, whereas those from 5 June 2019 until 1 October 2019 are identified using LIGO low-latency data.
Once candidate events are found by detection algorithms using either LIGO data stream, all estimations of the candidates' astrophysical parameters use the C01 LIGO version of strain data using methods described in Sec. V. The C01 LIGO version of strain data was available for the entirety of O3a at the time these additional analyses were completed. As such analyses are more sensitive to calibration error [76], it is advantageous to use the definitive characterization of error at the time of each event available with LIGO C01 data. The probability distributions of error for LIGO C01 strain data in O3a are characterized in Ref. [74]. Analysis of Virgo's collection of validation measurements during the run does not lead to a significant improvement to the low-latency strain data stream offline. As such, only lowlatency strain and its bounds on systematic error from pointestimate measurements are used for all detection and astrophysical parameter estimation results presented in this paper. The bounds of systematic error of Virgo strain in O3a are reported in Ref. [77].
Numerous noise sources that limit detector sensitivity are measured and linearly subtracted from the data using witness auxiliary sensors that measure the source of the noise [41,78,79]. In O3a, the sinusoidal excitations used for calibration and noise from the harmonics of the power mains are subtracted from LIGO data as a part of the calibration procedure [79]. This subtraction is completed for both online and offline versions of the calibration. For time periods around a subset of identified candidate events, additional noise contributions due to nonstationary couplings of the power mains are subtracted [80].
The Virgo online strain data production also performs broadband noise subtraction during O3a [77]. The subtracted noises include frequency noise of the input laser, noise introduced controlling the displacement of the beam splitter, and amplitude noise of the 56 MHz modulation frequency. An additional offline strain dataset is produced for 14 September 2019 through 1 October 2019 using the same calibration as the online data but with improved noise subtraction, resulting in a BNS range increase of up to 3 Mpc [81], and is used in source parameter estimation of candidate events that occur during this time period.

B. Data quality
During O3a, the data quality is closely monitored using summarized information from the detectors and their subsystems [82,83]. Deeper studies are conducted to identify the causes of data quality issues, which enable instrumental mitigation of the sources during the run. For example, at Livingston, glitches from a mechanical camera shutter and beats of varying radio-frequency signals are identified and eliminated. At Hanford and Livingston, strong frequency peaks that wander in time are tracked down to the amplitude stabilizer for the laser used to provide squeezed light. At Hanford, broad features in the spectrum at 48 Hz and multiples are tracked to scattered light from vacuum chamber doors and mitigated with absorptive black glass. These studies are a part of ongoing efforts to improve the data quality and the up time of the detectors [40,82].
For analyses of gravitational-wave transients, the most common data quality issue is the presence of glitches. The rate of glitches with signal-to-noise ratio (SNR) > 6.5 in the LIGO and Virgo detectors in O3a is shown in Fig. 4. In all three detectors, the glitch rate is dominated by glitches with peak frequencies below 100 Hz. This rate is higher than in previous observing runs [66] for both LIGO detectors and is especially problematic at LIGO Livingston, where the rate of glitches is 0.8 per minute in O3a, compared to 0.2 per minute in O2. The Virgo glitch rate decreases significantly between O2 and O3a, thanks to the work done during the O2-O3a shutdown to improve the accuracy of Virgo's operating point control and to identify, fix, or mitigate several sources of noise. The increased rate of glitches in the LIGO detectors limits the overall sensitivity of searches for gravitational waves in O3a and creates challenges for analysis of candidate events.
The most problematic source of glitches in O3a is caused by laser light scattered out of the main laser beam, which is reflected off walls of the vacuum systems and other equipment back into the main beam [52,[86][87][88].
Scattered light noise is correlated with periods of high seismic activity. For this reason, daily cycles of scattered light glitches are present throughout O3a, especially at LIGO Livingston, tied to ground motion driven by human activity. This noise is often visible as arch-shaped features in time frequency [86,89], as shown in Fig. 5, and is present at or near the time of many of the candidate events in this catalog. A significant portion of the increase in glitch rate between O2 and O3a at the LIGO detectors can be accounted for by the increased rate of scattered light glitches [88]. A potential major source of this noise for O3a is light scattered from the gold-coated electrostatic drives mounted to the fused silica reaction masses that are suspended directly behind the LIGO test masses to provide a stable platform for low-noise actuation [88]. Changes are implemented during O3b that reduce scattered light noise entering through this path [88].
Broadband short-duration glitches also occur often in all detectors during O3a. A subclass of those, blip FIG. 4. The rate of single interferometer glitches with SNR >6.5 and frequency between 10 and 2048 Hz identified by Omicron [84,85] in each detector during O3a. Each point represents the average rate over a 2048 s interval. Dotted black lines show the median glitch rate for each detector in O2 and O3a.
glitches [90], is one of the most problematic sources of transient noise in previous observing runs and is still present in O3a at a rate of 1.4 per hour. These glitches are one of the limiting sources of noise for searches for gravitational waves from high-mass compact binaries [91], and no sources or witnesses for the majority of these glitches have been identified. In O3a, there was also an additional population of short-duration glitches with SNR > 100. These loud glitches occur in both LIGO detectors, with unknown origin. Additional description of these glitches, along with details of potential sources that have been ruled out, can be found in Ref. [82].
Many glitches in LIGO and Virgo data have wellunderstood sources and couplings, making it possible to identify short time periods where excess power from environmental or technical sources is present in the strain data. Flagging these time periods as containing poor data quality, either by removing the data from the search or decreasing the significance of any candidate events identified, is shown to improve the overall sensitivity of searches for gravitational waves from compact binaries [82,92,93]. While a select number of LIGO and Virgo data quality issues are flagged in low latency, such as hardware injections and digital signal overflows, the majority of data quality flags are available only for offline searches.
Before performing searches for gravitational-wave signals, periods of poor data quality are flagged at various levels, called categories [63,66,82]. In O3a, data from an observatory not operating in a nominal state are flagged (category 1) and not used in any search. Additional periods likely to contain excess power in each LIGO detector are flagged based on detailed follow-up of identified sources of noise (category 2), statistical correlation between witness sensors (category 3), and machine-learning-based predictions (iDQ) [93,94]. No additional data quality products for Virgo are used beyond category 1. The specific set of data quality products used in O3a is search specific, as described in Sec. IV. Category 2 flags are tuned separately for searches for both gravitational waves from CBC and minimally modeled (burst) sources. Category 3 flags are tuned only for burst searches. The amount of time removed by each category of veto in O3a is shown in Table I.

C. Event validation
Event validation procedures similar to those used for previous gravitational-wave candidate events [8,66,82] are used for all candidate events in this catalog to evaluate if instrumental artifacts could impact analysis. Within tens of minutes of low-latency candidate event identification, timefrequency visualizations and monitors of the gravitationalwave strain data [95][96][97][98][99][100] are used to identify any data quality issues present. On the same timescale, data from hundreds of auxiliary sensors monitoring the detectors and their environments are used to identify potential auxiliary witnesses to instrumental artifacts [94,101,102]. Tools that rely upon deeper information about glitches and data nonstationarity gathered offline, such as long-term monitors of the instruments and their subsystems [83,103,104] and identification of likely sources of glitches by correlation with auxiliary sensors [85,105,106], are also used to vet candidate events in this catalog. These procedures do not identify evidence of instrumental origin for any of the candidate events in this catalog but do identify a number of FIG. 5. Top: time-frequency representation of the data surrounding event GW190701_203306 at LIGO Livingston, showing the presence of scattered light glitches (modulated arches). Bottom: the same data after glitch identification and subtraction as described in Sec. III D. In both plots, the time-frequency track of the matched-filter template used to identify GW190701_ 203306 is overlaid in orange. Investigations identify seven candidate events in coincidence with similar scattering glitches and require mitigation before further analysis. Despite the clear overlap of the signal with the glitch, the excess power from the glitch is successfully modeled and subtracted. data quality issues that could potentially impact analyses of these candidate events. Candidate events with data quality issues identified by these event validation procedures require further mitigation before analysis. In cases when glitches occur in time coincidence with a candidate event (but cannot account for the candidate event itself), additional data-processing steps are completed to mitigate the effect of those glitches on estimation of the candidate event parameters. If possible, the identified glitches are subtracted using the methods described in Sec. III D. In cases when sufficient subtraction is not possible, customized configurations of parameter estimation analyses are used to exclude the time period or frequency bandwidth impacted by glitches. An example of a glitch coincident with a signal that requires glitch subtraction is shown in Fig. 5. While the presence of excess power from transient noise does not prevent confident identification of this event, glitch subtraction is required before the source properties of the event can be evaluated. Although only data recorded from detectors in observing mode are used to identify candidate events in this catalog, some candidate events occur at times when one detector in the network is operating but not in observing mode. If it is concluded that data from the additional detector would substantially impact the scientific conclusions reached from analyzing the candidate, the additional data are investigated. For those cases, the data quality and calibration for the nonobserving detector are evaluated to determine whether the data could be used in the estimation of candidate event source properties. Such data are used for one candidate event, GW190814 [33]. The full list of candidate events requiring specific mitigation steps, due to either the presence of glitches or the state of a detector, is found in Sec. VII. The total number of candidate events requiring mitigation is consistent with the number expected based on the glitch rate in each detector.

D. Glitch subtraction
Data containing gravitational-wave candidate events and glitches in the same time-frequency volume are preprocessed through a glitch-subtraction procedure prior to being analyzed by the parameter estimation pipelines. The glitchsubtraction procedure evolved from the BayesWave (BW) algorithm [107][108][109] used for glitch subtraction in the Livingston detector at the time of the GW170817 binary neutron star merger [12,110], where the non-Gaussian, incoherent, noise is modeled as a linear combination of wavelets which is subtracted from the data. The number of wavelets used in the fit is determined using a transdimensional Markov chain Monte Carlo (MCMC) algorithm that balances using fewer wavelets against the quality of the fit [107].
To prevent the glitch-subtraction procedure from corrupting the signal candidate event, the time segment and bandwidth of the wavelet-based analysis are chosen, when possible, to exclude from subtraction the strongest part of the signal. For cases where the signal and glitch overlap in time-frequency space, a more robust application of the glitch-subtraction algorithm is used which simultaneously fits for the signal and the glitch. In the signalplus-glitch application, signal wavelets are included in the model if they are coherent over the detector network (marginalizing over sky location, etc.), while the glitch wavelets are independent in each detector [108]. Only the glitch model wavelets are then used in the subtraction. The signal-plus-glitch procedure is tested by injecting simulated coherent BBH signals onto known singledetector glitches from O2 and verifying that the signals are unaffected in the process.
The glitch-subtraction procedure is used only as a preprocessing step for the parameter estimation analysis (described in Sec. V) and is not part of the analyses that determine the presence, or significance, of gravitationalwave candidate events. As shown in Fig. 5, the glitchsubtraction methods described here are able to successfully remove excess power caused by glitches present near the time of candidate events.

IV. CANDIDATE IDENTIFICATION
Candidate identification happens on two timescales. First, five low-latency gravitational-wave pipelines [111][112][113][114][115] process the data immediately after acquisition with the goal of generating public detection alerts to the broader astronomical community within minutes [116]. Second, an offline reanalysis of gravitational-wave data is conducted to produce the curated candidate event list here. The offline analysis may benefit from updated data calibration, data quality vetoes, the ability to estimate event significance from the full data, and further algorithmic development that takes place over the course of an observing run. Although the candidate events presented here are derived from offline analysis, we provide a comparison with the public alerts in Sec. VI. Candidates are identified using two methods. The first method searches for minimally modeled sources. The second method searches for signals from a bank of template waveforms [117] modeled after the expected gravitational-wave emission from coalescing compact binaries in general relativity. In Sec. VI, we present results from one search for minimally modeled transient sources, coherent WaveBurst (cWB) [112,[118][119][120][121], and two searches for modeled sources, GstLAL [111,122,123] and PyCBC [29,[124][125][126][127]. cWB, GstLAL, and PyCBC are also three of the five low-latency pipelines in O3. The two remaining low-latency pipelines, MBTAOnline [114,128] and SPIIR [115], were not configured for offline reanalysis at the time of this publication and are, therefore, not included in GWTC-2. Below, we summarize the methods used by each of cWB, GstLAL, and PyCBC to identify candidate events.
A. Coherent WaveBurst search for minimally modeled transient sources cWB is a search pipeline for detection and reconstruction of transient GW signals that operates without a specific waveform model [129] and was used in previous searches by the LVC [1,8,130]. cWB identifies coincident signal power in multiple detectors, searching for transient signals with durations up to a few seconds in the detector bandwidth. The analysis is performed on the timefrequency data obtained with the Wilson-Daubechies-Meyer wavelet transform [118,131] and normalized by the amplitude spectral density of the detector noise. cWB selects the time-frequency data samples above fluctuations of the detector noise and groups them into clusters. For clusters correlated in multiple detectors, cWB reconstructs the source sky location and signal waveforms with the constrained maximum likelihood method [112]. The signal SNR is estimated from the signal waveforms reconstructed by cWB, and the network SNR is calculated combining the SNRs of individual detectors.
The cWB detection statistic is based on the coherent energy E c obtained by cross-correlating the normalized signal waveforms reconstructed in different detectors. It is normalized by a chi-squared statistic where E n is the residual noise energy estimated after the reconstructed waveforms are subtracted from the data and N df is the number of independent wavelet amplitudes describing the event. The cWB detection statistic is η c ∝ ½E c =maxðχ 2 ; 1Þ 1=2 , where the χ 2 correction is applied to reduce the contribution of non-Gaussian noise. To improve the robustness of the algorithm against glitches, cWB uses signal-independent vetoes, which reduce the FAR of the pipeline; this includes category 2 burst data quality flags in the processing step and hierarchical vetoes in the postproduction phase [105,132]. Other vetoes applied to candidate events are on the network correlation coefficient c c ¼ E c =ðE c þ E n Þ and the χ 2 . To further reduce the background, the cWB analysis employs additional signal-dependent vetoes based on the properties of the time-frequency evolution of compact binary signals: (a) The frequency of the signal is increasing in time [133], and (b) the central frequency of the signal f c is inversely proportional to the total mass of the system [134]. cWB searches are performed with two pipeline configurations targeting detection of high-mass (f c < 80 Hz) and lowmass (f c > 80 Hz) BBH systems. They use different signal-dependent vetoes defined a priori to alleviate the large variability of nonstationary noise in the detectors' bandwidth.
We estimate the significance of candidate events by systematically time shifting the data of one detector with respect to the other in each detector pair, with a time lag so large that actual astrophysical events are excluded, and repeating this for a large number of different time lags over a total time T bkg . We count the number of events N due to instrumental noise that have a ranking statistic value, such as the SNR that is at least as large as that of the candidate event, and we compute the FAR as the number of background events divided by the total background time [135]. The detection significance of a candidate event identified by either configuration in a single-frequency range is determined by its FAR measured by the corresponding cWB configuration. Whenever the low-mass and high-mass configurations overlap, the trials factor of 2 is included to determine the final FAR [34]. In the end, each configuration reports the selected candidate events and their FAR.
The sensitivity of the cWB search pipeline approaches that of matched-filter methods for coalescing stellar mass BBHs with high chirp masses, where most of the signal energy is concentrated in just a few wavelets of the cWB representation [136]. It is less competitive for low-chirpmass events, where the signal power is spread over large time-frequency areas. cWB can also detect sources that are not well represented in current template banks (e.g., eccentric systems or high-mass ratio precessing systems) [137]. Tests with cWB show that the detection efficiency for a given FAR threshold is slightly smaller with the inclusion of Virgo. Therefore, also to reduce computing time, all cWB detection candidates and waveform consistency tests reported in this catalog use the Hanford-Livingston network only.
The matched-filter method relies on a model of the signal, dependent on the source physical parameters. Most important for the phase evolution of the source (and, therefore, the matched filter) are the intrinsic parameters: two individual component masses m 1 and m 2 and two dimensionless spin vectors ⃗ χ f1;2g , where the dimensionless spin is related to each component's spin angular momen- We also make use of combinations of these intrinsic parameters that are typically well constrained by gravitational-wave measurements; the binary chirp mass [139] determines to lowest order the phase evolution during the inspiral and is typically better constrained than the component masses. At higher orders, the mass ratio q ¼ m 2 =m 1 (where m 2 ≤ m 1 ) and effective inspiral spin χ eff affect the binary phase evolution. The effective inspiral spin is defined as [140] where M ¼ m 1 þ m 2 is the total mass andL N is the unit vector along the Newtonian orbital angular momentum. The spin tilt angle for each component object θ LS i ¼ cos −1 ð⃗ χ i ·L N =j⃗ χ i jÞ quantifies the angle between the orbital angular momentum vector and its spin vector. Since the spin and angular momentum vectors vary if the system precesses, by convention we use the spin parameters at a reference frequency of 20 Hz, with the exception of GW190521, where we use 11 Hz for consistency with previous publications [34,35]. Additional intrinsic parameters are needed to describe eccentricity, which we assume to be zero in our modeled analyses. The timescale for circularization of isolated binaries with nonzero eccentricity at birth is sufficiently short that sources are expected to have negligible eccentricity when they enter the sensitive bands of the LIGO and Virgo detectors [141]. However, dynamically formed binaries may have residual eccentricity as the signal enters the sensitive band of the detector. These systems are the target of unmodeled searches of previous observing runs but with no candidate events reported [137]. Seven extrinsic parameters provide the orientation and position of the source in relation to Earth: the luminosity distance D L , two-dimensional sky position (right ascension α and declination δ), inclination between total angular momentum and line of sight θ JN , time of merger t c , a reference phase ϕ, and polarization angle ψ.
In general, as the signal travels from the source to the detector, its frequency is redshifted by a factor of (1 þ z). For a system involving only black holes, the observed signal is identical to that from a source in the rest frame of the detector with total mass M det ¼ ð1 þ zÞM [142,143]. For convenience, the templates used by the modeled searches are defined in the rest frame of the detectors which subsumes the factor (1 þ z) into the definition of masses.
For this work, the GstLAL analysis uses a template bank with component masses between 1 M ⊙ and 400 M ⊙ with total masses M det between 2 M ⊙ and 758 M ⊙ and spins that are aligned or antialigned with the binary's orbital angular momentum, such that only the spin components The bank is constructed in five regions via a stochastic placement algorithm [144,145] satisfying different minimal match (MM) [117] criteria with waveforms starting at f min as described in Table II. Template placement is augmented to improve the collection of background statistics in the last region shown in Table II by a grid of templates distributed uniformly in the logarithm of component mass to improve detection efficiency for systems with primary mass m det 1 above 50 M ⊙ [8,146]. The TaylorF2 waveform approximant [139,[147][148][149][150][151][152][153][154][155][156] is used for templates with M det < 1.73 M ⊙ and the SEOBNRv4_ROM waveform approximant [157] for templates with M ≥ 1.73 M ⊙ .
The PyCBC analysis uses a template bank covering the same parameter space as for GWTC-1 [8] shown in Figs. 3 and 7 of Ref. [158]. Unlike the previous work, the template bank here is created using a hybrid geometric-random method described in Refs. [159,160]. This new method provides a more efficient template bank-in terms of covering the full parameter space with fewer template waveforms-than the stochastic method [144,145]. This bank is broadly similar to the parameter space covered in the GstLAL search described above with a key difference being that only templates longer than 0.15 s are kept. Details about this cut and its effect on the explored mass and spin range can be found in Ref. [158].
Both GstLAL and PyCBC scan data from each gravitational-wave detector against the above-described banks of template waveforms to produce SNR time series [29]. The SNR time series are maximized over short time windows to produce a set of triggers for each template and each detector. Triggers that pass an SNR threshold of 4 in one detector form the basis of candidate events according to the procedures for each pipeline described below. PyCBC removes the time period during category 2 veto flags from the final results, while GstLAL uses only iDQ for singledetector triggers and no data quality products for coincident triggers.
GstLAL defines a candidate event as consisting of triggers from one or more gravitational-wave detectors ranked by the SNRs of the triggers, signal-consistency tests, time delays between each detector, phase differences between detectors, the (possibly zero) time-averaged volumetric sensitivity of each detector, and the signal population model. These parameters are used as variables in the likelihood-ratio ranking statistic L [111,122,123,161], which is a monotonic function of the inverse false alarm probability [162].
There are two differences between the ranking statistic used here and in O2 [8]. First, we implement a template likelihood pðTjsignal; SNRÞ [163], which is the probability that a trigger is recovered by a template T, given the trigger SNR and that the signal belongs to some population. Previous versions of GstLAL approximate pðTjsignal; SNRÞ by a constant in L [111,122,161], implying all templates are equally likely to recover a signal. Now, the template likelihood is informed by the template bank (to account for the fact that templates are not uniformly distributed in parameter space [164]) and a signalpopulation model, which for this search considers ⃗ θ ¼ fm det 1 ; m det 2 ; χ 1;z ; χ 2;z g and is given by The distribution pð ⃗ θjsignalÞ is deliberately broad to minimize the number of missed signals. Second, single-detector candidate events are ranked using both an empirically determined penalty and information from iDQ [94] as described previously. The penalty from iDQ is added to the denominator of L. Candidates from the GstLAL search have their likelihood ratios and significance estimated using the entire approximately 6-month dataset. PyCBC identifies candidate events by requiring triggers in both LIGO Hanford and LIGO Livingston with a time delay smaller than the light travel time between observatories. These candidate events are then ranked using a set of signal-based vetoes, data quality information, and by comparing the properties of the event against those expected from astrophysical signals [127]. A FAR is then computed for each of these candidate events by estimating the background noise distribution using timeshifted analyses similarly to cWB [29,126,165], triggers from LIGO Livingston being time shifted relative to LIGO Hanford by multiples of 0.1 s. Virgo data are not searched with the PyCBC pipeline due to three-detector searches not being completely integrated in the version of code used.
A focused search for BBH coalescences [27] is also used here, denoted later as PyCBC BBH. This search is motivated by the fact that all signals observed in O1 and O2, with the single exception of the binary neutron star merger GW170817, are consistent with BBH coalescences with mass ratio close to 1 and effective inspiral spins close to 0. The full parameter space search used for GWTC-1 [8], in contrast, is tuned to observe signals anywhere in the possible space of signal parameters, which might include signals that do not have very high matches with search templates. The PyCBC BBH search uses a recently developed detection statistic [27,166] which includes a number of tuning choices to reject triggers that do not match the filter waveforms well and also includes a template weighting implementing a prior that signals detectable in any given range of SNR are uniformly distributed in chirp mass. This search enables PyCBC to identify more BBHs in the O1 and O2 datasets than reported in the GWTC-1 paper [8,27]. This search includes some of the BBHs first reported in Refs. [23,25] by independently developed searches [24,167]. We use the focused BBH search in this work to better extract BBH coalescences from the data: This search considers only a reduced set of filter templates defined prior to the analysis of O3 data, namely, systems with mass ratio q > 1=3 and with both component masses (in detector frame) larger than 5 M ⊙ .

C. Estimation of modeled search sensitivity
In order to estimate the sensitivity of the GstLAL and PyCBC searches, we conduct a campaign of simulated signals injected into the O3a gravitational-wave data and analyzed by both matched-filter pipelines. The simulated population, intended to cover (or overcover) the detected population of stellar-mass BBHs [11,37], contains component masses m 1 and m 2 between 2 M ⊙ and 100 M ⊙ and extends out to a maximum redshift of 2.3. In order to reduce statistical uncertainties, the mass, spin, and redshift distributions should be sufficiently similar to population models for which we intend to estimate merger rates: See Ref. [37] (Appendix A) for further discussion of selection functions in population inference. For the simulations, we choose pðm 1 Þ ∝ m −2.35 1 , pðm 2 jm 1 Þ ∝ m 2 2 (for m 2 < m 1 ), and χ i;z values distributed uniformly between −0.998 and 0.998. The cosmological distribution of sources simulates a merger rate in the comoving frame that evolves as RðzÞ ¼ Rð0Þð1 þ zÞ 2 ; thus, the source redshift distribution follows pðzÞ ∝ ð1 þ zÞdV c =dz, where V c is the comoving volume (see, e.g., Ref. [168] for further discussion of cosmological effects). The simulation set is generated in two stages: First, points are chosen according to this distribution out to z ¼ 2.3; then, these 7.70 × 10 7 samples are reduced to a set of potentially detectable signals by imposing that the expected LIGO Hanford-LIGO Livingston network SNR, calculated using representative noise power spectral density (PSD), be above a threshold of 6. The approximately 157 000 signals remaining after this cut are assigned merger times ranging uniformly over the duration of O3 for analysis by the searches. The SEOBNRv4_opt aligned-spin waveform model [157] is used for the simulated signals.
The expected number of signals from such a population detected by a given analysis may be written aŝ where Rð0Þ is the rate of signals per unit volume and unit observing time at present and V is the effective surveyed hypervolume, a measure of analysis sensitivity for the injected population [169]. Since each analysis recovers over 10 4 injections, statistical counting uncertainties in the hypervolumes V are at the subpercent level. Our estimates of sensitivity are, though, affected by possible systematic uncertainties in calibration of the strain data. Strain calibration affects the detectability of simulated signals via the magnitude of the response function, which is affected by uncertainties of at most a few percent over the frequency range where the SNR of binary merger signals is accumulated; see Ref. [74] for details. The hypervolume V surveyed by each analysis, at a detection FAR threshold of 2.0 per year, is 0.456 Gpc 3 yr for GstLAL, 0.296 Gpc 3 yr for PyCBC, and 0.386 Gpc 3 yr for PyCBC BBH. For a combination of all matched-filter analyses presented in this catalog, with any injection found in one or more analysis below the FAR threshold considered as detected, we find a surveyed hypervolume 0.567 Gpc 3 yr.

D. Estimation of signal probability
For each candidate event, the probability of origin from an astrophysical source p astro and corresponding probability of terrestrial noise origin p terr ¼ 1 − p astro may be estimated using the outputs of search pipelines. We obtain these probabilities for the candidate events in this catalog consistent with a BBH, via the Poisson mixture model formalism [170] used in O1 [39,168,171]. Only BBH candidate events are considered because, other than GW190425 [172] whose component masses are consistent with those of NSs, all significant detections can be classified as BBHs [173]. A low-significance candidate NSBH event, GW190426_152155, is reported in low latency [174]; however, its astrophysical probability is strongly dependent on prior assumptions of the rate of such signals. We, therefore, do not estimate its p astro here.
We start by collecting the ranking statistics ⃗ x ¼ fx 1 ; x 2 ; …; x N g of all candidate events more significant than a predefined threshold: For the GstLAL pipeline events are thresholded on FAR, while for PyCBC a ranking statistic threshold is applied. The threshold for GstLAL is chosen to ensure that the total number of background events considered exceeds the number of signals by a large (approximately 100) factor, which enables an accurate estimate of the total rate of background events above threshold. On the other hand, PyCBC estimates the background rate from timeshifted analyses, as outlined above in Sec. IV B; thus, the requirement to include a large number of background events is relaxed. The statistic threshold is then set low enough to include (at least) all events with p astro ≳ 0.1.
Additionally, for both GstLAL and PyCBC searches, a threshold of 4.35 M ⊙ is applied on the chirp mass of the templates, corresponding to a 5 M ⊙ þ 5 M ⊙ binary, ensuring that the selected candidate events have template masses consistent with those of putative BBHs. Using the distribution of ranking statistics x under the foreground model, fðxÞ ¼ pðxjsignalÞ, and the distribution under the background model, bðxÞ ¼ pðxjnoiseÞ, estimated by each matched-filter pipeline, we assign a Bayes factor kðxÞ ¼ fðxÞ=bðxÞ to each event. Assuming that foreground and background triggers are drawn from independent Poisson processes, one can then calculate the posterior over the Poisson expected counts for each process, Λ 1 and Λ 0 .
For the PyCBC searches presented here, we proceed as for O1 and O2 [8] and estimate foreground and background event densities empirically for all putative BBH candidate events with ranking statistic above a given threshold. The PyCBC full parameter space and BBH focused searches differ in how their ranking statistics are calculated: For the full search [127], a threshold of 7.9 is applied to the ranking statistic, while for the BBH search [27,166], a threshold value of 9 is applied. We empirically measure the rate of noise events satisfying these cuts via time-shifted analyses and infer the posterior over the rate of signals; finally, we marginalize over the signal rate to obtain probabilities of astrophysical and terrestrial origin for each event [175]. Since the PyCBC BBH search is more sensitive to realistic BBH signal populations, implying a more accurate estimate of the relative densities of signal and noise events within its targeted mass region, we consider the p astro values from the BBH analysis to be more accurate for events recovered by both searches.
For the GstLAL analysis, we estimate the astrophysical and terrestrial probabilities p astro ðxj⃗ xÞ from the joint posterior on the Poisson expected counts, pðΛ 0 ; Λ 1 j⃗ xÞ, where the set of triggers ⃗ x have a chirp mass M > 4.35 M ⊙ and a FAR < 8766 yr −1 . The prior used to construct the joint counts posterior is taken to be the corresponding posterior from O1 and O2 [176].

V. ESTIMATION OF SOURCE PARAMETERS
Once triggers of interest are identified, the physical parameters of the candidate event gravitational-wave signals are inferred by computing their posterior probability density functions. The uncertainty in the source parameters is quantified by the posterior probability distribution pð ⃗ ϑj ⃗ dÞ, which is calculated using Bayes' theorem as where pð ⃗ dj ⃗ ϑÞ is the likelihood of the data given the model parameters ⃗ ϑ and πð ⃗ ϑÞ is the prior probability distribution for the parameters. The likelihood is calculated from a coherent analysis of data from each of the detectors. As in our previous analyses, e.g., Ref. [177], we assume that the noise can be treated as Gaussian, stationary, and uncorrelated between detectors [63,178] in the stretch of data used to calculate the likelihood and to measure the noise PSD. This yields a Gaussian likelihood [143,179] for the data from a single detector: where d i is the data of the ith instrument and h i M ð ⃗ ϑÞ is the waveform model calculated at ⃗ ϑ projected on the ith detector and adjusted to account for the uncertainty in offline calibration described in Sec. III A. The noiseweighted inner product hajbi [143,180] requires specifying the frequency range in which the analysis is performed as well as the noise PSD.
Upon detection of a binary merger, exploratory analyses are first conducted to identify which models and settings are most suitable for use in our production analyses. In general, we use a low-frequency cutoff of f low ¼ 20 Hz, unless data quality requirements at the time of specific candidate events are different, in which case a specific range is noted in Sec. VII. The high-frequency cutoff is always equal to the Nyquist frequency of the analysis, which is chosen on a per-event basis to be greater than the highest frequency of the l ¼ m ¼ 2 mode, since the signals (especially the higher-mass BBH signals) do not require the full bandwidth available at the native sampling rate of 16 kHz. The PSD characterizing the noise at the time of each event is measured by BW using the same data that are used for likelihood computation [181,182]. We then obtain the final joint likelihood over all detectors by multiplying together the likelihood from each detector in Eq. (5).
The gravitational-wave signal emitted by a circularized compact binary composed of two black holes depends on 15 unknown parameters, defined in Sec. IV B. The initial masses and spins of the inspiraling black holes determine the peak gravitational-wave luminosity and mass and spin of the postmerger remnant black hole, which we calculate from fits to numerical relativity (NR) [5,[183][184][185][186][187]. When one or both objects are neutron stars, matter effects modify the binary inspiral and are included via the dimensionless quadrupole tidal deformability Λ i , adding one extra parameter for each neutron star in the binary. Other matter effects, such as octupolar and higher tidal deformabilities, nonblack hole spin-induced multipole moments, and f-mode resonances, are parameterized by the Λ i using quasiuniversal relations [188]. The dominant tidal contribution to the waveform is given by the dimensionless tidal deformability parameter [189,190] Λ ¼ 16 13 Nonspinning black holes have Λ i ¼ 0 [191,192], and the waveform models we use adopt the convention that this result is true for all black holes [193,194]. We do not calculate final masses and spins when matter effects are included in the analyses, as this calculation requires an accurate prediction of the ejected mass following possible tidal disruption, taking into account the equation of state and uncertain details of the dynamics of the merger.

A. Waveform models
We characterize the detected binaries using multiple waveform models, each of which uses a different set of modeling techniques and includes different physical effects. For every event with inferred component masses above 3 M ⊙ in preliminary analyses, we perform production parameter estimation runs using a subset of BBH waveforms. IMRPhenomPv2 [195][196][197] is a phenomenological model for gravitational waves from precessing BBH systems, calibrated to NR and using an effective single-spin description to model effects from spin precession [198]. SEOBNRv4P [157,199] is based on the effective-one-body (EOB) formalism [200,201] and calibrated to NR, with a generic two-spin treatment of the precession dynamics. These models rely on twisting-up procedures, where aligned-spin, NR-calibrated waveform models defined in the coprecessing frame are mapped (through a suitable frame rotation) to approximate the multipoles of a precessing system in the inertial frame [202][203][204][205][206]. These models do not include contributions to the strain from spherical harmonic modes beyond l ¼ 2, so we also analyze each event with at least one of the following models that incorporate higherorder multipole (HM) moments and precession effects: IMRPhenomPv3HM [207,208], SEOBNRv4PHM [199,209], and NRSur7dq4 [210]. IMRPhenomPv3HM (based on IMRPhenomHM [211]) and SEOBNRv4PHM (based on SEOBNRv4HM [212]) both rely on the twisting-up approach described above. NRSur7dq4 is a surrogate waveform model for BBH systems that directly interpolates a large set of precessing NR simulations. Unlike the other two HM models, NRSur7dq4 waveforms are restricted by the length of the NR simulations in the training set, covering only approximately 20 orbits before merger.
Any sources with evidence for at least one binary component below 3 M ⊙ are characterized using several waveforms capable of modeling matter effects. For the BNS system GW190425 [32], we use the following: IMRPhenomD_NRTidal and IMRPhenomPv2_NRTidal [213,214], which are based on the BBH models IMRPhenomD and IMRPhenomPv2, respectively, and incorporate NR and tidal EOB-tuned contributions from tidal interaction as well as equation-of-state-dependent self-spin effects; TaylorF2 [139,[147][148][149][150][151][152][153][154][155][156], which describes waveforms from the inspiral of nonprecessing compact binaries, with matter effects derived in the post-Newtonian formalism, including quadrupole-monopole coupling parameterized in terms of the tidal deformabilities [188,215,216]; TEOBResumS [217], an aligned-spin EOB model that incorporates post-Newtonian and self-force contributions to the tidal potential; and, finally, a frequency-domain surrogate model of aligned-spin SEOBNRv4T waveforms [157,[218][219][220], which are derived in the EOB approach and include dynamical tides.
For potential NSBH sources with m 1 > 3 M ⊙ > m 2 , we use both BBH waveforms and the NSBH-specific alignedspin waveform models SEOBNRv4_ROM_NRTidalv2_NSBH [221] and IMRPhenomNSBH [222], which feature the dominant quadrupole modes. SEOBNRv4_ROM_NRTidalv2_NSBH uses SEOBNRv4_ROM as the BBH baseline, while IMRPhenomNSBH employs phase evolution from IMRPhenomD [195,196] and amplitude from IMRPhenomC [223]. Both models contain a phenomenological description of the tidal effects tuned to NR simulations [224] and include corrections to the amplitude through inspiral, merger, and ringdown to account for a possibility of tidal disruption.
To account for the systematic uncertainties in the waveform models, we combine equal numbers of posterior samples (described in Sec. V B) from all parameter estimation runs for an event that use waveforms with comparable physics. This event treats the constituent waveform models in the combined results as having equal weight rather than weighting them by marginal likelihood as suggested in Ref. [225]. Table III shows the waveforms employed in this work, the keys under which we group results using these waveforms, and descriptions of the physical effects incorporated in the models.

B. Sampling methods
We use several methods to draw samples from the posterior distributions on source parameters using the models described above. The LALInference [179] package is used for most analyses presented in this paper. This package provides two independent stochastic sampling algorithms: a MCMC algorithm and a nested sampling [228] algorithm. We employ LALInference 's nested sampling algorithm for most of the BBH analyses performed with the IMRPhenomD and IMRPhenomPv2 waveforms and the MCMC algorithm for those performed with SEOBNRv4P. However, the serial nature of these methods makes them unsuitable for use with some of the more computationally costly waveform models, such as waveforms with HMs and precession effects, especially for long-duration signals. For these, we also use RIFT, which performs a hybrid exploration of the parameter space split into intrinsic and extrinsic parameters [229][230][231], and Parallel Bilby, based on a distributed implementation of nested sampling [232][233][234][235], which is also used in previously published analyses of GW190412 [236], GW190425 [32], and GW190814 [33]. The raw posterior samples from the analyses described above are then collated to a common format using the PESummary package [237].

C. Priors
Each event is analyzed independently using a prior distribution on the source parameters that is chosen to ensure adequate sampling of the parameter space and simplicity in using the posterior samples for further analyses. We choose a prior that is uniform in spin magnitudes and redshifted component masses and isotropic in spin orientations, sky location, and binary orientation. The prior on luminosity distance corresponds to a uniform merger rate in the comoving frame of the source, using a flat ΛCDM cosmology with Hubble constant H 0 ¼ 67.9 km s −1 Mpc −1 and matter density Ω m ¼ 0.3065 [238]; this physically motivated prior differs from that used in previous published results, which uses a prior ∝ D L 2 . However, our data release includes parameter estimation samples for both the flat-in-comoving-volume and ∝ D L 2 priors. For details on the conversion, see Appendix C. Intrinsic source masses are computed by dividing the redshifted masses measured in the detector frame by (1 þ z), where z is calculated using the same cosmological model.
For the LALInference and Parallel Bilby analyses, we marginalize over uncertainty in the strain calibration. The calibration errors in amplitude and phase are described by frequency-dependent splines, whose coefficients are allowed to vary alongside signal parameters in the inference. The prior distribution on the calibration error at each TABLE III. Waveform models used in this paper. We indicate which multipoles are included for each model. For precessing models, the multipoles correspond to those in the coprecessing frame. The combined key column specifies which results generated with these waveforms are combined in our data release under a common key. The models below the horizontal line include matter effects. For these datasets, we enforce j⃗ χ i j ¼ 0. b The data release contains versions of these keys with HS and LS in the name, which correspond to the high-spin (j⃗ χ i j ≤ 0.89) and low-spin (j⃗ χ i j ≤ 0.05) priors, respectively. spline node is set by the measured uncertainty at each node [177]. Table IV presents the results from each of cWB, GstLAL, and PyCBC passing a FAR threshold of 2.0 yr −1 ; the full gravitational-wave name encodes the UTC date with the time of the event given after the underscore. GW190521 [34,35], GW190425 [32], GW190412 [236], and GW190814 [33] were published previously, and these names are used here verbatim. The 2.0 yr −1 threshold is chosen to be higher (more permissive) than the threshold used for public alerts, 1.2 yr −1 [239], but sufficiently low to provide an expected contamination fraction below 10%. An extended candidate event list, which may contain marginally significant triggers, will be provided later once final data calibration and quality checks are available. Unlike GWTC-1 [8], a separate p astro threshold is not applied; however, p astro is greater than 50% for all candidate events for which p astro is calculated in this work, satisfying the same criteria as GWTC-1 [240]. Among the 39 reported candidate events passing the FAR threshold of 2.0 yr −1 , 15 are detected by cWB, 36 candidate events are detected by GstLAL, and 27 candidate events are detected by PyCBC; 25 candidate events are recovered by at least two pipelines. Given the FAR threshold and number of candidate events detected, the expectation value for noise events is < 4 in this list. Based on the FAR or the probability of being a signal described in Sec. IV D, GW190426_152155, GW190719_215514, and GW190909_114149 are the most likely to be noise among the candidate event list. cWB recovers fewer candidate events than either GstLAL or PyCBC. This result is expected, because cWB has the highest sensitivity for short-duration, high-mass signals, and its sensitivity decreases for lower-mass systems with longer duration. cWB also requires candidate events to be found in coincidence between at least two detectors.

VI. CANDIDATE EVENT LIST
The difference in candidate event recovery between GstLAL and PyCBC is primarily due to PyCBC analyzing only times when both LIGO Hanford and LIGO Livingston are operating and requiring signals to be observed in both. PyCBC does not analyze Virgo data due to the fact that the code version used for this catalog had not fully integrated three-detector analysis including Virgo. GstLAL analyzes LIGO and Virgo data and allows for the detection of candidate events from one, two, or three gravitationalwave detectors. These algorithmic choices account for the GstLAL-only detection of GW190424_180648, GW190425, GW190620_030421, GW190708_232457, and GW190910_112807, which are detected above the required SNR threshold only in the LIGO Livingston detector, and for GW190630_185205, GW190701_ 203306, and GW190814, where the inclusion of Virgo is essential for determining event significance. The difference in candidate event recovery between PyCBC and GstLAL is also consistent with the results of the simulation presented in Sec. IV C. After accounting for differences in analyzed data, the GstLAL and PyCBC methods detect a comparable number of candidate events.
The remaining differences between the candidate event lists from GstLAL (GW190426_152155, GW190527_ 092055, GW190909_114149, and GW190929_012149) and PyCBC (GW190413_052954, GW190514_065416, and GW190719_215514) arise from the low SNR of each event (SNR ≲ 10). Small fluctuations in SNR caused by different PSD estimation and data segmentation between pipelines lead to differences in significance estimation. This list of low SNR candidate events, which are identified by only one pipeline, also contains the three candidate events with the highest minimum FAR among the pipelines (GW190426_152155, GW190719_215514, and GW190909_114149) which have the highest likelihood among the full candidate event list of being caused by noise.
Since 2 April 2019 20:00 UTC, the LVC produced automated, public preliminary GCN notices for gravitational-wave candidate events appearing in two or more interferometers with FARs less than 6 per year before a multiple analysis trials factor was applied, resulting in an effective threshold of 1.2 yr −1 [248]. On 11 June 2019, this threshold was extended to include gravitational-wave candidate events appearing in only one interferometer and satisfying the same FAR threshold. During O3a, 33 candidate events are disseminated as plausible astrophysical signals; seven are not recovered above the threshold considered in this work [249]. S190510g, S190718y, S190901ap, S190910d, S190910h, S190923y, and S190930t [241][242][243][244][245][246][247] are the seven candidate events disseminated via GCNs which are not recovered here. S190718y, S190901ap, S190910h, and S190930t are initially identified as single-detector candidate events (with an SNR above threshold in only one detector) with FARs of 1.14, 0.22, 1.14, and 0.47 yr −1 , respectively. Relaxing the demand for coincident observation across interferometers allows LVC analyses to report on additional astrophysically interesting candidate events [172,250] but also removes a powerful check on the search background and leads to larger uncertainties in the FAR. All public alerts are subsequently followed up in low latency to assess whether the analysis pipelines and detectors are operating as expected. The low-latency follow-up of S190718y, S190901ap, S190910h, and S190930t does not uncover any reason to retract these candidate events based on data quality. However, after offline reanalysis with additional background statistics, these four single-detector candidate events are no longer significant enough to merit inclusion in Table IV. The remaining three public alert candidate events not recovered here-S190510g, S190910d, and S190923yare found in coincidence in low latency, albeit at modest TABLE IV. Gravitational-wave candidate event list. We find 39 candidate events passing the FAR threshold of 2.0 yr −1 in at least one of the four searches. Except for previously published events, the gravitational-wave name encodes the UTC date with the time of the event given after the underscore. Bold-faced names indicate the events that were not previously reported. The second column denotes the observing instruments. For each of the four pipelines, cWB, GstLAL, PyCBC, and PyCBC BBH, we provide the FAR and network SNR. Of the 39 candidate events, the five that are found above the required SNR threshold in only one of the gravitational-wave detectors are denoted by a dagger ( †). For candidate events found above threshold in only one detector (single-detector candidate events), the FAR estimate involves extrapolation. All single-detector candidate events in this list by definition are rarer than the background data collected in this analysis. Therefore, a conservative bound on the FAR for triggers denoted by † is approximately 2 yr −1 . GW190521, GW190602_175927, GW190701_203306, and GW190706_222641 are identified by the cWB high-mass search as described in Sec. IVA. GstLAL FARs are capped at 1 × 10 −5 yr −1 to be consistent with the limiting FARs from other pipelines. Á Á Á indicate that a pipeline does not find the event below the specified 2.0 yr −1 threshold. Blank entries indicate that the data are not searched by a pipeline. The probability that an event is astrophysical in origin as described in Sec. IV D is indicated in the column p astro . PyCBC and cWB SNRs do not include Virgo. S190510g, S190718y, S190901ap, S190910d, S190910h, S190923y, and S190930t [241][242][243][244][245][246][247] are candidate events disseminated via GCNs which are not recovered here. The SNR for GW190814 (22.2) differs from the previously published value [33] because the non-observing-mode LIGO Hanford data are not analyzed in this work. The FAR of GW190425 differs from Ref. [32] due to different background data and pipeline configuration. The SNR of GW190521 differs from Ref. [34] due to the inclusion of subthreshold Virgo SNR. significance. S190510g is found in low latency by GstLAL and assigned a FAR of 0.28 yr −1 . There are initially data quality concerns with S190510g [251], and offline followup using an additional 24 h of background collection reveals the candidate event to be less significant than originally estimated [252]. Comparison to the full O3a background corroborates that the candidate event no longer passes the FAR threshold of 2.0 yr −1 . S190910d is identified in low latency by the SPIIR compact binary search pipeline [115] with a FAR of 0.12 yr −1 . The compact binary search pipeline MBTAOnline also records a lowsignificance candidate event at this time. However, the candidate event is not observed in low latency or offline by GstLAL, PyCBC, or cWB as significant. Presently, MBTAOnline and SPIIR are configured to run in low latency only. S190923y is reported in low latency by PyCBC and assigned a FAR of 1.51 yr −1 . GstLAL and MBTAOnline also record low-significance candidate events at this time. No pipeline retains this candidate event in the offline analysis below the threshold of 2.0 yr −1 , and it is therefore excluded from Table IV. These and other subthreshold events will be explored further in a future publication.
The remaining 26 public alerts are recovered in our offline analysis and included in the candidate event list presented in Table IV. The table also includes 13 gravitational-wave detections not previously reported. Four of these detections-GW190424_180648, GW190620_ 030421, GW190708_232457, and GW190910_112807have detection-level SNR (≥4) in only one of the two LIGO detectors, making them single-detector observations from the perspective of candidate event significance. The incorporation of iDQ data quality into event ranking, combined with tuning of the signal consistency tests to further reject the O3a glitch background, improves the sensitivity of the offline GstLAL analysis to singledetector candidate events compared to the low-latency configuration, which accounts for these new discoveries. Nine new detections are observed in two or more interferometers and appear for the first time in Table IV:  GW190413_052954, GW190413_134308, GW190514_  065416, GW190527_092055, GW190719_215514, GW190731_140936, GW190803_022701, GW190909_ 114149, and GW190929_012149. Several of these detections are observed in low latency but do not meet the criteria for public release. They all exhibit moderate network SNRs (≲10). The offline analyses here differ from their low-latency counterparts through having improved template banks, improved use of data quality information, improved data calibration, data cleaning, and improved tuning to reject the nonstationary noise background observed in O3a. These differences account for the new moderate SNR candidate events.
During O3a, the LVC issues eight retractions for public alerts that are promptly determined to be unlikely to have originated from astrophysical systems. The LIGO Livingston detector is more problematic for low-latency analyses than in previous observing runs with the rate of noise glitches being significantly higher than in O2 (see Sec. III B). Low-latency detection, especially for candidate events originating in only one interferometer, is especially challenging during O3a. S190405ar is the first retraction [253]. This notice is distributed in error and is never considered to be of astrophysical origin, because the event's FAR is significantly above threshold at 6800 yr −1 . The remaining retractions are caused by severely nonstationary noise. A glitch in LIGO Hanford data leads to the identification and subsequent retraction of S190518bb [254]. S190524q, S190808ae, S190816i, S190822c, S190829u, and S190928c all exhibit extreme nonstationary noise in the LIGO Livingston detector [255][256][257][258][259][260]. S190822c is assigned a significant FAR at identification but is observed only in a single interferometer. As previously mentioned, single-detector triggers are subject to higher uncertainties in FAR. Out of these triggers, an automatic gating method deployed later in the observing run used by searches in order to mitigate nonstationary noise in the detectors would have vetoed S190822c. However, at the time, this automatic gating was not being employed. FARs calculated by the pipelines are correct if the data collected so far are representative of the future data. Online pipelines assign FARs using previously collected data, while offline analyses have access to asynchronous background. If a new nonstationary noise source emerges in the data during an observing run, it is possible for it to be misidentified as a candidate event. Although this misidentification can also affect candidates observed in multiple interferometers, single-detector candidates are especially susceptible to novel noise sources seen for the first time in low latency. In some cases where the impact of glitches is unknown, follow-up analyses are performed to remove instrumental artifacts and reassess the candidate event significance. After follow-up, none of these eight retracted candidate events remain significant. Subsequent offline analyses do not identify them as significant, either.

VII. SOURCE PROPERTIES
We analyze the 39 candidate events shown in Table IV with the parameter estimation techniques described in Sec. V. For a subset of candidate events, event validation procedures outlined in Sec. III C identify transient noise that may impact the results of parameter inference. In order to minimize the effect of this transient noise, candidatespecific procedures are explored for all impacted candidate events. In most cases, transient noise is mitigated through the glitch-subtraction methods outlined in Sec. III D. After application of these methods, the identified transient noise is considered mitigated if the data surrounding the event are consistent with Gaussian noise, as measured by the variance of the measured power spectral density during the time period containing the identified transient [97]. If data are not Gaussian after glitch subtraction, we evaluate the SNR lost by restricting the frequency range of data considered in parameter inference to fully excise the identified transient noise. In cases where the single-detector SNR loss is below 10%, this reduced frequency range is used in analyses. Otherwise, the nominal frequency range is used. The full list of candidate events using candidatespecific mitigation, along with the mitigation configuration, is found in Table V.
Based on the investigations described in Appendix A, we find that most gravitational-wave candidate events in this catalog exhibit small changes in source parameter estimates when spherical harmonic modes above l ¼ 2 are included. However, these differences in aggregate could still affect population-level studies, so we present as fiducial results the combined posterior samples of HM runs for all BBHs candidate events except GW190707_093326, GW190720_ 000836, GW190728_064510, GW190915_235702, GW190924_021846, and GW190930_133541. For these six exceptions, we present combined IMRPhenomPv2-SEOBNRv4P samples, because the effect of higher modes is either negligible or subdominant to the systematics between IMRPhenomPv2 and SEOBNRv4P results, as detailed in Appendix A. GW190412 [236], GW190521 [34,35], and GW190814 [33] are analyzed extensively in separate publications with the HM waveform families SEOBNRv4PHM, IMRPhenomPv3HM, or NRSur7dq4, so we present HM runs for these candidate events as fiducial results here and defer readers to those publications for details on those candidate events [261].
GW190425, GW190426_152155, and GW190814 show indications of including at least one neutron star, and so are also analyzed using tidal waveforms in addition to IMRPhenomPv2 and SEOBNRv4P, and are discussed in Sec. VII B. Further details on waveform systematics and the waveforms employed in this work can be found in Appendix A, and the full suite of posterior samples is publicly available [262].
In the following subsections, we summarize the results of our parameter estimation analyses and highlight candidate events of particular interest. To identify candidate events with the most extreme parameter values, we repeatedly select one posterior sample at random from each event and record which candidate events have the lowest and highest values of each parameter. From these repeated trials, we determine each event's probability of having the lowest or highest value for a given parameter. Table VI shows 90% credible intervals on the source parameters of all 39 candidate events using the priors described in Sec. V C and waveforms specified above.
To provide an overview of the posterior distributions of the source parameters for all GWTC-2 candidate events, we show 90% credible regions for all candidate events in the M-q and M-χ eff planes in Figs. 6 and 7, respectively, and the corresponding one-dimensional marginal posterior distributions on m 1 , q, and χ eff in Fig. 8.
A. Masses of sources with m 2 > 3 M ⊙ Our candidate event list includes compact binary mergers that have higher total masses than those in GWTC-1 [8] as well as mergers with component masses in the purported lower mass gap of approximately 2.5-5 M ⊙ [263][264][265][266]. Here, we describe the masses for candidate events with m 2 > 3 M ⊙ , which we can confidently expect to be BBHs. The remaining candidate events are described separately in Sec. VII B.
A majority of the masses of black holes reported herein are larger than those reported via electromagnetic observations [267][268][269]. By repeatedly selecting one posterior sample at random from each event and recording the most massive among the ensemble, we find that the most massive binary system is probably the one associated with GW190521 [34,35]. This system has a 98% probability of being the most massive, with a total mass of 163.9 þ39.2 −23.5 M ⊙ and remnant mass 15.6 þ36.8 −22.4 M ⊙ , where we average over SEOBNRv4PHM, NRSur7dq4, and IMRPhenomPv3HM waveform families. This averaging is done for consistency with other sources contained in this catalog, in contrast to the individual results reported in Ref. [35], where the NRSur7dq4 results are highlighted. The more massive component in the source of GW190521 has a 66% probability of being the most massive BH detected in gravitational waves to date (m 1 ¼ 95.3 þ28.7 −18.9 M ⊙ ). GW190519_153544, GW190602_ 175927, and GW190706_222641 also have notably high total masses with over 50% posterior support for total mass M > 100 M ⊙ .
The least massive O3a system with m 2 > 3 M ⊙ is probably (96%) the one associated with GW190924_021846  VI. Median and 90% symmetric credible intervals on selected source parameters. The columns show source total mass M, chirp mass M and component masses m i , dimensionless effective inspiral spin χ eff , luminosity distance D L , redshift z, final mass M f , final spin χ f , and sky localization ΔΩ. The sky localization is the area of the 90% credible region. For GW190425, we show the results using the high-spin prior (j⃗ χ i j ≤ 0.89). We also report the network matched-filter SNR for all events. These SNRs are from LALInference IMRPhenomPv2 runs, since RIFT does not produce the SNRs automatically, except for GW190425 and GW190426_152155, which use the SNRs from fiducial runs, and GW190412, GW190521, and GW190814, which use IMRPhe-nomPv3HM SNRs. For GW190521, we report results averaged over three waveform families, in contrast to the results highlighting one waveform family in Ref. [34].  . For most sources detected in O3a, the mass ratio posteriors have support at unity and, therefore, are consistent with equal mass mergers. An exception is the source of GW190412, which is the first event detected that has a confidently unequal mass ratio (q ¼ 0.28 þ0. 12 −0.06 ) and exhibits strong signs of HM contributions to the waveform [236]. Although its mass ratio is confidently bounded away from unity, GW190412 has only a 34% chance of having the smallest mass ratio among O3a sources with m 2 > 3 M ⊙ .
As seen in Figs. 6 and 8, the mass ratios are not well constrained for many systems, so one or more could have a smaller mass ratio than GW190412.
The least massive O3a system is associated with GW190425 and is likely a binary neutron star system given the inferred masses (m 1 ¼ 2. FIG. 7. Credible region contours for all candidate events in the plane of chirp mass M and effective inspiral spin χ eff . Each contour represents the 90% credible region for a different event. We highlight the previously published candidate events (cf. Fig. 6), as well as GW190517_055101 and GW190514_065416, which have the highest probabilities of having the largest and smallest χ eff , respectively. not rule out a NSBH or BBH origin. These estimates are obtained with the PhenomPv2NRT model with a high-spin prior that restricts dimensionless spin magnitudes of the compact objects to be less than 0.89 and previously reported in Ref. [32]. Although the inferred component masses of GW190425's source are consistent with masses of known neutron stars [270][271][272][273], the total mass 3.4 þ0. 3 −0.1 M ⊙ is greater than that of observed Galactic BNSs [274,275]. This difference raises the question of whether GW190425's source was formed in a different environment from the double neutron star systems observed to date [32,[276][277][278][279].
Extending the discussion of waveform systematics reported with the discovery of GW190425 [32], we perform four supplementary analyses using the nonprecessing EOB models SEOBNRv4T_surrogate and TEOBResumS for low-spin and high-spin priors. The results are summarized in Table VII. All the analyses produce quantitatively similar results to the corresponding nonprecessing analysis reported in the discovery paper, and the two EOB models produce consistent results between them.
For the low-spin prior, the results agree with the nonprecessing IMRPhenomDNRTidal analysis reported in the discovery paper, with SEOBNRv4T_surrogate and TEOBResumS recovering chirp mass as M ¼ 1.44 þ0.02 −0.02 M ⊙ and effective inspiral spin of χ eff ¼ 0.01 þ0.01 −0.01 . When allowing larger compact object spins, the results with IMRPhenom-DNRTidal and EOB models exhibit some differences but, overall, give consistent posteriors.
Our inferences about tidal parameters are likewise consistent with the previous nonprecessing analyses. We follow the procedure of Ref. [32] and reweight the posteriors to a flat inΛ prior. For the low-spin prior, the two EOB models give very similar bounds, withΛ constrained below 630. For the high-spin prior, the TEOBResumS waveform model constrains the dimensionless tidal deformability parameter better (Λ ≤ 870) as compared to the other waveform models, as seen in the top in Fig. 9.
The two EOB models also constrain the mass ratio better than other waveforms (0.54-1.0) as can be seen in the bottom in Fig. 9. The previously reported analyses that allow for significant precessing spins have much greater flexibility and, thus, for the high-spin prior, produce a more asymmetric mass ratio posterior distribution than the two nonprecessing updates reported here.
Finally, both the EOB models find the luminosity distance of D L ¼ 0.16 þ0.07 −0.07 Gpc independent of the spin prior used.

GW190814
Among the O3a events, GW190814's source [33] has the least massive secondary component after the sources of GW190425 and GW190426_152155. GW190814's less TABLE VII. Source properties of GW190425 with different waveform families. For the primary mass we give the 0%-90% interval, while for the secondary mass and mass ratio we give the 10%-100% confidence intervals. The quoted 90% upper limits forΛ are obtained by reweighing its posterior distribution as detailed in Appendix F of Ref. [32]. massive component has mass m 2 ¼ 2.59 þ0.08 −0.09 M ⊙ , making its interpretation as a black hole or a neutron star unclear [33]. GW190814 also has the most extreme mass ratio of all the candidate events, q ¼ 0.112 þ0.008 −0.009 [33].

GW190426_152155
GW190426_152155 is the candidate event with the highest FAR: 1.4 yr −1 . Assuming it is a real signal of astrophysical origin, we estimate its component masses to be m 1 ¼ 5.7 þ3.9 −2.3 M ⊙ and m 2 ¼ 1.5 þ0.8 −0.5 M ⊙ , raising the possibility that it could have originated from either a BBH or an NSBH source. The mass of the secondary component is consistent with masses of (previously) reported neutron stars [10,271,272,274], but the data are uninformative about potential tidal effects, showing essentially no difference between the prior and posterior on Λ 2 obtained from NSBH waveforms SEOBNRv4_ROM_ NRTidalv2_NSBH or IMRPhenomNSBH which we use for our fiducial results. A more definitive assessment of this event likely requires further observations to establish the rate of astrophysical signals with comparable properties.

C. Spins
Most of the compact objects detected in O3a have spin magnitudes consistent with zero, within uncertainties, but in some cases the spins can be constrained away from zero. Two systems have a > 50% chance of having at least one black hole with dimensionless spin magnitude χ i¼f1;2g > 0.8: GW190517_055101 has χ i¼f1;2g > 0.8 with 77% credibility, and GW190521 with 58% credibility. In addition to spin magnitudes, we also consider the effective inspiral spins. As described in Sec. IV B, the effective inspiral spin χ eff is the mass-weighted combination of aligned spins and is approximately conserved under precession. Effective inspiral spin posterior distributions for all candidate events are shown in Figs. 7 and 8. We find 11 systems that show signs of nonzero χ eff . At 90% credibility, GW190412, GW190425, GW190517_055101, GW190519_153544, GW190620_030421, GW190706_ 222641, GW190719_215514, GW190720_000836, GW190728_064510, GW190828_063405, and GW190930_133541 have sources with χ eff > 0. No individual systems are found to have χ eff < 0 with ≥ 95% probability, but the event with the lowest χ eff in O3a is probably GW190514_065416 with χ eff ¼ −0.19 þ0. 29 −0.32 . The surplus of events with χ eff > 0 compared to none with χ eff < 0 suggests that the spin orientations of black holes in binaries are not isotropically distributed with respect to their orbital angular momenta. This possibility is explored further in Ref. [37].
In some cases, the joint χ eff and mass-ratio measurement for an event enables a tighter measurement of the spin magnitude of the primary mass than for candidate events with mass ratios closer to unity. For example, we find primary spin magnitudes of χ 1 ¼ 0.40 þ0. 40 −0.35 for the source of GW190720_000836 and χ 1 ¼ 0.32 þ0. 37 −0.28 for the source of GW190728_064510. Posterior distributions on spin magnitudes and tilt angles are shown in Fig. 10 for these two candidate events and other select systems that exhibit nonzero spins.
The magnitude of spin precession in the waveform can be partially captured by the effective precession spin parameter χ p , which includes the projection of component spin vectors onto the orbital plane [197,198]. To identify events for which the data constrain χ p , we compare the χ p priors-conditioned on the posteriors of χ eff -to the χ p posteriors, as done in Ref. [8]. Conditioning the χ p prior on the χ eff posterior accounts for the correlated prior between χ eff and χ p in the default spin prior choices presented in Sec. V. Figure 11 shows the one-dimensional posterior and (χ eff -conditioned) prior distributions on χ p for events with Jensen-Shannon (JS) divergence [280] JS is calculated between the χ p posterior and conditioned prior. For most of the candidate events, the posterior on χ p is similar to the prior, indicating that the data are largely uninformative about precession, but there are a few notable exceptions. The χ p inference on GW190814 is most striking with D χ p JS ¼ 0.72 bit: The spin magnitude of the primary mass of the system is constrained to be near zero, resulting in a correspondingly small χ p value. After GW190814, D χ p JS is largest for GW190412 [236,281] and GW190521 [34,35], with D χ p JS ¼ 0.18 bit and D χ p JS ¼ 0.15 bit, respectively. Unlike GW190814, the χ p posterior distributions for these events are constrained away from zero, showing preference for precession in these systems. The tilt angle of GW190412's more massive component's spin with respect to the Newtonian orbital angular momentum is particularly well constrained to θ LS 1 ¼ 0.80 þ0. 54 −0.35 , as seen in Fig. 10. To further investigate possible precession in these signals, we compute the precession SNR ρ p [282,283], which characterizes the observability of precession in gravitational-wave data. GW190412 has the largest precession SNR in O3a with a median ρ p ¼ 3.0. Despite parameter estimation preferring high χ p for GW190521, we calculate ρ p ¼ 0.7 showing that the measurable precession in the signal is small due to the short signal duration. Using empirical relations between ρ p and Bayes' factors for precession [284,285], we see that the corresponding Bayes' factors for precession also only mildly favor precession. A population-level analysis of the GWTC-2 spins is presented in Ref. [37] and finds evidence for the presence of spin precession in the population.

D. Three-dimensional localization
The most distant event, after accounting for measurement uncertainties in distance, is most probably GW190413_134308, with an estimated luminosity distance and redshift of D L ¼ 4.45 þ2.48 −2.12 Gpc and z ¼ 0.71 þ0.31 −0.30 , respectively, approximately twice the luminosity distance of the most distant source from GWTC-1, GW170729 [8], and comparable to the distance for gravitational-wave candidate GW170817A [22] (not to be confused with the BNS signal GW170817). However, GW190909_ 114149, GW190514_065416, GW190521, GW190706_ 222641, and GW190719_215514 have similarly large distances to GW190413_134308. With candidate events at these cosmological distances, we can more readily measure the Hubble constant and the evolution of the BBH merger rate over cosmic time. Such analyses are performed in Ref. [37]. The closest source detected in O3a is GW190425, with an inferred luminosity distance of D L ¼ 0.16 þ0.07 −0.07 Gpc, about 4 times the distance for GW170817.
Overall, GW190814 is the best localized event detected in O3a. The contour encompassing 90% of this event's two-dimensional sky position posterior is ΔΩ ¼ 19 deg 2 , and 90% of this event's 3D sky position posterior is contained in ΔV 90 ¼ 3.2 × 10 −5 Gpc 3 . Although GW190814 was initially detected in only LIGO Livingston and Virgo, it was reanalyzed with LIGO Hanford data, enabling the strong constraint on 3D source position. GW190412 and GW190701_203306 are also relatively well localized with ΔΩ ¼ 21 deg 2 , ΔV 90 ¼ 0.035 Gpc 3 and ΔΩ ¼ 46 deg 2 , ΔV 90 ¼ 0.035 Gpc 3 , respectively, and are both detected in all three detectors. GW190424_180648 is detected and analyzed only in Livingston data and, therefore, has the largest localization area and volume with ΔΩ ¼ 28000 deg 2 and ΔV 90 ¼ 28 Gpc 3 . Credible intervals on each source's distance and sky area are shown in Table VI. Probability density sky maps for all events are available as part of the data release [262].

VIII. WAVEFORM RECONSTRUCTIONS
Template-based [179,286] and minimally modeled methods [107,112,287]  provide a mapping between the shape of the waveforms and the parameters of the source, such as the masses and spins of a binary system, but are limited to those sources for which we have models. Minimally modeled reconstructions make it possible to discover unexpected phenomena, but they do not provide a direct mapping to the physical properties of the source. Currently available waveform templates for binary mergers are based on various approximations and numerical solutions to Einstein's equations that cover a subset of the full parameter space. These waveform templates may, thus, fail to capture some features of the signal. A more exotic possibility is that gravity behaves differently than predicted by general relativity. One way to test these scenarios is to compare the template-based and minimally modeled waveform reconstructions.
A standard measure of the agreement between two waveforms h 1 and h 2 is the match, or overlap where hajbi denotes the noise-weighted inner product [143,180]. The match is constrained to be ≤1.
For each event, the matches are computed between the maximum likelihood template-based waveforms and two minimally modeled waveform reconstruction methods, cWB [112] and BW [107], using data that contain the FIG. 11. Posterior and prior distributions on the effective precession spin parameter χ p for select events. The χ p prior shown on the right half of each leaf is conditioned on the posterior of χ eff , since the default prior in sampling contains correlations between χ p and χ eff .
FIG. 12. p-value plot for the candidate events reconstructed by the minimally modeled pipelines in O3a. The upper panel is from the BW analysis, and the lower panel is from the cWB analysis. The cWB analysis includes the candidate events that are both detected and reconstructed (listed in Table IV) and those that are detected but reconstructed offline. (GW190513_205428, GW190707_093326, GW190728_064510, GW190814, and GW190828_065509). The BW analysis uses the same selection of BBH candidate events as are used for testing general relativity [36]. The p values are sorted in increasing order and plotted vs the order number (which is also the cumulative number of candidate events). Each p value is obtained from the observed on-source match value and the corresponding off-source distribution of the match values from off-source injections. The green band indicates the theoretical 90% symmetric distribution about the null hypothesis (dark green line). Only deviations below the green 90% confidence band indicate disagreement, and we see that there are none. A few of the cWB p values are above the band, indicating either a statistical fluctuation or an overfitting, which we attribute to a small asymmetry between the way the on-source and off-source matches are computed. event (on-source data). To ascertain whether these match values are in line with expectations, waveforms from the template-based analysis are added to data near, but not including, each event (off-source data). The minimally modeled waveform reconstructions are repeated multiple times on these off-source data to estimate the distribution of match values we would expect for each event. For each event, these distributions are used to compute a p value, given by the fraction of off-source match values that are below the on-source match. For some of the lower SNR candidate events, the minimally modeled methods are unable to reconstruct the signals, and these candidate events are excluded from the analysis. Details of the analysis procedure, and additional results, can be found in Appendix B. Figure 12 shows the p values for the analyzed candidate events sorted in increasing order [287,288]. Events with p values above the diagonal have on-source matches that are higher than expected, while those below the line have matches that are lower than expected. The cWB analysis shows some events outside the theoretical 90% band for the single-order statistic, which can be due to statistical fluctuations. Because of correlations between p values in the ordered plot, for 20p values the probability of finding exactly four p values outside the 90% band as in Fig. 12 is 5.2%, although it may also point to a systematic effect in the analysis. Since the p values are higher than expected, there is no evidence of a discrepancy with the templatebased analysis. Our tests also indicate that, if there is indeed a systematic effect, it may originate from an overestimate of the off-source matches no larger than 2%. Further details are given in Appendix B. Overall, the p-value distributions support the null hypothesis that the minimally modeled waveform reconstructions are consistent with the generalrelativity-derived waveform templates.

IX. CONCLUSION
We have presented the results from a search for compact binary coalescence signals in the first part of the third observing run of Advanced LIGO and Advanced Virgo. During the period of observations, spanning 1 April to 1 October 2019, the three detectors had sensitivity that significantly exceeded previous observing runs, with median BNS inspiral ranges of 108 (Hanford), 135 (Livingston), and 45 Mpc (Virgo). This improved sensitivity allowed us to greatly expand the number of known compact binary mergers, adding 39 new gravitational-wave events to the 11 we have previously reported in GWTC-1 [8].
We performed parameter estimation on these 39 new GWTC-2 signals using a range of waveform models, allowing us to incorporate the effect of HMs in the inference of source parameters for BBH systems and to compare to the systematic differences between waveform families. We find that the sources of these signals include BBHs that are more massive, farther away, and more asymmetric in mass ratio than any sources in GWTC-1, as well as three binaries with at least one component of mass < 3 M ⊙ . These latter systems may include the first detected NSBH mergers; however, there is insufficient SNR to perform an informative measurement of the tidal deformability that would definitively indicate whether they had an NSBH or BBH origin. We expanded our analysis of the potential BNS signal to encompass further waveforms which include tidal effects, finding agreement with previously reported results [32].
Our analyses recovered 11 sources with positive effective inspiral spin but zero with negative (at 90% credibility). Although no individual binary in O3a has χ eff < 0 with high credibility, a hierarchical population analysis [37] of LVC-reported events to date indicates a nontrivial fraction of binaries have negative effective inspiral spin. This result would be consistent with the independent observation of gravitational-wave candidate GW170121, which was found to have significant probability of having χ eff < 0 [23,27]. We also examined the evidence for misalignment of orbital and component angular momenta, which would be an indication of a dynamical formation channel and produce precession of the orbital plane. We find only mild evidence in favor of precession in the most significant case of GW190412 [236,281]. We also performed consistency checks between the waveform models and the observed data, finding no statistically significant differences.
A pair of companion papers make use of the events in GWTC-2 to study source populations and fundamental physics. The inferred population distribution of compact object mergers is described in Ref. [37], which reports an updated BBH merger rate density of 23.9 þ14.9 −8.6 Gpc −3 yr −1 and rate density for BNS of 320 þ490 −240 Gpc −3 yr −1 . This paper also investigates mass and spin distributions and finds evidence for a population of BBH systems with spins misaligned from the orbital plane. Eight tests of general relativity are reported in Ref. [36], showing no evidence for violations of Einstein's theory of gravity, leading to some of the best constraints on alternative theories to date.
The online gravitational-wave transient catalog represents a cumulative set of events found in LIGO-Virgo data, including detections presented in GWTC-1 and now also events from the first six months of the O3 observing run. O3 continued from November 2019 until March 2020. Analysis of the second portion (O3b) is currently in progress, and events found during this period will be added in the next update.
The Advanced LIGO and Advanced Virgo detectors are currently undergoing commissioning to further improve their sensitivity and will be joined in their fourth observing run by the KAGRA detector [289]. This upgrade should lead to improvements in the detection rate and source localization, improving the prospects for multimessenger observation of future sources [290,291].
Data products associated with this catalog are available through the Gravitational Wave Open Science Center (GWOSC) [262]. Data associated with all events described in this paper are available through the GWOSC event portal, including calibrated strain time series, parameter estimation posterior samples, tables of 90% credible intervals for physical parameters, and search pipeline results. These data products may be accessed through a Web browser or open-source client package [292]. We provide a public data release associated with the results contained in this paper [293], which includes the data behind the figures, the simulation data used in estimating search sensitivity, and the posterior samples used in estimating the source properties. Full results from the injection campaign are available via a data release [294]. and Technology (MOST), Taiwan, and the Kavli Foundation. The authors gratefully acknowledge the support of the NSF, STFC, INFN, and CNRS for provision of computational resources. The detection of the signals and subsequent significance evaluations in this catalog were performed with the GstLALbased inspiral software pipeline [111,122,123] built on the LALSuite software library [295] and with the PyCBC [29,[124][125][126][127] and cWB [112,[118][119][120][121] packages. Parameter estimation was performed with the LALInference [179] and LALSimulation libraries within LALSuite [295], the Bilby and Parallel Bilby libraries [232][233][234] using the dynesty nested sampling package [235], and the RIFT library [229][230][231]. PESummary was used to postprocess and collate parameter estimation results [237]. Estimates of the noise spectra and glitch models were obtained using BayesWave [107,108,182]. The cWB and BayesWave packages were also used to generate waveform reconstructions. Plots were prepared with MATPLOTLIB [296], SEABORN [297], and GWPY [298]. NumPy [299] and SciPy [300] were used in the preparation of the manuscript.

APPENDIX A: WAVEFORM SYSTEMATICS
The choice of waveform influences our inferences of source properties [8,301,302]. As such, we employ multiple waveform families in the inference of each event's source parameters. The full list of models represented in our publicly available results for each event is shown in Table VIII.
There are many tools designed to quantify difference between probability distributions, one of which is the Kullback-Leibler (KL) divergence [303]. It is defined by  The convenient properties of the JS divergence make it a useful measure of the difference between posterior probability distributions, and we adopt it to quantify the systematics in our results resulting from different analysis choices. We use the JS divergence to investigate two sources of systematics that can impact the results of inference: the differences in the models with equivalent physics due to modeling choices and differences due to the inclusion of different physical effects. To assess the importance of the former, we compare the results using our two fiducial precessing waveform models (IMRPhenomPv2 and SEOBNRv4P). For the latter, we consider inclusion of HMs beyond the dominant quadrupole by comparing SEOBNRv4P to models that include HMs: SEOBNRv4PHM and NRSur7dq4. We compute the JS divergence between one-dimensional marginal distributions on two key parameters: mass ratio and effective inspiral spin. We use a threshold of 0.007 bit to deem the differences significant, which for a Gaussian corresponds to a 20% shift in the mean, measured in units of one standard deviation. This threshold is larger than 0.002 bit, which is the variation that is shown to arise due to stochastic sampling [234]. In the following subsection, we describe how we use our computed JS divergences and JS threshold to choose the default parameter estimation results shown in this work.

Choice of waveform models for each event
We use the above considerations to select which waveform model(s) are used as the fiducial results presented in Sec. VII. [304] In particular, we present results from models with HMs if the following conditions are satisfied for any of the three key parameters and any of the HM models: When the conditions do not hold, we combine equal number of samples from the results of IMRPhenomPv2 and SEOBNRv4P and use the joint samples. The only exceptions are GW190425 and GW190426_152155, for which we use the tidal waveforms described in Sec. V, and GW190412, GW190521, and GW190814, for which we know HMs are significant [33,34,236]. Figure 13 shows the JS divergences for every event (except GW190425, GW190426_152155, GW190521, and GW190814) to compare the effects of model systematics versus the effects of HMs. The gray regions correspond to the criteria in Eq. (A4a): Any events with D JS values in those regions are presented using HM results in Sec. VII.

Waveform comparison-Model systematics
The in-depth studies of GW190412, GW190521, and GW190814 quantitatively demonstrate how much the choice of gravitational waveform impacts our interpretation of real gravitational-wave events. We identify several additional instances where analyses with our two fiducial precessing waveform models (IMRPhenomPv2 and SEOBNRv4P) have notable differences, as measured by the JS divergence.
To illustrate the impact of systematics, we show in Fig. 14 the posterior distributions for the events with extremal JS divergence in the key parameters q and χ eff : GW190924_021846 and GW190521_074359. For GW190924_021846, SEOBNRv4P prefers more moderate mass ratios around 1=2 and shows only minor differences in the other parameters. Meanwhile, GW190521_074359 shows the largest differences in the effective inspiral spin, with SEOBNRv4P more clearly preferring nonzero values.
While the cases above demonstrate the moderate shifts that result from model systematics, for all the events presented in this paper, there is always substantial overlap between the posteriors for all the key quantities we consider. Therefore, systematic uncertainty remains subdominant to statistical uncertainty.
We systematically apply one or two models with HM to all sources in our catalog. We find that a majority of the sources investigated have modest shifts in mass ratio or χ eff due to the impact of HMs.
To illustrate the impact of nonquadrupole modes, Fig. 15 shows posterior inferences on GW190519_ 153544, GW190602_175927, GW190706_222641, and GW190929_012149 using one or two independent models which include nonquadrupole modes (SEOBNRv4PHM and NRSur7dq4). Using more complete waveform models, the source of GW190519_153544 is inferred to be edge on, at a smaller distance, a more asymmetric mass ratio and, thus, a higher source-frame mass m 1 . Because of this source's favorable orientation, nonquadrupole modes have a significant impact, with a Bayes factor for HMs of approximately 15. Similarly, nonquadrupole modes allow us to more strongly exclude both extreme and comparable mass ratios for the source responsible for GW190929_012149 and to disfavor comparable masses for GW190706_222641. Conversely, using the same two waveform models, the high-mass source responsible for GW190602_175927 is more confidently inferred to have mass ratio q closer to unity, and, therefore, m 2 is skewed to larger values. Nonquadrupole modes also have a noticeable impact on parameters of GW190630_185205 and GW190828_ 065509, in particular, the mass ratio and luminosity distance.

APPENDIX B: WAVEFORM CONSISTENCY TESTS
There are several different quantitative measures that can be used to measure waveform consistency. These include the residual SNR, which is found by subtracting the best-fit waveform template h T from the data d and then applying minimally modeled methods to search for any coherent excess in the noise residual r ¼ d − h T . Additional measures of waveform consistency include the distance between waveforms, Δ 2 hh 1 ; h 2 i ¼ hh 1 − h 2 jh 1 − h 2 i, and the match, or overlap, Ohh 1 ; h 2 i.
The residuals test is applied as a test of general relativity [9,10], at least within the precision with which the waveform models approximate general relativity. Distance and match provide more sensitive measures of the waveform consistency than the residuals test, since the extrinsic parameters of the source, such as the arrival time, sky location, and polarization, are constrained by the full signal, while for the residuals test the extrinsic parameters have to be constrained from the (usually small) difference between the signal and the template. To compare the signal reconstructions for the current catalog of sources, we adopt the waveform match as our measure of waveform consistency, since it does not depend on the overall amplitude of the signals, making it a convenient choice when comparing events with a range of amplitudes.
To make a quantitative assessment of the waveform match values, we need to know how the match depends on quantities such as the SNR and time-frequency volume of the signals. Instrument noise leads to nonzero mismatches, MM ¼ 1 − O, even when using perfect templates. For example, the maximum likelihood solution for a perfect template has a mismatch with a mean and variance given by [318] E½MM where D is the number of parameters that define the signal model and where the reduction from D to D − 1 is because the match is independent from the overall amplitude of the signal. For the minimally modeled waveform reconstructions, the distributions of match values are more difficult to predict. Using simulations, it is found that the mismatch decreases with SNR, but more slowly than for templates, since the effective dimension of the model increases with SNR [108]. The mismatch also scales with the timefrequency volume. For binary systems of a given SNR, the mismatch is generally smaller for high-mass systems [108,319,320]. Given these complexities, we choose to empirically estimate the match distribution from simulations for each event. As a proxy for the signal, we use fair draws from the on-source template-based analysis and inject these into data surrounding the event; the right ascensions for the simulated signals are adjusted such that the simulated source is at the same sky location in the frame of the detectors. For the majority of events, the waveform model used for the injections is IMRPhenomPv2.
FIG. 16. The upper panels show waveform reconstructions for GW190519_153544 in the LIGO Livingston detector. The waveform posterior from the template-based analysis (shown in orange) is compared to the BW analysis in the upper left and to the cWB analysis in the upper right. The panels also show the 90% credible bands for the Bayesian LALInference and BW algorithms and the 90% confidence band for cWB derived from off-source injections, i.e., by injecting samples from the template-based analysis into data surrounding the event and repeating the analysis multiple times (these bands are computed on an individual time sample basis). The lower two panels show the distribution of overlap values when running BW and cWB on waveforms drawn from the template-based analysis that are injected into data surrounding the event. The fraction of runs with matches below that of the on-source analysis give the p value for the event.
The exceptions are GW190412, GW190521, and GW190814, where IMRPhenomPv3HM is used. The cWB and BW algorithms are used to produce point estimates for the waveform reconstructions for each of the simulated events. For cWB the point estimate is a constrained maximum likelihood reconstruction, while for BW the point estimate is the median of the waveform posterior distribution. Figure 16 illustrates the results obtained for GW190519_153544. The upper panels compare the template-based LALInference waveform reconstruction in the LIGO Livingston detector to the minimally modeled BW and cWB reconstructions. The solid lines are point estimates for the waveforms: for LALInference the maximum likelihood; for BW the median of posterior draws; and for cWB the constrained maximum likelihood. The panels also show the 90% credible bands for the Bayesian LALInference and BW algorithms and the 90% confidence band for cWB derived by injecting samples from the template-based analysis into data surrounding the event and repeating the analysis multiple times (these bands are computed on an individual time sample basis). The lower panels in Fig. 16 show the distribution of overlaps found when running BW and cWB on simulated data with similar properties to the event. Waveforms drawn from the onsource LALInference analysis are injected into data surrounding the event. The overlap between the injected waveform and the point estimates from the BW and cWB analyses of these injections are then used to produce the histograms seen in Fig. 16. The distribution of the match values defines a null distribution for each detected event, which takes into account the variability of the LALInference posterior distribution, the fluctuations of the detector noise, and the waveform reconstruction errors. The fraction of off-source analyses with overlaps below the on-source match values, which are shown as vertical lines in the lower panels in Fig. 16, defines the p value for this event.
The same analysis procedure is repeated for a subset of additional events. cWB uses only events that are above the cWB search thresholds (resulting in a morphologydependent SNR threshold which is about 7-10 for the events reported in this catalog), while for BW the analysis is restricted to events where the on-source BW analysis yields SNR > 7. Figure 17 shows the on-source match values vs the off-source median match values with 90% intervals. The upper panel shows the results of the BW analysis, while the lower panel shows the results of the cWB analysis. In both cases, the p values point to a good agreement between the minimally modeled and templatebased reconstructions.
The discrepancies between the two plots may be ascribed to different choices made in the two reconstruction algorithms. cWB is both a detection and a reconstruction pipeline. For this reason, reconstructions are performed with the same production settings used in searches.
The production settings are optimized for noise rejection and to enforce strong network coherency constraints. To construct the match distributions, cWB uses about 2000-3000 waveform injections per event; however, those injections that are below the cWB thresholds are not reconstructed: In the majority of cases reported here (13), the reconstruction efficiency is greater than 80%, while in a few other cases (7), the efficiency ranges from FIG. 17. Off-source vs on-source match values for the candidate events in O3a. The upper panel displays the results of the BW analysis, and the lower panel shows the results of the cWB analysis. The on-source match on the horizontal axis is the value obtained comparing the maximum likelihood waveform from parameter estimation with point estimates from the minimally modeled waveform reconstructions. The off-source match on the vertical axis is the median value of the match distribution obtained from off-source injection of sample waveforms from the template-based posterior distribution. In both panels, the errors bars denote the 90% equal-tailed confidence interval. 15% to 50%. For each event, this efficiency depends on the variability of the noise background. In cases of lower efficiency, we also check that the waveforms that are successfully reconstructed by cWB have parameter distributions that are statistically indistinguishable from those of the injected waveforms.
BW employs Bayesian inference to characterize detections made by the CBC and cWB search pipelines. As such, no cuts are made on the waveform reconstructions, which means that, for quiet signals, some of the samples are drawn from the prior distribution, resulting in a wide spread in the match distributions.
The distribution of match values for each event are used to compute p values. The overall consistency of the template-based and minimally modeled waveform reconstructions can then be summarized by plotting the p values, ranked in ascending order, against the theoretical distribution. In such plots, any significant deviations below the plot diagonal point to events that should undergo further analysis. The p values for the events reconstructed by cWB and BW are shown in Fig. 12 in the main text. Applying the Fisher test [321] to the ensemble of p values yields combined p values of 0.57 for the BW analysis and 0.99 for the cWB analysis, indicating that there is no reason to reject the null hypothesis that the template-based and minimally modeled analysis are in agreement. The Fisher test is one sided in that it penalizes only p values that are lower than expected. The cWB analysis includes instances where the p values are higher than predicted, indicating that the on-source matches are higher than expected based on the off-source distributions. The cause of this bias is likely due to an asymmetry between the on-source and off-source analyses. The onsource analysis computes the match between the maximum likelihood template-based waveform and its reconstruction, while the off-source analysis computes the match between the injected template and its reconstruction. Ideally, the analysis would be symmetric, with the maximum likelihood template used both onsource and off-source, but the computational cost of running the full template-based analysis on the thousands of off-source injections is prohibitively expensive.

APPENDIX C: COSMOLOGICAL DISTANCE RESAMPLING
For the luminosity distance, a prior that goes as D 2 L is enforced in the sampling, following the same procedure as described in previous publications. As this assumption becomes increasingly unrealistic as events are detected at greater distances, the posterior distributions shown in this paper are derived from a physically motivated prior that incorporates cosmological effects. We perform rejection sampling on the initial posterior samples to instead use a prior corresponding to a uniform merger rate per comoving volume in the rest frame of the source. Using a standard flat The initial 1=ð1 þ zÞ factor accounts for time dilation of the observed merger rate, dV c is the comoving volume element, and dV E ¼ D L 2 dΩ is the Euclidean volume element. D H ¼ c=H 0 is the Hubble distance and EðzÞ ≃ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi Ω m ð1 þ zÞ 3 þ Ω Λ p for ΛCDM, and we use Hubble constant H 0 ¼ 67.9 km s −1 Mpc −1 and matter density Ω m ¼ 0.3065 ¼ 1 − Ω Λ [238].