GWTC-2: Compact Binary Coalescences Observed by LIGO and Virgo During the First Half of the Third Observing Run

We report on gravitational wave discoveries from compact binary coalescences detected by Advanced LIGO and Advanced Virgo in the first half of the third observing run (O3a) between 1 April 2019 15:00 UTC and 1 October 2019 15:00. By imposing a false-alarm-rate threshold of two per year in each of the four search pipelines that constitute our search, we present 39 candidate gravitational wave events. At this threshold, we expect a contamination fraction of less than 10%. Of these, 26 candidate events were reported previously in near real-time through GCN Notices and Circulars; 13 are reported here for the first time. The catalog contains events whose sources are black hole binary mergers up to a redshift of ~0.8, as well as events whose components could not be unambiguously identified as black holes or neutron stars. For the latter group, we are unable to determine the nature based on estimates of the component masses and spins from gravitational wave data alone. The range of candidate events which are unambiguously identified as binary black holes (both objects $\geq 3~M_\odot$) is increased compared to GWTC-1, with total masses from $\sim 14~M_\odot$ for GW190924_021846 to $\sim 150~M_\odot$ for GW190521. For the first time, this catalog includes binary systems with significantly asymmetric mass ratios, which had not been observed in data taken before April 2019. We also find that 11 of the 39 events detected since April 2019 have positive effective inspiral spins under our default prior (at 90% credibility), while none exhibit negative effective inspiral spin. Given the increased sensitivity of Advanced LIGO and Advanced Virgo, the detection of 39 candidate events in ~26 weeks of data (~1.5 per week) is consistent with GWTC-1.

We report on gravitational wave discoveries from compact binary coalescences detected by Advanced LIGO and Advanced Virgo in the first half of the third observing run (O3a) between 1 April 2019 15:00 UTC and 1 October 2019 15:00 UTC. By imposing a false-alarm-rate threshold of two per year in each of the four search pipelines that constitute our search, we present 39 candidate gravitational wave events. At this threshold, we expect a contamination fraction of less than 10%. Of these, 26 candidate events were reported previously in near real-time through GCN Notices and Circulars; 13 are reported here for the first time. The catalog contains events whose sources are black hole binary mergers up to a redshift of ∼ 0.8, as well as events whose components could not be unambiguously identified as black holes or neutron stars. For the latter group, we are unable to determine the nature based on estimates of the component masses and spins from gravitational wave data alone. The range of candidate event masses which are unambiguously identified as binary black holes (both objects ≥ 3 M ) is increased compared to GWTC-1, with total masses from ∼ 14M for GW190924 021846 to ∼ 150M for GW190521. For the first time, this catalog includes binary systems with significantly asymmetric mass ratios, which had not been observed in data taken before April 2019. We also find that 11 of the 39 events detected since April 2019 have positive effective inspiral spins under our default prior (at 90% credibility), while none exhibit negative effective inspiral spin. Given the increased sensitivity of Advanced LIGO and Advanced Virgo, the detection of 39 candidate events in ∼26 weeks of data (∼1.5 per week) is consistent with GWTC-1.

I. INTRODUCTION
Since the discovery of gravitational waves from a binary black hole (BBH) coalescence in 2015 [1], the Advanced LIGO [2] and Advanced Virgo [3] gravitational wave detectors have opened a new window on our Universe [4][5][6][7][8]. Binary black hole observations have allowed us to probe gravity in the strong-field regime [9,10] and to establish the rate and population properties of BBH coalescences [11]. In addition to BBHs, Advanced LIGO and Advanced Virgo detected the first gravitational wave signal from a binary neutron star (BNS) coalescence, GW170817 [12], which was also the first joint detection of gravitational waves and electromagnetic emission [13,14]. Gravitational wave discoveries have had a profound impact on physics, astronomy and astrophysics [13,[15][16][17][18][19], and the public release of LIGO and Virgo data [20,21] has enabled groups other than the LIGO Scientific Collaboration and Virgo Collaboration (LVC) to perform analyses searching for gravitational wave signals [22][23][24][25][26][27][28] and to report additional candidate events in some cases.
We present the results of searches for compact binaries in Advanced LIGO and Advanced Virgo data taken between 1 April 2019 15:00 UTC and 1 October 2019 15:00 UTC. This period, referred to as O3a, is the first six months of Advanced LIGO and Advanced Virgo's eleven month long third observing run. The first Gravitational-Wave Transient Catalog (GWTC-1) of compact binary coalescences (CBC) includes candidate events observed by Advanced LIGO and Advanced Virgo during the first (O1) and second (O2) observing runs [8]. The increased sensitivity of Advanced LIGO and Advanced Virgo during O3a has enabled us to increase the number of confident gravitational wave detections more than three-fold over GWTC-1. Together, GWTC-1 and the new candidate events presented here comprise GWTC-2. Fig. 1 shows this consistent increase in both the effective binary neutron star volume-time (BNS VT) of the gravitational wave network, and the number of detections across these observing runs. The BNS VT is Euclidean sensitive volume of the detector network [29,30] multiplied by the live time of the network and should be approximately proportional to the total number of detections. Our analysis of the O3a dataset has resulted in 39 gravitational wave candidate events passing our false alarm rate (FAR) threshold of 2.0 per year. Given our use of multiple search pipelines to identify candidate events, we expect ∼ 3 false alarms, i.e. candidate events caused by instrumental noise, to be present in this catalog. It is not possible to determine with certainty which specific candidate events are due to noise; instead we provide statistical measures of false alarm rate and probability of astrophysical origin. Among these candidate events, 26 have been reported previously in real-time processing via GCN Notices and Circulars [31]. Furthermore, four gravitational wave candidate events from O3a have already been published separately due to their interesting properties: GW190425 [32] is the second gravitational wave event consistent with a BNS coalescence; GW190412 [32] is the first BBH observation with definitively asymmetric component masses, which also produced detectable gravitational radiation beyond the leading quadrupolar order; GW190814 [33] is an even more asymmetric system having a ∼23 M object merging with a ∼2.6 M object, making the latter either the lightest black hole or heaviest neutron star known to be in a double compact object system; GW190521 [34,35] is a BBH with total mass ∼ 150 M having a primary mass above 65 M at 99% credibility.
Here we present 13 candidate events for the first time along with the 26 previously reported candidates. Among the 39 candidates, we find gravitational wave emission consistent with the coalescence of BBHs, BNSs, and neutron star-black hole binaries (NSBHs).
We report on the status of the Advanced LIGO and Advanced Virgo gravitational wave detectors (Sec. II) and the properties and quality of the data taken during the analyzed period (Sec. III). We then describe the analysis methods that led to the identification of the 39 gravitational wave candidates (Sec. IV), as well as the inference of their parameters (Sec. V). Next, we report the significance of the identified candidates, as well as a comparison to the public gravitational wave alerts (Sec. VI). Finally, we discuss the properties (Sec. VII) and the reconstructed waveforms (Sec. VIII) of each event . Further interpretation of the binary population is conducted in companion papers [36,37]. We will analyze the second half of Advanced LIGO and Advanced Virgo's third observing run (O3b) in future publications.
We provide a public data release associated with the results contained in this paper at [38]. This includes the data behind the figures, the simulation data used in estimating search sensitivity, and the posterior samples used in estimating the source properties. The number of compact binary coalescence detections versus the effective volume-time (VT) to which the gravitational wave network is sensitive to BNS coalescences. The effective VT is defined as the Euclidean sensitive volume [30] of the second-most sensitive detector in the network at a given time, multiplied by the live time of that network configuration. The Euclidean sensitive volume of the network is the volume of a sphere with a radius given by the BNS inspiral range [29,30] (shown in Fig. 3) of the second most sensitive detector in the network. To account for the addition of single detector candidates in O3a, a single-detector Euclidean sensitive volume was also included, defined using the inspiral range of the most sensitive detector divided by 1.5. The effective BNS VT does not account for differences in sensitivity across the entire population of signals detected, necessary cosmological corrections, or changes to analysis pipeline efficiency between observing runs, but, as shown in this figure, is consistent with the currently observed rate of detections. The colored bands indicate the three runs, O1, O2 and O3a. The black line is the cumulative number of confident detections of all compact binary coalescences (including black holes and neutron stars) for GWTC-1 [8] and this catalog. The blue line, dark blue band, and light blue band are the median, 50% confidence interval, and 90% confidence interval of draws from a Poisson fit to the number of detections at the end of O3a. The increase in detection rate is dominated by improvements to the sensitivity of the LIGO and Virgo detectors with changes in analysis methods between observing runs being sub-dominant.

II. INSTRUMENTS
The Advanced LIGO [2] and Advanced Virgo [3] detectors are kilometer-scale laser interferometers [39]. The current generation of detectors started operations in 2015, and since then have been alternating periods of observation with periods of tuning and improvement. Since O1 [40] and O2 [8], the sensitivity and robustness of the detectors improved significantly.
The LIGO detectors underwent several upgrades between the end of O2 and the start of O3a [41]. The main laser sources were replaced to allow for higher operating powers. The LIGO Hanford detector operated with 37 W of input power and the Livingston detector with 40 W. Those levels can be compared to 30 W and 25 W during O2 for Hanford and Livingston respectively. The laser sources replacement also reduced fluctuations in the input beam pointing and size that were previously detrimental for the detector sensitivity [42]. At both LIGO detectors the two end test masses were replaced with mirrors with lower scattering losses [43], allowing for higher circulating power. Additionally, annular test masses were installed to reduce noise induced by residual gas damping [44]. In the Hanford interferometer, one of the two input test masses was also replaced, because the one which was previously installed had a large point absorber [45] that limited the amount of power that could be handled in the arm cavities.
The build up of electric charge on the test masses was also an issue during previous runs, therefore several actions were undertaken to mitigate the contribution of this noise source: electric field meters were installed in end stations to monitor the local eletric field; baffles were installed in front of the vacuum system ion pumps to mitigate charging, and a test mass discharging system was put in operation and successfully deployed on all LIGO Hanford test masses.
Parametric instabilities [46], i.e., radiation-pressureinduced excitation of the test masses' mechanical modes, also limited the maximum power allowed into the interferometer. This problem was mitigated with the installation of acoustic mode dampers [47] that reduce the mechanical quality factor of the test mass resonant modes and thus suppress parametric instabilities.
The high frequency ( 1 kHz) sensitivity of both detectors was significantly improved compared to the O2 observing run (a factor 1.68 for the Hanford detector, and 1.96 for the Livingston detector), partially due to the increased circulating power made possible by the improvements already discussed, and by the installation of squeezed light sources [48,49] to reduce the quantum vacuum noise entering the interferometers [50], making O3 the first observing run of the LIGO detectors that routinely implement quantum noise reduction techniques. GEO600 has been using the same approach since 2011 [51,52] but has not detected gravitational waves so far due to an overall too-low sensitivity.
Additionally, many beam dumps and baffles were installed at both LIGO sites, to mitigate the effect of scattered light [53] that can be the source of non-stationary disturbances. Finally, the feedback control systems for the seismic isolation and for the angular and longitudinal control of the instruments were improved, increasing the detectors' duty cycle and robustness against external disturbances. With respect to O2, the LIGO Hanford median BNS inspiral range, as defined in [29], increased by a factor 1.64 (from 66 Mpc to 108 Mpc) and the LIGO Livingston median range by 1.53 (from 88 Mpc to 135 Mpc).
Also in Virgo between the O2 and the O3a observing runs, many upgrades have been implemented to boost the sensitivity.. The most important upgrade was the replacement of the steel wires suspending the four test masses with fused-silica fibers [54] to improve the sensitivity below 100 Hz. This was achieved by changing the design of the final stage of the mirror suspension to improve the screening of the fused-silica fibers [55] from residual particles injected by the vacuum system. In parallel, the vacuum system was modified to avoid dust pollution of the environment.
Another upgrade to improve the low-frequency sensitivity was the suspension of the external injection bench (used to manipulate and steer the input laser beam into the interferometer). In this way, the seismic motion of the optics was reduced, and, consequently, the beam jitter noise contribution [56,57].
The major upgrade to boost the Virgo high-frequency sensitivity was the installation of a more powerful laser that can deliver more than 65 W output power. After some commissioning activities at different power values, the laser power injected into the interferometer was set to 19 W, improving both the sensitivity and the stability of the interferometer at the same time. The laser power was almost doubled compared to the 10 W injected during O2. 1 In this configuration, due to the marginally stable power recycling cavity, the aberration induced by thermal effects prevented a reliable and robust interferometer longitudinal control, and worsened the alignment performances. Therefore, the thermal compensation system actuators were used to stabilize the power recycling cavity. The end test masses' radii of curvature were tuned with the ring heaters, maximizing the power circulating in the arm cavities [59,60].
The squeezing technique adopted by LIGO to improve the high frequency sensitivity of the detectors, was also implemented in Virgo interferometer [61]. This system was fully operational also in Virgo for the first time during O3. To reduce the optical losses optimizing the squeezing performance, the photodiodes installed at the interferometer dark port to measure the gravitational wave signals were replaced by high quantum efficiency ones [62].
Moreover, complementary activities have been carried out in parallel to the main upgrades, reducing the overall contribution of the various technical noises. In particular, the improvement of the control strategy for the suspended benches allowed the reduction of the noise contribution below 30 Hz, the installation of baffles and diaphragms on the optical benches and inside the vac-uum tanks reduced the impact of the scattered light on the sensitivity. Finally, sources of environmental noises have been identified and removed. All these upgrades increased the Virgo median BNS range by 1.73 (from 26 Mpc to 45 Mpc) with respect to the O2 run. Fig. 2 shows representative sensitivities of the three detectors during O3a, as measured by the amplitude spectral density of the calibrated strain output. Fig. 3 shows the evolution of the detectors' sensitivity over time, as measured by the binary neutron star range. The up-time of the detectors was kept as high as possible, but was nevertheless limited by many factors, such as earthquakes, instrumental failures and planned maintenance periods. The duty cycle for the three detectors was 76% (139.5 days) for Virgo, 71% (130.3 days) for LIGO Hanford, and 76% (138.5 days) for LIGO Livingston. With these duty cycles, the full 3-detector network was in observing mode for 44.5% of the time (81.4 days). Moreover, for 96.9% of the time (177.3 days) at least one detector was observing and for 81.9% (149.9 days) at least two detectors were observing. For comparison, during the O2 run the duty cycles were 62% for LIGO Hanford and 61% for LIGO Livingston, so that two detectors were in observing mode 46.4% of the time and at least one detector was in observing mode 75.6% of the time. The goal was of course to obtain the highest possible duty cycle. The current performance is limited by planned maintenance periods and the time needed to recover the control of the interferometers after large transients triggered, for example, by earthquakes. Work is on-going at all detectors to improve the duty cycle.

III. DATA
Before analyzing LIGO and Virgo time-domain data for gravitational waves, we apply multiple data conditioning steps to accurately calibrate the data into strain and mitigate periods of poor data quality [63]. Segments of data where each interferometer was operating in a nominal state, free from external intervention, are recorded [64]. Data from outside these time periods is not used in analyses unless additional investigations are completed to understand the state of the interferometer [6,33]. The data conditioning process involves calibration of the data, both in near-real time and in higher latency; subtraction of noise from known instrumental sources; and identification of short-duration noise transients, which we refer to as glitches [65], that should be excluded from any searches for astrophysical candidates either by not considering the data containing the glitches or mitigating the glitches with methods such as gating.
For time periods containing gravitational wave candidate events, additional investigation of the data quality is completed as a part of event validation to evaluate if instrumental artifacts could impact the detection and analysis of the candidate events [66]. These investigations sometimes lead to additional data processing steps such  as modeling and subtraction of glitches beyond what is completed to mitigate glitches before searching the data.
This section further outlines the procedures used to calibrate the data collected by LIGO and Virgo, characterize the data quality, validate any identified gravitational-wave candidates, and subtract glitches that may impact the analysis of candidates.

A. Calibration and noise subtraction
The optical power variations at the gravitational wave readout ports of the LIGO and Virgo detectors are calibrated into a time series of dimensionless strain measured by the detectors before use by astrophysical analyses [67,68]. The calibration process requires data conditioning filters whose response is complex-valued, frequency-dependent, and informed by detailed modeling of the feedback control system along with the interferometric, opto-mechanical response of the detectors [69]. Some control system model parameters vary slowly with time throughout operation of the interferometer. These parameters must be monitored and, when possible, the filters are corrected in near real-time (low-latency) [70]. Other parameters may change at discrete times and cause systematic error in the data stream that cannot be accounted for in low latency. Examples of such error can arise from poorly compensated changes in electronics configurations, accidental application of incorrect control parameter values, model errors not-yet-known at the start of the observing period, and hardware problems such as failures of analog electronics within the control system. Most of these sources of error are subtle, and can only be assessed once they are measured and quantified a posteriori.
All three detectors use photon calibrator (Pcal) systems [71][72][73] for absolute reference. These reference systems are used to develop each static detector model, measure parametric time dependence, and establish residual levels of systematic error in each strain data stream once constructed. Each of these measurements required to evaluate the systematic error is done by using the Pcal systems to drive forces on the end test masses via radiation pressure, creating displacement above other detector noise. Validating the strain data stream in low latency for all time, by establishing carefully quantified estimates of the systematic error with these excitations, competes with the desire for unhampered astrophysical sensitivity. As a compromise, systematic error is measured continuously only at a select few frequencies at the edges of the sensitive frequency band of the detector with monochromatic excitations during observation. The data stream is only validated at high frequency resolution, and across the entire detection band, at roughly weekly cadence: the detectors are fully functional, but are declared out of observation mode, and swept-sinusoid and colored-random-noise Pcal excitations are driven above the noise. These measurements can only provide approximate, point-estimate bounds on the data stream's error in low-latency; they cannot reflect the error distribution for all time.
Once an observing period with a stable detector configuration has completed, estimates of the probability dis-tribution of systematic error for all observation time are created [74]. These estimates leverage the power of hindsight, the collection of measurements mentioned above, and other measurements of individual components gathered while the detector is offline. During this error characterization process, if any identified systematic error is egregious and well-quantified, where possible, the control system model and data conditioning filters are modified to remove the error. The data stream is then regenerated offline from the optical power variations and control signals, and the systematic error estimate is updated accordingly [67].
Results in this paper are derived from either lowlatency (C00) or offline, recalibrated (C01) strain data, depending on whether offline data were available and whether the results are sensitive to calibration error. Detection algorithms for gravitational wave candidates, described in Sec. IV, are insensitive to typical levels of systematic error in calibration [75], so low-latency data may be used for additional offline analyses without concern. However, if available, offline data are preferred for its improved accuracy and completeness. At the time the data were searched for candidate events, offline data were only available for a portion of O3a. The candidate events presented in this paper detected prior to 5 June 2019 were identified using LIGO offline data, whereas those from 5 June 2019 until 1 October 2019 were identified using LIGO low-latency data.
Once candidate events are found by detection algorithms using either LIGO data stream, all estimations of the candidates' astrophysical parameters use the C01 LIGO version of strain data using methods described in Sec. V. The C01 LIGO version of strain data was avail-able for the entirety of O3a at the time these additional analyses were completed. As such analyses are more sensitive to calibration error [76], it is advantageous to use the definitive characterization of error at the time of each event available with LIGO C01 data. The probability distribution of error for LIGO C01 strain data in O3a are characterized in [74]. Analysis of Virgo's collection of validation measurements during the run did not lead to a significant improvement to the low-latency strain data stream offline. As such, only low-latency strain and its bounds on systematic error from point-estimate measurements are used for all detection and astrophysical parameter estimation results presented in this paper. The bounds of systematic error of Virgo strain in O3a are reported in [77].
Numerous noise sources that limit detector sensitivity are measured and linearly subtracted from the data using witness auxiliary sensors that measure the source of the noise [42,78,79]. In O3a, the sinusoidal excitations used for calibration, and noise from the harmonics of the power mains were subtracted from LIGO data as a part of the calibration procedure [79]. This subtraction was completed for both online and offline versions of the calibration. For time periods around a subset of identified candidate events, additional noise contributions due to non-stationary couplings of the power mains were subtracted [80].
The Virgo online strain data production also performed broadband noise subtraction during O3a [77]. The subtracted noises included frequency noise of the input laser, noise introduced controlling the displacement of the beam splitter, and amplitude noise of the 56 MHz modulation frequency. An additional offline strain data set was produced for 14 September 2019 through 1 October 2019 using the same calibration as the online data but with improved noise subtraction resulting in a BNS range increase of up to 3 Mpc [81] and is used in source parameter estimation of candidate events that occurred during this time period.

B. Data quality
During O3a, the data quality was closely monitored using summarized information from the detectors and their subsystems [82,83]. Deeper studies were conducted to identify the causes of data quality issues, which enabled instrumental mitigation of the sources during the run. For example, at Livingston, glitches from a mechanical camera shutter and beats of varying radio-frequency signals were identified and eliminated. At Hanford and Livingston, strong frequency peaks that wandered in time were tracked down to the amplitude stabilizer for the laser used to provide squeezed light. At Hanford, broad features in the spectrum at 48 Hz and multiples were tracked to scattered light from vacuum chamber doors and mitigated with absorptive black glass. These studies were a part of ongoing efforts to improve the data quality and the up-time of the detectors [41,82].
For analyses of gravitational wave transients, the most common data quality issue is the presence of glitches. The rate of glitches with signal-to-noise ratio (SNR) > 6.5 in the LIGO and Virgo detectors in O3a is shown in Fig. 4. In all three detectors, the glitch rate is dominated by glitches with peak frequencies below 100 Hz. This rate was higher than in previous observing runs [66] for both LIGO detectors and was especially problematic at LIGO Livingston, where the rate of glitches was 0.8 per minute in O3a, compared to 0.2 per minute in O2. The Virgo glitch rate decreased significantly between O2 and O3a, thanks to the work done during the O2-O3a shutdown to improve the accuracy of Virgo's operating point control and to identify, fix, or mitigate several sources of noise. The increased rate of glitches in the LIGO detectors limited the overall sensitivity of searches for gravitational waves in O3a and created challenges for analysis of candidate events.
The most problematic source of glitches in O3a was caused by laser light scattered out of the main laser beam, which is reflected off walls of the vacuum systems and other equipment back into the main beam [53,[86][87][88]. Scattered light noise is correlated with periods of high seismic activity. For this reason, daily cycles of scattered light glitches were present throughout O3a, especially at LIGO Livingston, tied to ground motion driven by human activity. This noise is often visible as arch-shaped features in time-frequency [86], as shown in Fig. 5, and was present at or near the time of many of the candidate events in this catalog. A significant portion of the increase in glitch rate between O2 and O3a at the LIGO detectors can be accounted for by the increased rate of scattered light glitches [88]. A potential major source of this noise for O3a was light scattered from the goldcoated electrostatic drives mounted to the fused silica reaction masses that are suspended directly behind the LIGO test masses to provide a stable platform for lownoise actuation [88]. Changes were implemented during O3b that reduced scattered light noise entering through this path [88].
Broadband short-duration glitches also occurred often in all detectors during O3a. A sub-class of those, blip glitches [89], was one of the most problematic sources of transient noise in previous observing runs, and is still present in O3a at a rate of 1.4 per hour. These glitches are one of the limiting sources of noise for searches for gravitational waves from high mass compact binaries [90] and no sources or witnesses for the majority of these glitches have been identified. In O3a, there was also an additional population of short duration glitches with SNR > 100. These loud glitches occurred in both LIGO detectors, with unknown origin. Additional description of these glitches, along with details of potential sources that have been ruled out, can be found in [82].
Many glitches in LIGO and Virgo data have wellunderstood sources and couplings, making it possible to identify short time periods where excess power from en- The rate of single interferometer glitches with SNR > 6.5 and frequency between 10 Hz and 2048 Hz identified by Omicron [84,85] in each detector during O3a. Each point represents the average rate over a 2048 s interval. Dotted black lines show the median glitch rate for each detector in O2 and O3a. vironmental or technical sources will be present in the strain data. Flagging these time periods as containing poor data quality, either by removing the data from the search or decreasing the significance of any candidate events identified, has been shown to improve the overall sensitivity of searches for gravitational waves from compact binaries [82,91,92]. While a select number of LIGO and Virgo data quality issues are flagged in low latency, such as hardware injections and digital signal overflows, the majority of data quality flags are only available for offline searches.
Before performing searches for gravitational wave signals, periods of poor data quality are flagged at various levels, called categories [63,66,82]. In O3a, data from an observatory not operating in a nominal state are flagged (category 1) and not used in any search. Additional periods likely to contain excess power in each LIGO detec- showing the presence of scattered light glitches (modulated arches). Bottom: The same data after glitch identification and subtraction as described in Sec. III D. In both plots, the time-frequency track of the matched filter template used to identify GW190701 203306 is overlaid in orange. Investigations identified 7 candidate events in coincidence with similar scattering glitches, and required mitigation before further analysis. Despite the clear overlap of the signal with the glitch, the excess power from the glitch is successfully modeled and subtracted.
tor were flagged based on detailed follow up of identified sources of noise (category 2), statistical correlation between witness sensors (category 3), and machine-learning based predictions (iDQ) [92,93]. No additional data quality products for Virgo were used beyond category 1. The specific set of data quality products used in O3a is search-specific, as described in Sec. IV. Category 2 flags are tuned separately for searches for both gravitational waves from CBC and minimally-modeled (Burst) sources. Category 3 flags are only tuned for Burst searches. The amount of time removed by each category of veto in O3a is shown in Table I.

C. Event Validation
Event validation procedures similar to those used for previous gravitational wave candidate events [8,66,82] were used for all candidate events in this catalog to evaluate if instrumental artifacts could impact analysis. Within tens of minutes of low-latency candidate event identification, time-frequency visualizations and monitors of the gravitational wave strain data [94][95][96][97][98][99] were used to identify any data quality issues present. On the same timescale, data from hundreds of auxiliary sensors monitoring the detectors and their environments were used to identify potential auxiliary witnesses to instrumental artifacts [93,100,101]. Tools that relied upon deeper information about glitches and data nonstationarity gathered offline, such as long-term monitors of the instruments and their subsystems [83,102,103] and identification of likely sources of glitches by correlation with auxiliary sensors [85,104,105], were also used to vet candidate events in this catalog. These procedures did not identify evidence of instrumental origin for any of the candidate events in this catalog, but did identify a number of data quality issues that could potentially impact analyses of these candidate events.
Candidate events with data quality issues identified by these event validation procedures required further mitigation before analysis. In cases when glitches occurred in time coincidence with a candidate event (but could not account for the candidate event itself), additional data processing steps were completed to mitigate the effect of those glitches on estimation of the candidate event parameters. If possible, the identified glitches were subtracted using the methods described in Sec. III D. In cases when sufficient subtraction was not possible, customized configurations of parameter estimation analyses were used to exclude the time period or frequency bandwidth impacted by glitches. An example of a glitch coincident with a signal that required glitch subtraction is shown in Fig. 5. While the presence of excess power from transient noise did not prevent confident identification of this event, glitch subtraction was required before the source properties of the event could be evaluated. Although only data recorded from detectors in observing mode were used to identify candidate events in this catalog, some candidate events occurred at times when one detector in the network was operating, but not in observing mode. If it was concluded that data from the additional detector would substantially impact the scientific conclusions reached from analyzing the candidate, the additional data were investigated. For those cases, the data quality and calibration for the non-observing detector was evaluated to determine whether the data could be used in the estimation of candidate event source properties. Such data were used for one candidate event, GW190814 [33]. The full list of candidate events requiring specific mitigation steps, due to either the presence of glitches or the state of a detector, is found in Sec. VII. The total number of candidate events requiring mitigation is consistant with the number expected based on the glitch rate in each detector.

D. Glitch subtraction
Data containing gravitational wave candidate events and glitches in the same time-frequency volume are pre-processed through a glitch-subtraction procedure prior to being analyzed by the parameter estimation pipelines. The glitch-subtraction procedure evolved from the BayesWave (BW) algorithm [106][107][108] used for glitch subtraction in the Livingston detector at the time of the GW170817 binary neutron star merger [12,109], where the non-Gaussian, incoherent, noise was modeled as a linear combination of wavelets which was subtracted from the data. The number of wavelets used in the fit was determined using a trans-dimensional Markov chain Monte Carlo (MCMC) algorithm that balances using fewer wavelets against the quality of the fit [106].
To prevent the glitch-subtraction procedure from corrupting the signal candidate event, the time segment and bandwidth of the wavelet-based analysis are chosen, when possible, to exclude from subtraction the strongest part of the signal. For cases where the signal and glitch overlapped in time-frequency space, a more robust application of the glitch-subtraction algorithm was used which simultaneously fits for the signal and the glitch. In the signal-plus-glitch application, signal wavelets are included in the model if they are coherent over the detector network (marginalizing over sky location, etc.), while the glitch wavelets are independent in each detector [107]. Only the glitch model wavelets are then used in the subtraction. The signal-plus-glitch procedure was tested by injecting simulated coherent BBH signals onto known single detector glitches from O2 and verifying that the signals were unaffected in the process.
The glitch-subtraction procedure is only used as a preprocessing step for the parameter estimation analysis (described in Sec. V), and is not part of the analyses that determine the presence, or significance, of gravitational wave candidate events. As shown in Fig. 5, the glitch subtraction methods described here are able to successfully remove excess power caused by glitches present near the time of candidate events.

IV. CANDIDATE IDENTIFICATION
Candidate identification happens on two timescales. First, five low-latency gravitational wave pipelines [110][111][112][113][114] process the data immediately after acquisition with the goal of generating public detection alerts to the broader astronomical community within minutes [115]. Second, an offline reanalysis of gravitational wave data is conducted to produce the curated candidate event list here. The offline analysis may benefit from updated data calibration, data quality vetoes, the ability to estimate event significance from the full data, and further algorithmic development that takes place over the course of an observing run. Although the candidate events presented here are derived from offline analysis, we provide a comparison with the public alerts in Sec. VI. Candidates are identified using two methods. The first method searches for minimally-modeled sources. The second method searches for signals from a bank of template waveforms [116] modeled after the expected gravitational wave emission from coalescing compact binaries in general relativity. In Sec. VI, we present results from one search for minimally-modeled transient sources, Coherent WaveBurst (cWB) [111,[117][118][119][120], and two searches for modeled sources, GstLAL [110,121,122] and Py-CBC [29,[123][124][125][126]. cWB, GstLAL, and PyCBC were also three of the five low-latency pipelines in O3. The two remaining low latency pipelines, MBTAOnline [113,127] and SPIIR [114] were not configured for offline reanalysis at the time of this publication and are therefore not included in GWTC-2. Below, we summarize the methods used by each of cWB, GstLAL, and PyCBC to identify candidate events.
A. Coherent WaveBurst search for minimally-modeled transient sources cWB is a search pipeline for detection and reconstruction of transient GW signals that operates without a specific waveform model [128] and was used in previous searches by the LVC [1,8,129]. cWB identifies coincident signal power in multiple detectors, searching for transient signals with durations up to a few seconds in the detector bandwidth. The analysis is performed on the time-frequency data obtained with the Wilson-Daubechies-Meyer wavelet transform [117,130] and normalized by the amplitude spectral density of the detector noise. cWB selects the time-frequency data samples above fluctuations of the detector noise and groups them into clusters. For clusters correlated in multiple detectors, cWB reconstructs the source sky location and signal waveforms with the constrained maximum likelihood method [111]. The signal SNR is estimated from the signal waveforms reconstructed by cWB, and the network SNR is calculated combining the SNRs of individual detectors.
The cWB detection statistic is based on the coherent energy E c obtained by cross-correlating the normalized signal waveforms reconstructed in different detectors. It is normalized by a chi-squared statistic where E n is the residual noise energy estimated after the reconstructed waveforms are subtracted from the data, and N df is the number of independent wavelet amplitudes describing the event. The cWB detection statistic is η c ∝ [E c / max(χ 2 , 1)] 1/2 , where the χ 2 correction is applied to reduce the contribution of non-Gaussian noise. To improve the robustness of the algorithm against glitches, cWB uses signal-independent vetoes, which reduce the FAR of the pipeline; this includes category 2 Burst data quality flags in the processing step and hierarchical vetoes in the post-production phase [104,131].
Other vetos applied to candidate events are on the network correlation coefficient c c = E c /(E c + E n ) and the χ 2 . To further reduce the background, the cWB analysis employs additional signal-dependent vetos based on the properties of the time-frequency evolution of compact binary signals: a) the frequency of the signal is increasing in time [132] and b) the central frequency of the signal f c is inversely proportional to the total mass of the system [133]. cWB searches are performed with two pipeline configurations targeting detection of high-mass (f c < 80 Hz) and low-mass (f c > 80 Hz) BBH systems. They use different signal-dependent vetos defined a priori to alleviate the large variability of non-stationary noise in the detectors' bandwidth. We estimate the significance of candidate events by systematically time-shifting the data of one detector with respect to the other in each detector pair, with a time lag so large that actual astrophysical events are excluded, and repeating this for a large number of different time lags over a total time T bkg . We count the number of events N due to instrumental noise that have a ranking statistic value such as the SNR that is at least as large as that of the candidate event and we compute the FAR as the number of background events divided by the total background time [134]. The detection significance of a candidate event identified by either configuration in a single frequency range is determined by its FAR measured by the corresponding cWB configuration. Whenever the low-mass and high-mass configurations overlap, the trials factor of two is included to determine the final FAR [34]. In the end, each configuration reports the selected candidate events and their FAR.
The sensitivity of the cWB search pipeline approaches that of matched-filter methods for coalescing stellar mass BBHs with high chirp masses, where most of the signal energy is concentrated in just a few wavelets of the cWB representation [135]. It is less competitive for low chirp mass events, where the signal power is spread over large time-frequency areas. cWB can also detect sources that are not well represented in current template banks (e.g., eccentric systems or high-mass ratio precessing systems) [136]. Tests with cWB showed that the detection efficiency for a given FAR threshold is slightly smaller with the inclusion of Virgo. Therefore, also to reduce computing time, all cWB detection candidates and waveform consistency tests reported in this catalog use the Hanford-Livingston network only.
The matched filter method relies on a model of the signal, dependent on the source physical parameters. Most important for the phase evolution of the source (and therefore the matched filter) are the intrinsic parameters: two individual component masses m 1 , m 2 , and two dimensionless spin vectors χ {1,2} , where the dimensionless spin is related to each component's spin angular momentum S by χ i = c S i /(Gm 2 i ). We also make use of combinations of these intrinsic parameters that are typically well-constrained by gravitational wave measurements; the binary chirp mass [138], determines to lowest order the phase evolution during the inspiral, and is typically better constrained than the component masses. At higher orders, the mass ratio q = m 2 /m 1 (where m 2 ≤ m 1 ) and effective inspiral spin χ eff affect the binary phase evolution. The effective inspiral spin is defined as [139]: where M = m 1 + m 2 is the total mass andL N is the unit vector along the Newtonian orbital angular momentum. The spin tilt angle for each component object θ LSi = cos −1 χ i ·L N /| χ i | quantifies the angle between the orbital angular momentum vector and its spin vector. Since the spin and angular momentum vectors vary if the system precesses, by convention we use the spin parameters at a reference frequency of 20 Hz, with the exception of GW190521 where we use 11 Hz for consistency with previous publications [34,35]. Additional intrinsic parameters are needed to describe eccentricity, which we assume to be zero in our modeled analyses. The timescale for circularization of isolated binaries with non-zero eccentricity at birth is sufficiently short that sources are expected to have negligible eccentricity when they enter the sensitive bands of the LIGO and Virgo detectors [140]. However, dynamically formed binaries may have residual eccentricity as the signal enters the sensitive band of the detector. These systems have been the target of unmodelled searches of previous observing runs, but with no candidate events reported [136]. Seven extrinsic parameters provide the orientation and position of the source in relation to the Earth: the luminosity distance D L , two-dimensional sky position (right ascension α and declination δ), inclination between total angular momentum and line-of-sight θ JN , time of merger t c , a reference phase φ, and polarization angle ψ.
In general, as the signal travels from the source to the detector its frequency is redshifted by a factor (1 + z).
For a system involving only black holes, the observed signal is identical to that from a source in the rest frame of the detector with total mass M det = (1 + z)M [141,142]. For convenience, the templates used by the modeled searches are defined in the rest frame of the detectors which subsumes the factor (1 + z) into the definition of masses.
For this work, the GstLAL analysis used a template bank with component masses between 1 M and 400 M with total masses, M det , between 2 M and 758 M and spins that are aligned or anti-aligned with the binary's orbital angular momentum, such that only the spin components χ i,z = χ i ·L N are non-zero. The bank was constructed in five regions via a stochastic placement algorithm [143,144] satisfying different minimal match (mm) [116] criteria with waveforms starting at f min as described in Table II. Template placement was augmented to improve the collection of background statistics in the last region shown in Table II by a grid of templates distributed uniformly in the logarithm of component mass to improve detection efficiency for systems with primary mass m det 1 above 50 M [8,145]. The TaylorF2 waveform approximant [138,[146][147][148][149][150][151][152][153][154][155]. was used for templates with M det < 1.73 M and the SEOBNRv4 ROM waveform approximant [156] for templates with M ≥ 1.73 M .
The PyCBC analysis used a template bank covering the same parameter space as for GWTC-1 [8] shown in Figs. 3 and 7 of [157]. Unlike the previous work, the template bank here was created using a hybrid geometric-random method described in [158,159]. This new method provides a more efficient template bankin terms of covering the full parameter space with fewer template waveforms-than the stochastic method [143,144]. This bank is broadly similar to the parameter space covered in the GstLAL search described above with a key difference being that only templates longer than 0.15 s were kept. Details about this cut and its effect on the explored mass and spin range can be found in [157].
Both GstLAL and PyCBC scan data from each gravitational wave detector against the above-described banks of template waveforms to produce SNR time-series [29]. The SNR time series are maximized over short time windows to produce a set of triggers for each template and each detector. Triggers that pass an SNR threshold of 4 in one detector form the basis of candidate events according to the procedures for each pipeline described below. PyCBC removes the time period during category 2 veto flags from the final results, while GstLAL uses only iDQ for single-detector triggers and no data quality products for coincident triggers.
GstLAL defines a candidate event as consisting of triggers from one or more gravitational wave detectors ranked by the SNRs of the triggers, signal-consistency tests, time delays between each detector, phase differences between detectors, the (possibly zero) timeaveraged volumetric sensitivity of each detector, and the signal population model. These parameters are used as variables in the likelihood-ratio ranking statistic L [110,121,122,160] which is a monotonic function of the inverse false alarm probability [161].
There are two differences between the ranking statistic used here and in O2 [8]. First, we implemented a template likelihood p(T|signal, SNR) [162], which is the probability that a trigger is recovered by a template T, given the trigger SNR and that the signal belongs to some population. Previous versions of GstLAL approximated p(T|signal, SNR) by a constant in L [110,121,160], implying all templates were equally likely to recover a signal. Now, the template likelihood is informed by the template bank (to account for the fact that templates are not uniformly distributed in parameter space [163]) and a signal-population model, which for this search considers θ = {m det 1 , m det 2 , χ 1,z , χ 2,z } and is given by The distribution p( θ|signal) is deliberately broad to minimize the number of missed signals. Second, single-detector candidate events are ranked using both an empirically determined penalty and information from iDQ [93] as described previously. The penalty from iDQ is added to the denominator of L. Candidates from the GstLAL search had their likelihood ratios and significance estimated using the entire ∼ 6 month data set.
PyCBC identifies candidate events by requiring triggers in both LIGO Hanford and LIGO Livingston with a time delay smaller than the light travel time between observatories. These candidate events are then ranked using a set of signal-based vetoes, data quality information, and by comparing the properties of the event against those expected from astrophysical signals [126]. A FAR is then computed for each of these candidate events by estimating the background noise distribution using timeshifted analyses similarly to cWB [29,125,164], triggers from LIGO Livingston being timeshifted relative to LIGO Hanford by multiples of 0.1 s. Virgo data were not searched with the PyCBC pipeline due to three-detector searches not being completely integrated in the version of code used.
A focused search for BBH coalescences [27] is also used here, denoted later as PyCBC BBH. This was motivated by the fact that all signals observed in O1 and O2, with the single exception of the binary neutron star merger GW170817, were consistent with BBH coalescences with mass ratio close to 1 and effective inspiral spins close to 0. The full parameter space search used for GWTC-1 [8], in contrast, was tuned to observe signals anywhere in the possible space of signal parameters, which might include signals that do not have very high matches with search templates. The PyCBC BBH search uses a recently developed detection statistic [27,165] which includes a number of tuning choices to reject triggers that do not match the filter waveforms well, and also includes a template weighting implementing a prior that signals detectable in any given range of SNR are uniformly distributed in chirp mass. This search enabled PyCBC to identify more BBHs in the O1 and O2 datasets than reported in the GWTC-1 paper [8,27]. This included some of the BBHs first reported in [23,25] by independentlydeveloped searches [24,166]. We use the focused BBH search in this work to better extract BBH coalescences from the data: this search considers only a reduced set of filter templates defined prior to the analysis of O3 data, namely systems with mass ratio q > 1/3, and with both component masses (in detector frame) larger than 5 M .

C. Estimation of modeled searches sensitivity
In order to estimate the sensitivity of the GstLAL and PyCBC searches, we conducted a campaign of simulated signals injected into the O3a gravitational wave data and analyzed by both matched filter pipelines. The simulated population, intended to cover (or over-cover) the detected population of stellar-mass BBHs [11,37], contains component masses m 1 , m 2 between 2 M and 100 M and extends out to a maximum redshift of 2.3. In order to reduce statistical uncertainties, the mass, spin and redshift distributions should be sufficiently similar to population models for which we intend to estimate merger rates: see [37] (Appendix A) for further discussion of selection functions in population inference. For the sim- (for m 2 < m 1 ), and χ i,z values distributed uniformly between −0.998 and 0.998. The cosmological distribution of sources simulates a merger rate in the comoving frame that evolves as R(z) = R(0)(1 + z) 2 , thus the source redshift distribution follows p(z) ∝ (1 + z)dV c /dz, where V c is the comoving volume (see, e.g., [167] for further discussion of cosmological effects). The simulation set was generated in two stages: first, points were chosen according to this distribution out to z = 2.3; then, these 7.70 × 10 7 samples were reduced to a set of potentially detectable signals by imposing that the expected LIGO Hanford-LIGO Livingston network SNR, calculated using repre-sentative noise power spectral densities (PSDs), be above a threshold of 6. The ∼157,000 signals remaining after this cut were assigned merger times ranging uniformly over the duration of O3 for analysis by the searches. The SEOBNRv4 opt aligned-spin waveform model [156] was used for the simulated signals.
The expected number of signals from such a population detected by a given analysis may be written aŝ where R(0) is the rate of signals per unit volume and unit observing time at present, and V is the effective surveyed hypervolume, a measure of analysis sensitivity for the injected population. 2 Since each analysis recovered over 10 4 injections, statistical counting uncertainties in the hypervolumes V are at the sub-percent level. Our estimates of sensitivity are, though, affected by possible systematic uncertainties in calibration of the strain data. Strain calibration affects the detectability of simulated signals via the magnitude of the response function, which is affected by uncertainties of at most a few percent over the frequency range where the SNR of binary merger signals is accumulated; see [74] for details. The hypervolume V surveyed by each analysis, at a detection FAR threshold of 2.0 per year, is 0.456 Gpc 3 yr for GstLAL, 0.296 Gpc 3 yr for PyCBC, and 0.386 Gpc 3 yr for PyCBC BBH. For a combination of all matched filter analyses presented in this catalog, with any injection found in one or more analysis below the FAR threshold considered as detected, we find a surveyed hypervolume 0.567 Gpc 3 yr. Full results from the injection campaign are available via a data release at [168].

D. Estimation of signal probability
For each candidate event, the probability of origin from an astrophysical source p astro and corresponding probability of terrestrial noise origin p terr = 1 − p astro may be estimated using the outputs of search pipelines. We obtain these probabilities for the candidate events in this catalog consistent with a BBH, via the Poisson mixture model formalism [169] used in O1 [40,167,170]. Only BBH candidate events are considered because, other than GW190425 [171] whose component masses are consistent with those of NSs, all significant detections can be classified as BBHs. 3 A low significance candidate NSBH event, GW190426 152155, was reported in low-latency [172]; however, its astrophysical probability is strongly dependent on prior assumptions of the rate of such signals. We therefore do not estimate its p astro here.
We start by collecting the ranking statistics x = {x 1 , x 2 , . . . , x N } of all candidate events more significant than a predefined threshold: for the GstLAL pipeline events are thresholded on FAR, while for PyCBC, a ranking statistic threshold is applied. The threshold for Gst-LAL is chosen to ensure that the total number of background events considered exceeds the number of signals by a large (∼ 100) factor, which enables an accurate estimate of the total rate of background events above threshold. On the other hand, PyCBC estimates the background rate from time-shifted analyses, as outlined above in Section IV B, thus the requirement to include a large number of background events is relaxed. The statistic threshold is then set low enough to include (at least) all events with p astro 0.1.
Additionally, for both GstLAL and PyCBC searches, a threshold of 4.35 M is applied on the chirp mass of the templates, corresponding to a 5 M + 5 M binary, ensuring that the selected candidate events have template masses consistent with those of putative BBHs. Using the distribution of ranking statistics x under the foreground model, f (x) = p(x|signal), and the distribution under the background model, b(x) = p(x|noise), estimated by each matched filter pipeline, we assign a Bayes factor k(x) = f (x)/b(x) to each event. Assuming that foreground and background triggers are drawn from independent Poisson processes, one can then calculate the posterior over the Poisson expected counts for each process, Λ 1 and Λ 0 .
For the PyCBC searches presented here we proceed as for O1 and O2 [8] and estimate foreground and background event densities empirically for all putative BBH candidate events with ranking statistic above a given threshold. The PyCBC full parameter space and BBH focused searches differ in how their ranking statistics are calculated: for the full search [126], a threshold of 7.9 is applied to the ranking statistic; while for the BBH search [27,165], a threshold value of 9 is applied. We empirically measure the rate of noise events satisfying these cuts via time-shifted analyses, and infer the posterior over the rate of signals; finally we marginalize over the signal rate to obtain probabilities of astrophysical and terrestrial origin for each event [173]. Since the Py-CBC BBH search is more sensitive to realistic BBH signal populations, implying a more accurate estimate of the relative densities of signal and noise events within its targeted mass region, we consider the p astro values from the BBH analysis to be more accurate for events recovered by both searches.
For the GstLAL analysis, we estimate the astrophysical and terrestrial probabilities, p astro (x| x), from the joint posterior on the Poisson expected counts, p(Λ 0 , Λ 1 | x), where the set of triggers x have a chirp mass M > 4.35 M , and a FAR < 8766 yr −1 . The prior used to construct the joint counts posterior is taken to be the corresponding posterior from O1 and O2 [174].

V. ESTIMATION OF SOURCE PARAMETERS
Once triggers of interest have been identified, the physical parameters of the candidate event gravitational wave signals are inferred by computing their posterior probability density functions. The uncertainty in the source parameters is quantified by the posterior probability distribution p( ϑ| d), which is calculated using Bayes' theorem as where p( d| ϑ) is the likelihood of the data given the model parameters ϑ, and π( ϑ) is the prior probability distribution for the parameters. The likelihood is calculated from a coherent analysis of data from each of the detectors. As in our previous analyses, e.g., [175], we assume that the noise can be treated as Gaussian, stationary, and uncorrelated between detectors [63,176] in the stretch of data used to calculate the likelihood and to measure the noise PSD. This yields a Gaussian likelihood [142,177] for the data from a single detector, is the waveform model calculated at ϑ projected on the ith detector and adjusted to account for the uncertainty in offline calibration described in Sec. III A. The noiseweighted inner product a|b [142,178] requires specifying the frequency range in which the analysis is performed as well as the noise PSD.
Upon detection of a binary merger, exploratory analyses are first conducted to identify which models and settings are most suitable for use in our production analyses. In general, we use a low frequency cutoff of f low = 20 Hz, unless data quality requirements at the time of specific candidate events are different, in which case a specific range is noted in Sec. VII. The high frequency cutoff is always equal to the Nyquist frequency of the analysis for each event, which is determined by the sampling rate. The sampling rate is tailored for specific candidate events, since the signals (especially the higher-mass BBH signals) do not require the full bandwidth available at the native sampling rate of 16 kHz. The PSD characterizing the noise at the time of each event is measured by BW using the same data that is used for likelihood computation [179,180]. We then obtain the final joint likelihood over all detectors by multiplying together the likelihood from each detector in Eq. (5).
The gravitational wave signal emitted by a circularized compact binary composed of two black holes depends on fifteen unknown parameters, defined in Sec. IV B. The initial masses and spins of the inspiraling black holes determine the peak gravitational wave luminosity and mass and spin of the post-merger remnant black hole, which we calculate from fits to numerical relativity (NR) [5,[181][182][183][184][185]. When one or both objects are neutron stars, matter effects modify the binary inspiral and are included via the dimensionless quadrupole tidal deformability Λ i , adding one extra parameter for each neutron star in the binary. Other matter effects, such as octupolar and higher tidal deformabilities, non-black hole spin-induced multipole moments, and f -mode resonances, are parameterized by the Λ i using quasiuniversal relations [186]. The dominant tidal contribution to the waveform is given by the dimensionless tidal deformability parameter [187,188] Λ = 16 13 Non-spinning black holes have Λ i = 0 [189,190], and the waveform models we use adopt the convention that this is true for all black holes [191,192]. We do not calculate final masses and spins when matter effects are included in the analyses, as this requires an accurate prediction of the ejected mass following possible tidal disruption, taking into account the equation of state and uncertain details of the dynamics of the merger.

A. Waveform models
We characterize the detected binaries using multiple waveform models, each of which uses a different set of modeling techniques and includes different physical effects. For every event with inferred component masses above 3 M in preliminary analyses, we perform production parameter estimation runs using a subset of BBH waveforms. IMRPhenomPv2 [193,194,201] is a phenomenological model for gravitational waves from precessing BBH systems, calibrated to NR and using an effective single-spin description to model effects from spinprecession [213]. SEOBNRv4P [156,198] is based on the effective-one-body (EOB) formalism [214,215] and calibrated to NR, with a generic two-spin treatment of the precession dynamics. These models rely on twisting up procedures, where aligned-spin, NR-calibrated waveform models defined in the co-precessing frame are mapped (through a suitable frame rotation) to approximate the multipoles of a precessing system in the inertial frame [216][217][218][219][220]. These models do not include contributions to the strain from spherical harmonic modes beyond = 2, so we also analyze each event with at least one of the following models that incorporate higher-order multipole (HM) moments and precession effects: IMR-PhenomPv3HM [202,203], SEOBNRv4PHM [198,199] and NRSur7dq4 [204]. IMRPhenomPv3HM (based on IMRPhenomHM [195]) and SEOBNRv4PHM (based on SEOBNRv4HM [196]) both rely on the twisting-up approach described above. NRSur7dq4 is a surrogate waveform model for BBH systems that directly interpolates a large set of precessing NR simulations. Unlike the other two HM models, NRSur7dq4 waveforms are restricted by the length of the NR simulations in the training set, covering only ∼ 20 orbits before merger.
Any sources with evidence for at least one binary component below 3 M are characterized using several waveforms capable of modeling matter effects. For the BNS system GW190425 [32] we use the following: IMR-PhenomD NRTidal and IMRPhenomPv2 NRTidal [205,206], which are based on the BBH models IMRPhenomD and IMRPhenomPv2 respectively, and incorporate NR and tidal EOB-tuned contributions from tidal interaction as well as equation-of-state dependent self-spin effects; TaylorF2 [138,[146][147][148][149][150][151][152][153][154][155], which describes waveforms from the inspiral of non-precessing compact binaries, with matter effects derived in the post-Newtonian formalism, including quadrupole-monopole coupling parameterized in terms of the tidal deformabilities [186,221,222]; TEO-BResumS [207], an aligned-spin EOB model that incorporates post-Newtonian and self-force contributions to the tidal potential; and finally a frequency-domain surrogate model of aligned-spin SEOBNRv4T waveforms [156,[208][209][210], which were derived in the EOB approach and include dynamical tides.
SEOBNRv4 ROM NRTidalv2 NSBH uses SEOBNRv4 ROM as the BBH baseline, while IMR-PhenomNSBH employs phase evolution from IMR-PhenomD [193,194] and amplitude from IMRPhe-nomC [223]. Both models contain a phenomenological description of the tidal effects tuned to NR simulations [224] and include corrections to the amplitude through inspiral, merger, and ringdown to account for a possibility of tidal disruption.
To account for the systematic uncertainties in the waveform models, we combine equal numbers of posterior samples (described in Sec. V B) from all parameter estimation runs for an event that use waveforms with comparable physics. This treats the constituent waveform models in the combined results as having equal weight rather than weighting them by marginal likelihood as suggested in [225]. Table III shows the waveforms employed in this work, the keys under which we group results using these waveforms, and descriptions of the physical effects incor-porated in the models.

B. Sampling methods
We use several methods to draw samples from the posterior distributions on source parameters using the models described above. The LALInference [177] package was used for most analyses presented in this paper. This package provides two independent stochastic sampling algorithms: a MCMC algorithm and a nested sampling [226] algorithm. We employ LALInference's nested sampling algorithm for most of the BBH analyses performed with the IMRPhenomD and IMRPhe-nomPv2 waveforms, and the MCMC algorithm for those performed with SEOBNRv4P. However, the serial nature of these methods makes them unsuitable for use with some of the more computationally costly waveform models, such as waveforms with HMs and precession effects, especially for long-duration signals. For these, we also use RIFT, which performs a hybrid exploration of the parameter space split into intrinsic and extrinsic parameters [227][228][229], and Parallel Bilby, based on a distributed implementation of nested sampling [230][231][232][233], which was also used in previously published analyses of GW190412 [234], GW190425 [32], and GW190814 [33]. The raw posterior samples from the analyses described above are then collated to a common format using the PESummary package [235].

C. Priors
Each event is analysed independently using a prior distribution on the source parameters that is chosen to ensure adequate sampling of the parameter space and simplicity in using the posterior samples for further analyses. We choose a prior that is uniform in spin magnitudes and redshifted component masses, and isotropic in spin orientations, sky location and binary orientation. The prior on luminosity distance corresponds to a uniform merger rate in the co-moving frame of the source, using a flat ΛCDM cosmology with Hubble constant H 0 = 67.9 km s −1 Mpc −1 and matter density Ω m = 0.3065 [236]; this physically motivated prior differs from that used in previous published results which used a prior ∝ D 2 L . However, our data release includes parameter estimation samples for both the flat-in-comoving-volume and ∝ D 2 L priors. For details on the conversion see Appendix C. Intrinsic source masses are computed by dividing the redshifted masses measured in the detector frame by (1 + z), where z is calculated using the same cosmological model.
For the LALInference and Parallel Bilby analyses, we marginalize over uncertainty in the strain calibration. The calibration errors in amplitude and phase are described by frequency-dependent splines, whose coefficients are allowed to vary alongside signal parameters in the inference. The prior distribution on the calibration error at each spline node is set by the measured uncertainty at each node [175]. Table IV presents the results from each of cWB, Gst-LAL, and PyCBC passing a FAR threshold of 2.0 yr −1 ; the full gravitational wave name encodes the UTC date with the time of the event given after the underscore. GW190521 [34,35], GW190425 [32], GW190412 [234], and GW190814 [33] were published previously and these names are used here verbatim. The 2.0 yr −1 threshold was chosen to be higher (more permissive) than the threshold used for public alerts, 1.2 yr −1 , 4 but sufficiently low to provide an expected contamination fraction below 10%. An extended candidate event list, which may contain marginally significant triggers, will be provided later once final data calibration and quality checks are available. Unlike GWTC-1 [8], a separate p astro threshold was not applied, however, p astro is greater than 50% for all candidate events for which p astro was calculated in this work, satisfying the same criteria as GWTC-1. 5 Among the 39 reported candidate events passing the FAR threshold of 2.0 yr −1 , 15 were detected by cWB, 36 candidate events were detected by GstLAL, and 27 candidate events were detected by PyCBC; 25 candidate events were recovered by at least two pipelines. Given the FAR threshold and number of candidate events detected, the expectation value for noise events is < 4 in this list. Based on the FAR or the probability of being a signal described in Sec. IV D, GW190426 152155, GW190719 215514, and GW190909 114149 are the most likely to be noise among the candidate event list. cWB recovered fewer candidate events than either Gst-LAL or PyCBC. This is expected because cWB has highest sensitivity for short duration, high mass signals, and its sensitivity decreases for lower mass systems with longer duration. cWB also required candidate events to be found in coincidence between at least two detectors.

VI. CANDIDATE EVENT LIST
The difference in candidate event recovery between GstLAL and PyCBC is primarily due to PyCBC analyzing only times when both LIGO Hanford and LIGO Livingston were operating and requiring signals to be observed in both. PyCBC did not analyze Virgo data due to the fact that the code version used for this catalog had not fully integrated 3-detector analysis including Virgo. GstLAL analyzed LIGO and Virgo data and allowed for the detection of candidate events from one, two, or three gravitational wave detectors. These algorithmic choices account for the GstLAL-only detection of GW190424 180648, GW190425, GW190620 030421, GW190708 232457 and GW190910 112807, which were detected above the required SNR threshold only in the LIGO Livingston detector, and for GW190630 185205, GW190701 203306, and GW190814, where the inclusion of Virgo was essential for determining event significance. The difference in candidate event recovery between Py-CBC and GstLAL is also consistent with the results of the simulation presented in Sec. IV C. After accounting for differences in analyzed data, the GstLAL and Py-CBC methods detect a comparable number of candidate events.
The remaining differences between the candidate event lists from GstLAL (GW190426 152155, GW190527 092055, GW190909 114149, and GW190929 012149) and PyCBC (GW190413 052954, GW190514 065416, and GW190719 215514) arise from the low SNR of each event (SNR 10). Small fluctuations in SNR caused by different PSD estimation and data segmentation between pipelines leads to differences in significance estimation. This list of low SNR candidate events, which were only identified by one pipeline, also contains the three candidate events with the highest minimum FAR among the pipelines (GW190426 152155, GW190719 215514, and GW190909 114149) which have the highest likelihood among the full candidate event list of being caused by noise.
Since 2 April 2019 20:00 UTC, the LVC produced automated, public preliminary GCN Notices for gravitational wave candidate events appearing in two or more interferometers with FARs less than 6 per year before a multiple analysis trials factor was applied resulting in an effective threshold of 1.2 yr −1 [244]. On 11 June 2019, this was extended to include gravitational wave candidate events appearing in only one interferometer and satisfying the same FAR threshold. During O3a, 33 candidate events were disseminated as plausible astrophysical signals; 7 were not recovered above the threshold considered in this work. 6 S190510g, S190718y, S190901ap, S190910d, S190910h, S190923y, and S190930t [237][238][239][240][241][242][243] are the 7 candidate events disseminated via GCNs which are not recovered here. S190718y, S190901ap, S190910h, and S190930t were initially identified as single-detector candidate events (with an SNR above threshold in only one detector) with FARs of 1.14 yr −1 , 0.22 yr −1 , 1.14 yr −1 , and 0.47 yr −1 , respectively. Relaxing the demand for coincident observation across interferometers allows LVC analyses to report on additional astrophysically interesting candidate events [171,245], but also removes a powerful check on the search background and leads to larger uncertainties in the FAR. All public alerts were subsequently 6 https://gracedb.ligo.org/superevents/public/O3/ followed-up in low-latency to assess whether the analysis pipelines and detectors were operating as expected. The low-latency followup of S190718y, S190901ap, S190910h, and S190930t did not uncover any reason to retract these candidate events based on data quality. However, after offline re-analysis with additional background statistics, these four single detector candidate events are no longer significant enough to merit inclusion in Table IV.
The remaining three public alert candidate events not recovered here-S190510g, S190910d, and S190923ywere found in coincidence in low-latency, albeit at modest significance. S190510g was found in low-latency by GstLAL and assigned a FAR of 0.28 yr −1 . There were initially data quality concerns with S190510g [246] and offline follow-up using an additional 24 hours of background collection revealed the candidate event to be less significant than originally estimated [247]. Comparison to the full O3a background corroborates that the candidate event no longer passes the FAR threshold of 2.0 yr −1 . S190910d was identified in low-latency by the SPIIR compact binary search pipeline [114] with a FAR of 0.12 yr −1 . The compact binary search pipeline MBTAOnline also recorded a low significance candidate event at this time. However, the candidate event was not observed in low-latency or offline by GstLAL, PyCBC or cWB as significant. Presently, MBTAOnline and SPIIR are configured to run in low-latency only. S190923y was reported in low-latency by PyCBC and assigned a FAR of 1.51 yr −1 . GstLAL and MBTAOnline also recorded low significance candidate events at this time. No pipeline retains this candidate event in the offline analysis below the threshold of 2.0 yr −1 and it is therefore excluded from Table IV. These and other sub-threshold events will be explored further in a future publication.
The remaining 26 public alerts are recovered in our offline analysis and included in the candidate event list presented in Table IV. The table also includes 13 gravitational wave detections not previously reported. Four of these detections, GW190424 180648, GW190620 030421, GW190708 232457 and GW190910 112807, had detection-level SNR (≥ 4) in only one of the two LIGO detectors, making them single-detector observations from the perspective of candidate event significance. The incorporation of iDQ data quality into event ranking, combined with tuning of the signal consistency tests to further reject the O3a glitch background, improved the sensitivity of the offline GstLAL analysis to single-detector candidate events compared to the low-latency configuration, which accounts for these new discoveries. Nine new detections were observed in two or more interferometers and appear for the first time in Table IV: GW190413 052954, GW190413 134308,  GW190514 065416,  GW190527 092055,  GW190719 215514,  GW190731 140936,  GW190803 022701, GW190909 114149, and GW190929 012149. Several of these detections were observed in low-latency, but did not meet the criteria for public release. They all exhibit moderate network Name Inst. cWB GstLAL PyCBC PyCBC BBH FAR (yr −1 ) SNR * FAR (yr −1 ) SNR pastro FAR (yr −1 ) SNR * pastro FAR (yr −1 ) SNR * pastro GW190408 181802 HLV < 9.5 × 10 −4 14.8 < 1.0 × 10 −5 14.7 1.00 < 2.5 × 10 −5 13.5 1.00 < 7.9 × 10 −5 13.6 1.00 GW190412 HLV < 9.5 × 10 −4 19.7 < 1.0 × 10 −5 18.9 1.00 < 3.1 × 10 −5 17.9 1.00 < 7.9 × 10 −5 17.  TABLE IV. Gravitational wave candidate event list. We find 39 candidate events passing the FAR threshold of 2.0 yr −1 in at least one of the four searches. Except for previously published events, the gravitational wave name encodes the UTC date with the time of the event given after the underscore. Bold-faced names indicate the events that were not previously reported. The second column denotes the observing instruments. For each of the four pipelines, cWB, GstLAL, PyCBC, and PyCBC BBH, we provide the FAR and network SNR. Of the 39 candidate events, the 5 that were found above the required SNR threshold in only one of the gravitational wave detectors are denoted by a dagger ( † ). For candidate events found above threshold in only one detector (single-detector candidate events), the FAR estimate involves extrapolation. All single-detector candidate events in this list by definition are rarer than the background data collected in this analysis. Therefore, a conservative bound on the FAR for triggers denoted by † is ∼ 2 yr −1 . GW190521, GW190602 175927, GW190701 203306, and GW190706 222641 were identified by the cWB high-mass search as described in Sec. IV A. GstLAL FARs have been capped at 1 × 10 −5 yr −1 to be consistent with the limiting FARs from other pipelines. Dashes indicate that a pipeline did not find the event below the specified 2.0 yr −1 threshold. Blank entries indicate that the data were not searched by a pipeline. The probability that an event is astrophysical in origin as described in Sec. IV D is indicated in the column pastro. * PyCBC and cWB SNRs do not include Virgo. S190510g, S190718y, S190901ap, S190910d, S190910h, S190923y, and S190930t [237][238][239][240][241][242][243] were candidate events disseminated via GCNs which are not recovered here. The SNR for GW190814 (22.2) differs from the previously published value [33] because the non-observing-mode LIGO Hanford data were not analyzed in this work. The FAR of GW190425 differs from [32] due to different background data and pipeline configuration. The SNR of GW190521 differs from [34] due to the inclusion of sub-threshold Virgo SNR.
SNRs ( 10). The offline analyses here differ from their low-latency counterparts through having improved template banks, improved use of data quality information, improved data calibration, data cleaning, and improved tuning to reject the non-stationary noise background observed in O3a. These differences account for the new moderate SNR candidate events. During O3a, the LVC issued 8 retractions for public alerts that were promptly determined to be unlikely to have originated from astrophysical systems. The LIGO Livingston detector was more problematic for low-latency analyses than in previous observing runs with the rate of noise glitches being significantly higher than in O2 (see Sec. III B). Low-latency detection, especially for candidate events originating in only one interferometer, was especially challenging during O3a. S190405ar was the first retraction [248]. This Notice was distributed in error and was never considered to be of astrophysical origin because the event's FAR was significantly above threshold at 6800 yr −1 . The remaining retractions were caused by severely non-stationary noise. A glitch in LIGO Hanford data led to the identification and subsequent retraction of S190518bb [249]. S190524q, S190808ae, S190816i, S190822c, S190829u, and S190928c all exhibited extreme non-stationary noise in the LIGO Livingston detector [250][251][252][253][254][255]. S190822c was assigned a significant FAR at identification but was only observed in a single interferometer. As previously mentioned, single detector triggers are subject to higher uncertainties in FAR. Out of these triggers, an automatic gating method deployed later in the observing run used by searches in order to mitigate non-stationary noise in the detectors would have vetoed S190822c. However, at the time, this automatic gating was not being employed. FARs calculated by the pipelines are correct if the data collected so far is representative of the future data. Online pipelines assign FARs using previously collected data, while offline analyses have access to asynchronous background. If a new non-stationary noise source emerges in the data during an observing run, it is possible for it to be misidentified as a candidate event. Although this can also affect candidates observed in multiple interferometers, single detector candidates are especially susceptible to novel noise sources seen for the first time in lowlatency. In some cases where the impact of glitches was unknown, follow-up analyses were performed to remove instrumental artifacts and reassess the candidate event significance. After follow-up, none of these 8 retracted candidate events remained significant. Subsequent offline analyses did not identify them as significant either.

VII. SOURCE PROPERTIES
We analyze the 39 candidate events shown in Table IV with the parameter estimation techniques described in Sec. V. For a subset of candidate events, event validation procedures outlined in Sec. III C identified transient noise that may have impacted the results of parameter inference. In order to minimize the effect of this transient noise, candidate-specific procedures were explored for all impacted candidate events. In most cases, transient noise was mitigated through the glitch subtraction methods outlined in Sec. III D. After application of these methods, the identified transient noise was considered mitigated if the data surrounding the event was consistent with Gaussian noise, as measured by the variance of the measured power spectral density during the time period containing the identified transient [96]. If data were not Gaussian after glitch subtraction, we evaluated the SNR lost by restricting the frequency range of data considered in parameter inference to fully excise the identified transient noise. In cases where the single detector SNR loss was below 10%, this reduced frequency range was used in analyses. Otherwise, the nominal frequency range was used. The full list of candidate events using candidate-specific mitigation, along with the mitigation configuration, is found in Table V. Based on the investigations described in Appendix A, we find that most gravitational wave candidate events in this catalog exhibit small changes in source parameter estimates when spherical harmonic modes above = 2 are included. However, these differences in aggregate could still affect population-level studies, so we present as fiducial results the combined posterior samples of HM runs for all BBHs candidate events except GW190707 093326, GW190720 000836, GW190728 064510, GW190915 235702, GW190924 021846, and GW190930 133541. For these six exceptions, we present combined IMRPhenomPv2-SEOBNRv4P samples, because the effect of higher modes is either negligible or subdominant to the systematics between IMRPhenomPv2 and SEOBNRv4P results, as detailed in Appendix A. GW190412 [234], GW190521 [34,35], and GW190814 [33] were analyzed extensively in separate publications with the HM waveform families SEOBNRv4PHM, IMRPhenomPv3HM, or NRSur7dq4, so we present HM runs for these candidate events as fiducial results here and defer readers to those publications for details on those candidate events. 7 GW190425, GW190426 152155, and GW190814 showed indications of including at least one neutron star, and so were also analyzed using tidal waveforms in addition to IMRPhenomPv2 and SEOBNRv4P, and discussed in Sec. VII B. Further details on waveform systematics and the waveforms employed in this work can be found in Appendix A, and the full suite of posterior samples is publicly available at [256].
In the following subsections, we summarize the results of our parameter estimation analyses and highlight candidate events of particular interest. To identify candidate events with the most extreme parameter values, we repeatedly select one posterior sample at random from each event and record which candidate events have the lowest and highest values of each parameter. From these repeated trials, we determine each event's probability of having the lowest or highest value for a given parameter. Table VI shows 90% credible intervals on the source parameters of all 39 candidate events using the priors described in Sec. V C and waveforms specified above.
To provide an overview of the posterior distributions of the source parameters for all GWTC-2 candidate events, we show 90% credible regions for all candidate events in the M -q and M-χ eff planes in Figs. 6 and 7, respectively, and the corresponding one-dimensional marginal posterior distributions on m 1 , q, and χ eff in Fig. 8.
A. Masses of sources with m2 > 3 M Our candidate event list includes compact binary mergers that have higher total masses than those in GWTC-1 [8] as well as mergers with component masses in the purported lower mass gap of ∼ 2.5-5 M [257][258][259][260]. Here we describe the masses for candidate events with m 2 > 3 M , which we can confidently expect to be BBHs. The remaining candidate events are described separately in Sec. VII B.
A majority of the masses of black holes reported herein are larger than those reported via electromagnetic observations [261][262][263]. By repeatedly selecting one posterior sample at random from each event and recording the most massive among the ensemble, we find that the most massive binary system is probably the one associated with GW190521 [34,35]. This system has a 98% probability of being the most massive, with a total mass of 163 [35], where the NRSur7dq4 results are highlighted. The more massive component in the source of GW190521 has an 66% probability of being the most massive BH detected in gravitational waves to date (m 1 = 95.3 +28.7 −18.9 M ). GW190519 153544, GW190602 175927, and GW190706 222641 also have notably high total masses with over 50% posterior support for total mass M > 100 M .
The least massive O3a system with m 2 > 3 M is probably (96%) the one associated with GW190924 021846 (M = 13.9 +5.1 −1.0 M ), and likely also has the least massive object over 3 M (85% probability and m 2 = 5.0 +1.4 −1.9 M ). For most sources detected in O3a, the mass ratio posteriors have support at unity and therefore are consistent with equal mass mergers. An exception is the source of GW190412 which was the first event detected that had a confidently unequal mass ratio (q = 0.28 +0. 12 −0.06 ) and exhibited strong signs of HM contributions to the waveform [234]. Although its mass ratio is confidently bounded away from unity, GW190412 only has a 34% chance of having the smallest mass ratio among O3a sources with m 2 > 3 M . As seen in Figs. 6 and 8, the mass ratios are not well constrained for many systems, so one or more could have a smaller mass ratio than GW190412.

GW190425
The least massive O3a system is associated with GW190425 and is likely a binary neutron star system given the inferred masses (m 1 = 2.0 +0.6 −0.3 M and m 2 = 1.4 +0.3 −0.3 M ), but constraints on the tidal parameters do not rule out a NSBH or BBH origin. These estimates were obtained with the PhenomPv2NRT model with a high-spin prior that restricts dimensionless spin magnitudes of the compact objects to be less than 0.89, and previously reported in [32].
Although the inferred component masses of GW190425's source are consistent with masses of known neutron stars [264][265][266][267], the total mass 3.4 +0. 3 −0.1 M is greater than that of observed Galactic BNSs [268,269]. This raises the question of whether GW190425's source was formed in a different environment from the double neutron star systems observed to date [32,[270][271][272][273].  Credible region contours for all candidate events in the plane of total mass M and mass ratio q. Each contour represents the 90% credible region for a different event. We highlight the previously published candidate events: GW190412, GW190425, GW190521 and GW190814, the potential NSBH GW190426 152155, and finally GW190924 021846, which is most probably the least massive system with both masses > 3 M . The dashed lines delineate regions where the primary/secondary can have a mass below 3 M . For the region above the m2 = 3 M line, both objects in the binary have masses above 3 M . Credible region contours for all candidate events in the plane of chirp mass M and effective inspiral spin χ eff . Each contour represents the 90% credible region for a different event. We highlighted the previously published candidate events (cf. Fig. 6), as well as GW190517 055101 and GW190514 065416, which have the highest probabilities of having the largest and smallest χ eff respectively.  Extending the discussion of waveform systematics reported with the discovery of GW190425 [32], we perform four supplementary analyses using the non-precessing EOB models SEOBNRv4T surrogate and TEOBResumS for low spin and high spin priors. The results are summarised in the Table VII. All the analyses produce quantitatively similar results to the corresponding nonprecessing analysis reported in the discovery paper, and the two EOB models produce consistent results between them.
For the low-spin prior, the results agree with the non-precessing IMRPhenomDNRTidal analysis reported in the discovery paper, with SEOBNRv4T surrogate and TEOBResumS recovering chirp mass as M = 1.44 +0.02 −0.02 M and effective inspiral spin of χ eff = 0.01 +0.01 −0.01 . When allowing larger compact object spins, the results with IMRPhenomDNRTidal and EOB models exhibit some differences, but overall give consistent posteriors.
Our inferences about tidal parameters are likewise consistent with the previous non-precessing analyses. We follow the procedure of [32] and re-weight the posteriors to a flat inΛ prior. For the low-spin prior, the two EOB models give very similar bounds, withΛ constrained below 630. For the high spin prior, the TEOBResumS waveform model constrains the dimensionless tidal deformability parameter better (Λ ≤ 870) as compared to the other waveform models, as seen in the top panel of Fig. 9.
The two EOB models also constrain the mass ratio better than other waveforms (0.54-1.0) as can be seen in the bottom panel of the Fig. 9. The previously-reported analyses that allow for significant precessing spins have much greater flexibility and thus, for the high-spin prior, produce a more asymmetric mass ratio posterior distribution than the two non-precessing updates reported here.
Finally, both the EOB models find the luminosity distance of D L = 0.16 +0.07 −0.07 Gpc independent of the spin prior used.

GW190814
Amongst the O3a events, GW190814's source [33] has the least massive secondary component after the sources of GW190425 and GW190426 152155. GW190814's less massive component has mass m 2 = 2.59 +0.08 −0.09 M , making its interpretation as as a black hole or a neutron star unclear [33]. GW190814 also has the most extreme mass ratio of all the candidate events, q = 0.112 +0.008 −0.009 [33].

GW190426 152155
GW190426 152155 is the candidate event with the highest FAR: 1.4 yr −1 . Assuming it is a real signal of astrophysical origin, we estimate its component masses to be m 1 = 5.7 +3.9 −2.3 M and m 2 = 1.5 +0.8 −0.5 M , raising the possibility that it could have originated from either a BBH or an NSBH source. The mass of the secondary component is consistent with masses of (previously) reported neutron stars [10,265,266,268], but the data are uninformative about potential tidal effects, showing essentially no difference between the prior and posterior on Λ 2 obtained from NSBH waveforms SEOBNRv4 ROM NRTidalv2 NSBH or IMRPhe-nomNSBH which we use for our fiducial results. A more definitive assessment of this event likely requires further observations to establish the rate of astrophysical signals with comparable properties.

C. Spins
Most of the compact objects detected in O3a have spin magnitudes consistent with zero, within uncertainties, but in some cases the spins can be constrained away from zero. Two systems have a > 50% chance of having at least one black hole with dimensionless spin magnitude χ i={1,2} > 0.8: GW190517 055101 has χ i={1,2} > 0.8 with 77% credibility, and GW190521 with 58% credibility. In addition to spin magnitudes, we also consider the effective inspiral spins. As described in Sec. IV B, the effective inspiral spin χ eff is the mass-weighted combination of aligned spins and is approximately conserved under precession.
Effective inspiral spin posterior distributions for all candidate events are shown in Figs. 7   GW190620 030421, GW190706 222641, GW190719 215514, GW190720 000836, GW190728 064510, GW190828 063405, and GW190930 133541 have sources with χ eff > 0. No individual systems were found to have χ eff < 0 with ≥ 95% probability, but the event with the lowest χ eff in O3a is probably GW190514 065416 with χ eff = −0.19 +0. 29 −0.32 . The surplus of events with χ eff > 0 compared to none with χ eff < 0 suggests that the spin orientations of black holes in binaries are not isotropically distributed with respect to their orbital angular momenta. This possibility is explored further in [37].
The BBH which most likely has the largest measured χ eff is GW190517 055101 (χ eff = 0.52 +0. 19 −0. 19 ). It has a 60% posterior probability of having the highest χ eff , followed by GW190719 215514 (χ eff = 0.32 +0. 29 −0.31 ) with 12%. In some cases, the joint χ eff and mass-ratio measurement for an event enables a tighter measurement of the spin magnitude of the primary mass than for candidate events with mass ratios closer to unity. For example, we find primary spin magnitudes of χ 1 = 0.40 +0. 40 −0.35 for the source of GW190720 000836, and χ 1 = 0.32 +0. 37 −0.28 for the source of GW190728 064510. Posterior distributions on spin magnitudes and tilt angles are shown in Fig. 10 for these two candidate events and other select systems that exhibit non-zero spins.
The magnitude of spin-precession in the waveform can be partially captured by the effective precession spin parameter χ p , which includes the projection of component spin vectors onto the orbital plane [201,213]. To identify events for which the data constrain χ p , we compare the χ p priors-conditioned on the posteriors of χ effto the χ p posteriors, as done in [8]. Conditioning the χ p prior on the χ eff posterior accounts for the correlated prior between χ eff and χ p in the default spin prior choices presented in Sec. V. Fig. 11 shows the one-dimensional posterior and (χ eff -conditioned) prior distributions on χ p for events with Jensen-Shannon (JS) divergence [274] D χp JS > 0.05 bit, where D χp JS is calculated between the χ p posterior and conditioned prior. For most of the candidate events, the posterior on χ p is similar to the prior, indicating that the data are largely uninformative about precession, but there are a few notable exceptions. The χ p inference on GW190814 is most striking with D χp JS = 0.72 bit: the spin magnitude of the primary mass of the system is constrained to be near zero, resulting in a correspondingly small χ p value. After GW190814, D χp JS is largest for GW190412 [234,275] and GW190521 [34,35], with D χp JS = 0.18 bit and D χp JS = 0.15 bit, respectively. Unlike GW190814, the χ p posterior distributions for these events are constrained away from zero, showing preference for precession in these systems. The tilt angle of GW190412's more massive component's spin with respect to the Newtonian orbital angular momentum is particularly well constrained to θ LS1 = 0.80 +0. 54 −0.35 , as seen in Fig. 10. To further investigate possible precession in these signals, we compute the precession SNR ρ p [276,277], which characterizes the observability of precession in gravitational wave data. GW190412 has the largest precession SNR in O3a with a median ρ p = 3.0. Despite parameter estimation preferring high χ p for GW190521, we calculate ρ p = 1.6 showing that the measurable precession in the signal is small. Using empirical relations between ρ p and Bayes' factors for precession [278,279], we see that the corresponding Bayes' factors for precession also only mildly favor precession. A population-level analysis of the GWTC-2 spins is presented in [37] and finds evidence for the presence of spin precession in the population.

D. Three-Dimensional Localization
The most distant event, after accounting for measurement uncertainties in distance, is most probably GW190413 134308, with an estimated luminosity distance and redshift of D L = 4.45 +2.48 Gpc and z = 0.71 +0.31 −0.30 , respectively, approximately twice the luminosity distance of the most distant source from GWTC-1, GW170729 [8], and comparable to the distance for gravitational wave candidate GW170817A [22] (not to be confused with the BNS signal GW170817). However, GW190909 114149, GW190514 065416, GW190521, GW190706 222641, and GW190719 215514 have similarly large distances to GW190413 134308. With candidate events at these cosmological distances, we can more readily measure the Hubble constant, and the evolution of the BBH merger rate over cosmic time. Such analyses are performed in [37]. The closest source detected in O3a is GW190425, with an inferred luminosity distance of D L = 0.16 +0.07 −0.07 Gpc, about four times the distance for GW170817.
Overall, GW190814 is the best localized event detected in O3a. The contour encompassing 90% of this event's two-dimensional sky position posterior is ∆Ω = 19 deg 2 , and 90% of this event's 3D sky position posterior is contained in ∆V 90 = 3.2 × 10 −5 Gpc 3 . Although GW190814 was initially detected in only LIGO Livingston and Virgo, it was reanalyzed with LIGO Hanford data, enabling the strong constraint on 3D source position. GW190412 and GW190701 203306 were also relatively well localized with ∆Ω = 21 deg 2 , ∆V 90 = 0.035 Gpc 3 and ∆Ω = 46 deg 2 , ∆V 90 = 0.035 Gpc 3 , respectively, and were both detected in all three detectors. GW190424 180648 was detected and analyzed only in Livingston data, and therefore had the largest localization area and volume with ∆Ω = 28000 deg 2 and ∆V 90 = 28 Gpc 3 . Credible intervals on each source's distance and sky area are shown in Table VI. Probability density sky-maps for all events are available as part of the data release [256].

VIII. WAVEFORM RECONSTRUCTIONS
Template-based [177,280] and minimally-modeled methods [106,111,281] are complementary techniques for producing waveform reconstructions. Waveform templates provide a mapping between the shape of the waveforms and the parameters of the source, such as the masses and spins of a binary system, but are limited to those sources for which we have models. Minimallymodeled reconstructions make it possible to discover unexpected phenomena, but they do not provide a direct mapping to the physical properties of the source. Currently available waveform templates for binary mergers are based on various approximations and numerical solutions to Einstein's equations that cover a subset of the full parameter space. These waveform templates may thus fail to capture some features of the signal. A more exotic possibility is that gravity behaves differently than predicted by general relativity. One way to test these scenarios is to compare the template-based and minimallymodeled waveform reconstructions. A standard measure of the agreement between two waveforms h 1 , h 2 is the match, or overlap where a|b denotes the noise-weighted inner product [142,178]. The match is constrained to be ≤ 1.
For each event, the matches were computed between the maximum likelihood template-based waveforms and two minimally-modeled waveform reconstruction methods, cWB [111] and BW [106], using data that contain the event (on-source data). To ascertain whether these match values are in line with expectations, waveforms from the template based analysis were added to data near, but not including, each event (off-source data). The minimally-modeled waveform reconstructions were repeated multiple times on these off-source data to estimate the distribution of match values we would expect for each event. For each event, these distributions were used to compute a p-value, given by the fraction of off-source match values that are below the on-source match. For some of the lower SNR candidate events the minimallymodeled methods were unable to reconstruct the signals, and these candidate events were excluded from the analysis. Details of the analysis procedure, and additional results, can be found in Appendix B. Fig. 12 shows the p-values for the analyzed candidate events sorted in increasing order [281,282]. Events with p-values above the diagonal have on-source matches that are higher than expected, while those below the line have matches that are lower than expected. The cWB analysis shows some events outside the theoretical 90% band for the single-order statistic, which can be due to statistical fluctuations. Because of correlations between p-values in the ordered plot, for 20 p-values the probability of finding exactly four p-values outside the 90% band as in figure  12 is 5.2%, although it may also point to a systematic effect in the analysis. Since the p-values are higher than expected, there is no evidence of a discrepancy with the template-based analysis. Our tests also indicates that if there is indeed a systematic effect, it may originate from an overestimate of the off-source matches no larger than 2%. Further details are given in Appendix B. Overall, the p-value distributions support the null hypothesis that the minimally-modeled waveform reconstructions are consistent with the GR derived waveform templates.

IX. CONCLUSION
We have presented the results from a search for compact binary coalescence signals in the first part of the third observing run of Advanced LIGO and Advanced Virgo. During the period of observations, spanning 1 April to 1 October 2019, the three detectors had sensitivity that significantly exceeded previous observing runs, with median BNS inspiral ranges of 108 Mpc (Hanford), 135 Mpc (Livingston) and 45 Mpc (Virgo). This improved sensitivity allowed us to greatly expand the number of known compact binary mergers, adding 39 new gravitational wave events to the 11 we have previously reported in GWTC-1 [8].
We performed parameter estimation on these 39 new GWTC-2 signals using a range of waveform models, allowing us to incorporate the effect of HMs in the inference of source parameters for BBH systems, and to compare to the systematic differences between waveform families. We find that the sources of these signals include BBHs that are more massive, farther away, and more asymmetric in mass ratio than any sources in GWTC-1, as well as three binaries with at least one component of mass < 3 M . These latter systems may include the first de-tected NSBH mergers; however, there is insufficient SNR to perform an informative measurement of the tidal deformability that would definitively indicate whether they had an NSBH or BBH origin. We expanded our analysis of the potential BNS signal to encompass further waveforms which include tidal effects, finding agreement with previously-reported results [32].
Our analyses recovered 11 sources with positive effective inspiral spin but zero with negative (at 90% credibility). Although no individual binary in O3a has χ eff < 0 with high credibility, a hierarchical population analysis [37] of LVC-reported events to date indicates a nontrivial fraction of binaries have negative effective inspiral spin. This would be consistent with the independent observation of gravitational-wave candidate GW170121, which was found to have significant probability of having χ eff < 0 [23,27]. We also examined the evidence for misalignment of orbital and component angular momenta, which would be an indication of a dynamical formation channel, and produce precession of the orbital plane. We find only mild evidence in favor of precession in the most significant case of GW190412 [234,275]. We also performed consistency checks between the waveform models and the observed data, finding no statistically significant differences.
A pair of companion papers make use of the events in GWTC-2 to study source populations and fundamental physics. The inferred population distribution of compact object mergers is described in [37], which reports an updated BBH merger rate density of 23.9 +14.9 −8.6 Gpc −3 yr −1 , and rate density for BNS of 320 +490 −240 Gpc −3 yr −1 . This paper also investigates mass and spin distributions, and finds evidence for a population of BBH systems with spins misaligned from the orbital plane. Eight tests of general relativity are reported in [36], showing no evidence for violations of Einstein's theory of gravity, leading to some of the best constraints on alternative theories to date.
Data products associated with this catalog are available through the Gravitational Wave Open Science Center (GWOSC) at https://gw-openscience.org [256]. Data associated with all events described in this paper are available through the GWOSC Event Portal, including calibrated strain time-series, parameter estimation posterior samples, tables of 90% credible intervals for physical parameters, and search pipeline results. These data products may be accessed through a web browser or open source client package [283].
The online Gravitational Wave Transient Catalog represents a cumulative set of events found in LIGO-Virgo data, including detections presented in GWTC-1, and now also events from the first six months of the O3 observing run. O3 continued from November 2019 until March 2020. Analysis of the second portion (O3b) is currently in progress, and events found during this period will be added in the next update.
The Advanced LIGO and Advanced Virgo detectors are currently undergoing commissioning to further im-prove their sensitivity, and will be joined in their fourth observing run by the KAGRA detector [284]. This should lead to improvements in the detection rate and source localization, improving the prospects for multimessenger observation of future sources [285,286].
of Hong Kong, the National Natural Science Foundation of China (NSFC), the Leverhulme Trust, the Research Corporation, the Ministry of Science and Technology (MOST), Taiwan and the Kavli Foundation. The authors gratefully acknowledge the support of the NSF, STFC, INFN and CNRS for provision of computational resources.
We would like to thank all of the essential workers who put their health at risk during the COVID-19 pandemic, without whom we would not have been able to complete this work.

Appendix A: Waveform systematics
The choice of waveform influences our inferences of source properties [8,293,294]. As such, we employ multiple waveform families in the inference of each event's source parameters. The full list of models represented in our publicly available results for each event is shown in Table VIII.
There have been many tools designed to quantify difference between probability distributions, one of which is the Kullback-Leibler (KL) divergence [295]. It is defined by where p(ϑ), q(ϑ) are two probability distributions. However, the KL divergence is not symmetric, D KL (p|q) = D KL (q|p) and further D KL can diverge if the probability distributions are disjoint, which makes it impractical to use when comparing posterior distributions of parameter estimation runs. Both of these issues are addressed by the JS divergence [274] which is a smoothed and symmetrized version of D KL . It is defined as (A2) where s = (p + q)/2 is the average distribution. By construction, the JS divergence is symmetric and it satisfies 0 ≤ D JS ≤ 1 bit, which follows from The convenient properties of the JS divergence make it a useful measure of the difference between posterior probability distributions, and we adopt it to quantify the systematics in our results resulting from different analysis choices.
We use the JS divergence to investigate two sources of systematics that can impact the results of inference: the differences in the models with equivalent physics due to modeling choices, and differences due to the inclusion of different physical effects. To assess the importance of the former, we compare the results using our two fiducial precessing waveform models (IMRPhenomPv2 and SEOB-NRv4P). For the latter, we consider inclusion of HMs beyond the dominant quadrupole by comparing SEOB-NRv4P to models that include HMs: SEOBNRv4PHM and NRSur7dq4. We compute the JS divergence between one-dimensional marginal distributions on two key parameters: mass ratio and effective inspiral spin. We use a threshold of 0.007 bit to deem the differences signifi-cant, which for a Gaussian corresponds to a 20% shift in the mean, measured in units of one standard deviation. This threshold is larger than 0.002 bit, which is the variation that has been shown to arise due to stochastic sampling [232]. In the following subsection, we describe how we use our computed JS divergences and JS threshold to choose the default parameter estimation results shown in this work.

Choice of waveform models for each event
We use the above considerations to select which waveform model(s) are used as the fiducial results presented in Sec. VII. 8 In particular, we present results from models with HMs if the following conditions are satisfied for any of the three key parameters and any of the HM models:  When the conditions do not hold, we combine equal number of samples from the results of IMRPhenomPv2 and SEOBNRv4P and use the joint samples. The only exceptions are GW190425 and GW190426 152155, for which we use the tidal waveforms described in Sec. V, and GW190412, GW190521 and GW190814, for which we know HMs are significant [33,34,234]. Fig. 13 shows the JS divergences for every event (except GW190425, GW190426 152155, GW190521, and GW190814) to compare the effects of model systematics versus the effects of HMs. The gray regions correspond to the criteria in Eq. (A4a): any events with D JS values in those regions are presented using HM results in Sec. VII.

Waveform comparison-Model systematics
The in-depth studies of GW190412, GW190521, and GW190814 have quantitatively demonstrated how much the choice of gravitational waveform impacts our interpretation of real gravitational wave events. We have identified several additional instances where analyses with our two fiducial precessing waveform models (IMRPhe-nomPv2 and SEOBNRv4P) have notable differences, as measured by the JS divergence.
To illustrate the impact of systematics, we show in Fig. 14 the posterior distributions for the events with extremal JS divergence in the key parameters q and χ eff : GW190924 021846 and GW190521 074359. For GW190924 021846, SEOBNRv4P prefers more moderate mass ratios around 1/2, and shows only minor differences in the other parameters. Meanwhile, GW190521 074359 shows the largest differences in the effective inspiral spin, with SEOBNRv4P more clearly preferring non-zero values.
While the cases above demonstrate the moderate shifts that result from model systematics, for all the events presented in this paper, there is always substantial overlap between the posteriors for all the key quantities we consider. Therefore, systematic uncertainty remains subdominant to statistical uncertainty.
We systematically applied one or two models with HM to all sources in our catalog. We find that a majority of the sources investigated have modest shifts in mass ratio or χ eff due to the impact of HMs.
To illustrate the impact of non-quadrupole modes, Fig. 15 shows posterior inferences on GW190519 153544, GW190602 175927, GW190706 222641, and GW190929 012149 using one or two independent models which include non-quadrupole modes (SEOB-NRv4PHM and NRSur7dq4). Using more complete waveform models, the source of GW190519 153544 is inferred to be edge on, at a smaller distance, a more asymmetric mass ratio, and thus a higher source-frame mass m 1 . Due to this source's favorable orientation, non-quadrupole modes have a significant impact, with a Bayes factor for HMs of ∼ 15. Similarly, non-quadrupole modes allow us to more strongly exclude both extreme and comparable mass ratios for the source responsible for GW190929 012149, and to disfavor comparable masses for GW190706 222641. Conversely, using the same two waveform models, the high-mass source responsible for GW190602 175927 is more confidently inferred to have mass ratio q closer to unity and therefore m 2 is skewed to larger values. Non-quadrupole modes also have a noticeable impact on parameters of GW190630 185205 and GW190828 065509, in particular the mass ratio and luminosity distance.

Appendix B: Waveform Consistency Tests
There are several different quantitative measures that can be used to measure waveform consistency. These include the residual SNR, which is found by subtracting the best-fit waveform template h T from the data d, then   applying minimally-modeled methods to search for any coherent excess in the noise residual r = d − h T . Additional measures of waveform consistency include the distance between waveforms, ∆ 2 h 1 , h 2 = h 1 − h 2 |h 1 − h 2 and the match, or overlap, O h 1 , h 2 .
The residuals test has been applied as test of general relativity [9,10], at least within the precision with which the waveform models approximate general relativity. Distance and match provide more sensitive measures of the waveform consistency than the residuals test since the extrinsic parameters of the source, such as the arrival time, sky location and polarization, are constrained by the full signal, while for the residuals test the extrinsic parameters have to be constrained from the (usually small) difference between the signal and the template. To compare the signal reconstructions for the current catalog of sources we adopt the waveform match as our measure of waveform consistency since it does not depend on the overall amplitude of the signals, making it a convenient choice when comparing events with a range of amplitudes.
To make a quantitative assessment of the waveform match values we need to know how the match depends on quantities such as the SNR and time-frequency volume of the signals. Instrument noise will lead to non-zero mis-matches, MM = 1 − O, even when using perfect templates. For example, the maximum likelihood solution for a perfect template will have a mis-match with mean and variance given by [ where D is the number of parameters that define the signal model, and where the reduction from D to D − 1 is because the match is independent from the overall amplitude of the signal. For the minimally-modeled waveform reconstructions the distributions of match values are more difficult to predict. Using simulations it has been found that the mis-match decreases with SNR, but more slowly than for templates since the effective dimension of the model increases with SNR [107]. The mis-match also scales with the time-frequency volume. For binary systems of a given SNR, the mis-match will generally be smaller for high mass systems [107,310,311]. Given these complexities, we chose to empirically estimate the match distribution from simulations for each event. As a proxy for the signal we use fair draws from the on-source template-based analysis and inject these into data surrounding the event; the right ascensions for the simulated signals are adjusted such that simulated source is at the same sky location in the frame of the detectors. For the majority of events the waveform model used for the injections is IMRPhenomPv2. The exceptions are GW190412, GW190521, GW190814 where IMRPhenomPv3HM was used.
The cWB and BW algorithms are used to produce point estimates for the waveform reconstructions for each of the simulated events. For cWB the point estimate is a constrained maximum likelihood reconstruction, while for BW the point estimate is the median of the waveform posterior distribution. Fig. 16 illustrates the results obtained for GW190519 153544. The upper panels compare the template-based LALInference waveform reconstruction in the LIGO Livingston detector to the minimallymodeled BW and cWB reconstructions. The solid lines are point estimates for the waveforms: for LALInference the maximum likelihood; for BW the median of posterior draws; and for cWB the constrained maximum likelihood. The panels also show the 90% credible bands for the Bayesian LALInference and BW algorithms and the 90% confidence band for cWB derived by injecting samples from the template based analysis into data surrounding the event and repeating the analysis multiple times (these bands are computed on an individual time sample basis). The lower panels of Fig. 16 show the distribution of overlaps found when running BW and cWB on simulated data with similar properties to the event. Waveforms drawn from the on-source LALInference analysis were injected into data surrounding the event. The overlap between the injected waveform and the point estimates from the BW and cWB analyses of these injections were then used to produce the histograms seen in Fig. 16. The distribution of the match values defines a null distribution for each detected event, which takes into account the variability of the LALInference posterior distribution, the fluctuations of the detector noise, and the waveform reconstruction errors. The fraction of off-source analyses with overlaps below the on-source match values, which are shown as vertical lines in the lower panels of Fig. 16, define the p-value for this event.
The same analysis procedure was repeated for a subset of additional events. cWB uses only events that are above the cWB search thresholds (resulting in a morphologydependent SNR threshold which is about 7-10 for the events reported in this catalog), while for BW the analysis was restricted to events where the on-source BW analysis yielded SNR > 7. Fig. 17 shows the on-source match values vs. the off-source median match values with 90% intervals. The upper panel shows the results of the BW analysis, while the lower panel shows the results of the cWB analysis. In both cases the p-values point to a good agreement between the minimally-modeled and template-based reconstructions.
The discrepancies between the two plots may be ascribed to different choices made in the two reconstruction algorithms. cWB is both a detection and a reconstruction pipeline. For this reason, reconstructions are performed with the same production settings used in searches. The production settings are optimized for noise rejection and to enforce strong network coherency constraints. To construct the match distributions, cWB uses about 2000-3000 waveform injections per event; however, those injections that are below the cWB thresholds are not reconstructed: in the majority of cases reported here (13) the reconstruction efficiency is greater than 80%, while in a few other cases (7) the efficiency ranges from 15% to 50%. For each event, this efficiency depends on the variability of the noise background. In cases of lower efficiency, we have also checked that the waveforms that are successfully reconstructed by cWB have parameter distributions that are statistically indistinguishable from those of the injected waveforms.
BW employs Bayesian inference to characterize detections made by the CBC and cWB search pipelines. As such, no cuts are made on the waveform reconstructions, which means that for quiet signals, some of the samples will be drawn from the prior distribution, resulting in a wide spread in the match distributions.
The distribution of match values for each event are used to compute p-values. The overall consistency of the template based and minimally-modeled waveform reconstructions can then be summarized by plotting the p-values, ranked in ascending order, against the theoretical distribution. In such plots, any significant deviations below the plot diagonal point to events that should undergo further analysis. The p-values for the events reconstructed by cWB and BW are shown in Fig. 12 in the main text. Applying the Fisher test [312] to the ensemble of p-values yields combined p-values of 0.57 for the BW analysis and 0.99 for the cWB analysis, indicating that there is no reason to reject the null hypothesis that the template based and minimally-modeled analysis are in agreement. The Fisher test is one-sided in that it only penalizes p-values that are lower than expected. The cWB analysis includes instances where the p-values are higher than predicted, indicating that the on-source matches are higher than expected based on the off-source distributions. The cause of this bias is likely due to an asymmetry between the on-source and off-source analyses. The on-source analysis computes the match between the maximum likelihood template-based waveform and its reconstruction, while the off-source analysis computes the match between the injected template and its reconstruction. Ideally the analysis would be symmetric, with the maximum likelihood template used both on-source and off-source, but the computational cost of running the full template-based analysis on the thousands of off- The panels also show the 90% credible bands for the Bayesian LALInference and BW algorithms and the 90% confidence band for cWB derived from off-source injections, i.e., by injecting samples from the template-based analysis into data surrounding the event and repeating the analysis multiple times (these bands are computed on an individual time sample basis). The lower two panels show the distribution of overlap values when running BW and cWB on waveforms drawn from the template based analysis that are injected into data surrounding the event. The fraction of runs with matches below that of the on-source analysis give the p-value for the event.
source injections is prohibitively expensive.

Appendix C: Cosmological distance resampling
For the luminosity distance, a prior that goes as D 2 L is enforced in the sampling, following the same procedure as described in previous publications. As this assumption becomes increasingly unrealistic as events are detected at greater distances, the posterior distributions shown in this paper are derived from a physically motivated prior that incorporates cosmological effects. We perform rejection sampling on the initial posterior samples to instead use a prior corresponding to a uniform merger rate per comoving volume in the rest frame of the source. Using a standard flat ΛCDM cosmology, samples are accepted The initial 1/(1 + z) factor accounts for time dilation of the observed merger rate, dV c is the comoving volume element, and dV E = D 2 L dΩ is the Euclidean volume element. D H = c/H 0 is the Hubble distance and E(z) Ω m (1 + z) 3 + Ω Λ for ΛCDM, and we use Hubble constant H 0 = 67.9 km s −1 Mpc −1 and matter density Ω m = 0.3065 = 1 − Ω Λ [236].