Binary Black Hole Mergers in the First Advanced LIGO Observing Run

detections of gravitational waves from binary black hole full a for binary black hole merger signals total M detailed observations of on general-relativistic models of gravitational-wave signals from binary black hole systems, unambiguously identified two signals, GW150914 and GW151226, with a significance of greater than 5 σ over the observing period. It also identified a third possible signal, LVT151012, with substantially lower significance and with an 87% probability of being of astrophysical origin. We provide detailed estimates of the parameters of the observed systems. Both GW150914 and GW151226 provide an unprecedented opportunity to study the two-body motion of a compact-object binary in the large velocity, highly nonlinear regime. We do not observe any deviations from general relativity, and we place improved empirical bounds on several high-order post-Newtonian coefficients. From our observations, we infer stellar-mass binary black hole merger rates lying in the range 9 – 240 Gpc − 3 yr − 1 . These observations are beginning to inform astrophysical predictions of binary black hole formation rates and indicate that future observing runs of the Advanced detector network will yield many more gravitational-wave detections.

models of gravitational-wave signals from binary black hole systems, unambiguously identified two signals, GW150914 and GW151226, with a significance of greater than 5σ over the observing period. It also identified a third possible signal, LVT151012, with substantially lower significance and with an 87% probability of being of astrophysical origin. We provide detailed estimates of the parameters of the observed systems. Both GW150914 and GW151226 provide an unprecedented opportunity to study the two-body motion of a compact-object binary in the large velocity, highly nonlinear regime. We do not observe any deviations from general relativity, and we place improved empirical bounds on several highorder post-Newtonian coefficients. From our observations, we infer stellar-mass binary black hole merger rates lying in the range 9-240 Gpc −3 yr −1 . These observations are beginning to inform astrophysical predictions of binary black hole formation rates and indicate that future observing runs of the Advanced detector network will yield many more gravitational-wave detections. DOI: 10.1103/PhysRevX. 6.041015 Subject Areas: Gravitation

I. INTRODUCTION
The first observing run (O1) of the Advanced LIGO detectors took place from September 12, 2015, to January 19, 2016. The detectors provided unprecedented sensitivity to gravitational waves over a range of frequencies from 30 Hz to several kHz [1], which covers the frequencies of gravitational waves emitted during the late inspiral, merger, and ringdown of stellar-mass binary black holes (BBHs). In this paper, we report the results of a matched-filter search using relativistic models of BBH waveforms during the whole of the first Advanced LIGO observing run. The compact binary coalescence (CBC) search targets gravitational-wave emission from compact-object binaries with individual masses from 1M ⊙ to 99M ⊙ , total mass less than 100M ⊙ , and dimensionless spins up to 0.99. Here, we report on results of the search for BBHs. The search was performed using two independently implemented analyses, referred to as PyCBC [2][3][4] and GstLAL [5][6][7]. These analyses use a common set of template waveforms [8][9][10] but differ in their implementations of matched filtering [11,12], their use of detector data-quality information [13], the techniques used to mitigate the effect of non-Gaussian noise transients in the detector [5,14], and the methods for estimating the noise background of the search [3,15]. We obtain results that are consistent between the two analyses.
The search identified two BBH mergers: GW150914, observed on September 14, 2015 at 09∶50:45 UTC [16], and GW151226, observed on December 26, 2015 at 03∶38:53 UTC [17]. Both of these signals were observed with a significance greater than 5σ. In addition, a third candidate event, LVT151012, consistent with a BBH merger was observed on October 12, 2015 at 09∶54:43 UTC with a significance of ≲2σ. Although LVT151012 is not significant enough to claim an unambiguous detection, it is more likely to have resulted from a gravitational-wave signal than from an instrumental or environmental noise transient. The key parameters of the events are summarized in Table I.
The properties of the sources can be inferred from the observed gravitational waveforms. In particular, the binary evolution, which is encoded in the phasing of the gravitational-wave signal, is governed by the masses and spins of the binary components. The sky location of the source is primarily determined through time of arrival differences at the two Advanced LIGO sites. The observed amplitudes and relative phase of the signal in the two Advanced LIGO detectors can be used to further restrict the sky location and infer the distance to the source and the binary orientation. We provide a detailed evaluation of the source properties and inferred parameters of GW150914, GW151226, and LVT151012. We use models of the waveform covering the inspiral, merger, and ringdown phases based on combining post-Newtonian (PN) theory [19][20][21][22][23][24], the effective-onebody (EOB) formalism [25][26][27][28][29], and numerical relativity simulations [30][31][32][33][34][35][36]. One model is restricted to spins aligned with the orbital angular momentum [8,9], while the other allows for nonaligned orientation of the spins, which can lead to precession of the orbital plane [37,38]. The parameters of GW150914 have been reported previously in Refs. [39,40]. We provide revised results which make use of updated instrumental calibration.
The emitted signals depend upon the strong field dynamics of general relativity; thus, our observations provide an extraordinary opportunity to test the predictions of general relativity for binary coalescence waveforms. Several tests of general relativity were performed using GW150914, as described in Ref. [41]. One of these was a parametrized test for the consistency of the observed waveform with a general-relativity-based model. We perform a similar test on GW151226. Since this source is of lower mass than GW150914, the observed waveform lasts for many more cycles in the detector data, allowing us to better constrain the PN coefficients that describe the evolution of the binary through the inspiral phase. In addition, we combine the results from GW150914 and GW151226 to place still tighter bounds on deviations from general relativity.
The observed events begin to reveal a population of stellar-mass black hole mergers. We use these signals to constrain the rates of BBH mergers in the universe and begin to probe the mass distribution of black hole mergers. The inferred rates are consistent with those derived from GW150914 [42]. We also discuss the astrophysical implications of the observations and the prospects for future Advanced LIGO and Virgo observing runs.
The results presented here are restricted to BBH systems with total masses less than 100M ⊙ . Searches for compact binary systems containing neutron stars are presented in Ref. [43], and searches for more massive black holes and unmodeled transient signals will be reported elsewhere.
This paper is organized as follows: Section II provides an overview of the Advanced LIGO detectors during the first observing run, as well as the data used in the search. Section III presents the results of the search, details of the two gravitational-wave events, GW150914 and GW151226, and the candidate event LVT151012. Section IV provides detailed parameter-estimation results for the events. Section V presents results for the consistency of the two events, GW150914 and GW151226, with the predictions of general relativity. Section VI presents the inferred rate of stellar-mass BBH mergers, and Sec. VII discusses the implications of these observations and future prospects. We include appendixes that provide additional technical details of the methods used. Appendix A describes the CBC search, with A 1 and A 2 presenting details of the construction and tuning of the two independently implemented analyses used in the search, highlighting differences from the methods described in Ref. [44]. Appendix B provides a description of the parameter-estimation analysis and includes a summary table of results for all three events. Appendixes C and D provide details of the methods used to infer merger rates and mass distributions, respectively.

II. OVERVIEW OF THE INSTRUMENTS AND DATA SET
The two Advanced LIGO detectors, one located in Hanford, Washington (H1) and one in Livingston, Louisiana (L1), are modified Michelson interferometers with 4-km-long arms. The interferometer mirrors act as test masses, and the passage of a gravitational wave induces a differential arm length change which is proportional to the gravitational-wave strain amplitude. The Advanced LIGO detectors came online in September 2015 after a major upgrade targeting a tenfold improvement in sensitivity over the initial LIGO detectors [45]. While not yet operating at design sensitivity, both detectors reached an instrument noise 3-4 times lower than ever measured before in their most sensitive frequency band between 100 Hz and 300 Hz [1]. The corresponding observable volume of space for BBH mergers, in the mass range reported in this paper, was about 30 times greater, enabling the successful search reported here.
The typical instrument noise of the Advanced LIGO detectors during O1 is described in detail in Ref. [46]. In the left panel of Fig. 1, we show the amplitude spectral density of the total strain noise of both detectors, ffiffiffiffiffiffiffiffiffi SðfÞ p , calibrated in units of strain per ffiffiffiffiffiffi Hz p [47]. Overlaid on the noise curves TABLE I. Details of the three most significant events. The false alarm rate, p-value, and significance are from the PyCBC analysis; the GstLAL results are consistent with this. For source parameters, we report median values with 90% credible intervals that include statistical errors, and systematic errors from averaging the results of different waveform models. The uncertainty for the peak luminosity includes an estimate of additional error from the fitting formula. The sky localization is the area of the 90% credible area. Masses are given in the source frame; to convert to the detector frame, multiply by (1 þ z). The source redshift assumes standard cosmology [18]. of the detectors, the waveforms of GW150914, GW151226, and LVT151012 are also shown. The expected signal-to-noise ratio (SNR) ρ of a signal, hðtÞ, can be expressed as wherehðfÞ is the Fourier transform of the signal. Writing it in this form motivates the normalization of the waveform plotted in Fig. 1, as the area between the signal and noise curves is indicative of the SNR of the events. The gravitational-wave signal from a BBH merger takes the form of a chirp, increasing in frequency and amplitude as the black holes spiral inwards. The amplitude of the signal is maximum at the merger, after which it decays rapidly as the final black hole rings down to equilibrium. In the frequency domain, the amplitude decreases with frequency during inspiral, as the signal spends a greater number of cycles at lower frequencies. This is followed by a slower falloff during merger and then a steep decrease during the ringdown. The amplitude of GW150914 is significantly larger than the other two events, and at the time of the merger, the gravitational-wave signal lies well above the noise. GW151226 has a lower amplitude but sweeps across the whole detector's sensitive band up to nearly 800 Hz. The corresponding time series of the three waveforms are plotted in the right panel of Fig. 1 to better visualize the difference in duration within the Advanced LIGO band: GW150914 lasts only a few cycles, while LVT151012 and GW151226 have lower amplitudes but last longer.
The analysis presented in this paper includes the total set of O1 data from September 12, 2015 to January 19, 2016, which contain a total coincident analysis time of 51.5 days accumulated when both detectors were operating in their normal state. As discussed in Ref. [13] with regard to the first 16 days of O1 data, the output data of both detectors typically contain nonstationary and non-Gaussian features, in the form of transient noise artifacts of varying durations. Longer duration artifacts, such as nonstationary behavior in the interferometer noise, are not very detrimental to CBC searches as they occur on a time scale that is much longer than any CBC waveform. However, shorter duration artifacts can pollute the noise background distribution of CBC searches. Many of these artifacts have distinct signatures [49] visible in the auxiliary data channels from the large number of sensors used to monitor instrumental or environmental disturbances at each observatory site [50]. When a significant noise source is identified, contaminated data are removed from the analysis data set. After applying this data quality process, detailed in Ref. [51], the remaining coincident analysis time in O1 is 48.6 days. The analyses search only stretches of data longer than a minimum duration, to ensure that the detectors are operating stably. The choice is different in the two analyses and reduces the available data to 46.1 days for the PyCBC analysis and 48.3 days for the GstLAL analysis.

III. SEARCH RESULTS
Two different, largely independent, analyses have been implemented to search for stellar-mass BBH signals in the data of O1: PyCBC [2][3][4] and GstLAL [5][6][7]. Both these analyses employ matched filtering [52][53][54][55][56][57][58][59][60] with waveforms given by models based on general relativity [8,9] to search for gravitational waves from binary neutron stars, BBHs, and neutron star-black hole binaries. In this paper, we focus on the results of the matched-filter search for BBHs. , and the recovered signals of GW150914, GW151226, and LVT151012 plotted so that the relative amplitudes can be related to the SNR of the signal (as described in the text). Right panel: Time evolution of the recovered signals from when they enter the detectors' sensitive band at 30 Hz. Both figures show the 90% credible regions of the LIGO Hanford signal reconstructions from a coherent Bayesian analysis using a nonprecessing spin waveform model [48].
Results of the searches for binary neutron stars and neutron star-black hole binaries are reported in Ref. [43]. These matched-filter searches are complemented by generic transient searches which are sensitive to BBH mergers with total mass of about 30M ⊙ or greater [61].
A bank of template waveforms is used to cover the parameter space to be searched [54,[62][63][64][65]. The gravitational waveforms depend upon the masses m 1;2 (using the convention that m 1 ≥ m 2 ) and angular momenta S 1;2 of the binary components. We characterize the angular momentum in terms of the dimensionless spin magnitude and the component aligned with the direction of the orbital angular momentum, L, of the binary [66,67], We restrict this template bank to circular binaries for which the spin of the systems is aligned (or antialigned) with the orbital angular momentum of the binary. The resulting templates can nonetheless recover systems with misaligned spins, which will exhibit orbital precession, with good sensitivity over much of the parameter space, particularly for near equal-mass binaries [44]. At leading order, the phase evolution during inspiral depends on the chirp mass of the system [68][69][70] At subsequent orders in the PN expansion, the phase evolution depends predominantly upon the mass ratio [19] q ¼ and the effective spin parameter [71][72][73][74][75][76] where M ¼ m 1 þ m 2 is the binary's total mass. The minimum black hole mass is taken to be 2M ⊙ , consistent with the largest known masses of neutron stars [77]. There is no known maximum black hole mass [78]; however, we limit this template bank to binaries with a total mass less than M ≤ 100M ⊙ . For higher-mass binaries, the Advanced LIGO detectors are sensitive to only the final few cycles of inspiral plus merger, making the analysis more susceptible to noise transients. The results of searches for more massive BBH mergers will be reported in future publications. In principle, black hole spins can lie anywhere in the range from −1 (maximal and antialigned) to þ1 (maximal and aligned). We limit the spin magnitude to less than 0.9895, which is the region over which the EOBNR waveform model [8,9] used in the search is able to generate valid template waveforms [8]. The bank of templates used for the analysis is shown in Fig. 2. Both analyses separately correlate the data from each detector with template waveforms that model the expected signal. The analyses identify candidate events that are detected at both the Hanford and Livingston observatories consistent with the 10-ms intersite propagation time. Additional signal consistency tests are performed to mitigate the effects of nonstationary transients in the data. Events are assigned a detection-statistic value that ranks their likelihood of being a gravitational-wave signal. For PyCBC, the observed SNR in each detector is reweighted using the signal consistency tests. These reweighted SNRs are added in quadrature to obtain the detection statisticρ c . For GstLAL, ln L is the log-likelihood ratio for the signal and noise models. The detection statistics are compared to the estimated detector noise background to determine, for each candidate event, the probability that detector noise would give rise to at least one equally significant event. Further details of the analysis methods are available in Appendix A.
The results for the two different analyses are presented in Fig. 3. The figure shows the observed distribution of events, as well as the background distribution used to FIG. 2. The four-dimensional search parameter space covered by the template bank shown projected into the component-mass plane, using the convention m 1 > m 2 . The colors indicate mass regions with different limits on the dimensionless spin parameters χ 1 and χ 2 . Symbols indicate the best matching templates for GW150914, GW151226, and LVT151012. For GW150914 and GW151226, the templates were the same in the PyCBC and GstLAL searches, while for LVT151012 they differed. The parameters of the best matching templates are consistent, up to the discreteness of the template bank, with the detector frame mass ranges provided by detailed parameter estimation in Sec. IV. assess significance. In both analyses, there are three events that lie above the estimated background: GW150914, GW151226, and LVT151012. All three of these are consistent with being BBH merger signals and are discussed in further detail below. The templates producing the highest significance in the two analyses are indicated in Fig. 2, the gravitational waveforms are shown in Fig. 1, and key parameters are summarized in Table I. There were no other significant BBH candidates in the first advanced LIGO observing run. All other observed events are consistent with the noise background for the search. A followup of the coincident eventsρ c ≈ 9 in the PyCBC analysis suggests that they are likely due to noise fluctuations or poor data quality, rather than a population of weaker gravitational-wave signals.
It is clear from Fig. 3 that at high significance, the background distribution is dominated by the presence of GW150914 in the data. Consequently, once an event has been confidently identified as a signal, we remove triggers associated with it from the background in order to get an accurate estimate of the noise background for lower amplitude events. The lower panel of Fig. 3 shows the search results with GW150914 removed from both the foreground and background distributions. In both analyses, GW150914 is the most significant event in the data, and it is more significant than any background event in the data. It is identified with a significance greater than 5σ in both analyses. As GW150914 is so significant, the high significance background is dominated by its presence in the data. Once it has been identified as a signal, we remove it from the background estimation to evaluate the significance of the remaining events. The lower plots show results with GW150914 removed from both the foreground and background, with the PyCBC result on the left and the GstLAL result on the right. In both analyses, GW151226 is identified as the most significant event remaining in the data. GW151226 is more significant than the remaining background in the PyCBC analysis, with a significance of greater than 5σ. In the GstLAL search, GW151226 is measured to have a significance of 4.5σ. The third most significant event in the search, LVT151012, is identified with a significance of 1.7σ and 2.0σ in the two analyses, respectively. The significance obtained for LVT151012 is not greatly affected by including or removing background contributions from GW150914 and GW151226.

A. GW150914
GW150914 was observed on September 14, 2015 at 09∶50:45 UTC with a matched-filter SNR of 23.7 [79]. It is recovered with a reweighted SNR in the PyCBC analysis of ρ c ¼ 22.7 and a log likelihood of 84.7 in the GstLAL analysis. A detailed discussion of GW150914 is given in Refs. [16,39,44], where it was presented as the most significant event in the first 16 days of Advanced LIGO observing. The results presented here differ from the previous ones in two ways: They make use of the full O1 data set, and they use the final instrumental calibration. Thus, while GW150914 remains the most significant in this search, the recovered SNR and significance of the event differ slightly from the previously reported values. In particular, for the PyCBC analysis, the event is recovered with slightly lower SNR than with the preliminary calibration and with a higher value of the χ 2 signal consistency test in the H1 detector. This leads to a reduction of the detection statisticρ c , from 23.6 in Ref. [16] to the current value of 22.7. Additionally, for the PyCBC analysis, a redefinition of the mass bins used to group templates with similar background caused the significance of GW150914 to be evaluated against a different background; for details see Appendix A 1. For the GstLAL analysis of the full O1 data set, a decrease in the background probability for GW150914 increased the log likelihood to 84.7 from the original value of 78.
GW150914 remains the most significant event in both analyses. Furthermore, in both cases, there are no background events with significance equal to or greater than GW150914. Consequently, we can only calculate a limit on the false alarm rate (FAR) for GW150914. Using the timeshift method to estimate background, we limit the FAR of GW150914 to be less than 6.0 × 10 −7 yr −1 . This corresponds to a p-value of 7.5 × 10 −8 , or a significance of 5.3σ. The significance is greater than the 5.1σ derived in Ref. [44] due to a tripling of the analysis time, which allows time shifts to probe smaller false alarm rates.
The GstLAL analysis estimates the p-value assuming that noise triggers are equally likely to occur in any of the templates within a background bin. Under this assumption, the p-value of GW150914 is estimated to be 8.8 × 10 −12 , which is the minimum p-value that can be informed by the data. However, as stated in Ref. [44], breaking that assumption implies that the minimum p-value would be higher. For this reason, we quote the more conservative PyCBC bound on the false alarm rate of GW150914 here and in Ref. [16].

B. GW151226
GW151226 was observed on December 26, 2015 at 03∶38:53 UTC with a combined matched-filter SNR of 13.0. The signal was identified as the second most significant event in both the PyCBC and GstLAL analyses withρ c ¼ 12.8 and ln L ¼ 22.6, respectively.
Signal consistency tests show no sign of transient noise affecting the analyses at this time, and checks of the instrumental data reveal no serious data quality issues at the time of the event. When single interferometer triggers from GW150914 are used in our background estimation methods, the tail of the distribution is dominated by their presence. As GW150914 is confidently identified as a gravitational-wave signal [16], we remove any background events associated with it from the distribution.
The background distribution, under the assumption that GW150914 is a gravitational wave, is shown in the bottom row of Fig. 3. Now, GW151226 is more significant than all background events in the PyCBC analysis. Its significance cannot be measured and, as for GW150914, we limit the FAR to be less than 6.0 × 10 −7 yr −1 . This corresponds to a p-value of 7.5 × 10 −8 , or a significance of 5.3σ. In the GstLAL analysis, the background extends past the observed log likelihood of GW151226, and the event is recovered with a FAR of 1 per 44000 years, which corresponds to a p-value of 3.5 × 10 −6 and a significance of 4.5σ.

C. LVT151012
The third most significant event in O1 is LVT151012 observed on October 12, 2015 at 09∶54:43 UTC. It was observed with a combined matched-filter SNR of 9.7 and detection statistic valuesρ c ¼ 9.7 and ln L ¼ 18.1. The SNR of this event is considerably lower than GW150914 and GW151226 and, even though the signal consistency tests show no signs of noise origin, the search background is such that the FAR of LVT151012 is 1 per 2.7 years and 1 per 5.9 years in the PyCBC and GstLAL analyses, respectively. This equates to p-values of 0.045 and 0.025, or significances of 1.7σ and 2.0σ. These results are consistent with expectations for candidate events with low matched-filter SNR since PyCBC and GstLAL use different ranking statistics and background estimation methods. At the significance of LVT151012, the background has contributions from a large number of triggers in each detector and is no longer dominated by the presence of GW150914 and GW151226 in the data. Consequently, removing them does not have a large effect on the significance. For PyCBC, the estimate of the significance is essentially unaffected by the removal of the events. For GstLAL, inclusion of GW150914 changes the p-value of LVT151012 by a factor of 2, but inclusion of GW151226 has little effect.
The significance of LVT151012 is such that we do not confidently claim this event as a gravitational-wave signal. However, it is more likely to be a gravitational-wave signal than noise based on our estimate for the rate of gravitational-wave signals (see Sec. VI). Detector characterization studies have not identified an instrumental or environmental artifact as causing this candidate event [13]. Parameterestimation results for LVT151012 are presented in the following section and are consistent with our expectations for an astrophysical BBH source. The inferred component masses of LVT151012 lie roughly between the masses of GW150914 and GW151226, as shown in Fig. 4.

IV. SOURCE PROPERTIES
In this section, we present the inferred properties of the sources of GW150914, LVT151012, and GW151226, assuming that the signals each originate from a binary coalescence as described by general relativity. Tests of the consistency of the signal with the predictions of general relativity are presented in Sec. V. Full results for GW150914 have been provided in Refs. [39,40], and key results for LVT151012 have been given in Ref. [44]. Here, we give results based upon an updated calibration of the data. The analyses of all three signals closely mirror the original analysis of GW150914, as detailed in Ref. [39] and described in Appendix B.
The analysis makes use of two waveform models, the double aligned spin waveform model (EOBNR) [8,9] and an effective precessing spin model (IMRPhenom) [36][37][38]. Results from the two waveforms are similar, and the data give us little reason to prefer one model over the other. We therefore average the posterior distributions from two waveforms for our overall results. These are used for the discussion below, except in Sec. IV B, where we also consider measurements of spin alignment from the precessing IMRPhenom waveform.
The results match our expectations for a coherent signal in both detectors and give us no reason to suspect that any of the signals are not of astrophysical origin. All three signals are consistent with originating from BBHs. Key parameters for the three events are included in Table I and plotted in Figs. 4,5, and 6. Detailed results are provided in Table IV in Appendix B.

A. Masses
The binary component masses of all three systems lie within the range expected for stellar-mass black holes. The least massive black hole is the secondary of GW151226, which has a 90% credible lower bound that m source 2 ≥ 5.6M ⊙ . This is above the expected maximum neutron star mass of about 3M ⊙ [80,81] and beyond the mass gap where there is currently a dearth of black holes observed in x-ray binaries [82][83][84]. The range of our inferred component masses overlaps with those for stellarmass black holes measured through x-ray observations but extends beyond the nearly 16M ⊙ maximum of that population [85][86][87].
GW150914 corresponds to the heaviest BBH system we observed, and GW151226 corresponds to the least massive (M source ¼ 21.8 þ5.9 −1.7 M ⊙ ). Higher mass systems merge at a lower gravitational-wave frequency. For lower-mass systems, the gravitational-wave signal is dominated by the inspiral of the binary components, whereas for higher-mass systems, the merger and ringdown parts of the signal are increasingly important. The transition from being inspiral dominated to being merger and ringdown dominated depends upon the sensitivity of the detector network as a function of frequency; GW150914 had SNR approximately equally split between the inspiral and post-inspiral phases [41]. Information about the masses is encoded in different ways in the different parts of the waveform: The inspiral predominantly constrains the chirp mass [70,88,89], and the ringdown is more sensitive to the total mass [90]; hence, the bestmeasured parameters depend upon the mass [91][92][93]. This is illustrated in the posterior probability distributions for the three events in Fig. 4. For the lower-mass GW151226 and LVT151012, the posterior distribution follows curves of constant chirp mass, but for GW150914, the posterior is shaped more by constraints on the total mass [94]. The mass ratio q also differs between the events. We infer that GW150914 came from a near equal-mass system (the 90% credible lower bound of the mass ratio is q ≥ 0.65), but GW151226 and LVT151012 have posterior support for more unequal-mass ratios (q ≥ 0.28 and q ≥ 0.24, respectively). The mass ratio has a large uncertainty, as it is degenerate with the spin of the compact objects [89,95,96]. This degeneracy could be broken if a signal contains a clear imprint of precession [97][98][99][100], but we have yet to observe this signature (see Sec. IV B). Measurement of the mass ratio could inform our understanding of the origin of BBH systems.
Following the inspiral, the BBHs merge to form a final remnant black hole. We estimate the masses of these using fitting formulas calibrated to numerical relativity simulations [36,101]. Each final mass is 0.95-0.98 of the initial total mass of the binary components, as similar fractions of 0.02-0.05 are radiated away as gravitational waves. While predominantly determined by the total mass, the radiated energy also depends upon the mass ratio and component spins; our results are consistent with expectations for moderately spinning black holes [102,103]. The remnant black holes are more massive than any black hole observed to date in an x-ray binary, the least massive being The final black hole masses, as well as their spins, are shown in Fig. 4. The remnant for GW150914 has a mass of M source 1 M ⊙ and is the most massive stellar-mass black hole observed to date. BBH mergers have extremely high gravitational-wave luminosities: The peak values are 3.6 þ0.5 −0.4 × 10 56 erg s −1 , 3.1 þ0.8 −1.8 × 10 56 erg s −1 , and 3.3 þ0.8 −1.6 × 10 56 erg s −1 for GW150914, LVT151012, and GW151226, respectively. These luminosities are calculated using a fit to nonprecessing numerical-relativity simulations [104], and the uncertainty includes the estimated error from this fit. Whereas the energy radiated scales with the total mass, the luminosity is comparable for all three systems. There is some variation from differences in the mass ratios and spins, and uncertainties in these dominate the overall uncertainty. The luminosity is independent of the total mass, as this sets both the characteristic energy scale and characteristic time scale for the system [105].

B. Spins
An isolated black hole has three intrinsic properties: mass, spin, and electric charge [106][107][108][109]. We expect the charge of astrophysical black holes to be negligible [110][111][112]. Both the masses and spins of the black holes leave an imprint on the gravitational-wave signal during a coalescence. The components of the spins parallel to the orbital angular momentum affect the phasing of the binary, whereas orthogonal components lead to orbital precession. The effects of the spins of the binary components are subdominant, and they are more difficult to constrain than the masses.
Only weak constraints can be placed on the spin magnitudes of the binary components: In all cases, the uncertainty spans the majority of the allowed range of [0, 1]. We can better infer the spin of the more massive black hole, as this has a greater impact upon the inspiral. We find that smaller spins are favored, and we place 90% credible bounds on the primary spin a 1 ≤ 0.7 for GW150914 and LVT151012, and a 1 ≤ 0.8 for GW151226.
Observations for all three events are consistent with small values of the effective spin: jχ eff j ≤ 0.17, 0.28, and 0.35 at 90% probability for GW150914, LVT151012, and GW151226, respectively. This result indicates that large parallel spins aligned or antialigned with the orbital angular momentum are disfavored. Only in the case of GW151226 do we infer a nonzero value of χ eff , and from this, we infer that at least one of the components has a spin of ≥ 0.2 at the 99% credible level.
Misalignment of the component spins with respect to the orbital angular momentum leads to precession [113]. As a first approximation, the amount of precession may be quantified through a single effective precession spin parameter χ p [114]. The inferred distributions for χ p are roughly consistent with our prior expectations after incorporating the measured constraints on χ eff . The absence of clear information about precession could be because there is intrinsically little precession since the binary is orientated nearly face-on or face-off (see Sec. IV C), which minimizes the visible effect of precession, or because of a combination of these effects. Our aligned-spin search has reduced sensitivity to highly precessing systems [44], which makes it more probable that we detect nonprecessing systems. We have yet to find strong evidence for precession but cannot exclude the possibility of misaligned spins.
The posterior probabilities for the spin magnitudes and tilts relative to the orbital angular momentum using the precessing IMRPhenom model are shown in Fig 5. In all cases, larger spin magnitudes are allowed when the spin is misaligned: The additional in-plane spin does not change χ eff . For LVT151012 and GW151226, there is significantly greater uncertainty for the spin of the secondary than for the primary. This is because the mass ratios for these systems can be more extreme: For equal mass binaries, both spins play an equal role in the dynamics, but, as the mass ratio tends towards zero, the effects of the secondary spin become negligible.
All three events have final black holes with spins of about 0.7, as expected for mergers of similar-mass black holes [115,116]. The final spin is dominated by the orbital angular momentum of the binary at merger. Consequently, it is more precisely constrained than the component spins and is broadly similar across the three events. The masses and spins of the final black holes are plotted in Fig. 4.
The spin of the final black hole, like its mass, is calculated using fitting formulas calibrated against numerical relativity simulations. In Ref. [39], we used a formula that only included contributions from the aligned components of spins [101]; we now use an extension of this formula, which also incorporates the effects of in-plane spins [117]. The change has a small impact on the final spin of GW150914 (changing from 0.67 þ0.05 −0.06 to 0.68 þ0.05 −0.06 ) and a larger effect on GW151226 (changing from 0.72 þ0.05 −0.05 to 0.74 þ0.06 −0.06 ) as its components have more significant spins.

C. Distance, inclination, and sky location
The luminosity distance to the source is inversely proportional to the signal's amplitude. GW150914 and GW151226 have comparable distance estimates [118]. GW151226 originates from a lower-mass system than GW150914; hence, the gravitational-wave signal is intrinsically quieter, and its SNR is lower than GW150914's even though the distances are comparable. LVT151012 is the quietest signal and is inferred to be at a greater distance . In all cases, there is significant fractional uncertainty for the distance. This is predominantly a consequence of the degeneracy between the distance and the binary's inclination, which also impacts the signal amplitude [95,119,120].
The inclination is only weakly constrained; in all cases, there is greatest posterior support for the source being either face-on or face-off (angular momentum pointed parallel or antiparallel to the line of sight). This is the orientation that produces the greatest gravitational-wave amplitude, so it is consistent with the largest distance. The inclination could potentially be better constrained in a precessing system [98,121]. Only for GW150914 is there preference for one of the configurations, with there being greater posterior support for the source being face-off [39].
Sky localization from a gravitational-wave detector network is primarily determined by the measured delay in the signal arriving at the sites, with additional information coming from the signal amplitude and phase [122][123][124]. For a two-detector network, the sky localization forms a characteristic broken annulus [125][126][127][128]. Adding additional detectors to the network would improve localization abilities [129][130][131][132]. The sky localizations of the three events are shown in Fig. 6, including both celestial coordinates (indicating the origin of the signal) and geographic coordinates (illustrating localization with respect to the two detectors). The arrival time at Hanford relative to Livingston was Δt HL ¼ 7.0 þ0.2 −0.2 ms for GW150914, Δt HL ¼ −0.6 þ0. 6 −0.6 ms for LVT151012, and Δt HL ¼ 1.1 þ0.3 −0.3 ms for GW151226. Both LVT151012 and GW151226 are nearly overhead of the two detectors, which is where we are most sensitive and hence expect to make most detections [53,133].
The 90% credible region for sky localization is 230 deg 2 for GW150914, 850 deg 2 for GW151226, and 1600 deg 2 for LVT151012. As expected, the sky area is larger for quieter events. The sky area is expected to scale inversely with the square of the SNR [128,134], and we see that this trend is followed.

V. TESTS OF GENERAL RELATIVITY
GW150914 provided us with the first empirical access to the genuinely strong field dynamics of gravity. With the frequency of the waveform peak amplitude well aligned with the best instrument sensitivity, the late inspiral and merger-ringdown regime could be studied in considerable detail, as described in Ref. [41]. This allows for checks of the consistency between masses and spins estimated from different portions of the waveform [135], as well as parametrized tests of the waveform as a whole [136][137][138][139]. Even though not much of the early inspiral was in the detectors' sensitive band, interesting bounds were placed on departures from general relativity in the PN coefficients up to 3.5PN. Since the source of GW151226 merged at about 450 Hz, the signal provides the opportunity to probe the PN inspiral with many more waveform cycles, albeit at relatively low SNR. Especially in this regime, GW151226 allows us to further tighten our bounds on violations of general relativity.
As in Ref. [41], to analyze GW151226, we start from the IMRPhenom waveform model of Refs. [36][37][38], which is capable of describing inspiral, merger, and ringdown, and partly accounts for spin precession. The phase of this waveform is characterized by coefficients fp i g, which include PN coefficients, as well as phenomenological coefficients describing merger and ringdown. The latter were obtained by calibrating against numerical waveforms and tend to multiply specific powers of f. They characterize the gravitational-wave amplitude and phase in different stages of the coalescence process. We allow for possible departures from general relativity, parametrized by a set of testing coefficients δp i , which take the form of fractional deviations in the p i [140,141]. Thus, we replace p i by ð1 þ δp i Þp i and let one or more of the δp i vary freely, in addition to the source parameters that also appear in pure general relativity waveforms, using the general relativistic expressions for p i in terms of masses and spins. Our testing coefficients are those in Table I of Ref. [41]. For convenience, we list them again: (i) fδφ 0 ; …; δφ 7 g [142] and fδφ 5l ; δφ 6l g for the PN coefficients (where the last two multiply a term of the form f γ log f), (ii) intermediateregime parameters fδβ 2 ; δβ 3 g, and (iii) merger-ringdown parameters fδα 2 ; δα 3 ; δα 4 g [143].
In our analyses, we let each one of the δp i in turn vary freely, while all others are fixed to their general relativity values, δp j ¼ 0 for j ≠ i. These tests model general relativity violations that would occur predominantly at a particular PN order (or in the case of the intermediate and merger-ringdown parameters, a specific power of frequency in the relevant regime), although together they can capture deviations that are measurably present at more than one order.
In Ref. [41], for completeness, we have also shown results from analyses where the parameters in each of the regimes (i)-(iii) are allowed to vary simultaneously, but these tests return wide and uninformative posteriors. By contrast, analyses where the testing parameters δp i are varied one at a time have much smaller statistical uncertainties. Moreover, as demonstrated in Ref. [144], checking for a deviation from zero in a single testing parameter is an efficient way to uncover GR violations that occur at multiple PN orders, and one can even find violations at powers of frequency that are distinct from the one that the testing parameter is associated with [145,146]. Hence, such analyses are well suited to search for generic departures from GR, though it should be stressed that if a violation is present, the measured values of the δp i will not necessarily reflect the predicted values of the correct alternative theory. To reliably constrain theoryspecific quantities such as coupling constants or extra charges, one should directly apply full inspiral-mergerringdown waveform models from specific modified gravity theories [147], but in most cases, these are not yet available. However, in the present work, the focus is on modelindependent tests of general relativity itself.
Given the observation of more than one BBH merger, posterior distributions for the δp i can be combined to yield stronger constraints. In Fig. 7, we show the posteriors from GW150914, generated with final instrumental calibration, and GW151226 by themselves, as well as joint posteriors from the two events together. We do not present similar results for the candidate LVT151012 since it is not as confident a detection as the others; furthermore, its smaller detection SNR means that its contribution to the overall posteriors is insignificant.
For GW150914, the testing parameters for the PN coefficients, δφ i and δφ il , showed moderately significant (2σ-2.5σ) deviations from their general relativity values of zero [41]. By contrast, the posteriors of GW151226 tend to be centered on the general relativity value. As a result, the offsets of the combined posteriors are smaller. Moreover, the joint posteriors are considerably tighter, with a 1σ spread as small as 0.07 for deviations in the 1.5PN parameter φ 3 , which encapsulates the leading-order effects of the dynamical self-interaction of spacetime geometry (the "tail" effect) [148][149][150][151], as well as spin-orbit interaction [67,152,153].
In Fig. 8, we show the 90% credible upper bounds on the magnitude of the fractional deviations in PN coefficients, jδφ i j, which are affected by both the offsets and widths of the posterior density functions for the δφ i . We show bounds for GW150914 and GW151226 individually, as well as the joint upper bounds resulting from the combined posterior density functions of the two events. Not surprisingly, the quality of the joint bounds is mainly due to GW151226 because of the larger number of inspiral cycles in the detectors' sensitive frequency band. Note how at high PN order, the combined bounds are slightly looser than the ones from GW151226 alone; this is because of the large offsets in the posteriors from GW150914.
Next, we consider the intermediate-regime coefficients δβ i , which pertain to the transition between inspiral and merger-ringdown. For GW151226, this stage is well inside the sensitive part of the detectors' frequency band. Returning to Fig. 7, we see that the measurements for GW151226 are of comparable quality to GW150914, and the combined posteriors improve on the ones from either detection by itself. Last, we look at the merger-ringdown parameters δα i . For GW150914, this regime corresponded to frequencies of f ∈ ½130; 300 Hz, while for GW151226, it occurred at f ≳ 400 Hz. As expected, the posteriors from GW151226 are not very informative for these parameters, and the combined posteriors are essentially determined by those of GW150914.
In summary, GW151226 makes its most important contribution to the combined posteriors in the PN inspiral regime, where both offsets and statistical uncertainties have significantly decreased over the ones from GW150914, in some cases almost to the 10% level.
An inspiral-merger-ringdown consistency test as performed on GW150914 in Ref. [41] is not meaningful for GW151226 since very little of the signal is observed in the post-merger phase. Likewise, the SNR of GW151226 is too low to allow for an analysis of residuals after subtraction of the most probable waveform. In Ref. [41], GW150914 was used to place a lower bound on the graviton Compton wavelength of 10 13 km GW151226 gives a somewhat weaker bound because of its lower SNR, so combining information from the two signals does not significantly improve on this; an updated bound must await further observations. Finally, BBH observations can be used to test the consistency of the signal with the two polarizations of gravitational waves predicted by general relativity [154]. However, as with GW150914, we are unable to test the polarization content of GW151226 with the two, nearly aligned aLIGO detectors. Future observations, with an expanded network, will allow us to look for evidence of additional polarization content arising from deviations from general relativity.

VI. BINARY BLACK HOLE MERGER RATES
The observations reported here enable us to constrain the rate of BBH coalescences in the local Universe more precisely than was achieved in Ref. [42] because of the longer duration of data containing a larger number of detected signals.
FIG. 8. The 90% credible upper bounds on deviations in the PN coefficients, from GW150914 and GW151226. Also shown are joint upper bounds from the two detections; the main contributor is GW151226, which had many more inspiral cycles in band than GW150914. At 1PN order and higher, the joint bounds are slightly looser than the ones from GW151226 alone; this is due to the large offsets in the posteriors for GW150914.
To do so, we consider two classes of triggers: those whose origin is astrophysical and those whose origin is terrestrial. Terrestrial triggers are the result of either instrumental or environmental effects in the detector, and their distribution is calculated from the search background estimated by the analyses (as shown in Fig. 3). The distribution of astrophysical events is determined by performing large-scale simulations of signals drawn from astrophysical populations and added to the data set. We then use our observations to fit for the number of triggers of terrestrial and astrophysical origin, as discussed in detail in Appendix C. The details of the astrophysical population have a minimal impact on the fit, as in all cases we assume a population distributed uniformly in comoving volume. Figure 9 shows the inferred distributions of signal and noise triggers, as well as the combined distribution. The observations are in good agreement with the model. GW150914 stands somewhat above the inferred distribution, as it is an unusually significant event-only 6% of the astrophysical population of sources appearing in our search with a false rate of less than one per century will be more significant than GW150914.
It is clear from the figure that three triggers are more likely to be signal (i.e., astrophysical) than noise (terrestrial). We evaluate this probability and find that, for GW150914 and GW151226, the probability of astrophysical origin is unity to within one part in 10 6 . Meanwhile, for LVT151012, it is calculated to be 0.87 and 0.86, for the PyCBC and GstLAL analyses, respectively. For all of the remaining events, the probability of astrophysical origin is less than 15%.
Given uncertainty in the formation channels of the various BBH events, we calculate the inferred rates using a variety of source population parametrizations. For a given population, the rate is calculated as R ¼ Λ=hVTi, where Λ is the number of triggers of astrophysical origin and hVTi is the population-averaged sensitive space-time volume of the search. We use two canonical distributions for BBH masses: (i) a distribution uniform (flat) over the logarithm of component masses, pðm 1 ; m 2 Þ ∝ m 1 −1 m 2 −1 and (ii) assuming a power-law distribution in the primary mass, pðm 1 Þ ∝ m −2.35

1
, with a uniform distribution on the second mass. We require 5M ⊙ ≤ m 2 ≤ m 1 and m 1 þ m 2 ≤ 100M ⊙ . The first distribution probably overestimates the fraction of high-mass black holes and therefore overestimates hVTi, resulting in an underestimate of the true rate, while the second probably overestimates the fraction of low-mass black holes and therefore underestimating hVTi and overestimating the true rate. The inferred rates for these two populations are shown in Table II, and the rate distributions are plotted in Fig. 11.
In addition, we calculate rates based upon the inferred properties of the three significant events observed in the data: GW150914, GW151226, and LVT151012, as discussed in Appendix C. Since these classes are distinct, the total event rate is the sum of the individual rates: R ≡ R GW150914 þ R LVT151012 þ R GW151226 . Note that the total rate estimate is dominated by GW151226, as it is the least massive of the three likely signals and is therefore observable over the smallest space-time volume. The  results for these population assumptions are also shown in Table II and in Fig. 10. The inferred overall rate is shown in Fig. 11. As expected, the population-based rate estimates bracket the one obtained by using the masses of the observed black hole binaries.
The inferred rates of BBH mergers are consistent with the results obtained in Refs. [42,155], following the observation of GW150914. The median values of the rates have decreased by approximately a factor of 2, as we now have three likely signals (rather than two) in 3 times as much data. Furthermore, because of the observation of an additional highly significant signal GW151226, the uncertainty in rates has reduced. In particular, the 90% range of allowed rates has been updated to 9-240 Gpc −3 yr −1 , where the lower limit comes from the flat in log mass population and the upper limit from the power-law population distribution.
With three significant triggers, GW150914, LVT151012, and GW151226, all of astrophysical origin to high probability, we can begin to constrain the mass distribution of coalescing BBHs. Here, we present a simple, parametrized fit to the mass distribution using these triggers; a nonparametric method that can fit general mass distributions will be presented in future work. Our methodology is described more fully in Appendix D.
We assume that the distribution of black hole masses in coalescing binaries follows with M min ≤ m 2 ≤ m 1 and m 1 þ m 2 ≤ 100M ⊙ , and a uniform distribution on the secondary mass between M min ¼ 5M ⊙ and m 1 . With α ¼ 2.35, this mass distribution is the power-law distribution used in our rate estimation. Our choice of M min is driven by a desire to incorporate nearly all the posterior samples from GW151226 and because there is some evidence from electromagnetic observations for a minimum BH mass near 5M ⊙ [82,156] (but see Ref. [84]). We use a hierarchical analysis [156][157][158][159] to infer α from the properties of the three significant events-GW150914, GW151226, and LVT151012-where all three are treated equally and we properly incorporate parameter-estimation uncertainty on the masses of each system. Our inferred posterior on α is shown in Fig. 12. The value α ¼ 2.35, corresponding to the power-law mass distribution used above to infer rates, lies near the peak of the posterior, and the median and broad 90% credible interval is It is not surprising that our fit peaks near α ∼ 2.5 because the observed sample is consistent with a flat distribution and the sensitive space-time volume scales roughly as M 15=6 . The mass distribution of merging black hole binaries cannot be constrained tightly with such a small number of observations. This power-law fit is sensitive to a number of arbitrary assumptions, including a flat distribution in the mass ratio and a redshift-independent merger rate and mass distribution. Most critically, the fit is sensitive to the choice of the lower-mass cutoff M min : Larger values of M min lead FIG. 10. The posterior density on the rate of GW150914-like BBH, LVT151012-like BBH, and GW151226-like BBH mergers. The event-based rate is the sum of these. The median and 90% credible levels are given in Table II. FIG. 11. The posterior density on the rate of BBH mergers. The curves represent the posterior assuming that BBH masses are distributed flat in logðm 1 Þ − logðm 2 Þ (Flat), match the properties of the observed events (Event based), or are distributed as a power law in m 1 (Power law). The posterior median rates and symmetric 90% symmetric credible intervals are given in Table II. to a preference for steeper power laws with indices different by a few.
GW151226 differs from GW150914 primarily in the significantly lower inferred companion masses: These masses are similar to the black hole masses measured dynamically in x-ray binaries (for reviews, see Refs. [82,156]). If LVT151012 is of astrophysical origin, its inferred companion masses m 1 ¼ 23 þ18 −6 M ⊙ and m 2 ¼ 13 þ4 −5 M ⊙ fall between those of GW150914 and GW151226. This result indicates that merging BBHs exist in a broad mass range.
GW151226 and LVT151012 could have formed from lower-mass progenitor stars than GW150914 and/or in higher-metallicity environments in which progenitors lose a greater fraction of their mass to winds. Black holes with such masses can be formed at solar metallicity; see, e.g., Ref. [187]. The low masses of GW151226 are probably inconsistent with the chemically homogeneous evolution scenario, under which higher masses are thought to be required [176,177]. However, the masses are still consistent with both classical isolated binary evolution and dynamical formation.
The broad power-law index range α ¼ 2.5 þ1.5 −1.6 inferred from the fit to the merging binary black hole mass distribution attempted in Sec. VI demonstrates the statistical uncertainty associated with extrapolating a distribution from just three events. There are additional systematic uncertainties associated with the power-law model. In particular, while population-synthesis models of binary evolution can be consistent with power-law mass distributions over a range of masses, as in Figs. 8 and 9 of Ref. [188], the power law is likely to be broken over the very broad range between M min ¼ 5 M ⊙ and a total mass of 100 M ⊙ . Other formation models may not be consistent with power-law distributions altogether (see, e.g., Ref. [183]). Similar methods have been employed to fit the population of black holes with dynamical mass measurements in x-ray binaries: Reference [156] obtained, for a power-law model, M min ∼ 5 and power-law slopes in the range 1.8 ≲ α ≲ 5.0 without accounting for possible selection effects.
Isolated binary evolution is thought to prefer comparable masses, with mass ratios q < 0.5 unlikely for the classical scenario [189] and implausible for chemically homogeneous evolution [181]. The dynamical formation channel also prefers comparable masses but allows for more extreme mass ratios; observations of merging binary black holes with extreme mass ratios could therefore point to their dynamical origin. However, the mass ratios of GW151226, q ≥ 0.28, and LVT151012, q ≥ 0.24, are not well determined, and q ¼ 1 cannot be ruled out for either event. Similarly, spin measurements, which point to a moderate degree of net spin alignment with the orbital angular momentum for GW151226, χ eff ¼ 0.21 þ0.20 −0.10 , cannot be used to distinguish formation channels. On the other hand, a zero effective spin is ruled out for GW151226; the data indicate that at least one of the merging black holes must have been spinning with a > 0.2 at the 99% credible level.
The inferred GW151226 merger luminosity distance of D L ¼ 440 þ180 −190 Mpc, corresponding to a merger redshift of z ¼ 0.09 þ0.03 −0.04 , is similar to that of GW150914; in contrast, LVT151012 merged about a factor of 2 further away, at D L ¼ 1000 þ500 −500 Mpc, or z ¼ 0.20 þ0.09 −0.09 . Both are consistent with either a relatively recent formation followed by a prompt merger or formation in the early Universe with a significant time delay between formation and merger.
The BBH merger rate inferred from the full analysis of all O1 triggers, R ¼ 9-240 Gpc −3 yr −1 , is consistent with the rate inferred from the first 16 days of the O1 run [42]. The full O1 merger rate can be used to update the estimate of the energy density Ω GW in the stochastic gravitationalwave background from unresolvable BBH mergers, improving on early results in Ref. [190]. Using the FIG. 12. The posterior distribution for α in Eq. (7) using the inferred masses for our three most significant triggers, GW150914, LVT151012, and GW151226. The vertical line indicates the value of α ¼ 2.35 that corresponds to the powerlaw mass distribution used to infer the rate of BBH coalescence. This value is fully consistent with the posterior, which allows a broad range of possible values with a median and 90% credible interval of α ¼ 2.5 þ1.5 −1.6 . event-based, log-flat, and power-law mass distributions presented in Sec. VI and the corresponding combined rates in Table III, and employing the other "Fiducial" model assumptions from Ref. [190], we obtain 90% credible intervals on Ω GW . The three models agree at frequencies below 100 Hz, where Ω GW ðfÞ ∼ f 2=3 and which contain more than 99% of the signal-to-noise ratio for stochastic backgrounds, with Ω GW ðf ¼ 25 HzÞ ∼ 1.2 þ1.9 −0.9 × 10 −9 . These predictions do not significantly change the median value of Ω GW from Ref. [190] while slightly decreasing the range; we still conclude that this background is potentially measurable by the Advanced LIGO/Virgo detectors operating at their projected design sensitivity.
Despite the uncertainty in the merger rate, its lower limit can be used to rule out some corners of the parameter space if a single formation channel is assumed for all BBHs. For example, if all merging BBHs arise from dynamical formation in globular clusters, then the lower limit on the merger rate disfavors low-mass clusters [165]. On the other hand, if all merging BBHs arise from isolated binaries evolving via the common-envelope phase, the lower limit on the merger rate disfavors a combination of very-low common envelope binding energy with a high efficiency of common envelope ejection [189] (high values of α × λ, as defined in Refs. [192][193][194]), or very high black hole natal kicks of several hundred km/s [195]. However, since population synthesis studies have typically varied one parameter at a time, individual parameter values cannot be ruled out until the full parameter space is explored (see, e.g., Ref. [196]). Moreover, the parametrizations used in existing models may not even capture the full physical uncertainties (see, e.g., Refs. [197,198]).
It is likely, however, that multiple formation channels are in operation simultaneously, and GW150914, LVT151012, and GW151226 could have been formed through different channels or in different environments. A lower limit on the merger rate cannot be used to rule out evolutionary parameters if multiple channels contribute. Future observations will be required to test whether binaries can be classified into distinct clusters arising from different formation channels [199] or to compare the population to specific evolutionary models [200][201][202][203]. Such observations will make it possible to further probe the underlying mass distribution of merging BBHs and the dependence of the merger rate on redshift. Meanwhile, spaceborne detectors such as eLISA could observe heavy BBHs several years before merger; multispectrum observations with ground-based and space-borne observatories would aid in measuring binary parameters, including location, and determining the formation channel by measuring the eccentricity at lower frequencies [204][205][206].
We can use the inferred rates to estimate the number of BBH mergers expected in future observing runs. We make use of the future observing plans laid out in Ref. [132] to predict the expected rate of signals in the second and third advanced LIGO and Virgo observing runs. To do so, we restrict our attention to those signals which will be observed with a false alarm rate smaller than 1=100 yr. In the simulations used to estimate sensitive space-time volumes, 61% of the events above the low threshold used in the PyCBC rates calculation are found with a search false alarm rate lower than one per century. The expected number of observed events will then scale linearly with the sensitive space-time volume hVTi of a future search. The improvement in sensitivity in future runs will vary across the frequency band of the detectors and will therefore have a different impact for binaries of different mass. For concreteness, we use a fiducial BBH system with total mass 60M ⊙ and mass ratio q ¼ 1 [160], to estimate a range of sensitive space-time volumes for future observing runs [207]. The second observing run (O2) is anticipated to TABLE III. The standard deviations used for the (zero-mean) Gaussian priors on calibration uncertainty for each of the three events. The calibration of each of the two detectors has been independently assessed [47]. These priors set the expected variation for the frequency-dependent spline model used to incorporate the effects of calibration uncertainty [191]. begin in late 2016 and last six months, and the third run (O3) is to begin in 2017 and last nine months. We show the predictions for the probability of obtaining N or more highsignificance events as a function of hVTi (in units of the space-time volume surveyed during O1) in Fig. 13. Current projections for O2 suggest that the sensitivity will be consistent with the lower end of the band indicated in Fig. 13.

VIII. CONCLUSION
During its first observing run, Advanced LIGO has observed gravitational waves from the coalescence of two stellar-mass BBHs, GW150914 and GW151226, with a third candidate LVT151012 also likely to be a BBH system. Our modeled binary coalescence search detects both GW150914 and GW151226 with a significance of greater than 5.3σ, while LVT151012 is found with a significance of 1.7σ. The component masses of these systems span a range from the heaviest black hole in GW150914 with a mass of 36.2 þ5.2 −3.8 M ⊙ , to 7.5 þ2.3 −2.3 M ⊙ , the lightest black hole of GW151226. The spins of the individual coalescing black holes are weakly constrained, but we can rule out two nonspinning components for GW151226 at the 99% credible level. All our observations are consistent with the predictions of general relativity, and the final black holes formed after merger are all predicted to have high spin values with masses that are larger than any black hole measured in x-ray binaries. The inferred rate of BBH mergers based on our observations is 9-240 Gpc −3 yr −1 , which gives confidence that future observing runs will observe many more BBHs. resources.

APPENDIX A: SEARCH DESCRIPTION
In this appendix, we give further details of the two analyses, PyCBC and GstLAL, used in the search. Both analyses separately correlate the data from each detector with template waveforms that model the expected signal. The analyses identify candidate events that are detected at both the Hanford and Livingston observatories, consistent with the 10-ms intersite propagation time. Additional signal consistency tests are performed to mitigate the effects of nonstationary transients in the data. Events are assigned a detection-statistic value that ranks their likelihood of being a gravitational-wave signal. This detection statistic is compared to the estimated detector noise background to determine, for each candidate event, the probability that detector noise would give rise to at least one equally significant event.
The choice of parameters for the templates depends on the shape of the power spectrum of the detector noise. The average noise power spectral density of the LIGO detectors was measured over the period September 12 to September 26, 2015. The harmonic mean of these noise spectra from the two detectors was used to place a single template bank that was employed for the duration of the search [3].
The matched-filter SNR ρ for each template waveform and each detector's data as a function of time is calculated according to [11,208]  h c and h s are the normalized orthogonal sine and cosine parts of the template, andãðfÞ is used to denote the Fourier transform of the time domain quantity aðtÞ. Here, S n ðfÞ denotes the one-sided average power spectral density of the detector noise. The waveform components h c and h s are normalized such that the expected value of hsjh s;c i 2 ðtÞ in stationary, Gaussian noise is unity [95]. The analyses identify times when the matched-filter SNR achieves a local maximum and store each of these as a trigger. The analyses search only stretches of data longer than a minimum duration, to ensure that the detectors are operating stably. The choice is different in the two analyses and reduces the available data of 48.6 days to 46.1 days for the PyCBC analysis and 48.3 days for the GstLAL analysis.
To suppress large SNR values caused by non-Gaussian detector noise, the analyses perform additional tests to quantify the agreement between the data and the template. These tests are different in the two analyses and are discussed in their respective subsections below. Both analyses enforce coincidence between detectors by selecting trigger pairs that occur within a 15-ms window and come from the same template. The 15-ms window is determined by the 10-ms intersite propagation time plus 5 ms for uncertainty in accurately determining the measured arrival time of weak signals. A detection statistic for each coincident event is derived as a function of the SNR observed in each detector, the value of the signal consistency tests, and details of the template.
The significance of a candidate event is determined by comparing it to the search background. From this, we are able to determine the rate at which detector noise produces events with a detection-statistic value equal to or higher than the candidate event (the FAR). Estimating this background is challenging for two reasons: First, the detector noise is nonstationary and non-Gaussian; therefore, its properties must be empirically determined. Second, it is not possible to shield the detector from gravitational waves to directly measure a signal-free background. The specific procedure used to estimate the background is different for the two analyses, as described in detail below.
The results of the independent analyses are two separate lists of candidate events, with each candidate event assigned a p-value and FAR. Candidate events with low FARs are identified as possible gravitational-wave signals for further investigation.

PyCBC analysis
The PyCBC analysis is described in detail in Refs. [2][3][4], and the configuration used to analyze the first 16 days of O1 data, containing GW150914, is described in Ref. [44]. Following the observation of GW150914, some improvements were made to the analysis, as we better understood the Advanced LIGO data. All changes were tested and tuned only on background data, prior to being incorporated into the analysis. These changes do not affect the significance bound of GW150914. Consequently, we chose to present the full results, on the final calibrated data, using the improved analysis. Here, we provide a brief overview of the analysis, including details of changes made following the discovery of GW150914.
In the PyCBC analysis, a trigger is stored when the maximum of the SNR time series is above the threshold of 5.5 (chosen as a compromise between a manageable trigger rate and assurance that no real event will be missed), with a maximum of one trigger stored in a 1-s window (reduced from 4 s in the previous analysis). A χ 2 statistic is computed to distinguish between astrophysical signals and noise transients. This result tests whether the signal power in a number of nonoverlapping frequency bands is consistent with that expected from the waveform template [14]. The χ 2 test is written explicitly as where p denotes the number of frequency bandsconstructed such that the expected signal power in each band is equal-and ρ i is the matched-filter SNR in the ith frequency band. For data containing only Gaussian noise, or Gaussian noise and a signal exactly matching the template waveform, the expected value of this statistic will be 1. For data containing non-Gaussian artefacts, or a signal not matching well with the template waveform, this value will be elevated. Each trigger is then ranked according to a combination of the SNR and the χ 2 test, namely, The number of frequency bands p used to compute the χ 2 signal-based veto [14] was optimized using data from the first month of O1. An improved background rejection was found when adopting the following, template-dependent expression for the number of χ 2 bands, where f peak is the frequency corresponding to the maximum amplitude of the template waveform using the models described in Ref. [8], and p is rounded to the nearest integer. This choice was adopted for the full O1 analysis presented here, where all waveforms have peak frequencies greater than 60 Hz. Loud and short instrumental transients are identified and excised from the data, as part of the data conditioning prior to SNR computation. In this analysis, we compute a whitened time series of the strain data and compare the magnitude of each sample against a threshold value of 100. Samples above threshold and within a time window of AE0.5 s are clustered together, and a gating window is placed at the time of the loudest sample in the cluster [209]. The threshold value of 100 is chosen to be much larger than the typical value of the magnitude in Gaussian noise and also larger than the value expected from any gravitationalwave signal from binaries at astrophysical distances and with intrinsic parameters within our search space.
Coincident triggers are formed when a trigger exists in both observatories, associated with the same template waveform and with arrival times within 15 ms. Each coincidence is ranked with a network statisticρ c , defined as the quadrature sum of theρ in each observatory. The rate of background events, as a function of network statistic, is estimated from the data themselves by repeating the analysis after artificially time-shifting the triggers from one detector relative to the other. Time shifts in multiples of 100 ms are performed, leading to a total of T b ¼ 5.0 × 10 6 years of background time analyzed.
The distribution of background noise events overρ c can vary strongly as a function of the template waveform; to account for this variation, the parameter space is divided into a number of regions which are treated as independent searches [44]. Each coincident trigger is assigned a FAR based on the background distribution in the region containing the coincidence and incorporating a trial factor equal to the number of regions. Studies of the background distribution as a function of the template parameters, and a reduced rate of noise events in O1 data, compared to the engineering run data previously used in tuning the search configuration [44], motivated a redefinition of the regions used to divide the search space. In the current analysis, we split the parameter space into three regions, defined by (i) M < 1.74M ⊙ , (ii) M ≥ 1.74M ⊙ and f peak ≥ 100 Hz, and (iii) M ≥ 1.74M ⊙ and f peak < 100 Hz. In the GW150914 analysis, the boundary between regions (ii) and (iii) was set at 220 Hz. By reducing this frequency, we significantly reduce the number of templates assigned to region (iii), which is dominated by short templates that are most affected by noise transients. The frequency at peak amplitude of the best-matching template for GW150914 is f peak ¼ 144 Hz. With the tuning used for the original result, this placed it in noise-background class (iii) of the PyCBC analysis [44]. However, with the improved O1 tuning, which changed the boundaries of the noise-background classes, this event is in noise-background class (ii). In Fig. 3, we plot the background only from class (ii), while the quoted significances take into account a trial factor of three because of the three noise-background classes.

GstLAL analysis
The GstLAL [210] analysis method is a low-latency, multidetector matched-filtering search for gravitational waves emitted by the coalescence of compact objects. The analysis exploits time-domain operations [5] that give it a latency of seconds after the acquisition of gravitationalwave data. This allows the GstLAL analysis to run in both low-latency mode to provide rapid identification of signals and in off-line mode on data that have been conditioned with data-quality vetoes [13]. The results presented here are for the off-line mode. No changes were made to the GstLAL analysis relative to the results presented in Ref. [44].
For the off-line analysis, the data sðtÞ are partitioned into chunks, and along with the templates hðtÞ, the data sðtÞ are then whitened in the frequency domain. The analysis splits the template bank into sub-banks containing waveforms that have morphological similarities. The templates are binned in a two-dimensional space by effective spin parameter χ eff and chirp mass M, as these parameters can be used to effectively describe a binary system in which the spins are aligned with the binary's orbital angular momentum. Templates are allowed to overlap in adjacent bins to mitigate boundary effects, although no redundant waveforms are filtered.
An orthonormal basis of filtersĥðtÞ is then constructed using singular value decomposition [5]. This basis is significantly smaller than the number of input waveforms and allows for a significant reduction in the time-domain filtering cost. The set of filtersĥðtÞ in each bin is convolved with the whitened data, producing a time series; the matched-filter SNR time series ρ for each template can then be constructed using linear combinations of the convolution time series. A trigger is stored when the maximum of the SNR time series crosses a predetermined threshold of 4. A maximum of one coincident trigger per template is stored in each second.
A signal consistency test is performed by comparing the SNR time series of data to the SNR time series expected from a real signal using the autocorrelation function of the template at its time of peak amplitude, RðtÞ. A consistency test value ξ 2 ac is determined for each trigger using the SNR time series ρðtÞ, the peak SNR ρ p , and the autocorrelation function RðtÞ in some window of time δt (corresponding to ρ p ) around the trigger: where the factor μ ensures that a well-fit signal has an expectation value of 1 [44]. The window δt is a tunable parameter that has been chosen based on Monte Carlo simulations in real data and finding the value that (on average) best rejects glitches. Triggers that survive consistency checks are assigned a ranking based upon their SNR, ξ 2 ac value, and the instantaneous horizon distance values at each detector, fD H1 ; D L1 g, which encode the detector sensitivity [15,211].
A likelihood ratio is constructed to rank candidate events by the ratio of the probability of observing matched-filter SNR and ξ 2 ac from signals, h, versus obtaining the same parameters from noise, n. The templates have already been grouped into regions that contain high overlap, so it is likely that templates within each group will respond similarly to noise; in fact, the template group itself is used as a parameter in the likelihood ratio to qualitatively establish how different regions of the parameter space are affected by noise. The likelihood ratio can thus be written as where x d ¼ fρ d ; ξ 2 d g are the matched-filter SNR and ξ 2 ac in each detector, θ i corresponds to the template group, and D d is the horizon distance of the given detector at the time of the trigger. The signal distribution in the numerator is calculated using an astrophysical model of signals distributed isotropically in the nearby universe. The denominator is calculated under the assumption that the noise in each detector is independent. It can then be calculated from the distribution of triggers in each template bin observed in each detector. In the case where multiple high-likelihood events are produced at the same time, a clustering process is used to remove events with lower likelihoods within a 4-s window so that only the event with the highest likelihood is retained.
In a typical search, the majority of events found in coincidence correspond to noise and not an actual signal. To accurately distill signals from the data, the p-value at the value of L for each event is ascertained; the p-value describes the probability of observing the event's L or greater in noise alone. The GstLAL method determines the p-value by taking the probability density functions of parameters in Eq. (A7) obtained from triggers that are noiselike in nature [212].

APPENDIX B: PARAMETER-ESTIMATION DESCRIPTION
To extract information from the signal, we perform a coherent Bayesian analysis of the data from the two instruments using LALInference [48,213]. The properties of the source leave imprints on the signal from which we can infer their values [39]. We match the measured strain to model waveforms and use the agreement to define probability distributions for the parameters that describe the signal. A summary of results for the three events is given in Table IV.
The result of our analysis is the posterior probability distribution for parameters describing the source. The posterior is computed from Bayes' theorem [216,217]: It is proportional to the product of the likelihood of the data given the parameters and the prior for the parameters. The likelihood is calculated using a noise-weighted inner product between the data and the model waveform [95]. This depends upon the waveform and the noise spectral density at the time of event, and both could potentially be sources of systematic error. We incorporate the effects of uncertainty in the detectors' calibration using a frequencydependent model [191]. The posterior probability density is mapped out using stochastic sampling algorithms, and our parameter estimates are constructed from the distribution of samples.
The analysis makes use of two inspiral-merger-ringdown waveform models, a reduced-order model of the double aligned spin EOB waveform used for the detection analyses, which we refer to as EOBNR [8,9], and an effective precessing spin model, which we refer to as IMRPhenom [36][37][38]218]. For all events, the results from the EOBNR and IMRPhenom waveforms are similar. An analysis using a fully precessing EOBNR waveform [219], as done in Ref. [40], will be reported in the future; this analysis is currently too computationally expensive for results to be presented now.
To compare how well the different waveform models match the data, we use the Bayes factor B s=n and the deviance information criterion (DIC). The Bayes factor is the ratio of the evidence (the marginalized likelihood) for a coherent signal hypothesis to that for Gaussian noise [215]. A larger Bayes factor indicates that there is more support for the signal model [220]. The DIC is a measure of the goodness of fit of a model, defined as an average log likelihood plus a penalty factor for higher dimensional models [221][222][223]. A smaller value of the DIC indicates a greater expectation that the model would predict data similar to that being analyzed, and hence that it is a better fit. The values for both quantities are similar for all three events. The data do not allow us to conclusively prefer one waveform model over the other; therefore, in the column titled Overall of Table IV, results are constructed by averaging the two, marginalizing over our choice of waveform.
Inaccuracies in the waveform models could be a source of systematic error in the parameter estimates [224][225][226]. However, an alternative analysis of GW150914 using a set of waveforms from numerical-relativity simulations yielded results consistent with those using the EOBNR and IMRPhenom approximants [227]. For our results, we use the difference between results from the two waveform models as a proxy for the theoretical error from waveform modeling, although some known physics such as higher modes and eccentricity are missing from both of these waveform families. For each parameter, we quote systematic errors on the boundaries of the 90% credible intervals; this is the 90% range of a normal distribution estimated from the variance of results from the different models [39]. For parameters with bounded ranges, like the spins or mass ratio, the normal distributions should be truncated, but for simplicity, we still quote the 90% range of the uncut distributions. More sophisticated means of incorporating waveform uncertainty into the analysis, such as Gaussian TABLE IV. Parameters that characterize GW150914, GW151226, and LVT151012. For model parameters, we report the median value with the range of the symmetric 90% credible interval [214]; we also quote selected 90% credible bounds. For the logarithm of the Bayes factor for a signal compared to Gaussian noise, we report the mean and its 90% standard error from four parallel runs with a nested sampling algorithm [215], and for the deviance information criterion, we report the mean and its 90% standard error from a Markov-chain Monte Carlo and a nested sampling run. The source redshift and source-frame masses assume standard cosmology [18]. Results are given for spin-aligned EOBNR and precessing IMRPhenom waveform models. The "Overall" results are computed by averaging the posteriors for the two models. For the overall results, we quote both the 90% credible interval or bound and an estimate for the 90% range of systematic error on this determined from the variance between waveform models. Further explanations of the parameters are given in Ref. [39]. process regression [228], may be used in the future. For all three events, we find that the theoretical uncertainty from waveform modeling is less significant than statistical uncertainty from the finite SNR of the events. The calibration error is modeled using a cubic spline polynomial [39,191], and we marginalize over uncertainty in the calibration. Each analysis assumes a prior for the calibration uncertainty, which is specific for each detector at the time of that signal. Standard deviations of the prior distributions for the amplitude and phase uncertainty are given in Table III. The updated calibration uncertainty is better than the original 10% in amplitude and 10 deg in phase [47] used for the first results.
Aside from the difference in calibration, the analysis of GW150914 follows the specification in Ref. [39]. We analyze 8 s of data centered on the time reported by the detection analyses [44], using the frequency range between 20 Hz and 1024 Hz. The time interval is set by the in-band duration of waveforms in the prior mass range. We assume uninformative prior distributions for the parameters (uniform distributions for the time and phase of coalescence, uniform distribution of sources in volume, isotropic orientations for the binary and the two spins, uniform distribution of spin magnitudes, and uniform distribution of component masses m 1;2 ∈ ½10; 80M ⊙ ). For quantities subject to change because of precession, we quote values at a reference gravitation-wave frequency of f ref ¼ 20 Hz. There are small differences in the source's parameters compared to the runs on the older calibration, but these are well within the total uncertainty; the greatest difference is in the sky area, where the reduced calibration uncertainty improves the localization area by a factor of about 2-3.
There are two differences in the configuration of the analysis of LVT151012 from that for GW150914: The prior on the component masses was set to be uniform over the range m 1;2 ∈ ½5; 80M ⊙ , and the length of data analyzed was T ¼ 22 s. We find that LVT151012 is consistent with a lower-mass source, which necessitates a lower prior bound on the component masses and requires us to analyze a longer stretch since the signal could be in band for longer.
GW151226 is also consistent with being a lower-mass source. However, we can still consider just 8 s of data by confining the component masses such that the chirp mass is M ∈ ½9.5; 10.5M ⊙ and the mass ratio is q ∈ ½1=18; 1. Preliminary analyses found no support outside of these ranges, and the final posteriors lie safely within this region. This choice of segment length limits the computational expense of the analysis.

APPENDIX C: RATES CALCULATION DESCRIPTION
In this appendix, we give further details of how the BBH coalescence rates are estimated. The framework of Ref. [229] considers two classes of triggers (coincident search events): those whose origin is astrophysical and those whose origin is terrestrial. Terrestrial triggers are the result of either instrumental or environmental effects in the detector. In order to calculate the rate of astrophysical triggers, we first seek to determine the probability that any given trigger arises from either class. The two classes of source produce triggers with different densities as a function of the detection statistic used in the analysis, which we denote as x. Triggers appear in a Poisson process with number density where Λ 1 and Λ 0 are the Poisson mean numbers of triggers of astrophysical and terrestrial origin, respectively. Here, Λ 1 is related to the merger rate density through where hVTi is the population-averaged sensitive spacetime volume of the search [42,155], where V c is the comoving volume [230], θ describes the population parameters, sðθÞ is the distribution function for the astrophysical population in question, and 0 ≤ fðz; θÞ ≤ 1 is the selection function giving the probability of detecting a source with parameters θ at redshift z.
Because the distribution of astrophysical triggers is independent of source parameters without parameter-estimation follow-up, we must assume an astrophysical distribution of sources, and the rate enters the likelihood only in the form Λ 1 ¼ RhVTi. We also marginalize over a calibration uncertainty of 6% on the recovered luminosity distances (18% uncertainty on hVTi) when computing the rates. The distribution of terrestrial triggers is calculated from the search background estimated by the analyses (as shown in Fig. 3). The distribution of astrophysical events is determined by performing large-scale simulations of signals drawn from the various astrophysical populations added to the O1 data set and using the distribution of triggers recovered by our detection analyses applied to this data set. This method correctly accounts for various thresholds applied in the analyses. Note that the observed distribution of astrophysical triggers over the detection statistic will be essentially independent of the astrophysical population used: All populations are assumed to be distributed uniformly in comoving volume; thus, to a good approximation, the measured SNRs and other detection statistics follow the flat-space, volumetric density p 1 ðρÞ ∝ ρ −4 [129].
The likelihood for a search result containing M triggers with detection statistic values fx j jj ¼ 1; …; Mg is [229] Lðfx j jj ¼ 1; …;MgjΛ 1 ; The posterior over Λ 1 and Λ 0 is then obtained by multiplying the likelihood in Eq. (C4) by a prior proportional to 1= ffiffiffiffiffiffiffiffiffiffi ffi Λ 0 Λ 1 p and marginalizing over the x j to obtain pðΛ 0 ; Λ 1 Þ. For a trigger with statistic value x, the probability that it is of astrophysical origin is P 1 ðxjfx j jj ¼ 1; …; MgÞ ≡ Z dΛ 0 dΛ 1 Λ 1 p 1 ðxÞ Λ 0 p 0 ðxÞ þ Λ 1 p 1 ðxÞ × pðΛ 1 ; Λ 0 jfx j jj ¼ 1; …; MgÞ: Finally, we evaluate the rate assuming a population containing only BBH mergers with mass and spin parameters matching the three triggers for which P 1 > 0.5; i.e., astrophysical origin is more likely than terrestrial. To do so, we must generalize the formalism presented above to account for three different astrophysical populations, each having a different mean number of triggers Λ i . In this case, the likelihood of Eq. (C4) is generalized to allow for each trigger to arise from one of the astrophysical classes, or be of terrestrial origin. Additionally, we change the prior distribution to account for the number of astrophysical trigger classes via where N c ¼ 3 is the number of different classes of astrophysical triggers. This functional form is chosen to prevent the posterior expectation of the total count of astrophysical events, P N c i Λ i , from growing without limit as more classes are considered in the calculation.
The three triggers associated with GW150914, GW151226, and LVT151012 are restricted to originate either from their specific class or be of terrestrial origin. Thus, for instance, we neglect any probability of GW150914 arising from the class containing GW151226. We justify this by noting that the probability distributions for the component masses of the three likely signals are disjoint from one another at high confidence.
Multiplying this prior by the generalization of the likelihood, Eq. (C4), we obtain the posterior distribution on Λ i , the number of astrophysical triggers in each class. We again calculate the sensitive hVTi for each of the classes of signals and thus infer merger rates for each class. Figure 14 shows how the sensitive hVTi is accumulated as a function of redshift. For the less massive GW151226, the peak occurs at z ∼ 0.1, while for GW150914, it occurs at z ≈ 0.2, with the search being sensitive to some signals with redshifts as high as 0.6.

APPENDIX D: MASS DISTRIBUTION CALCULATION DESCRIPTION
Here, we describe the details of the analysis of the mass distribution that appears in Sec. VI. Further details on population analysis in the context of measurement uncertainty and selection effects are given in Ref. [231]. After this paper was accepted, we became aware of Ref. [232], which derives the same result as Ref. [231] and this appendix in a different context. Useful references for hierarchical analysis in astronomy include Refs. [156][157][158][159].
We assume that the distribution of black hole masses in coalescing binaries follows [see Eq. (7)] and pðm 2 jm 1 Þ ¼ 1 with M min ¼ 5M ⊙ the minimum black hole mass we consider, as in the models of the mass distributions used to infer rates. The joint population distribution on m 1 and m 2 therefore follows FIG. 14. The rate at which sensitive space-time volume accumulates with redshift. Curves labeled by component masses in M ⊙ are computed using an approximate prescription described in Ref. [42], assuming sources with fixed masses in the comoving frame and with zero component spins; the GW150914, GW151226, and LVT151012 curves are determined from the Monte Carlo injection campaign described in Sec. VI.
Here, we take all masses to be source-frame masses. The distribution of masses observed in our experiment will differ from the population distribution because our detector sensitivity is a strong function of system mass. A simplified model of our detection pipeline is that it is a deterministic function of the data, fðdÞ, such that when fðdÞ > f 0 , for some threshold f 0 , we detect a trigger. Given our population parameter α, the joint distribution of system parameters and data for a single detected trigger with data d is pðd; m 1 ; m 2 jαÞ ¼ pðdjm 1 ; m 2 Þpðm 1 ; m 2 jαÞ βðαÞ ; ðD4Þ where the first term in the numerator is the standard (unnormalized) likelihood function used in our parameterestimation analysis, the second term is the population distribution in Eq. (D3) and plays a role of a prior in our hierarchical analysis, and βðαÞ is a normalization factor, ensuring that the joint distribution is properly normalized. This factor is βðαÞ ¼ Z dm 1 dm 2 ddpðdjm 1 ; m 2 Þpðm 1 ; m 2 jαÞ; ðD5Þ where the integral is taken over all allowed masses and the set of data producing a detected trigger fdjfðdÞ > f 0 g. Consider first the integral over d, which is where we have defined the detection probability as a function of mass, This quantity is proportional to the hVTi defined in Eq. (C3) evaluated with a source distribution that fixes the source masses: To evaluate this factor, we use the approximate recipe from Ref. [42]. Thus, βðαÞ ∝ Z dm 1 dm 2 pðm 1 ; m 2 jαÞhVTij m 1 ;m 2 : This normalization factor accounts for the selection effects of our searches on the observed distribution of masses.
Here, we are interested only in the population parameters, not in reanalyzing the system masses; thus, we can integrate the masses out of the joint distribution in Eq. (D4) to obtain pðdjαÞ ¼ 1 βðαÞ Z dm 1 dm 2 pðdjm 1 ; m 2 Þpðm 1 ; m 2 jαÞ ∝ 1 βðαÞ hpðm 1 ; m 2 jαÞi; ðD10Þ where the notation h…i refers to an average over posterior samples (properly reweighted to correspond to a flat prior in m 1 and m 2 ) [231].
With multiple triggers analyzed, the likelihood is a product of single-event likelihoods from Eq. (D10). We impose a flat prior on α. The posterior from an analysis using GW150914, LVT151012, and GW151226 appears in Fig. 12.