University of Birmingham First observation of the rare purely baryonic decay B0 p p-Aaij,

The first observation of the decay of a B 0 meson to a purely baryonic final state, B 0 → p ¯ p , is reported. The proton-proton collision data sample used was collected with the LHCb experiment at center-of-mass energies of 7 and 8 TeVand corresponds to an integrated luminosity of 3 . 0 fb − 1 . The branching fraction is determined to be B ð B 0 → p ¯ p Þ ¼ ð 1 . 25 (cid:2) 0 . 27 (cid:2) 0 . 18 Þ × 10 − 8 , where the first uncertainty is statistical and the second systematic. The decay mode B 0 → p ¯ p is the rarest decay of the B 0 meson observed to date. The decay B 0 s → p ¯ p is also investigated. No signal is seen and the upper limit B ð B 0 s → p ¯ p Þ < 1 . 5 × 10 − 8 at 90% confidence level is set on the branching fraction.

First Observation of the Rare Purely Baryonic Decay B 0 → pp R. Aaij et al. * (LHCb Collaboration) (Received 7 September 2017;published 4 December 2017) The first observation of the decay of a B 0 meson to a purely baryonic final state, B 0 → pp, is reported. The proton-proton collision data sample used was collected with the LHCb experiment at center-of-mass energies of 7 and 8 TeV and corresponds to an integrated luminosity of 3.0 fb −1 . The branching fraction is determined to be BðB 0 → ppÞ ¼ ð1.25 AE 0.27 AE 0.18Þ × 10 −8 , where the first uncertainty is statistical and the second systematic. The decay mode B 0 → pp is the rarest decay of the B 0 meson observed to date. The decay B 0 s → pp is also investigated. No signal is seen and the upper limit BðB 0 s → ppÞ < 1.5 × 10 −8 at 90% confidence level is set on the branching fraction. DOI: 10.1103/PhysRevLett.119.232001 Studies of B mesons decaying to baryonic final states have been carried out since the late 1990s [1]. It was quickly realized that baryonic and mesonic B-meson decays differ in a number of ways. Two-body baryonic decays are suppressed with respect to decays to multibody final states [2,3] and the characteristic threshold enhancement in the baryon-antibaryon mass spectrum [4,5] is still not fully understood. The study of such decays provides information on the dynamics of B decays and tests QCDbased models of the hadronization process [5]. It helps to discriminate the available models and makes it possible to extract both tree and penguin amplitudes of charmless twobody baryonic decays when combining the information on the B 0 → pp and B þ → pΛ branching fractions [6].
Baryonic B decays are also interesting in the study of CP violation. First evidence of CP violation in baryonic B decays has been reported from the analysis of B þ → ppK þ decays [7] and awaits confirmation in other decay modes [8].
This Letter presents a search for the suppressed decays of B 0 and B 0 s mesons to the two-body charmless baryonic final state pp. Prior to searches at the LHC, the ALEPH, CLEO, BABAR, and Belle Collaborations searched for the B 0 → pp decay [9][10][11][12]. The most stringent upper limit on its branching fraction was obtained by the Belle experiment and is BðB 0 → ppÞ < 1.1 × 10 −7 at 90% confidence level (C.L.) [12]. The only search for the B 0 s → pp decay, performed by the ALEPH Collaboration, yielded the upper limit BðB 0 s → ppÞ < 5.9 × 10 −5 at 90% C.L. [9]. The LHCb Collaboration has greatly increased the knowledge of baryonic B decays in recent years [7,[13][14][15][16][17]. The collaboration has reported the first observation of a twobody charmless baryonic B þ decay, B þ → pΛð1520Þ [7], and the first evidence for B 0 → pp, a two-body charmless baryonic decay of the B 0 meson [13]. The experimental data on two-body final states is nevertheless scarce. The study of these suppressed modes requires large data samples that are presently only available at the LHC.
In this analysis, in order to suppress common systematic uncertainties, the branching fractions of the B 0 → pp and B 0 s → pp decays are measured using the topologically identical decay B 0 → K þ π − . The branching fractions are determined from where N represents yields determined from fits to the pp or K þ π − invariant-mass distributions, f d =f s (included only for the B 0 s mode) is the ratio of b-quark hadronization probabilities into the B 0 and B 0 s mesons [18] and ε represents the geometrical acceptance, reconstruction, and selection efficiencies. The notation B 0 ðsÞ → pp stands for either B 0 → pp or B 0 s → pp. The inclusion of chargeconjugate processes is implied, unless otherwise indicated.
The data sample analyzed corresponds to an integrated luminosity of 1.0 fb −1 of proton-proton collision data collected by the LHCb experiment at center-of-mass energies of 7 TeV in 2011 and 2.0 fb −1 at 8 TeV in 2012. The LHCb detector [19,20] is a single-arm forward spectrometer covering the pseudorapidity range 2 < η < 5, designed for the study of particles containing b or c quarks. The detector includes a high-precision tracking system consisting of a silicon-strip vertex detector surrounding the pp interaction region [21], a large-area silicon-strip detector located upstream of a dipole magnet with a bending power of about 4 Tm, and three stations of silicon-strip detectors and straw drift tubes [22] placed downstream of the magnet. The tracking system provides a measurement of momentum p of charged particles with a relative uncertainty that varies from 0.5% at low momentum to 1.0% at 200 GeV=c. The minimum distance of a track to a primary vertex (PV), the impact parameter (IP), is measured with a resolution of ð15 þ 29=p T Þ μm, where p T is the component of the momentum transverse to the beam, in GeV=c. Different types of charged hadrons are distinguished using information from two ring-imaging Cherenkov detectors [23]. Photons, electrons, and hadrons are identified by a calorimeter system consisting of scintillating-pad and preshower detectors, an electromagnetic calorimeter, and a hadronic calorimeter. Muons are identified by a system composed of alternating layers of iron and detector planes consisting of multiwire proportional chambers and gas electron multipliers. Simulated data samples, produced as described in Refs. [24][25][26][27][28][29], are used to evaluate the response of the detector and to investigate and characterize possible sources of background.
Candidates are selected in a similar way for both signal B 0 ðsÞ → pp decays and the normalization channel B 0 → K þ π − . Real-time event selection is performed by a trigger [30] consisting of a hardware stage, based on information from the calorimeter and muon systems, followed by a software stage, which performs a full event reconstruction. The hardware trigger stage requires events to have a hadron, photon, or electron with high transverse energy (above a few GeV) deposited in the calorimeters, or a muon with high transverse momentum. For this analysis, the hardware trigger decision can be made either on the signal candidates or on other particles in the event. The software trigger requires a two-track secondary vertex with a significant displacement from the PVs. At least one charged particle must have high p T and be inconsistent with originating from a PV. A multivariate algorithm [31] is used for the identification of secondary vertices consistent with the decay of a b or c hadron. The final selection of candidates in the signal and normalization modes is carried out with a preselection stage, particle identification (PID) criteria, and a requirement on the response of a multilayer perceptron (MLP) classifier [32]. To avoid potential biases, pp candidates with invariant mass in the range ½5230; 5417 MeV=c 2 (a AE 50 MeV=c 2 window approximately three times the invariant-mass resolution around the known B 0 and B 0 s masses [33]) were not examined until the analysis procedure was finalized.
At the preselection stage, the B 0 ðsÞ decay products are associated with tracks with good reconstruction quality that have χ 2 IP > 9 with respect to any PV, where the χ 2 IP is defined as the difference between the vertex-fit χ 2 of a PV reconstructed with and without the track in question. The minimum p T of the decay products is required to be above 900 MeV=c and at least one of the decay products is required to have p T > 2100 MeV=c. A loose PID requirement, based primarily on information from the Cherenkov detectors, is also imposed on both particles. The B 0 ðsÞ candidate must have a vertex with good reconstruction quality, p T > 1000 MeV=c and a χ 2 IP < 36 with respect to the associated PV. The associated PV is that with which it forms the smallest χ 2 IP . The angle θ B between the momentum vector of the B 0 ðsÞ candidate and the line connecting the associated PV and the candidate's decay vertex is required to be close to zero [cosðθ B Þ > 0.9995].
After preselection, tight PID requirements are applied to the two final-state particles to suppress so-called combinatorial background formed from the accidental associations of tracks unrelated to the signal decays, and to reduce contamination from b-hadron decays where one or more decay products are misidentified. The PID requirements are determined by optimizing the figure of merit , where a ¼ 5 quantifies the target level of significance in standard deviations and ε sig is the PID efficiency of the signal selection. The quantity N bkg denotes the expected number of background events in the signal region. This is estimated by extrapolating the result of a fit to the invariant-mass distribution of the data in the sideband regions above and below the signal region. The PID criteria are allowed to be different for protons and antiprotons. The optimization of the PID criteria applied to the normalization decay candidates relies on maximizing the signal significance, while minimizing the contamination from misidentified backgrounds.
Further separation between signal and combinatorial background candidates relies on an MLP implemented with the TMVA toolkit [35]. There are ten input quantities to the MLP classifier: the minimum values of the p T and η of the decay products, the scalar sum of their p T values, the χ 2 IP of the decay products; the distance of closest approach between the two decay products; a parameter expressing the quality of the B 0 ðsÞ vertex fit; the χ 2 IP and θ B angle of the B 0 ðsÞ candidate; and the p T asymmetry within a cone around the is the transverse component of the vector sum of the momenta of all tracks measured within the cone radius R ¼ 1.0 around the B 0 ðsÞ direction, except for the B 0 ðsÞ decay products. The cone radius is defined in pseudorapidity and azimuthal angle ðη; ϕÞ as R ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi ðΔηÞ 2 þ ðΔϕÞ 2 p . The A p T requirement exploits the relative isolation of signal decay products as compared with background. The MLP is trained using simulated B 0 → pp decays and data candidates in the pp invariant-mass sideband above 5417 MeV=c 2 to represent the background. The requirement on the MLP response is optimized using the same figure of merit as that used for the optimization of the PID selection. The MLP selection keeps approximately 60% of the signal candidates, while suppressing combinatorial background by 2 orders of magnitude. The MLP applied to the normalization decay candidates is the same as that trained to select the B 0 ðsÞ → pp signal candidates, with the requirement on the response chosen to maximize the B 0 → K þ π − significance. A vanishingly small fraction of events contains a second candidate after all selection requirements are applied and all candidates are kept.
Large control data samples of kinematically identified pions, kaons, and protons originating from the decays D 0 → K − π þ , Λ → pπ − , and Λ þ c → pK − π þ are employed to determine the efficiency of the PID requirements [23]. All the other components of the selection efficiencies are determined from simulation. The agreement between data and simulation is verified comparing kinematic distributions from selected B 0 → K þ π − decays. The distributions in data are obtained with the sPlot technique [36] with the B 0 candidate invariant mass used as the discriminating variable. The overall efficiencies of this analysis, including the trigger selection and the reconstruction, are of the order 10 −3 .
Sources of noncombinatorial background to the pp spectrum are investigated using simulation samples. These sources include partially reconstructed backgrounds in which one or more particles from the decay of a b hadron are not associated with the signal candidate, or b-hadron decays where one or more decay products are misidentified. The sum of such backgrounds does not peak in the B 0 and B 0 s signal regions but rather contributes a smooth pp mass spectrum, which is indistinguishable from the dominant combinatorial background.
The yields of the signal and normalization candidates are determined using unbinned maximum likelihood fits to the invariant-mass distributions. The pp invariant-mass distribution is described with three components, namely the B 0 → pp and B 0 s → pp signals and combinatorial background. The B 0 ðsÞ → pp signals are modeled with the sum of two Crystal Ball (CB) functions [37] describing the highand low-mass asymmetric tails. The two components of each signal share the same peak and core width parameters. The core widths are fixed using B 0 ðsÞ → pp simulated samples. A scaling factor is applied to account for differences in the resolution between data and simulation as determined from B 0 → K þ π − candidates. The B 0 s → pp signal peak value is set relative to the B 0 → pp signal peak value determined from the fit according to the B 0 s -B 0 mass difference [33]. The tail parameters and the relative normalization of the CB functions are determined from simulation. The combinatorial background is described with a linear function, with the slope parameter allowed to vary in the fit.
The pp invariant-mass distribution is presented in Fig. 1 together with the result of the fit. The yields of the B 0 ðsÞ → pp signals are NðB 0 → ppÞ ¼ 39 AE 8 and NðB 0 s → ppÞ ¼ 2 AE 4, where the uncertainties are statistical only. The significance of each of the signals is determined from the change in the logarithm of the likelihood between fits with and without the signal component [38]. The B 0 → pp decay mode is found to have a significance of 5.3 standard deviations, including systematic uncertainties, and the B 0 s → pp mode is found to have a significance of 0.4 standard deviations, where, given its size, the significance has been evaluated ignoring systematic effects. The high significance of the B 0 → pp signal implies the first observation of a two-body charmless baryonic B 0 decay.
The K þ π − invariant-mass distribution of the normalization decay candidates is described with components accounting for the B 0 → K þ π − and B 0 s → π þ K − signals; the background due to the decays B 0 →π þ π − , B 0 s → K þ K − , Λ 0 b → pπ − , and Λ 0 b → pK − when at least one of the finalstate particles is misidentified; background from partially reconstructed b-hadron decays; and combinatorial background. The B 0 → K þ π − and B 0 s → π þ K − decays are modeled with the sum of two CB functions sharing the same peak and core width parameters. The peak value and core width of the B 0 → K þ π − signal model are free parameters in the fit. The difference between the peak positions of the B 0 → K þ π − and B 0 s → π þ K − signals is constrained to its known value [33] and the core width of the B 0 s → π þ K − signal is related to the B 0 → K þ π − signal core width by a scaling factor of 1.02 as determined from simulation. The tail parameters and the relative normalization of both double CB functions are determined from simulation. The invariant-mass distributions of the four misidentified decays are determined from simulation and are modeled with nonparametric functions [39]. The relative fractions of these background components depend upon the branching fractions, b-hadron hadronization probabilities, and misidentification rates of the backgrounds. The fractions are Gaussian constrained to the product of these three factors, with the widths of the Gaussian functions equal to their combined uncertainties. The misidentification rates are determined from calibration data samples, whereas all other selection efficiencies are obtained from simulation. Partially reconstructed backgrounds represent decay modes misreconstructed as signal with one or more undetected final-state particles, possibly in conjunction with misidentifications. The shapes of these backgrounds in K þ π − invariant mass are determined from simulation, where each contributing decay is assigned a weight dependent on its relative branching fraction, hadronization probability, and selection efficiency. The weighted sum of the partially reconstructed backgrounds is well modeled with the sum of two exponential functions, the slope parameters of which are fixed from simulation, while the yield is determined in the fit to the data. As for the signal fit, the combinatorial background is described with a linear function, with the slope parameter allowed to vary in the fit.
The fit to the K þ π − invariant mass, shown in Fig. 2, involves seven fitted parameters and yields NðB 0 → K þ π − Þ ¼ 88961 AE 341 signal decays, where the uncertainty is statistical only.
The sources of systematic uncertainty on the B 0 ðsÞ → pp branching fractions arise from the fit model, the limited knowledge of the selection efficiencies, and the uncertainties on the B 0 → K þ π − branching fraction and on the ratio of b-quark hadronization probabilities f s =f d . Pseudoexperiments are used to estimate the effects of using alternative shapes for the fit components and of including additional backgrounds in the fit. Systematic uncertainties on the fit models are also assessed by varying the fixed parameters of the models within their uncertainties. The description of the combinatorial background is replaced by an exponential function. In the fit to the signal modes, the partially reconstructed decays B þ → ppl þν l , where l stands for an electron or a muon and ν l for the corresponding neutrino, are added to the fit model. Intrinsic biases in the fitted yields are also investigated with pseudoexperiments and are found to be negligible.
Uncertainties on the efficiencies arise from residual differences between data and simulation in the trigger, reconstruction, selection, and uncertainties on the datadriven particle identification efficiencies. These differences are assessed using the B 0 → K þ π − normalization decay, comparing the level of agreement between simulation and data. The distributions of selection variables for B 0 → K þ π − signal candidates in data are obtained by subtracting the background using the s Plot technique [36], with the K þ π − candidate invariant mass as the discriminating variable. The effect of binning the PID calibration samples used to obtain the PID efficiencies is evaluated by varying the binning scheme and by adding an extra dimension accounting for event multiplicity to the binning of the samples.
The uncertainty on the branching fraction of the normalization decay, BðB 0 → K þ π − Þ ¼ ð1.96 AE 0.05Þ × 10 −5 [33], is taken as a systematic uncertainty from external inputs. The uncertainty on the measurement f s =f d ¼ 0.259 AE 0.015 [18] is quoted as a separate source of systematic uncertainty from external inputs in the determination of the upper limit on BðB 0 s → ppÞ. The total systematic uncertainty on the B 0 → pp (B 0 s → pp) branching fraction is given by the sum of all uncertainties added in quadrature and amounts to 14.2% (209%). The systematic uncertainties on the B 0 → pp (B 0 s → pp) branching fraction are dominated by the uncertainties on the fit model, which are 7.3% (208%), and on the reconstruction and selection efficiencies, which amount to 6.1% (6.1%) and 8.6% (8.3%), respectively. Specifically, the systematic uncertainty arising from the description of the fit model backgrounds dominates the uncertainty on the B 0 s → pp branching fraction. In summary, the first observation of the simplest decay of a B 0 meson to a purely baryonic final state, B 0 → pp, is reported using a data sample of proton-proton collisions collected with the LHCb experiment, corresponding to a total integrated luminosity of 3.0 fb −1 . This rare two-body charmless baryonic decay is observed with a significance of 5.3 standard deviations, including systematic uncertainties. The B 0 → pp branching fraction is determined to be where the first uncertainty is statistical and the second systematic. Since no B 0 s → pp signal is seen, the world's best upper limit BðB 0 s → ppÞ < 1.5 × 10 −8 at 90% confidence level is set on the decay branching fraction using the Feldman-Cousins frequentist method [40].
The first observation of the decay B 0 → pp, the rarest B 0 decay ever observed, provides valuable input towards the understanding of the dynamics of hadronic B decays. This measurement helps to discriminate among several QCD-based models and makes it possible to extract both tree and penguin amplitudes of charmless two-body baryonic decays when combining the information on the B 0 → pp and B þ → pΛ branching fractions [6]. The measured B 0 → pp branching fraction is compatible with recent theoretical calculations, as is the upper limit on the B 0 s → pp branching fraction [2,3,6]. An improved measurement of the B 0 s → pp branching fraction will make it possible to quantitatively compare the models proposed in Refs. [2,6].
We express our gratitude to our colleagues in the CERN accelerator departments for the excellent performance of the LHC. We thank the technical and administrative staff at the LHCb institutes. We acknowledge support from CERN and from the national agencies: