Search for the rare hadronic decay $B_s^0\to p \bar{p}$

A search for the rare hadronic decay $B_s^0\to p \bar{p}$ is performed using proton-proton collision data recorded by the LHCb experiment at a center-of-mass energy of 13 TeV, corresponding to an integrated luminosity of 6 fb$^{-1}$. No evidence of the decay is found and an upper limit on its branching fraction is set at ${\cal B}(B_s^0\to p \bar{p})<4.4~(5.1) \times 10^{-9}$ at 90% (95%) confidence level; this is currently the world's best upper limit. The decay mode $B^0\to p \bar{p}$ is measured with very large significance, confirming the first observation by the LHCb experiment in 2017. The branching fraction is determined to be ${\cal B}(B^0\to p \bar{p}) = \rm (1.27 \pm 0.15 \pm 0.05 \pm 0.04) \times 10^{-8}$, where the first uncertainty is statistical, the second is systematic and the third is due to the external branching fraction of the normalization channel $B^0\to K^+\pi^-$. The combination of the two LHCb measurements of the $B^0\to p \bar{p}$ branching fraction yields ${\cal B}(B^0\to p \bar{p}) = \rm (1.27 \pm 0.13 \pm 0.05 \pm 0.03) \times 10^{-8}$.


Introduction
Exclusive decays of B mesons to final states containing baryons as well as mesons have been studied since the late 1990s [1].Differences in the way B mesons decay to baryonic versus purely mesonic final states have been found since the first experimental measurements became available: multi-body baryonic decay modes (such as B + → ppK + ) can have their branching fractions significantly increased owing to a threshold enhancement effect in the baryon-antibaryon invariant mass spectrum [2][3][4][5].On the other hand, two-body baryonic decays (such as B 0 (s) → pp) are suppressed [5][6][7].To date, only three charmless two-body baryonic decays have been observed, namely the B + → pΛ(1520) [8], B + → pΛ [9] and B 0 → pp [10] modes.
In the Standard Model (SM), the dominant contributions to the B 0 → pp decay amplitude involve tree-level b → u transitions through factorizable W -exchange and non-factorizable internal W -emission.Other electroweak contributions such as penguin annihilation have a small influence on the decay amplitude.References [3,[11][12][13] indicate that the contribution from W -exchange or annihilation can be neglected due to helicity suppression.There is no non-factorizable contribution to the B 0 s → pp decay.Therefore, Refs.[13,14] point out that the penguin-level gluon-exchange and annihilation contributions could be neglected, thus leading to the branching fraction B(B 0 s → pp) to be vanishingly small.Reference [4] instead argues that contributions from exchange or annihilation processes could yield B(B 0 s → pp) ∼ 10 −8 , which is experimentally accessible.Consequently, a study of the suppressed B 0 s → pp decay provides valuable information on the role of exchange and annihilation diagrams in baryonic B decays.
This paper presents a search for the rare baryonic decay mode B 0 s → pp using protonproton (pp) collision data from years 2015-2018 collected by the LHCb experiment at a center-of-mass energy of 13 TeV, corresponding to an integrated luminosity of 6 fb −1 .Prior to this, the ALEPH, CLEO, DELPHI, Belle, BaBar and LHCb collaborations searched for the B 0 (s) → pp decays [10,[15][16][17][18][19].In 2017 the LHCb experiment reported the first observation of the B 0 → pp decay and a measurement of its branching fraction of B(B 0 → pp) = (1.25 ± 0.32) × 10 −8 [10].That analysis is based on the dataset recorded at center-of-mass energies of 7 and 8 TeV, corresponding to an integrated luminosity of 3 fb −1 .The most stringent upper limit of B(B 0 s → pp) < 1.5 × 10 −8 at 90% confidence level was set on the branching fraction of the B 0 s → pp decay [10].In this analysis, the B 0 → pp and B 0 s → pp yields are determined from a fit to the data and normalized to the yield of B 0 → K + π − decays, which have a similar topology and a precisely measured branching fraction.The use of a normalization channel reduces systematic uncertainties associated to reconstruction and selection efficiencies when determining branching fractions.The branching fractions of the signal decays are measured using the formula where f s /f d = 0.2539 ± 0.0079 (included only for the B 0 s mode) is the ratio of probabilities for b-quark hadronization into B 0 and B 0 s mesons [20], N represents the measured yields, and ε stands for the detection efficiency including the geometrical acceptance of the LHCb detector.The notation B 0 (s) → pp denotes either B 0 → pp or B 0 s → pp.The inclusion of charged-conjugate processes is implied, unless otherwise stated.

The LHCb detector and data samples
The LHCb detector [21,22] is a single-arm forward spectrometer covering the pseudorapidity range 2 < η < 5, designed for the study of particles containing b or c quarks.The detector includes a high-precision tracking system consisting of a silicon-strip vertex detector surrounding the pp interaction region [23], a large-area silicon-strip detector located upstream of a dipole magnet with a bending power of about 4 Tm, and three stations of silicon-strip detectors and straw drift tubes [24] placed downstream of the magnet.The tracking system provides a measurement of the momentum, p, of charged particles with relative uncertainty that varies from 0.5% at low momentum to 1.0% at 200 GeV/c.The minimum distance of a track to a primary pp collision vertex (PV), the impact parameter (IP), is measured with resolution of (15 + 29/p T ) µm, where p T is the component of the momentum transverse to the beam, in GeV/c.Different species of charged hadrons are distinguished using information from two ring-imaging Cherenkov detectors [25].Photons, electrons and hadrons are identified by a calorimeter system consisting of scintillating-pad and preshower detectors, an electromagnetic and a hadronic calorimeter [26].The resolution of the energy measurement in the electromagnetic calorimeter varies from around 12% at 3 GeV to around 5% at 80 GeV.Muons are identified by a system composed of alternating layers of iron and multiwire proportional chambers [27].
The online event selection is performed by a trigger [28], which consists of a hardware stage, based on information from the calorimeter and muon systems, followed by a software stage, which applies a full event reconstruction.At the hardware trigger stage, candidates are required to have a muon with high p T or a hadron, photon or electron with high transverse energy in the calorimeters.For hadrons, the transverse energy threshold is 3.5 GeV.The software trigger requires a two-track secondary vertex with significant displacement from any primary pp interaction vertex.At least one charged particle must have significant transverse momentum and be inconsistent with originating from a PV.A multivariate algorithm based on the boosted decision tree technique is used for the identification of secondary vertices consistent with the decay of a b hadron [29].
Simulated samples are used to study the properties of the signal, normalization and background channels.Proton-proton collisions are generated by Pythia [30] with a specific LHCb configuration [31].Decays of unstable particles are described by EvtGen [32], in which final-state radiation is generated using Photos [33].The interactions of the generated particles with the detector material, and their responses, are implemented using the Geant4 toolkit [34,35], as described in Ref. [36].

Candidate selection
Candidates for both the B 0 (s) → pp signal channels and the normalization mode B 0 → K + π − are required to pass a set of initial selection criteria, which include requirements on particle identification (PID), and are then further selected using a multivariate classifier relying on a boosted decision tree (BDT) algorithm [37].To avoid experimenters' bias, the m(pp) invariant mass range [5230, 5417] MeV/c 2 of the pp signal candidates was not inspected until all the analysis procedures were finalized.This range corresponds to the combined range spanning approximately three times the invariant mass resolution (±50 MeV/c 2 ) around the known B 0 and B 0 s masses [38].
In the initial selection the B 0 (s) decay products are associated to high-quality tracks with large χ 2 IP with respect to any PV, where the χ 2 IP is defined as the difference between the vertex-fit χ 2 of a PV reconstructed with and without the track in question.The p T of the decay products is required to be between 900 MeV/c and 40 GeV/c and at least one of the decay products is required to have p T > 2100 MeV/c.Furthermore, the decay products are required to satisfy p < 500 GeV/c and 1.5 < η < 5.0.Both particles are required to pass a loose PID selection based primarily on information from the Cherenkov detectors, with fiducial cuts in momentum and pseudorapidity.The angle θ B between the momentum vector of the B 0 (s) candidate and the line connecting the associated PV and the decay vertex of the candidate must be close to zero.Likewise, the decay vertex of the B 0 (s) candidate is required to be significantly displaced from the PV and its transverse momentum is required to be larger than 500 MeV/c.The decay time of the B 0 (s) candidate is limited to be less than ten times the B 0 (s) lifetime.After the initial selection a tight PID selection is applied to the two final-state particles to remove the contamination from b-hadron decays where one or more decay products are misidentified.This PID selection also significantly reduces combinatorial background formed from accidental associations of tracks unrelated to the signal decays.The optimal PID requirements are found by maximizing the figure of merit ε sig /(α/2 + N bkg ) [39], where ε sig represents the PID selection efficiency of the signal channel.The quantity N bkg stands for the expected number of background candidates in the signal region.It is estimated by extrapolating the result of a fit to the invariant mass distribution of the data from the data sideband regions (above and below the signal region, i.e. [5080, 5230] MeV/c 2 and [5417, 5570] MeV/c 2 ).The term α specifies the target signal significance in units of standard deviations; for this analysis a value α = 3 is chosen.
Following the PID selection a multivariate classifier, based on a BDT algorithm and implemented with the TMVA package [40,41], is applied as the final stage of the entire selection chain.The main purpose is to keep as much of the B 0 (s) → pp signals as possible whilst removing combinatorial background.Ten input quantities for the BDT classifier are chosen to maximize signal-to-background discrimination: two isolation variables [42,43] used to reject background candidates from decays with additional charged or neutral particles in the final state; the χ 2 IP of the B 0 (s) candidate with respect to the PV; the B 0 (s) candidate vertex fit χ 2 ; the cosine of the direction angle θ B of the B 0 (s) candidate; the separation between the two outgoing proton tracks at their point of closest approach; the smallest of the χ 2 IP with respect to the PV of the candidate final-state particles; the sum of transverse momenta of the candidate final-state particles; the smallest of the pseudorapidities of the candidate final-state particles; the cosine of the angle between the proton momentum in the B 0 (s) rest-frame and the vector perpendicular to the B 0 (s) momentum and the beam axis.The BDT classifier is trained using fully selected B 0 → pp simulated events as signal and sideband data as background.The sideband data are taken from across the wider pp invariant mass spectrum ranging from 5080 to 5570 MeV/c 2 , excluding candidates within the signal region [5230, 5417] MeV/c 2 .The requirement on the BDT response is optimized using the same figure of merit as that used in the optimization of the PID selection.The independence of the BDT response on the two-body invariant mass is verified.The same BDT selection is applied to the normalization decay candidates.In this way, systematic effects on the ratio of BDT selection efficiencies entering the relative branching fraction calculation cancel to a large extent.
Possible sources of background to the pp and K + π − spectra are studied with simulated samples.These background components contain partially reconstructed decays with one or more particles from the b-hadron decays escaping detection, and two-body b-hadron decays where one or both final-state particles are misidentified.The relevant backgrounds in both mass spectra are detailed in the following section.

Signal yield determination
The yields of the signal and background candidates after full selection in the signal and control model spectra are determined with unbinned maximum likelihood fits to the pp and K + π − invariant-mass distributions, respectively.

Mass fit to the control channel
The distribution of the invariant mass m(K + π − ) of selected B 0 → K + π − candidates, shown in Figure 1, is described with the following components: the B 0 → K + π − control channel and B 0 s → π + K − ; the background contributions due to the decays and Λ 0 b → pK − when at least one of the final-state particles is misidentified; partially reconstructed background from b-hadron decays, such as B + → ppK + ; and the combinatorial background.Contamination from other decays is treated as a systematic uncertainty.The fits are performed over the K + π − invariant-mass range from 5150 to 5750 MeV/c 2 .The lower mass limit is chosen to reject the majority of the partially reconstructed background, thus including only the high-mass tail of the contribution, which comes into the B 0 → K + π − signal region.The upper mass limit is chosen to provide adequate constraints on the model parameters of the combinatorial background model.
The B → h + h − signal mass distributions are modeled with a double-sided Crystal Ball (DSCB) function [44], characterized by exponential tails on both sides of a Gaussian core.The mean and the width of the Gaussian core function describing the B 0 → K + π − signal channel are free to vary in the fit.The difference between the peak positions of the B 0 → K + π − and B 0 s → π + K − signal decays is constrained to the known value of 87.35 MeV/c 2 [38].The core width of the B 0 s → π + K − decay mode is related to that of B 0 → K + π − by a scaling factor of 1.01 as determined from the simulation.The four DSCB tail parameters are also extracted from the simulation.
Misidentified B → h + h − and Λ 0 b → ph − decays (where h ( ) denotes a π or a K meson) with expected yields greater than 0.1% of the B 0 → K + π − yield are included individually in the fit.The shapes are modeled with non-parametric distributions, extracted from simulated samples, reconstructed and selected under the K + π − mass hypothesis, and smoothed using a kernel method [45] to suppress statistical noise due to finite simulated samples.The relative fractions of these background sources depend on the branching fractions, b-hadron hadronization probabilities, and misidentification rates.The fractions are left to vary in the fit, but their values are constrained with Gaussian functions: the means are set to the products of the three described factors and the widths defined to be equal to the combined uncertainties on the three factors.
The shape of the partially reconstructed background component in the K + π − mass hypothesis, which represents several background decay modes, is determined from a mixture of simulated samples.Each mode is assigned a weight given by its relative branching fraction, the hadronization fraction and selection efficiency.The spectrum of the ensemble of partially reconstructed background components is well modeled with the sum of two exponential functions.The overall yield is extracted from the fit to the data, while all other parameters are fixed from simulation.The invariant-mass distribution of the combinatorial background is described with an exponential function of which the slope parameter is determined in the fit.
The number of signal candidates in the normalization channel is extracted from unbinned maximum likelihood fits to the K + π − and π + K − invariant-mass spectra, shown together in Figure 1, yielding N B 0 →K + π − = 179902 ± 450 signal decays, where the uncertainty is statistical only.Both fits yield compatible results for the signal shape parameters and the fractions of misidentified backgrounds.The difference in yields from the individual fits corresponds to a raw asymmetry that is found to be consistent with the result in a previous LHCb analysis [46].

Mass fit to B 0 (s) → pp
The pp invariant-mass distribution for selected candidates is shown in Figure 2. The shape of this distribution is modeled with three components, namely the B 0 → pp and B 0 s → pp signals, and combinatorial background.Contributions from specific physics background sources are determined to be negligible after the full event selection has been applied.The B 0 (s) → pp signal shapes are modeled with DSCB functions, as the B 0 → K + π − normalization channel.The core widths are taken from simulation with scaling factors to  account for differences in the resolution between data and simulated samples as determined from B 0 → K + π − candidates.The peak of the B 0 → pp DSCB is free to vary in the fit whereas the peak of the B 0 s → pp DSCB is again constrained by the known B 0 s -B 0 mass difference [38].The tail parameters of both B 0 (s) → pp modes are fixed from simulation.The mass distribution of the combinatorial background is described by an exponential function with the gradient as the only free parameter in the fit.
The yields of the B 0 (s) → pp decays are N (B 0 → pp) = 98±11 and N (B 0 s → pp) = 4±5 candidates, where the uncertainties are statistical.The significance of each of the signals is calculated from the change in the logarithm of the likelihood between fits with and without the signal component, following Wilks' theorem [47].The statistical signal significance is 16.2 σ and 0.9 σ for the B 0 → pp and B 0 s → pp decays, respectively.This measurement hence confirms the observation of the B 0 → pp decay previously published by the LHCb collaboration [10], while no evidence for the B 0 s → pp decay is found.

Systematic uncertainties
Systematic uncertainties cancel to a large extent because the branching fractions are measured relative to that of a topologically identical normalization mode.The main sources of uncertainty on the relative branching fraction measurement are the selection efficiencies, the B 0 → K + π − branching fraction, f s /f d , and the fit model.They are summarized in Table 1 (except for the contributions from the normalization mode) and hereafter discussed.The branching fraction of the normalization mode, B(B 0 → K + π − ) = (1.96± 0.05) × 10 −5 , is taken from Ref. [38].Its relative uncertainty of 2.6% is taken as a systematic uncertainty on the B 0 (s) → pp branching fractions.For the B 0 s → pp branching fraction measurement an extra 3.1% relative uncertainty comes from the uncertainty on the ratio of fragmentation fractions f s /f d .
The trigger efficiencies are assessed with simulation.The p T distributions of the decay final-state particles are well described in simulation and all final states comprise the same number of tracks.Neglecting very small p and p T differences between the B 0 → pp and B 0 s → pp modes, the ratios of B 0 → K + π − /B 0 → pp and B 0 → K + π − /B 0 s → pp trigger efficiencies are identical within uncertainties.No uncertainties are considered for the software triggers given that the dominant source of systematic uncertainty is the reconstruction of charged particles, taken into account below as tracking systematic uncertainties.The main contribution to the uncertainties on the trigger efficiencies comes from the imperfect description of the hardware trigger in the simulation.The accuracy of the efficiency determinations is assessed exploiting trigger calibration data subdivided by particle charge and momentum distribution to obtain efficiency maps for protons, kaons and pions individually.The effect is studied with simulated toys for the signals and the normalization mode, accounting for observed variations in the efficiency maps.A systematic uncertainty on the relative L0 efficiencies of 1.0% is assigned for both signal modes based on the differences observed with respect to the default determination of the efficiencies.
The B 0 → K + π − decay mode is used to assess a systematic uncertainty related to the selection.The discrepancies between data and simulation are assessed with B 0 → K + π − signal distributions obtained from weighted collision data, using the sPlot technique with m(K + π − ) as discriminating variable [48].The 2.0% level of disagreement between simulation and weighted data is assigned as the systematic uncertainty for the ratios of the B 0 (s) → pp to B 0 → K + π − selection efficiencies.Differences in the reconstruction of the decay particles may arise to some extent given that the final states of the signal modes contain only baryons whereas the final state of the normalization mode contains only mesons.The track reconstruction efficiencies of the B 0 (s) → pp and B 0 → K + π − decays are corrected with tracking calibration samples [49].A systematic uncertainty of 1.9% is assigned on the relative efficiency for reconstructing B 0 (s) → pp decays versus B 0 → K + π − decays.
The systematic uncertainties on the PID efficiencies are determined using a method based on resampling, alternative to the default method of correcting on a particle-by-particle basis the PID variables in simulation in order to match the PID performance in data.The data-driven PID resampling method effectively reweights the PID distributions in simulation exploiting calibration data samples, from which the PID selection efficiencies are evaluated.The largest discrepancy observed between both methods, which is 2.4%, is assigned as the overall systematic uncertainty due to the relative PID selection efficiencies for both B 0 → pp and B 0 s → pp.The fit systematic uncertainties originate from the finite accuracy of the mass models and from the uncertainties on the fixed parameters in the fits.They are investigated with pseudoexperiments in which the fixed parameters of the models are varied within their uncertainties or the combinatorial background is modeled with an alternative (secondorder) polynomial function.The observed signal yields are used as the input means for the Poisson distributions.The fit systematic uncertainties for the B 0 → pp and B 0 s → pp signals are 1.0% and 22.0%, respectively; for the normalization mode they amount to 1.0%.The overall systematic uncertainties on the B 0 → pp, B 0 s → pp branching fractions coming from the B 0 → pp, B 0 s → pp and B 0 → K + π − contributions in Equation 1 are 3.9%, 22.5% and 2.8%, respectively, combining all sources in quadrature.

Results and conclusions
In summary, an updated study of the rare two-body charmless baryonic decays B 0 (s) → pp is performed using the full proton-proton collision data sample collected with the LHCb detector, during 2015-2018, at a center-of-mass energy of 13 TeV, corresponding to an integrated luminosity of approximately 6 fb −1 .No evidence for a B 0 s → pp signal is found and an upper limit (UL) at 90% (95%) confidence level (CL) is computed by integrating the likelihood as a function of the branching fraction, where the likelihood probability distribution function is obtained from the fit profile likelihood using a flat prior on the branching fraction: Figure 3 shows the resulting likelihood as a function of the branching fraction with combined statistical and systematic uncertainties, where the effect of the systematic uncertainties is accounted for convolving the statistical-only likelihood with a Gaussian function of mean zero and width equal to the fit systematic uncertainty.The 90% (95%) upper limit on the B 0 s → pp decay branching fraction is set at B(B 0 s → pp) < 4.4 (5.1) × 10 −9 at 90% (95%) CL.This world's best upper limit supersedes and improves the previous limit by approximately a factor of four, and constrains contributions from beyond SM processes.The B 0 → pp decay mode observation is confirmed with very large significance.Using the measured quantities of Equation 1 the branching fraction of the B 0 → pp decay is measured to be where the first uncertainty is statistical, the second is systematic and the third is due to the uncertainty on the branching fraction of the normalization channel.The combination of this measurement with the previous LHCb measurement [10] (data sample from years 2011-2012) of the B 0 → pp branching fraction is based on the minimization of a χ 2 dependent on the two measurements and their uncertainties.It provides B(B 0 → pp) = (1.27± 0.13 ± 0.05 ± 0.03) × 10 −8 , where correlated sources of systematic uncertainties are assumed as fully correlated and all other uncertainties considered as uncorrelated.The sources of correlated systematic uncertainties are the following: the B 0 → K + π − branching fraction, the relative L0 trigger uncertainties and the mass fit model parameters.
This precise measurement of the B 0 → pp branching fraction and stringent limit on the B 0 s → pp branching fraction, together with the experimental knowledge of the B + → pΛ branching fraction, improves the extraction of both tree and penguin amplitudes of charmless two-body baryonic B decays [14].A possible observation of the very rare decay mode B 0 s → pp and a determination of its branching fraction requires future data to be collected by LHCb.

Figure 1 :
Figure 1: Invariant-mass distribution of B 0 → K + π − candidates.The fit result is shown together with each fit model component.

Figure 2 :
Figure 2: Invariant-mass distribution of pp candidates with each fit model component.

Figure 3 :
Figure 3: Negative logarithm of the profile likelihoods (−2∆lnL) and the corresponding likelihood as a function of the B 0 s → pp branching fraction.The solid blue line represents the statistical-only profile, whereas the dashed red line includes the systematic uncertainties.The solid magenta line shows the distribution of likelihood.The dashed violet vertical line at 0 shows the start point of the integral.The black (cyan) arrow shows the upper limits at 90% (95%) confidence level.

Table 1 :
Systematic uncertainties on the B 0 (s) → pp branching fractions, in percentage (%), coming from the B 0 → pp and B 0 s → pp contributions in Equation1.The totals correspond to the quadratic sums of each column.