First observation of the decay $B_s^0 \to K^-\mu^+\nu_\mu$ and measurement of $|V_{ub}|/|V_{cb}|$

The first observation of the suppressed semileptonic $B_s^0 \to K^-\mu^+\nu_\mu$ decay is reported. Using a data sample recorded in {\it pp} collisions in 2012 with the LHCb detector, corresponding to an integrated luminosity of 2 $\mathrm{fb}^{-1}$, the branching fraction \mbox{$\mathcal{B}(B_s^0 \to K^-\mu^+\nu_\mu)$} is measured to be $(1.06\pm0.05~(\mathrm{stat})\pm0.08~(\mathrm{syst}))\times 10^{-4}$, where the first uncertainty is statistical and the second one represents the combined systematic uncertainties. The decay $B_s^0 \to D_s^-\mu^+\nu_\mu$, where $D_s^-$ is reconstructed in the final state $K^+K^-\pi^-$, is used as a normalization channel to minimize the experimental systematic uncertainty. Theoretical calculations on the form factors of the $B_s^0 \to K^-$ and $B_s^0 \to D_s^-$ transitions are employed to determine the ratio of the CKM matrix elements ${|V_{ub}|}/{|V_{cb}|}$ at low and high $B_s^0 \to K^-$ momentum transfer.

The coupling of the electroweak interaction between up-and down-type quarks is modulated by the Cabibbo-Kobayashi-Maskawa (CKM) matrix [1,2]. Improving the precision on the measurements of its elements can be exploited to probe possible deviations from the Standard Model of particle physics [3]. Hadrons containing a b quark can decay weakly via a virtual W boson to semileptonic final states through the tree-level transitions b → c(W * → ν) and b → u(W * → ν), where ν denotes a charged lepton and a neutrino. These transitions involve the CKM matrix elements V cb and V ub , respectively, which obey the observed hierarchy |V ub |/|V cb | ∼ 0.1, resulting in the transitions b → c ν being favored over b → u ν. Semileptonic b-hadron decays are an excellent ground for measuring |V cb | and |V ub | due to the factorization of the hadronic and leptonic parts of the amplitudes, thereby easing theoretical calculations [4,5]. Existing |V ub | and |V cb | measurements show a discrepancy between those performed with exclusive decays, where all the visible particles are reconstructed, and inclusive decays where only the lepton is reconstructed [6]. The world average of the exclusive |V ub | results is dominated by B 0 → π − + ν measurements. The LHCb measurement using the baryonic decays Λ 0 b → pµ −ν µ and Λ 0 b → Λ + c µ −ν µ [7] gives the ratio |V ub |/|V cb | = 0.079 ± 0.006, as updated in Ref. [6]. Besides the inclusive versus exclusive puzzle, measurements of |V ub |/|V cb | are important to constrain the CKM unitarity triangle [8,9].
This Letter reports the first observation of the decay B 0 s → K − µ + ν µ , the measurement of its branching fraction and of the ratio |V ub |/|V cb | with B 0 s → D − s µ + ν µ as a normalization channel. 1 The measurement of the branching fraction is performed in two regions of the B 0 s → K − momentum transfer or invariant mass squared of the muon and the neutrino, q 2 , as well as integrated over the full q 2 range. The ratio |V ub |/|V cb | is derived in the two q 2 regions using calculations of the form factors of the B 0 s → K − and B 0 s → D − s transitions based on both light cone sum rule (LCSR) [10] and lattice QCD (LQCD) [11] methods. The data sample consists of pp collisions recorded by the LHCb detector in 2012 at a center-of-mass energy of 8 TeV corresponding to 2 fb −1 of integrated luminosity. The LHCb detector is a single-arm forward spectrometer covering the pseudorapidity range 2 < η < 5, described in detail in Refs. [12,13]. The trigger [14] consists of a hardware stage, based on information from the calorimeter and muon systems, followed by a software stage, which reconstructs charged particles. Simulation, produced with software packages described in Refs. [15][16][17], is used to model the effects of the detector acceptance and the imposed selection requirements.
In this analysis candidates for B 0 s → K − µ + ν µ and B 0 s → D − s µ + ν µ decays are formed by combining a muon with a kaon or a D − s candidate reconstructed through the decay D − s → K + K − π − . The trigger and initial selection requirements are chosen to be similar between these two modes. Events are retained by the hardware trigger due to the presence of a high-p T muon, where p T is the momentum component transverse to the beam. The software trigger [18] selects partially reconstructed B decays by combining a track or a D − s candidate with a well identified muon candidate. The initial selection includes requirements on the track kinematics and quality, particle identification, as well as on the B 0 s candidate kinematics and decay topology. The obtained samples for each of the decays include background contributions dominated by b-hadron decays with additional tracks or neutral particles in the final state. For the K − µ + combinations, the main background originates from H b → µ + H c (→ K − X)X , where H b,c represents a hadron containing a b or a c quark and X ( ) denotes unreconstructed particles. Decays to excited K * resonances, B 0 s → K * − (→ K − π 0 )µ + ν µ , and charmonium modes B → [cc](→ µ + µ − )K − X, where [cc] = J/ψ, ψ(2S), are secondary background contributions. Other sources arise from bhadron decays where a track is misidentified as a kaon or a muon, and random combinations of a muon and a kaon. In the D − s µ + combinations, the main (and irreducible) source of background arises from B 0 s → D * − s (→ D − s γ)µ + ν µ decays. Additional contributions include decays to higher excitations of the D − s meson, B 0 s → D * * − s (→ D − s X)µ + ν µ , double-charm decays of the type B u,d,s → D s DX and semitauonic B 0 s → D − s τ + ν τ decays. To suppress background, the K − µ + and D − s µ + candidates are required to be isolated from other tracks in the event. A multivariate algorithm (MVA) is trained to determine if a given track originates from the candidate, or from the rest of the event (ROE). All ROE tracks are required to pass a minimum isolation threshold. For K − µ + candidates, two boosted decision tree (BDT) classifiers [19,20] are used sequentially to further reduce the remaining background. A charged BDT classifier is trained against a mixture of the main background components using, in addition to the isolation MVA output, invariant masses formed by the least isolated ROE track with respect to each of the muon or the kaon, and variables related to the B 0 s , K − and µ + kinematics. The background passing the charged BDT requirement comprises decays without an additional track, mainly of the type where P is either a long-lived or a neutral particle. A second BDT classifier, denoted neutral BDT, involves kinematic variables of the K − and B 0 s candidates, the B 0 s vertex position and quality, the invariant mass formed by the signal kaon and any π 0 meson in its vicinity; it also exploits the asymmetry between the kaon momentum and an average momentum direction formed by neutral particles in the vicinity of the kaon. The shapes of the BDT outputs are calibrated with the decay B − → J/ψ(→ µ + µ − )K − , which is reconstructed both as a K − µ + candidate and fully reconstructed where the least isolated track near the K − µ + pair is identified as µ − . Kinematic weighting accounts for data-simulation discrepancies for the training of the classifiers.
The B 0 s mass is represented by the corrected mass [21], defined as where m Y µ is the invariant mass of the Y µ pair, with Y = K − or D − s , and p ⊥ is the momentum of this pair transverse to the B 0 s flight direction. The flight direction is defined as the vector between the positions of the primary pp collision vertex and the B 0 s decay vertex. In order to improve the separation between the B 0 s → K − µ + ν µ signal and background, the uncertainty on m corr is required to be σ(m corr ) < 100 MeV/c 2 . To derive q 2 , the neutrino momentum is estimated using the B 0 s flight direction and the known B 0 s mass. A two-fold ambiguity resulting from this estimate is resolved by choosing the solution that is most consistent with the B 0 s momentum predicted by a linear regression method [22]. The fit to the m corr distribution, used for the extraction of the B 0 s → K − µ + ν µ signal, is performed in two q 2 regions, respectively above and below 7 GeV 2 /c 4 ("high" and "low"), which are chosen to contain approximately the same expected signal yields.
For the B 0 s → D − s µ + ν µ decay, a fit to the invariant mass of the D − s → K + K − π − candidates is performed in 40 intervals of m corr from 3000 to 6500 MeV/c 2 . This provides the D s yield as a function of m corr and thus subtracts the background originating from combinations of random kaon and pion tracks. The obtained m corr distribution is fit to extract the B 0 s → D − s µ + ν µ signal yield. For the B 0 s → K − µ + ν µ decay, the combinatorial background is largely reduced by applying a topological criterion: the opening angle between the directions of the K − and µ + candidates in the plane transverse to the pp collision axis is required to be less than 90 degrees. The efficiency of this requirement on the signal is 93%, while it removes approximately 90% of the combinatorial background. The efficiencies of the signal and normalization channels are derived from simulation and take into account the effects of the triggers, reconstruction, selection, particle identification, isolation procedure, MVA requirements and detector acceptance. Data-driven corrections are applied to account for any mismodelling related to the kinematics, number of tracks in the event and particle identification variables. The efficiency ratio between the signal and normalization decays is K / Ds = 1.109 ± 0.018, 0.553 ± 0.009 and 0.733 ± 0.009 for q 2 < 7 GeV 2 /c 4 , q 2 > 7 GeV 2 /c 4 and the full q 2 range, respectively. The uncertainties reflect the limited size of the simulated samples.
The fit template for the m corr distribution of the B 0 s → K − µ + ν µ signal is obtained from simulation, while the shapes for the background components are derived from either simulation or control samples. The statistical uncertainties originating from the finite samples used to obtain the templates are accounted for in the fits [23]. The main background H b → H c (→ K − X)µ + X , whose yield is free in the fit, is obtained with a simulated inclusive sample. The B 0 s → K * − (→ K − π 0 )µ + ν µ background is modelled by simulating a mixture of three resonances (K * − (892), K * − 0 (1430) and K * − 2 (1430)) with a substantial branching fraction to the K − π 0 final state. Though the overall yield is free, the mixture is fixed to certain proportions which are varied up to a factor of 2.5 for systematic studies, according to available measurements of the decays B − → K * − µ + µ − and B − → K * − η/φ [24]. The impact of a possible B 0 s → K − π 0 µ + ν µ nonresonant decay has also been considered and found to be absorbed by the resonant mixture. The charmonium background is dominated by B − → J/ψ(→ µ + µ − )K − X decays, with the fraction of the B − → J/ψ(→ µ + µ − )K − channel exceeding 75%. Its shape is determined with simulated B − → J/ψ(→ µ + µ − )K − X events while its yield is derived from the yield of the B − → J/ψ(→ µ + µ − )K − signal peak in data. To recover that peak from K − µ + combinations, the missing momentum of the µ − is calculated from the B − flight direction and the known J/ψ mass. The background originating from the misidentification (MisID) of a pion, proton or muon as a kaon; or a kaon, proton or pion as a muon is modelled using data samples of hµ + (K − h) candidates with an identical selection as for the main sample but where h is a charged track which fails the kaon (muon) identification criteria. These control samples are thus enriched in misidentified tracks of the different species. The different contributions to the kaon and muon MisID are unfolded using control samples of kinematically identified hadrons and muons [25]. These samples are used to derive the probabilities that a particle belonging to a given species and with particular kinematic properties would pass the kaon or muon criteria. With this method both the m corr shape and the yield of the MisID are constrained. The combinatorial background is modelled with a separate data sample where a kaon and a muon from different events are combined. The obtained pseudocandidates undergo the same selection as the signal candidates and are corrected to reproduce the kinematic properties of the standard candidates.
The fit to the normalization channel B 0 s → D − s µ + ν µ employs shapes obtained from simulation. The B 0 s → D − s µ + ν µ decay is modelled with the recent form factor predictions of Ref. [26].
The corrected mass distributions of the signal and normalization candidates are shown in Fig. 1, with the binned maximum-likelihood fit projections overlaid. The B 0 s → K − µ + ν µ yields for q 2 < 7 GeV 2 /c 4 and q 2 > 7 GeV 2 /c 4 regions are found to be N K = 6922 ± 285 and 6399 ± 370, respectively, while the B 0 s → D − s µ + ν µ yield is N Ds = 201450 ± 5200. The uncertainties include both the effect of the limited data set and the finite size of the samples used to derive the fit templates. Unfolding the two effects in quadrature shows that they have similar sizes. This is the first observation of the decay B 0 s → K − µ + ν µ . The ratio of branching fractions is inferred as

Uncertainty
All q 2 low q 2 high q 2  where the uncertainties are statistical, systematic and due to the D − s → K + K − π − branching fraction. Table 1 summarizes the systematic uncertainties. It includes uncertainties on the calibration and correction of the track reconstruction, trigger, particle identification, selection variables, migration of events between q 2 regions, efficiencies and the fit template distributions. The largest systematic uncertainty originates from the fit templates and is evaluated by varying the shape of the fit components according to alternative models and also by modifying within its uncertainty the mixture of exclusive decays representing some of the background contributions. In particular, the signal shape is varied using various form factor models [27][28][29][30]. A similar procedure is applied to the normalization channel. The tracking uncertainty comprises the limited precision on tracking efficiency corrections obtained from control samples in data, and the uncertainty on modelling the hadronic interactions with the detector material. The uncertainty on the q 2 migration is related to the limited accuracy of the evaluation of the cross-feed between low and high q 2 regions in simulation.
where the uncertainties are statistical, systematic, from the external inputs (D − s branching fraction, B 0 s lifetime and |V cb |) and the B 0 s → D − s form factor integral, respectively. Combining the systematic uncertainties, the branching fraction is The ratio of CKM elements |V ub |/|V cb | is obtained through the relation R BF = |V ub | 2 /|V cb | 2 × FF K /FF Ds . For the FF K value, a recent LQCD prediction is used for the high q 2 range, FF K (q 2 > 7 GeV 2 /c 4 ) = 3.32 ± 0.46 ps −1 [29], while a LCSR calculation is used for the low q 2 range, FF K (q 2 < 7 GeV 2 /c 4 ) = 4.14 ± 0.38 ps −1 [30], due to the lower accuracy of LQCD calculatons in this region. The obtained values are where the latter two uncertainties are from the D − s branching fraction and the form factor integrals. The discrepancy between the values of |V ub |/|V cb | for the low and high q 2 ranges is due to the difference in the theoretical calculations of the form factors. To illustrate this, the LQCD calculation in Ref. [29] gives FF K = 0.94 ± 0.48 ps −1 at low q 2 , which can be compared to the chosen LCSR value, 4.14 ± 0.38 ps −1 [30]. Figure 2 depicts the |V ub |/|V cb | measurements of this Letter, |V ub |/|V cb |(low) = 0.061 ± 0.004 and |V ub |/|V cb |(high) = 0.095 ± 0.008, with the uncertainties combined. The |V ub |/|V cb | measurement obtained with the Λ 0 b baryon decays [7], for which a form factor model based on a LQCD calculation [31] was used, is also shown.
In conclusion, the decay B 0 s → K − µ + ν µ is observed for the first time. The branching fraction ratios in the two q 2 regions reported in this Letter represent the first experimental ingredient to the form factor calculations of the B 0 s → K − µ + ν µ decay. Moreover, the |V ub |/|V cb | results will improve both the averages of the exclusive measurements in the (|V cb |, |V ub |) plane and the precision on the least known side of the CKM unitarity triangle.