Search for $B^{-}\to\mu^{-}\bar\nu_\mu$ Decays at the Belle Experiment

We report the result of a search for the decay $B^{-}\to\mu^{-}\bar\nu_\mu$. The signal events are selected based on the presence of a high momentum muon and the topology of the rest of the event showing properties of a generic $B$-meson decay, as well as the missing energy and momentum being consistent with the hypothesis of a neutrino from the signal decay. We find a 2.4 standard deviation excess above background including systematic uncertainties, which corresponds to a branching fraction of ${\cal B}(B^{-}\to\mu^{-}\bar\nu_\mu) =(6.46 \pm 2.22 \pm 1.60 )\times10^{-7}$ or a frequentist 90% confidence level interval on the $B^{-}\to\mu^{-}\bar\nu_\mu$ branching fraction of $[2.9, 10.7]\times 10^{-7}$. This result is obtained from a $711\ \text{fb}^{-1}$ data sample that contains $772 \times 10^6$ $B\bar{B}$ pairs, collected near the $\Upsilon(4S)$ resonance with the Belle detector at the KEKB asymmetric-energy $e^+ e^-$ collider.

We report the result of a search for the decay B − → µ −ν µ. The signal events are selected based on the presence of a high momentum muon and the topology of the rest of the event showing properties of a generic B-meson decay, as well as the missing energy and momentum being consistent with the hypothesis of a neutrino from the signal decay. We find a 2.4 standard deviation excess above background including systematic uncertainties, which corresponds to a branching fraction of B(B − → µ −ν µ) = (6.46 ± 2.22 ± 1.60) × 10 −7 or a frequentist 90% confidence level interval on the B − → µ −ν µ branching fraction of [2.9, 10.7] × 10 −7 . This result is obtained from a 711 fb −1 data sample that contains 772 × 10 6 BB pairs, collected near the Υ(4S) resonance with the Belle detector at the KEKB asymmetric-energy e + e − collider.
PACS numbers: 14.40.Nd,12.15.Hh,12.38.Gc In the Standard Model (SM), the branching fraction for the purely leptonic decay of a B − meson [1], assuming a massless neutrino, is: where G F is the Fermi constant, m B and m are the masses of the B meson and charged lepton, respectively, f B is the B-meson decay constant obtained from theory, τ B is the lifetime of the B meson and V ub is the CKM matrix element governing the coupling between u and b quarks. The FLAG [2] average of lattice QCD calculations gives f B = 0.186 ± 0.004 GeV, and the world-average value of τ B is 1.638 ± 0.004 ps [3]. For the value of |V ub |, we repeat the fit procedure described in Ref. [4], equipped with the most recent lattice QCD calculation by the FNAL/MILC collaborations [5] that provides a tight constraint on the hadronic form-factor f + (q 2 ) governing exclusiveB 0 → π + −ν decays. The form-factor parameters forB 0 → π + −ν decay are also obtained with this procedure. The value of |V ub | thus obtained is |V ub | × 10 3 = 3.736 ± 0.142 with fit quality χ 2 = 47.9 for 45 degrees of freedom. Using these values as input parameters for Eq. 1, the expected branching fractions for B − → −ν decays are displayed in Table I. Also shown in the Table are the expected event yields for B − → −ν decays in the full Belle data set, where we use B(Υ(4S) → B + B − ) = 0.514 ± 0.006 [3]. Due to the relatively small theoretical uncertainties within the SM framework, B − → −ν decays are good candidates for testing SM predictions and searching for phenomena that might modify them. For instance, the effects of charged Higgs bosons in two-Higgs-doublet models of type-II [6], the R-parity-violating Minimal Su-persymmetric Standard Model (MSSM) [7], or leptoquarks [8] may significantly change the B − → −ν decay rates.
Moreover, by taking the ratios of purely leptonic B − decays, most of the input parameters in Eq. 1 cancel and very precise values are predicted. Predictions of the obtained within a general MSSM at large tan β [9] with heavy squarks [10] deviate from the SM expectations and the deviation can be as large as an order of magnitude in the grand unified theory framework [11].
In this article, we present a search for the decay B − → µ −ν µ that also uses the untagged method. This study is based on a 711 fb −1 data sample that contains (772 ± 11)×10 6 BB pairs, collected with the Belle detector at the KEKB asymmetric-energy e + e − (3.5 on 8 GeV) collider [17] operating at the Υ(4S) resonance.
The Belle detector is a large-solid-angle magnetic spectrometer that consists of a silicon vertex detector (SVD), a 50-layer central drift chamber (CDC), an array of aerogel threshold Cherenkov counters (ACC), a barrel-like arrangement of time-of-flight scintillation counters (TOF) and an electromagnetic calorimeter comprised of CsI(Tl) crystals (ECL) located inside a superconducting solenoid coil that provides a 1.5 T magnetic field. An iron fluxreturn yoke located outside of the coil is instrumented to detect K 0 L mesons and to identify muons (KLM). The detector is described in detail elsewhere [18]. Two inner detector configurations were used. A 2.0 cm beampipe and a 3-layer silicon vertex detector were used for the first sample of 152×10 6 BB pairs, while a 1.5 cm beampipe, a 4-layer silicon detector and a small-cell inner drift chamber were used to record the remaining 620 × 10 6 BB pairs [19].
The data were collected at a center-of-mass energy of 10.58 GeV, corresponding to the Υ(4S) resonance. The size of the data sample is equivalent to an integrated luminosity of 711 fb −1 . We also utilise a sample of 79 fb −1 collected below the BB threshold to characterize the contribution of the e + e − → qq process, so-called continuum, where q is either a u, d, s, or c quark; this is one of the major backgrounds.
We use Monte Carlo (MC) samples based on the detailed detector geometry description implemented with the GEANT3 package [20] to establish the analysis technique and study major backgrounds. Events with Bmeson decays are generated using EvtGen [21]. The generated samples include 2 × 10 6 signal events, a sample of generic BB decays corresponding to ten times the integrated luminosity of the data, continuum corresponding to six times the data,B → X u −ν decays corresponding to twenty times the data, other B decays with probability 4 × 10 −4 corresponding to fifty times the data, and e + e − → τ + τ − corresponding to five times the data, as well as other QED and two-photon processes with various multiples of the data. The simulation accounts for the evolution in background conditions and beam collision parameters. Final-state radiation from charged particles is modelled using the PHOTOS package [22].
MC samples for one of the largest backgrounds from B decays, charmless semileptonic decays, are generated according to the number of BB pairs in data, scaled 20 times, assuming inclusive semileptonic branching frac- [23,24]. Other decays to exclusive meson states are modelled using the updated quark model by Isgur-Scora-Grinstein-Wise [25]. The inclusive component of charmless semileptonic decays is modelled to leading order in α s based on a prediction in the Heavy-Quark Expansion framework [26]. The fragmentation process of the resulting parton to the final hadron state is modelled using the PYTHIA6.2 package [27].
In addition, 8 × 10 6B → π −ν MC events are generated uniformly as a function of q 2 . These events are reweighted to the most recent lattice QCD form-factor calculation, in order to decrease MC statistical fluctuations at high q 2 and to study the behavior of the fit procedure described below when form-factors are varied within uncertainties.
Finally, 10 6 events of the three-body decay B − → µ −ν µ γ are generated with photon energy above 25 MeV in the B decay frame with the form-factor parameters R = 3 and m b = 5 GeV based on the work in Ref. [28].
The muon in B − → µ −ν µ decay is monochromatic in the absence of radiation, with an energy of half the B-meson rest mass energy in the B-meson rest frame. In the Υ(4S) center-of-mass frame, where the B meson is in motion, the boost smears the momentum of the muon, p * µ , to the range (2.476, 2.812) GeV/c. We select well-reconstructed muon candidates in the wider region of (2.2, 4.0) GeV/c to include enough data to validate the analysis procedure and estimate backgrounds. A blind analysis is performed with the Υ(4S) data in the p * µ interval (2.45, 2.85) GeV/c excluded until the analysis procedure has been finalized. Signal muons are identified by a standard procedure based on their penetration range and degree of transverse scattering in the KLM detector with an efficiency of ∼ 90% [29]. An additional selection is applied with information from the CDC, ECL, ACC, and TOF subdetectors, combined using an artificial neural network, to reject the charged-kaon muonic decay in flight. Background suppression of 33% is achieved by this procedure, with a signal-muon selection efficiency of 97%.
Charged particles, including the signal muon candidate, are required to originate from the region near the interaction point (IP) of the electron and positron beams. This region is defined by |z PCA | < 2 cm and r PCA < 0.5 cm, where z PCA is the distance of the point of closest approach (PCA) from the IP along the z axis (opposite the positron beam) and r PCA is the distance from this axis in the transverse plane. The charged daughters of reconstructed long-lived neutral particles (converted γ, K 0 S , and Λ) are included in this list even if they fail the IP selection. All other charged particles are ignored. We discard the event if the total momentum of these particles exceeds 1.3 GeV/c to suppress the background from mis-reconstructed long-lived neutral particles.
Each surviving track that is not classified as a longlived neutral-particle daughter is assigned a unique identity. Electrons are identified using the ratio of the energy detected in the ECL to the track momentum, the ECL shower shape, position matching between the track and ECL cluster, the energy loss in the CDC, and the response of the ACC [30]. Muons are identified as described earlier for the signal muon candidates. Pions, kaons and protons are identified using the responses of the CDC, ACC, and TOF. In the expected momentum region for particles from B-meson decays, charged leptons are identified with an efficiency of about 75% while the probability to misidentify a pion as an electron (muon) is 1.9% (5%). Charged pions (kaons, protons) are selected with an efficiency of 86% (75%, 98%) and a pion (kaon, proton) misidentification probability of 6% (13%, 72%).
Photon candidates are selected using a polar-angledependent energy threshold chosen such that a photon with energy above (below) the threshold is more likely to originate from B-meson decay (calorimeter noise). In the barrel calorimeter, the energy threshold is about 40 MeV; in the forward and backward endcaps, it rises to 110 MeV and 150 MeV, respectively. Additionally, we require the total energy deposition in the calorimeter not associated with charged particles nor recognized as photons to be under 0.6 GeV.
The neutrino in B − → µ −ν µ decay is not detected. The photons and surviving charged particles other than the signal muon should come from the companion B me-son in the e + e − → Υ(4S) → B + B − process. We select companion B meson candidates that have invariant mass close to the nominal B-meson mass and total energy close to the nominal B-meson energy from the Υ(4S) → BB decay. These quantities are represented by the beamconstrained mass and energy where E beam is the beam energy in the Υ(4S) centerof-mass frame, and p * i and m i are the center-of-mass frame momentum and mass, respectively, of the i th particle that makes up the accompanying B-meson candidate. We retain events that satisfy M bc > 5.1 GeV/c 2 and −3 GeV < E B − E beam < 2 GeV.
To exploit the jet-like structure of non-BB background, where particles tend to be produced collinearly, we define the directionn of the thrust axis by maximizing the quantity while satisfying the conditionn · ( i p * i ) > 0. We requirê n ·p * µ > −0.8, wherep * µ is the signal-muon direction, to remove muons collinear with the other particles in the event.
The missing energy of a neutrino from semileptonic decays of B or D mesons can be similar to that of the signal, and an excess of reconstructed charged leptons is a signature of these decays. We therefore require no more than one additional lepton in the event besides the signal muon.
The information from the KLM detector subsystem is also used to improve signal purity. We require no more than one K 0 L cluster in the KLM and no K 0 L clusters associated with ECL clusters. This selection rejects about 24% of background events and keeps about 90% of signal. The K 0 L detection efficiency is calibrated using a D 0 → φK 0 S control sample. The total signal selection efficiency for B − → µ −ν µ decays is estimated at this stage to be around 38%, with an expected signal yield of 115 ± 9.
After all of the selections described above are applied, the remaining background is still more than three orders of magnitude larger than the expected signal yield. A multivariate data analysis is employed to further separate signal from background. We combine various kinematic parameters of an event into a single variable o nn using an artificial neural network. We choose 14 input parameters that are uncorrelated with the absolute value of the muon The distributions of the neural network output variable for the signal and major background processes predicted by MC in the signal-enhanced region 2.644 GeV/c < p * µ < 2.812 GeV/c. momentum, and that collectively yield the best signal to background ratio. These parameters are five event-shape moments, the polar angle of the missing momentum vector, the angle between the thrust axis and the signalmuon direction, the energy difference E B − E beam , the angle between the signal-muon direction and the thrust axis calculated using only photons, the angle between the momentum of the companion B meson and the signalmuon direction, the z-axis distance between the signal muon's z PCA and the reconstructed vertex of the companion B meson, the square of the thrust as defined in Eq. (4), the sum of charges of charged particles in an event, and the polar angle of the muon momentum vector.
The employed configuration of the network consists of the input layer and two hidden layers having 56 and 28 neurons and the tanh activation function; in total, it has 2465 parameters to optimize. The MC sample is divided into equal training and testing parts with almost 2 million events in each. The distributions of the neural network output variables in the signal-enhanced momentum region are shown in Fig. 1. The only background components peaking in the signal region areB → π −ν and, much less prominently,B → ρ −ν . All other major backgrounds decrease significantly approaching the o nn ∼ 1 region and do not have a peaking behavior in the o nn variable that can mimic the signal.
The signal yield is extracted by a binned maximumlikelihood fit in the p * µ -o nn plane using the method de-scribed in Ref. [31], taking into account the uncertainty arising from the finite number of events in the template MC histograms. The fit region covers muon momenta from 2.2 to 4 GeV/c with 50 MeV/c bins and the full range of the o nn variable from −1 to 1 with 0.04 bins. The region at high muon momentum p * µ and high o nn is sparsely populated; to avoid bins with zero or a few events, which are undesirable for the fit method employed, we increased the bin size in this region. The fine binning in the signal region is preserved. After the rebinning, the p * µ -o nn histogram is reduced from 1800 to 1226 bins. The fit method tends to scale low-populated templates to improve the fit to data; because of this, background components with the predicted fraction of under 1% of the total number of events are fixed in the fit to the MC prediction. The fitted-yield components are the signal,B → π −ν ,B → ρ −ν , the rest of the charmless semileptonic decays, BB, cc, uds, τ + τ − , and e + e − µ + µ − . The fixed-yield components are µ + µ − , e + e − e + e − , e + e − uū, e + e − ss, and e + e − cc.
To obtain the signal branching fraction, we fit the ratio R = N B→µνµ /N B→πµνµ . This ratio also helps to reliably estimate the fit uncertainty. The result of the fit is R = (1.66 ± 0.57) × 10 −2 , which is equivalent to a signal yield of N B→µνµ = 195 ± 67 and the branching fraction ratio of B(B − → µ −ν µ )/B(B → π −ν ) = (4.45 ± 1.53 stat ) × 10 −3 . This result can be compared to the MC prediction of this ratio R MC = 114.6/11746 = 0.976 × 10 −2 , obtained assuming B(B → µν µ ) = 3.80 × 10 −7 and B(B → π −ν ) = 1.45 × 10 −4 (the PDG average [3]). The fitted value of R results in the branching fraction B(B → µν µ ) = (6.46 ± 2.22) × 10 −7 , where the quoted uncertainty is statistical only. The statistical significance of the signal is 3.4σ, determined from the likelihood ratio of the fits with a free signal component and with the signal component fixed to zero. The fit result of the reference processB → π −ν agrees with the MC prediction to better than 10%. The projections of the fitted distribution in the signal-enhanced regions are shown in Fig. 2. The fit qualities of the displayed projections are χ 2 /ndf = 27.6/16 (top panel) and χ 2 /ndf = 29.1/25 (bottom panel), taking into account only data uncertainties.
The double ratio R/R MC benefits from substantial cancellation of the systematic uncertainties from muon identification, lepton and neutral-kaon vetos and the companion B-meson decay mis-modelling, as well as partially cancelling trigger uncertainties and possible differences in the distribution of the o nn variable.
In the signal region, the main background contribution comes from charmless semileptonic decays; in particular, the main componentsB → π −ν andB → ρ −ν , which peak at high o nn values, are carefully studied. With soft and undetected hadronic recoil, these decays are kinematically indistinguishable from the signal in an untagged analysis. For theB → π −ν component, we vary the form-factor shape within uncertainties obtained with the new lattice QCD result [5] and the procedure described in Ref. [4], which was used to estimate the value of |V ub |. Since the form-factor is tightly constrained, the contribution to the systematic uncertainty from theB → π −ν background is estimated to be only 0.9%. For theB → ρ −ν component, the form-factors at high q 2 or high muon momentum have much larger uncertainties and several available calculations are employed [24,25,32], resulting in a systematic uncertainty of 12%. The rare hadronic decay B − → K 0 L π − , where K 0 L is not detected and the high momentum π is misidentified as a muon, is also indistinguishable from the signal decay and has a similar o nn shape. This contribution is fixed in the fit and the signal yield difference, with and without the B − → K 0 L π − component, of 5.5% is taken as a systematic uncertainty since GEANT3 poorly models K 0 L interactions with materials.
The not-yet-discovered process B − → µ −ν µ γ with a soft photon can mimic the signal decay. To estimate the uncertainty from this hypothetical background, we perform the fit with this contribution fixed to half of the best upper limit B(B − → µ −ν µ γ) < 3.4 × 10 −6 at 90% C.L. by Belle [33] and take the difference of 6% as the systematic uncertainty.
Previous studies [13,14] did not characterize these backgrounds in a detailed manner, which could have led to a substantial underestimation of the systematic uncertainties.
In the region p * µ > 2.85 GeV/c, where only continuum events are present, we observe an almost linearly growing data/fit difference with maximum deviation ∼ 20% at o nn ∼ 1. To estimate the uncertainty due to the level of data/MC agreement in the o nn variable, we rescale linearly with o nn the continuum histograms used in the fit and refit, obtaining a 15% lower value of R. For peaking components such as the signal B − → µ −ν µ and the normalization decayB → π −ν , we use the fit/data ratio in the region p * µ < 2.5 GeV/c and apply it to the peaking components in the signal-region histograms (B − → µ −ν µ ,B → π −ν andB → ρ −ν ). Refitting produces an 11% higher value of R. Simultaneously applying both effects leads to only a 2% shift in the refitted central value; thus, we include the individual deviations as systematic uncertainties in the continuum and signal peak descriptions.
In some cases, the signal muon and detected fraction of the particles from the companion B-meson decay do not provide enough particles for an event to be identified as a B-meson decay and hence to be recorded. The efficiency for recording these events is 84% as calculated using MC, and we take the event-recording uncertainty to be half of the inefficiency (8%) since it will be partially cancelled by taking the ratio with the normalization processB → π −ν .
The branching fraction of the normalization process B → π −ν is known with 3.4% precision [3] and this is included as a systematic uncertainty.