Search for the lepton-flavor-violating decay $B^{0}\to K^{\ast 0} \mu^{\pm} e^{\mp}$

We have searched for the lepton-flavor-violating decay $B^{0}\to K^{\ast 0} \mu^{\pm} e^{\mp}$ using a data sample of 711 $fb^{-1}$ that contains $772 \times 10^{6}$ $B\bar{B}$ pairs. The data were collected near the $\Upsilon (4S)$ resonance with the Belle detector at the KEKB asymmetric-energy $e^{+}e^{-}$ collider. No signals were observed, and we set 90% confidence level upper limits on the branching fractions of ${\cal B}(B^{0}\to K^{\ast 0} \mu^{+} e^{-})<1.2\times 10^{-7}$, ${\cal B}(B^{0}\to K^{\ast 0} \mu^{-} e^{+})<1.6\times 10^{-7}$, and, for both decays combined, ${\cal B}(B^{0}\to K^{\ast 0} \mu^{\pm} e^{\mp})<1.8\times 10^{-7}$. These are the most stringent limits on these decays to date.

We have searched for the lepton-flavor-violating decay B 0 → K Ã0 μ AE e ∓ using a data sample of 711 fb −1 that contains 772 × 10 6 BB pairs. The data were collected near the ϒð4SÞ resonance with the Belle detector at the KEKB asymmetric-energy e þ e − collider. No signals were observed, and we set 90% confidence level upper limits on the branching fractions of BðB 0 → K Ã0 μ þ e − Þ < 1.2×10 −7 , BðB 0 → K Ã0 μ − e þ Þ < 1.6 × 10 −7 , and, for both decays combined, BðB 0 → K Ã0 μ AE e ∓ Þ < 1.8 × 10 −7 . These are the most stringent limits on these decays to date. DOI: 10.1103/PhysRevD.98.071101 In recent years, measurements from the LHCb [1,2] experiment have exhibited possible deviations from lepton universality in flavor-changing neutral-current b → sl þ l − transitions. Such universality is an important symmetry of the Standard Model. These deviations have generated much interest within the theoretical community, and several models of new physics [3][4][5][6][7][8][9][10] have been proposed to explain these discrepancies. In many such models, violation of lepton universality is accompanied by lepton flavor violation (LFV) [11]. The idea of LFV in B decays was discussed in Refs. [12][13][14][15][16][17][18][19]. Experimentally, one way to search for LFV is via the decays B 0 → K Ã0 μ AE e ∓ [20], which have large available phase space and also avoid the helicity suppression that a two-body decay such as B 0 → μ AE e ∓ might be subjected to. The most stringent upper limits for B 0 → K Ã0 μ AE e ∓ were set by the BABAR experiment based on a data sample of 229 × 10 6 BB events [21]. Here, we report a search for B 0 → K Ã0 μ AE e ∓ using a data sample of ð772 AE 11Þ × 10 6 BB events (711 fb −1 ), which is more than 3 times larger than that of BABAR. The data sample was collected by the Belle experiment running near the ϒð4SÞ resonance at the KEKB e þ e − collider [22].
The Belle detector is a large-solid-angle magnetic spectrometer consisting of a silicon vertex detector (SVD), a 50-layer central drift chamber (CDC), an array of aerogel threshold Cherenkov counters (ACC), a barrel-like arrangement of time-of-flight scintillation counters (TOF), and an electromagnetic calorimeter (ECL) comprising CsI (Tl) crystals. All are located inside a superconducting solenoid coil, which provides a 1.5 T magnetic field. An iron flux return yoke located outside the coil is instrumented with resistive-plate chambers (KLM) to detect K 0 L mesons and muons. Further details of the detector are given in Ref. [23]. Two inner detector configurations were used: a 2.0 cm radius beam-pipe and a three-layer SVD were used to record the first sample of 140 fb −1 , while a 1.5 cm radius beam-pipe, a four-layer SVD, and a small-cell inner drift chamber were used to record the remaining 571 fb −1 [24].
To study properties of signal events and optimize selection criteria, we generate samples of Monte Carlo (MC) simulated events. These samples are generated with the EVTGEN package [25] using three-body phase space and assuming that the K Ã0 is unpolarized. The detector response is simulated with the GEANT3 package [26].
We begin reconstructing B 0 → K Ã0 μ AE e ∓ [27] decays by selecting charged particles that originate from a region near the e þ e − interaction point. This region is defined using impact parameters: we require dr < 1 cm in the x-y plane (transverse to the positron beam), and jdzj < 4 cm along the z axis (antiparallel to the positron beam). To reduce backgrounds from low-momentum particles, we require that tracks have a transverse momentum (p T ) greater than 0.1 GeV=c.
From selected tracks, we identify K AE , π AE , μ AE , and e AE candidates using information from the CDC, ACC, and TOF detectors. The K AE and π AE candidates are identified by constructing the likelihood ratio R K ¼ L K =ðL K þ L π Þ, where L π and L K are relative likelihoods for kaons and pions, respectively, calculated based on the number of photoelectrons in the ACC, the specific ionization in the CDC, and the time-of-flight as determined from TOF hit times. We select kaons (pions) by requiring R K > 0.6 (< 0.4). This criterion is 92% (89%) efficient for kaons (pions), and has a misidentification rate of 7% (8%) for pions (kaons).
Muon candidates are identified based on information from the KLM detector. We require that candidates have momentum greater than 0.8 GeV=c, and that they have a penetration depth and degree of transverse scattering consistent with those of a muon, given the track momentum measured in the CDC [28]. A criterion on normalized muon likelihood, R μ > 0.9, is used to select muon candidates. For this requirement, the average muon detection efficiency is 89%, and the average pion misidentification rate is 1.4% [29].
Electron candidates are required to have momentum greater than 0.4 GeV=c and are identified using the following information: the ratio of ECL energy to the CDC track momentum; the ECL shower shape; position matching between the CDC track and the ECL cluster; the energy loss in the CDC; and the response of the ACC [30]. A requirement on normalized electron likelihood R e > 0.9 is imposed. This requirement has an efficiency of 92% and a pion misidentification rate of about 0.25% [29]. To recover electron energy lost due to possible bremsstrahlung, we search for photons inside a cone of radius 50 mrad centered around the electron momentum. If a photon is found within this cone, its four-momentum is added to that of the electron.
Kaon and pion candidates are combined to form K Ã0 candidates by requiring that their K-π invariant mass be within a 100 MeV=c 2 window centered around the K Ã0 mass [31]. B candidates are subsequently reconstructed by combining K Ã0 , μ AE , and e ∓ candidates. To discriminate signal decays from background, two kinematic variables are defined: the beam-energy-constrained mass M bc ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi ðE beam =c 2 Þ 2 − ðp B =cÞ 2 p , and the energy difference where E beam is the beam energy and E B and p B are the energy and momentum, respectively, of the B candidate. All of these quantities are evaluated in the e þ e − center-of-mass (CM) frame. For signal events, the ΔE distribution peaks near zero, and the M bc distribution peaks near the B mass. We retain events satisfying the loose requirements ΔE ∈ ½−0.05;0.04 GeV and M bc > 5.2 GeV=c 2 .
After the above selection criteria are imposed, about 3% of events have more than one signal B candidate. To select a single candidate, we choose the one with the smallest χ 2 from a vertex fit of the four charged tracks. From MC simulation, we find that this criterion identifies the correct signal decay 63% of the time.
At this stage of the analysis, there is significant background from e þ e − → qq (q ¼ u, d, s, c) continuum events. As lighter quarks are produced with large initial momentum, these events tend to consist of two back-toback jets of pions and kaons. In contrast, e þ e − → bb events result in BB pairs produced almost at rest in the CM frame; this results in more spherically distributed daughter particles. We thus distinguish BB events from qq background based on event topology. We use a multivariate analyzer constructed from a neural network (NN) that uses the following information: (1) A likelihood ratio constructed from modified Fox-Wolfram moments [32,33]. (2) The angle between the thrust axis of the B decay products and that of the rest of the event (the thrust axis is defined as the direction that maximizes the sum of the longitudinal momenta of all particles). (3) The angle θ B between the z axis and the B flight direction in the CM frame (for BB events, dN=d cos θ B ∝ 1 − cos 2 θ B , whereas for continuum events, dN=d cos θ B ≈ constant). (4) Flavor-tagging information from the other (nonsignal) B decay. Our flavor-tagging algorithm [34] outputs two variables: the flavor q of the tag-side B, and the tag quality r. The latter ranges from zero for no flavor information to one for unambiguous flavor assignment. We choose a selection criterion on the NN output (O qq NN ) by optimizing a figure of merit ε= ffiffiffiffiffiffi ffi N B p , where ε is the signal efficiency as determined from MC simulation, and N B is the total number of background events expected in a restrictive signal region M bc > 5.27 GeV=c 2 . We obtain a criterion O qq NN > 0.5, which rejects 94% of qq background while retaining 73% of signal events.
After this criterion is applied, the remaining background arises mainly from B decays that produce two leptons. Such background falls into three categories: (a) both B andB decay semileptonically; (b) a B →D ðÃÞ Xl þ ν decay is followed by aD ðÃÞ → Xl −ν decay; and (c) hadronic B decays where one or more daughter particles are misidentified as leptons. To suppress these backgrounds, we use a second NN that utilizes the following information: (1) The separation in z between the signal B decay vertex and the vertex of the other B. . At this stage, we also optimize the criterion on the variable ΔE, obtaining jΔEj < 0.025 GeV.
After applying this NN selection, only a small amount of background survives. We study this remaining background using MC simulation and find that the main source is B 0 → K Ã0 ð→K þ π − ÞJ=ψð→l þ l − Þ decays in which one of the leptons is misidentified and swapped with the K þ or π − . To suppress this background, we apply a set of vetoes. For B 0 → K Ã0 μ þ e − signal events, we apply three: the dilepton invariant mass must satisfy Mðl þ l − Þ ∉ ½3.04; 3.12 GeV=c 2 ; the kaon-electron invariant mass must satisfy MðK þ e − Þ ∉ ½2.90; 3.12 GeV=c 2 ; and the pion-muon invariant mass must satisfy Mðπ − μ þ Þ ∉ ½3.06; 3.12 GeV=c 2 . For B 0 → K Ã0 μ − e þ signal events, we apply two vetoes: the dilepton invariant mass must satisfy Mðl þ l − Þ ∉ ½3.02; 3.12 GeV=c 2 , and the pionelectron invariant mass must satisfy Mðπ − e þ Þ ∉ ½3.02; 3.12 GeV=c 2 . While calculating these invariant masses, the mass hypothesis for a hadron is taken to be that of the associated lepton. These vetoes have relative efficiencies of 90.4% and 94.8% for B 0 → K Ã0 μ þ e − and B 0 → K Ã0 μ − e þ , respectively. We use a high-statistics MC sample to study backgrounds originating from charmless hadronic B decays and find them to be negligible. The largest contribution is from B 0 → K Ã0 π þ π − in which the pions are misidentified as leptons; this contribution is only 0.01 event. To avoid bias, all selection criteria are determined in a "blind" manner, i.e., they are finalized before looking at events in the signal region.
To test our understanding of remaining backgrounds, we compare the M bc distributions for data and MC events, as shown in Fig. 1. The plots show good agreement between data and MC for both the number of events observed and the shapes of the distributions.
We calculate the signal yield by performing an unbinned extended maximum-likelihood fit to the M bc distribution. The probability density function (PDF) used to model signal decays is a Gaussian, and that for all backgrounds combined is an ARGUS function [35]. The signal shape parameters are obtained from MC simulation. We check these parameters by fitting the M bc distribution of a control sample of B 0 → K Ã0 ð→K þ π − ÞJ=ψð→l þ l − Þ decays. For this control sample, we fit both data and MC events and find excellent agreement between them for the shape parameters obtained. All background shape parameters, along with the signal and background yields, are floated in the fit. The fitted M bc distributions are shown in Fig. 2. The fitted yields are N sig ¼ −1.5 þ4.7 −4.1 and 0.4 þ4.8 −4.5 for B 0 → K Ã0 μ þ e − and B 0 → K Ã0 μ − e þ , respectively. By combining both final states, we obtain N sig ¼ −1.2 þ6.8 −6.2 . As there is no evidence of a signal, we calculate 90% confidence level (C.L.) upper limits on the branching fractions using a frequentist method as follows. We scan through a range of possible signal yields, and for each yield generate 10 000 sets of signal and background events according to their PDFs. Each set of events is statistically equivalent to our data set of 711 fb −1 . We combine signal Points with error bars are the data, while the color filled stacked histograms depict MC components from generic B decays (blue), qq continuum (green), and negligible contributions from charmless hadronic B decays (purple). and background samples and perform our fitting procedure on these combined sets of events. We then calculate, for each input value of signal yield, the fraction of sets (f sig ) that have a fitted yield less than that observed in the data. The input signal having f sig ¼ 0.10 is taken as an upper limit N UL sig (statistical error only). We convert N UL sig into an upper limit on the branching fraction (B UL ) via the formula where BðK Ã0 → K þ π − Þ ¼ 0.6651 is the assumed branching fraction (from isospin symmetry) for the intermediate decay K Ã0 → K þ π − ; N BB is the number of BB pairs, ð7.72 AE 0.11Þ × 10 8 ; f 00 is the branching fraction Bðϒð4SÞ → B 0B0 Þ ¼ 0.486 AE 0.006 [31]; and ε is the signal reconstruction efficiency as calculated from MC simulation. We include systematic uncertainty in B UL by smearing the N sig distributions of the aforementioned statistically equivalent samples by the total fractional systematic uncertainty (see below) before calculating f sig . The resulting upper limits are listed in Table I. For the upper limit on both decays K Ã0 μ þ e − and K Ã0 μ − e þ combined, BðB 0 → K Ã0 μ AE e ∓ Þ ≡ BðB 0 → K Ã0 μ þ e − Þ þ BðB 0 → K Ã0 μ − e þ Þ, and the branching fractions for the two modes are assumed to be identical when calculating the efficiency.
There are a number of systematic uncertainties, as listed in Table II. The uncertainty on ε due to limited MC  Systematic uncertainty (%) statistics is 0.3%, and the uncertainty on the number of B 0B0 pairs is 1.4%. The systematic uncertainties related to detector performance are determined from dedicated studies of control samples; specifically, these samples are used to measure tracking and particle identification efficiencies of charged particles. The systematic uncertainty due to charged track reconstruction is 0.35% per track. The uncertainty due to particle identification requirements is 2.8%. The uncertainty due to the requirements imposed on O qq NN and O BB NN is evaluated by imposing the same requirements on the control sample of B → K Ã0 J=ψ; J=ψ → l þ l − decays. We compare the efficiencies of the O NN criteria on the control sample to those obtained from corresponding MC samples; the ratio is used to correct our signal efficiency, and the statistical error on the ratio is taken as the systematic uncertainty. For O qq NN , this ratio is 1.002 AE 0.022; for O BB NN , the ratio is 0.919 AE 0.026. The total systematic uncertainty due to both NN criteria applied together is 2.8%. The uncertainty due to the PDF shapes is evaluated by varying the fixed PDF shape parameters by AE1σ and repeating the fit; the change in the central value of N sig is taken as the systematic uncertainty. Systematic uncertainties due to the aforementioned tiny contribution of the charmless hadronic B decays are included. We initially assume that the K Ã0 is unpolarized. To investigate the effect of this, we calculate the reconstruction efficiency for fully longitudinal and fully transverse polarizations. The efficiency varies by only a few percent, and we include this variation as a systematic uncertainty.
Our reconstruction efficiency corresponds to B 0 → K Ã0 μ AE e ∓ decays proceeding according to three-body phase space. The corresponding q 2 ≡ M 2 ðl þ l − Þ spectra peak at low values, where the reconstruction efficiency is also low; thus our upper limits are conservative. For larger values of q 2 , the efficiency rises approximately linearly from a minimum of 8% to 14% near q 2 max . Such higher efficiencies would give lower upper limits.
In summary, we have searched for the lepton-flavorviolating decays B 0 → K Ã0 μ AE e ∓ using the full Belle data set recorded at the ϒð4SÞ resonance. We see no statistically significant signal and set the following 90% C.L. upper limits on the branching fractions: BðB 0 → K Ã0 μ − e þ Þ < 1.6 × 10 −7 ; ð2Þ These results are the most stringent constraints on these LFV decays to date.