Search for lepton-flavor violating decays $D^{0}\rightarrow X^{0}e^{\pm}\mu^{\mp}$

We present a search for seven lepton-flavor-violating neutral charm decays of the type $D^{0}\rightarrow X^{0} e^{\pm} \mu^{\mp}$, where $X^{0}$ represents a $\pi^{0}$, $K^{0}_{\rm S}$, $\bar{K^{*0}}$, $\rho^{0}$, $\phi$, $\omega$, or $\eta$ meson. The analysis is based on $468$ fb$^{-1}$ of $e^+e^-$ annihilation data collected at or close to the $\Upsilon(4S)$ resonance with the BaBar detector at the SLAC National Accelerator Laboratory. No significant signals are observed, and we establish 90\% confidence level upper limits on the branching fractions in the range $(5.0 - 22.5)\times 10^{-7}$. The limits are between one and two orders of magnitude more stringent than previous measurements.

In this report we present a search for seven D 0 → X 0 e ± µ ∓ LFV decays, where X 0 represents a π 0 , K 0 S , K * 0 , ρ 0 , φ, ω, or η meson, with data recorded with the BABAR detector at the PEP-II asymmetric-energy e + e − collider operated at the SLAC National Accelerator Laboratory.The intermediate mesons X 0 are reconstructed through the decays and η → γγ.The branching fractions for the signal modes are measured relative to the normalization decays used as it has the smallest branching fraction uncertainty [23] and the largest number of reconstructed candidates of the three normalization modes.Although decays of the type D 0 → X 0 h − h + have momentum distributions that more closely follow those of the signal decays under study, they suffer from smaller branching fractions, greater uncertainties on their branching fractions, and reduced reconstruction efficiencies relative to the three chosen normalization modes.
The D 0 mesons are identified using the decay D * + → D 0 π + produced in e + e − → cc events.Although D 0 mesons are also produced via other processes, the use of this decay chain increases the purity of the D 0 samples at the cost of a smaller number of reconstructed D 0 mesons.

II. THE BABAR DETECTOR AND DATA SET
The BABAR detector is described in detail in Refs.[24,25].Charged particles are reconstructed as tracks with a five-layer silicon vertex detector and a 40-layer drift chamber inside a 1.5 T solenoidal magnet.An electromagnetic calorimeter comprised of 6580 CsI(Tl) crystals is used to identify and measure the energies of electrons, positrons, muons, and photons.A ring-imaging Cherenkov detector is used to identify charged hadrons and to provide additional lepton identification information.Muons are primarily identified with an instrumented magnetic-flux return.
The data sample corresponds to 424 fb −1 of e + e − collisions collected at the center-of-mass (c.m.) energy of the Υ (4S) resonance (10.58 GeV, on peak) and an additional 44 fb −1 of data collected 0.04 GeV below the Υ (4S) resonance (off peak) [26].
Monte Carlo (MC) simulation is used to investigate sources of background contamination and evaluate selection efficiencies.Simulated events are also used to validate the selection procedure and for studies of systematic effects.The signal and normalization channels are simulated with the EvtGen package [27].We generate the signal channel decays uniformly throughout the three-body phase space, while the normalization modes include twobody and three-body intermediate resonances, as well as nonresonant decays.We also generate e + e − → qq (q = u, d, s, c), Bhabha and µ + µ − pairs (collectively referred to as QED events), and BB background, using a combination of the EvtGen, Jetset [28], KK2F [29], AfkQed [30], and TAUOLA [31] generators, where appropriate.The background samples are produced with an integrated luminosity approximately 6 times that of the data.Final-state radiation is generated using PHOTOS [32].The detector response is simulated with GEANT 4 [33,34].All simulated events are reconstructed in the same manner as the data.

III. EVENT SELECTION
In the following, unless otherwise noted, all observables are evaluated in the laboratory frame.In order to optimize the event reconstruction, candidate selection criteria, multivariate analysis training, and fit procedure, a rectangular area in the m(D 0 ) versus ∆m = m(D * + ) − m(D 0 ) plane is defined, where m(D * + ) and m(D 0 ) are the reconstructed masses of the D * + and D 0 candidates, respectively.This region is kept hidden (blinded) in data until the analysis steps are finalized.The hidden region is approximately 3 times the root mean square (RMS) width of the ∆m and m(D 0 ) resolutions.Its ∆m region is 0.1447 < ∆m < 0.1462 GeV/c 2 for all modes.The m(D 0 ) signal peak distribution is asymmetric due to bremsstrahlung emission, with the left-side RMS width typically 1 to 2 MeV/c 2 wider than the right-side.The m(D 0 ) RMS widths vary between 5 and 21 MeV/c 2 , depending on the signal mode.
Particle identification (PID) criteria are applied to all charged daughter tracks of the intermediate meson 0 decays.The charged pions and kaons are identified by measurements of their energy loss in the tracking detectors, and the number of photons and the Cherenkov angle recorded in the ring-imaging Cherenkov detector.These measurements are combined with information from the electromagnetic calorimeter and the muon detector to identify electrons and muons [24,25].Photons are detected and their energies are measured in the electromagnetic calorimeter.For D 0 → φ e ± µ ∓ , the PID requirement on the kaons from the φ meson decay is relaxed compared to the single-kaon modes.This increases the reconstruction efficiency for this signal mode, with little increase in backgrounds or misidentified candidates.The muon PID requirement depends on the signal mode, with tighter requirements imposed for modes with more charged pions in the final state.The PID efficiency depends on the track momentum, and is in the range 0.87−0.92for electrons, 0.60−0.95for muons, 0.86−0.98 for pions, and 0.84−0.92for kaons.The misidentification probability [35], defined as the probability that particles are identified as one flavor (e.g.muon) that are in reality of a different flavor (i.e.not a muon), is typically less than 0.03 for all selection criteria, except for the pion selection criteria, where the muon misidentification rate can be as high as 0.35 at low momentum.
We select events that have at least five charged tracks, except for D 0 → π 0 e ± µ ∓ and D 0 → η(→ γγ)e ± µ ∓ , which must have at least three.Two or more of the tracks must be identified as leptons.The separation along the beam axis between the two leptons at their distance of closest approach to the beam line is required to be less than 0.2 cm.The leptons must have opposite charges, and their momenta must be greater than 0.3 GeV/c.Electrons and positrons from photon conversions are rejected by removing electron-positron pairs with an invariant mass less than 0.03 GeV/c 2 and a production vertex more than 2 cm from the beam axis.
The minimum photon energy in a signal decay is required to be greater than 0.025 GeV.For the decays D 0 → π 0 e ± µ ∓ and D 0 → η(→ γγ)e ± µ ∓ , the momentum of the π 0 or η must be greater than 0.4 GeV/c and the energy of each photon from the π 0 must be greater than 0.045 GeV.The reconstructed π 0 invariant mass for all signal decays is required to be between 120 and 160 MeV/c 2 .
Candidate D 0 mesons for the signal modes are formed from the electron or positron, muon or antimuon, and intermediate resonance candidates.For the normalization modes, the D 0 candidate is formed from four charged tracks.Particle identification is applied to all charged tracks and the D 0 candidates are reconstructed with the appropriate charged-track mass hypotheses for both the signal and normalization decays.The tracks are required to form a good-quality vertex with a χ 2 probability for the vertex fit greater than 0.005.For the decay D 0 → K 0 S e ± µ ∓ , the K 0 S must have a transverse flight distance from the D 0 decay vertex greater than 0.2 cm.A bremsstrahlung energy recovery algorithm is applied to electrons and positrons, in which the energy of photon showers that are within a small angle (35 mrad in polar angle and 50 mrad in azimuth [24]) with respect to the tangent of the initial electron or positron direction is added to the energy of the electron or positron candidate.For the normalization modes, the reconstructed D 0 meson mass is required to be in the range 1.81 < m(D 0 ) < 1.91 GeV/c 2 , while for the signal modes, m(D 0 ) must be in the hidden m(D 0 ) range defined above.
The candidate D * + is formed by combining the D 0 candidate with a charged pion having a momentum greater than 0.1 GeV/c.For the normalization mode D 0 → K − π + π + π − , this pion is required to have a charge opposite that of the kaon.The pion and D 0 candidate are subject to a vertex fit, with the D 0 mass constrained to its known value [23] and the requirement that the D 0 meson and the pion originate from the beam spot [36].The χ 2 probability of the fit is required to be greater than 0.005.After the application of the D * + vertex fit, the D 0 candidate momentum in the c.m. system p * (D 0 ) must be greater than 2.4 GeV/c.For the normalization modes, the mass difference ∆m is required to be 0.143 < ∆m < 0.148 GeV/c 2 , while for the signal modes the range is 0.1395 < ∆m < 0.1610 GeV/c 2 .The extended ∆m range for the signal modes provides greater stability when fitting the background distributions.
The requirement on the number of charged tracks strongly suppresses backgrounds from QED processes.The p * (D 0 ) criterion removes most sources of combinatorial background, as well as charm hadrons produced in B decays, which are kinematically limited to p * (D 0 ) < ∼ 2.2 GeV/c [37].
Simulated samples indicate that the remaining background arises from e + e − → cc events in which charged tracks and neutral particles can either be lost or selected from elsewhere in the event to form a D 0 candidate.To reject this background, a multivariate selection based on a Boosted Decision Tree (BDT) discriminant is applied to the signal modes [38].A common set of eight input observables is used for all modes: the momenta of the electron or positron, muon or antimuon, and reconstructed intermediate meson; the momentum of the lowest-momentum charged track or photon from the X 0 candidate; the maximum angle between the direction of D 0 daughters and the D 0 direction; the total energy of all charged tracks and photons in the event, normalized to the beam energy; the ratio e + e − − m 2 (D * + ), where p * (D * + ) is the c.m. momentum of the D * + candidate and E * e + e − is the c.m. beam energy; and the reconstructed mass of the intermediate meson.Three additional input observables are used for the D 0 decays with ω or η decaying to π + π − π 0 : the momentum and reconstructed mass of the π 0 candidate, and the energy of the lowest-energy photon from the π 0 .The discriminant is trained and tested independently for each signal mode, using simulated samples for the signal modes, and ensembles of data outside the hidden region and e + e − → cc simulated samples for the background.Depending on the signal mode, the requirement on the discriminant output accepts between 70% to 90% of the simulated signal sample while rejecting between 50% to 90% of the background.
The cross feed to one signal mode from any other signal modes is estimated from simulated samples to be less than 4% in all cases, and typically less than 1%, assum-ing equal branching fractions for all signal modes.The cross feed to a specific normalization mode from the other two normalization modes is predicted from simulation to be less than 0.7%, where the branching fractions are taken from Ref. [23].In the data, no events with reconstructed normalization decays contain reconstructed signal decays.
From the data, we find that the fraction of normalization mode events with more than one candidate is 2.4%, 3.6%, and 4.4% for and D 0 → π − π + π + π − , respectively.For the signal mode with η → π + π − π 0 , 40% of events have multiple candidates.For η → γγ and ω decays, the number of events with multiple candidates is ∼ 10%, and for the remaining modes it is 1% to 5%.If two or more candidates are found in an event, the one with the highest D * + vertex χ 2 probability is selected.After applying the best-candidate selection, the correct D * + candidate in the simulated samples is selected with a probability of 95% or more for the normalization modes.For the signal modes, 70% of D * + candidates are correctly selected for η → π + π − π 0 , and between 86% and 94% for the remaining modes.After the application of all selection criteria and corrections for small differences between data and MC simulation in tracking and PID performance, the reconstruction efficiency sig for the simulated signal decays is between 1.6% and 3.6%, depending on the mode.For the normalization decays, the reconstruction efficiency norm is between 19.2% and 24.7%.The difference between sig and norm is mainly due to the minimum momentum criterion on the leptons required by the PID algorithms [25].

IV. SIGNAL YIELD EXTRACTION
The D 0 → X 0 e ± µ ∓ signal mode branching fraction B sig is determined relative to that of the normalization decay using where B norm is the branching fraction of the normalization mode [23], and N sig and N norm are the fitted yields of the signal and normalization mode decays, respectively.B(X 0 ) is the branching fraction of the intermediate meson decay channel.The symbols L sig and L norm represent the integrated luminosities of the data samples used for the signal (468.2±2.0 fb −1 ) and the normalization decays (39.3 ± 0.2 fb −1 ), respectively [26].For the signal modes, we use both the on-peak and off-peak data samples.For the normalization modes, a subset of the off-peak data is sufficient for achieving statistical uncertainties that are much smaller than the systematic uncertainties.
We perform an extended unbinned maximum likelihood fit to extract the signal and background yields for both the normalization and signal modes [39].The likelihood function is We define the likelihood for each event candidate i to be the sum of n j P j ( x i ; α j ) over two hypotheses j (signal or normalization and background).The symbol P j ( x i ; α j ) is the product of the probability density functions (PDFs) for hypothesis j evaluated for the measured variables x i of the i-th event.The total number of events in the sample is N , and n j is the yield for hypothesis j.The quantities α j represent parameters of P j .The distributions of each discriminating variable x i in the likelihood function is modeled with one or more PDFs, where the parameters α j are determined from fits to signal simulation or data samples.Each normalization mode yield N norm is extracted by performing a two-dimensional unbinned maximum likelihood fit to the ∆m versus m(D 0 ) distributions in the range 0.143 < ∆m < 0.148 GeV/c 2 and 1.81 < m(D 0 ) < 1.91 GeV/c 2 .Considering normalization and background events separately, the measured ∆m and m(D 0 ) values are essentially uncorrelated and are therefore treated as independent observables in the fits.The PDFs in the fits depend on the normalization mode and use sums of multiple Cruijff [15] and Crystal Ball [40] functions in both ∆m and m(D 0 ).The functions for each observable use a common mean.The background is modeled with an ARGUS threshold function [41] for ∆m and a Chebyshev polynomial for m(D 0 ).The ARGUS end point parameter is fixed at 0.1395 GeV/c 2 , the ∆m kinematic threshold for D * + → D 0 π + decays.All other PDF parameters, together with the normalization mode and background yields, are allowed to vary in the fit.
The fitted yields and reconstruction efficiencies for the normalization modes are given in Table I. Figure 1 shows projections of the unbinned maximum-likelihood fits onto the final candidate distributions as a function of ∆m for the normalization modes in the range 0.143 < ∆m < 0.148 GeV/c 2 .TABLE I. Summary of fitted candidate yields, with statistical uncertainties, and reconstruction efficiencies for the three normalization modes.

Decay mode
Nnorm (candidates) norm (%) After the application of the selection criteria, there are on the order of 100 events or fewer available for fitting in each signal mode.Each signal mode yield N sig is therefore extracted by performing a one-dimensional unbinned maximum likelihood fit to ∆m in the range 0.1395 < ∆m < 0.1610 GeV/c 2 .A Cruijff function is implemented for the signal mode PDF, except for D 0 → φ e ± µ ∓ , for which two two-piece Gaussians functions are used, and D 0 → ρ 0 e ± µ ∓ , for which two Cruijff functions are used.The background is modeled with an ARGUS function with the same end point used for the normalization modes.The signal PDF parameters and the end point parameter are fixed in the fit.All other background parameters and the signal and background yields are allowed to vary. Figure 2 shows the results of the fits to the ∆m distributions for the signal modes.We test the performance of the maximum likelihood fit for the normalization modes by generating ensembles of MC samples from the normalization and background PDF distributions.The mean numbers of normalization and background candidates used in the ensembles are taken from the fits to the data.The numbers of generated background and normalization mode candidates are sampled from a Poisson distribution.All background and normalization mode PDF parameters are allowed to vary, except for the ARGUS function end point.No significant biases are observed in the fitted yields of the normalization modes.The same procedure is repeated for the maximum likelihood fits to the signal modes, with ensembles of MC samples generated from the background PDF distributions only, assuming a signal yield of zero.The signal PDF parameters are fixed to the values used for the fits to the data, and the signal yield is allowed to vary.The biases in the fitted signal yields are less than ±0.3 candidates for all modes, and these are subtracted from the fitted yields before calculating the signal branching fractions.

V. SYSTEMATIC UNCERTAINTIES
The systematic uncertainties in the branching fraction determinations of the signal modes arise from so-called additive systematic uncertainties that affect the significance of the signal mode yields in the fits to the data samples and from multiplicative systematic uncertainties on the luminosity and signal reconstruction efficiencies.
The main sources of the additive systematic uncertainties in the signal yields are associated with the model parametrizations used in the fits to the signal modes, the fit biases, the allowed invariant-mass ranges for the D 0 and X 0 candidates, the amount of cross feed, and the limited MC and data sample sizes available for the optimization of the BDT discriminants.
The uncertainties associated with the fit model parametrizations of the signal modes are estimated by repeating the fits with alternative PDFs.This involves replacing the Cruijff functions with Crystal Ball functions, using a two-piece Gaussian function, and changing the number of functions used in the PDFs.For the background, the ARGUS function is replaced by a firstor second-order polynomial.The largest deviation occurs when using the Crystal Ball functions for the signal and the first-order polynomial for the background.The systematic uncertainty is taken as half this maximum deviation.The largest contribution comes from the normalization mode D 0 → π − π + π + π − due to the presence of increased background and greater uncertainty in the background shape.To account for potential inaccuracies in the simulation of the D 0 and X 0 invariant mass distributions, we change the mass selection ranges by ±0.5σ, where σ is the RMS width of the D 0 or X 0 meson.
The systematic uncertainties in the correction on the fit biases for the signal yields are taken from the ensembles of fits to the MC samples.Given the central value of the signal yield obtained from the fit in each mode, the cross feed yields from all other modes are calculated and are taken as a systematic uncertainty.To evaluate the systematic uncertainty in the application of the BDT discriminant, we vary the value of the selection criterion for the BDT discriminant output, change the size of the hidden region in data, and also retrain the BDT discriminant using a training sample with a different ensemble of MC samples.Summing the uncertainties in quadrature, the total additive systematic uncertainties in the signal yields are between 0.4 and 0.9 events.
Multiplicative systematic uncertainties are due to assumptions made about the distributions of the finalstate particles in the signal simulation modeling, the model parametrizations used in the fits to the normalization modes, the normalization mode branching fractions, tracking and PID efficiencies, limited simulation sample sizes, and luminosity.
Since the decay mechanism of the signal modes is unknown, we vary the angular distributions of the simulated final-state particles from the D 0 signal decay in three angular variables, defined following the prescription of Ref. [42].We weight the events, which are simulated uniformly in phase space, using combinations of sin, cos, sin 2 , and cos 2 functions of the angular variables.The reconstruction efficiencies calculated from simulation samples as functions of the three angles are constant, within the statistics available.The deviations of the reweighted efficiencies from the default average reconstruction efficiencies are therefore small.Half the maximum change in the average reconstruction efficiency is assigned as a systematic uncertainty.
Uncertainties associated with the fit model parametrizations of the normalization modes are estimated by repeating the fits with alternative PDFs.This involves swapping the Cruijff and Crystal Ball functions used in both ∆m and m(D 0 ).For the background, the order of the polynomials is changed and the ARGUS function is replaced by a second-order polynomial.Half the maximum change in the fitted yield is assigned as a systematic uncertainty.The normalization modes branching fraction uncertainties are taken from Ref. [23].
For both signal and normalization modes, we include uncertainties to account for discrepancies between reconstruction efficiencies calculated from simulation and data samples of 1.0% per K 0 S , 0.8% per lepton, and 0.7% per hadron track [43].We include a momentumdependent π 0 reconstruction efficiency uncertainty of 2.1% for D 0 → π 0 e ± µ ∓ and 2.3% for D 0 → ωe ± µ ∓ and D 0 → η(→ π + π − π 0 )e ± µ ∓ .For the PID efficiencies, we assign an uncertainty of 0.7% per track for electrons, 1.0% for muons, 0.2% for charged pions, and 1.1% for kaons [25].A systematic uncertainty of 0.4% is associated with our knowledge of the luminosities L norm and L sig [26].We assign systematic uncertainties in the range 0.8% to 1.8% to account for the limited size of the simulation samples available for calculating reconstruction efficiencies for the signal and normalization modes.
The simulation samples for the normalization modes contain a resonant structure of intermediate resonances that decay to two-or three-body final states, as well as four-body nonresonant decays.To investigate how changes in the resonant structure affect the reconstruction efficiencies, the simulation samples were generated using a four-body phase-space distribution only and the reconstruction efficiencies recalculated.The resulting changes in reconstruction efficiencies are less than the statistical uncertainties on norm due to the limited size of the simulation samples, and no systematic uncertainties are assigned.The total multiplicative systematic uncertainties are between 4.7% and 6.8% for the normalization modes and between 4.2% and 7.8% for the signal modes.
Table II summarizes the contributions of the systematic uncertainties of the normalization modes to the systematic uncertainties in the signal mode branching fractions, as defined in Eq. (1).Table III summarizes the systematic uncertainties in the signal mode yields, excluding those due to the normalization modes.
TABLE II.Summary of the contributions to the systematic uncertainties on the signal mode branching fractions, as defined in Eq. ( 1), that arise from uncertainties in the measurement of the normalization modes.II.

VI. RESULTS
Table IV gives the fitted signal yields, reconstruction efficiencies, branching fractions with statistical and systematic uncertainties, 90% C.L. upper limits on the branching fractions, and previous upper limits [19,20,23] for the signal modes.The yields for all the signal modes are compatible with zero.We assume that there are no cancellations due to correlations in the systematic uncertainties in the numerator and denominator of Eq. (1).We use the frequentist approach of Feldman and Cousins [44] to determine 90% C.L. bands.When computing the limits, the systematic uncertainties are combined in quadrature with the statistical uncertainties in the fitted signal yields.
In summary, we report 90% C.L. upper limits on the branching fractions for seven lepton-flavor-violating D 0 → X 0 e ± µ ∓ decays.The analysis is based on a sample of e + e − annihilation data collected with the BABAR detector, corresponding to an integrated luminosity of 468.2 ± 2.0 fb −1 .The limits are in the range (5.0 − 22.5) × 10 −7 and are between 1 and 2 orders of magnitude more stringent than previous D 0 → X 0 e ± µ ∓ decay results.For the four D 0 → X 0 e ± µ ∓ decays with the same final state as the D 0 → h − h + e ± µ ∓ decays reported in Ref. [22], the limits are 1.5 to 3 times more stringent.
TABLE IV.Summary of fitted signal yields Nsig with statistical and systematic uncertainties, reconstruction efficiencies sig, branching fractions with statistical and systematic uncertainties, 90% C.L. upper limits (U.L.) on the branching fractions, and previous limits [19,20,23].The additive and multiplicative uncertainties are combined to obtain the overall systematic uncertainties.The branching fraction systematic uncertainties include the uncertainties in the normalization mode branching fractions.

FIG. 1 .
FIG.1.Projections of the unbinned maximum-likelihood fits to the final candidate distributions as a function of ∆m for the normalization modes in the range 0.143 < ∆m < 0.148 GeV/c 2 .The solid blue line is the total fit, the dashed red line is the signal and the dotted green line is the background.

FIG. 2 .
FIG.2.Unbinned maximum-likelihood fits to the final candidate distributions as a function of ∆m for the signal modes in the range 0.1395 < ∆m < 0.1610 GeV/c 2 .The solid blue line is the total fit, the dashed red line is the signal and the dotted green line is the background.