Tests of lepton universality using $B^0\to K^0_S \ell^+ \ell^-$ and $B^+\to K^{*+} \ell^+ \ell^-$ decays

Tests of lepton universality in $B^0\to K^0_S \ell^+ \ell^-$ and $B^+\to K^{*+} \ell^+ \ell^-$ decays where $\ell$ is either an electron or a muon are presented. The differential branching fractions of $B^0\to K^0_S e^+ e^-$ and $B^+\to K^{*+} e^+ e^-$ decays are measured in intervals of the dilepton invariant mass squared. The measurements are performed using proton-proton collision data recorded by the LHCb experiment, corresponding to an integrated luminosity of $9\, \mathrm{fb}^{-1}$. The results are consistent with the Standard Model and previous tests of lepton universality in related decay modes. The first observation of $B^0 \to K^0_S e^+ e^-$ and $B^+ \to K^{*+} e^+ e^-$ decays is reported.

The B 0 → K 0 S + − and B + → K * + + − decays, where refers to either an electron or a muon, are flavour-changing neutral current (FCNC) transitions involving the transformation of a beauty quark into a strange quark. 1 These decays proceed via higher order electroweak processes in the Standard Model (SM) due to the absence of first order FCNC transitions, making them highly suppressed. Therefore, these decays may receive significant contributions from new quantum fields that lie beyond the Standard Model (BSM) and hence are promising laboratories for new physics (NP) searches.
In recent years, studies of similar b → s + − transitions, most prominently B + → K + + − and B 0 → K * 0 + − decays, have revealed tensions with the SM predictions. Deviations have been seen in ratios of branching fractions where B denotes a B + or a B 0 meson, H is either a K or a K * meson, and q 2 is the dilepton invariant mass squared. In the SM the charged leptons have identical interaction strengths, which is referred to as lepton universality. The only exception is their interaction with the Higgs field, which determines their differing masses. Therefore, these ratios are predicted to be very close to unity [1], with corrections from QED up to O(10 −2 ) [2,3] and further small corrections due to the muon-electron mass difference. Furthermore, these ratios benefit from precise cancellation of the hadronic uncertainties that affect predictions of the branching fractions and angular observables, which affect the ratios at O(10 −4 ) [4]. Significant deviation from unity in such ratios would therefore constitute unambiguous evidence of BSM physics. The ratio R K * 0 , measured by the LHCb collaboration using the data collected in the q 2 regions 0.045 < q 2 < 1.1 GeV 2 /c 4 and 1.1 < q 2 < 6.0 GeV 2 /c 4 [5], is in tension with the SM predictions at 2.2-2.4 and 2.4-2.5 standard deviations (σ), respectively, where the ranges are due to the use of different Standard Model predictions. A measurement of R K + performed in the region 1.1 < q 2 < 6.0 GeV 2 /c 4 deviates from the SM by 3.1 standard deviations [6]. The analogous ratio measured using Λ 0 b → pK − + − decays, R pK , is consistent with the SM within one standard deviation [7]. All four measurements show a deficit of b → sµ + µ − decays with respect to b → se + e − decays.
The B 0 → K 0 S + − and B + → K * + + − decays are the isospin partners of B + → K + + − and B 0 → K * 0 + − decays and are expected to be affected by the same 1 Charge conjugate processes are implied throughout. NP contributions. Testing lepton universality by measuring the ratios R K 0 S and R K * + can therefore provide important additional evidence for or against NP. However, while these decays have similar branching fractions to their isospin partners, O(10 −6 ) to O(10 −7 ), they suffer from a reduced experimental efficiency at LHCb due to the presence of a long-lived K 0 S meson in the final state. These ratios have previously been measured by the BaBar [100] and Belle [101,102] collaborations. The differential branching fractions of the muon modes, B 0 → K 0 S µ + µ − and B + → K * + µ + µ − , were found to be lower although still consistent with predictions at low q 2 in a measurement performed by the LHCb collaboration [22]. No single experiment has unambiguously observed the electron decay modes to date.
In this Letter, measurements of the ratios R K 0 S and R K * + and the differential branching fractions of B 0 → K 0 S e + e − and B + → K * + e + e − decays are presented. The measurements are performed using proton-proton (pp) collision data corresponding to an integrated luminosity of 9 fb −1 recorded by the LHCb experiment in 2011, 2012 (Run 1) and 2016-2018 (Run 2) at centre-of-mass energies of 7, 8 and 13 TeV, respectively. The K 0 S and K * + mesons are reconstructed in the π + π − and K 0 S π + final states, respectively. The ratio R K 0 S and the branching fraction B(B 0 → K 0 S e + e − ) are measured in the region 1.1 < q 2 < 6.0 GeV 2 /c 4 , while R K * + and B(B + → K * + e + e − ) are determined in the range 0.045 < q 2 < 6.0 GeV 2 /c 4 . A wider range is used in the case of the B + decay, the differential branching fraction of which is enhanced at low q 2 by the photon pole, since the K * + is a vector meson. Splitting the q 2 range into two bins at 1.1 GeV 2 /c 4 , as was done in the R K * 0 measurement, is not possible due to the limited data sample.
The analysis is designed to minimise systematic uncertainties, particularly those associated with differences in the detector response between electrons and muons. The ratios and differential branching fractions are normalised to the control modes, B 0 → J/ψ (e + e − ) K 0 S , B 0 → J/ψ (µ + µ − ) K 0 S , B + → J/ψ (e + e − ) K * + , and B + → J/ψ (µ + µ − ) K * + , the branching fractions of which are known to respect lepton universality to an excellent approximation [103] and are taken to be equal for the muon and the electron decays of a given B meson. The parameters R −1 K 0 S and R −1 K * + are measured as double ratios where K ( * ) is either a K 0 S or K * + meson, N is the measured yield, and is the total efficiency for signal (sig) and control (con) decays. The inverse ratio R −1 K ( * ) is measured as its uncertainty better represents a Gaussian distribution due to the low yield of the electron decay mode. Many sources of systematic bias cancel in the ratio between the signal and control modes. The differential branching fractions of the signal electron modes are measured as The LHCb detector is a single-arm forward spectrometer covering the pseudorapidity range 2 < η < 5, described in detail in Refs. [104,105]. The simulated events used in this analysis are produced with the software described in Refs. [106][107][108][109][110]. In particular, final-state radiation is simulated using Photos [111].
The candidates used in the analysis must first pass a hardware trigger, which requires the presence of at least one muon with high transverse momentum, p T , in the case of B → K ( * ) µ + µ − candidates, or in the case of B → K ( * ) e + e − candidates at least one electron or hadron with large energy deposits in the electromagnetic calorimeter (ECAL) or hadronic calorimeter (HCAL), respectively. Further B → K ( * ) e + e − candidates are selected where the hardware trigger requirements are satisfied by objects from the underlying pp collision that do not form part of the reconstructed candidate. Candidates are then required to pass a software trigger, the first stage of which requires the presence of at least one track with high p T that is well separated from the primary pp interaction vertex (PV), followed by a second stage that imposes topological requirements on the final-state tracks to determine whether they are consistent with the decay of a b hadron.
Muons are initially identified from tracks that penetrate the calorimeters and the iron absorber plates of the muon system and further separated from hadrons (primarily pions and kaons) by a multivariate classifier that combines information from the other subdetectors. Electrons are identified from tracks with an associated deposit of energy in the ECAL, and separated from hadrons using a similar multivariate classifier.
Due to their small mass, electrons lose energy via bremsstrahlung radiation as they traverse the detector material, leading to a degradation in their energy and momentum resolution. A bremsstrahlung recovery procedure is used to identify energy deposits in the ECAL that are consistent with photons radiated from electron candidate tracks upstream of the magnet. This is done by extrapolating the direction of the electron track before the magnet to a position in the ECAL and then searching for energy deposits without associated tracks at that location. When such a deposit is identified, its energy is used to correct the electron's energy and momentum. This leads to an improvement in the B candidate invariant mass resolution, although the resolution for electronic modes remains larger than for the equivalent muonic channels.
Candidate K 0 S mesons are reconstructed from two oppositely charged tracks identified as pions, using either a pair of tracks that originate in the vertex locator (long tracks), or two tracks that originate downstream of the vertex locator in the first silicon-strip detector (downstream tracks). Around a third of reconstructed K 0 S mesons are formed from long tracks. Candidate B 0 → K 0 S + − decays are formed from two oppositely charged tracks identified as either muons or electrons combined with a candidate K 0 S meson. In the case of B + → K * + + − candidates, the additional charged pion is required to result in a K 0 S π + mass within 300 MeV/c 2 of the K * + mass [103]. An estimated S-wave contribution in this K * + mass window of approximately 22% based on previous studies of B 0 → K + π − µ + µ − decays by the LHCb collaboration [20] is included in the analysis. When measuring the B + → K * + e + e − differential branching fraction, the B + → J/ψ (e + e − ) K * + control mode is selected with a K 0 S π + mass in the range 792 − 992 MeV/c 2 in order to be consistent with the selection used in previous measurements of B (B + → J/ψK * + ) [112,113], the world average of which is taken as external input [103]. Background is further suppressed by requirements on the quality of the B decay vertex, the flight distance significance of the B candidate, how consistent the B candidate is with having originated at the PV, the invariant masses of the K 0 S , K * + and B candidates, and the p T and separation from the PV of the final-state tracks. In an additional step, the invariant masses of B candidates are recalculated with the K 0 S meson mass constrained to its measured value [103], leading to an improvement in the B mass resolution. Various requirements on decay kinematics, decay time, and particle identification (PID) information are used to reject potential background originating from misidenti- where both X and Y represent either an electron-neutrino pair or a pion. Background to where the companion pion from the K * + is swapped with a lepton from the J/ψ or ψ(2S) meson, H b → hh π + − decays, decays with Λ baryons in the final state, and B + → D 0 (K 0 S π + X)Y decays. These selection requirements reduce all these background sources to levels that have a negligible (sub-percent) effect on the measured signal yields. The B 0 → K 0 S π + π − and B + → K * + π + π − decays, where the pions are misidentified as electrons, are significantly reduced by the electron particle identification requirements and included as components in the mass fits.
Multivariate classifiers based on boosted decision tree (BDT) algorithms [114] are used to suppress background from coincidental track combinations (combinatorial background). Separate classifiers are trained for each signal decay mode, the two data-taking periods (Run 1, Run 2) and whether the K 0 S meson was reconstructed from long or downstream tracks. Each classifier is trained on a signal sample of simulated B → K ( * ) + − decays, and a background sample taken from data with a reconstructed invariant B mass greater than 5500 MeV/c 2 and q 2 regions consistent with either a J/ψ or ψ(2S) meson removed. The classifiers combine information on the B candidate's fit quality, distance of closest approach to the PV, flight distance, p T and decay time; the decay time of the dilepton pair; the p T and decay time of the K 0 S candidate; how isolated the B, dilepton and K 0 S candidates are from other tracks in the event; and the distance of closest approach to the PV of long tracks forming the K 0 S candidate. Requirements on the classifier outputs are optimised to provide the maximum signal significance, defined as S/ √ S + B, where S is the expected signal calculated from the control mode yield in data, the signal-to-control mode efficiency ratio from simulation and the signal-to-control mode branching fraction ratio [103], and B is the background yield in the signal region, extrapolated from a fit to the data mass sidebands.
The muon and electron control modes are selected in the ranges of 8.98 < q 2 < 10.21 GeV 2 /c 4 and 6.0 < q 2 < 11.0 GeV 2 /c 4 , respectively, and the spectra of their invariant masses m(J/ψK 0 S ) and m(J/ψK 0 S π + ) are shown in Fig. 1. Their yields are determined using fits to the B candidate mass, calculated with the masses of the K 0 S and J/ψ mesons constrained to their measured values [103], improving the resolution. The mass distribution of each control mode is modelled with a sum of two Crystal Ball functions [115] with common mean and opposite-side power law tails (referred to as a double Crystal Ball or DCB), with parameters fixed to values obtained from fits to simulated events. The muon control modes are modelled using a single DCB function, while the electron control modes are modelled using a sum of three DCB functions, where the shape of each component and their relative fractions are determined from fits to simulation in three different categories according to the number of bremsstrahlung photons added to the candidate's electrons: 0, 1, or ≥ 2. In the fit to data, shifts in the DCB means and widths are allowed to vary freely with respect to those obtained from simulation in order to accommodate data-simulation differences in the mass scale

Data 9 fb Total
S π + mass and (bottom right) J/ψ(e + e − )K 0 S π + mass with the fit models used to determine the control mode yields. and resolution. Combinatorial background is modelled with an exponential function. In the case of the B 0 control modes, B 0 s → J/ψ ( + − ) K 0 S decays are modelled with the B 0 DCB function with its mean offset by the measured m B 0 s − m B 0 mass difference [103] and with its yield allowed to vary freely. The lower limit of the mass range excludes partially reconstructed background from higher K * resonances, which are not modelled. The structures visible in the range 5400 MeV/c 2 < m(J/ψK 0 S ) < 5500 MeV/c 2 are due to small amounts of residual contamination from Λ 0 b decays, which have a negligible effect on the measured yields of B 0 → J/ψ (µ + µ − ) K 0 S and B 0 → J/ψ (e + e − ) K 0 S decays. The results of the fit are shown in Fig. 1 and the yields and B + → J/ψ (e + e − ) K * + decays are found to be 118 750 ± 360, 21 080 ± 170, 75 420 ± 290, and 14 330 ± 170, respectively.
The spectra of the invariant masses m(K 0 S µ + µ − ), m(K 0 S e + e − ), m(K 0 S π + µ + µ − ), and m(K 0 S π + e + e − ) of the muon and electron signal modes are shown in Fig. 2. The yields of B 0 → K 0 S µ + µ − and B + → K * + µ + µ − decays are determined using fits to the K 0 S µ + µ − and K 0 S π + µ + µ − mass distributions. The B 0 → K 0 S µ + µ − and B + → K * + µ + µ − signal decays are modelled using DCB functions where the shape parameters are determined from fits to simulation, with shifts in their means and widths taken from the corresponding control mode fits to data. Combinatorial background is modelled using an exponential function, while partially reconstructed background is excluded by the lower mass limit.
The ratios and branching fractions are determined using fits to K 0 S e + e − and K 0 S π + e + e − , and (bottom left) K 0 S π + µ + µ − and (bottom right) K 0 S π + e + e − mass with the fit models used to determine the B + → K * + µ + µ − yield and mass spectra, with the K 0 S candidate's mass constrained to its measured value. The B 0 → K 0 S e + e − and B + → K * + e + e − signal decays are modelled using the sum of three DCB functions, each corresponding to different numbers of recovered bremsstrahlung photons. The DCB parameters are taken from simulation with shifts in the means and widths taken from the control mode fits to data without a J/ψ mass constraint. Partially reconstructed background from higher K * resonances is modelled using a DCB function with shape parameters constrained from simulation in the B 0 → K 0 S e + e − fit, and Gaussian kernel density estimations (KDE) determined from simulation in the B + → K * + e + e − fit, with their yields allowed to vary freely. Leakage from the J/ψ control modes into the signal region is modelled using KDE functions, with their yields constrained based on the control mode fits and the efficiency in simulation. Residual contamination from B 0 → K 0 S π + π − and B + → K * + π + π − decays is modelled in each fit by a DCB function determined from simulated events, with its yield constrained using the control mode yields, the control mode and background branching fractions, and the efficiencies taken from simulation. Combinatorial background is modelled with an exponential function.
The efficiencies used in the measurements of the ratios and differential branching fractions are calculated using simulation, to which various corrections are applied to improve the agreement with data. The PID efficiencies for each channel are calculated from calibration data samples of electrons, muons and pions, and are applied as per-candidate weights to the simulation. Similarly, the electron tracking efficiency is corrected using calibration samples. The p T and pseudorapidity of the B mesons generated by Pythia 8 [106], and the occupancy of the underlying events are corrected by comparing their distributions between data and simulation using the muon control modes to calculate per-candidate weights, which are applied to both electron and muon samples. Similarly, the trigger efficiency is corrected by comparing the efficiency as a function of the p T of the muons, the transverse energy of the electrons and pions, and the p T of the B meson, between control mode data and simulation. Further weights are applied to correct any residual mismodelling of the BDT classifier response. Finally, the simulated q 2 distribution is corrected using control mode data to account for the larger observed resolution in data.
Multiple sources of systematic uncertainty are evaluated, the largest of which comes from the statistical uncertainties of the efficiencies due to the sizes of the available samples of simulated events. These affect the R −1 K ( * ) ratios and the differential branching fractions at 2-3%. The next largest are associated with the mass fit models, in particular the limited size of the simulation samples used to determine the shape parameters, and the choices of models used for the partially reconstructed and J/ψ leakage background, which affect the observables by 1-2%. The remaining sources of systematic uncertainty are found to be close to or below the 1% level. These include: the limited amount of data and simulation used to calculate correction weights; the choices of binning schemes used to evaluate the PID efficiency weights and potential biases in this procedure due to correlations in the PID response between the two electrons; the choice of methods used to calculate the trigger efficiency; imperfect modelling of the muon track reconstruction efficiency; residual mismodelling of the BDT classifier response in simulation; residual bias in the fitting procedure evaluated using pseudoexperiments, and residual contamination from B 0 → D − (K 0 S X)Y and B + → D 0 (K 0 S π + X)Y decays. Other sources of residual background contamination were found to have a negligible effect on the measurements. All systematic uncertainties, as well as the statistical precision of the efficiencies and various parameters used in the maximum likelihood fits, are included as Gaussian constraints in the fit. Finally, the statistical precision on each observable is scaled by factors determined from pseudoexperiments (with values in the range 1.01-1.02), in order to guarantee proper coverage.
A number of checks are performed to ensure that efficiencies are accurately estimated.
The ratios R −1 ψ(2S)K ( * ) , where the signal modes in Eq. , consistent with unity as expected due to lepton universality in J/ψ and ψ(2S) decays [6]. Additionally, the single ratios which do not benefit from cancellation of systematic biases between signal and control modes, and are therefore a stringent check of the efficiencies, are found to be r −1 J/ψK 0 S = 0.977 ± 0.008 (stat.) ± 0.027 (syst.) and r −1 J/ψK * + = 0.965 ± 0.011 (stat.) ± 0.032 (syst.), again consistent with unity in both cases. Furthermore, differential measurements of r J/ψK ( * ) as functions of a range of variables that are differently distributed in the signal and control decays are performed. The most powerful of these tests measures r J/ψK ( * ) as a function of the response of a BDT classifier trained on simulated candidates to distinguish signal and control decays. All these differential distributions are found to be flat within the statistical precision, providing further confidence that the yields and efficiencies are well estimated [116].
In order to avoid experimenter's bias, the results of the analysis and the electron signal mode mass spectra were not examined until the full procedure had been finalised and checks had been performed to ensure that the muon signal mode branching fractions were in agreement between the different data-taking years and also with the results of the previous measurements made by LHCb [22]. The fits to the B 0 → K 0 S e + e − , B 0 → K 0 S µ + µ − , B + → K * + e + e − and B + → K * + µ + µ − invariant-mass spectra are shown in Fig. 2. The fitted yields of B 0 → K 0 S e + e − B 0 → K 0 S µ + µ − B + → K * + e + e − and B + → K * + µ + µ − decays are 45 ± 10, 155 ± 15, 67 ± 13 and 221 ± 17, respectively. The ratios R −1 K 0 S and R −1 K * + are measured to be in the q 2 ranges [1.1, 6.0] GeV 2 /c 4 and [0.045, 6.0] GeV 2 /c 4 , respectively. These ratios are consistent with the SM at 1.5 and 1.4 standard deviations, respectively, evaluated using Wilks' theorem [117]. To aid comparison with other lepton-universality ratios, R K 0 S and R K * + are calculated by inverting the results above, yielding  [117] are 5.3σ and 6.0σ, respectively. Since the control mode branching fraction of B 0 → J/ψK 0 decays (8.91±0.21)×10 −4 [103] is used, the differential branching fraction of B 0 → K 0 e + e − instead of B 0 → K 0 S e + e − decays is reported. A combination of the R −1 K 0 S and R −1 K * + measurements is performed using the flavio software package [118] to fit for a single muon-specific Wilson coefficient C NP 9 = −C NP 10 , while fixing all other Wilson coefficients to their SM values. This scenario is used in several existing fits to b → s + − data, and is chosen specifically as the ratios R K ( * ) have poor sensitivity in discriminating between the Wilson Coefficients C 9 and C 10 . The fit results in C NP 9 = −C NP 10 = −0.8 +0.4 −0.3 and a significance of 2.0 standard deviations with respect to the SM under this specific scenario. It should be noted that this fit is model-dependent and the result could change if data-driven estimates of the hadronic uncertainties were used.
These measurements constitute the most precise tests of lepton universality in B 0 → K 0 S + − and B + → K * + + − decays to date, the most precise measurements of their differential branching fractions at low q 2 , and the first observations of B 0 → K 0 S e + e − and B + → K * + e + e − decays. While these measurements are individually consistent with the SM, the central values exhibit the same deficit of muonic decays relative to electronic decays as seen in the other lepton universality tests performed by the LHCb collaboration [5][6][7].

A Supplemental material
This appendix contains supplemental material to the measurements. Eq. 5 gives the the B 0 → K 0 S e + e − and B + → K * + e + e − differential branching fractions divided by the branching fractions of their respective normalisation modes. Fig. 3 shows the total efficiencies for the four signal decay modes as function of q 2 . Figs. 4 and 5 show differential measurements of r −1 J/ψK ( * ) . Figs. 6 and 7 show the profile likelihood scans for R −1 K ( * ) and the differential branching fractions. Fig. 8 compares the measurements of R −1 K ( * ) presented in this Letter with previous measurements by the Belle collaboration. Fig. 9 shows the J/ψK ( * ) invariant mass spectra and the maximum likelihood fits use to determine the B → J/ψK ( * ) control mode yields with a linear scale. Figs. 10 and 11 show the ψ(2S)K ( * ) invariant mass spectra and the maximum likelihood fits use to determine the B → ψ(2S)K ( * ) mode yields with linear and logarithmic scales.
In the main text, the differential branching fractions of B 0 → K 0 S e + e − and B + → K * + e + e − decays are reported. These are normalised using the world-average branching fractions for B 0 → J/ψ (e + e − ) K 0 and B + → J/ψ (e + e − ) K * + respectively [103]. The ratios of signal and control mode branching fractions used in this calculation are where B + → K * + e + e − candidates are selected with m(K 0 S π + ) within 300 MeV/c 2 of the world-average K * + mass, whereas B + → J/ψ (e + e − ) K * + candidates are selected in the range 792 < m(K 0 S π + )/ MeV/c 2 < 992. The efficiencies are evaluated using simulated candidates that have been corrected to improve agreement with data. These efficiencies account for detector resolution effects including the effect of bremsstrahlung, which may cause a candidate to be reconstructed in a different bin of q 2 than would be dictated by its 'true' value.