Search for lepton-universality violation in $B^+\to K^+\ell^+\ell^-$ decays

A measurement of the ratio of branching fractions of the decays $B^+\to K^+\mu^+\mu^-$ and $B^+\to K^+e^+e^-$ is presented. The proton-proton collision data used correspond to an integrated luminosity of $5.0\,$fb$^{-1}$ recorded with the LHCb experiment at centre-of-mass energies of $7$, $8$ and $13\,$TeV. For the dilepton mass-squared range $1.1<q^2<6.0\,$GeV$^2\!/c^4$ the ratio of branching fractions is measured to be $R_K = {0.846\,^{+\,0.060}_{-\,0.054}\,^{+\,0.016}_{-\,0.014}}$, where the first uncertainty is statistical and the second systematic. This is the most precise measurement of $R_K$ to date and is compatible with the Standard Model at the level of 2.5 standard deviations.

Decays involving b → s + − transitions, where represents a lepton, are mediated by flavour-changing neutral currents.Such decays are suppressed in the Standard Model (SM), as they proceed only through amplitudes that involve electroweak loop diagrams.These processes are sensitive to virtual contributions from new particles, which could have masses that are inaccessible to direct searches for resonances, even at Large Hadron Collider experiments.
The electroweak couplings of all three charged leptons are identical in the SM and, consequently, the decay properties (and the hadronic effects) are expected to be the same up to corrections related to the lepton mass, regardless of the lepton flavour (referred to as lepton universality).The ratio of branching fractions for B → Hµ + µ − and B → He + e − decays, where H is a hadron, can be predicted precisely in an appropriately chosen range of the dilepton mass squared q 2 min < q 2 < q 2 max [30,31].This ratio is defined by where Γ is the q 2 -dependent partial width of the decay.In the range 1.1 < q 2 < 6.0 GeV 2 /c 4 , such ratios are predicted to be unity with O(1%) precision [32].The inclusion of chargeconjugate processes is implied throughout this Letter.
This Letter presents the most precise measurement of the ratio R K in the range 1.1 < q 2 < 6.0 GeV 2 /c 4 .The analysis is performed using 5.0 fb −1 of proton-proton collision data collected with the LHCb detector during three data-taking periods in which the centre-of-mass energy of the collisions was 7, 8 and 13 TeV.The data were taken in the years 2011, 2012 and 2015-2016, respectively.Compared to the previous LHCb R K measurement [33], the analysis benefits from a larger data sample and an improved reconstruction; moreover the lower limit of the q 2 range is increased, in order to be compatible with other LHCb b → s + − analyses and to suppress further the contribution from B + → φ(→ + − )K + decays.The results supersede those of Ref. [33].
The analysis strategy is designed to reduce systematic uncertainties induced by the markedly different reconstruction of decays with muons in the final state compared to decays with electrons.These differences arise due to the significant bremsstrahlung emission of the electrons and the different signatures exploited in the online trigger selection.Systematic uncertainties that would otherwise affect the calculation of the efficiencies of the B + → K + µ + µ − and B + → K + e + e − decay modes are suppressed by measuring R K as a double ratio of branching fractions, The measurement requires knowledge of the observed yield and the efficiency to trigger, reconstruct and select each decay mode.The use of this double ratio exploits the fact that J/ψ → + − decays are observed to have lepton-universal branching fractions within 0.4% [52,53].Using Eq. ( 2) then requires the nonresonant B + → K + e + e − detection efficiency to be known only relative to that of the resonant B + → J/ψ (→ e + e − )K + decay, rather than the B + → K + µ + µ − decay.As the detector signatures of each resonant decay are similar to those of the corresponding nonresonant decay, systematic effects are reduced and the precision on R K is dominated by the statistical uncertainty.
After the application of selection criteria, which are discussed below, the four decay modes are separated from background on a statistical basis, using fits to the m(K + + − ) distributions.For the resonant decays, the mass m J/ψ (K + + − ) is computed by constraining the dilepton system to the known J/ψ mass [53].This improves the electron-mode mass resolution (full width at half maximum) from 140 to 24.5 MeV/c 2 and the muon-mode mass resolution from 30 to 17.5 MeV/c 2 .The m(K + + − ) fit ranges and the q 2 selection used for the different decay modes are shown in Table 1.The selection requirements applied to the resonant and nonresonant decays are otherwise identical.The two ratios of efficiencies required to form Eq. ( 2) are taken from simulation.The simulation is calibrated using data-derived control channels, including B + → J/ψ (→ µ + µ − )K + and B + → J/ψ (→ e + e − )K + .Correlations arising from the use of these decay modes both for this calibration and in the determination of the double ratio of Eq. ( 2) are taken into account.A further feature of the analysis strategy is that the results were not inspected until all analysis procedures were finalised.
The LHCb detector is a single-arm forward spectrometer covering the pseudorapidity range 2 < η < 5, described in detail in Refs.[54,55].The detector includes a silicon-strip vertex detector surrounding the proton-proton interaction region, tracking stations either side of a dipole magnet, ring-imaging Cherenkov (RICH) detectors, calorimeters and muon chambers.The simulation used in this analysis is produced using the software described in Refs.[56][57][58][59][60][61].Final-state radiation is simulated using Photos++ 3.61 in the default configuration [59,62], which is observed to agree with a full quantum electrodynamics calculation at the level of 1% [32].
Candidate events are first required to pass a hardware trigger that selects either a high transverse momentum (p T ) muon; or an electron, hadron or photon with high transverse energy deposited in the calorimeters.In this analysis, it is required that B + → K + µ + µ − Table 1: Resonant and nonresonant mode q 2 and m(K + + − ) ranges.The variables m(K + + − ) and m J/ψ (K + + − ) are used for nonresonant and resonant decays, respectively.and B + → J/ψ (→ µ + µ − )K + candidates are triggered by one of the muons, whereas B + → K + e + e − and B + → J/ψ (→ e + e − )K + candidates are required to be triggered in one of three ways: by either one of the electrons; by the kaon from the B + decay; or by particles in the event that are not part of the signal candidate.In the software trigger, the tracks of the final-state particles are required to form a vertex that is significantly displaced from any of the primary proton-proton interaction vertices (PVs) in the event.

Decay mode
A multivariate algorithm is used for the identification of secondary vertices consistent with the decay of a b hadron [63,64].
Candidates are formed from a particle identified as a charged kaon, together with a pair of well-reconstructed oppositely charged particles identified as either electrons or muons.Each particle is required to have sizeable p T and to be inconsistent with coming from a PV.The particles must originate from a common vertex with good vertex-fit quality, which is displaced significantly from all of the PVs in the event.The B + momentum vector is required to be aligned with the vector connecting one of the PVs in the event (subsequently referred to as the associated PV) and the B + decay vertex.
Kaons and muons are identified using the output of multivariate classifiers that exploit information from the tracking system, the RICH detectors, the calorimeters and the muon chambers [55,[65][66][67][68][69].Electrons are identified by matching tracks to electromagnetic calorimeter (ECAL) showers and adding information from the RICH detectors.The ratio of the energy detected in the ECAL to the momentum measured by the tracking system is central to this identification.If an electron radiates a photon downstream of the dipole magnet, the photon and electron deposit their energy in the same ECAL cells and the original energy of the electron is measured.However, if an electron radiates a photon upstream of the magnet, the energy of the photon will not be deposited in the same ECAL cells as the electron.For each electron track, a search is therefore made for ECAL showers around the extrapolated track direction (before the magnet) that are not associated with any other charged tracks.The energy of any such shower is added to the electron energy that is derived from the measurements made in the tracker.
Backgrounds from exclusive decays of b hadrons and the so-called combinatorial background, formed from the reconstructed fragments of multiple heavy-flavour hadron decays, are reduced using selection criteria that are discussed below.The muon modes benefit from superior mass resolution so that a reduced mass range can be used (see Table 1).Consequently, the only remaining backgrounds after the application of the selection criteria are combinatorial and, for the resonant mode, from the Cabibbosuppressed decay B + → J/ψ π + , where the pion is misidentified as a kaon.For the electron modes, where a wider mass range is used, significant residual exclusive backgrounds also contribute.Since higher-mass K * resonances are suppressed in the mass range selected, the dominant exclusive backgrounds for the resonant and nonresonant modes are from partially reconstructed B 0,+ → J/ψ (→ e + e − )K * (892) (0,+) (→ K + π (−,0) ) and B 0,+ → K * (892) (0,+) (→ K + π (−,0) )e + e − decays, respectively, where the pion is not included in the candidate.At the level of O(1%) of the K + e + e − signal, there are also exclusive background contributions from B + → D 0 (→ K + e − ν e )e + ν e decays and, at low m(K + e + e − ), from the radiative tail of B + → J/ψ (→ e + e − )K + decays.This tail is visible in the distribution of m(K + e + e − ) versus q 2 , which is given in the Supplemental Material to this Letter [70].
Cascade backgrounds of the form , and X, Y are particles that are not reconstructed, are suppressed by requiring that the kaon-lepton invariant mass satisfies the constraint m(K + − ) > m D 0 , where m D 0 is the known D 0 mass [53].Cascade backgrounds with a misidentified particle are suppressed by applying a similar veto, but with the lepton-mass hypothesis changed to that of a pion (denoted [→ π]).In the muon case, it is sufficient to reject Kµ[→ π] combinations with a mass smaller than m D 0 .In the electron case this veto is applied without the bremsstrahlung recovery, i.e. based on only the measured track momenta, and a window around the D 0 mass is used to reject candidates.The vetoes retain 97% of B + → K + µ + µ − and 95% of B + → K + e + e − decays passing the full selection.The relevant mass distributions are given in the Supplemental Material [70].
Other exclusive b-hadron decays require at least two particles to be misidentified in order to form backgrounds.These include the decays B + → K + π + π − and misreconstructed B + → J/ψ (→ + − )K + and B + → ψ(2S)(→ + − )K + decays, where the kaon is misidentified as a lepton and the lepton (of the same electric charge) as a kaon.The particle-identification criteria used in the selection render such backgrounds negligible.Backgrounds from decays with a photon converted into an e + e − pair are also negligible.
Combinatorial background is reduced using Boosted Decision Tree (BDT) algorithms [71], which employ the gradient boosting technique [72].For the nonresonant muon mode and for each of the three different trigger categories of the nonresonant electron mode, a single BDT is trained for the 7 and 8 TeV data, and an additional BDT is trained for the 13 TeV data.The same BDTs are used to select the resonant decays.The BDT training uses nonresonant K + + − candidates selected from the data with m(K + + − ) > 5.4 GeV/c 2 as a proxy for the background, and simulated nonresonant K + + − candidates as a proxy for the signal decays.The training and testing is performed using the k-folding technique with k = 10 [73].The variables used as input to these BDTs are: the p T of the B + , K + and dilepton candidates, and the minimum and maximum p T of the leptons; the B + , dilepton and K + χ 2 IP with respect to the associated PV, where χ 2 IP is defined as the difference in the vertex-fit χ 2 of the PV reconstructed with and without the particle being considered; the minimum and maximum χ 2 IP of the leptons; the B + vertex-fit quality; the significance of the B + flight distance; and the angle between the B + candidate momentum vector and the direction between the associated PV and the B + decay vertex.The selection applied to the BDT output variables is chosen to maximise the predicted significance of the nonresonant signal yield.The BDT selection reduces the combinatorial background by approximately 99%, while retaining 85% of the signal modes.The efficiency of each BDT response is independent of m(K + + − ) in the regions used to determine the event yields.
An unbinned extended maximum-likelihood fit to the m(K + e + e − ) and m(K + µ + µ − ) distributions of nonresonant candidates is used to determine R K .In order to take into account the correlation between the selection efficiencies, the different trigger categories and data-taking periods are fitted simultaneously.The resonant decay mode yields are incorporated as constraints in this fit, such that the B + → K + µ + µ − yield and R K are fit parameters.The resonant yields are determined from separate unbinned extended maximum-likelihood fits to the m J/ψ (K + + − ) distributions.For all the mass-shape models described below, the parameters are derived from simulated decays that are calibrated using data control channels.
All four signal modes are modelled by functions with multi-Gaussian cores and powerlaw tails on both sides of the peak [74,75].The electron-mode signal mass shapes are described with the sum of three distributions which model whether a bremsstrahlung photon cluster was added to neither, either or both of the e ± candidates.The fraction of signal decays in each of the bremsstrahlung categories is constrained to the value obtained from the simulation.
The shape of the B + → J/ψ π + background is taken from simulation, while its size is constrained with respect to the B + → J/ψ K + mode using the known ratio of the relevant branching fractions [53,76] and efficiencies.
In each trigger category, the shape and relative fraction of the background from partially reconstructed B 0,+ → K * (892) (0,+) (→ K + π (−,0) )e + e − or B 0,+ → J/ψ (→ e + e − )K * (892) (0,+) (→ K + π (−,0) ) decays are also taken from simulation.The overall yield of these partially reconstructed decays is left free to vary in the fit, in order to accommodate possible lepton-universality violation in such decays.In the fits to nonresonant K + e + e − candidates, the shape of the radiative tail of B + → J/ψ (→ e + e − )K + decays is taken from simulation and its yield is constrained to the expected value within its uncertainty.In all fits, the combinatorial background is modelled with an exponential function with a freely varying yield and shape.
In order to evaluate the efficiencies accurately, weights are applied to simulated candidates to correct for the imperfect modelling of the B + production kinematics, the particle-identification performance, and the trigger response.The weights are computed sequentially, making use of control samples of J/ψ → µ + µ − , D * + → D 0 (→ K − π + )π + and B + → J/ψ (→ + − )K + decays, and are applied to both resonant and nonresonant simulated candidates.Only subsets of the B + → J/ψ (→ + − )K + samples are used to derive these corrections, which minimises the number of common candidates being used for both the determination of the corrections and the measurement.The correlations between samples are taken into account in the results and cross-checks presented below.The overall effect of the corrections on the R K measurement is at the 0.02 level, demonstrating the robustness of the double-ratio method in suppressing systematic biases that affect the resonant and nonresonant decay modes similarly.
Two classes of systematic uncertainty are considered: those that only affect the nonresonant decay yields, and those that affect the ratio of efficiencies for different trigger categories and data-taking periods in the fit for R K .The uncertainty from the choice of mass-shape models falls into the former category and is estimated by fitting pseudoexperiments with alternative models that still describe the data well.The effect on R K is at the ±0.01 level.Systematic uncertainties in the latter category affect the ratios of efficiencies and hence the value of R K that maximises the likelihood.These uncertainties are accounted for through constraints on the efficiency values used in the fit to determine R K , taking into account the correlations between different trigger categories and data-taking periods.The combined statistical and systematic uncertainty is then determined from a profile-likelihood scan.In order to isolate the statistical contribution to the uncertainty, the profile-likelihood scan is repeated with the efficiencies fixed to their fitted values.For the subsamples of the electron-mode data where the trigger is based on the kaon or on other particles in the event that are not part of the signal candidate, the dominant systematic uncertainties come from the (data-derived) calibration of the trigger efficiencies.For the electron trigger, there are comparable contributions from the statistical uncertainties associated with various calibration samples and the calibration of data-simulation differences.
The migration of events in q 2 is studied in the simulation.The effect of the differing q 2 resolution between data and simulation, which alters the estimate of the migration, gives a negligible uncertainty in the determination of the ratio of efficiencies.The uncertainties on parameters used in the simulation decay model (Wilson coefficients, form factors, other hadronic uncertainties etc.) affect the q 2 distribution and hence the selection efficiencies determined from simulation.The variation caused by the uncertainties on these parameters is propagated to an uncertainty on R K using predictions from the flavio software package [41].The resulting systematic effect on R K is negligible, even when non-SM values of the Wilson coefficients are considered.
Several cross-checks are used to verify the analysis procedure.The single ratio ) is known to be compatible with unity at the 0.4% level [52,53].This ratio does not benefit from the cancellation of systematic effects that the double ratio used to measure R K exploits, and is therefore a stringent test of the control of the efficiencies.The corrections applied to the simulation do not force r J/ψ to be unity and some of the corrections shift r J/ψ in opposing directions.The value of r J/ψ is found to be 1.014 ± 0.035, where the uncertainty includes the statistical uncertainty and those systematic effects relevant to the R K measurement.It does not include additional subleading systematic effects that should be accounted for in a complete measurement of r J/ψ .As a further cross-check, the double ratio of branching fractions, R is determined to be 0.986 ± 0.013, where again the uncertainty includes the statistical uncertainty but only those systematic effects that are relevant to the R K measurement.This ratio provides an independent validation of the analysis procedure.
Leptons from B + → J/ψ K + decays have a different q 2 value than those from the nonresonant decay modes.However, the detector efficiency depends on laboratory-frame variables rather than on q 2 , e.g. the momenta of the final-state particles, opening angles, etc.In these laboratory variables there is significant overlap between the nonresonant and resonant modes, even if the decays do not overlap in q 2 (see the Supplemental Material [70]).The r J/ψ ratio is examined as a function of a number of reconstructed variables.Any trend would indicate an uncontrolled systematic effect that would only partially cancel in the double ratio.For each of the variables examined, no significant trend is observed.Figure 1 shows one example and others are provided in the Supplemental Material [70].Assuming the deviations that are observed indicate genuine mismodelling of the efficiencies, rather than fluctuations, and taking into account the spectrum of the relevant variables in the nonresonant decay modes of interest, a total shift on R K is computed for each of the variables examined.In each case, the resulting variation is within the estimated systematic uncertainty on R K .The r J/ψ ratio is also computed in two-and three-dimensional bins of the considered variables.Again, no trend is seen and the deviations observed are consistent with the systematic uncertainties on R K .An example is shown in Fig. S7 in the Supplemental Material [70].Independent studies of the electron reconstruction efficiency using control channels selected from the data also give consistent results.
The results of the fits to the m(K + + − ) and m J/ψ (K + + − ) distributions are shown in Fig. 2. A total of 1943 ± 49 B + → K + µ + µ − decays are observed.A study of the B + → K + µ + µ − differential branching fraction gives results that are consistent with previous LHCb measurements [12] but, owing to the selection criteria optimised for the precision on R K , are less precise.The B + → K + µ + µ − differential branching fraction observed is consistent between the 7 and 8 TeV data and the 13 TeV data.
The value of R K is measured to be where the first uncertainty is statistical and the second systematic.This is the most precise measurement to date and is consistent with the SM expectation at the level of 2.5 standard deviations [21,32,35,39,41].The likelihood profile as a function of R K is given in the Supplemental Material [70].The value for R K obtained is consistent across

LHCb
Figure 2: Fits to the m (J/ψ ) (K + + − ) invariant mass distribution for (left) electron and (right) muon candidates for (top) nonresonant and (bottom) resonant decays.For the electron (muon) nonresonant plots, the red-dotted line shows the distribution that would be expected from the observed number of the different data-taking periods and trigger categories.A fit to just the 7 and 8 TeV data gives a value for R K compatible with the previous LHCb measurement [33] within one standard deviation.This consistency test takes into account the large correlation between the two data samples, which are not identical due to different reconstruction and selection procedures.The result from just the 7 and 8 TeV data is also compatible with that from only the 13 TeV data at the 1.9 standard deviation level.
The branching fraction of the B + → K + e + e − decay is determined in the nonresonant signal region 1.1 < q 2 < 6.0 GeV 2 /c 4 by combining the value of R K with the value of B(B + → K + µ + µ − ) from Ref. [12], taking into account correlated systematic uncertainties.This gives The dominant systematic uncertainty is from the limited knowledge of the B + → J/ψ K + branching fraction [53].This is the most precise measurement to date and is consistent with predictions based on the SM [41,77].In summary, in the dilepton mass-squared region 1.1 < q 2 < 6.0 GeV 2 /c 4 , the ratio of the branching fractions for B + → K + µ + µ − and B + → K + e + e − decays is measured to be R K = 0.846 + 0.060 − 0.054 + 0.016 − 0.014 .This is the most precise measurement of this ratio to date and is consistent with the SM prediction at the level of 2.5 standard deviations.Further reduction in the uncertainty on R K can be anticipated when the data collected by LHCb in 2017 and 2018, which have a statistical power approximately equal to that of the full data set used here, are included in a future analysis.In the longer term, there are good prospects for high-precision measurements as much larger samples are collected with an upgraded LHCb detector [78].

Supplemental Material for LHCb-PAPER-2019-009
The two-dimensional distributions of [m(K + + − ), q 2 ] for muon and electron candidates are shown in Fig. S1.For the muon sample, nonresonant candidates can be seen to accumulate in a vertical band around the B + meson mass.For the electron candidates, only some of the bremsstrahlung energy is recovered by the procedure described in the Letter and this results in a worse mass resolution and a long tail to lower K + e + e − masses.The vertical band of signal candidates is then more difficult to discern.The resonant signals from B + → J/ψ (→ + − )K + and B + → ψ(2S)(→ + − )K + decays are visible as diagonal bands, where the extended tails originate from both radiative and resolution effects, which are especially marked for the electron decay modes.As the energy loss affects both m(K + + − ) and q 2 measurements, the angle of these bands is fixed and it is not possible for candidates to migrate into the bulk of the signal region in [m(K + + − ), q 2 ].For the electron mode, the lower radiative tail of B + → J/ψ (→ e + e − )K + decays enters the 1.1 < q 2 < 6.0 GeV 2 /c 4 region only at the lower part of the m(K + e + e − ) fit range around 4.9 GeV/c 2 (see also the left side of the B + → K + e + e − fit projection in Fig. 2 of the Letter).
The reconstructed properties of simulated decays are shown in Fig. S2.The distributions for resonant and nonresonant decays are similar and consequently the determination of the efficiency of each nonresonant decay with respect to its corresponding resonant decay results in the cancellation of systematic effects.
Figure S3 shows the m(K + e − ) mass distribution for B + → K + e + e − signal decays and for several cascade background decays.For the mass reconstructed taking into account the bremsstrahlung correction, signal candidates are required to satisfy m(K + e − ) > m D 0 , suppressing the majority of cascade backgrounds to negligible levels.However, for cascade backgrounds involving D 0 → K + π − decays, where the π + is misidentified as an electron, the bremsstrahlung correction gives rise to a long tail of candidates with m(K + e − ) > m D 0 .Such decays are suppressed by placing an additional veto on the K + e − mass reconstructed without the bremsstrahlung correction, i.e. based on the measured track momentum alone.This veto removes background around the known D 0 mass, as shown in Fig. S3.After the application of both these vetoes, the cascade backgrounds are reduced to a negligible level while retaining 97% of B + → K + µ + µ − and 95% of B + → K + e + e − decays passing the remainder of the selection requirements.
The fits to the nonresonant (resonant) decay modes divided into different data-taking periods and trigger categories are shown in Fig. S4 (Fig. S5).For the resonant modes these projections come from independent fits to each period/category.The nonresonant ) is used to denote an electron (pion) that is misidentified as a pion (electron).
figures show the projections from the simultaneous fit that is used to obtain R K .The total yields for the resonant and nonresonant decays obtained from these fits are given in Table S1.
The distributions of the ratio r J/ψ as a function of the B + transverse momentum and the minimum p T of the leptons are shown in Fig. S6, together with the spectra expected for the resonant and nonresonant decays.This single ratio does not benefit from the cancellation of systematic effects that the double ratio exploits in the measurement of R K , and is therefore a stringent test of the control of the efficiencies.No significant trend is observed in either r J/ψ distribution and the results are compatible with r J/ψ = 1.Assuming the deviations observed indicate genuine mismodelling of the efficiencies, rather than fluctuations, and taking into account the spectrum of the relevant variables in the nonresonant decay modes of interest, a total shift of R K at the level 0.002 would be expected for the B + p T and lepton minimum p T .This variation is compatible with the estimated systematic uncertainties on R K .Similarly, the variations seen in all other reconstructed quantities are compatible with the systematic uncertainties assigned.The ratio r J/ψ is also computed in two-and three-dimensional bins of reconstructed quantities.An example is shown in Fig. S7.Again, no significant trend is seen and the distributions are compatible with r J/ψ = 1.

Decay Mode
Event Yield    The profile likelihood for the fit is shown in Fig. S8.The likelihood is Gaussian to a reasonable approximation in the range 0.75 < R K < 0.95, but non-Gaussian effects can be seen outside of this range due to the comparatively low yield in the B + → K + e + e − decay.
The R K values derived from a fit to just the 7 and 8 TeV data, and a fit to just the 13 TeV data are where the first set of uncertainties are statistical and the second systematic.The combination of these values, or a combination of the latter value with the previously published LHCb result [33], requires that correlations are properly taken into account, as is done in the simultaneous fit used to derive the R K measurement given in the main body of the Letter.

Figure 1 :
Figure1: (Top) distributions of the opening angle between the two leptons, in the laboratory frame, for the four modes in the double ratio used to determine R K .(Bottom) the single ratio r J/ψ relative to its average value r J/ψ as a function of the opening angle.

Figure
FigureS1: Two-dimensional distributions of [m(K + + − ), q 2 ] for (left) muon and (right) electron candidates after the application of the pre-selection and trigger requirements but not the multivariate selection.

Figure S2 :
FigureS2: Distributions of various reconstructed properties for simulated decays.The first row shows the angle between the two leptons, or one lepton and the kaon.The second row shows the rapidity distributions, and the third row the transverse momentum distributions of all the final-state particles.The bottom left plot shows the distribution for the quality of the B + vertex fit and the bottom right plot shows the χ 2 IP (B + ) variable, which quantifies the significance of the B + impact parameter.

Figure S3 :
Figure S3: Simulated K + e − mass distributions for signal and various cascade background samples.The distributions are all normalised to unity.(Left) the bremsstrahlung correction to the momentum of the electron is taken into account, resulting in a tail to the right.(Right) the mass is computed only from the track information (m track ).The notation π [→e] (e [→π]) is used to denote an electron (pion) that is misidentified as a pion (electron).

Figure S4 :
Figure S4: Fit to the m(K + + − ) invariant-mass distribution of nonresonant candidates in the (left) 7 and 8 TeV and (right) 13 TeV data samples.The top row shows the fit to the muon modes and the subsequent rows the fits to the electron modes triggered by (second row) one of the electrons, (third row) the kaon and (last row) by other particles in the event.

Figure S5 :
FigureS5: Fit to the m J/ψ (K + + − ) invariant-mass distribution of resonant candidates in the (left) 7 and 8 TeV and (right) 13 TeV data samples.The top row shows the fit to the muon modes and the subsequent rows the fits to the electron modes triggered by (second row) one of the electrons, (third row) the kaon and (last row) by other particles in the event.Some large pulls are observed but have a negligible impact on the yields extracted.

Figure
FigureS6: (Top) distributions of the spectra of (left) the B + transverse momentum and (right) the minimum p T of the leptons.(Bottom) the single ratio r J/ψ relative to its average value r J/ψ as a function of these variables.

FigureFigure S8 :
FigureS7: (Left) the value of r J/ψ , relative to the average value of r J/ψ , measured in twodimensional bins of the maximum lepton momentum (p(l)) and the opening angle between the two leptons (α(l + , l − )).(Right) the bin definition in this two-dimensional space together with the distribution for B + → K + e + e − (B + → J/ψ (→ e + e − )K + ) decays depicted as red (blue) contours.

a
Universidade Federal do Triângulo Mineiro (UFTM), Uberaba-MG, Brazil b Laboratoire Leprince-Ringuet, Palaiseau, France c P.N.Lebedev Physical Institute, Russian Academy of Science (LPI RAS), Moscow, Russia d Università di Bari, Bari, Italy e Università di Bologna, Bologna, Italy f Università di Cagliari, Cagliari, Italy g Università di Ferrara, Ferrara, Italy h Università di Genova, Genova, Italy i Università di Milano Bicocca, Milano, Italy j Università di Roma Tor Vergata, Roma, Italy k Università di Roma La Sapienza, Roma, Italy l AGH -University of Science and Technology, Faculty of Computer Science, Electronics and Telecommunications, Kraków, Poland m LIFAELS, La Salle, Universitat Ramon Llull, Barcelona, Spain n Hanoi University of Science, Hanoi, Vietnam o Università di Padova, Padova, Italy p Università di Pisa, Pisa, Italy q Università degli Studi di Milano, Milano, Italy r Università di Urbino, Urbino, Italy s Università della Basilicata, Potenza, Italy t Scuola Normale Superiore, Pisa, Italy u Università di Modena e Reggio Emilia, Modena, Italy v H.H. Wills Physics Laboratory, University of Bristol, Bristol, United Kingdom w MSU -Iligan Institute of Technology (MSU-IIT), Iligan, Philippines x Novosibirsk State University, Novosibirsk, Russia y Sezione INFN di Trieste, Trieste, Italy z School of Physics and Information Technology, Shaanxi Normal University (SNNU), Xi'an, China aa Physics and Micro Electronic College, Hunan University, Changsha City, China ab Lanzhou University, Lanzhou, China † Deceased