Evidence for B + → K + ν ¯ ν Decays

We search for the rare decay B + → K + ν ¯ ν in a 362 fb − 1 sample of electron-positron collisions at the Υ (4 S ) resonance collected with the Belle II detector at the SuperKEKB collider. We use the inclusive properties of the accompanying B meson in Υ (4 S ) → BB events to suppress background from other decays of the signal B candidate and light-quark pair production. We validate the measurement with an auxiliary analysis based on a conventional hadronic reconstruction of the accompanying B meson. For background suppression, we exploit distinct signal features using machine learning meth-ods tuned with simulated data. The signal-reconstruction efficiency and background suppression are validated through various control channels. The branching fraction is extracted in a maximum likelihood fit. Our inclusive and hadronic analyses yield consistent results for the B + → K + ν ¯ ν branching fraction of [2 . 7 ± 0 . 5(stat) ± 0 . 5(syst)] × 10 − 5 and (cid:2) 1 . 1 +0 . 9 − 0 . 8 (stat) +0 . 8 − 0 . 5 (syst) (cid:3) × 10 − 5 , respectively. Combining the results, we determine the branching fraction of the decay B + → K + ν ¯ ν to be (cid:2) 2 . 3 ± 0 . 5(stat) +0 . 5 − 0 . 4 (syst) (cid:3) × 10 − 5 , providing the first evidence for this decay at 3 . 5 standard deviations. The combined result is 2 . 7 standard deviations above the standard model expectation.

R. Tiwary , D. Tonelli , E. Torassa , N. Toutounji , K. Trabelsi , I. Tsaklidis , M. Uchida , I. Ueda , Y. Uematsu , T. Uglov , K. Unger , Y. Unno , K. Uno , S. Uno , P. Urquijo , Y. Ushiroda , S. E. Vahsen  We search for the rare decay B + → K + ν ν in a 362 fb −1 sample of electron-positron collisions at the Υ (4S) resonance collected with the Belle II detector at the SuperKEKB collider.We use the inclusive properties of the accompanying B meson in Υ (4S) → BB events to suppress background from other decays of the signal B candidate and light-quark pair production.We validate the measurement with an auxiliary analysis based on a conventional hadronic reconstruction of the accompanying B meson.For background suppression, we exploit distinct signal features using machine learning methods tuned with simulated data.The signal-reconstruction efficiency and background suppression are validated through various control channels.The branching fraction is extracted in a maximum likelihood fit.Our inclusive and hadronic analyses yield consistent results for the B + → K + ν ν branching fraction of [2.7 ± 0.5(stat) ± 0.5(syst)] × 10 −5 and 1.1 +0.9 −0.8 (stat) +0.8 −0.5 (syst) × 10 −5 , respectively.Combining the results, we determine the branching fraction of the decay B + → K + ν ν to be 2.3 ± 0.5(stat) +0.5  −0.4 (syst) × 10 −5 , providing the first evidence for this decay at 3.5 standard deviations.The combined result is 2.7 standard deviations above the standard model expectation.

I. INTRODUCTION
Flavor-changing neutral-current transitions, such as b → sν ν and b → sℓℓ, where ℓ represents a charged lepton, are suppressed in the standard model (SM) of particle physics, because of the Glashow-Iliopoulos-Maiani mechanism [1].These transitions can only occur at higher orders in SM perturbation theory through weakinteraction amplitudes that involve the exchange of at least two gauge bosons.Rate predictions for b → sℓℓ have significant theoretical uncertainties from the breakdown of factorization due to photon exchange [2].This process does not contribute to b → sν ν, so the corresponding rate predictions are relatively precise.
The b → sν ν transition provides the leading amplitudes for the B + → K + ν ν decay in the SM, as shown in Fig. 1.The SM branching fraction of the B + → K + ν ν decay [3] is predicted in Ref. [4] to be including a contribution of (0.61 ± 0.06) × 10 −6 from the long-distance double-charged-current B + → τ + (→ K + ν)ν decay.The B + → K + ν ν decay rate can be significantly modified in models that predict non-SM particles, such as leptoquarks [5].In addition, the B + meson could decay into a kaon and an undetectable particle, such as an axion [6] or a dark-sector mediator [7].In all analyses reported to date [8][9][10][11][12][13], no evidence for a signal has been found, and the current experimental upper limit on the branching fraction is 1.6 × 10 −5 at the 90% confidence level [14].The study of the B + → K + ν ν decay is experimentally challenging as the final state contains two neutrinos that are not reconstructed.This prevents the full reconstruction of the kinematic properties of the decay, hindering the differentiation of signal distributions from background.
In this study the signal B meson is produced in the e + e − → Υ (4S) → B + B − process.The at-threshold production of BB pairs helps to mitigate the limitations due to the unconstrained kinematics, as the partner B meson can be used to infer the presence and properties of the signal B. An inclusive tagging analysis method (ITA) exploiting inclusive properties from the B meson pairproduced along with the signal B, is applied to the entire Belle II data set currently available, superseding the results of Ref. [13], where this method was first used.In addition, an auxiliary analysis using the well-established hadronic tagging analysis method (HTA) [9,10] is presented; this involves explicit reconstruction of the partner B meson through a hadronic decay.The HTA method offers an important consistency check of the newer inclusive tagging method and helps validate the ITA results.In addition, the small size of the overlap between the HTA and ITA samples allows for a straightforward combination of the results, achieving a 10% increase in precision over the ITA result alone.
The ITA commences with the reconstruction of charged and neutral particles, followed by the selection of a single signal kaon candidate in events with one or more kaons.Subsequently, relevant quantities are computed using the kaon candidate, along with the remaining particles in the event, to discriminate between signal and background processes.These quantities are used in boosted decision trees (BDTs) [15,16] that are optimized and trained using simulated data.A signal region is then defined, and a binned profile-likelihood sample-composition fit is carried out on data.This fit uses simulated samples to provide predictions to determine the branching fraction of the B + → K + ν ν decay along with the rates of background processes.The fit incorporates systematic uncertainties arising from detector and physics-modeling imperfections as nuisance parameters.To validate the modeling of signal and background processes in simulation, several control channels are employed.The method is further validated through a closure-test measurement of the branching fraction of the B + → π + K 0 decay.
The HTA follows a similar method, but begins with the reconstruction of the partner B meson, and then proceeds to the definition of the signal candidate.
Except for the tagging method, the two analyses are similar in terms of particle reconstruction, event selection, usage of control samples, fit strategy, and treatment of common systematic uncertainties.In what follows, common approaches and details of the ITA are given first, followed by the HTA-specific details.
The paper is organized as follows.The data and simulated samples are presented in Sec.II followed by the Belle II detector description in Sec.III.The initial event selection and reconstruction of the decays are described in Sec.IV.Corrections introduced to the simulated samples are discussed in Sec.V. Section VI details background suppression and final event-selection using machine learning methods.Section VII defines the signal region used to extract the B + → K + ν ν decay branching fraction.The following two sections, Sec.VIII and Sec.IX, are dedicated to the validation of the modeling of the signal-selection efficiency and background contributions, respectively.Section X documents the statistical approach used to extract the signal and Sec.XI describes the systematic uncertainties.The results are discussed in Sec.XII, and consistency checks used for validation are presented in Sec.XIII.The combination of the ITA and HTA results is discussed in Sec.XIV.A discussion of the results is presented in Sec.XV.Section XVI concludes the paper.

II. DATA AND SIMULATED SAMPLES
This search uses data from e + e − collisions produced between the years 2019 and 2022 by the SuperKEKB collider [17].The on-resonance data, with an integrated luminosity of 362 fb −1 [18], are recorded at a centerof-mass (c.m.) energy of √ s = 10.58GeV, which corresponds to the mass of Υ (4S) resonance, and contain N BB = (387 ± 6) × 10 6 BB pairs [19].An additional 42 fb −1 off-resonance sample, collected at an energy 60 MeV below the mass of Υ (4S) resonance, is used to study background from continuum: e + e − → τ + τ − events and e + e − → q q events, where q indicates an u, d, s, or c quark.Simulated samples are exploited for training multivariate classifiers, estimating signal-selection efficiencies, identifying backgrounds, and defining components of the fits to data.Various event generators are used.The production and decays of charged and neutral B mesons use PYTHIA8 [20] and EVTGEN [21].The KKMC generator [22] is used to generate the q q pairs followed by PYTHIA8 to simulate their hadronization and EVTGEN to model the decays of the resulting hadrons.Similarly, KKMC and TAUOLA [23] are employed to simulate production of e + e − → τ + τ − events and decays of τ leptons, respectively.Final-state QED radiation is simulated using PHOTOS [24].For all samples, the Belle II analysis software [25,26], interfaced with GEANT4 [27], is used to simulate the detector response and perform event reconstruction.
The simulated B + → K + ν ν signal decays are weighted according to the SM form factor calculations from Ref. [4].Similar weighting is applied to B → K * (892) ν ν background decays [in the following, K * (892) mesons are indicated with K * ].The long-distance B + → τ + (→ K + ν)ν decays are simulated separately, normalized using the branching fraction from Ref. [4], and added to the B + B − background.The simulation of several other background processes receives additional corrections.Nonresonant three-body B + → K + nn decays are simulated assuming the threshold-enhancement effect present in the isospin-partner decay B 0 → K 0 S pp [28].Three-body [29] and assuming equal probabilities for the ) resonant contribution and nonresonant p-wave contribution with parameters taken from the isospin-related decay B 0 → K 0 S K + K − , as measured in Ref. [29].The PHOKHARA event generator [30] is used to simulate e + e − → ϕ(→ K 0 S K 0 L )γ events, which are used for additional studies.
The simulated continuum samples are normalized based on the known cross sections and integrated luminosity.Both the simulated BB background and signal samples are scaled using N BB , where the number of B + B − pairs is calculated as f +− N BB , and the number of B 0 B 0 pairs is calculated as f 00 N BB , with f 00 = 1 − f +− and f +− = 0.516 ± 0.012 [31].

III. DETECTOR
A comprehensive description of the Belle II detector is given in Ref. [32].The detector consists of several subdetectors arranged in a cylindrical structure around the beam pipe.The innermost subsystem consists of a silicon pixel detector surrounded by a doublesided silicon strip detector, referred to as the silicon vertex detector, and a central drift chamber (CDC).The second layer of the pixel detector covers only onesixth of the azimuthal angle for the data used in this work.The silicon detectors allow for precise determination of particle-decay vertices while the CDC determines charged-particle momenta and electric charge.A time-of-propagation counter and an aerogel ring-imaging Cherenkov counter cover the barrel and forward endcap regions of the detector, respectively: these subdetectors are important for charged-particle identification (PID).An electromagnetic calorimeter (ECL), used to reconstruct photons and distinguish electrons from other charged particles, occupies the remaining volume inside a superconducting solenoid.This provides a uniform 1.5 T magnetic field, parallel to the detector's principal axis.A dedicated system to identify K 0 L mesons and muons is installed in the flux return of the solenoid.The z axis of the laboratory frame is collinear with the symmetry axis of the solenoid and almost aligned with the electron-beam direction.The polar angle, as well as the longitudinal and transverse directions, are defined with respect to the z axis.

IV. EVENT SELECTION
The online-event-selection systems (triggers) for this analysis are based either on the number of chargedparticle trajectories (tracks) in the CDC or on the energy deposits in the ECL, and have an efficiency close to 100% for signal decays.In the offline analysis, the reconstruction of charged particles follows the algorithm outlined in Ref. [33].For the ITA, to ensure that efficiency is high and well-measured, and to suppress beam-related background, charged particles are required to have a transverse momentum p T > 0.1 GeV/c and to be within the CDC acceptance (17 • < θ < 150 • ).All charged particles except those used to form K 0 S candidates are required to have minimum longitudinal and transverse distances (impact parameters) from the average interaction point of |d z | < 3.0 cm and d r < 0.5 cm, respectively.The K 0 S candidates are formed by combining pairs of oppositely charged particles in a vertex fit.These candidates are required to have a dipion reconstructed mass between 0.495 and 0.500 GeV/c 2 , vertex p-value greater than 0.001, flight time greater than 0.007 ns (corresponding to about 2 mm displacement from the primary vertex), and cosine of the angle between momentum and flight direction greater than 0.98.Photons are identified as energy deposits exceeding 0.1 GeV detected in the ECL regions within the CDC acceptance, and not matched to tracks.The minimum energy requirement suppresses the beam-related background and energy deposits from charged hadrons that fail the matching to tracks.Each of the charged particles and photons is required to have an energy of less than 5.5 GeV to reject misreconstructed particles and cosmic muons.The kaon candidates are selected using particle-identification likelihoods based on information coming primarily from the PID detectors, complemented with information from the silicon strip detector, CDC, and the K 0 L and muon identification system.To ensure reliable PID, at least 20 deposited-charge measurements are required in the CDC.The chosen PID requirement has 68% efficiency for signal kaons, while the probability to identify a pion as a kaon is 1.2%.Candidates are also required to have at least one deposit in the pixel detector: this improves the impact parameter resolution, and helps to reject background events.
Events are required to contain no more than ten tracks to suppress background (e.g., high-multiplicity continuum production) with only a 0.5% loss of signalselection efficiency.Low-track-multiplicity background events, such as those originating from two-photoncollision processes, are suppressed by demanding at least four tracks in the event.This reduces signalreconstruction efficiency by 7.6%.The total energy from all reconstructed particles in the event must exceed 4 GeV.The polar angle of the missing momentum, computed in the c.m. frame as the complement to the total momentum of all reconstructed particles, must be between 17 • and 160 • .This range is chosen to remove low-multiplicity events and to ensure that the missing momentum points toward the active detector volume.
To select the signal kaon in an event, the mass squared of the neutrino pair is computed as assuming the signal B meson to be at rest in the e + e − c.m. frame.Here M K is the known mass of K + mesons and E * K is the reconstructed energy of the kaon in the c.m. system.Uncertainties in the kinematic properties of the colliding beams have negligible impact on the q 2 rec reconstruction.The candidate having the lowest q 2 rec is retained for further analysis.Studies on simulated signal events show that prior to applying the q 2 rec requirement the fraction of events with multiple candidates is 39%.The average number of candidates in such events is 2.2.The lowest-q 2 rec candidate is the signal kaon in 96% of cases.Checks using a random selection of the signal candidate, if several candidates are found, indicate no bias in the procedure.The remaining charged particles are fit to a common vertex and are attributed, together with the photons and K 0 S candidates, to the rest of the event (ROE).For the signal events, these charged particles and K 0 S candidates correspond to the decay products of the second B meson.
The HTA commences with the full reconstruction of a B meson (B tag ), decaying into one of 36 hadronic B decays, through the full event interpretation (FEI) [34].The FEI is an algorithm based on a hierarchical multivariate approach in which final-state particles are constructed using the tracks and energy deposits in the ECL, and combined into intermediate particles until the final B tag candidates are formed.The algorithm calculates, for each decay chain, the probability of it correctly describing the true process using gradient-boosted decision trees.Only B tag mesons with a probability exceeding 0.001 are retained.In addition, the beam-constrained mass M bc = s/(4c and p * B are the energy and the magnitude of the threemomentum of the B tag in the c.m. frame, respectively.Signal candidates peak at the known B + mass and zero in M bc and ∆E, respectively, while continuum events are distributed more uniformly.The FEI algorithm imposes conditions on the charged particles and energy deposits in the ECL similar to those used by the ITA.The algorithm requires at least three tracks and three energy deposits in the ECL, including those that are associated with the tracks.Furthermore, events with more than 12 tracks having d r < 2 cm and |d z | < 2 cm are rejected.Such events would have greater multiplicity than the maximum of the reconstructed B tag final states plus the signal-kaon track.
The signal-kaon candidate track is required to have at least 20 measurements in the CDC, and impact parameters d r < 0.5 cm and |d z | < 4 cm, and is required to satisfy PID criteria for a kaon.The B tag and signal kaon are required to have opposite charges.The same restrictions on missing momentum are applied as in the ITA.Moreover, the number of tracks with d r < 2 cm, |d z | < 4 cm and with at least 20 measurements in the CDC, which are neither associated with the B tag nor with the signal kaon, is required to be zero.
The remaining reconstructed objects in the HTA include tracks, which neither meet the CDC nor impact parameter requirements (extra tracks), and energy deposits in the ECL, which are neither associated with the B tag nor the signal kaon (extra photon candidates).Only the energy deposits in the ECL that exceed a θ-dependent energy threshold ranging from 60 MeV to 150 MeV, have a distance from the nearest track extrapolation larger than 50 cm, and are reconstructed within the CDC acceptance are considered.The sum of the energies of these deposits, denoted as E extra , and the multiplicity of the extra tracks, denoted as n tracks extra , are utilized in the subsequent steps of the analysis.Events are rejected if a K 0 S -meson, π 0 -meson, or Λ-baryon candidate is reconstructed from the extra tracks and photons.

V. CORRECTIONS TO SIMULATED DATA
The simulation of the detector response is tested using control samples from data, and correction factors are introduced with corresponding systematic uncertainties.Correction factors are applied as weights to the selected events when appropriate, particularly when the corrections impact the efficiency of the signal-kaon selection.In other cases, when the corrections affect the kinematic properties of the particles, these corrections are applied prior to the event selection and computation of related variables.The correction procedure is carried out for the nominal analysis as well as when computing systematic variations.

A. Reconstruction of charged particles
The efficiency for reconstructing charged particles is studied using e + e − → τ + τ − events, where one τ lepton decays into a single charged particle while the other decays into three charged particles [35].Simulation agrees well with the data.A systematic uncertainty of 0.3% is introduced for each charged particle to account for uncertainties in the detection efficiency and in the knowledge of the detector geometrical acceptance.
The reconstruction of kinematic properties of charged particles is validated by comparing the measured pole masses of known resonances with simulation.The simulation reproduces the data with an accuracy better than 0.1%, and any residual differences have a negligible impact on the analysis.

B. Identification of charged particles
About 10% of background arises from incorrect particle identification of the signal-kaon candidate.The main contribution is from misidentified pions, while misidentified muons, electrons, and protons have a smaller impact.The efficiency of kaon identification and the misidentification ("fake") rate for pions misidentified as kaons are determined using D * + → π + D 0 (→ K − π + ) decays reconstructed in continuum data and simulation.The small mass difference between D * + and D 0 mesons enables the isolation of a pure signal.The charge of the low-momentum pion from the flavor-conserving D * + decay allows the precise identification of the products of the Cabibbo-favored D 0 decay, providing abundant and low-background K − and π + samples.
Correction factors and their uncertainties are applied to the simulation as functions of the particle's charge, momentum, and polar angle.The correction factors for the pion-to-kaon fake rates are close to a factor of 2, indicating that the simulation underestimates the rate at which pions are misidentified as kaons.The uncertainties associated with these corrections, which are around 1% for efficiencies and 10% for fake rates, are treated 0.100 0.075 0.050 0.025 0.000 0.025 0.050

Belle II
L dt = 362 fb -1 2. Distribution of ∆E in data (dots with error bars) obtained for B + → h + D 0 decays, where h + = π + or K + , computed assuming a pion mass hypothesis for h + .The blue solid line represents the fit result to the data, modeled as a sum of two Gaussian shapes corresponding to B + → K + D 0 (dashed red line) and B + → π + D 0 (dotted magenta line) decays.Events selected for the figure are reconstructed as B + → K + ν ν events, with the daughters from the D 0 decays removed, and chosen to be in the signal region of the ITA.
as systematic uncertainties.Correction factors for the lepton-to-kaon fake rates are also applied, although their impact is negligible.The correction factors for kaon identification efficiency and the pion-to-kaon fake rate are further validated for the signal region of the ITA, using the B + → h + D 0 (→ K + π − ) decays, where h + stands for π + or K + , following the procedure outlined below.The decays are reconstructed using the pion mass hypothesis and the nominal kaon identification for the h + candidate.For B + → π + D 0 decays, the distribution of the ∆E variable peaks at zero.Since the B + is produced almost at rest in the c.m. frame, the kaon momentum in the two-body B + → K + D 0 decay is expected to be equal to 2.3 GeV/c with a small spread.Combined with the pion mass, this leads to an energy deficit compared to the correct kaon hypothesis, resulting in a shift in ∆E of −0.049 GeV.This characteristic of the ∆E distribution is used to distinguish between B + → K + D 0 and B + → π + D 0 decays without relying on PID information.The kaon and pion candidates from the D 0 decay are differentiated by kaon identification: the particle with the higher value is assumed to be a kaon.Only D 0 candidates with an invariant mass within 3 standard deviations of the known D 0 mass [14] are kept.In addition, the selections M bc > 5.27 GeV/c 2 and |∆E| < 0.1 GeV are applied.If several candidates pass the selection, a random one is chosen.
For the selected B + → h + D 0 decays, the information on the ∆E variable is kept while the tracks from the D 0 decay are removed and each event is reconstructed again as a B + → K + ν ν event.The same procedure is re-peated for both data and simulation.The selected events show a q 2 rec distribution peaking between 3 GeV 2 /c 4 and 5 GeV 2 /c 4 corresponding to the D 0 mass squared.The events have a B + → K + ν ν signal-like signature and for this q 2 range are reconstructed with high efficiency.Distribution for q 2 is included in the Supplemental Material [36].The distribution of the ∆E variable for the signal region of the ITA is shown in Fig. 2. Two prominent peaks corresponding to h + = K + and h + = π + are observed.The yields of the two components are extracted in a fit using Gaussian shapes.The double ratio of the B + → π + D 0 to B + → K + D 0 decay rates in data to simulation is 1.03 ± 0.09, showing consistency with unity within the statistical uncertainty.

C. Reconstruction of neutral particles
The photon detection efficiency at Belle II, and the calibration of the photon energy reconstruction, are based on studies of e + e − → µ + µ − γ events.The efficiency is the fraction of events where a photon is reconstructed with momentum consistent with the expectation from recoil against the µ + µ − system [37].The resulting uncertainty is negligible for this analysis.
The uncertainty on the photon energy is 0.5%.The effect on signal yield in the fit is estimated by applying this uncertainty to energy deposits from photon candidates matched to simulated photons.
The simulated sample shows that photon candidates have 30% contamination from beam-related background, energy deposits from charged hadrons that are reconstructed away from the particle trajectory, and from neutral hadrons.These deposits are not matched to simulated photons ("unmatched").The bias in the reconstructed energy for these sources ("hadronic energy correction") is studied using the summed energy of the photon candidates in the ROE ("summed neutral energy", E γ ) of events containing a B + → K + J/ψ decay (details on B + → K + J/ψ decay reconstruction can be found in Sec.VIII).In the simulation, the energy of reconstructed photon candidates is treated differently based on their matching to the generated photons.The energy for matched candidates is not corrected.For unmatched candidates, a multiplicative hadronic energy correction is inferred empirically using data.In the simulation the correction is varied within a ±20% range around unity.For the ITA, an improvement is found when the hadronic energy is varied down by 10%.The corresponding correction with 100% uncertainty (relative) is introduced.Illustration of the hadronic energy correction for the ITA is included in the Supplemental Material [36].
Figure 3 shows the comparison of distributions of summed neutral energy for events in which a B + → K + J/ψ decay is reconstructed, for collision data and for the corresponding uncorrected and corrected simulation.The correction corresponds to a variation of the hadronic energy by −10%.Better data-simulation agreement is achieved by the corrected simulation.
The correction is validated using various control samples dominated by background, such as off-resonance data and data at early selection steps.An improvement is observed in the description of several variables related to neutral-particle energy deposits, such as the number of photon candidates.The latter is sensitive to the hadronic energy since the hadronic-energy deposits peak at low energy and are affected strongly by the minimal energy requirement of 0.1 GeV.
For the HTA, a different extra-photon selection is adopted.In the HTA sample, the energy spectrum of extra photon candidates exhibits good data-simulation agreement, but observed discrepancies in the multiplicity (n γ extra ) propagate to the E extra distribution.To correct this, a control sample is used where the signal kaon and the B tag have the same charge.A weight is computed for the n γ extra distribution as follows: where N data (n γ extra ) and N simulation (n γ extra ) correspond to the event yields with n γ extra candidates in data and simulation, respectively.Subsequently, simulated events where the signal kaon and B tag have opposite charges, are weighted based on their associated n γ extra value.This method is validated using an independent pionenriched control sample where the signal track is identified as a pion instead of a kaon.The pion-enriched sample is further divided into two samples based on whether the signal candidate and B tag have the same or opposite charge.Corrections are derived from the sample where signal and B tag have same charge and are then applied to the opposite-charge sample.The effect of the correction in the pion-enriched sample at the event-selection stage is shown in Fig. 4.
Although an improvement is observed after applying the correction, residual data-simulation discrepancies remain.To account for these, a systematic uncertainty is assigned corresponding to 100% of the residual difference in the data-to-simulation ratio observed in the oppositecharge pion-enriched control sample after the correction.
Given the prominence of background contributions containing K 0 L mesons, a dedicated study is performed to check their modeling.This study focuses on the ECL response only, as the analysis does not use K 0 L candidates from the dedicated identification system to avoid additional systematic uncertainties due to their modeling.Radiative-return production e + e − → γϕ(→ K 0 S K 0 L ) is used for this purpose for K 0 L with energy above 1.6 GeV.The events are selected by demanding a photon candidate with energy E * γ > 4.7 GeV in the c.m. frame, a well-reconstructed K 0 S candidate, and no extra tracks.The K 0 L four-momentum is inferred based on the photon and K 0 S four-momenta, where the photon energy is computed based on the two-body e + e − → γϕ process.The typical momentum resolution of an inferred K 0 L is better than 1%.An energy deposit in the ECL reconstructed at a radius R is matched to the trajectory of K 0 L extrapolated to the same R if the distance between them is less than 15 cm.The efficiency for finding a matched energydeposit is studied both in data and simulation, and is tested separately in the ITA and HTA.The ITA selection for the ECL deposits is looser than the HTA selection; therefore a higher efficiency is found.Figure 5 shows the ITA K 0 L efficiency as a function of momentum; the simulation overestimates the efficiency by 17%.This is taken into account by performing a −17% (relative) efficiency correction in the ITA sample, for all K 0 L , including those below 1.6 GeV.A ±8.5% systematic uncertainty (i.e.half of the correction) is assigned.Distribution of the energy deposits in ECL is shown in the Supplemental Material [36].
While the radiative-return production of ϕ mesons does not encompass K 0 L with energies below 1.6 GeV, approximately half of the K 0 L mesons in the main background processes populate this lower-energy range.As a consistency check, a 100% inefficiency is incorporated in the ITA for this kinematic region in the simulation.Specifically, all energy deposits in the ECL that fall within a 15 cm radius of the extrapolated K 0 L trajectory are removed for simulated K 0 L with energies smaller than 1.6 GeV.The impact of this additional requirement on the analysis is found to be covered by the hadronic-energy systematic uncertainty, discussed above.
The K 0 L reconstruction efficiency is smaller for the HTA.Since the effect on E extra is already addressed by the correction and systematic uncertainty derived from  the extra-photon-multiplicity spectrum, no direct correction to the K 0 L efficiency is applied.Instead, a systematic uncertainty is assigned, wherein the yields of B final states with a K 0 L are varied by 17%.

VI. BACKGROUND SUPPRESSION
Simulated signal and background events are used to train BDTs that suppress the background.Several inputs are considered, including general event-shape variables described in Ref. [38], as well as variables characterizing the kaon candidate and the kinematic properties of the ROE.Moreover, vertices of two and three charged particles, with one of the tracks being the kaon-candidate track, are reconstructed to identify kaons from D 0 and D + meson decays; variables describing the fit quality and kinematic properties of the resulting candidates are considered as possible BDT inputs.Variables are excluded if either their contribution to the classification's separation power is negligible or they are poorly described by the simulation.
The ITA uses two consecutive BDTs.A first binary classifier, BDT 1 , is designed as a first-level filter after event selection.It is trained on 10 6 simulated events of each of the seven considered background categories (decays of charged B mesons, decays of neutral B mesons, and the five continuum categories: e + e − → q q with q = u, d, s, c quarks and e + e − → τ + τ − ), weighted to a common equivalent luminosity such that the sum of weights is balanced to the 10 6 simulated signal events.The classifier uses 12 input variables.The most discriminating variable is the difference between the ROE energy in the c.m. frame and √ s/2 (∆E ROE ), which tends to be negative for signal events due to neutrinos, whereas it is positive for the background with additional reconstructed particles.Significant discrimination comes from variables sensitive to the momentum imbalance of the signal events due to neutrinos, as well as those that correlate the missing momentum with the signal-kaon momentum.Examples of such variables are the reduced first-order Fox-Wolfram moment [39] and the modified Fox-Wolfram moments [40].
The second classifier, BDT 2 , is used for the final event selection.It is trained on events with BDT 1 > 0.9, which corresponds to a signal (background) selection efficiency of 34% (1.5%), using 35 input variables.A simulated background sample of 200 fb −1 equivalent luminosity, corresponding to 4.2 × 10 6 events, and a sample of 1.7×10 6 signal events are used.Tests with larger samples used for BDT 2 training show no additional improvements in BDT 2 performance.For BDT 2 , the most discriminating variables are the cosine of the angle between the momentum of the signal-kaon candidate and the thrust axis of the ROE computed in the c.m. frame, which has a uniform distribution for the signal and a peaking shape for the jetlike continuum background.The thrust axis is defined as the unit vector t that maximizes the thrust value where ⃗ p * i is the momentum of ith final-state particle in the e + e − c.m. frame [41,42].Also important are variables identifying kaons from D 0 and D + meson decays, and the modified Fox-Wolfram moments.The BDT 1 and BDT 2 parameters are optimized based on a grid search in the parameter space and are described in Appendix A 1. Training of BDT 1 and BDT 2 classifiers is based on simulated samples that are statistically independent of those used in the samplecomposition fit.
For the HTA, the remaining background is suppressed using a multivariate classifier BDTh, which uses 12 input variables combining information about the event shape, the signal-kaon candidate, the B tag meson, and any extra tracks and extra photons.Simulated background samples of about 2×10 5 BB events and 3×10 5 continuum events, which correspond to an equivalent luminosity of, respectively, 3 ab −1 and 1 ab −1 , are used together with a signal sample of 5 × 10 5 events.The BDTh parameters are optimized through a grid search in the parameter space.Given the limited size of the simulated sample, it is beneficial to use information from the whole sample both to train the BDTh and estimate the remaining background in the signal region.The simulated sample is thus split into two subsamples that are used to train two separate BDTh's.Good agreement between the two outputs is observed.The data sample is then randomly divided into two halves and each BDTh is applied to one half.In the background sample, for each event, the BDTh other than the one the event is used to train is applied.Details regarding the input variables and BDTh parameters are reported in Appendix A 2.
The BDTh input variable providing the highest discriminating power is E extra .For correctly reconstructed signal events, no extra ECL deposits are expected, which results in a E extra distribution peaking at zero; backgrounds leave deposits with energies up to 1 GeV.The second most discriminating variable is the sum of missing energy and magnitude of the missing momentum (E * miss + cp * miss ), where the missing four-vector is defined as the difference between the beam four-vector and the sum of the signal kaon and B tag four-vectors in the c.m. frame.For correctly reconstructed signal events, E * miss + cp * miss is defined by the neutrino kinematic properties, and its distribution peaks around 5 GeV, while for background events the random loss of particles mimicking the neutrinos results in a broader distribution.

VII. SIGNAL REGION DEFINITION
Using the simulated signal sample, the BDT 2 variable is mapped to the complement of the integrated signalselection efficiency, where ξ(b) is the total signal-selection efficiency density for the BDT 2 value b.In this way the distribution of η(BDT 2 ) for simulated signal events is uniform; a similar mapping is used to define η(BDTh), based on the efficiency of the selection on BDTh.
For the ITA, the signal region (SR) is defined to be BDT 1 > 0.9 and η(BDT 2 ) > 0.92, as this criterion maximizes the expected signal significance, based on studies in simulation.The SR is further divided into 4 × 3 intervals (bins) in the η(BDT 2 ) × q 2 rec space.The bin boundaries are [0.92,0.94, 0.96, 0.98, 1.00] in η(BDT 2 ) and [−1.0, 4.0, 8.0, 25.0] GeV 2 /c 4 in q 2 rec .The bin η(BDT 2 ) > 0.98 provides the main information on the signal while the bin η(BDT 2 ) < 0.94 helps to constrain background contributions.The bin boundaries in q 2 rec are chosen to follow those of theoretical predictions [2] while ensuring a sufficient number of expected signal events in each bin.The expected yields of the SM signal and the backgrounds in the SR are 160 and 16793 events, respectively.More detailed information about the expected background composition for charged and neutral B decays is shown in the Supplemental Material [36].For the highest-purity η(BDT 2 ) > 0.98 region, the expected SM signal yield is reduced to 40 events with a background yield of 977 events.These signal and background yields include corrections to the simulation discussed in the following sections; they correspond to the sample entering the statistical analysis to extract the signal described in Sec.X.
For the HTA, the SR is defined to be η(BDTh) > 0.4 and is divided into six bins with bin boundaries at [0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0].In events containing multiple B tag -K + candidates, the candidate formed by the B tag with highest FEI probability is selected.The expected yields of the SM signal and the background in the SR are 8 and 211 events, respectively.For the highest purity η(BDTh) > 0.7 region, the expected SM signal yield is reduced to 4 events with background yield of 33 events.
The expected background and signal distributions in the signal search region are shown in the Supplemental Material [36].
The signal-selection efficiency in the SR is shown in Fig. 6.Much higher efficiency is observed for the ITA; however, the ITA efficiency has a significantly stronger q 2 dependence compared to the efficiency for the HTA.The analysis relies on modeling of this variation by simulation, which is checked using a control channel, as discussed in the next section.

VIII. SIGNAL SELECTION EFFICIENCY VALIDATION
The decay ) is used to validate the BDT performance on signal-like events between data and simulation, exploiting its large branching fraction and distinctive experimental signature.These events are selected in data and B + → K + J/ψ simulation by requiring the presence of two oppositely-charged muons with an invariant mass within 50 MeV/c 2 of the known J/ψ mass [14].To suppress background events, the variable |∆E| is required to be less than 100 MeV and the beam-energy constrained mass M bc is required to exceed 5.27 GeV/c 2 .These criteria result in 7214 events being selected in the data sample with an expected background contamination of 2%.Each event is then reconsidered as a B + → K + ν ν event by ignoring the muons from the J/ψ decay and replacing the kaon candidate with the signal kaon candidate from a simulated B + → K + ν ν event, to reflect the three-body topology of the signal signature.The kinematic properties of the signal kaon are then adjusted such that the B + four-momentum and decay vertex in the simulated B + → K + ν ν decay match the fourmomentum and decay vertex of the corresponding B + from the B + → K + J/ψ decay.This substitution is performed for the reconstructed track, ECL energy deposits, and PID likelihood values associated with the simulated kaon such that the test samples have a format identical to the data and can be analyzed by the same reconstruction software.This signal-embedding method is performed for both data and B + → K + J/ψ simulation.
The results obtained by analyzing selected events are summarized for the ITA in Fig. 7, where the distributions of the output values of both BDTs are shown.Good agreement between simulation and data is observed for the selected events before (B + → K + J/ψ ) and after (B + → K + ¨J/ψ ) the signal embedding.Distributions with logarithmic y-axis are presented in the Supplemen- Candidates/0.05FIG. 7. Distribution of the classifier output BDT1 (main figure) and BDT2 for BDT1 > 0.9 (inset).The distributions are shown before (B + → K + J/ψ ) and after (B + → K + ¨J/ψ ) the muon removal and replacement of the kaon momentum of selected B + → K + J/ψ events in simulation and data.As a reference, the classifier outputs directly obtained from simulated B + → K + ν ν signal events are overlaid.The simulation histograms are scaled to the total number of B + → K + J/ψ events selected in the data.
tal Material [36].The ratio of the selection efficiencies for the SR in data and simulation is 1.00 ± 0.03; i.e., agreement is observed.
For the HTA, the signal embedding is used to check both the FEI and the combined FEI plus BDTh signal reconstruction efficiency.The ratios of data and simulation efficiencies at the two levels of the selection are found to be 0.68 ± 0.06 and 0.60 ± 0.10, respectively.The first ratio agrees with an independent FEI calibration derived from B → Xℓν FEI-tagged events [43] and is therefore used as a correction for signal efficiencies and BB normalization.From the relative uncertainty on the efficiency ratio computed after the η(BDTh) selection, a 16% systematic uncertainty on the signal-selection efficiency is derived.For HTA, the resulting distributions of this study are shown in the Supplemental Material [36].

IX. BACKGROUND STUDIES
The main background sources for the analysis arise from decays that involve an energetic kaon (or a misidentified pion), missing energy, or particles that leave no or small signatures in the ECL, such as K 0 L mesons.These processes occur in both continuum and B-meson decays.Dedicated studies, using a variety of control samples, are performed in order to validate the background description in simulated events.Where needed, correction factors are derived with corresponding systematic uncertainties.In the following subsections the modeling of backgrounds from continuum (Sec.IX A) and BB events (Sec.IX B) are discussed.In Sec.IX C the overall background normalization after all corrections are applied is checked.

A. Continuum background
Continuum represents 40% and 30% of the background in the entire signal region of the ITA and HTA, respectively.This contribution drops to 17% in the highestsensitivity region η(BDT 2 ) > 0.98 of the ITA, and to 15% in the highest-sensitivity region η(BDTh) > 0.7 of the HTA.The background modeling is validated using the off-resonance data and shows moderate disagreements in the shape of some of the input features of the various classifiers (locally up to 20%).The modeling of continuum-background simulation is thus improved following Ref.[44].A binary classifier, BDT c , is trained to separate the off-resonance data and off-resonance simulation.For the ITA, the BDT c input variables consist of all BDT 2 input variables, q 2 rec , and the output of BDT 2 .The BDT c classifier is trained with events that satisfy BDT 1 > 0.9 and η(BDT 2 ) > 0.75 in the off-resonance data and a 50 fb −1 sample of off-resonance simulation.As a check, BDT c is trained using a 200 fb −1 simulated sample of continuum events produced at a c.m. energy corresponding to the Υ (4S) resonance, yielding a similar performance.For the HTA, the BDT c exploits all BDTh input variables and is trained with the off-resonance data and a 1 ab −1 simulated sample of continuum events produced at a c.m. energy corresponding to the Υ (4S) resonance.If p, taking values between 0.0 and 1.0, denotes the BDT c classifier output for a given continuum event, the ratio p/(1 − p) approximates the likelihood ratio L(data)/L(simulation), where L(data) (L(simulation)) is the likelihood of the continuum event being from data (simulation), which is used as a weight [44].The weights range between 0.5 and 2.0 with a standard deviation of 0.3 for the ITA.This weight, for the ITA, is applied to the simulated continuum events after the final selection; for the HTA, it is applied before the BDTh training.Comparison of simulated continuum events with off-resonance data shows that the application of this weight improves the modeling of the input variables.rec distribution in data and corrected simulation for the ITA off-resonance sample.While the shapes of the distributions are similar, there is a normalization excess of the data over the simulation of (40 ± 5)%, which is included as a systematic uncertainty (see Sec. XI).A possible source of the discrepancy is a mismodeling of kaon fragmentation in the PYTHIA8 version used in Belle II.Illustration of improvement of the ITA distributions with BDT c -based reweighting is shown in the Supplemental Material [36].
For the HTA, the relative normalization between offresonance data and continuum simulation is 0.82 ± 0.01 before the BDTh selection.This factor accounts for mismodeling effects on the FEI performance for continuum events and is used to scale the expected continuum contamination.The relative normalization in the BDTh signal region is consistent with unity with 50% uncertainty, which is included as a systematic uncertainty (see Sec. XI).

B. B background
The backgrounds originating from B 0 and B + decays are dominant in the highest-sensitivity regions of the analysis.The composition of the B backgrounds is similar for both the ITA and HTA samples.It is also similar for B + and B 0 decays; however, the contribution from B + decays has a larger impact for both analyses.
In the ITA sample, the main background process consists of semileptonic B decays to charm, where the signalcandidate kaons originate from charmed-meson decays.This process is approximately 47% of the total B background in the SR.The other major background processes are hadronic B decays involving charmed mesons and other hadronic B decays, contributing about 38% and 14% to the total B background in the SR, respectively.The remaining sources of background are B + → τ + ν τ decays and B → K * ν ν decays.
In the HTA sample, semileptonic B decays represent the majority of the B background events, accounting for approximately 62% of the total background.The second most abundant contribution comes from hadronic B decays with final states including a charmed meson accompanied by multiple pions, representing about 20% of the total background.The remaining contributions are from other hadronic modes.
The lower-particle-multiplicity events involving the direct decay of a B meson into a D meson contribute more than those containing D * resonances.The decays involving higher excitations of D mesons (D * * modes), which are less well known, correspond to approximately 4% of the total B background for ITA and 6% for HTA, and are modeled according to their PYTHIA8 [20] simulation.In the following, the modeling of the main background categories and of specific background decays requiring special treatment is presented.

Modeling of D-meson decays involving a K + meson
The dominant background contributions in which the signal-candidate K + originates from D 0 and D + decays are suppressed using several variables that exploit characteristic features of these decays, such as displaced decay vertex and invariant-mass information, as discussed in Sec.VI.The modeling of this background is checked by comparing the distributions of these variables in data and simulation at various selection stages, and good agreement is observed.An example is presented in Fig. 9, which shows the invariant mass distribution of the signalkaon candidate paired with a charged particle from the ROE after the BDT 1 selection.The distinctive shape in data, including the peak from the two-body D 0 → K − π + decay, is well reproduced by the simulation.
Uncertainties related to the knowledge of the semileptonic B-decay branching fractions are included explicitly, as discussed in Sec.XI.Uncertainties due to the decay form factors are studied using the eFFORT computer program [45] and found to be negligible.FIG. 9. Distribution of invariant mass for the candidate K + plus a charged particle from the ROE, which is reconstructed under the most probable mass hypothesis among a pion, kaon, proton, electron or muon based on the PID information in data (points with error bars) and the simulation (filled histogram).The samples are shown after the BDT1 > 0.9 selection in the ITA.The data-to-simulation ratio is shown in the bottom panel.

Modeling of D-meson decays involving a K 0 L meson
Backgrounds from prompt production of K + mesons in B decays are important in the highest-sensitivity region.The branching fractions of B 0 → K + D ( * )− and B + → K + D ( * )0 decays are relevant due to the sizable and poorly known fraction of D-meson decays involving K 0 L mesons.The branching fraction of decays which involve B → D → K 0 L transitions is studied using independent control samples based on alternative particleidentification requirements.A pion-enriched control sample is used to determine corrections, while samples with the signal track identified as an electron or a muon are used to validate them.
The pion-enriched sample presents an overall excess of the data over expectations for both ITA and HTA.For the ITA, the excess is studied as a function of q 2 rec and found to appear at the D 0 threshold and above (see Fig. 10

left). If attributed to D-meson decays involving K 0
L , the excess is consistent with a (30 ± 2)% increase in rate compared to the expectation from simulation.This is determined in a three-parameter fit to the binned q 2 rec distribution for η(BDT 2 ) > 0.92 where the fit parameters are the fractions of summed continuum, summed charged and neutral B-meson decays with D-meson decays involving K 0 L , and summed charged and neutral B-meson decays without D-meson decays involving K 0 L mesons (see Appendix B for more details).In background simulation the branching fraction for D 0 → (K 0 /K 0 )X is 40% and for D + → (K 0 /K 0 )X is 58%.When these branching fractions are scaled by 1.30, the resulting branching fraction of 52% for D 0 is compatible with the known value of (47±4)% [14]; the value for D + , 75%, is above the known value of (61 ± 5)% [14].The distribution of q 2 rec in simulation for B-meson decays with subsequent D + → K 0 L X and D 0 → K 0 L X decays in the pion-enriched ITA control sample is shown in the Supplemental Material [36].
An excess at q 2 rec above the charm-production threshold is also evident in the samples in which the signal track is identified to be a muon or an electron.It is covered by (35 ± 1)% and (38 ± 1)% increases in the rate of charm decays involving K 0 L in the respective samples.Consequently, a correction of +30% is applied to the branching fraction of events containing D → K 0 L X in the simulated background sample, in both ITA and HTA.The correction is based on the excess size determined for the pion-enriched sample, as the rate of pionto-kaon misidentification is significantly larger than that of lepton-to-kaon misidentification.Due to the discrepancy in the correction factors between the different samples, a systematic uncertainty of 10% is assigned; i.e., the correction is (+30 ± 10)%.
Figure 11 shows the η(BDT 2 ) distribution for the pionenriched sample, after all corrections are applied, including the scaling of the branching fraction of D-meson decays involving K 0 L mesons.The resulting expectations are consistent with the data.The q 2 rec distribution for the sample is also discussed in Sec.XIII.The q 2 rec distribution for the sample in which the signal track is identified as a lepton is shown in Fig. 12.

Modeling of B
Another important class of background is charmless hadronic B decays with K 0 L mesons or neutrons in the final state, since these neutral particles can mimic the signal signature.The contributions from B + → K + K 0 K 0 , B + → K + nn, and B → K * ν ν decays are estimated separately, as described in Sec.II.
The modeling of the Details of the reconstruction are given in Appendix C. The sPlot method [46] is used to determine the invariant-mass distributions for the K 0 S K 0 S and K + K − pairs.The result for the B + → K + K 0 S K 0 S decay is illustrated in Fig. 13.Data and simulation show good shape and normalization agreement, validating the S ϕ and nonresonant p-wave B + → K + (K 0 L K 0 S ) P contributions, as described in Sec.II.This model is validated by reconstructing the isospin-related decay B 0 → K 0 S K + K − in data.In addition to this decay proceeds via scalar resonances and a nonresonant s-wave amplitude.The B + → K + K 0 S K 0 S decays in data are used to model the latter two contributions only, as this decay lacks a p-wave component due to Bose-Einstein statistics of the K 0 S K 0 S pair. Figure 14 shows a comparison between the observed K 0 S K + K − invariant mass and a sum of (1) the B + → K + K 0 S K 0 S spectrum obtained in data and corrected for efficiency and the ratio of the B + and B 0 lifetimes, (2) simulated B 0 → K 0 S ϕ contributions, and (3) simulated B 0 → K 0 S (K + K − ) P contributions.Satisfactory agreement is observed both in shape and normalization.
The B + → K + nn background constitutes 0.4% of the total B background in the signal region and 1.0% in the most sensitive region for the ITA.This contribution is significant because of the threshold enhancement used in the model: these contributions would be only 0.2% and 0.3% if the decay proceeded according to phase space.
Contaminations from B + → K * + ν ν and B 0 → K * 0 ν ν decays are also included in the background model according to the SM prediction [4].Their expected yield is approximately 5 times smaller than the expected signal yield in the entire SR and 10 times smaller in the most sensitive region.
The long-distance contribution of B + → τ + (→ K + ν)ν decay is included as part of the background model (see Sec. II).Compared to the signal, which by construction has a selection efficiency of 8.0% in the SR for the ITA, this background has a higher selection efficiency of 9.7%.This higher efficiency is due to a q 2 distribution that peaks at a lower value than the signal.However, due to the small branching fraction, the expected yield is approximately 6 times smaller than the expected signal yield in the most sensitive region.

C. Validation of background estimation
The modeling of the ITA BDT distributions of background events is tested using events outside the SR with η(BDT 2 ) in the interval 0.75 to 0.90.
For the HTA, the background normalization and BDTh input and output distribution are checked in two control samples: one in which the B tag and the signal kaon have the same charge and another one in which the requirement on the PID criteria on the signal-side kaon is reversed.
In both analyses, the distributions obtained in data and simulation agree.The normalization of the background contributions also agrees with the expectation.

X. SIGNAL YIELD DETERMINATION
The signal yields are estimated via binned maximum likelihood fits to data event counts in the bins of the SRs defined in Sec.VII.The ITA fit is a simultaneous fit to on-and off-resonance data samples; the HTA fit is to on-resonance data only.Templates are used to approximate the distributions, in the relevant observables, of each class of events.The likelihood function is constructed as a product of Poisson probability-density func-tions that combine the information from the SR bins.The systematic uncertainties are included in the likelihood as nuisance parameters, which are approximated as additive or multiplicative modifiers of the relevant yields and constrained to the available auxiliary information using Gaussian likelihoods.The parameter of interest is µ, the signal branching fraction relative to its SM expectation (signal strength).The SM expectation for the signal branching fraction used as a reference is 4.97 × 10 −6 , based on Ref. [4] and excluding the contribution from τ decays.The statistical analysis is performed with the pyhf computer program [47,48], and the results are checked using a dedicated sghf computer program [13], which is also used for fits to control samples.

XI. SYSTEMATIC UNCERTAINTIES
A number of possible sources of systematic uncertainty are considered and summarized in Table I for the ITA and Table II for the HTA.
For the ITA, the yields of the seven individual background categories are allowed to vary independently in the fit.In each case, a Gaussian constraint is added to the fit, centered at the expectation based on (corrected) simulation and with standard deviation corresponding to 50% of the central value.The 50% value is motivated by a global normalization difference between the off-resonance data and continuum simulation, as mentioned in Sec.IX A. For the charged-B-background yield, which has the largest correlation with the signal strength µ, the postfit uncertainty is reduced to about half the as- TABLE I. Sources of systematic uncertainty in the ITA, corresponding correction factors (if any), their treatment in the fit, their size, and their impact on the uncertainty of the signal strength µ.The uncertainty type can be "Global", corresponding to a global normalization factor common to all SR bins, or "Shape", corresponding to a bin-dependent uncertainty.Each source is described by one or more nuisance parameters (see the text for more details).The impact on the signal strength uncertainty σµ is estimated by excluding the source from the minimization and subtracting in quadrature the resulting uncertainty from the uncertainty of the nominal fit.signed prefit uncertainty.The data also significantly constrain the cc-background yield, reducing the postfit uncertainty to approximately half of the prefit uncertainty.

Source
The remaining systematic uncertainties may also influence the shape of the templates.Each source is described by several nuisance parameters.Several sources are used to cover background-modeling uncertainties.The branching fractions of decay modes contributing about 80% of B + decays and 70% of B 0 decays in the SR are allowed to vary according to their known uncertainties [14].These variations are then propagated to the SR bins, and their effects, along with correlations, are incorporated into a covariance matrix.This matrix is subsequently factorized into a canonical form using eigendecomposition and represented using six nuisance parameters.The uncertainty on the branching fraction of the B + → K + K 0 L K 0 L decay is estimated to be 20% to account for potential branching fraction differences between decays.The uncertainty on the branching fraction of the B + → K + K 0 S K 0 L decay is estimated to be 30%.This accounts for possible isospin-breaking effects (20%) and uncertainties in the p-wave nonresonant contribution (20%).The uncertainties on the branching fractions of B → D * * decays, which are poorly known, are assigned to be 50%.Uncertainties in the modeling of baryonic decays involving neutrons are covered by the 100% uncertainty on the B + → K + nn branching fraction.The fraction of D-meson decays involving K 0 L mesons is corrected by 30% with a 10% absolute uncertainty, motivated by the differences in the scaling factors determined using different samples, as discussed in Sec.IX B 2. All Distribution of the invariant mass of the K + K − pair from B 0 → K 0 S K + K − decays in background-subtracted data (points with error bars) and the sum of the simulated B 0 → K 0 S ϕ(→ K + K − ) decay (purple-filled histogram), the s-wave contribution estimated using B + → K + K 0 S K 0 S decays in data (blue-filled histogram) and the simulated p-wave nonresonant component (red-filled histogram).The distribution obtained using B + → K + K 0 S K 0 S decays in data is corrected for efficiency and the ratio of the B + and B 0 lifetimes.The simulated distributions are normalized to the number of BB events.The pull distribution is shown in the bottom panel.
of these uncertainties are propagated as correlated shape uncertainties.
Global normalization uncertainties on the luminosity measurement (1% assumed) and the number of BB pairs (1.5%) are treated with one nuisance parameter each.In addition, a 5% uncertainty is introduced on the difference in normalization between on-and off-resonance data samples.
The following five sources represent uncertainties in detector modeling; they are discussed in detail in Sec.V.The sources are track-finding efficiency, kaonidentification efficiency, modeling of energy for photons and hadrons, and K 0 L reconstruction efficiency.The final three sources account for signal-modeling uncertainties.These are signal form factors, which are based on Ref. [4], and global signal-selection efficiency uncertainties as determined in Sec.VIII.
The systematic uncertainty due to the limited size of simulated samples is taken into account by one nuisance parameter per bin per category (156 parameters).
To account for all the systematic sources described above, a total of 193 nuisance parameters, along with the signal strength µ, are varied in the fit.
The largest impact on the uncertainty of the signal strength µ arises from the knowledge of the normalization of the background from charged B decays.Other important sources are the simulated-sample size, branching fraction for B + → K + K 0 L K 0 L decays, branching fraction for B → D * * decays, reconstructed energy of hadrons, branching fractions of the leading B decays, and K 0 L reconstruction efficiency.These sources of uncertainty allow for substantial changes in the BB shape.The shape variations are larger than the data-simulation residuals in η(BDT 2 ) in the pion-enriched sample (Fig. 11).This suggests that uncertainties in the BB shape are adequately covered by the existing systematic contributions.
The summary of systematic uncertainties for the HTA is provided in Table II.Three background components are considered in the HTA: BB, accounting for both charged and neutral B decays; cc; and light-quark continuum (uū, d d, ss).The contribution from τ -pair decays is negligible.The primary contribution to the systematic uncertainty arises from the determination of the normalization of the BB background.This determination is based on the comparison of data-to-simulation normalization in the pion-enriched control sample, which shows agreement within the 30% statistical uncertainty.The other important sources are the uncertainty associated with the bin-by-bin correction of the extra-photoncandidate multiplicity, and the uncertainty due to the limited size of the simulated sample.The uncertainty on continuum normalization (50%), determined using off-resonance data, is the fourth most important contribution.The limited size of the HTA sample prevents the substantial reduction of postfit uncertainties seen in the ITA, compared to prefit values, for the background normalization.The other sources of systematic uncertainty are the same in both analyses, except for those related to photon and hadronic-energy corrections, not applied in the HTA, and the p-wave contribution from B + → K + K 0 S K 0 L , whose contribution is negligible.For both analyses, nuisance-parameter results are investigated in detail.No significant shift is observed for the background yields from charged and neutral B-meson decays.For the ITA, the shifts in the continuum background yields are consistent with the difference observed in the normalization of the continuum simulation with respect to the off-resonance data.

XII. RESULTS
The data in the off-resonance sample and in the SR of the ITA are shown in Fig. 15, with fit results overlaid.Good visual agreement between data and fit is observed in both samples.An excess over background is observed in the SR, consistent with the presence of B + → K + ν ν signal.The observed signal purity, in terms of the fraction of signal events, is 5% in the SR and 19% in the three bins with η(BDT 2 ) > 0.98.The signal strength is determined to be µ = 5.4 ± 1.0(stat) ± 1.1(syst) = 5.4 ± 1.5, where the statistical uncertainty is estimated using simplified simulated experiments based on Poisson statistics.The total uncertainty is obtained using a profile likelihood ratio, fitting the model with fixed values of µ around the best-fit value while keeping the other fit parameters free; see Fig. 16.The systematic uncertainty is calculated by subtracting the statistical uncertainty in quadrature from the total uncertainty.An additional 8% theoretical uncertainty, arising from the knowledge of the branching fraction is not included.Compatibility between the data and fit result is assessed using simplified experiments, and a p-value of 47% is found.(The test is based on the fraction of simplified experiments with the negative profile log-likelihood ratio of the nominal to the "saturated" model, in which the predictions are set to the observations, above the one observed in data.)Figures 17 and 18 present distributions of several variables for the events within the signal region.Distributions of q 2 rec with more differential background information are included in the Supplemental Material [36].Each simulated event is weighted using the ratio of post-to-prefit yields for the corresponding SR bin and event category.Good overall agreement is observed.However, certain discrepancies are evident in the q 2 rec distribution, showing a deficit in data-to-predictions for q 2 rec < 3 GeV 2 /c 4 and an excess for 3 GeV 2 /c 4 < q 2 rec < 5 GeV 2 /c 4 .A comparison of data and fit results for the HTA is shown in Fig. 19.The compatibility between the data and fit results is determined to be 61%.The HTA observes a signal strength of µ = 2.2 +1.8 −2.0 , lower than the ITA result.In the whole SR, a signal purity of 7% is measured, which increases to 20% in the three bins with η(BDTh) > 0.7, with the main background contribution from BB decays.Figure 20 shows distributions of several variables for the events within the signal region.Good agreement is observed.Limit setting for HTA is included in the Supplemental Material [36].
The significance of the observation is determined by evaluating the profile likelihood L for several µ values.The square root of the difference between the −2 log L values at µ = 0 and the minimum is used to estimate the significance of the observed excess with respect to the background-only hypothesis, which yields 3.5 standard deviations for the ITA.For the HTA, the observed signal is consistent with the background-only hypothesis at 1.1 standard deviations.Similarly, the square root of the difference between the −2 log L values at µ = 1 and at the minimum is used to estimate the significance of the observed signal with respect to the SM expectation.For the ITA, it is found to be 2.9 standard deviations, indicating a potential deviation from the SM.For the HTA, the result is in agreement with the SM at 0.6 standard deviations.
Events from the SR of the HTA represent only 2% of the corresponding events in the ITA; their removal does FIG.15.Observed yields and fit results in bins of the η(BDT2) × q 2 rec space obtained by the ITA simultaneous fit to the off-and on-resonance data, corresponding to an integrated luminosity of 42 and 362 fb −1 , respectively.The yields are shown individually for the B + → K + ν ν signal, neutral and charged B-meson decays and the sum of the five continuum categories.The yields are obtained in bins of the η(BDT2) × q 2 rec space.The pull distributions are shown in the bottom panel.
not alter the ITA result significantly.The ITA sample with removed overlapping events is used for the compatibility checks.The ITA and HTA measurements agree, with a difference in signal strength of 1.2 standard deviations.

XIII. CONSISTENCY CHECKS
Several checks are performed to test the validity of the analysis.
Simulation and data events are divided into approximately same-size statistically independent samples (split samples) according to various criteria: data-taking period; missing-momentum direction; momentum of the rest-of-event particles; number of photons, charged particles, and lepton candidates in the event; kaon direction; kaon charge; and total charge of the reconstructed particles in the event.Fits are performed for each split sample, and the results are presented in Fig. 21.Good compatibility is observed between the split samples for the HTA.A tension at 2.4 standard deviations is observed for the total charge split sample in the ITA.Several studies are conducted to investigate this tension, but they did not reveal any significant systematic effects.The total χ 2 value per degrees of freedom for all tests in the ITA is 12.5/9.
An important test involves the subdivision based on the number of leptons in the ITA.Since there are no leptons on the signal side, this test compares events in which a (semi)leptonic B decay occurs in the ROE with those in which a hadronic B decay occurs.The separation is confirmed by inspecting simulated events.Excellent agreement is observed between the results in the two split samples.This demonstrates the robustness of the ITA procedure with respect to a particular signature in the ROE.
For each common split sample, a comparison is also performed between the ITA and the HTA, showing compatibility between 1 and 2 standard deviations.
An ITA fit fixing the normalization of the B background to the expectation and the normalization of the continuum to the yield observed in off-resonance data yields a reduction of the uncertainty on µ by 25% with a downward change in µ that is consistent with zero at 1.5 standard deviations.Performing a fit where the 50% constraints on the normalizations of all background sources are released leads to a minimal change of µ by 0.1, with the uncertainty on µ increasing by only 5%.Another fit in which the leading systematic uncertainties are fixed also gives a consistent result.A fit to the 12 bins of Υ (4S) data only, i.e., excluding the off-resonance data, changes µ by less than 0.1, while the uncertainty increases by 2%.Similarly, a fit restricted to the 18 bins with η(BDT 2 ) > 0.94 yields a change in µ of less than 0.1, while the uncertainty increases by 3%.Additional fits are conducted to study the stability of the result with respect to q 2 rec .In these fits, the B background normalization is fixed to its expected value due to increased uncertainties, and the normalization of the continuum is set based on the yield observed in off-resonance data.The fits are separately performed for the low q 2 rec < 4 GeV 2 /c 4 and high q 2 rec > 4 GeV 2 /c 4 SR bins.The results from these fits are consistent within 1.4 standard deviations.
The ITA method is further validated by performing a branching fraction measurement of the B + → π + K 0 decay.This decay is reconstructed by measuring the recoil of the π + , while the K 0 is not directly detected.In this case, the B + → π + K 0 channel exhibits a signature similar to B + → K + ν ν, with comparable selection efficiency and purity.The known branching fraction, measured using K 0 S in the final state, is (2.34±0.08)×10−5 [14].With respect to the nominal B + → K + ν ν analysis, the following modifications are implemented for this validation: (i) positive pion identification is used instead of kaon identification; (ii) a bin boundary of the SR in q 2 rec is changed from 4 GeV 2 /c 4 to 2 GeV 2 /c 4 to increase sensitivity; (iii) the fit model uses only three sources of background (continuum, neutral B decays, charged B decays excluding B + → π + K 0 ), and the signal B + → π + K 0 decays; (iv) systematic uncertainties are restricted to those originating from limited sizes of the simulated samples and global normalization uncertainties; (v) the fit is restricted to the data sample collected at the Υ (4S) resonance.
Based on the simulation, 80% of K 0 within the SR are K 0 L while the remaining 20% are K 0 S .The B + → π + K 0 SR corresponds to a signal-selection efficiency of 4.4% with 0.9% purity, which can be compared to the 8% and 0.9% values for the B + → K + ν ν, respectively.However, the yield is almost 3 times higher, providing a sensitive test of the SR modeling.The fit quality is good, with a p-value of 83%.The branching fraction of the B + → π + K 0 decay is found to be (2.5 ± 0.5) × 10 −5 , consistent with the known value.The distribution of q 2 rec with the background and signal components normalized using the fit result is shown in Fig. 22.The distribution of q 2 rec for events with η(BDT 2 ) > 0.98 is shown in Supplemental Material [36].

XIV. COMBINATION
The consistency of the two analyses and the small size of the overlap between the HTA and ITA samples allows the combination of the results, which achieves a 10% increase in precision over the ITA result alone.This is done through a profile likelihood fit, incorporating correlations between common systematic uncertainties.In order to eliminate statistical correlation, common data events are excluded from the ITA dataset prior to the combination.Nuisance parameters corresponding to the number of BB events, signal form factors, and branch- Candidates/(0.2

ing fractions for processes
L X, and other leading B-meson decays are treated as fully correlated.To capture full correlations for the systematic uncertainties related to the branching fractions of leading B-meson decays between the ITA and HTA, eigendecomposition of the shared covariance matrix between ITA and HTA is performed and represented using ten nuisance parameters.
Conversely, other sources are considered uncorrelated due to their analysis-specific nature, distinct evaluation methods, or minor impact, such as PID uncertainties.
In order to ensure robustness, various scenarios are studied, including variations in which sources, such as global background normalization, are assumed to be fully correlated between the two analyses.These tests yield no substantial deviation from the default combination.

XV. DISCUSSION
The measured branching fraction is compared with previous measurements in Fig. 23.The comparison is performed using branching fractions from prior measurements to assess both compatibility and relative accuracy.For BABAR, the branching fractions are taken as given in Refs.[10,11].Since Belle did not report branching fractions in Refs.[9,12] they are computed for this com-parison based on the quoted observed number of events and efficiency taking into account statistical and systematic uncertainties.Note that BABAR uses a different value of f +− = 0.5 compared to the one adopted here.However, due to the large statistical uncertainties, minor differences in the correction factors have a small impact on the comparison of the results.
The ITA result is in agreement with the previous measurements obtained using hadronic and inclusive tagging methods.There are tensions of 2.3 and 1.8 standard deviations with the results obtained using semileptonic tagging by the BABAR [11] and Belle [12] Collaborations, respectively.The HTA result is in agreement with all measurements.The precision of the ITA measurement is comparable with the previous best results, despite being obtained with a smaller data sample.The precision of the HTA result exceeds that achieved by previous analyses using hadronic tagging.The combined Belle II result has comparable accuracy to the best single measurement, reported by Belle using semileptonic tags.
A simplified weighted average of the five independent measurements, obtained using symmetrized uncertainties (see Fig. 23), yields a branching fraction of (1.3 ± 0.4) × 10 −5 , and the corresponding χ 2 per degree of freedom is found to be 5.6/5, corresponding to a pvalue of 35%.
The analysis was initially performed in a manner designed to reduce experimenter's bias.The full analysis procedure was developed and finalized before determining the branching fraction from data.However, several checks and corrections were applied after the result was obtained.The original measurement was initially limited to the ITA and optimized through simulation using a partial data set of 189 fb −1 .In spring 2022, a fit to the data revealed a significant deviation from the expectations of the SM.To validate the findings, the ITA was repeated using a larger data sample while maintaining the selection criteria employed in the original measurement.As an additional consistency check, the HTA was introduced.The new analyses underwent rigorous consistency checks before the signal strength was once again unveiled in spring 2023.The ITA and HTA results were found to be in agreement, confirming the results of the original 2022 analysis.Further comprehensive checks were conducted in PID sidebands, leading to changes in background modeling and an increase in systematic uncertainties.
The postunveiling changes in the ITA are corrections to the K 0 L reconstruction efficiency in the ECL and its uncertainty, motivated by the observed excess in the pion-enriched sample (Sec.V C); correction to the rate of D-meson decays involving K 0 L and its uncertainty (Sec.IX B 2); and corrections to the B + → K + K 0 K 0 decay modeling and corresponding uncertainty (Sec.IX B 3).In addition, the treatment of the reconstructed hadronic energy in the ECL was adjusted.Instead of keeping the scale at the nominal value, it is now adjusted to the preferred value while keeping the   Pull FIG.20.Distribution of q 2 , computed using Btag kinematics, Eextra, E * miss + cp * miss , and n tracks extra for events in the signal region of the HTA.These distributions are obtained for B + → K + ν ν candidates reconstructed in data (points with error bars), simulation (filled histograms) of B + → K + ν ν signal, BB decays, cc continuum, and light-quark continuum backgrounds, normalized according to the fit yields in the HTA.The pull distributions are shown in the bottom panels.
100% uncertainty (Sec.V C).These modifications lead to a shift of the signal strength µ in the ITA of about −0.5.A mistake was found in the treatment of the B + → τ + (→ K + ν)ν background which was accidentally removed from the simulation.The mistake was corrected yielding a −0.15 change in µ.Given updates of the input variables, a new training of the BDT 2 was performed that led to an additional −0.5 change in µ with a statistical uncertainty of 0.6 estimated using simulated experiments.The modifications lead to an increase of the total uncertainty by 10%, driven by the uncertainty on the B + → K + K 0 L K 0 L branching fraction.The HTA is based on a standard FEI data selection that is widely used within Belle II.During the review of another Belle II analysis [49], it was concluded that it is necessary to remove selection criteria on the total energy in the ECL that is poorly modeled in simulation.The selection was removed and the BDTh was then retrained on new selected samples.This change resulted in a change of signal strength of −2.6.Additional HTA changes include systematic uncertainty due to the K 0 L reconstruction efficiency in the ECL (Sec.V C); correction to the rate of D-meson decays involving K 0 L and its uncertainty (Sec.IX B 2); corrections to the B + → K + K 0 L K 0 L decay modeling and corresponding uncertainty (Sec.IX B 3). Dedicated studies were performed targeting the E extra variable that is correlated with the total energy in the ECL, as described in Sec.V C, resulting in a data-driven correction and additional systematic uncertainty.These changes resulted in a change in the signal strength of −1.1 with a statistical uncertainty of 1.2, estimated using simulated experiments, which accounts for both data and simulated samples.The previously underestimated contributions from B + → K + K 0 L K 0 L and D → K 0 L X background reduce the signal strength by −0.6.Taking this reduction and the estimate of the statistical uncertainty into account, the significance of the change in µ is 1.9 standard deviations.The total uncertainty for the HTA is reduced by about 20%.The increase in the systematic uncertainty, also observed in ITA, is compensated by an increase in the data-sample size due to changes in the  analysis, with the PID requirement for the signal candidate suitably adjusted.The simulated samples are corrected for known data-to-simulation discrepancies for the pion identification and kaon-to-pion as well as lepton-topion fake rates.The modeling of the continuum background is improved using BDT c , which is taken from the nominal analysis without retraining.The normalization of the continuum background is determined using the off-resonance data sample and is found to be 1.30 ± 0.03 compared to the expectations from simulation.Simulated B-meson samples are divided into two subsamples based on the presence of a D meson in the decay chain associated with the signal pion, where the D meson's decay products include a K 0 L meson.They are normalized using the number of the BB pairs collected by Belle II.
Figure 10 shows the q 2 rec distribution in data and simulation.A discrepancy in the data-to-simulation ratio is observed when the normalization factors described above are used, as can be seen from the left panel of the figure.The q 2 rec variable corresponds to the invariant mass squared of the system recoiling against the candidate pion in the B decay.The data-to-simulation ratio is below unity below the threshold of the D-meson mass and above unity beyond the threshold.For q 2 rec < 0, which is dominated by continuum background, the data are below the simulation, suggesting that the scale factor determined based on the off-resonance sample is overestimated.A binned sample-composition fit is performed to this q 2 rec distribution to determine the normalization of the three components.In this fit, the sample of B-meson decays containing D-mesons with K 0 L in the decay chain is allowed to vary without constraints, while the complementary B-meson decay sample is constrained by the uncertainty associated with the number of BB pairs.The uncertainty on the continuum normalization is taken to be 50%, as in the nominal fit.The result of this fit is shown in the right panel of the figure, and shows a much improved data-to-simulation ratio.The shift in the normalization factor for the sample including D → K 0 L X decays is found to be (+30 ± 2)%: this shift is used in the nominal analysis.
The normalization factor for the sample without D mesons involving K 0 L is consistent with the expectations to within 1.2 standard deviations, while the normalization for the continuum background is reduced by (19 ± 2)% and is therefore closer to the expectation from the simulation.The difference in the continuum normalization based on the off-and on-resonance data from this fit motivates the systematic uncertainty on the off-resonance sample normalization applied in the nominal analysis.
Appendix C: RECONSTRUCTION OF B + → K + K 0 S K 0 S AND B 0 → K 0 S K + K − CONTROL DECAY CHANNELS Reconstruction of B + → K + K 0 S K 0 S and B 0 → K 0 S K + K − decay channels utilizes the K + selection adapted for the main analysis.The K 0 S candidates are reconstructed using the K 0 S → π + π − decay mode, and a multivariate classifier is employed to enhance sample purity.The signal region is defined by requiring M bc > 5.27 GeV/c 2 and |∆E| < 0.2 GeV.The average multiplicity of candidates is about 1.5 for this selection.In cases where multiple candidates are found, a single candidate per event is randomly selected.For the less pure B + → K + K 0 S K 0 S decay, the input variables and classifier parameters are set to the same values as those used for BDT 2 in the nominal measurement.For the B 0 → K 0 S K + K − decay, BDT 1 variables and parameters are utilized.Less stringent criteria on the output of the classifiers are applied and checked using simulated signal samples to ensure approximately constant efficiency as a function of M (K 0 S K 0 S ) and M (K + K − ) for each decay, respectively.
The signal is extracted by an unbinned maximum likelihood fit to the ∆E distribution.For the B + → K + K 0 S K 0 S decay, a Gaussian distribution is used to model the signal.For the B 0 → K 0 S K + K − decay, the signal is modeled by a sum of two Crystal Ball functions [51] plus a sum of two Gaussian distributions with the shape of the model fixed by the simulation and the total width allowed to vary in the fit to data.The background is modeled by a second-order Chebyshev polynomial in both cases.

FIG. 1 .
FIG. 1. Lowest-order quark-level diagrams for the B + → K + ν ν decay in the SM are either of the penguin (a), or box type (b): examples are shown.The long-distance doublecharged-current diagram (c) arising at tree level in the SM also contributes to the B + → K + ν ν decay.

1 CorrectedFIG. 3 .
FIG.3.Distribution of the summed energy of the photon candidates obtained in collision data (points with error bars), uncorrected simulated data (open histogram), and corrected simulated data (filled histogram), for events in which a B + → K + J/ψ decay is reconstructed.The correction corresponds to a variation of the hadronic energy by −10%.The simulation is normalized to the number of events in data.The ratio shown in the lower panel refers to data over corrected simulation.

FIG. 4 .
FIG. 4. Distributions of the number of extra photon candidates in the HTA after the selection described in Sec.IV in data (points with error bars) and simulation (filled histograms) for the opposite-charge pion-enriched control sample, on the left before the photon multiplicity correction and on the right after the correction.The yields are shown individually for the three background categories (BB decays, cc continuum, and light-quark continuum).The data-to-simulation ratios are shown in the bottom panels.

FIG. 5 .
FIG.5.Efficiency of reconstructing an energy deposit in the ECL matched to the K 0 L direction, as a function of the K 0 L energy, for e + e − → γϕ data and simulation.The energy deposits are selected following the ITA criteria.

FIG. 6 .
FIG.6.Signal-selection efficiency as a function of the dineutrino invariant mass squared q 2 for simulated events in the SR for the ITA (left) and HTA (right).The error bars indicate the statistical uncertainty.

FIG. 8 .
FIG. 8. Distribution of q 2rec for the off-resonance data (points with error bars) and continuum background simulation (filled histograms) in the SR for the ITA.The simulation is normalized to the number of events in the data.The distribution of the difference between data and simulation divided by the combined uncertainty (pull) is shown in the bottom panel.

Figure 8
Figure8shows a comparison of the q 2 rec distribution in data and corrected simulation for the ITA off-resonance sample.While the shapes of the distributions are similar, there is a normalization excess of the data over the simulation of (40 ± 5)%, which is included as a systematic uncertainty (see Sec. XI).A possible source of the discrepancy is a mismodeling of kaon fragmentation in the PYTHIA8 version used in Belle II.Illustration of improvement of the ITA distributions with BDT c -based reweighting is shown in the Supplemental Material[36].For the HTA, the relative normalization between offresonance data and continuum simulation is 0.82 ± 0.01 before the BDTh selection.This factor accounts for mismodeling effects on the FEI performance for continuum events and is used to scale the expected continuum contamination.The relative normalization in the BDTh signal region is consistent with unity with 50% uncertainty, which is included as a systematic uncertainty (see Sec. XI).

FIG. 10 .FIG. 11 .
FIG. 10.Distribution of q 2rec in data (points with error bars) and simulation (filled histograms) divided into three groups (B-meson decays with and without subsequent D → K 0 L X decays, and the sum of the five continuum categories) for the pionenriched sample in the ITA.The left (right) panel shows pre(post)fit distributions.The data-to-simulation ratios are shown in the bottom panels.

9 FIG. 12 .
FIG.12.Distribution of q 2 rec in data (points with error bars) and simulation (filled histograms) divided into three groups (B-meson decays with and without subsequent D → K 0 L X decays, and the sum of the five continuum categories), for the electron-(left) and muon-enriched (right) PID control samples with η(BDT2) > 0.92 in the ITA.The pull distributions are shown in the bottom panels.

FIG. 13 .
FIG. 13.Distribution of invariant K 0S K 0 S mass in backgroundsubtracted data (points with error bars) and signal simulation (filled histogram) for B + → K + K 0 S K 0 S candidates.The simulated distribution is normalized to the number of BB events.The pull distribution is shown in the bottom panel.

FIG. 16 .
FIG.16.Twice the negative profile log-likelihood ratio as a function of the signal strength µ for the ITA, HTA, and the combined result.The value for each scan point is determined by fitting the data, where all parameters but µ are varied.

FIG. 17 .
FIG. 17. Distributions of η(BDT2), q2  rec , beam-constrained mass of the ROE M bc,ROE , ∆EROE, Fox-Wolfram R2, and modified Fox-Wolfram H so m,2 in data (points with error bars) and simulation (filled histograms) shown individually for the B + → K + ν ν signal, neutral and charged B-meson decays, and the sum of the five continuum categories in the ITA.Events in the full signal region, with η(BDT2) > 0.92, are shown.Simulated samples are normalized according to the fit yields in the ITA.The pull distributions are shown in the bottom panels.

FIG. 18 .
FIG.18.Distributions of η(BDT2), q2  rec , beam-constrained mass of the ROE M bc,ROE , ∆EROE, Fox-Wolfram R2, and modified Fox-Wolfram H so m,2 in data (points with error bars) and simulation (filled histograms) shown individually for the B + → K + ν ν signal, neutral and charged B-meson decays, and the sum of the five continuum categories in the ITA.Events in the most signal-rich region, with η(BDT2) > 0.98, are shown.Simulated samples are normalized according to the fit yields in the ITA.The pull distributions are shown in the bottom panels.

FIG. 19 .
FIG.19.Observed yields and fit results in bins of η(BDTh) as obtained by the HTA fit, corresponding to an integrated luminosity of 362 fb −1 .The yields are shown for the B + → K + ν ν signal and the three background categories (BB decays, cc continuum, and light-quark continuum).The pull distribution is shown in the bottom panel.

TABLE II .
Sources of systematic uncertainty in the HTA (see caption of TableIfor details).