Measurement of the branching fractions of the $B^+ \to \eta \ell^+ \nu_{\ell} $ and $B^+ \to \eta^{\prime} \ell^+ \nu_{\ell} $ decays with signal-side only reconstruction in the full $q^2$ range

The branching fractions of the decays $B^{+} \to \eta \ell^{+} \nu_{\ell}$ and $B^{+} \to \eta^{\prime} \ell^{+} \nu_{\ell}$ are measured, where $\ell$ is either an electron or a muon, using a data sample of $711\,{\rm fb}^{-1}$ containing $772 \times 10^6 B\bar{B}$ pairs collected at the $\Upsilon(4S)$ resonance with the Belle detector at the KEKB asymmetric-energy $e^+ e^-$ collider. To reduce the dependence of the result on the form factor model, the measurement is performed over the entire $q^2$ range. The resulting branching fractions are ${\cal B}(B^{+} \rightarrow \eta \ell^{+} \nu_{\ell}) = (2.83 \pm 0.55_{\rm (stat.)} \pm 0.34_{\rm (syst.)}) \times 10^{-5}$ and ${\cal B}(B^{+} \rightarrow \eta' \ell^{+} \nu_{\ell}) = (2.79 \pm 1.29_{\rm (stat.)} \pm 0.30_{\rm (syst.)}) \times 10^{-5}$.


I. INTRODUCTION
The transition b → u + ν , which spans over two generations of quarks, has been observed to be strongly suppressed.Understanding these decays is important to resolve the tension in the determination of the CKM matrix element V ub by improving the model of inclusive B → X u + ν decays, and also the backgrounds in measurements of other decays.This paper describes the measurements of the branching fractions of the decays B + → η + ν and B + → η + ν [1].Previous measurements of these decays typically restricted the measured range of the square of the momentum transfer (q 2 = (p B − p η ( ) ) 2 ) which created a difficulty to quantify uncertainty in the modeling of the decay.This analysis reconstructs these decays without restrictions in q 2 .Taking into account the hermeticity of the detector and the known initial state, only one of the two B mesons produced in the decay of the Υ(4S) is reconstructed.This achieves the statistical power needed for studying such a suppressed set of processes.The modes analyzed in this paper have before been measured by BaBar [2][3][4], CLEO [5] and another Belle analysis [6] using hadronic tagging.

II. THE BELLE DETECTOR AND ITS DATA SET
The measurement presented here is based on the full data sample of 772×10 6 BB pairs collected with the Belle detector at the KEKB asymmetric-energy e + e − (3.5 on 8 GeV) collider [7] operating at the Υ(4S) resonance.
The Belle detector is a large-solid-angle magnetic spectrometer that consists of a silicon vertex detector (SVD), a 50-layer central drift chamber (CDC), an array of aerogel threshold Cherenkov counters (ACC), a barrel-like arrangement of time-of-flight scintillation counters (TOF), and an electromagnetic calorimeter comprised of CsI(Tl) crystals (ECL) located inside a superconducting solenoid coil that provides a 1.5 T magnetic field.An iron flux-return located outside of the coil is instrumented to detect K 0 L mesons and to identify muons (KLM).The detector is described in detail elsewhere [8].Two inner detector configurations were used.A 2.0 cm radius beampipe and a 3-layer silicon vertex detector were used for the first sample of 152 × 10 6 B B pairs, while a 1.5 cm radius beampipe, a 4-layer silicon detector and a small-cell inner drift chamber were used to record the remaining 620 × 10 6 B B pairs [9].
In this analysis several different sets of Monte Carlo (MC) simulated data have been used.Decays involving b → c transitions have been simulated with the equivalent of ten times the integrated luminosity acquired in data, while the transitions e + e − → q q, (q = u, d, s, c), denoted continuum, are simulated with six times the data luminosity.Additionally, a set containing one B meson decaying via b → u + ν with the other decaying via b → c is simulated with twenty times the integrated luminosity of data.This sample also contains the signal decay.The decays have been simulated using EVTGEN [10] and PYTHIA [11], while the detector response was modeled with GEANT3 [12].Final-state radiation was added using PHOTOS [13].The branching fractions for B → D ( * ) + ν decays, as well as exclusive and inclusive b → u + ν decays have been updated to the most recent measurements [14], except for the semileptonic decays to ρ mesons, which have been set to the values measured by Ref. [15].The form factors of the semileptonic decays to D * and D * * have been updated according to the values reported in Ref. [16] and Ref. [17].

III. EVENT SELECTION AND BACKGROUND SUPPRESSION
All charged particle tracks and neutral clusters used in the analysis are required to satisfy basic quality criteria.Tracks need to originate from the interaction point (IP).All tracks for which the distance of closest approach to the IP in the longitudinal |dz| (perpendicular |dr|) component with respect to the beam direction is greater than 2 cm (0.5 cm) are discarded.Tracks with a transverse momentum less than 275 MeV/c are checked for duplicates.A track is considered a duplicate if the three-momentum difference to another such track is less than 100 MeV/c, and the angle between them is less than 15 • for equal charge or greater than 165 • for opposite charge tracks.For each such pair, only the track with the lesser value of |5 × dr| 2 + |dz| 2 is kept.Photons are accepted within a polar angle, θ, relative to the direction of the positron beam of 17 • to 150 • .Due to variations in the distribution of beam-related backgrounds, different energy requirements are used depending on the polar angle region.In the central barrel region, from 32 • to 130 • , photons are required to have an energy above 50 MeV.In the forward region, θ < 32 • , the requirement is E γ > 100 MeV, while in the backward region, θ > 130 • , the requirement is E γ > 150 MeV, with the boundaries based on ECL geometry.As the total charge of the initial e + e − system is zero, the sum of the charges of all reconstructed particles should be zero as well.However, particles can be misreconstructed or completely escape detection.Therefore a requirement of | q tracks | < 3e is set.
A signal event is required to have only one lepton, which can be either an electron or a muon.Electrons must lie in the same acceptance region as photons with 17 • < θ < 150 • , while muons, for which KLM information is important, are accepted in the range 25 • < θ < 145 • .Both must have a center-of-mass (c.m.) momentum above 1.3 GeV/c, and electrons (muons) must have a lab frame momentum above 0.4 (0.8) GeV/c.Electrons are identified by using a likelihood function, which combines the shower shape in the ECL, the light yield in the ACC, the energy loss dE/dx due to ionization in the CDC, the ratio of energy measured by the ECL to the momentum of the track measured in the CDC, and quality of the matching of the CDC track to the ECL cluster position [18].The muon likelihood compares the CDC track with the associated KLM hits, using both the penetration length determined by the CDC and the matching quality of the KLM hits to the trajectory extrapolated out of the CDC [19].In the momentum region relevant to this analysis, charged leptons are identified with an efficiency of about 90% and the probability to misidentify a pion as an electron (muon) is 0.25% (1.4%).For electron candidates, bremsstrahlung photons are recovered by searching photons in a 5 • cone around the electron.The closest photon is added to the lepton momentum unless it is used in the η reconstruction.
The η meson is reconstructed in two channels, η → γγ and η → π + π − π 0 .A common background source for photons in the former is π 0 decays.A veto [20] against such photons is implemented by combining each candidate photon with all other photons in the event, and if the combined mass lies between 110 MeV/c 2 and 160 MeV/c 2 , it is deemed to have come from a π 0 decay.Both photons in such a decay are discarded.All remaining photons are combined, where pairs with a mass between 510 MeV/c 2 and 580 MeV/c 2 are saved as η candidates.
For the η → π + π − π 0 channel, two pions of opposite charge are combined with one neutral pion built from two accepted photons.Charged pions are tested against the kaon hypothesis using a likelihood combining the energy loss dE/dx from the CDC, the flight time measured by the TOF, and the ACC response, providing an efficiency of 86% and a misidentification probability of 10% [21].The invariant mass of the combination is required to lie between 540 MeV/c 2 and 555 MeV/c 2 .A vertex fit with χ 2 /n.d.f.< 3 is required for the η candidate.The η meson is reconstructed through the η → π + π − η decay mode.The η candidates decay into two photons.The combined mass is required to be between 913 MeV/c 2 and 996 MeV/c 2 , with the additional requirement that the mass difference m η −m η must be between 400 MeV/c 2 and 420 MeV/c 2 .Here too a vertex fit is required with χ 2 /n.d.f.< 3. The mass windows required for the η and the η , as well as their mass difference, all correspond to a ±3σ window around the reconstructed value.Reconstruction of η using η → π + π − π 0 candidates proved to not be feasible due to increased combinatoric background.
Background is further suppressed using the angle between the B meson and the combination of the lepton and η ( ) , defined as where All events must fulfil the requirements | cos(θ B η ( ) )| < 1, which ensures that they lie within the physical region.For remaining B + → η ( ) + ν candidates, all charged decay products are fitted to a common vertex, and candidates for which the fit fails are discarded.Roughly half of the events contain more than one candidate in a single channel.Only the candidate with the lowest χ 2 /n.d.f.from this fit is kept.FIG.1: Distribution of cos(θ B η ( ) ) for the η → γγ channel, with all other requirements applied but before applying the BDT.Between the vertical yellow lines is the accepted region.The signal contribution is overlaid with an arbitrary scale factor.The distribution for the other channels looks very similar.
All final state particles of the signal decay chain have been measured except for the neutrino.Instead of a direct measurement, it is indirectly reconstructed with a technique [22] using the known c.m. state of the event.Assuming all other particles in the event were detected, the difference between the sum of their 4-momenta and that of the initial state corresponds to the neutrino 4momentum.This difference is called the missing momentum, p miss , and is defined as where p Υ(4S) is the momentum of the initial state of the Υ(4S), i.e. the sum of the two beam momenta, and energy and momenta of all N particles remaining in the event are summed together.In this summation the requirements are loosened to |dr| < 1.5 cm and |dz| < 10 cm.The invariant mass of the neutrino reconstructed in this way should be consistent with zero.Therefore, events with |m 2 miss | > 7 GeV 2 /c 4 are rejected.The missing mass is defined as: To compensate for the detector having better momentum resolution than the energy one, the neutrino energy in subsequent calculations is adjusted constraining Using the inferred neutrino kinematics with the reconstructed and η ( ) yields the B ± candidate.Signal yield extraction uses the beam-constrained mass Here E beam is the energy of one beam in the c.m. system, equivalent to half the c.m. energy, while p B and E B are the three-momentum and energy of the combined B-daughter particles, including the inferred neutrino.At this point, candidates outside the fit region, 5.
Continuum background is reduced by requiring the ratio of the second to zeroth Fox-Wolfram moment [23] to be less than 0.4.Further background reduction uses boosted decision trees (BDTs) [24].For each channel, two such BDTs are trained, one to discriminate against background originating from B B events, and the other against continuum background.The training uses MC corresponding to the on-resonance integrated luminosity for the b → c and continuum background, and ten times the data integrated luminosity of the signal modes and the b → u + ν background.These events are excluded from the analysis afterwards.The variables used by the BDT are: the number of particles with all general requirements applied except the distance to the IP; the number of charged particle tracks that fail the requirement on the distance to the IP; the number of K ± candidates; m 2 miss ; the angles between the sum of all particles not assigned to the signal decay (representing the decay of the second B), and either the η ( ) or the lepton candidate; the energy asymmetry of the two signal π ± and the two γ (where applicable) defined as A η = (E d1 − E d2 )/(E d1 + E d2 ); and the difference between the squared momentum transfer, q 2 , calculated with the inferred neutrino as q 2 = (p + p ν ) 2 and the method from Ref. [25].The continuum classifier additionally uses the cosine of the angle between the thrust axes of the η ( ) + system and the remaining event and 13 of the modified Fox-Wolfram moments [26] found to be uncorrelated to q 2 .The variable distributions are shown in Fig. 4, 5 and 6 in the appendix.
The selection of classifier input variables was restricted by the requirement to keep the entire q 2 range unbiased throughout the selection procedure.For each channel, the selections on the BDT output values are determined simultaneously by maximizing the figure of merit N Sig / N Sig + N Bkg , where N Sig (N Bkg ) is the number of signal (background) events in the remaining sample.The efficiencies of the event selection can be seen in Table I.The agreement between data and MC was validated in sidebands.These sidebands consist of events outside the accepted η mass range, or outside the range of the mass difference m η − m η in case of the η .All other selection criteria including the BDT were unchanged.The signal region was only investigated after sufficient agreement in the description was verified.

IV. SIGNAL DETERMINATION
The number of signal events in the remaining sample is determined with a two-dimensional binned maximumlikelihood fit [27] in the variables M bc and ∆E taking into account MC statistical uncertainties.Each η ( ) channel is fitted individually, while no distinction between decays to electrons or muons is made in the fit.The fitted range is 5.1 GeV/c 2 < M bc < 5.3 GeV/c 2 and −1 GeV < ∆E < 1 GeV, divided into eight equal-sized bins in each variable.The four most signal-rich bins in the area 5.25 GeV < M bc and |∆E| < 0.25 GeV further split into four bins each, giving a total of 76 bins for the fit.Pseudo-data generated from the MC have been used to validate the fit procedure, no bias in the results was observed.
The fit uses one signal and three background histogram templates.The first two background templates are the B B decays via b → c only and involving b → u.The third is the continuum as defined in the BDT training.The contribution of the b → u + ν background is fixed to the inclusive measurement from Ref. [16] while the other two backgrounds and the signal are determined by the fit.The number of events for each component and channel can be seen in Table II together with the efficiency .The distributions resulting from the fit can be seen in Fig. 2. With the fitted event yield, N fit and the efficiency, , the branching fraction is expressed as: η ( ) → X denotes the decay of the η ( ) into the respective final state.The sample contains N B B = 772 × 10 6 pairs, the fraction of B + B − among them is taken to be B(Υ(4S) → B + B − ) = 0.513 ± 0.006 [14] in this analysis.
Both of these can decay to the signal mode.Together with the combination of decays to electrons and muons this gives a factor of four.

V. SYSTEMATIC UNCERTAINTIES
The sources of systematic uncertainty considered for this analysis fall into three categories, uncertainties related (i) to the detector performance, (ii) to the quality of the MC model and related input parameters, and (iii) to the fit procedure used.Unless otherwise stated, these

A. MC modeling and theory
The most important background source for this analysis is other semileptonic decays of the type B → + ν X Bkg .Their branching fractions are reweighted to the current values taken from [14] and are varied within their uncertainty.
Semileptonic decay form factors are another important input of the MC model for which uncertainties are estimated.The form factors for the b → c decays + ν , and B → D 2 + ν have been updated to the most recent values [16,17] with the method described in Ref. [28] for both charged and neutral decaying B mesons.The signal decays B + → η + ν and B + → η + ν are reweighted from the ISGW2 model [29] to the model taken from Ref. [30] with the form factors updated to Ref. [31], using the BZ parametrisation and assuming uncorrelated parameters.
The decay B + → ω + ν is modeled according to Ref. [32] in the MC used and reweighted to Ref. [33] for comparison.The shape of the inclusive component [34] of the b → u + ν transitions is also considered.The form factor uncertainties listed in Table III are based on those reported in the publications they were obtained from.Despite having a slowly varying efficiency the η → γγ mode appears to have the largest such uncertainty.
The effect of remaining background events containing K 0 L is considered by varying the yield of such events up and down by 20% when building the MC templates for the fit.Missing momentum indicating a neutrino can be faked by K 0 L .The continuum MC consists of two separate components, decays via a cc pair and those via a pair of the three lighter quarks.Effects of a mismodeled continuum are included by varying the ratio Events/(0.1 GeV) FIG. 2: Projections onto the two fit variables for all three channels used, with the contributions scaled to those obtained in the fit.The other variable is restricted to the signal-enriched region of 5.25 GeV/c 2 < M bc < 5.3 GeV/c 2 and −0.25 GeV < ∆E < 0.25 GeV respectively for visibility.
of the two components by 20%.The total measured number of B B pairs has an uncertainty of 1.4% which is propagated through Eq. 4, as do the branching fractions of Υ(4S) → B + B − and the subsequent signal decay chain taken from Ref. [14].The MC statistics are assumed to have Poisson-distributed uncertainties due to their finite size.

B. Detector performance
Independent data samples have been used to validate the detector description and detection efficiency in the MC.A fully correlated uncertainty of 0.35% per charged particle due to track recognition and 2% per photon used in the signal reconstruction is assigned.For π 0 candidates a combined uncertainty of 2.5% is assigned instead.Studies on the performance of the particle identification (PID) for both charged leptons and pions led to the use of angle and momentum-dependent correction factors with an associated uncertainty.The two pions are ordered by their energy with the first pion always being the higher-energy one.Additionally, the yield of background events with a lepton candidate not being an actual lepton originating directly from a B meson decay is varied by 20% to estimate the effect of incorrectly assigning the lepton source.
The dependence of the reconstruction efficiency on the value of the momentum transfer q 2 for an event is shown in Fig. 3.The η → γγ channel only shows a weak dependence on q 2 , while the other two channels show a decrease at large q 2 values which can be traced to pion detection efficiency.

C. Fit validation
The much more common decay B + → D 0 + ν , with D 0 → K + π − , was used as a control mode.The reconstruction follows the same method except for adjusted mass requirements and adding a kaon.The measured branching fraction is 2.536 ± 0.036 ± 0.087%.
There is a 1.9σ discrepancy with the world average [14] of 2.29 ± 0.09% after adjusting to B(Υ(4S) → B + B − ) = 0.513.The measured branching fraction does however show good agreement with the previous measurement by Belle [35].The difference in central values of these measurements is 5% which is used as the systematic uncertainty due to shape mismodeling in the fit variables and selection efficiency discrepancies; this is listed in Table III as the control mode uncertainty.

VI. RESULTS
The branching fraction of B + → η + ν decay resulting from the fit is: by γγ : (2.91 ± 0.64 ± 0.32) by π + π − π 0 : (2.65 ± 1.04 ± 0.37) where the first uncertainty is the statistical uncertainty from the fit, while the second is systematic.Since the measurements of the branching fraction in the two η decay modes are consistent with each other, we can average over both η modes, assuming the statistical uncertainties to be uncorrelated and the systematic uncertainties to be fully correlated: The branching fraction of B + → η + ν decay resulting from the fit is: This result is compatible with and complements the earlier Belle result [6], which uses hadronic tagging for the second B meson in the event.Due to the different methods applied the statistical overlap between the two analyses is negligible and they can be considered independent.The branching fractions have been previously also measured by BaBar [2][3][4], although Ref. [4] reports a greater value.For the branching fraction for B + → η + ν CLEO [5] reports a value about an order of magnitude larger and incompatible with both this and the result from BaBar.The precision of this measurement is limited by the sample size.Significantly more precise results can therefore be expected in the future with the Belle II experiment at SuperKEKB.

VII. ACKNOWLEDGMENTS
We thank the KEKB group for the excellent operation of the accelerator; the KEK cryogenics group B and | p B | are the energy and momentum of the B meson in the c.m. frame, m η ( ) is the mass of the combined lepton-η ( ) system, while E η ( ) and | p η ( ) | are its energy and momentum.As the four-momentum of the B meson can not be directly measured, the half of the c.m. energy E B = √ s/2 and | p B | = s/4 − m 2 B c 2 are used in Eq. 1.The distribution is shown in Fig. 1.

FIG. 3 :
FIG.3: Efficiency of the entire reconstruction chain including the BDT as a function of q 2

TABLE I :
Efficiencies of the event selection.

TABLE II :
Event yields, fit quality and selection efficiencies.For the fixed b → u + ν component the Poissonian uncertainty of the yield is quoted.+ π − π 0 η → π + π − η

TABLE III :
Breakdown of the systematic uncertainty in %.