Search for the rare decay of $B^+ \to \ell^{\,+} \nu_{\ell} \gamma$ with improved hadronic tagging

We present the result of the search for the rare $B$ meson decay of $B^+ \to \ell^{\,+} \nu_{\ell} \gamma$ with $\ell =e,\mu$. For the search the full data set recorded by the Belle experiment of $711 \, \mathrm{fb}^{-1}$ integrated luminosity near the $\Upsilon (4S)$ resonance is used. Signal candidates are reconstructed for photon energies $E_{\gamma}$ larger than $1 \, \mathrm{GeV}$ using a novel multivariate tagging algorithm. The novel algorithm fully reconstructs the second $B$ meson produced in the collision using hadronic modes and was specifically trained to recognize the signal signature in combination with hadronic tag-side $B$ meson decays. This approach greatly enhances the performance. Background processes that can mimic this signature, mainly charmless semileptonic decays and continuum processes, are suppressed using multivariate methods. The number of signal candidates is determined by analyzing the missing mass squared distribution as inferred from the signal side particles and the kinematic properties of the tag-side $B$ meson. No significant excess over the background-only hypothesis is observed and upper limits on the partial branching fraction $ \Delta \mathcal{B} $ with $E_{\gamma}>1 \, \mathrm{GeV}$ individually for electron and muon final states as well as for the average branching fraction of both lepton final states are reported. We find a Bayesian upper limit of $\Delta \mathcal{B}( B^{+} \to \ell^{\, +} \nu_{\ell} \gamma)<3.0 \times 10^{-6}$ at 90% CL and also report an upper limit on the first inverse moment of the light-cone distribution amplitude of the $B$ meson of $\lambda_B$ at 90% CL.

We present the result of the search for the rare B meson decay of B + → + ν γ with = e, µ. For the search the full data set recorded by the Belle experiment of 711 fb −1 integrated luminosity near the Υ (4S ) resonance is used. Signal candidates are reconstructed for photon energies E γ larger than 1 GeV using a novel multivariate tagging algorithm. The novel algorithm fully reconstructs the second B meson produced in the collision using hadronic modes and was specifically trained to recognize the signal signature in combination with hadronic tag-side B meson decays. This approach greatly enhances the performance. Background processes that can mimic this signature, mainly charmless semileptonic decays and continuum processes, are suppressed using multivariate methods. The number of signal candidates is determined by analyzing the missing mass squared distribution as inferred from the signal side particles and the kinematic properties of the tag-side B meson. No significant excess over the background-only hypothesis is observed and upper limits on the partial branching fraction ∆B with E γ > 1 GeV individually for electron and muon final states as well as for the average branching fraction of both lepton final states are reported. We find a Bayesian upper limit of ∆B(B + → + ν γ) < 3.0 × 10 −6 at 90% CL and also report an upper limit on the first inverse moment of the light-cone distribution amplitude of the B meson of λ B > 0. 24 GeV at 90% CL.

I. INTRODUCTION
The precision study of leptonic B meson decays offers a unique path to test the Standard Model (SM) of particle physics: new heavy mediators or sterile neutrinos could contribute to the decay amplitudes and lead, for instance, to lepton flavor universality breaking effects. In the SM, however, leptonic transitions are helicity suppressed, making an observation in final states involving electrons or muons challenging. The helicity suppression can be lifted, if one considers final states which involve an additional photon in the final state, emitted for instance from the light u quark. The decay rate for such B + → + ν γ processes [1] is suppressed by a factor of α em and the resulting decay amplitude [2] may be expressed as with the projection operator P L = (1 − γ 5 )/2, G F denoting Fermi's constant, denoting either an electron or a muon field, u and b as quark fields and V ub denoting the relevant Cabibbo-Kobayashi-Maskawa (CKM) [3,4] matrix element of the transition. The hadronic transition can be fully described by two form factors parameterizing the axial-vector and vector hadronic currents, denoted by F A and F V , respectively. The differential decay rate as a function of these two form factors and the photon energy E γ is given by with e denoting the charge of the lepton and m B the B meson mass. At high photon energies of E γ > 1 GeV, both form factors can be expanded [5] as with f B denoting the B meson decay constant and e u the charge of the u quark line. The factor R(E γ , µ) accounts for photon emissions from the light spectator quark in the B meson and is unity at tree level. The ξ and ∆ξ terms are power suppressed by 1/m b and 1/(2E γ ) and contain a symmetry conserving and breaking part for both form factors F A and F V . In Refs. [6,7] the leading contributions to ξ and ∆ξ were evaluated and predictions with an accuracy of O(20%) were presented. The parameter λ B is related to the first inverse moment of the leading-twist B meson light-cone distribution amplitude φ + in the high energy limit, λ −1 B = ∞ 0 dw φ + (w) with w denoting the light-cone momentum. This parameter is of great relevance and serves as an important input to understand QCD factorization used to predict non-leptonic B meson decays [8][9][10]. The partial branching of B + → + ν γ is expected to be of the order of O(10 −6 ) for photon energies of E γ > 1 GeV and λ B in the range of several hundred MeV [5].
The search for the rare B + → + ν γ decay thus serves two purposes: first a limit or observation of the partial branching fraction can be used in combination with the decay rate Eq. 2 and input from theory for ξ and ∆ξ, as well as for f B and V ub , to experimentally determine the value of λ B . Second, with the Belle II experiment the B + → + ν γ decay rate could offer a future additional path to determine |V ub | by using lattice calculations for λ B . In this manuscript we present a novel method to determine λ B , which uses the measured ratio of B + → + ν γ with respect to B + → π 0 + ν . This cancels the explicit dependence of V ub on λ B and in the ratio several experimental uncertainties cancel.
The search in this manuscript supersedes the earlier result of Ref. [11], using an improved hadronic tagging and more accurate modeling of the charmless semileptonic backgrounds. The hadronic tagging is based on the Full Event Interpretation (FEI) [12], which uses multivariate methods trained to recognize the signal signature (consisting of a single photon with E γ > 1 GeV and a lepton candidate) in conjunction with a hadronic B meson decay. This signal-specific approach enhances the selection efficiency in comparison to the previous analysis by a factor of three, which results in an improvement of 1.9σ for the expected significance for a given partial branching fraction of ∆B(B + → + ν γ) = 5.0 × 10 −6 and with a similar ratio of signal over background events. Backgrounds that can mimic signal, most importantly B + → π 0 + ν and other charmless semileptonic decays, are suppressed with a dedicated π 0 and η veto, which combines the signal photon with other calorimeter clusters to form a multivariate classifier. The signal is extracted by analyzing the missing mass squared distribution, M 2 miss , which for correctly reconstructed signal decays peaks around 0 GeV 2 . Semileptonic background and event candidates from continuum processes are shifted towards positive values. The B + → + ν γ branching fraction is extracted simultaneously with a B + → π 0 + ν control sample, which constrains the most dominant background contribution. The extracted yields are converted into (partial) branching fractions to determine the ratio of B + → + ν γ to B + → π 0 + ν decays.
This manuscript is organized as follows: Section II details the analysis strategy and selection. Section III summarizes the statistical analysis of the M 2 miss distribution and the limit setting procedure. In Section IV the calibration procedure of the multivariate tagging algorithm is summarized. Sections V and VI discuss the systematic uncertainties and the main results. The manuscript concludes with a summary in Section VII.

II. DATA SET AND ANALYSIS STRATEGY
The full Belle data set of (772 ± 10) × 10 6 B meson pairs is analyzed, produced at the KEKB accelerator complex [13] at a center-of-mass energy of 10.58 GeV at the Υ (4S ) resonance. The Belle detector is a large-solidangle magnetic spectrometer that consists of a silicon vertex detector (SVD), a 50-layer central drift chamber (CDC), an array of aerogel thresholdČerenkov counters (ACC), a barrel-like arrangement of time-of-flight scintillation counters (TOF), and an electromagnetic calorimeter comprised of CsI(Tl) crystals (ECL) located inside a superconducting solenoid coil that provides a 1.5 T magnetic field. An iron flux return located outside of the coil is instrumented to detect K 0 L mesons and to identify muons (KLM). A more detailed description of the detector can be found in Ref. [14].
All analyses steps are carried out using the Belle II analysis software framework [15] and the recorded Belle collision data and Monte Carlo (MC) simulated samples were converted using the tool described in Ref. [16]. MC samples of B meson decays and non-resonant e + e − → qq with q = u, d , s, c continuum processes are generated using the EvtGen generator [17], with sample sizes corresponding to approximately ten and six times the Belle collision data, respectively. The interactions of particles traversing the detector are simulated using Geant3 [18]. QED final state radiation (FSR) is simulated using the PHOTOS [19] package. The B + → + ν γ signal is simulated using the calculation of Ref. [20]. Charmless semileptonic decays are one of the main contributions to the background in the analysis, in particular B + → π 0 + ν with π 0 → γγ can mimic the signal final state. The B + → π 0 + ν background is modeled using the BCL form factor parametrization [21] with central values and uncertainties from the global fit of Ref. [22]. The remaining charmless semileptonic background is modeled using a mix of resonant and non-resonant modes: the resonant contributions for B + → ω + ν , B + → ρ 0 + ν , B 0 → ρ − + ν are simulated according to a pole model documented in Ref. [17]. The contributions from B + → η + ν , B + → η + ν , B + → f 2 (1270) + ν , and B + → b 1 (1235) + ν are modeled using the ISGW2 model [23]. Non-resonant contributions are modeled using the DFN calculation [24] with a choice of its parameters λ SF 1 andΛ SF to approximatively reproduce the first and second moments of the inclusive m X u distribution. The fragmentation of the X u system is performed by Jetset [25]. Background processes from leptonic, semileptonic or hadronic B meson decays are included in the simulation and all relevant decay branching fractions are corrected to correspond to the values of Ref. [26]. The efficiencies in the MC are corrected using data-driven methods and are described later.
The presence of a neutrino in B + → + ν γ decays prohibits the full reconstruction of the signal B meson, and thus the entire Υ (4S ) decay chain is considered to infer the missing momentum of the neutrino. First, a signal electron or muon and a photon candidate with at least 1 GeV in the laboratory frame are identified. The FEI algorithm then hierarchically reconstructs the rest of the event (ROE): The tag-side B meson (B tag ) is reconstructed using 29 explicit hadronic decay channels, leading to O(10000) final states. An optimized implementation of gradient-boosted decision trees (BDT) [27] is used for multivariate classification at the individual stages of the B tag reconstruction, which progresses by forming intermediate particles for (J/ψ, π 0 , K 0 S , D, D s , D * s ) from stable particle candidates for (e + , µ + , K + , π + , γ) to re-construct B candidates in six distinct stages. To each reconstructed candidate a signal probability P FEI is assigned, which is calculated by the respective classifier on the properties of the candidate (such as the invariant mass or vertex fit information) to discriminate signal from background candidates. A detailed description of the entire algorithm can be found in Ref. [12] and references therein. By reconstructing the signal side first, the FEI trains specifically to recognize B + → + ν γ decays in conjunction with hadronic B meson decays. Using the kinematic information of the B tag , the four-momenta of the signal side B meson (B sig ) and thus the missing neutrino can be reconstructed in the center-of-mass frame as where √ s denotes the center-of-mass energy of the colliding e + e − pair, and p x and p x are the four-and threemomentum of a given particle x.
Only events with not more than 12 tracks are selected as the signal side consists of only one charged track and signal events typically have a low track multiplicity.
Photons are identified as energy depositions in the calorimeter without an associated track. Only photons with an energy deposition of E γ > 100 MeV, 150 MeV, and 50 MeV in the forward end-cap, backward end-cap and barrel part of the calorimeter, respectively, are considered. Charged tracks are required to have a distance of closest approach to the nominal interaction point transverse to and along the beam axis of |dr| < 2 cm and |dz| < 4 cm, respectively. Charged tracks are identified as electron or muon candidates by combining the information of multiple subdetectors into a likelihood ratio, the lepton identification (L LID ). For electrons the identifying features are the ratio of the energy deposition in the ECL with respect to the reconstructed track momentum, the energy loss in the CDC, the shower shape in the ECL, the quality of the geometrical matching of the track to the shower position in the ECL, and the photon yield in the ACC [28]. Muon candidates are identified from charged track trajectories extrapolated to the outer detector. The identifying features are the difference between expected and measured penetration depth as well as the transverse deviation of KLM hits from the extrapolated trajectory [29]. Charged tracks are identified as pions or kaons using a likelihood classifier using information from the CDC, ACC, and TOF subdetectors.
Electrons can radiate sizable fractions of their kinetic energy through bremsstrahlung and FSR processes. Candidates for such photons are identified in the ECL using a cone of 5 degrees around the initial trajectory of the electron candidate, such that only photons radiated near the interaction region can be found. The bremsstrahlung and FSR photons are required to have an energy of E γ < 1 GeV and if several photon candidates are identified, only the photon with the highest energy is used.
The four-momentum of the signal side electron candidate is then corrected by adding the photon energy accordingly. The prompt signal-photon candidates from the B + → + ν γ decay are required to have E γ > 1 GeV in the rest frame of the B sig meson. Further, signal-photon candidates must provide R 9/25 > 0.9, which is defined as the ratio of the energies deposited in the 3 × 3 with respect to the energy deposited in the 5 × 5 CsI(Tl) crystals around the maximal energy deposition. L LID > 0.8 is required for electron and muon candidates. Since the efficiencies of the L LID requirement can differ between MC and data, an efficiency correction is applied, measured on four-lepton and inclusive B → XJ/ψ (→ + − ) decays in bins of lepton momentum in the laboratory frame and polar angle. Furthermore, the invariant mass M B of the reconstructed lepton-photon pair has to be within (1.0, 6.0) GeV.
Before the FEI tagging algorithm is applied, the corresponding tag-side is cleaned to remove events which do not allow for a reasonable tag-side reconstruction. Photon candidates must provide R 9/25 > 0.9. In addition, a cut on the difference between the beam energy and the energy of the ROE calculated in the center-ofmass frame, ∆E ROE < 2.0 GeV, is applied. The beamconstrained mass M bc,ROE = ( √ s/2) 2 − p 2 ROE of the ROE calculated in the center-of-mass frame has to be larger than 4.8 GeV.
After the reconstruction of both B sig and B tag , the Υ (4S ) candidate can be reconstructed. A best-candidate selection based on P FEI (of the B tag candidate) is performed, if more than one candidate per event is reconstructed. Further combinatoric background is removed with cuts on the reconstructed invariant mass of the Υ (4S ) candidate of M Υ (4S ) ∈ (7.5, 10.5) GeV (considering the missing neutrino), and the difference between the beam energy and the energy of the B tag candidate, ∆E ∈ (−0.15, 0.1) GeV. The beam-constrained mass M bc = ( √ s/2) 2 − p 2 B tag of the B tag candidate has to be within (5.27, 5.29) GeV. Only events with no unassigned tracks (either for B sig or B tag ) and E ECL ≤ 0.9 GeV, where E ECL is the sum of the remaining unassigned energy depositions in the ECL, are retained. Continuum background is removed by applying an additional cut P FEI > 0.01, whose value was chosen by studying the M bc sideband, defined as M bc ∈ (5.24, 5.27) GeV.
The background from B + → π 0 + ν and B + → η + ν is peaking in M 2 miss and is suppressed in a two-step procedure: First, the signal-side photon is combined with any other photon in the ROE to reconstruct a π 0 candidate. The event is vetoed if the candidate satisfies M γγ ∈ (110, 160) MeV, where M γγ is the invariant mass of the two photons combined. Further, a multivariate method is trained to suppress the remaining B + → π 0 + ν and the B + → η + ν background using the following variables: the number of ECL cluster hits used for the signal-side photon reconstruction, R 9/25 , the lateral distribution of the energy of the ECL cluster hits, the angle between the signal-side photon and the missing momentum p ν calculated in the rest frame of the B sig , E ECL , and the energy asymmetry, revealing the asymmetry in energy distribution of the lepton and photon candidate of the B sig , calculated as To improve control over the normalization of the peaking background, control samples for B + → π 0 + ν , with = e, µ are reconstructed. The signal side selection is slightly adapted for the B + → π 0 + ν selection: instead of a single photon with E γ > 1 GeV, two photon candidates are combined to form a π 0 candidate and only events with an invariant mass of M γγ ∈ (115, 152)MeV (corresponding to approximately ±3σ in π 0 mass resolution), are retained. Both control samples and the B + → + ν γ signal decays are analyzed simultaneously to extract the desired signal yields and to constrain the peaking B + → π 0 + ν contaminations in the signal candidates.
For both the B + → + ν γ and the B + → π 0 + ν selections, non-resonant continuum processes are suppressed using a multivariate approach with the aforementioned implementation of BDT. The event topology for continuum processes differs from that of B meson decays. This can be exploited to suppress continuum events by using event shape variables, such as the magnitude of the thrust of final state particles forming the B sig and ROE candidates, the angle between the B sig and the z-axis and between the B sig and the ROE, the reduced Fox-Wolfram moment R 2 , the modified Fox-Wolfram moments [30] and CLEO Cones [31].
The cuts on the multivariate classifier for continuum and the peaking background suppression are simultaneously optimized with Punzi's figure of merit [32]. After all selection steps, we obtain a signal reconstruction efficiency for B + → + ν γ decays of 0.64% (0.67%) for the electron (muon) final state. On the normalization sample we obtain an efficiency of 0.38% for both final states for B + → π 0 + ν decays.
To discriminate the signal from background decays, the missing mass squared M 2 miss of the event is calculated as with p X denoting p γ for B + → + ν γ signal events, and p π for B + → π 0 + ν normalization events, respectively. The signal and background yields are then obtained using the statistical analysis described in Section III. The analysis procedure is validated using two signal-depleted sidebands: an off-resonance sample, recorded 40 MeV below the Υ (4S ) resonance, and the M bc sideband were analyzed. Both showed good agreement between data and the MC expectation.

III. STATISTICAL ANALYSIS AND LIMIT SETTING PROCEDURE
Signal and background yields are extracted using a binned maximum likelihood fit of the M 2 miss distribution. For an individual channel, the likelihood function is constructed as with P(n i ; ν i ) = ν n i i / (n i !) e −ν i denoting the Poisson distribution with n i and ν i the number of observed and expected events in a given bin i of M 2 miss , respectively. Three different likelihood fits are carried out in this manuscript: i. Semileptonic B + → D 0 + ν decays are analyzed to determine a calibration factor for the FEI tagging efficiency. The selection and obtained calibration factors are further discussed in Section IV.
ii. The branching fraction of B + → π 0 + ν events is determined as a cross check of the FEI calibration procedure, cf. Section IV.
iii. The B + → + ν γ signal events are analyzed using a simultaneous fit to the B + → + ν γ and with c denoting the reconstructed event type corresponding to the four categories defined by the B + → e + ν e γ, B + → µ + ν µ γ, B + → π 0 e + ν e and B + → π 0 µ + ν µ channels. Further, G(θ k ) denotes the standard normal distribution for nuisance parameters θ k , which incorporate systematic uncertainties into the likelihood function. The various systematic uncertainties are further discussed in Section V.
The expected number of events in a given bin i of the M 2 miss distribution and in a given category is constructed as with ν j the total number of events of type j and f ij denoting the expected fraction of events of type j in the i th bin. The fractions f ij are obtained from the MC simulation and the event types for the B + → D 0 + ν and B + → π 0 + ν fits are further detailed in Sections IV. For the search for the rare B + → + ν γ decay the yield of four event types are used as free parameters in the fit: i. B + → + ν γ signal events.
iv. Other B meson or continuum background events.
The B + → π 0 + ν normalization mode is linked between the B + → + ν γ and B + → π 0 + ν categories and the global likelihood function L is maximized to determine the estimates for the number of signal events. Confidence intervals are constructed using the profile likelihood method where ν j , ν and θ are the values of the normalization of interest, the remaining normalizations, and nuisance parameters that unconditionally maximize the likelihood function while ν ν and θ ν are the values of the other normalizations and nuisance parameters which maximize the likelihood under the condition that the observable of interest is kept fixed at a given value ν j . In the asymptotic limit, approximate confidence intervals (CI) can be constructed using with f χ 2 (x; 1 dof) denoting the χ 2 distribution with one degree of freedom. In case of two parameters of interest a two-dimensional confidence level (CL) can be constructed via Eq. 11 by modifying λ(ν j ) to a likelihood ratio depending on two parameters, ν j and ν k , λ(ν j , ν k ), and with f χ 2 correspondingly then having two degrees of freedom.
In case we observe no significant signal, we set a Bayesian limit by converting the likelihood Eq. 8, L = L(n|ν j ), into a probability density function F of the parameter of interest ν i using a flat prior π(ν j ) such that with π(ν j ) = constant for ν j > 0 and zero otherwise. In Eq. 12, n denotes the vector of observed event yields in the given bins in all channels. The fit procedure was validated using ensembles of pseudoexperiments generated with different input branching fractions for B + → + ν γ and B + → π 0 + ν decays. No biases or undercoverage of CI are observed.

IV. HADRONIC TAGGING EFFICIENCY CALIBRATION AND B
The multivariate classifiers that enter the hadronic tagging algorithm of the FEI are trained on simulated events. Due to imperfections in these simulations, e.g. due to inadequate modeling of hadronic decays or experimentally poorly constrained branching fractions, the tagging algorithm exhibits a different performance on simulated and recorded events. Due to this, the tagging algorithm has to be calibrated on data using well known processes. In this analysis, such a calibration is derived using three semileptonic B channels with different multiplicities, where = e, µ. The tag-side selection is identical with the nominal analysis. The signal side selects electron and muon candidates using the criteria detailed in Section II. Charged pion tracks are required to originate near the IP with |dr| < 2 cm and |dz| < 4 cm. Charged kaons and pions are separated using a likelihood ratio, P K π , which combines the relevant information from the ACC, TOF and CDC subdetectors. We require P K π < 0.4 for pion and P K π > 0.6 for kaon candidates, respectively. Neutral pions are reconstructed from the combination of two photons with an invariant mass of M γγ ∈ (117.8, 152) MeV.  1.854, 1.872) GeV for the three channels i.iii., respectively. Additional loose cuts are applied on the beam-constrained mass of the B sig candidate (reconstructed from the D 0 and the lepton), M bc > 4.5 GeV, and the cosine of the angle between the true B meson (calculated from beam energy and momentum) and the reconstructed D system, | cos(θ B D )| < 3.0. An unconstrained vertex fit is applied on the D and B candidates and candidates with a p-value of the fit of p χ 2 > 0.01 are retained.
The tag efficiency calibration factor is calculated by extracting the number of signal decays on data and comparing to the expected number of events from the MC simulation. The signal yield is determined using a binned maximum likelihood fit (cf. Section III ) of the M 2 miss distribution, reconstructed as The obtained calibration factors of the three channels are shown in Fig. 1 and the global calibration factor is found to be = 0.825 ± 0.014 (stat.) ± 0.049 (syst.). To validate the found calibration factor, we measure the branching fraction of the B + → π 0 + ν decay and compare it to the current world average. We obtain B(B + → π 0 + ν ) = (7.8 ± 0.6 (stat.)) × 10 −5 , which is in agreement with the average B PDG (B + → π 0 + ν ) = (7.80 ± 0.27) × 10 −5 of Ref. [26].

V. SYSTEMATIC UNCERTAINTIES
There are several systematic uncertainties that affect the measured yields and partial branching fractions: Table I summarizes the most important sources of uncertainty for the B + → + ν γ and B + → π 0 + ν branching fraction measurements.
The effect of all systematic uncertainties are directly incorporated into the likelihood Eq. 7 via the replacement of and for multiplicative and additive uncertainties, respectively. Nuisance parameters θ k are constrained using standard normal distributions G(θ k ) in Eq. 8 for relative and absolute uncertainties ijk of a source k for a component j and a given bin i. Systematic and statistical uncertainties are separated from each other using scans of the likelihood contour in which the systematic nuisance parameters are kept fixed at their best fit value.
The largest multiplicative systematic uncertainty on both branching fractions stems from the uncertainty on the tagging calibration (see the previous section). It is evaluated by shifting the central value of the combined correction factor according to its statistical and systematic uncertainty. This results in a relative uncertainty of 6.2%. The second largest uncertainty for B + → π 0 + ν is given by the statistical uncertainty on the signal reconstruction efficiency. Its uncertainty is evaluated using binomial uncertainties, following the prescription of Ref. [33]. Another large multiplicative uncertainty stems from the L LID efficiency, which is corrected in the simulation using data-driven methods. The statistical and systematic uncertainty on these correction factors are propagated and result in an uncertainty of 1.81% and 1.97% for ∆B(B + → + ν γ) and B(B + → π 0 + ν ), respectively. The remaining two multiplicative uncertainties are from the number of B B pairs, used to convert the measured yield into (partial) branching fractions, and the uncertainty on reconstruction efficiency differences between the simulation and recorded collisions of charged tracks. The tracking efficiency differences are studied using D * → D 0 π decays with D 0 → ππK 0 S and K 0 S → π + π − . The uncertainty on N B B results in a relative error of 1.37% and for the tracking efficiency an uncertainty of 0.35% for the single signal side track is found. The largest additive systematic uncertainty for the B + → + ν γ partial branching fraction measurement stems from the systematic uncertainty assigned to the multivariate method that suppresses peaking background contributions. This uncertainty is evaluated by reweighting the MC samples to the distribution of the input variables used for the classification on data. The distribution which gives the largest deviation from the nominal result is used to estimate the uncertainty. The second largest additive uncertainty for the B + → + ν γ partial branching fraction measurement is due to limited MC statistics. The uncertainty is evaluated for each MC sample individually by producing a large ensemble of templates, where the numbers of entries are varied using a Poisson distribution. The templates of the ensemble are used to repeat the fit to estimate the total uncertainty. The largest additive systematic uncertainty for the B + → π 0 + ν branching fraction is given by the uncertainty on the BCL form factors and is evaluated by variations using the covariance matrix from the global fit of Ref. [22].
The remaining additive uncertainties on both channels are evaluated as follows: The fraction of the individual channels in which the B tag is reconstructed differs between MC and data. To estimate the impact of this mismatch, the MC samples are corrected to the fraction in data of the reconstructed tag channels and the difference is taken as an estimation for the systematic uncertainty. In the fit, the individual branching fractions of charmless semileptonic background decay modes are kept fixed and modeled as a single floating background template. To estimate uncertainties due to slight shape differences in M 2 miss from these templates, we vary the decay branching fractions of B + → ω + ν , B + → ρ 0 + ν , B 0 → ρ − + ν , B + → η + ν , B + → η + ν , and B 0 → π − + ν individually within their uncertainties [26]. The uncertainty on the B + → + ν γ signal model is estimated by correcting the simulated events from the prediction of Ref. [20] to the state-of-the-art prediction of Ref. [5] and repeating the fit.
VI. RESULTS Figure 2 shows the M 2 miss distribution of the selected data events in the four categories of B + → e + ν e γ, B + → µ + ν µ γ, B + → π 0 e + ν e , and B + → π 0 µ + ν µ . The selected events are used to maximize the likelihood function Eq. 7 numerically, determining the four (B + → + ν γ) and The ellipses correspond to the given confidence level, including systematic uncertainties. Plot (b) shows the one-dimensional likelihood contour and its conversion into a Bayesian PDF F(ν j |n) using a flat prior for the B + → + ν γ measurement, see Section III for details. three (B + → π 0 + ν ) event types detailed in Section III.
The fitted B + → + ν γ signal, B + → π 0 + ν normalization and other background contributions are shown as colored histograms and the summed signal plus background template is shown as a filled gray histogram. The observed partial branching fraction of B + → + ν γ with E γ > 1 GeV is where the first error is statistical and the second error contains all systematic uncertainties discussed in Section V. The significance over the background-only hypothesis for the B + → + ν γ signal, as calculated using the likelihood ratio, is 1.4 standard deviations. The B + → π 0 + ν branching fraction is found to be B(B + → π 0 + ν ) = (7.9 ± 0.6 ± 0.6) × 10 −5 , and has better statistical precision than the measurement of Ref. [34] 1 . A summary of all fit results, including fits of the individual electron and muon samples, is presented in Table II. Figure 3a shows the two-dimensional likelihood ratio contours of −2λ (see Eq. 10) for both branching fractions. The correlation between ∆B(B + → + ν γ) and B(B + → π 0 + ν ) is found to be ρ = −2.7%. Due to the low significance of the measured B + → + ν γ signal, we convert the likelihood into a Bayesian 1 The statistical overlap with the previous measurement is unknown. Since the current result is not measured in bins of q 2 , the previous result should still be used for the determination of |V u b | and world averages of the branching fraction.
probability density function (PDF), with the procedure detailed in Section III. Figure 3b shows the onedimensional likelihood ratio scan and the obtained Bayesian PDF, which was obtained using a flat prior in the partial branching fraction. The resulting limit for B + → + ν γ at 90% CL is This provides a significantly more stringent limit than previous searches, and a summary of previous limits and individual limits for the electron and muon signal channel can be found in Table III.
Using the B + → + ν γ and B + → π 0 + ν branching fractions, the first inverse moment λ B of the leadingtwist B meson light-cone distribution amplitude φ + can be determined. Instead of directly using the measured B + → + ν γ partial branching fraction, we use the theoretically well understood B + → π 0 + ν decay rate to derive a measurement of λ B which is independent of V ub . The value of λ B is related to this ratio as with ∆Γ(λ B ) denoting the partial decay rate as a function of λ B with E γ > 1 GeV, and Γ(B + → π 0 + ν ) denoting the total decay rate of B + → π 0 + ν . Using the central values and the full experimental covariance we measure For the prediction of the B + → π 0 + ν decay rate, we use the global fit [22] of BaBar [35] Belle [11] This work e - For the partial B + → + ν γ decay rate the predictions and uncertainties of Ref. [7] extrapolated to E γ > 1 GeV are used. In Ref. [7] three different models are used to evaluate the dependence of the partial decay rate on the functional form of the light-cone distribution amplitude. Figure 4 shows the predicted and measured R π ratio as a function of λ B . We solve Eq. 19 numerically and in Table IV the determined value of λ B for each of the three models are given, including the corresponding theoretical uncertainties of Ref. [7]. We use the shift in the central value between all three models to also quote a value of λ B , whose uncertainty should incorporate the overall model dependence. For this we find where the first uncertainty is experimental, the second from the theoretical uncertainty on the B + → + ν γ prediction of Ref. [7] and the B + → π 0 + ν uncertainty from Ref. [22], and the third uncertainty is due to the light-cone distribution amplitude model dependence. We further obtain a one-sided limit of λ B > 0. 24 GeV (22) at 90% CL. Note, that these estimates might suffer from additional uncertainties from the extrapolation to E γ > 1 GeV. Further details can be found in Ref. [7].

VII. SUMMARY
In this manuscript, an improved search for the radiative leptonic decay B + → + ν γ on the full Belle data set recorded at the Υ (4S ) resonance is presented. The results improve the previous analysis by our collaboration and increase the signal efficiency by a factor of three. In The determined values of λ B using the predictions of Ref. [7] are given. A detailed description of the three approaches to model the functional form of the light-cone distribution amplitude (LCDA) can be found in Ref. [7]. The first uncertainty are experimental and the second from theory.  [7] and [22] (red line with 1σ uncertainties) for R π is compared to the measured value and 1σ uncertainty (blue dashed line and band). The dark red band shows the theoretical uncertainty, the light red band additionally contains the light-cone distribution amplitude model dependence.
addition, the description of the important B + → π 0 + ν background was improved, by analyzing simultaneously B + → π 0 + ν signal events and using the global fit result of Ref. [22] to describe its form factors. The large improvement in sensitivity stems from employing a newly developed tagging algorithm developed for the Belle II experiment, the Full Event Interpretation [12]. Although this drastically improves the sensitivity, no significant signal of B + → + ν γ decays is observed. As it is not possible to determine the statistical overlap with the previous Belle result, this work supersedes Ref. [11].
The determined partial branching fraction for B + → + ν γ decays with photon energies E γ > 1 GeV in the B sig rest frame is found to be ∆B(B + → + ν γ) = (1.4 ± 1.0 ± 0.4) × 10 −6 , (23) with a significance of 1.4 standard deviations over the background-only hypothesis. Using the likelihood contour and a flat prior, we determine a Bayesian upper limit of ∆B(B + → + ν γ) < 3.0 × 10 −6 , at 90% confidence level. In addition, we report an improved determination of the first inverse momentum λ B of the the light-cone distribution amplitude of the B meson. It is done using a V ub independent way, by normalizing the measured partial branching fraction to the branching fraction of B + → π 0 + ν . This reduces the experimental uncertainties, and the theoretical prediction of the total decay width of B + → π 0 + ν is well understood.
Using the result of Ref. [7], its associated uncertainties and an additional uncertainty to assess the model dependence, we obtain λ B = 0.36 +0. 25 −0.09 GeV (25) or λ B > 0.24 GeV at 90% CL. The search of B + → + ν γ is limited by the available data set and its sensitivity will be greatly enhanced by the upcoming Belle II experiment. The anticipated data set of 50 ab −1 will greatly reduce the experimental uncertainties on λ B . Together with recent developments in lattice QCD calculations reducing the theoretical uncertainties on λ B , the feasibility of B + → + ν γ increases as an alternative channel to measure |V ub | to provide a consistency check of the SM.