Measurement of the branching fraction of $B \rightarrow D^{(*)}\pi \ell\nu$ at Belle using hadronic tagging in fully reconstructed events

We report a measurement of the branching fraction of the decay $B \rightarrow D^{(*)}\pi \ell\nu$. The analysis uses 772$\times 10^6$ $B\bar{B}$ pairs produced in $e^+e^-\rightarrow \Upsilon(4S)$ data recorded by the Belle experiment at the KEKB asymmetric-energy $e^+e^-$ collider. The tagging $B$ meson in the decay is fully reconstructed in a hadronic decay mode. On the signal side, we reconstruct the decay $B \rightarrow D^{(*)}\pi \ell\nu$ $(\ell=e,\mu)$. The measured branching fractions are $\mathcal{B}(B^+ \rightarrow D^-\pi^+ \ell^+\nu)$ = [4.55 $\pm$ 0.27 (stat.) $\pm$ 0.39 (syst.)]$\times 10^{-3}$, $\mathcal{B}(B^0 \rightarrow \bar{D}^0\pi^- \ell^+\nu)$ = [4.05 $\pm$ 0.36 (stat.) $\pm$ 0.41 (syst.)]$\times 10^{-3}$, $\mathcal{B}(B^+ \rightarrow D^{*-}\pi^+ \ell^+\nu)$ = [6.03 $\pm$ 0.43 (stat.) $\pm$ 0.38 (syst.)]$\times 10^{-3}$, and $\mathcal{B}(B^0 \rightarrow \bar{D}^{*0}\pi^- \ell^+\nu)$ = [6.46 $\pm$ 0.53 (stat.) $\pm$ 0.52 (syst.)]$\times 10^{-3}$. These are in good agreement with the current world average values.


I. INTRODUCTION
Semileptonic decays of B mesons are an important tool for precision measurements of CKM matrix elements and precision tests of the electroweak sector of the standard model.An important recent development was the observation of a more than 3σ deviation between the standard model expectation for R(D ( * ) ) [1,2] and the combined experimental results from Babar [3,4], Belle [5][6][7] and LHCb [8,9].Here, R(D ( * ) ) is defined as the ratio of the branching fraction (B) of B → D ( * ) τ ν and B → D ( * ) lν, ( = e, µ).We report on a new measurement of B → D ( * ) π ν, which is important as a background for B → D ( * ) τ ν decays, and in its own right, as a vehicle to understand high-multiplicity semi-leptonic B decays.The process B → D ( * ) π ν proceeds predominantly via B → (D * * → D ( * ) π) ν, where D * * is an orbitally excited (L = 1) charmed meson.The D * * mass-spectrum contains two doublets of states having light-quark total angular momentum j q = 1 2 and j q = 3 2 [10].All states can decay via D * * → D * π, while the 2 + state can also decay via D * * → Dπ.Since the D * * masses are not far from threshold, and the j q = 3  2 have a significant D-wave component, these states are narrow and were observed with a typical width of about 20 MeV [11][12][13].On the other hand, the states with j q = 1 2 decay mainly via S-wave and are therefore expected to be broad resonances with a width of several hundred MeV [10,14].Compared to previous measurements of B(B → D ( * ) π ν) at Belle [11], the analysis presented in this report benefits from the use of the full Belle dataset, containing 772×10 6 B B pairs, recorded at the Υ(4S) resonance, an improved hadronictagging method, and a direct extraction of the branching fractions using a fit to Monte-Carlo templates.

II. EXPERIMENTAL APPARATUS
The Belle experiment [15] at the KEKB storage ring [16] recorded about 1 ab −1 of e + e − annihilation data.The data were taken mainly at the Υ(4S) resonance at √ s = 10.58GeV, but also at Υ(1S) to Υ(5S) resonances and at √ s = 10.52 GeV.The Belle instrumentation used in this analysis includes the central drift chamber (CDC) and the silicon vertex detector, which provides precision tracking for tracks in the polar-angle range 17.0 • < θ lab < 150.0 • , and the electromagnetic calorimeter (ECL) covering the same range.The polar angle θ lab is measured with respect to the z-axis, which is anti-parallel to the e + beam.Charged particle identification is performed using specific ionization measurements in the CDC, time-of-flight information from the interaction point (IP) to a barrel of scintillators, light yield in an array of aerogel Cherenkov counters in the barrel and the forward endcap, as well as a muon-and K 0 L -identification system in the return yoke of the superconducting solenoid, which provides a 1.5 T magnetic field.

III. ANALYSIS
The analysis strategy is based on fully reconstructing one tagging B meson in a hadronic mode, then, using the rest of the event, reconstructing the signal mode with the exception of the ν, which escapes undetected.Since the rest of the event has been reconstructed, it is possible to infer the escaped neutrino invariant mass M ν from the kinematic constraints of the initial e + e − collision.The distribution of M 2 ν is then fitted with Belle Monte Carlo (MC) simulation templates to derive the branching fraction of interest.Simulations in this analysis use Pythia [17] and EvtGen [18] for the event generation, and GEANT3 [19] for the detector response.The simulation treats all B → D ( * ) π ν decays as proceeding through a B → D * * decay, which is simulated using the model of Leibovich-Ligeti-Stewart-Wise (LLSW) [14].By comparing known processes, we correct the simulation of the detector for the efficiency of the particle identification of charged tracks, π 0 and K 0 S mesons as well as the misidentification probabilities of charged tracks.These corrections are dependent on the kinematics of the respective particles.We reweight the simulation of underlying physical processes to account for newly measured values of branching fractions and related parameters.In particular, we use the latest world-average values of D and B meson branching fractions [20] as well as D * [2] and D * * form factors [14].

A. Btag selection
A neural-network-based multivariate classifier, as implemented in the NeuroBayes package [21,22], is used to fully reconstruct B mesons that decay hadronically.The algorithm considers 17 final states for charged B candidates and 15 final states for neutral B candidates.Incorporating the subsequent hadronic decays and J/ψ leptonic decays, the algorithm investigates 1104 different decay topologies.The output variable o tag of the algorithm takes a value between 0 and 1, with larger values corresponding to a higher likelihood that a B meson was correctly reconstructed.
We select events with log(o tag ) > −3.5.For each B tag , we impose a requirement on the difference between the measured center-of-mass (CM) energy and its nominal value of |∆E| = |E Btag − E CM | < 0.18 GeV, and on the beam constrained mass of M bc = (E CM /c 2 ) 2 − ( P Btag /c) 2 > 5.27 GeV/c 2 .Here, E Btag and P Btag are the energy and momentum of the tagged B candidate.
Differences in the tagging efficiency between data and MC have been observed [23].These depend on the tag-side reconstruction and the value of o tag .We use a calibration derived in Ref. [23], which uses a control sample B → X c lν decays on the signal side.Based on this calibration, we assign an event-by-event weight based on the reconstructed B tag decay mode and value of o tag to equalize the efficiency of the tagging algorithm between data and MC.

B. B sig reconstruction
Having selected the B tag in this way, the signal side B sig is then reconstructed with the charged tracks and photons in the event that are not part of the B tag decay chain.Charged tracks are identified using the Belle particle identification (PID) [24].We accept electrons in the laboratory frame polar-angle range 17 • < θ e < 150 • , and muons in the range 25 • < θ µ < 145 • , where the relevant subsystems of the Belle PID have acceptance for these particles.To recover energy lost by bremsstrahlung of electrons, we add the 4-vector of the closest γ found within 5 • of an identified electron.Charged tracks that cannot be unambiguously identified are treated as pions.We reconstruct π 0 candidates from pairs of photons, each of which satisfies a minimum energy requirement of 50 MeV, 75 MeV or 100 MeV in the barrel (32 • < θ γ < 130 • ), the forward endcap (17 • < θ γ < 32 • ) or the backward endcap (130 • < θ γ < 150 • ), respectively.We require the reconstructed mass to lie in the range 0.12 GeV/c 2 < M γγ < 0.15 GeV/c 2 , which corresponds to about five times the measured resolution around the nominal mass.To reduce overlap in the π 0 candidate list, we sort them according to the most energetic daughter photon (and then, if needed, the second most energetic daughter) and remove any pion that shares photons with one that appears earlier in this list.We reconstruct K 0 S mesons from π + π − pairs.We require the two-pion invariant mass to lie in the range 0.482-0.514GeV/c 2 (about four times the experimental resolution around the nominal mass [20]).Different selections are applied, depending on the momentum of the K 0 S candidate in the laboratory frame [25]: For low (p < 0.5 GeV/c), medium (0.5 ≤ p ≤ 1.5 GeV/c), and high momentum (p > 1.5 GeV/c ) candidates, we require the impact parameters of the pion daughters in the transverse plane (perpendicular to the beam) to be greater than 0.05 cm, 0.03 cm, and 0.02 cm, respectively.The angle in the transverse plane between the vector from the interaction point to the K 0 S vertex and the K 0 S flight direction is required to be less than 0.3 rad, 0.1 rad, and 0.03 rad for low, medium, and high momentum candidates, respectively; the separation distance along the z axis of the two pion trajectories at their closest approach must be below 0.8 cm, 1.8 cm, and 2.4 cm, respectively.Finally, for medium (high) momentum K 0 S candidates, we require the flight length in the transverse plane to be greater than 0.08 cm (0.22 cm).Using the reconstructed pions and kaons, we reconstruct D mesons in the channels Here and throughout this report, the charge-conjugated modes are implied.We require a maximum difference of 3σ between the reconstructed mass and the nominal D mass.This corresponds to 15 MeV for all modes except the D 0 → K − π + π 0 channel where the corresponding value is 25 MeV.Using the D candidates, we reconstruct D * mesons in the channels D * 0 → D 0 π 0 , D * + → D + π 0 , and D * + → D 0 π + .The maximal difference allowed between the reconstructed mass and the nominal value is 3 MeV, which again corresponds to 3σ.For both the D and D * reconstruction, we perform a mass-vertex constrained fit and discard candidates for which this fit fails.We require that no additional charged track be in the event other than the decay products of the B tag , D ( * ) , the lepton, and the signal's bachelor pion.Furthermore, we require the lepton and bachelor pion to be positively identified.We require that the pion, lepton and D ( * ) meson form a overall charge neutral system with B tag .We also require M D ( * ) π to be less than 3 GeV/c 2 and larger than 2.05 GeV/c 2 .There is the possibility of signal overlap, i.e. the non-tag final state may be combined in different signal states.This overlap fraction is about 5 %.In such cases, we select at most one B sig candidate per event using two criteria.First, we prefer D * over D in the final state since, otherwise, we would have an extra π 0 in the event, leading to additional missing energy.Second, we select the D ( * ) whose reconstructed mass is closer to its nominal value.The requirements described above for M bc , ∆E, o tag , and M D ( * ) π are determined by maximizing the figure of merit S/ √ S + B using MC simulation; here, S and B are the signal and background yields, respectively.

C. Extraction of the branching fraction
The branching fractions are determined by fitting the Here, (p e + + p e − ) is the sum of the four-momenta of the colliding beam particles and the other terms are the four-momenta of the indicated final-state particles.We fit the spectrum with probability density function (PDF) templates derived from simulation to extract the yields; then we determine B, using the ratios of the fitted yields to those in the original MC and the branching fractions used in MC.
The agreement of the simulations with data is checked by comparing the sidebands and the signal region for events that were discarded for failing to form a charge-neutral system.The reduced χ 2 , obtained by comparing the difference between data and MC, for these tests is 1.02, showing that the agreement of data and MC is good.
For the channels B + → D ( * )− π + + ν and B 0 → D( * )0 π − + ν, we consider the following components in the MC: Since B → D * π ν contributes also as feed-down to B → Dπ ν with a known ratio, we fit simultaneously the B → Dπ ν and B → D * πlν channels.Charged and neutral B channels are fitted separately.
The simulation sample corresponds to five times the integrated luminosity of the data.With the given statistics, not all templates can be determined precisely enough for a stable fit.We therefore float only the B → Dπ ν, the B → D * π ν and the continuum yields.The contribution from "other B B" is not small; however, the shape is very similar to the continuum contribution and, given the agreement of the data and simulation in the sidebands, it is reasonable to fix this contribution to the MC prediction.We use a binned extended maximum likelihood fit to extract the yields.The range of the fit is 2 ) 2 with 140 bins for the B + → D − π + + ν and B 0 → D0 π − + ν channels.For the B + → D * − π + + ν and B 0 → D * 0 π − + ν channels, we use a range of −0.3 (GeV/c 2 ) 2 < M 2 ν < 0.6 (GeV/c 2 ) 2 with 54 bins.In the given M 2 ν ranges, we select 1566, 438, 3750, and 87 candidates for the B + → D − π + + ν, B + → D * − π + + ν, B 0 → D0 π − + ν, and B 0 → D * 0 π − + ν channels, respectively.Figure 1 shows the result of the fit to the combined B + → D − π + + ν and B + → D * − π + + ν channels and Fig. 2 for the combined B 0 → D0 π − + ν and B 0 → D * 0 π − + ν channels.The χ 2 /Ndf value for the B + and B 0 mode fits is 1.1 and 1.2, respectively.Ndf refers to the number of degrees of freedom in the fit.Since the counts for some entries in the fitted histograms are small, we use the equivalent quantity for Poisson statistics (see, e.g., Eq. (40.16) in Ref. [20]).Tables I and II summarize the fit results.
We check that the fits are unbiased and give the expected uncertainty by fitting ensembles of simulated events generated by sampling from the fitting templates.We plot the resulting residuals, fit them to a normal distribution, and check the mean and standard deviation.Finally, we correct for the fact that our efficiency in M D ( * ) π is not constant.Since in the simulation the shape of M D ( * ) π is determined by the poorly-known widths and relative branching fractions of the D * * mesons, it might be different in data.Therefore the non-constant efficiency may introduce an overall efficiency difference between data and simulation.We use a quadratic function to fit the efficiency for each channel after determining that higher-order polynomials do not improve the fit quality significantly.Then we determine the shape of M D ( * ) π in data by subtracting the background components determined from simulation using the B determined from our fit to M 2 ν .Comparing the integrated efficiency in data and simulation for the signal, we determine overall-efficiency calibration factors of 1.008 ± 0.007 for B + → D − π + + ν, 0.983 ± 0.006 for B 0 → D0 π − + ν, 0.997 ± 0.002 for B + → D * − π + + ν, and 0.98 ± 0.01 for B 0 → D * 0 π − + ν.

D. Determination of systematics
There are three main sources of systematic uncertainties for our measurement: uncertainties in the simulation of our detector and underlying physics process, the statistical uncertainties of our fitting templates and the uncertainty of the efficiency correction based on the M D ( * ) π shapes in data and MC.For all three of these sources, our strategy to determine the systematic uncertainty is to use a MC approach that is based on running 1000 ensembles of simulated events, where the source of the systematic uncertainty is varied as described below for each source.We check that the refitted branching fraction in question follows a normal distribution and use the standard deviation of this distribution as our systematic uncertainty.
For the uncertainties of the simulation of the detector, we consider the uncertainty in the determination of the correction factors of the simulation of the PID discussed earlier as well as the uncertainty on the tracking efficiency.
Similarly, for the underlying physical processes, we consider the uncertainty of the D and B meson branching fractions and the D * and D * * form factors. Furthermore, we consider the uncertainty of the calibration of the tagging algorithm, the uncertainty on the total num- ber of B B pairs, and the uncertainty on the branching fractions of Υ(4S) to B + B − and B 0 B0 .These sources of uncertainty of the simulation of the detector and underlying physical processes are described in more detail in Ref. [25].Since it is reasonable to assume that the sources of uncertainty follow a normal distribution, we draw for each ensemble of simulated events, source, and kinematic bin a new weight from a normal distribution with the corresponding width.This is then used to do an event-by-event weighting of the ensemble of simulated events.The advantage of this method is that correlations among the different sources for uncertainties as well as the dependence on the event kinematics are taken into account.By repeating this exercise while varying only one source at a time, we estimate the relative contributions of each source to the systematics.This decomposition is shown in Tables III and IV.We omit the uncertainties due to the K 0 S efficiencies and the D * form factors because these are consistent with zero relative to the tabulated uncertainties.
We estimate the systematic uncertainties propagated from the statistical uncertainty of the fitting templates to be 1.9%, 2.6%, 3.2%, and 3.5% for the B + → D − π + + ν , B + → D * − π + + ν , B 0 → D0 π − + ν and B 0 → D * 0 π − + ν channels, respectively.These values are estimated using 1000 ensembles of simulated events for which we vary the templates using Poisson statistics.Finally, the uncertainty on the detector-efficiency dependence on M D ( * ) π is estimated by varying the M D ( * ) π spectrum for each channel within Poisson statistics and adding the difference of the average efficiency between the ±68 % boundaries of the fit to the efficiency versus M D ( * ) π .

IV. RESULTS AND CONCLUSION
Using the combined fits, including the correction and systematics from the M D ( * ) π efficiency, simulation uncertainties and statistical uncertainty of the templates, we obtain the following values for the branching fractions: These are within one standard deviation of the current world-average values [20] with the exception of B 0 → D * 0 π − + ν , which deviates by 1.7σ.These supersede the previous Belle result [11].The total uncertainties on our measurement are slightly better than the current world-average for the channels B 0 → D0 π − + ν and B 0 → D * 0 π − + ν , whereas they are the same for the channels B + → D − π + + ν and B + → D * − π + + ν.A potential extension to this work would be to confirm the recent observation of B → D ( * ) ππ ν by BaBar [26] as well as to analyze the M D ( * ) π distribution to extract the branching fractions and widths of the different D * * mesons for which there are still some discrepancies between the Belle [11] and BaBar [13] measurements.

FIG. 1 .FIG. 2 .
FIG. 1. (Color online) Binned extended maximum likelihood of the MC templates to the data for the combined fit to B + → D − π + + ν (left) and B + → D * − π + + ν (right).The data is shown with error bars.The legend in the left panel indicates each component in the fit.The dots at the bottom of each panel show the pulls between the data and the fit.For better visibility, we doubled the bin width for this plot.

TABLE III .
Sources of uncertainty in the MC simulations considered for systematic uncertainties for the channels B + → D − π + + ν and B 0 → D0 π − + ν .The table lists the relative uncertainties in the branching fractions in percent for each channel for the combined fits.The last row gives the combined variation of all sources.

TABLE IV .
Sources of uncertainty in the MC simulations considered for systematic uncertainties for the channels B + → D * − π + + ν andB 0 → D * 0 π − + ν .The table lists the relative uncertainties in the branching fractions in percent for each channel for the combined fits.The last row gives the combined variation of all sources.