Form factors of $B\to\pi\ell\nu$ and a determination of $|V_{ub}|$ with M\"{o}bius domain-wall-fermions

Using a fully relativistic lattice fermion action, we compute the form factors of the semileptonic decay $B\to\pi\ell\nu$, which is required for the determination of the Cabibbo-Kobayashi-Maskawa matrix element $|V_{ub}|$. We employ the M\"{o}bius domain-wall fermion formalism for the generation of lattice ensembles with 2+1 sea quark flavours as well as for the valence heavy and light quarks. We compute the form factors at various values of the lattice spacing and multiple light and heavy quark masses, and extrapolate the results to the physical point. We combine our lattice results with the available experimental data to obtain $|V_{ub}| = (3.93\pm 0.41)\times 10^{-3}$.


I. INTRODUCTION
The determination of the Cabibbo-Kobayashi-Maskawa (CKM) matrix element |V ub | from the measurement of the exclusive semileptonic decay B → π ν requires precise knowledge of the corresponding decay form factors, which can be obtained using lattice simulations of Quantum Chromodynamics (QCD). For the first time in lattice QCD we calculate these quantities using a fully relativistic approach. |V ub | is an important Standard Model parameter, and the ratio |V ub |/|V cb | is a particularly sought-after result that requires continued refining of both these elements of the CKM matrix.
Heavy quarks require special consideration in lattice QCD since, on coarse lattices, discretization errors from large masses in lattice units, am Q , become uncontrollable. Therefore, B → π ν calculations typically use effective actions for b quarks, such as Nonrelativistic QCD (NRQCD) [1][2][3], the Columbia interpretation of Relativistic Heavy Quarks (RHQ) [4] and the Fermilab interpretation of the Sheikholeslami-Wohlert clover action [5]. Alternatively, it is possible to use multiple values of the heavy quark mass am Q < am b in a fully relativistic action and extrapolate to the physical mass. This requires that sufficiently fine lattices are available to keep am Q small enough that discretization effects can be controlled when combining the data at various lattice spacings. We take the latter approach in this work using the Möbius domain-wall fermion action [6][7][8][9][10]. We use the same action for the heavy and light quarks, and for both valence and sea light quarks.
With the Möbius domain-wall fermion formalism, the leading discretization effects are of O(a 2 ). In our analysis we extrapolate the results at finite lattice spacing to the continuum limit assuming that there are effects of O(a 2 ) as well as a term proportional to (am Q ) 2 , which is specific to the heavy quark. The maximum value of am Q used in this work is 0.688 so that discretization effects are kept under control. The continuum extrapolation is combined with the extrapolation to the physical heavy and light quark masses in a global fit function. The associated systematic errors are estimated by introducing higher order dependences on the lattice spacing and quark masses.
The momentum transfer range in B → π ν decays is large, owing to the large energy release from the b quark. The most precise experimentally available data points are in the small momentum transfer q 2 m 2 b corresponding to the kinematics where the pion recoil momentum is large. The small recoil data near maximum momentum transfer q 2 ≈ 26.46 GeV 2 is less copious and the relative statistical error is larger. On the lattice QCD side, the most accurate form factor results are obtained at large momentum transfer when the recoil momentum is small. In order to make most use of the available information from both experiment and lattice calculations, one can combine the data to constrain the q 2 dependence using the so-called z-parameter expansion [11][12][13][14][15][16][17][18], as first applied to the B → π ν process in Ref. [19]. This approach only makes assumptions about the analytic structure of the form factors and, because it only involves an expansion about a small parameter z, the results are robust. We follow this strategy in this work and estimate the associated errors.
In calculations of both |V cb | and |V ub | there exist persistent tensions between their exclusive and inclusive determinations [20,21]. The cause(s) of this tension is still unclear, although new theoretical and experimental analyses for |V cb | are revealing potential problems in previous analyses, such as the assumed functional form of the form factors. A recent review of the |V cb | puzzle can be found in Ref. [22], while general overviews of the CKM matrix elements from a lattice perspective can be found in Refs. [23,24].
A more elaborate analysis of |V ub | is premature due to the small branching fraction, but care is needed to ensure that the choice of the parametrization of the form factors allows systematic improvement when more data becomes available. On the exclusive side, the model-independent lattice calculation is a key element in the combined analysis with experimental data. In this work we provide a fully nonperturbative computation of the B → π ν form factors with controlled extrapolation to the physical mass parameters for both heavy and light quarks as well as to vanishing lattice spacing. A discussion of the inclusive determination of |V ub | is beyond the scope of this paper as it involves very different theoretical methods, such as perturbative QCD and the heavy quark expansion, but we note that a promising new direction for tackling the problem using lattice QCD is also being developed [25,26]. The rest of this paper is organised as follows. In Sec. II we discuss the relevant background, including details on the form factors obtained from the calculation and how they are extracted using the appropriate matrix elements. The lattice setup and procedure for our calculation is described in Sec. III, while further details of the ensemble generation and the properties of the generated ensembles are described in the supplemental material [27]. We discuss the results of the lattice form factors and the estimation of various sources of systematic uncertainties in Sec. IV. In Sec. V we discuss the continuum results for the form factors, the use of the z-parameter expansion to obtain results across the entire q 2 range, and our main result: the determination of |V ub | when our lattice form factors are combined with differential branching fractions from experiment. Finally, we conclude in Sec. VI.

II. FORM FACTORS
Form factors to describe the semileptonic decay of a B meson to a pion can be defined for the transition matrix element π(p π )|V µ |B(p B ) of the flavor-changing vector current V µ =qγ µ Q as π(p π )|V µ |B(p B ) = f + (q 2 ) p µ B + p µ π − where f + (q 2 ) and f 0 (q 2 ) are the vector and scalar form factors of this process, p B and p π are the four-momenta of the B and π respectively, and M B and M π are their masses. The momentum transfer is q µ = p µ B − p µ π . At q 2 = 0 there exists a kinematic constraint, f + (0) = f 0 (0).
A common alternative parametrization that is useful for lattice calculations relates the matrix elements to parallel and perpendicular form factors, f (E π ) and f ⊥ (E π ), through where v µ = p µ B /M B is the velocity of the B meson, and p µ π,⊥ ≡ p µ π − (v · p π )v µ . The pion energy E π is related to the momentum transfer of the leptons by Throughout this paper we keep the B meson on the lattice at rest and so can use the relations where the temporal, µ = 0, and spatial, µ = i, components of the vector current V µ are considered, respectively. Another possible parametrization-motivated by heavy quark effective theory-is [28] π(p π )|V µ |B(v) where the B meson state is defined as |B(v) = (1/ √ M B )|B(p B ) such that it is properly defined in the heavy quark limit. The form factors f 1 (v·p π ) and f 2 (v·p π ) are also consistently defined in the heavy quark limit and the heavy quark mass dependence would start from 1/m b . Comparing with Eqs. (4) and (5), we get The relation to the conventionally defined form factors f + (q 2 ) and f 0 (q 2 ) is given by or, equivalently, by In the limit v · p π → 0, the soft pion theorem and the pole dominance ansatz is justified using the heavy meson chiral Lagrangian approach and one obtains [28] lim v·pπ→0 with M B * the mass of the vector meson B * , ∆ B = M B * − M B the hyperfine splitting, f B * and f π the B * and π decay constants respectively, and g B * Bπ the B * Bπ coupling.

A. Ensembles and correlators
We use the Möbius domain-wall fermion action [10] in this work for both heavy and light quarks. The gauge ensembles were generated with 2 + 1 flavours of dynamical quarks by the JLQCD Collaboration. The tree-level Symanzik-improved gauge action is employed, and stout smearing [29] is applied to the gauge fields when coupled to fermions. The lattice ensembles used in this work are summarized in Table I. They form a subset of those generated by the JLQCD Collaboration. (The full list is found in the supplemental material [27].) Each ensemble is given an ID of the form "X-ud#-sa", where X (= C, M, or F) denotes the lattice spacing, the number after ud represents the pion mass in units of 100 MeV, and the letter after s distinguishes whether the strange quark mass is above (a) or below (b) its physical value.
The simulation parameters are chosen as follows. The lattice spacings for coarse "C", middle "M" and fine "F" lattices are 0.0804(1), 0.0547(1) and 0.0439(1) fm, corresponding to lattice cutoffs a −1 = 2.453(4), 3.610(9) and 4.496(9) GeV, respectively. We use a range of light quark masses that correspond to pion masses from 500 MeV down to 230 MeV. They are roughly tuned to 500 (ud5), 400 (ud4), 300 (ud3) and 230 (ud2) MeV. Two values of strange quark mass are taken to sandwich its physical value on the coarsest lattice, i.e., above (sa) or below (sb) the physical strange quark mass. Lattice volumes are 32 3 × 64, 48 3 × 96 and 64 3 × 128 for the three lattice spacings, respectively. They are chosen such that the spatial extent L of the lattice is kept constant, ∼ 2.6 fm, in physical units. The only exception is for the "C" ensemble with the lightest pion mass, "C-ud2-sa-L", which has a larger volume of 48 3 × 96. The temporal extent N T is chosen as N T = 2L. All ensembles satisfy the condition M π L > 4, which is often required to suppress the finite volume effects to a sufficient level, i.e., below a few per cent level for meson masses, decay constants, and form factors. We summarize the parameters of the gauge configurations including the light and strange sea quark masses, m l and m s , in Table I. The ID for each ensemble is the same as those in the supplemental material where further details about the lattice ensembles, including the measurement of the lattice spacing through the gradient flow, the observation of the topology tunnelings, and the light pseudoscalar meson masses and decay constants, are discussed [27].
The chiral symmetry of Möbius domain-wall fermions is not exact due to the finite fifth dimension L s . The resulting residual mass depends on the lattice spacing and the details of the implementation of the domain-wall fermion. In our case the residual mass on the coarsest lattice (β = 4.17, the "C" lattices) is at the level of 1 MeV and an order of magnitude smaller on finer lattices ("M" and "F"). Detailed measurements are described in the supplemental material [27]. The residual mass, however, does not directly affect the analysis of the B → π ν form factors because we use the pion and kaon masses as parameters to control the chiral extrapolation.
In addition to this work, the ensembles have so far been used for a determination of the renormalization constants [30], a calculation of the charmonium correlator and the extraction of the charm quark mass and the strong coupling constant [31], and a calculation of the D semileptonic decay form factors [32]. The lattice data have also been applied to a calculation of the topological susceptibility in QCD [33], a study of the Dirac eigenvalue spectrum and a precise calculation of the chiral condensate [34], another study of the Dirac eigenvalue spectrum but in the high energy region [35], the short-distance current correlator and its comparison with experimental data [36], and a proposal for lattice calculations of inclusive B meson decays [25,26]. The valence sector also uses the Möbius domain-wall fermion action and the light quark masses are the same as used in the sea. Heavy quark masses were chosen as am Q = 1.25 2n × am c , for n ≥ 0 and limited to values am Q ≤ 0.7 to keep discretization errors under proper control. This results in mass values am c ≤ am Q ≤ 2.44 × am c . The charm quark mass in lattice units, am c , is tuned such that the spin-averaged Charmonium 1S state reproduces its physical mass. (Details are discussed in Ref. [31].) Since the lowest heavy quark mass used is that of the charm, we have the process D → π ν included as part of our dataset. We can then use form factors from that decay plus the additional heavier quark masses to extrapolate to the physical b quark mass.
In order to extract the form factors, we compute the three-point functions of the form where P S π and P S Q are interpolating operators to create or annihilate the pseudoscalar pion and heavy mesons. These operators are smeared to enhance the overlap with the corresponding ground state. The smearing is applied in a gauge invariant manner using an operator (1 − (α/N )∆) N with a discretized Laplacian ∆ and parameters α = 20 and N = 200. The source of the quark propagator is generated on the entire source time slice with random Z 2 noise, and then the smearing is applied. The B meson is always set at rest so that q = −p π . The source-sink separation in the temporal direction T is kept approximately fixed in physical units across all lattice spacings. We use T = 28, 42 and 56 on ensembles with β = 4.17, 4.35, and 4.47, respectively. The ground state can then be well isolated by the fits as described in Sec. III B.
The heavy-to-light vector current is defined as V µ =qγ µ Q. Both light (q) and heavy (Q) quark fields are described by the Möbius domain-wall fermion action, and the current is local on the lattice. The renormalization constant Z V is multiplied with the current afterwards, as discussed in Sec. III C.
We also compute the pion and heavy meson two-point functions. These are used to constrain the energies of the initial and final states in the combined fit, as discussed in Sec. III B.
These measurements are performed on N cfg gauge configurations for each ensemble and repeated N tsrc times by always using time source t = 0 and then shifting the source and sink time slices by N T /N tsrc . The number of measurements is thus N cfg × N tsrc per ensemble. Details are listed in Table I for each choice of ensemble and valence heavy quark mass.

B. Two-point & three-point correlator fits
To extract the required form factors, we perform simultaneous fits of all two-point and three-point correlators for each set of ensembles and quark mass parameters using a constrained multi-exponential fit [37]. Doing so allows us to fit the majority of the time extent of the correlators while isolating the ground states-needed to determine the form factorsfrom the excited states, which can be discarded. We include data starting from time slice t min = 2, 3 or 4 in the fit, depending on the ensemble. The two-point correlators are fit to the cosh form where the subscripts P, n correspond to state n of pseudoscalar P , such that n = 0 is the ground state. The interpolating operators are always smeared at the source, and are either local or have the same smearing parameters at the sink. The amplitudes a P,n and b P,n are then equal for the smeared sink or different if the sink is local. We always fit both cases simultaneously to improve the determination of the ground-state energy. We use n exp = 3 and, since we only require ground-state energies and amplitudes for our calculation, we simply check that fits with two or four exponentials give consistent ground-state results.
This multiexponential approach to our fits ensures the uncertainty due to contamination of excited states is taken into account. For the three-point correlators, we fit to the form The energies, E π,n and E B,m , and smeared amplitudes, a π,n and a B,m , are the same as those from the pion and heavy meson two-point correlator fit form. The amplitude V 0,0 , which connects the ground-state heavy meson to the ground-state pion, is needed to determine the form factors. It relates to the corresponding matrix element by As with the two-point correlators, we use n exp = 3. We use the Python packages Gvar [38], Lsqfit [39] and Corrfitter [40] to fit our correlators. The fit parameters are given Bayesian priors as follows. The magnitudes of meson two-point amplitudes can be estimated by fitting that correlator alone with a single exponential at large time t, which leaves only the ground-state contribution. The heavy meson and pion two-point amplitudes are found to be of order 0.1-1.0 (smeared) and 10-30 (local), depending on the lattice spacing of the ensemble. These are taken as the central values, and the priors are given very conservative widths that are 5 times these values. Similarly, one can extract estimates of the magnitudes of the three-point amplitudes, which are found to be of order 1.0 (µ = 0) and 0.4 (µ = 1, 2, 3), and we again assign widths 5 times these values. Priors for the energies of ground states with zero momentum are given 10% widths. The energies of ground states with nonzero momentum get their priors according to the dispersion relation for energy using the prior of the zero momentum ground state. The gaps between energies of two consecutive states are given priors of ≈ 0.7 GeV with 70% widths.
We simultaneously fit a substantial amount of two-point and three-point correlator data, including multiple am Q and q 2 values. This can be difficult as we have to invert large covariance matrices in our fits. If the available statistics is limited, as it is in our case, the eigenvalues of the matrices tend to be underestimated and driven to zero. A standard way to deal with this is to impose singular value decomposition (svd) cuts c svd . In this procedure any eigenvalue smaller than c svd times the largest eigenvalue e max is replaced by c svd e max . The use of the svd cuts makes the matrices less singular. This is a conservative approach since it can only serve to increase the final error. We have chosen the value of c svd for each ensemble such that the fit quality is good while keeping as many eigenvalues as possible.
As we have to use fairly large svd cuts in these fits, using χ 2 per degree of freedom (χ 2 /N dof ) as a measure of goodness-of-fit becomes less reliable. An svd cut increases the uncertainties in the data without increasing the random fluctuations in the data means. This tends to make the contributions from the parts of the χ 2 function affected by the svd cut much smaller than naively expected, which pulls χ 2 /N dof down artificially. We therefore check the fits and the final χ 2 /N dof by adding extra noise to the priors and svd cut, which does not change the fits significantly.
In Figs. 1-6 we show how well our fit results agree with the correlator data for various values of lattice spacing, pion mass and heavy quark mass. In Fig. 1 (left panel) we plot a representative example of the ratio of three-point and two-point correlators, , alongside the fit result. The data at β = 4.17, am u,d = 0.007 and am Q = 0.44037 with zero momentum insertion are shown. In the time range where the ground states dominate, this ratio will be a constant: the three-point ground-state amplitude divided by the two-point ground-state amplitudes. Towards T = 28 we observe a significant curvature of the correlator ratio downward. If, on the other hand, we plot the ratio of the three-point correlator to the leading exponential functions e −Eπt and e −M B (T −t) , a much longer plateau is evident as shown in Fig. 1 (right panel). This implies that the significant excited state contribution comes from the B meson two-point function. The plateau represents a π,0 V µ 0,0 a * B,0 in Eq. (16). In either case, the fit results capture the excited-state effects in the data very well. We emphasize that we do not fit these correlator ratios. Rather, we use the simultaneous, multi-exponential fits to two-point and three-point correlators described earlier in this section for each ensemble.
Similar plots of the three-point function divided by the ground-state exponentials are  tions are the same as those on the coarse lattice, but we observe larger noise due to limited statistics, especially on the finest lattice at β = 4.47 (Fig. 6).

C. Current renormalization
For the lightest heavy quark mass, i.e., when am Q = am c , we find that it is sufficient to renormalize our currents using results from the massless coordinate space current correlators as described in Ref. [36]. However, as discussed in Refs. [25,41], discretization effects arising from larger quark masses can lead to the renormalization constant Z V from vector currents Qγ µ Q deviating substantially from 1. We therefore consider it prudent to use the matrix element B s |Qγ µ Q|B s to partially renormalize our vector current alongside the massless renormalization results. (Here B s stands for the pseudoscalar state comprising the heavy quark Q and the strange quark.) By calculating three-point B s → B s correlators and demanding that the inserted temporal vector current matrix element is 1-since it is conserved in the continuum-we can obtain the renormalization constant We then take the overall renormalization constant for the heavy-light current Z V = Z V QQ Z Vqq where the renormalization constant for light-light current are determined as Z −1 Vqq = 1.047(10), 1.038(6), 1.031(5) at β = 4.17, 4.35, 4.47, respectively [30]. We generated three-point correlators on each of the ensembles with the heavier of the available strange quark masses. On each ensemble and for each value of am Q > am c we used smeared sources and sinks with time separation T . We averaged over two time sources that were separated by half the temporal extent of the lattice. The exception was on the finest ensemble for which we used only a single time source. We also generated two-point correlators with the same sources so that we could extract the required matrix element by We show plots of the ratio from Eq. (19) in Fig. 7 for ensembles with β = 4.17, 4.35 and 4.47. We are able to find plateaus in all cases and thus simply fit to a constant in these regions. Table II gives the results for Z −1 V QQ .

A. Global fit to form factors
In order to obtain the B to π form factors f 1 (v · p π ) + f 2 (v · p π ) and f 2 (v · p π ) at the physical quark masses and in the continuum limit, we perform a global fit. The form factors are functions of v · p π = E π , which should also be parametrized. We assume the energy dependence of the form factor f 1 (v · p π ) + f 2 (v · p π ) is described by a simple polynomial, and use a fit function For f 2 (v · p π ), since we expect a contribution from the vector meson (B * ) pole as described in Eq. (13), we use Here C x and D x are fit parameters, and N x are normalization constants that fix the units for energies and masses. These have been chosen so that C x and D x are ∼ O(1). We choose N E = 1/(0.3 GeV) and N M 2 π = 1/(0.3 GeV) 2 , where 0.3 GeV is a typical pion mass/energy, and N m Q = 1 GeV −1 . We take Λ QCD = 0.5 GeV.
The heavy quark mass dependence as an expansion in terms of 1/m Q is justified because the form factors f 1 (v · p π ) and f 2 (v · p π ) can be defined even in the heavy quark limit. The 1/m Q term represents the first correction to that limit.
The strange quark masses have been set such that they are close to the physical strange quark mass. They are not, however, exactly tuned so we include the term in our fit to take this into account. Having two strange quark masses on either side of the physical mass on the coarsest lattice allows the fit to determine the coefficient of this correction term. For the light quark mass dependence, we take the expectation from SU (2) "hard-pion" chiral perturbation theory for heavy-light mesons [42] (see also Ref. [43]): plus a term linear in M 2 π . We take 1.0 GeV as the value for the scale Λ appearing in the chiral logarithm terms. For the pion decay constant f π appearing in the denominator, we take f π = 130.4 MeV. The logarithmic dependence expected from chiral effective theory is not very significant with the precision of the current lattice data, and in our main fit we use the result from SU (2) chiral perturbation theory by fixing C χlog = D χlog = 1. However, this depends on the value we choose for the B * Bπ coupling g B * Bπ . In the literature, the extracted values cover a wide range [44][45][46][47][48][49], and it is not straightforward to assess the overall uncertainty. On the other hand, it is not clear whether we can see the chiral log in our data. We therefore estimate the systematic uncertainty related to this term by setting g B * Bπ = 0.45 [44] as a representative value in our main fit with fixed C χlog and D χlog = 1, followed by another fit where C χlog and D χlog are free fit parameters. In this way the uncertainty due to g B * Bπ is taken into account in the estimated systematic error. This is discussed in Sec. IV B.
We assume that the leading discretization effects appear as an overall factor of the form (1 + C a 2 (Λ QCD a) 2 + C (am Q ) 2 (am Q ) 2 ), and do not consider cross terms, e.g., a term of the form E π a 2 with independent parameters. This is justified because the dependence on the lattice spacing is small. We confirmed that adding such cross terms with free fit parameters has a negligible effect on the fit.
We find a good fit when simply fitting up to the quadratic term in pion energy for f 1 (v · p π ) + f 2 (v · p π ), but larger uncertainties in data points with large pion momentum make it unclear what behaviour is exhibited at higher pion energies. For this reason we include the cubic term in Eq. (20). The impact of the choice to include this higher order term is minimal since, as we will discuss in Sec. V A, when extrapolating towards q 2 = 0 we restrict our choice of synthetic data for the z-expansion to the region of pion energies covered by our simulation data. For f 2 (v · p π ) we only include a term linear in the pion energy.
Fitting both form factors f 1 (v · p π ) + f 2 (v · p π ) and f 2 (v · p π ) simultaneously, we obtain a fit with χ 2 /N dof = 0.59 (N dof = 182). We use Bayesian priors for the fit parameters: we choose 1.0 ± 2.0 for C 0 and D 0 , and 0.0 ± 2.0 for all other fit parameters. Results for the parameters from the global fit are given in Table III. (14) --0.026(15) −0.09 (14) 0.10(7) 0.03(1.09) 0.14 (12) TABLE III. Our best fit parameters from the global fit functions [Eqs. (20) and (21)]. We illustrate the extrapolations in pion mass, heavy quark mass and lattice spacing in Figs. 8, 9 and 10, respectively. Figure 8 shows the form factors f 1 (v · p π ) + f 2 (v · p π ) and f 2 (v · p π ) as functions of v · p π = E π computed at different light quark masses corresponding to M π = 300, 400 and 500 MeV. The extrapolations to the chiral limit (or to the physical pion mass) are performed using the fit to Eqs. (20) and (21). One can see that the values of the form factors are rather stable as a function of the quark mass. The data points are well described by the global fit shown by dashed curves. The thick curves represent the results corresponding to the physical pion mass.
The heavy quark mass extrapolation is demonstrated in Fig. 9, which shows the form factors computed for three different heavy quark masses: m Q = m c ; 1.56×m c ; and 2.44×m c . We find that both form factors increase towards the physical b quark mass. As represented in Eqs. (20) and (21), we extrapolate assuming dependence of the form 1/m Q , and the results at the physical point are represented by the solid curves. The systematic error due to the effect of neglecting a 1/m 2 Q term is estimated in the next subsection. The continuum extrapolation is shown in Fig. 10 for a typical parameter choice (p 2 π = (2π/La) 2 , M π 300 MeV and m Q = 1.56 × m c ). Since the physical volumes of the three lattices are similar, so too are the values of the physical momenta of the three points shown.   Continuum extrapolation of the form factors f 1 (v · p π ) + f 2 (v · p π ) and f 2 (v · p π ) evaluated with a typical parameter choice: p 2 π = (2π/La) 2 (note that the physical volumes of the three lattices are similar); M π 300 MeV; and m Q = 1.56m c .
We find that the continuum extrapolation in a 2 is also mild, even though a potentially significant discretization effect due to the heavy quark mass of the form (am Q ) 2 is expected. This is partly because the renormalization factor discussed in the previous section absorbs the bulk of the discretization effects. The global fit forms of Eqs. (20) and (21) assume that the discretization effect applies as an overall factor (1 FIG. 11. Results of the global fit of the data for f 1 (v · p π ) + f 2 (v · p π ) (upper curve) to Eq. (20) and f 2 (v · p π ) (lower curve) to Eq. (21). The data from which these are obtained exist in the region 0.225 GeV < E π < 0.975 GeV.
independent of light quark masses and energies v · p π = E π . This choice is justified because the dependence on each such parameter is small as we saw above. In principle this allows the global fit to discriminate between the (am Q ) 2 and (Λ QCD a) 2 effects; in practice, both terms in our fits return coefficients consistent with zero.
The final results for f 1 (v · p π ) + f 2 (v · p π ) and f 2 (v · p π ) at the physical quark masses and in the continuum limit are shown in Fig. 11 as a function of v · p π = E π . The bands represent the one standard deviation regions with only the statistical uncertainties included. The region that our lattice data cover is from 0.225 GeV to 0.975 GeV. The results outside of this region are obtained from the fit functions in Eqs. (20) and (21). In the soft pion limit, the form factor f 2 (v · p π ) rapidly goes to zero as a result of the pole term included in Eq. (21), and is not directly confirmed by the lattice data.

B. Estimation of systematic errors
We now turn to the analysis of systematic uncertainties. To make an assessment of their impact we perform additional fits with particular terms added or amended. We attempt the following variations of the fits: 1. The original fit using the form of Eqs. (20) and (21).
3. Adding M 4 π terms such that the pion mass dependence of 4. Adding the next order term in E π , so that 5. Adding a 4 terms such that the discretization effects of . Similarly for f 2 (v · p π ).
6. Adding (am Q ) 4 terms such that the discretization effects of 7. Allowing the fit to determine the coefficient in front of the chiral log, i.e., letting C χlog and D χlog be free fit parameters instead of fixing them to 1.
We plot the result of these alternative fits in Fig. 12 at three representative q 2 values (19.15 GeV 2 , 23.65 GeV 2 and 26.40 GeV 2 ) after converting to f 0 (q 2 ) and f + (q 2 ). The results are very stable across the alternative fits. The inner, lighter grey band shows our statistical uncertainty only, which is exactly the result from fit 1. The outer, darker grey band displays our total error, which includes systematic effects that come from the deviation from fit 1 of each of fits 2-7 added in quadrature.
We also plot the systematic uncertainty coming from each of the listed sources as a function of pion energy in Fig. 13 for both form factors f 0 and f + , covering the q 2 range where we have data. They are estimated using the fits as described above, i.e., the deviation from the main fit "1" is plotted. They can therefore be either positive or negative. The estimated total systematic errors (red dash-dot lines), calculated from all sources of systematic uncertainty added in quadrature, are comparable in size to the statistical errors (blue solid lines).

V. FORM FACTORS IN THE CONTINUUM AND |V ub |
The differential decay width relates to the form factors f + (q 2 ) and f 0 (q 2 ), and |V ub | through where G F is Fermi's constant and m is the lepton mass. For electrons and muons the terms suppressed by m 2 can be discarded (at least at the current theoretical and experimental precision), which means that the contribution from the scalar form factor f 0 can be neglected. Thus the relation between the differential decay width and the form factors is reduced to a much simpler form: where the pion momentum in the rest frame of the B meson is To determine |V ub |, we need the branching fractions obtained from experiment as well as form factors from our lattice calculation. In this section, we first discuss the parametrization of the q 2 dependence of the form factors. The treatment of the experimental data is then described so that we can combine this with our lattice data to make a determination of |V ub |.

A. Form factor shape
We use the z-parameter expansion to parametrize the shape of the form factors. Here, q 2 is transformed to a small parameter z as where t + = (M B 0 + M π + ) 2 is the Bπ threshold. We are free to choose the value of t 0 ≤ t + . We choose t 0 = (M B + M π )( √ M B − √ M π ) 2 since this symmetrizes the values of z around 0, with |z| < 0.28.
For our final results of the f + (q 2 ) form factor we fit our data to the Bourrely-Caprini-Lellouch (BCL) expansion [18], where the denominator on the right hand side addresses a pole at q 2 = M 2 B * . The second term in parentheses is introduced to ensure that the form factor satisfies the appropriate asymptotic form near the threshold. For the scalar form factor, f 0 (q 2 ), we fit to a simple series expansion in z: Another widely used parametrization is the Boyd-Grinstein-Lebed (BGL) expansion [13,14]: where P 0 (q 2 ) is usually taken as 1, and the pole in the vector form factor is taken care of by the Blaschke factor P + = z(q 2 , M 2 B * ). The outer functions φ 0 (q 2 , t 0 ) and φ + (q 2 , t 0 ) are analytic. Often, the outer function for the scalar form factor is chosen as φ 0 (q 2 , t 0 ) = 1. For the vector form factor we follow [50] and choose where t ± = (M B 0 ± M π + ) 2 , t 0 = 0.65t − and χ (0) J = 6.9 × 10 −4 GeV −2 . Note that the choice of t 0 differs between the BCL and BGL z-expansion parametrizations in our analysis. Although our final results use the BCL parametrization, we confirmed that the BGL parametrization produces entirely consistent results.
The coefficients of the BCL ansatze in Eqs. (28) and (29) obey the unitarity constraint [18,51] Nz m,n=0 This holds for both b + k and b 0 k . The coefficients B mn are symmetric in the indices, B mn = B nm , and satisfy the relation B mn = B 0|m−n| . They depend on the choice of t 0 , and we list them for our choice for both form factors f + and f 0 in Table IV. We do not implement these constraints explicitly in our fits, but we do check that they are satisfied by our results.
From the results of the global fit, we generate synthetic data for a range of q 2 values. Note that we have six degrees of freedom left after the extrapolations, so we can pick six data points (choosing more would result in a singular correlation matrix). We choose to generate three data points for each f + (q 2 ) and f 0 (q 2 ) at q 2 values q 2 1 = 19.15 GeV 2 , q 2 2 = 23.65 GeV 2 , and q 2 3 = 26.40 GeV 2 . We pick these so that they are approximately   [18] and [5]. Our results for a fit to the BCL form of the z-expansion are given in Table VII. The correlation matrix of the resulting parameters b + k and b 0 k are in Table VIII. We do not use priors in this fit. We obtain a good fit when the order of the polynomial is chosen as N z = 3.   Here we impose the kinematic constraint f + (0) = f 0 (0), i.e., we have six data points and five fit parameters. If we do not include the constraint then we have six data points and six fit parameters so cannot use χ 2 /N dof as a measure of goodness of the fit. The fit result, however, remains unchanged. Although we do not impose them explicitly, we find that the unitarity constraints from Eq. (32) are satisfied and we get 0.034 (16) and 0.122 (44) for f + and f 0 , respectively. We find that N z = 2 is insufficient for a good fit. We also test fitting the form factor f + (q 2 ) alone using five synthetic data points. This makes very little difference to the f + (q 2 ) results. We plot results of the form factors across the entire z range in Fig. 14.
The blue squares show f 0 and the red circles show (1 − q 2 /M 2 B * )f + , while the bands are their corresponding fit results.
We can compare the form factors f 0 (q 2 ) and f + (q 2 ) to the results from other lattice QCD calculations when both statistical and systematic uncertainties are included. Results from the RBC and UKQCD Collaborations [4] and the Fermilab lattice and MILC Collaborations [5] are plotted alongside our results in Fig. 15. We restrict this comparison to the q 2 region that approximately corresponds to the inserted pion momentum in the lattice calculations and find general agreement for both form factors. Near q 2 max there are slight discrepancies with RBC/UKQCD for f 0 (q 2 ) and Fermilab/MILC for f + (q 2 ). This may hint at some systematic effects, although the statistical significance is limited.
It is also interesting to compare the lattice form factors with theoretical expectations from heavy-quark symmetry. In the soft-pion limit, the vector and scalar form factors, f + (q 2 ) and f 0 (q 2 ), are related by [28] lim up to corrections of O(1/m 2 b ). This ratio is plotted in Fig. 16 along with the theoretical expectation. We take g B * Bπ = 0.45(5) (from Ref. [44]) and f B * /f B = 0.941(26) (from Ref. [52]). The width of the green error band that represents the Heavy Quark Effective Theory (HQET) expectation reflects only the uncertainties from g B * Bπ and f B * /f B , and not  any other theoretical errors. For the lattice data, we take our result of fit "1" extrapolated to the chiral limit M 2 π = 0, showing only the statistical uncertainty. The agreement with the theoretical expectation in the soft pion limit and q 2 → M 2 B , which is at the rightmost end of the plot, is excellent.

B. Branching fractions from experiment
For the experimental results we use the following sets of data: the BaBar 2010 untagged analysis in 6 bins [50]; the Belle 2010 untagged analysis in 13 bins [53]; the BaBar 2012 untagged analysis in 12 bins [54]; and the Belle 2013 tagged analysis in which the B 0 → π + ν process was measured in 13 bins and the B − → π 0 ν process was measured in 7 bins [55]. We deal with this last set of data by assuming isospin symmetry, which allows us to convert as a function of q 2 compared with the prediction in the soft-pion limit from heavy-quark symmetry and χPT [28]. The width of the green error band reflects only the uncertainties from g B * Bπ = 0.45(5) (from Ref. [44]) and f B * /f B = 0.941(26) (from Ref. [52]), and not any other theoretical errors.
the B − decay to the B 0 decay through where the mean life of the neutral and charged B mesons are τ B 0 = 1.519(4) ps and τ B − = 1.638(4) ps, respectively [56]. These are the same sets of data as used by the Heavy Flavour Averaging Group (HFLAV) [20], the Flavour Lattice Averaging Group (FLAG) [21] and in the analysis presented in Ref. [57], as well as in the most recent lattice calculations of |V ub | [4,5]. We assume that systematic correlations between each of the individual datasets are negligible. We do, however, include correlations from the systematic uncertainties in the Belle 2013 analysis between the 13-bin and 7-bin data. The Belle collaboration indicated systematic correlations of 49%. We construct a total covariance matrix for the B 0 and B − data (after conversion to the isospin symmetric B 0 mode) by taking the direct sum of the statistical covariance matrices (where the off-diagonal blocks are 0) and of the systematic covariance matrices (with 49% correlation between each of the bins in the off-diagonal blocks), and then summing these two 20×20 matrices. The inclusion of these systematic correlations was found to have a negligible effect on the parameters and fit quality.
Our first step is fitting the four sets of data individually and then collectively without any lattice input. Using the BCL parametrization, we fit for the branching fraction in the ith bin through so that the combination of the form factor and CKM matrix element results in an overall normalization of b + 0 |V ub |. The slope and the curvature from the z-expansion fits are captured in the ratios b + 1 /b + 0 and b + 2 /b + 0 , respectively.  results with N z = 3. We find that the fit quality is acceptable for each set of data when fitted individually, but that fitting all data simultaneously ("All") results in a relatively poor fit. This is due to a tension between the BaBar 2010 data and the other results. We confirm this by fitting various combinations of datasets, finding poor fit quality whenever BaBar 2010 is included. Therefore, we also give results for the case where BaBar 2010 is dropped ("Excl. BaBar 2010"), which results in an acceptable fit.
Fitting with N z = 3 is sufficient, and higher order fits do not improve the fit quality. Although we agree with the values of the fitted parameters for the BaBar 2012 data reported by the Fermilab Lattice and MILC Collaborations in Ref. [5], we find that the fit quality is actually better. Our result is in agreement with that found by the RBC and UKQCD Collaborations [4] and the result presented in Ref. [57] where they each find a similar discrepancy with the fit quality reported by the Fermilab and MILC Collaborations.
In Fig. 17 we plot 68% and 95% confidence regions for b + 1 /b + 0 and b + 2 /b + 0 for each of the cases listed in Table IX. This visually demonstrates the tension between the BaBar 2010 dataset and the other measurements. We also show the consistency between these shapes and with the shapes determined from our lattice only fit to the form factors using the BCL parametrization with N z = 3 and with the kinematic constraint f + (0) = f 0 (0) imposed.

C. Determination of |V ub |
We now turn to fitting the above branching fraction results alongside our form factor results from the lattice. In this way we can determine the z-expansion parameters b + n and our main result of |V ub |, which appeared in the normalization of the experiment-only fits above. As discussed earlier, the contribution from the scalar form factor f 0 (q 2 ) to the branching fraction is suppressed by the squared lepton mass, and we neglect it. Therefore only f + (q 2 ) appears in Eq. (35). However, we do include lattice data for both form factors in the fit, and fit f + (q 2 ), f 0 (q 2 ) and experimental branching fraction data simultaneously. We impose the constraint f + (0) = f 0 (0) explicitly, although this makes a negligible difference to our final results since the low-q 2 region is primarily controlled by the branching fraction data.
As we have only three data points for f 0 , N z = 4 gives the maximum number of fit parameters we can use for f 0 if the constraint f + (0) = f 0 (0) is imposed (N z = 3 without the constraint). For f + we have data points from lattice and experiment, and are not limited to N z = 4. We therefore choose (N f + z , N f 0 z ) = (3, 3), (4, 4) and (5, 4) for our main fits (imposing the constraint at q 2 = 0), and (N f + z , N f 0 z ) = (3, 3), (4, 3) and (5, 3) for test fits without the constraint. We find that all these choices give a reasonable fit quality and the parameters are stable. We take (N f + z , N f 0 z ) = (4, 4) for our accepted final result. Numerical results for our combined lattice and experiment fits are given in Table X. We first fit the lattice form factors with each of the experimental branching fraction analyses in turn and find acceptable fit quality in each case. Next, we fit the lattice data alongside all experimental datasets simultaneously. As in the experiment-only fit, we do not find that the fit quality is particularly good when all experimental analyses are included. We therefore provide a further set of numerical values for the case where the BaBar 2010 analysis is excluded. This improves the fit quality while all parameters are consistent with the allexperiment fit. It should be noted that when BaBar 2010 is excluded, the value of |V ub | is determined to be marginally higher. The unitarity constraints from Eq. (32) are satisfied in each case, although we stress again that they are not explicitly imposed on the fits. The correlation matrices for the combined fit of all lattice and experimental data are in Table XI,  while those without BaBar 2010 are in Table XII. The differential branching fraction data from experiments, our lattice data (converted using |V ub | from our accepted fit) and bands representing our z-expansion fit results with all errors included are plotted in Fig. 18. The differences among the results with different (N f + z , N f 0 z ) are hardly visible, and they give essentially the same result for |V ub |. We reiterate that we take N f + z , N f 0 z = (4, 4) as our main result. In Fig. 19 we again show the form factors across the entire z range, this time using the above BCL fits combining lattice form factor data and branching fractions from experiment. The lattice data for f 0 (blue squares) and (1 − q 2 /M 2 B * )f + (red circles) are shown with corresponding fit bands from the combined fit. 1.16 (23) 0.56 (17) 0.59 (17)  z , N f 0 z ) = (4, 4). We list b 0 0 here for completeness, but it is fixed through the constraint f + (0) = f 0 (0).    [5]; the RBC and UKQCD Collaborations [4]; and the HPQCD Collaboration [3]. The value tagged Λ b → p ν is from Refs. [61,62]. This combines a lattice QCD calculation of the form factors of the Λ b to p process with experimental measurement of the ratio presented by the LHCb Collaboration, which allows the extraction of the ratio |V ub |/|V cb |. Using |V cb | = (39.5 ± 0.8) × 10 −3 from exclusive decays [62,63], the authors quoted a value for |V ub |. The FLAG average is from the 2021 report [21], and the HFLAV exclusive and inclusive results are from Ref. [20]. The inclusive data point is from their GGOU analysis, with a second (dashed) error bar to represent the spread of values from other frameworks.
Our final result for |V ub | is thus from the combined fit with all experimental data: The uncertainty includes the statistical and systematic errors originating from our lattice calculation as well as the total errors from the experimental data. If we exclude the BaBar 2010 data set from the analysis, we obtain |V ub | = (4.01±0.42)×10 −3 with a much improved p-value (see Table X). Our result for |V ub | is compared with other lattice QCD calculations and exclusive and inclusive determinations by HFLAV and FLAG in Fig. 20. Compared with other lattice QCD computations of the B → π ν process (Fermilab/MILC [5], RBC/UKQCD [4] and HPQCD [3]) our result is slightly higher but still consistent within the estimated errors. Our result is also compatible with the inclusive determination, which we have taken from HFLAV [20] using the "GGOU" analysis. We include dashed error bars to indicate the spread of results from other methods. We also note that our value is in good agreement with those of Refs. [57][58][59], while moderately higher than-but still consistent with-that in Ref [60], all of which use lattice form factor results as input.

VI. CONCLUSIONS
For the determination of |V ub |, the combination of the lattice computation of form factors and the experimental measurements of the differential cross section is crucial. This is not solely because the experiments can only measure the product of the form factor f + (q 2 ) and |V ub |, but because they provide complementary information about the form factor shape. The lattice calculation provides the form factor in the large q 2 region with controlled errors, while the experimental data are more sensitive to the low q 2 region. As one can see from the fit results, by combining the data from both experiment and lattice QCD, the form factor shape is much better controlled.
Our combined result for |V ub | is 3.93(41)×10 −3 when including data from all experiments, and 4.01(42) × 10 −3 when excluding the 6-bin untagged BaBar 2010 analysis. In both cases these results are consistent with the inclusive determination of |V ub | and with previous results on the exclusive B → π ν process.
The advantage of our lattice calculation over previous work is the use of a fully relativistic lattice fermion formulation, with which no extra matching procedure is required. (For the renormalization constant, we employed a strategy to eliminate the bulk of the large discretization effects appearing in the wave-function renormalization by making a nonperturbative determination of Z V using heavy-to-heavy three-point functions.) Our analysis therefore becomes rather straightforward: we simply assume the discretization effects are of O(a 2 ) and O((am Q ) 2 ) and let the numerical data determine their size by combining the lattice data at various a and am Q . We also explore the dependence on the heavy quark mass and find that it is consistent with a leading 1/m Q correction to the heavy quark limit.
A major challenge in this analysis was due to the multiple extrapolations that have to be performed at the same time in three parameter dimensions: the light quark mass; the heavy quark mass; and the lattice spacing. We find that these limits are reached rather smoothly with our global fit function. We estimate systematic errors due to potentially missing higher order terms in the ansatze by attempting the fit including one such term at a time. There is no single dominant source of error, but after adding them in quadrature the total systematic error is comparable to the statistical error in our calculation. The inclusion of heavier masses for m Q and smaller pion masses would help further control systematic effects, while additional statistics is the key to improving the calculation of these form factors in the future.
We anticipate more lattice calculations of the B → π ν process using fully relativistic actions in the near future. Crucially, this includes cases where the heavy quark is tuned to the physical b quark mass on the finest lattices, allowing for an improved approach to the physical point, and therefore even better control of systematic effects. by JSPS KAKENHI Grant Numbers 18H03710 and 21H01085, and by MEXT as "Program for Promoting Researches on the Supercomputer Fugaku" (Simulation for basic science: from fundamental laws of particles to creation of nuclei, JPMXP1020200105) through the Joint Institute for Computational Fundamental Science (JICFuS). The reference list was automatically populated using filltex [65].  0) corresponds to the standard domain-wall fermion action. One can also make these parameters s-dependent to further improve the chiral symmetry, but we do not consider such a possibility in this work. For practical reasons we multiply each row by D − and use the 5D operator rather than D GDW to avoid the need for the inverse term (D − ) −1 in our simulations.

arXiv:2203.04938v2 [hep-lat] 1 Jan 2023
We identify the physical four-dimensional quark fields q andq as the surface modes of the 5D fields ψ andψ: q R = P + ψ Ls , q L = P − ψ 1 ;q R =ψ Ls P − ,q L =ψ 1 P + , where the subscripts L and R denote the chirality components of the fermion fields. Then, one can show that the quark propagator qq can be written in terms of the inverse of the 5D Dirac operator as where R 5 denotes an operator to reverse the ordering in the fifth direction and the permutation operators P and P −1 are defined as follows: The 4D propagator corresponds to an inverse of the 4D effective operator after subtracting a contribution from the contact term: In Eq. (7), another 5D operator D GDW (m = 1) is introduced to cancel the irrelevant eigenmodes of D (5) GDW (m) living in the bulk of 5D by multiplying by its inverse (D (5) GDW (m = 1)) −1 . The 4D effective operator D (4) (m) thus constructed has a form similar to the overlap-Dirac operator [7,8] with a polar approximation of the matrix sign function where the kernel operator H M is given by In the limit of large L s , the approximation of the sign function becomes exact and the 4D effective operator reduces to the overlap-Dirac operator. The kernel in Eq. (10) is different from that in Refs. [7,8] by the presence of its denominator. The standard domain-wall fermion action corresponds to the special case where b + c = b − c = 1. In this work we choose a different parameter set, b + c = 2 and b − c = 1, motivated by a detailed study of the residual chiral symmetry violation [9], which is also partly described in Sec. I C.
We construct the fermion bilinear operator with the 4D quark fields q andq asqΓq with an appropriate γ-matrix Γ. The renormalization factors for (axial-)vector and (pseudo-)scalar operators are separately determined through short-distance current correlators [10].

B. Action and generation parameters
For the gauge part of the lattice action, we adopt the tree-level improved Symanzik gauge action [11][12][13] S Sym = β 3  [15], which is another popular choice of smearing procedure [16]. The depth in the fifth dimension is L s = 12 on the coarsest lattice at β = 4.17 and L s = 8 for finer latices. We generated 15 lattice ensembles with three different lattice spacings and various combinations of sea quark masses as listed in Table I. We assign an ID name for each ensemble, which distinguishes coarse (C), medium (M) and fine (F) lattices, as well as the masses of up and down (ud) and strange (s) quarks. In the C and M ensembles, we use two values of the strange quark mass that sandwich the physical value from above (a) or from below (b). The number given after "ud" represents the corresponding pion mass, e.g. "ud3" for a 300 MeV pion, and so on. The lattice size L is taken such that the physical lattice extent is kept approximately constant at ∼ 2.6 fm. There is a single coarse ensemble with a larger volume: 48 3 × 96 compared to the regular size of 32 3 × 64. This is indicated by "-L" in the ID.
Each ensemble of the "C" lattices has 10,000 Hybrid Monte Carlo (HMC) updates, where each update is of 0.5 molecular dynamics time in length. The "M" and "F" lattices use 1.0 and 2.0 molecular dynamics time units with 5,000 and 2,500 updates, respectively.

C. Residual masses
Although Möbius domain-wall fermions offer precise chiral symmetry, this is slightly violated due to the finite extent of the fifth direction, L s . This violation may be represented by an operator ∆ L defined through which can also be written as By measuring some matrix elements of this operator, which probes the violation of the Ginsparg-Wilson relation, one can characterize the size of such effects. A global measure of the chiral symmetry violation can be constructed [1] as where the trace runs over all indices, including x and the color/spinor indices of the 4D quark propagator. This is called the "residual mass" since the operator ∆ L plays the same role as the mass term, as can be seen from the definition of Eq. (12). As discussed in [17], we may consider a decomposition of m res in terms of the eigenmodes of D (4) (0). Since the eigenvalue λ of D (4) (0) is distributed more densely for high modes (as ∼ λ 3 ),m res as defined in Eq. (14) is most sensitive to the high end of the eigenvalue spectrum, which is of the order of the lattice cutoff. It therefore probes the violation of chiral symmetry at short distances. In addition to Eq. (14) we consider a measurement on a given time slice, m res (t) = x,y q x,t γ 5 ∆ L q x,tqy,0 γ 5 q y,0 x,y q x,t γ 5 q x,tqy,0 γ 5 q y,0 , which can be evaluated by calculating the 4D propagatorD −1 ov from randomly generated Z 2 noise spread over points y on time slice "0" and contracting at another time slice "t". Some examples are shown in Fig. 1. We find that the residual mass m res (t) as defined in Eq (15) is nearly independent of t. On the coarse lattice at β = 4.17 (upper panel), it shows a plateau at about 0.4 × 10 −3 , which corresponds to 1 MeV in physical units. On the medium lattice at β = 4.35 (lower panel), it is reduced by an order of magnitude to ∼ 0.5 × 10 −4 , which is about 0.2 MeV and therefore almost negligible in the analysis of physical quantities. We expect m res (t) to be even smaller on the finest lattice. Numerical results are summarized in Tables II and III. Measurements with ud and s valence quarks are separately averaged between t = T /4 and 3T /4 where a clear plateau is found on the plots. The results for the global measurementm res are also listed.
When measured with different valence quark masses, the residual mass values are consistent within their statistical uncertainties. Going to even heavier valence quark masses, such as those of the charm and bottom quark masses, we find that m res (t) is slightly smaller but has a larger statistical error. In any case, the residual mass for heavy quarks does not have any impact on their analysis because m res (t) is 10 −3 of their bare quark mass. We also find that the results are insensitive to the sea quark masses.
Another point to notice from Fig. 1 is that m res (t) turns out to be smaller at short distances, i.e. up to t = 2-3 in lattice units. This is consistent with the observation of [17] in which the matrix elements of ∆ L were calculated on individual eigenmodes of D (4) (0) and enhancement of the violation was found for low-lying eigenmodes. Although there is no unique definition of the "residual mass", we can take the plateau value since this is used when we analyze low-energy physical observables for which the violation at long distances is most relevant. Alternatively, we can take M 2 π in place of the quark mass when we perform a chiral extrapolation of the lattice data.
We also reiterate that the extent of the fifth dimension L s in our work is not large: L s = 12 on the coarse lattices and L s = 8 on the fine lattices. This suggests that precise chiral symmetry may be achieved by rather modest computational overhead compared to the conventional Wilson fermion formulation.

D. Scale setting
In this work the lattice scale is set by the Yang-Mills gradient-flow time t 0 , defined by t 2 E(t) t=t 0 = 0.3 [18], with the energy density operator E(t) of the gluon field evaluated at flow time t. (The symbol "t" is used here for the flow time and should not be confused with the time coordinate on the lattice. This usage is restricted to this subsection.) We adopt the gradient flow defined by the conventional Wilson gauge action and discretize the flow by a step size ∆t/a 2 = 0.01. For each ensemble, we perform 700 (500) measurements of the gradient flow on the coarse (medium/fine) lattices.
We list the numerical results for t 1/2 0 /a in Table IV. The lattice data is averaged taking into account autocorrelations using the jackknife method with a bin size of 140 or 250 HMC trajectories for the coarse or medium/fine ensembles, respectively.
The flow time w 0 /a, which is defined by a slope, t(d/dt){t 2 E(t) }| t=w 2 0 = 0.3 [19], is also calculated and the results are summarized in the same table. The statistical signal is slightly worse for w 0 /a and so we only use t 1/2 0 /a in the following analysis. We extrapolate the lattice data to the physical point assuming a linear dependence on the quark masses, or equivalently on the square of the pseudoscalar meson mass. We take the form where the superscripts "phys" denote values at the physical point. The form of Eq. (16), which does not contain chiral logarithms is justified in [20]. For inputs from experimental data we use the pion mass M phys π = 134.8 MeV and the kaon mass M phys K = 494.2 MeV, which are the recommended values in the isospin limit [21]. For the reference value of t 0 , we use the value t 1/2 0 = 0.1465(21)(13) fm from [19] as an input. We assume that the slope parameters c π and c s are common among different lattice spacings (or β values), and fit the results of all ensembles in Table IV together. In this way we can take account of the sea quark mass dependence for the finest lattice "F-ud3-sa", for which only one value of each sea quark mass is available.
The extrapolation is shown in Fig. 2 for the coarse (β = 4.17) and the medium (β = 4.35) lattices. The lattice data for two strange quark masses are plotted; the extrapolation

E. Topological charge distribution
Autocorrelations between gauge configurations generated by Monte Carlo methods for lattice QCD is a potentially serious problem. In particular, the expectation value of the F µνFµν operator summed over space-time may have a long autocorrelation time, as it is related to a topological quantity of the SU(3) gauge field. In continuum QCD, the definition of the topological charge Q is such that it becomes an integer and may not change its value under continuous deformations of the gauge field. Its lattice counterpart tends to have the same property as we approach the continuum limit. Since the Hybrid Monte Carlo algorithm relies on continuous evolution of the gauge fields, the topological charge would have long autocorrelation times when the continuum limit is approached. This is a problem because the vacuum of QCD should have a distribution of the topological charge characterized by the topological susceptibility χ t = Q 2 /V , where V is the 4D volume. Some examples of the Monte Carlo history of the global topological charge Q are shown in Fig. 3. We adopt the conventional definition of the topological charge density [22] con- structed from the gauge link variables after performing the gradient flow as described previously. After some gradient flow time, the global topological charge tends to become unchanging, and we take this frozen value for this study. These examples are on the ensembles with pion mass around 300 MeV at three values of the lattice spacing. They are therefore expected to show a similar variation of Q. In fact, the range of fluctuations is very similar among the different lattice spacings, while it is evident that Q changes much less frequently on finer lattices. We estimate the autocorrelation time as 14 (3) and 243(153) on the coarser two lattices at β = 4.17 and 4.35, respectively, while it is O(2000) or larger on the finest lattice [23]. Histograms of Q on these ensembles are shown in Fig. 4. The distribution of Q on the coarsest lattice adheres to a Gaussian-like form, while it is highly distorted on fine lattices. It is therefore important to address the question of how such a distorted topological distribution affects the measurements of other physical quantities. First, let us consider the worst-case scenario in which the global topological charge is frozen at a certain integer value Q on consecutive gauge configurations. Estimates of physical quantities calculated on these configurations may be biased since the topological charge is not sampled according to the correct distribution, which should satisfy Q 2 = χ t V .
The bias may be understood in a systematic way as being a series in 1/V . The formula for a CP-even observable G even is available from [24,25] as where the left-hand side is the measurement in a fixed topological sector Q. The right-hand side concerns quantities defined in the θ = 0 vacuum: G(0) is the observable at θ = 0, and G (2) (0) is its second derivative with respect to θ evaluated at θ = 0. If we sum over all possible values of Q, the second term on the right-hand side of Eq. (17) vanishes and the correct value G(0) is recovered. The formula in Eq. (17) can be used to estimate the size of the distortion due to the biased distribution of Q, provided that the derivative G (2) (0) is known. There is no general rule available for the θ-dependence of physical quantities, but for pions and kaons one can use an estimate based on chiral effective theory. The result for the pion mass M π is G (2) (0)/G(0) = −1/2N 2 f [24]. We use the estimate for χ t from chiral effective theory, χ t = m q Σ/N f [26], which can also be written as χ t = M 2 π F 2 π /2N 2 f . We thus obtain −(1 − Q 2 /χ t V ) × 1/(2M 2 π F 2 π V ). For our choice of the parameters (M π 300 MeV, F π 100 MeV, L 2.6 fm), this gives an estimate −(1 − Q 2 /χ t V ) × 0.45% for the worst case.
Using our lattice data, we attempt to confirm the prediction discussed above. For each configuration with a known Q, we calculate the pion "effective mass" M eff from a ratio of the zero-momentum pion correlators at neighboring time separations, i.e. M eff = ln[C(t)/C(t + 1)]. This does not give the correct pion mass, because the correlators are not averaged over configurations, but would reflect the effect of the background gauge field. In particular, if the pion mass is affected by the topological charge of the background gauge field, that should be visible in this quantity. Results for the ensembles with M π 300 MeV are shown in Fig. 5 as a function of Q 2 . Red dots measured on each configuration do not show any significant dependence on Q 2 , and their averages within individual Q 2 values are also independent of Q 2 . This is consistent with our expectation that the Q 2 -dependence is small ( < ∼ 1%). When we average over different Qs, the prefactor (1 − Q 2 /χ t V ) in the above estimate gives another suppression factor even when the distribution of Q is slightly biased. On the finest lattice, F-ud3-sa, it turns out that this factor is about 1/4, so that the actual bias is reduced to ∼ 0.1%, which is below the statistical error (∼ 0.2%) on this lattice. We may therefore ignore the potential finite volume effect from this source. Going to smaller pion masses, this effect would become more significant due to the suppression factor 1/χ t V containing a term that scales as 1/M 2 π . The small effect, as estimated above, is not a surprise because local topological fluctuations are active even when the global charge gets stuck. Some methods to probe such local fluctuations exist, and it turns out that the topological susceptibility χ t measured locally is consistent with chiral perturbation theory [23]. Most physical quantities, such as the hadron correlation functions, are measured in a small subvolume of the whole lattice. For the pions, the size of the subvolume is ∼ 1/M 4 π , which is much smaller than the total volume 2L 4 by about a factor of 1/2(M π L) 4 ∼ 0.002. The correlation functions for heavier hadrons are even more local, and the estimate above gives a conservative upper limit for the bias.

II. LIGHT HADRON SPECTRA AND DECAY CONSTANTS
Here we describe the measurements of basic physical quantities, i.e. the masses and decay constants of pions and kaons. Note that this study is separate from the B → π ν analysis in which we calculate form factors and |V ub |. In that case we once again fit all relevant two-point and three-point correlators simultaneously, but get fit results that are consistent with the independent analysis described here. Some of the simulation details, such as the number of time sources used are different between the two analyses.
In this work, the correlators are computed using Z 2 noise sources, which are then smeared with a gauge-invariant Gaussian smearing. The Gaussian smearing is defined by the operator Two-point correlation functions P L (x)P G † (0) are fit to a single exponential function simultaneously with correlators P G (x)P G † (0) , where L indicates an unsmeared local operator while G denotes Gaussian smeared operators. The fit range is determined by setting t max to half the lattice temporal extent minus one and t min is set to the first point in which three consecutive values of the effective mass are constant within 5% of their standard deviations. The 5% is chosen after visually inspecting the results and ensuring that t min appears to be in the plateau region. All fits are then performed again with t min + 2 to check that values remain constant. These differences in fitted mass are of the order of 0.1% and the differences in amplitudes 0.5%.
Thanks to the good chiral symmetry of our Möbius domain-wall action, we compute the pseudoscalar decay constants directly from the pseudoscalar current using Z A ∂ µ A µ = (m q 1 + m q 2 )P , where A µ is the lattice axial current, P is the pseudoscalar current and m q values are the unrenormalized quark masses. The unrenormalized quark masses include the residual masses: m = m bare + m res . We thus obtain the decay constant from F P = (m q 1 + m q 2 ) 2A P P /M 3 π , where A P P denotes the amplitude of the P L P L † correlator. Numerical results for the masses and decay constants obtained on each ensemble are summarized in Table V.

A. Chiral fits
We perform a global fit of the data for the pion mass M π and decay constant F π using the SU (2) chiral perturbation theory (χPT) formulae (see, e.g. [21]) These are written as an expansion in the parameter x = M 2 /(4πF ) 2 where M 2 = 2Bm q = 2m q Σ/F 2 . The light quark massm q is the averaged up and down quark mass, which is m ud . The parameters Λ 3 and Λ 4 are related to the effective coupling constants of χPT through n = ln Λ 2 n / M phys with Λ 12 being the combination Λ 2 12 = (7 ln Λ 2 1 + 8 ln Λ 2 2 ) /15. The chiral expansions above are fit to the data for F π and M 2 π /m q simultaneously at both next-to-leading order (NLO) and next-to-next-to-leading order (NNLO). At NLO only terms up to O(x 2 ) in Eqs. (18) and (19) are included, leaving the free parameters F , B, Λ 3 , and Λ 4 . At NNLO there are additional free parameters k M , and k F , while the values of Λ 1 and Λ 2 are fixed to the phenomenological values from Ref. [27]. FIG. 6. Plots of M 2 π /m q (left panel) and F π (right panel), using the expansion in x = 2m q B/(4πF ) 2 . Fit lines show the best NLO (blue) and NNLO (green) fits in the continuum limit and interpolated to physical strange quark mass. The NLO fits only include the ensembles for M π < 450 MeV.
The results of the NLO and NNLO fits alongside our lattice data are shown in Fig. 6. Details of the fit parameters and their statistical uncertainties are given in Table VI. The blue line shows the NLO fit and the black line shows the NNLO fit. Bands of the same color represent a statistical standard deviation from the global fit. Each line has a point with an error bar drawn at the physical light quark mass representing the value and statistical error of the global fit. The resulting values for F π are F x−NLO π = 87.6 ± 0.5 stat ± 1.1 scale fit ± 1.477 scale MeV, where the first error is the statistical error from the fit. The second error is the uncertainty estimated from how the fit changes by varying the scale by 1σ, and the third error is the systematic error directly from the scale setting. Similarly, estimates for F π from the ξ parametrization are F ξ−NLO π = 87.6 ± 0.6 stat ± 1.3 scale fit ± 1.5 scale MeV,   In both cases the NLO fits required that we limited the data to ensembles with lower pion masses and they still have poor χ 2 per degree of freedom compared to the NNLO fits. It is also useful to look at the ratio of the physical pion decay constant F π and the value of the SU (2) constant F in the chiral limit. This ratio should mitigate systematic effects such as the scale setting uncertainty. We get F π /F = 1.0635(13)(10) from the x-fit at NNLO and 1.0532(28)(16) from the ξ fit at NNLO.
The values of the SU (2) low energy constants are also listed in Tables VI and VII. The values of Λ 3 and Λ 4 can be expressed as the low energy constants to 3 and 4 . The values of the NLO low-energy constants from both expansions and fits are shown in Table VIII. Estimates of the uncertainties in the fit due to varying the scale setting and from the uncertainty of the values directly from the scale in the case of dimensionful constants are given. Due to the reasons above, we prefer the values of the NNLO fits.

B. Kaon Decay Constant
We also compute the kaon decay constant F K . Here the results appear incredibly linear in M π in the region covered by our lattice data. We tested a fit ansatz that includes the SU (3) chiral logarithms, but we have insufficient data at small pion masses to obtain a reliable fit. So for the kaon we simply perform a linear fit in m ud using x = M 2 /(4πF ) 2 , fixing M and F to the value obtained in the NNLO fit (see Fig. 7). Evaluating this extrapolation at the physical point for m ud gives F K = 154.96 ± 0.37 stat ± 1.13 scale fit ± 2.61 scale .
To extract F K /F π we use a linear extrapolation for F K and the NNLO x expansion for F π . Evaluating this at the physical point gives F K /F π = 1.2260 ± 0.0057 stat ± 0.0022 scale fit . This ratio may be a better quantity as it does not suffer from a strong uncertainty due to the scaling, although it is a biased comparison since we use only a linear extrapolation for F K and a full NNLO fit for F π .
F F x / FIG. 7. F K (top) and F K /F π (bottom) with linear extrapolations to the physical pion mass point. The fit to F k is a simple linear extrapolation in the chiral x expansion parameter to the chiral limit. The extrapolation in F K /F π uses a linear extrapolation for F K over the NNLO fit for the x expansion for F π .