$B \rightarrow \pi \ell \nu$ at zero recoil from lattice QCD with physical $u/d$ quarks

The exclusive semileptonic decay $B \rightarrow \pi \ell \nu$ is a key process for the determination of the Cabibbo-Kobayashi-Maskawa matrix element $V_{ub}$ from the comparison of experimental rates as a function of $q^2$ with theoretically determined form factors. The sensitivity of the form factors to the $u/d$ quark mass has meant significant systematic uncertainties in lattice QCD calculations at unphysically heavy pion masses. Here we give the first lattice QCD calculations of this process for u/d quark masses going down to their physical values, calculating the $f_0$ form factor at zero recoil to 3\%. We are able to resolve a long-standing controversy by showing that the soft-pion theorem result $f_0(q^2_{max}) = f_B/f_{\pi}$ does hold as $m_{\pi} \rightarrow 0$. We use the Highly Improved Staggered Quark formalism for the light quarks and show that staggered chiral perturbation theory for the $m_{\pi}$ dependence is almost identical to continuum chiral perturbation theory for $f_0$, $f_B$ and $f_{\pi}$. We also give results for other processes such as $B_s \rightarrow K \ell \nu$.


I. INTRODUCTION
The exclusive semileptonic process, B → π ν, is a key one for flavour physics because it gives access to the Cabibbo-Kobayashi-Maskawa matrix element V ub from a process involving two 'gold-plated' (stable in QCD) mesons, B and π. V ub is determined by comparing the experimental rate for the process to that determined from theoretical calculations of hadronic parameters known as form factors which are functions of the squared 4-momentum transfer, q 2 , between the B and the π. The form factors encapsulate the information about how likely it is for a π meson with a specific momentum to form when a b quark inside a B meson changes into a u quark with emission of a W boson. If the calculations of the form factors are done in lattice QCD then the full effect of QCD interactions that keep the quarks bound inside the mesons is taken into account.
Lattice QCD calculations for this process are particularly difficult, however, because the results are sensitive to the mass of the u/d quarks that form the π meson. Existing lattice QCD calculations have used u/d quarks that have heavier masses than in the real world and results then have to be extrapolated to the physical point. The value of the u/d quark masses, and hence the π mass, also affects the q 2 value for a given π spatial momentum and, since the form factors are strongly varying functions of q 2 , this gives further dependence on the quark masses. For recent lattice QCD results for B → π ν form factors see [1,2].
Lattice QCD calculations for the form factors are most reliable close to the 'zero recoil' point where the π has small spatial momentum (and hence relatively small dis- * christine.davies@glasgow.ac.uk † URL: http://www.physics.gla.ac.uk/HPQCD cretisation errors). This corresponds to being close to the maximum value of q 2 , (m B − m π ) 2 . In this regime it is possible to use soft pion relations coupled to Heavy Quark Effective Theory to derive expectations for the functional dependence of form factors on the heavy quark mass, m b , and on the π meson mass. Since these relationships come from a well understood theoretical framework it is important to test them against lattice QCD results. It was apparent in quenched lattice QCD calculations many years ago that there were large deviations between lattice results and the expected dependence on m b and m π ; see [3,4] for a review. Relatively heavy u/d quark masses were used in these early calculations, often close to the s quark mass.
Here we revisit this issue with results from lattice QCD that include the full effect of sea quarks (u, d, s and c) but, more importantly, include u and d quarks with masses at their very small physical values for the first time. We focus on the soft pion relation which gives the scalar form factor f 0 at the zero recoil point in terms of the ratio of B and π decay constants as m π → 0 [5][6][7][8]: This relationship has been shown to hold through O(Λ/m h ) in an expansion in powers of the inverse heavy quark mass, m h [9], and we therefore expect it to work at the 1% level for b quarks. A test of this in lattice QCD calculations can then provide important confirmation of how well we understand the u/d quark mass dependence for a process B → π ν for which this is critical.
Here we show that indeed the soft-pion relation does emerge from lattice QCD calculations as m π → 0 for a fixed heavy quark mass tuned to that of the b. This calculation builds on the very accurate results for f B and f π that we have been able to obtain at physical u/d quark masses [10,11] and benefits from the fact that the 2 renormalisation factors of the temporal vector and temporal axial-vector heavy-light currents are the same for our light quark formalism and so renormalisation uncertainties are minimised.
In addition we give a precise result for f 0 (q 2 max ) for physical u/d quark masses (where corrections to the softpion relation are substantial). This is a useful comparison point for lattice QCD calculations that are done with unphysical u/d quark masses, as a test of extrapolations in the u/d quark mass. Although f 0 itself is not accessible in experiment, lattice QCD calculations of it should agree, and it is important to test this using different discretisations of QCD. f 0 (q 2 max ) is the most accurately determined form factor for a lattice QCD calculation of B → π decay and so a good number for calibration.
For comparison we also give results for f 0 (q 2 max ) for B s → K ν decay, which is another physical process, and for B s → η s ν, which does not occur in the real world. The latter decay is again useful for comparison between lattice QCD calculations.
The paper is laid out as follows: In Section II we describe the lattice calculation of the correlation functions needed to extract the scalar form factor and its ratio to f B /f π . We use the Highly Improved Staggered Quark (HISQ) action for the light quarks and improved Non-Relativistic QCD (NRQCD) for the b quark. In Section III we give the results and determine the ratio both at the physical m π and as m π → 0. Section IV includes a comparison of our values with those from lattice QCD calculations that extrapolated results from heavier than physical values of m π and Section V provides our conclusions. In Appendix A we discuss the staggered quark chiral perturbation theory that we use for f B , f π and f 0 and show that it is in fact very continuum-like in its approach to m π = 0.

II. LATTICE CALCULATION
We use ensembles of lattice gluon configurations provided by the MILC collaboration [12] at three values of the lattice spacing, a ≈ 0.15 fm, 0.12 fm and 0.09 fm. The configurations include the effect of u, d, s and c quarks in the sea using the HISQ formalism [13] and a gluon action improved through O(α s a 2 ) [14]. These then give significant improvements in the control of systematic errors from finite lattice spacing and light quark mass effects over earlier configurations.
We work at three different values of the u/d quark masses (which are taken to be degenerate and will be referred to as light (l) quarks) in the sea. These correspond to approximately one-fifth and one-tenth of the s quark mass and the physical average u/d quark mass (m s /27.5 [16]). The lattice spacing on these configurations is determined for this calculation from the mass difference between the Υ and the Υ [15]. Table I lists the parameters of the ensembles.
On these configurations we calculate l and s quark   [12]. a is the lattice spacing, fixed from the mass difference between the Υ and Υ in [15]. The first error is from statistics and the second from NRQCD systematics in that determination and from experiment.   [15,19]. Column 4 gives the parameter u0L used for 'tadpoleimproving' the gluon field [10,15] and columns 5, 6 and 7 give the coefficients of kinetic and chromomagnetic terms used in the NRQCD action. c1 (c6 has the same value), c5 and c4 are correct through O(αs) [15,18].
propagators using the HISQ action. The l quarks are taken to have the same mass as is used in the sea. For the s quarks we retune the valence mass to be closer to the physical s quark mass and so it is slightly different from that in the sea [15]. Values are given in Table II. We also calculate b quark propagators using the improved NRQCD [17] action developed in [15,18].
The NRQCD Hamiltonian we use is given by [17]: Here ∇ is the symmetric lattice derivative and ∆ (2) and ∆ (4) the lattice discretization of the continuum i D 2 i and i D 4 i respectively. am b is the bare b quark mass in units of the lattice spacing. n h is a stability parameter set equal to 4 here. The gluon field is tadpole-improved, which means dividing all the links, U µ (x) by a tadpoleparameter, u 0 , before constructing covariant derivatives or chromo-electric or magnetic fields. For u 0 we took the mean trace of the gluon field in Landau gauge, u 0L [15]. E andB are the chromoelectric and chromomagnetic fields calculated from an improved clover term [20]. They are made anti-hermitian but not explicitly traceless, to match the perturbative calculations done using this action.
Given the NRQCD action above, the heavy quark propagator is readily calculated from a simple lattice time evolution equation [17] which is numerically very fast. The heavy quark propagator is given by: with starting condition: Here Γ is a matrix in (2-component) spin space and φ(x) is a function of spatial position, to be discussed below.
We can use such a function because we fix the gluon field configurations to Coulomb gauge. The terms in δH in eq. (3) have coefficients c i whose values can be fixed from matching lattice NRQCD to full QCD perturbatively, giving the c i the expansion 1 + c Here we include O(α s ) corrections to the coefficients of the sub-leading kinetic terms, c 1 , c 5 and c 6 , and the chromomagnetic term, c 4 [15,18]. The b quark mass parameter, am b , is nonperturbatively tuned to the correct value by calculating the spin-average of the 'kinetic masses' of the Υ and η b as described in [15]. We are able to do this to 1%, limited by the accuracy of the determination of the lattice spacing to convert the kinetic mass to physical units. The values used for am b , u 0L and c i on the different ensembles are given in Table II. We use the same NRQCD action for both heavyonium and for heavy-light meson calculations since it is accu-  rate for both. For heavy-light calculations, which concern us here, the power counting in the heavy-quark velocity is equivalent to power-counting in inverse powers of the heavy quark mass. The NRQCD action above is then fully improved through O(α s Λ/m b ). It has already been used for accurate calculations of the Υ spectrum and properties [15,19,21,22] and B, B s and B c meson masses [23] and decay constants [10,24], including those of their vector partners.
For the calculation of the B → π ν form factor we need (Goldstone) π meson correlation functions, B meson correlation functions and '3-point' correlation functions that connect the B meson to the π. Here we work with B and π mesons at rest. The π meson correlators were calculated in [11] and are simply given by: where g( x, t) is a staggered quark propagator with source at t = 0, tr denotes a trace over color indices and the angle brackets denote an average over gluon field configurations in the ensemble. We use random wall sources to improve statistical errors. The factor of 1/4 is needed to account for the number of staggered quark 'tastes' because the correlator is a staggered quark loop. Fitting these '2point' correlators to the standard multi-exponential form allows the extraction of the ground-state π mass (≡ E π,0 ). The amplitude a π,0 gives 0|P |π /( √ 2m π ) where P is the pseudoscalar density. Using the PCAC relation we can convert this to f π : and note that f π is absolutely normalised here. Results for m π and f π are given in [11]. Statistical errors below 0.1% are obtained. Staggered quarks have numerical efficiency advantages as a result of having no spin degree of freedom. A 4component naive quark propagator,g, can be simply obtained from a staggered quark propagator, g, from point 4 x to point y by reversing the staggering transformation: g(x, y) = g(x, y)Ω(x)Ω † (y) (9) We can then useg straightforwardly in combination with other propagators that carry a spin component [25], for example an NRQCD b quark propagator. B meson correlators at zero spatial momentum were made in this way in [10]. NRQCD b quark and staggered light quark propagators are simply combined to make a pseudoscalar meson (with operator Ψ b (x)γ 5 Ψ l (x)) as: Here Tr is a spin trace. In this equation both G b and g are calculated from a simple delta function source at the origin (and Ω(0) = 1).
To improve our statistical precision we use a random wall source for both propagators which adds the technical complication that Ω † ( x, t = 0) must be used, along with the U(1) random noise field, as the source for the NRQCD propagator [26]. In addition we use 3 different smearing functions for the source of the b quark in eq. (5): a delta function and two exponentials with different radii, a sm , given in Table III. Thus, in eq. (5), Γ = Ω( x, t = 0) and, for the smeared case, with η( y) a random field from U(1), and a 3-vector in colour space. For the staggered quark propagator the source is simply η( x). We obtain a 3 × 3 matrix of correlation functions from the use of the 3 different smearings for the b quark at source and sink. This can be fit to a multi-exponential form as for the π meson, except that the correlator has a simple exponential form in time rather than a cosh because NRQCD b quarks propagate in one direction in time only. To improve statistics we average over forwardin-time and backward-in-time directions. The fit enables us to extract B meson energies (these are not equal to the meson masses because of the NRQCD energy offset) and amplitudes that depend on the smearing function used [23]. We use a fit function where sc and sk denote source and sink respectively. The second line captures the presence of 'oscillating' opposite  (14)) from [10]. z0 = ρ0 − ζ10, z1 = ρ1 − z0, z2 = ρ2 from [28]. Column 5 gives the values of αs used in the matching. This is determined in V-scheme with 4 sea quarks at scale 2/a. parity states in the correlator that are a result of using a staggered light quark. Having multiple smearing functions improves the fit significantly because they enhance the signal for the ground-state at small t values which counteracts the relatively poor signal-noise in B correlators at large t.
The decay constant for the B meson is defined from the matrix element of the continuum temporal axial current between the vacuum and a B meson at rest: To determine this accurately in our lattice QCD calculations requires finding a good approximation to the continuum A 0 current in terms of bilinears made of HISQ light and NRQCD b quarks. The most accurate calculation to date is in [10] where we use a matching through α s Λ/m b , consistent with the level of accuracy in our improved NRQCD action: The O(α s ) matching coefficients z 0 , z 1 and z 2 are given in Table IV. In [10] we used α s ≡ α V (n f = 4, 2/a) to evaluate the matching coefficients above. We do that again here and also give in Table IV the value of α s on each ensemble. These are obtained by converting and running down the result α M S (n f = 5, M Z ) = 0.1185(6) [16,27].
To determine the matrix element of A 0 and hence f B we calculate the matrix elements of each of the J From simultaneous fits, using eq. (13), to the 3 × 3 matrix of B meson correlators described above, along with correlators that have J (1) A0 inserted at the sink, the amplitude c(A 0 , 0) that corresponds to the annihilation of the ground-state B meson with the A 0 current can be determined. The amplitude is given by Results for Φ = f B √ m B for B and B s are given on the sets of gluon field ensembles used here in [10] and we use the same B meson correlation functions here.
The new correlation functions that we have calculated here are the '3-point' functions for B → π ν decay, illustrated in Fig. 1. For these a light quark propagator from a random wall source at t = 0 is used, at time-slice T and convoluted with each of the b-quark smearing functions, as the source of a NRQCD b propagator. This propagates backwards in time to time-slice t where it is connected via a temporal vector current to another identical light quark propagator, joined to the first one at the source time-slice in the appropriate way to form a π meson. For this to work with staggered light quark propagators Ω matrices must be inserted at T and t to reinstate the light quark spin. Both the B meson and π meson are at rest. We use multiple T values so that we can fit the 3-point function both as a function of t and as a function of T . The T values used on each ensemble are given in Table III.
Because of the chiral symmetry of staggered quarks the matching of the NRQCD-HISQ temporal vector current to continuum QCD takes the same form as the temporal axial vector current: We calculate the 3-point function inserting each of the J V0 at the heavy-light vertex at t. We can then fit the 3point functions along with 2-point correlators discussed above to determine the matrix element of V 0 between π and B. For B and π at rest the matrix elements of J and J (2) V0 are equal and so we only calculate one of them. The 3-point correlation function is fit to the form: where n denotes normal parity states and again oscillating terms appear for the B meson from higher mass opposite-parity (o) states. Here The amplitudes a π,k and energies E π,k are the same amplitudes and energies as in the π 2-point fits (eq. (7)). The amplitudes c * B,k and d * B,k and energies E B,k and E B,k are the same as in the B 2-point fit (eq. (13)) for the corresponding smearing function for the B meson.
The fit parameter V nn (0, 0) is the result that we need for the ground-state B to ground-state π matrix element for a current, J, inserted at t. Using the standard relativistic normalisation of states: Combining results from J V0 and J V0 as in eq. (17) gives an amplitude corresponding to the continuum QCD current, V 0 , which we denote as V V0 . This is directly related to the matrix element of V 0 between π and B as in eq. (21). Since, at zero recoil We can therefore directly extract f 0 (q 2 max ) √ m B (1 + m π /m B ) from our fit results. Dividing by √ 2 times the amplitude from eq. (16) gives f 0 (q 2 max )(1 + m π /m B )/f B . Note that in forming this ratio the overall renormalisation factor of the temporal axial/vector current cancels (eqs. (14) and (17)). Hence this ratio has significantly lower systematic errors from renormalisation/matching than the individual quantities f 0 and f B . The division can be done inside the fit code and therefore correlations between the fit parameters can be taken into account to reduce statistical errors. Because of the inclusion of relativistic/radiative corrections to the currents, the ratio is accurate through α s Λ/m b in a power-counting in inverse powers of the b quark mass. Multiplication by the amplitude and energy combination that gives f π is also readily done inside the fit to give a result for This is the quantity that we will work with, examining its limit as m π → 0 where the factor of (1 + m π /m B ) vanishes and we expect the answer 1 from the soft-pion relation, eq. (1). From this we can also obtain the ratio at the physical value of m π and thereby the value of f 0 (q 2 max ) at that point. We have also calculated the appropriate 3-point correlation functions for the processes B s → K ν and B s → η s ν and combined these with the appropriate 2-point functions from [10,11] to obtain an analogous ratio for f 0 (q 2 max ) to that above (R BsK and R Bsηs respectively). In these cases we can also extract a result for f 0 (q 2 max ) for physical quark masses that is accurate through α s Λ/m b .

A. B → π
As discussed in Section II, we fit our results for 3-point functions for B → π and 2-point functions for B and π simultaneously to the forms given in eqs. (7), (13) and (20). We use a constrained fitting technique [29] so that we can include uncertainties in our fitted results for the groundstate coming from the presence of excited states in the correlation function. The prior value taken on the difference in mass between adjacent states (both in normal and oscillating channels) is 600(300) MeV and (for the B) on the difference between the ground-state and the first oscillating state is 400 (200)  Set (8)  V 0 (see eq. (18)). J (2) V 0 has equal amplitude to J (1) V 0 at zero recoil and so is not given separately. In column 4 results are combined as in eq. (17) into an equivalent parameter for the full QCD current, V0. Column 5 presents the results as the combination g Bπ these widths correspond to 3-5 times the ground-state value and so provide a loose constraint. Figure 2 illustrates the quality of our results by showing a plot of the ratio of 3-point to 2-point correlators for B → π decay on coarse set 4. Statistical errors are small and, although oscillating terms are strong on the B side of the 3-point correlator, it is clear that results at different T values are consistent so that we are able to isolate ground-state amplitudes.
The results from our 2-point fits have been detailed in [10] and [11] and we obtain results in good agreement with those values here. Table V gives values for the ground-state B to ground-state π 3-point amplitude V nn (0, 0) from eq. (20) for the case where the current   ) for B → π decay to fB/fπ in the combination that appears naturally from our fits: RBπ ≡ f0(q 2 max )(1 + mπ/mB)/[fB/fπ]. Column 3 gives the full result and column 2 the result from using the leading order NRQCD-HISQ current only, both in f0 and in fB.
V0 (these two cases are fit simultaneously). We see that the raw matrix element for the sub-leading current is 5-8% of the leading current. We also give results for the combination V V0 from eq. (17) that corresponds to the full QCD current (to the order to which we are working), V 0 . The final column of Table V uses eq. (23) to determine the combination . We see significant dependence on the mass of the π meson in these results, a feature that was missing in earlier calculations that could not reproduce the soft pion theorem relationship for f 0 (q 2 max ) [4]. As discussed in Section II, we can also evaluate, directly from our fits, the combination f 0 (q 2 max )(1 + m π /m B )/[f B /f π ] by dividing our 3-point amplitude by appropriate 2-point amplitudes which are correlated within the fit. This dimensionless ratio, in which the overall renormalisation factor for the lattice currents cancels, is tabulated in Table VI. There we give the result both the full calculation, using currents V 0 in the 3-point amplitude and A 0 in f B , and for the calculation using just the leading order current J (0) in both cases. The difference between the two is small, around 2%, because the impact of the sub-leading currents largely cancels between f 0 (q 2 max ) and f B . Our statistical/fitting uncertainty is about at the same level. The results for the full ratio are plotted in Figure 3 as a function of m π . Again, dependence on m π is clear, but no dependence on the lattice spacing is seen since discretisation effects evident for f B in [10] largely cancel in the ratio.
In order to test the soft pion theorem we need to fit our results as a function of m π to extrapolate to m π = 0. Having results for a range of small m π values allows us to do this.
One key element of m π dependence is purely kinematic [30]: the fact that q 2 max depends on m π through the formula This will mean that, if results for different m π fall on similar f 0 curves as a function of q 2 , there will be a significant linear term in m π coming from the shift in q 2 max as m π is reduced. A slope in m π of order −2m B df 0 /dq 2 | q 2 =q 2 max might then be expected. An estimate for this slope can be derived from a simple pole form for f 0 (q 2 ), behaving as 0 is the lowest scalar state in the B meson spectrum. This would give a slope in m π of f 0 (q 2 max )/Λ where Λ is the difference in mass between the B * 0 and the B. Taking Λ of order 500 MeV results in a slope of 2 GeV −1 . The term in m 2 π coming from the kinematics above would have a much smaller slope, of O(1/(2Λm B )), since there is no enhancement by m B . This slope is then smaller than that for powers of m π coming from chiral perturbation theory and so can be absorbed into those terms in our fit.
From low-order chiral perturbation theory we expect simple powers of m 2 π as well as chiral logarithms of the form m 2 π /(4πf π ) 2 ×log(m 2 π /Λ 2 χ ) (where we take f π = 130 MeV). The appropriate staggered quark chiral perturbation theory for these quantities is given in [31][32][33], see also [34]. The way in which the masses of different tastes of π meson appear in the chiral logarithm terms and associated 'hairpins' is discussed in Appendix A. It turns out that the staggered chiral perturbation theory for f π and f B is very like the continuum form, because discretisation effects from the masses of π mesons of different taste almost entirely cancel out. Terms of this kind in fact cancel between f 0 and f B /f π . This includes in particular all the chiral logarithms with coefficient g B * Bπ .
The remaining chiral logarithms that do not cancel in the ratio of f 0 (q 2 max ) and f B /f π take the form of an average over tastes, t, of m 2 π G /(4πf π ) 2 × log(m 2 π,t /Λ 2 χ ). Here m π G is the mass of the Goldstone π meson in the final state, and m π,t is the mass of a π of taste t. m π G ≡ m π is the expansion parameter for the chiral perturbation   (24), (33) and (31)) evaluated at the physical value of mπ. Uncertainties are given as a percentage of the final answer.
theory since its square is proportional to the light quark mass. The masses of the other taste π mesons, m π,t , could contribute discretisation errors to the chiral perturbation theory, when compared to the continuum chiral logarithms, since they differ by α 2 s a 2 from m π G . These discretisation errors would be relatively benign, since the chiral logarithm above vanishes as m u/d → 0 even at nonzero lattice spacing. In fact, as shown in Appendix A, hairpin diagrams also cancel most of these discretisation effects giving a dependence on m π which is almost identical to that of continuum chiral perturbation theory. We therefore use continuum chiral perturbation theory for our extrapolation to m π = 0 but allow for m π -dependent discretisation errors to include remaining staggered tastechanging effects.
Combining results for f B , f π and f 0 gives a coefficient for the chiral logarithm above in f 0 f π /f B of -4. A value of -4 is a substantial coefficient, double that in f π for example, and so it might be expected that we need to pay attention to finite-volume effects. We studied these for f π on these ensembles in [11] and found them to be well below 1%, except on set 1 which has the coarsest lattice spacing and is furthest from the physical point and so has very little impact on any fit. Finite-volume corrections were included in that study because they were significant at the level of the statistical errors possible there. Here statistical errors are much larger and we find finite volume effects are not significant. We include double the relative finite volume effect seen in f π for the ratio f 0 f π /f B in the values shown in Figure 3. This has negligible impact on the final result at physical m π and less than 1% effect on the value at m π = 0.
We fit the results for the ratio R Bπ as a function of lattice spacing and π meson mass to the following form: Here the c j coefficients provide for regular discretisation errors which appear for our actions as even powers of a only. We take Λ = 400 MeV. The coefficient d provides for light quark mass-dependent discretisation errors, as discussed above; more information on these is given below. The terms with coefficients e 1 and e 2 allow for discretisation errors from higher-order terms in the NRQCD action with coefficients that might depend on am b . δx m = (am b − 2.7)/1.5, so it varies from -0.5 to 0.5 across the range used here for am b . We take Λ/m b = 0.1. The f k coefficients provide for m π dependence expected from kinematic effects from the dependence of q 2 max on m π (which includes in particular a linear term in m π as discussed above). Any dependence on (even) powers of m π from chiral perturbation theory will be subsumed into this dependence. The final term is a chiral logarithm, for which we allow coefficient g. As discussed above, we use the continuum form for the chiral logarithm and allow for m π -dependent discretisation effects from staggered chiral perturbation theory with the δ a 2 ,m 2 π term. Our fits are insensitive to the form that this term takes. We have tried, for example, a form based on discretisation effects from averaging over π meson tastes in a logarithm: δ t is one unit of taste-splitting (see Appendix A). For our final results we in fact use the simpler δ a 2 ,m 2 π = a 2 m 2 π . In neither case does the fit return a significant coefficient for this term.
We take a prior of 0.0(1.0) on almost all coefficients in eq. (26). The exceptions are: a prior of 1.0(5) on R Bπ (phys, m π = 0); a prior of 0.0(3) on c 1 , since treelevel a 2 errors are missing from our action so that the leading term is α s a 2 ; a prior of 0.0(5.0) on f 1 since this linear term in m π is expected (from the arguments above) to have a coefficient around 2 GeV −1 from its origin in the m π dependence of q 2 max and a prior of 4.0(1.0) on g, the coefficient of the chiral logarithm, which is known up to chiral corrections.
The chiral fit has a χ 2 /dof of 0.7 for 8 degrees of freedom and gives the result R Bπ (phys, m π = 0) = 0.987(51), i.e. 1 in agreement with the soft pion relation of eq. (1) with an uncertainty of 5%. Most of the parameters in the fit are not well determined by the data, except for the linear term in m π which has coefficient -2.4(5), in agreement with our expectation. Missing out the chiral logarithm changes things slightly, giving a result 2σ below 1 at m π =0 of 0.92 (4). Leaving the chiral logarithm unconstrained, i.e. giving g a prior of 0.0(10.0), results in a fitted value for g of 2(8), i.e. it is not well determined by the results (but also not inconsistent with its expected value).
From the fit we can extract the result at the physical value of the π meson mass corresponding to equal mass u and d quarks in the absence of electromagnetism. This is the experimental mass of the π 0 , 135 MeV. There we g Bsηs 0 (q 2 max ) RB sηs 1 1.329(14) -0.042(2) 1.314(14) 1.886 (21) 0.740(8) 2 1.323(10) -0.042(1) 1.306(10) 1.867 (15) 0.743(6) 4 1.325(7) -0.050(0) 1.267 (7) 1.648 (9) 0.733(4) 6 1.322(2) -0.051(0) 1.282 (2)  find R Bπ (phys, m π = m π 0 ) = 0.805 (16). (28) This is very insensitive to any of the details of the fit because we have results at the physical π mass. The error budget for R Bπ (phys, m π = m π 0 ) is given in Table VII The first uncertainty comes from the fit to the ratio R Bπ and includes statistical and fitting errors. The second uncertainty comes from our value for f B and includes uncertainties from missing higher order terms in the current matching. We expect these to be very similar for f 0 , since we have seen this to be the case for currents we have included, so we do not include any additional systematic error specific to f 0 . We find finite-volume effects, as discussed above, to give negligible uncertainty. Taking instead our value of f B + of 0.184(4) GeV [10] gives

B. Bs → ηs
The decay mode B s → η s , in which we replace the l quark in B → π with an s quark is a useful calibration point in lattice QCD. The η s meson is a pseudoscalar meson with valence quark content ss but including the quark-line-connected correlator only so that it cannot mix with other flavour-singlet mesons. It is thus not a physical particle that appears in experiment. However its properties, mass and decay constant, have been wellstudied in lattice QCD [11,35].
The calculation for B s → η s proceeds exactly as described in Section II for B → π. The valence s quarks have the masses in lattice units given in Table II Table VIII). The grey band gives the results of a fit described in the text.
correlators for the B s and η s were calculated in [10,11]. In Table VIII we give the results from this calculation for the 3-point amplitude for B s → η s decay via a temporal vector current with B s and η s at rest. We also give the result derived directly from our fits for the ratio . (31) in which the overall renormalisation of currents cancels. The ratio R Bsηs is plotted in Figure 4 as a function of the mass of the π meson made from the sea l quarks and as a function of the lattice spacing, a. We have results only for a subset of the ensembles used for B → π, but we see no dependence of R Bsηs on either m π or on a.
Our fit has a χ 2 /dof of 0.35 for 5 degrees of freedom and gives a result at physical m π for R Bsηs of 0.740 (5). For this quantity the value at m π = 0 has no significance. The physical result for R Bsηs is very insensitive to any details of the fit since we have results at the physical π mass and there is almost no dependence on m π and a. The error budget is given in Table VII; the uncertainty is dominated by statistics. The value of R Bsηs can be converted into a value for f 0 (q 2 max ) using f Bs = 0.224(5) GeV [10] and f ηs = 0.1811(6) GeV and (1 + m ηs /m Bs ) = 1.1283 [11,16]. We find where the first error is from the ratio and the second from f Bs and f ηs . This is a substantially smaller value than that for B → π and it is also more precise, reflecting smaller statistical uncertainties in the raw data and the absence of any dependence on m π or a.

C. Bs → K
The decay mode B s → K ν is a physical process in which a b quark undergoes a weak transition to a u quark inside a B s meson. This process can then be used in a similar way to B → π ν to determine V ub . Lattice QCD calculations for the form factors for this process are now starting to appear [2,36]. Here we again give results for the form factor f 0 at zero recoil with u/d quark masses going down to physical values. We again study the ratio determined directly from our fits: Table IX gives our results and Figure 5 plots the results for R BsK against m π . We observe no significant a dependence but some dependence on m π , because the K meson contains a valence l quark. To fit the dependence  Table IX for results. The grey band gives the results of a fit described in the text.
of R BsK as a function of m π and a we use a similar fit form to that used earlier (eq. (26)). We drop the odd powers of m π from kinematic effects, since these now depend on m K which depends on m l and hence m 2 π . As for B s → η s we add a term allowing for mistuning of the s quark mass as h(m 2 ηs − [0.6885 GeV] 2 ) with prior on h of 0.0(2). We also keep the chiral logarithm term from B → π but we expect the coefficient to be a lot smaller here so we allow a prior for the coefficient of 0(1). Again, for this quantity, we will only extract a value at the physical value of m π (and not m π → 0) and, since we have results very close to that point, the value is insensitive to the details of the fit.
Our fit has a χ 2 /dof of 0.2 for 5 degrees of freedom and gives a result at physical m π for R BsK of 0.626(6) with error budget given in Table VII. This can be converted into a value for f 0 (q 2 max ) using f Bs = 0.224(5) GeV [10] and f K + = 0.1561 GeV and (1+m K /m Bs ) = 1.0927 [16]. We find Here the first error is from R BsK and the second from f Bs .

IV. DISCUSSION
We can compare our results for f 0 (q 2 max ) for B → π and B s → K to existing full lattice QCD results where an extrapolation in m π to the physical point has been done (as well as a study of f + and f 0 as a function of q 2 ). Figure 6 shows this comparison for B → π, comparing our result to that from Fermilab/MILC [1] and RBC/UKQCD [2]. Both of these calculations use a 'clover' formalism for the b quark, which has a nonrelativistic interpretation on coarse lattices seamlessly  ) for B → πlν decay. The top result is from this paper using gluon field configurations that include u, d, s and c quarks in the sea and include u/d quarks at physical masses. The lower two results [1,2] use gluon field configurations that include u, d and s quarks in the sea and extrapolate to physical mπ from heavier values. For all points statistical and systematic uncertainties have been added in quadrature.
matching to a relativistic formulation as a → 0. Details differ significantly in the two cases [37,38] with the RBC/UKQCD calculation tuning the clover term nonperturbatively and including O(α s a) current corrections to reduce discretisation errors. The Fermilab/MILC results couple their b quark to an asqtad staggered quark on the MILC 2+1 asqtad configurations. They cover a range of lattice spacing values from 0.12 fm to 0.045 fm and have a minimum light quark mass corresponding to a value for m π of 180 MeV on 0.09 fm lattices. The RBC/UKQCD results use domain-wall light quarks on their 2+1 domain-wall configurations with two lattice spacing values (0.11 fm and 0.086 fm) and a minimum light quark mass corresponding to a value for m π of 290 MeV.
As discussed above, our results have the advantage of including much lower values of m l , down to values of physical m π (indeed our maximum light quark mass corresponds to m π of 305 MeV) as well as using a more highly-improved action (to reduce discretisation errors) for both the b quark, the light quark, the current connecting them, and the gluon field.
We see agreement between our result and that from RBC/UKQCD within their larger uncertainties. There is a 2σ 'tension' between our result and that of Fermilab/MILC where we have similar 3% uncertainties. The Fermilab/MILC value for f 0 (q 2 max ) for B + → π 0 is 1.012 (28) [39] to be compared to our result of 1.108(33) (eq. 30) for the same case.
For the processes B s → K and B s → η s the extrapolation in m π is less of an issue and instead, for example, it becomes important to tune the s quark mass accurately. For our B s → K result we can compare to RBC/UKQCD as above [2] and also to an earlier HPQCD result [36] (which also gives B s → η s form factors). This latter result uses O(v 4 ) NRQCD for the b quark with HISQ valence light quarks on MILC 2+1 asqtad lattices at two lattice spacing values (0.12 fm and 0.09 fm) and with a minimum light sea quark mass corresponding to m π = 280 MeV. In all cases we see agreement with our results, given in eqs (32) and (34), within the larger O(5%) uncertainties of [2,36]. For example [2] quotes a result for f 0 for B s → K at q 2 = 23.4GeV 2 of 0.81(6) to be compared to our result at q 2 max of 23.7 GeV 2 of 0.822(20) (eq. (34)).
In the comparison of B/B s semileptonic form factors to experiment for extraction of V ub it should be emphasised that it is the f + form factor that is relevant for µν or eν final states, and not f 0 . However, similar issues arise in both cases for the extrapolation to physical m π and so the tests above for f 0 are also relevant to f + .

V. CONCLUSIONS
In this paper we have laid to rest a long-standing controversy over the relationship between the form factor f 0 at zero recoil in B → π decay and the ratio f B /f π from lattice QCD results. We calculate the ratio of f 0 to f B /f π directly and obtain particularly accurate results because: • our lattice b-light currents are accurate through α s Λ/m b and the renormalisation of the current cancels between f 0 and f B .
• we work with light quarks that have their physical mass as well as heavier masses to allow an extrapolation to m π = 0. This is well controlled because of the form the staggered quark chiral perturbation theory takes (see Appendix A).
• we use improved actions for our b quarks, light quarks and gluon fields, so that discretisation errors are reduced below α s a 2 .
We find in agreement with the soft-pion theorem result of 1. This test adds confidence to our control of lattice systematic uncertainties now that we have reached values of the π mass close to the physical point. We are then able to determine values of the f 0 form factor at zero recoil and for physical quark masses for 3 processes, obtaining the ratios defined in Section II: R Bπ (phys) = 0.805(16) R BsK (phys) = 0.626(6) R Bsηs (phys) = 0.740 (5).
Numbers vary by 30% as the quark content in initial and final state changes between l and s, with corresponding changes in the value of q 2 max . This allows us to extract: These can act as calibration values for lattice QCD calculations working at heavier-than-physical light quark masses. The uncertainties we have reached in this first 'second-generation' form-factor calculation are encouraging for the improvements that will be possible as we move away from zero recoil.
The / D operator in staggered-quark lattice discretisations has O(a 2 ) lattice-spacing corrections that do not vanish in the limit of zero quark mass m q . These are due to taste-changing interactions and affect the zero modes of the lattice Dirac operator i / D + m q in the chiral limit, altering the topological properties of the theory at nonzero lattice spacing [40,41]. In principle, therefore, one should take the continuum limit a → 0 before taking the chiral limit m q → 0; that is, there will be a smallest quark mass for each lattice spacing below which non-physical effects will show up.
In practice, these non-physical effects have been small, especially with the HISQ discretisation and similar actions designed to suppress taste-changing interactions [42]. Such possibilities have therefore had minimal effect on analyses of meson decay constants and masses (see, for example, [11]). This might seem surprising because we are now working at realistic u/d quark masses which are very small. Here we are also examining the limit as m π → 0 (based on results at non-zero values of m π and a).
Chiral perturbation theory, and, in particular, staggered chiral perturbation theory is a useful tool for analyzing such effects. The problem is then apparent from the fact that the masses of different tastes of pion are split by taste-changing interactions of O(a 2 α 2 s , a 2 α 3 s ), so that only the Goldstone pion's mass vanishes when the quark mass goes to zero. This affects the chiral logarithms in the theory since these typically involve averages over the masses of all tastes of pion in staggered chiral perturbation theory. As long as taste-splittings are small compared to the Goldstone pion mass then the difference is purely a discretisation effect. However, in the opposite limit of m π → 0, m 2 π log(m 2 π ) in the continuum theory is replaced by a 2 log(a 2 ) in the staggered-quark theory. While such terms vanish as a → 0, one might still worry that unusually large a 2 corrections and corrections that are not analytic in a 2 would make the approach to the continuum limit hard to control in practice, as well as distorting the m π dependence.
In fact, as we show here, the leading non-analyticities in a 2 cancel as m u/d → 0 when using the HISQ discretisation at current lattice spacings, at least for such phenomenologically important quantities as: f π , m 2 π , f K , f B and f B→π 0 . Consequently HISQ quarks are more continuum-like in their a and m π dependence than might have been expected, even for very small u/d masses.
The non-analytic a 2 dependence coming from the normal chiral logarithms is cancelled by contributions from the 'hairpin' diagrams in staggered chiral perturbation theory. We will show how this cancellation works in the next subsection. The cancellation occurs only for special values of the staggered chiral theory parameters δ V and δ A , but at the same values for each of the physical quantities mention above. Simulations show that the actual values are within 10% or so of the special values required for cancellation.

Staggered Chiral Perturbation Theory analysis
As examples, we first study f π , f K and f B in the 2+1 full QCD case using formulae from [31] (eqs. 27 and 28) and [32] (eq. 105). We will evaluate here the leading behaviour in a at Goldstone m π = 0 (m u/d = 0). The key terms are those that contain logarithms of masses of different tastes of pion -different tastes of kaon or other strange particle are irrelevant because their masses will be controlled by the s quark mass. We implement a model for different tastes of pion in which the tastes are equally spaced. This is a very good approximation to what happens in simulations with the Highly Improved Staggered Quark action [12] (and indeed the asqtad improved staggered quark action). This is an indication that the taste-breaking potential in the staggered chiral Lagrangian is dominated by one partic- ular term [13,43,44]. If we take the unit of splitting to be δ t (proportional to α 2 a 2 or even α 3 a 2 in an improved formalism) then the Goldstone (G) pion mass will be zero when m u/d = 0 and the other tastes (axial vector, tensor, vector and singlet) will have squared masses according to: The degeneracies of the different tastes from the Goldstone upwards in mass are: 1, 4, 6, 4, 1, making a total of 16 tastes.
Then we can evaluate the chiral log terms that appear in the NLO term multiplying 1/(16π 2 f 2 ) in staggered chiral perturbation theory at m u/d = 0. For f π this is simply: where C is a constant. The Cδ t term is then a normal discretisation error which we will ignore here. Similarly for f K and f B (which has an overall extra factor of 1 + 3g 2 B * Bπ ) the log term appears as: Thus it is clear that the chiral logarithm terms on their own will give rise to potentially troublesome δ t log δ t ≡ a 2 log a 2 terms (ignoring α s factors in δ t although in practice they make a significant numerical difference) as m u/d → 0.
In addition there are 'hairpin' terms at NLO that come in a variety of forms. The key ones that can contain terms of the form a 2 log a 2 as m u/d → 0 are those multiplying logarithms of the mass of a taste of pion or a taste of η (but not the singlet). For f π there are two such hairpin terms that we reproduce here in the axial taste case (there is also a vector taste version of them) [31]. Term 1 is : where (m) = m 2 log m 2 and this term appears with coefficient 2δ A which is itself O(a 2 ) (we absorb the a 2 into δ A here).
To evaluate T 1 we use [44]: where m 2 S A is the axial taste of the ss pseudoscalar which we take to have the same taste-splittings as the pion. Again this is a very good approximation in the HISQ case [12].
We can evaluate Z to first order in δ A at m u/d = 0 as: and then the η and η masses follow: Then we can evaluate each of the mass differences in T 1: This final mass difference, appearing in the denominator of T 1 looks dangerous but the whole term, as discussed above, is multiplied by δ A . Combining all the mass differences we have: to leading order in δ A . However δ A still appears inside m 2 η A . We approach T 2 similarly.
From above we have the mass differences: Adding T 1 and T 2, as required for the chiral expansion of f π gives: Since δ A also arises from staggered-quark taste effects we can assume that it is some multiple of the unit of taste-splitting, δ t . If we write it as a multiple, x A , of the largest taste-splitting (between the taste-singlet and the Goldstone), then δ A = 4x A δ t . The sum of T 1 and T 2 becomes An equivalent result would be obtained for the vector hairpin terms.
Combining the chiral log for f π with the δ A and δ V hairpin terms then gives: We see that x A + x V = −0.5 is a special value where the δ t log δ t (i.e. a 2 log a 2 ) terms cancel between the chiral logarithms and the hairpins. In fact this is the value that is obtained from staggered chiral perturbation theory fits to f π and f K . Such fits in [11] give (6) so that δ A ≈ δ V ≈ −δ t . Then the axial taste of η has a mass between that of the Goldstone and axial tastes of pion from eq. (A10).
The impact of this is made clear in Figure 7 in which we plot the chiral logarithm term as a function of the (Goldstone) m π . The solid black curve shows the continuum form −2(m 2 π /Λ 2 χ ) log(m 2 π /µ 2 ) with Λ χ = 4πf π (for f π = 130 MeV) and µ = 0.568 GeV. For comparison we show the result from simply replacing the continuum chiral logarithm by an average over chiral logarithms for each taste of pion (the term analysed in eq. (A1)). We take equally-spaced masses for the tastes with splittings appropriate to coarse and fine lattices from [12]. The distortion of the continuum chiral logarithm from tastesplitting discretisation effects is clear, particularly on the coarse lattices at small values of m π . In contrast, when the complete staggered chiral perturbation theory expression is taken, including the hairpin corrections discussed above, the taste-splitting discretisation effects are almost eliminated and the match with the continuum chiral perturbation theory behaviour is restored. The curves in Figure 7 used δ A = δ V = −δ t with δ t = 0.022 GeV 2 on coarse lattices and δ t = 0.007 GeV 2 on fine lattices. The impact of taste-changing discretisation effects on the chiral perturbation theory is then very small, even on the coarse lattices.
To understand why the full staggered chiral perturbation theory looks so continuum-like across the full range of m π values (and not just as m π → 0) we can consider another limit in which δ t < m 2 π with 0 < m 2 π < m 2 ηs . This region is in reasonable correspondence with the righthand side of our plots in Figure 7. Then the full staggered chiral perturbation theory gives the continuum chiral logarithms as well as terms that behave as δ t log m 2 π and m 2 π log(1 + nδ t /m 2 π ) ≡ nδ t . Both of these terms look like regular discretisation errors and are not problematic. However, once again, both of these terms cancel between the staggered chiral logarithms and the hairpin terms for the case above where δ A = δ V = −δ t . This explains why there is such good correspondence, with only very small discretisation errors visible, between the full staggered chiral perturbation theory and the continuum chiral logarithm for f π across the full range of m 2 π values in Figure 7.
The staggered chiral perturbation theory for f K and f B is similar to that for f π [31,32]. The hairpin contributions that can give a 2 log a 2 terms as m 2 π → 0 in these cases are identical to the T 1 and T 2 above for f π (eqs. (A3) and (A12)). They come with a coefficient of 1/2 instead of 2, however (again f B has the additional multiplier of 1 + 3g 2 B * Bπ ). This is exactly the overall factor of 1/4 needed to cancel the δ t log δ t piece of the chiral logarithm (eq. (A2)) when x A + x V = −0.5 as for f π . So again in this case staggered chiral perturbation theory behaves much more benignly than might be expected in its approach to the continuum limit. This will also be true, as for f π , across our range of m π values.
The staggered chiral perturbation theory for m π (in terms of m u/d ) is somewhat different from the examples above [44]. The 'chiral log' is simply (m π I ). For m u/d = 0 this becomes 4δ t log 4δ t multiplying the standard factor of 1/(16π 2 f 2 ).
The 'hairpin' terms are also somewhat different. For the axial case, keeping only the pieces that can give rise to δ t log δ t terms, we have: Using mass-splittings from above this becomes This shows how the log m π factors in the a → 0 limit cancel between the two halves of T3. To obtain the a 2 log a 2 pieces as m u/d → 0 we need to take the Goldstone m π to zero. Then we have, for the axial case: T 3 = −4δ t log δ t + 4(1 + 2x A )δ t log(1 + 2x A )δ t = 8x A δ t log δ t (A21) Again, adding in the vector taste piece, we find that the value x A +x V = −0.5 cancels the δ t log δ t behaviour from the chiral log term above. So the same values for the δ A and δ V coefficients also give rise here to a cancellation of taste-splitting artefacts in the staggered chiral perturbation theory. It is also true, as before, that this cancellation can be demonstrated explicitly for the m 2 π > 0 case when δ t < m 2 π and in practice numerically it works across the whole m π range used here.
For the form-factor f B→π 0 discussed here we use the staggered chiral perturbation theory given in [33] (eq.67). Not surprisingly a set of terms appear there containing logarithms of m 2 π,t that are the same as those appearing in the staggered chiral perturbation theory for f π and f B . Our analysis above then demonstrates that these terms (combining chiral logarithms and hairpins) behave as continuum chiral perturbation theory. In fact these terms (denoted by I 1 in [33]) cancel between f 0 and f B /f π in the ratio R Bπ that we consider here.
There are additional terms in f 0 denoted I 2 in [33] that also behave as chiral logarithms. They have the form (v.p) 2 log(m 2 π,t /µ 2 ) where v.p is the dot product of B meson velocity and π 4-momentum. For a B meson at rest and a Goldstone π in the final state this 16 gives E 2 π,G log(m 2 π,t /µ 2 ), reducing to m 2 π,G log(m 2 π,t /µ 2 ) at zero recoil. These are the chiral logarithm terms that survive in the ratio R Bπ and that we include in our fit of eq. (26). Such terms only contain δ t inside the logarithm and so cannot give rise to δ t log δ t terms. The approach to m π ≡ m π,G = 0 for R Bπ is therefore the same as in continuum chiral perturbation theory even at non-zero lattice spacing. Discretisation effects could arise from the m 2 π,t inside the logarithm but these can be shown to cancel in the δ t < m 2 π case for the same δ A and δ V values (in eq. (A18). Once again the numerical result, shown in Fig. 8, gives continuum-like behaviour for the staggered chiral perturbation across the full range of m π values. We nevertheless allow for possible m π -dependent discretisation errors in our chiral fit, as discussed in Section III.