$B$- and $D$-meson leptonic decay constants from four-flavor lattice QCD

We calculate the leptonic decay constants of heavy-light pseudoscalar mesons with charm and bottom quarks in lattice quantum chromodynamics on four-flavor QCD gauge-field configurations with dynamical $u$, $d$, $s$, and $c$ quarks. We analyze over twenty isospin-symmetric ensembles with six lattice spacings down to $a\approx 0.03$~fm and several values of the light-quark mass down to the physical value $\frac{1}{2}(m_u+m_d)$. We employ the highly-improved staggered-quark (HISQ) action for the sea and valence quarks; on the finest lattice spacings, discretization errors are sufficiently small that we can calculate the $B$-meson decay constants with the HISQ action for the first time directly at the physical $b$-quark mass. We obtain the most precise determinations to-date of the $D$- and $B$-meson decay constants and their ratios, $f_{D^+} = 212.7(0.6)$~MeV, $f_{D_s} = 249.9(0.4)$~MeV, $f_{D_s}/f_{D^+} = 1.1749(16)$, $f_{B^+} = 189.4 (1.4)$~MeV, $f_{B_s} = 230.7(1.3)$~MeV, $f_{B_s}/f_{B^+} = 1.2180(47)$, where the errors include statistical and all systematic uncertainties. Our results for the $B$-meson decay constants are three times more precise than the previous best lattice-QCD calculations, and bring the QCD errors in the Standard-Model predictions for the rare leptonic decays $\overline{\mathcal{B}}(B_s \to \mu^+\mu^-) = 3.64(11) \times 10^{-9}$, $\overline{\mathcal{B}}(B^0 \to \mu^+\mu^-) = 1.00(3) \times 10^{-11}$, and $\overline{\mathcal{B}}(B^0 \to \mu^+\mu^-)/\overline{\mathcal{B}}(B_s \to \mu^+\mu^-) = 0.00264(8)$ to well below other sources of uncertainty. As a byproduct of our analysis, we also update our previously published results for the light-quark-mass ratios and the scale-setting quantities $f_{p4s}$, $M_{p4s}$, and $R_{p4s}$. We obtain the most precise lattice-QCD determination to date of the ratio $f_{K^+}/f_{\pi^+} = 1.1950(^{+16}_{-23})$~MeV.


I. INTRODUCTION
Leptonic decays of B and D mesons are important probes of heavy-to-light quark flavor-changing interactions. The charged-current decays H þ → l þ ν l (H ¼ D þ ; D s ; B þ ; l ¼ e, μ, τ) proceed at tree level in the standard model via the axial-vector current A μ ≡Qγ 5 γ μ q, where Q is the heavy charm or bottom quark and q is the light quark in the pseudoscalar meson. When combined with a nonperturbative lattice-QCD calculation of the decay constant f H þ , an experimental measurement of the leptonic decay width allows the determination of the corresponding Cabibbo-Kobayashi-Maskawa (CKM) quark-mixing matrix element jV Qq j. Because the decays H 0 → l þ l − (H ¼ D 0 ; B 0 ; B s ) proceed via a flavor-changing-neutral-current interaction, and are forbidden at tree level in the standard model, these processes may be especially sensitive to (tree-level) contributions of new heavy particles. Both the standard model and new-physics predictions for the rare-decay branching ratios depend upon the decay constants f H 0 .
Leptonic B-meson decays, in particular, make possible several interesting tests of the standard model and promising new-physics searches. The determination of jV ub j from B þ → τ þ ν τ decay can play an important role in resolving the 2 − 3σ tension between the values of jV ub j obtained from inclusive and exclusive semileptonic Bmeson decays (see the recent reviews [1,2] and references therein). Alternatively, the decay B þ → τ þ ν τ , because of the large τ-lepton mass, may receive observable contributions from new heavy particles such as charged Higgs bosons or leptoquarks [3,4]. The branching ratios for B 0 → l þ l − and B s → l þ l − can be enhanced with respect to the standard model rates in new-physics scenarios with treelevel flavor-changing-neutral currents, such as in fourthgeneration models [5,6].
Lattice-QCD calculations of the B-meson decay constants are especially timely given the wealth of leptonic B-decay measurements from the B-factories and, more recently, by hadron-collider experiments at the LHC. The branching ratio for the charged-current decay B þ → τ þ ν τ has been measured by the BABAR and Belle experiments to about 20% precision [7][8][9][10]. The rare decay B s → μ þ μ − has now been independently observed by the ATLAS, CMS, and LHCb experiments with errors on the measured branching ratio ranging from around 20%-100% [11][12][13]; these works have also set limits on the process B 0 → μ þ μ − . Precise determinations of f B þ , f B 0 , and f B s are needed to interpret these results. Such determinations are also necessary to fully exploit coming measurements by Belle II [14], which will begin running at the Super-KEKb facility next year, as well as future measurements by ATLAS, CMS, and LHCb after the LHC luminosity and detector upgrades [15], which are planned for 2023-2025.
Several independent three-and four-flavor calculations of heavy-light-meson decay constants using different lattice actions are available [16][17][18][19][20][21][22][23][24][25][26][27][28], with uncertainties ranging from ∼0.5%-5% and ∼2%-8% for the D ðsÞ and B ðsÞ systems, respectively. The most precise results for f D and f D s have been obtained by us [23], and for f B s by the HPQCD Collaboration [17], in both cases using improved staggered sea quarks and the "highly-improved staggered quark" (HISQ) action [29] for the valence light and heavy quarks. The HISQ action makes possible this high precision because it has both small discretization errors, even at relatively large lattice spacings, and an absolutelynormalized axial current. Our previous calculation [23] of the D ðsÞ -meson decay constants employed physical-mass light and charm quarks and gauge-field configurations with lattice spacings down to a ≈ 0.06 fm; the dominant contribution to the errors on f D and f D s came from the continuum extrapolation. HPQCD's calculation of f B s with the HISQ action for the b quark employed five three-flavor ensembles of gauge-field configurations from the MILC Collaboration [30][31][32] with lattice spacings as fine as a ≈ 0.045 fm, enabling them to simulate with heavy-quark masses close to the physical bottom-quark mass. The statistical errors dominate in their calculation due to the comparatively small number of configurations per ensemble (roughly 200 on their finest up to 600 on their coarsest). Other important sources of uncertainty are from the extrapolation in heavy-quark mass up to m b and from the extrapolation to zero lattice spacing.
In this paper, we present a new calculation of the leptonic decay constants of heavy-light mesons containing bottom and charm quarks that improves upon prior works in several ways. As in our previous calculation of f D and f D s [23], we employ the four-flavor QCD gauge-field configurations generated by the MILC Collaboration with HISQ up, down, strange, and charm quarks [33]; we also use the HISQ action for the light and heavy valence quarks. We now employ three new ensembles with finer lattice spacings of a ≈ 0.042 and a ≈ 0.03 fm, and also increase statistics on the a ≈ 0.06 fm ensemble with physical-mass light quarks. Altogether, we analyze 24 ensembles, most of which have approximately 1000 configurations. We also calculate the B þ -and B 0 -meson decay constant with HISQ b quarks on the HISQ ensembles for the first time.
We fit our lattice data for the heavy-light meson decay constants to a functional form that combines information on the heavy-quark mass dependence from heavy-quark effective theory, on the light-quark mass dependence from chiral perturbation theory, and on discretization effects from Symanzik effective theory. This allows us to exploit our wide range of simulation parameters by including multiple lattice spacings and heavy-and light-quark mass values in a single effective-field-theory (EFT) fit. We present results for all charged and neutral heavy-light pseudoscalar-meson decay constants, as well as the SU(3)breaking decay-constant ratios and the differences between the charged decay constants and the decay constants in the isospin-symmetric (m u ¼ m d ) limit. In addition, we provide the correlations between our decay-constant results to facilitate their use in other phenomenological studies beyond this work. Preliminary reports of this analysis have been presented in Refs. [34,35]. This paper is organized as follows. First, Sec. II presents relevant details of the lattice actions, simulation parameters, and methodology of our calculation, including a discussion of how we deal with nonequilibrated topological charge. Next, we describe our two-point correlator fits used to obtain the heavy-light-meson decay amplitudes in Sec. III. In Sec. IV, we determine the lattice spacings and light-quark masses on the ensembles employed in this calculation, which are parametric inputs to the decay-constant analysis, and also to a determination of heavy-quark masses in a companion paper [36]. Physical quark-mass ratios and the light decay constant ratio f K þ =f π þ are obtained as a byproduct. We then calculate the physical B-and D-meson decay-constant values in Sec. V by fitting our lattice decayamplitude data at multiple values of the light-and heavyquark masses and lattice spacing to a function based on effective field theories, and interpolating to the physical light-, charm-, and bottom-quark masses and extrapolating to the continuum limit. In Sec. VI, we estimate the systematic uncertainties in the decay constants not included in the EFT fit, and provide complete error budgets. We present our final results for the B-and D-meson leptonic decay constants with total errors and discuss the impact of our results for determinations of CKM matrix elements and tests of the standard model in Sec. VII. Final results for light-quark mass ratios, f K þ =f π þ , and the scale-setting quantities f p4s and M p4s are also presented. Finally, in Sec. VIII, we conclude with an outlook to future work. Two Appendices provide useful information about (improved) staggered fermions when the bare lattice quark mass am 0 ≪1. Appendix A discusses the radius of convergence of the expansion in am 0 , while Appendix B derives the normalization factor for staggered bilinears. Appendix C provides the correlation and covariance matrices between our B-and D-meson decay constant results.

II. SIMULATION PARAMETERS AND METHODS
In this paper, we use the MILC Collaboration's ensembles of QCD gauge-field configurations with four flavors of dynamical quarks. This simulation program is described in detail in Ref. [33], and since then it has been extended to smaller lattice spacings. Here we provide information on our current calculation, and also document the new ensembles. First, in Sec. II A, we summarize the parameters of the actions and two-point correlation functions used in the analysis presented below. Three ensembles with approximate lattice spacings 0.042 and 0.03 fm are new since Ref. [33], while some of the older ensembles have been extended. In Sec. II B, we update the discussion in Ref. [33] on possible effects from using different algorithms in different parts of the simulation. Finally, in Sec. II C, we discuss effects of poor sampling of the distribution of topological charge and how to compensate for these effects.

A. Simulation parameters
The gauge action [37] is one-loop Symanzik [38] and tadpole [39] improved, using the plaquette to determine the tadpole quantity u 0 . The fermion action is the HISQ action introduced by the HPQCD collaboration [29]. The ensembles all have an isospin-symmetric sea. A single staggeredfermion field yields four species, known as tastes, in the continuum limit [40]. To adjust the number of species in the sea, we take the fourth (square) root of the quark determinant for the strange and charm (up and down) sea [41]. In addition to the perturbative arguments [40,42], this procedure passes several nonperturbative tests [43][44][45][46][47][48][49][50][51][52][53][54][55], providing confidence that continuum QCD is obtained as a → 0. Table I summarizes the ensembles used in this work. In this table, we identify the ensembles by the approximate lattice spacing a and the ratio of light sea-quark (m 0 l ) to strange seaquark mass (m 0 s ). The exact lattice spacing and physical strange-quark mass (m s ) are outputs of our decay-constant analysis and can be found in Table IX in Sec. V. The six lattice spacings range from approximately 0.15 fm to 0.03 fm, and the sea has light sea-quark masses 0.2m 0 s , 0.1m 0 s , and approximately physical. In most ensembles, m 0 s is chosen close to the physical strange-quark mass, but sometimes it is deliberately chosen far from physical to provide useful information about the sea-quark-mass dependence. In all ensembles, the charm-quark mass is chosen close to its physical value. In Table I, β ¼ 10=g 2 is the gauge coupling, T and L are the lattice temporal and spatial extents, and M π is the mass of the taste-Goldstone sea pion.
For each ensemble, the light, strange, and charm seaquark masses are estimated either from short tuning runs or from tuned masses on nearby ensembles. These values are always found to be slightly in error once higher statistics become available, so it is necessary to adjust for this small sea-quark-mass mistuning a posteriori, as we do in the fitting procedure described in Sec. V.
We compute pseudoscalar correlators for several valencequark masses on each ensemble. In almost all cases, we use light valence-quark masses of 0.1m 0 s , 0.2m 0 s , 0.3m 0 s , 0.4m 0 s , 0.6m 0 s , 0.8m 0 s and 1.0m 0 s , where the prime distinguishes the strange sea-quark mass from the post-production, bettertuned mass. To save computer time, however, for the finest ensemble with a ≈ 0.03 fm and m 0 l ¼ m 0 s =5, we only use valence-quark masses greater than or equal to the light seaquark mass 0.2m 0 s . For the physical quark-mass ensembles and the ensembles with a ≈ 0.06 and 0.042 fm, we use lighter valence-quark masses, usually going down to the estimated physical light-quark mass. The wide range of valence-quark masses on the ensembles with a ≥ 0.042 fm are used to determine the light-quark-mass dependence, while the 0.03 fm ensemble helps guide the continuum limit. This strategy saves computer time, since light-quark propagators on these lattices are expensive, the cost being approximately proportional to 1=am q . In all cases, we compute valence heavy-quark propagators with masses of 1.0m 0 c and 0.9m 0 c , to allow interpolation or extrapolation to the physical charm-quark mass. Finally, on six of the ensembles we use valence-quark masses heavier than charm to allow us to extrapolate, and on the finest lattices interpolate, to the b-quark mass. Table II shows the lightest valence-quark mass used on each ensemble in units of the strange sea-quark mass, and also the heavy-quark masses used on each ensemble.
On each configuration, we compute quark propagators from four or six evenly-spaced source time slices. We change the location of the first source time slice from configuration to configuration, shifting by an amount approximately equal to half the spacing between source time slices but incommensurate with the lattice size, so that all possible source locations are used. Table II also shows the number of source time slices used on each ensemble.

B. RHMC and RHMD algorithms
The coarser ensembles were all generated using the rational hybrid Monte Carlo (RHMC) algorithm [56][57][58][59][60][61][62][63][64][65], but some of the finer ensembles were generated with a mixture of the RHMC and the rational hybrid molecular dynamics (RHMD) [32,33,[56][57][58][59][60][61][62][63][64] algorithms. The two most recently generated ensembles, one with a ≈ 0.042 fm and physical light-quark mass and another with a ≈ 0.03 fm and m 0 l ¼ m 0 s =5, were generated entirely with the RHMD algorithm. The considerations behind these choices, and the effects of using the RHMD algorithm, are discussed in detail in Ref. [33]. Since the preparation of Ref. [33], three of the ensembles have been enlarged, which enables us to update the comparison of the RHMC and RHMD algorithms in that work. Table III shows the differences in the plaquettes between the parts of the ensembles generated with RHMC and RHMD algorithms for the ensembles where both algorithms were used. The numbers of configurations used in this comparison differ from those in Table I because  heavier-than-charm correlators were only run on parts of   TABLE I. Ensembles used in this calculation. The notation and symbols are discussed in the text. In the first column the approximate lattice spacings are mnemonic only; the precise values are tabulated in Table IX. The second column is used as a key to identify the ensembles at a given approximate lattice spacing. A dagger ( †) on am 0 s flags ensembles for which the simulation strange-quark mass is deliberately chosen far from the physical value. The M π and L values are different from those listed in Table I  the first two ensembles listed, and the third ensemble was extended slightly after this comparison was done. In addition, the plaquette was measured after every trajectory, giving 2-3 times larger statistics than used in our decayconstant calculation. Motivated by the expectation that using an approximate integration procedure amounts to simulating with a slightly different action, we can estimate the importance of these shifts by asking how much the bare coupling or, equivalently, the lattice spacing would need to be adjusted to change the average plaquette by this amount. From looking at the plaquette at a couple of lattice spacings, we find Δ lnðaÞ=Δplaq ≈ −4.2, which leads to the corresponding values of Δa=a given in the final column of Table III. Clearly, these differences are quite small. In fact, they are negligible, because in the analysis reported below we use f π to set the scale, and the fractional error on the current value for f π from the Particle Data Group (PDG) [66,67] is about 150 × 10 −5 .
The new a ≈ 0.042 fm physical-mass ensemble has the largest physical volume of the four-flavor MILC ensembles, with a spatial size of about 6 fm, while the new a ≈ 0.03 fm ensemble with m 0 l =m 0 s ¼ 1=5 has the smallest lattice spacing. When the physical volume is made larger, more low-momentum (long-distance) modes are added to the system. Based on these considerations, we do not expect this added physics to be very sensitive to the molecular dynamics step size. On the other front, the lattice spacing is made smaller by making β larger. If the ultraviolet gauge modes are viewed as free fields, the coefficient of the gauge fields in the molecular-dynamics Hamiltonian is proportional to β while the coefficient of the conjugate momenta added for the molecular-dynamics time evolution is held fixed. Thus, the frequency of the modes in molecular dynamics time is proportional to β 1=2 . Strictly speaking, if we wish to keep the fractional error fixed while increasing β, we should reduce the step size as β −1=2 . That dependence is very weak-the square root of ln a. It turns out that this scaling is more or less what was chosen empirically in going from a ≈ 0.09 fm to 0.042 fm. The step size was decreased from 0.0133 to 0.0125, or by about 6%, as β was increased from 6.3 to 7.0, corresponding to β 1=2 changing by 5%.

C. Correction for nonequilibrated topological charge
Because QCD simulations use approximately continuous update algorithms, the topological charge Q evolves more and more slowly as the lattice spacing becomes smaller. In our finest ensembles, the evolution has slowed so much that the distribution of Q has not been sampled properly. Time histories of the topological charge in many of the HISQ ensembles can be found in Ref. [68]. In Fig. 1, we show one case, a ≈ 0.06 fm and physical m 0 l , where the topological charge is well equilibrated, and a second case, a ≈ 0.042 fm and m 0 l ¼ m 0 s =5, where its distribution is clearly not well sampled.
As first discussed in Ref. [69], one can study the Q-dependence of observables in chiral perturbation theory (χPT). Bernard and Toussaint [68] recently extended this approach to heavy-light decay constants in the context of heavy-meson χPT. We use their results to adjust the raw decay-constant results to account at lowest order for the incomplete sampling of Q in the small-a ensembles. The amount of the adjustment is smaller than our statistical errors, but not negligible in comparison to other systematic effects.
We summarize here the key results that allow us to make this adjustment. Let be the heavy-light decay constant, in the normalization suitable for heavy quarks. Let B denote either the meson mass M, the decay constant f, or the combination Φ H . In a finite volume V at fixed Q, the masses and decay constants obey [69,70].
where on the right-hand side B is the infinite-volume value, properly averaged over Q, B 00 is its second derivative with respect to the vacuum angle θ, evaluated at θ ¼ 0, and χ T is the topological susceptibility where subscripts x and y denote flavor, and the meson mass and decay constant are at θ ¼ 0. A similar calculation in heavy-meson χPT gives [68] Φ 00 ð2:5Þ where m x is the mass of the light valence quark, and B 0 , λ 1 , and λ 0 1 are low energy constants, which are estimated in a companion paper on heavy-light meson masses [36]. These are the appropriate results even with 2 þ 1 þ 1 flavors of sea quark, because the charmed sea quark decouples from the chiral theory. Although the dependence of masses and decay constants are usually small compared to our statistical errors, we have been able to resolve them in some of our well-equilibrated ensembles and confirm, within limited statistics, that our data agree with these formulas [68,71]. l ¼ m 0 s =5 run at a ≈ 0.042 fm, shows a case where the time history is not well sampled, and where we will apply the correction factors discussed in Ref. [68].
Knowing the dependence of masses and decay constants on the average Q 2 , one can correct the simulation results to account for the difference of the simulation average hQ 2 i sample , and the correct hQ 2 i. The lowest order χPT result for the topological susceptibility is [72] where the effect of staggered taste-violations has been included at leading order by using the taste-singlet meson masses [73,74], indicated by "I." The correction to the decay constants is then given by with χ T from Eq. (2.7). MILC has calculated hQ 2 i sample on all ensembles listed in Table I. For more details, see Ref. [68]. For three of the finest ensembles, namely those at a ≈ 0.042 and 0.03 fm, the simulation time histories of Q 2 show that it is not well equilibrated. In the analysis below, we use Eq. (2.8) with hQ 2 i sample calculated by MILC to adjust the decay-constant data. The adjusted data are used in our central fit, and we take 100% of the difference between fit results with the adjusted data and with the unadjusted data as the systematic error in our results from incomplete equilibration of the topological charge.

III. TWO-POINT CORRELATOR FITS
Our procedures for calculating pseudoscalar meson correlators and for finding masses and amplitudes from these correlators are the same as those used in our earlier computation of charm-meson decay constants in Ref. [23]. Our analysis includes new and extended ensembles, however, so the fit ranges and the number of states employed have been updated.
We compute quark propagators with both "Coulombwall" and "random-wall" sources, using four source time slices per gauge-field configuration in most cases, but six source time slices on the 0.042 fm m s =5 ensemble and the 0.06 and 0.042 fm physical quark mass ensembles. The pseudoscalar decay constant is obtained from the amplitude of a correlator of a single-point pion operator, jM −1 ðx; yÞj 2 , where M is the lattice fermion matrix = D þ m. The randomwall source consists of a randomly oriented unit vector in color space at each spatial lattice point at the source time. When averaged over sources, contributions to the correlator where the quark and antiquark are on different spatial points average to zero, so the average correlator is just the point-to-point correlator multiplied by the spatial size of the lattice, and the improved statistics from averaging over all the spatial source points more than makes up for the noise introduced from contributions with the quark and antiquark at different spatial points. We use three random source vectors at each source time slice.
For the Coulomb-wall source we fix to the lattice Coulomb gauge, and then use a source in a fixed direction in color space at each spatial lattice point. We use three such vectors, chosen to lie along the three coordinate axes in color space. The Coulomb-wall source is effectively smeared over the whole spatial slice, which we expect to suppress the overlap with excited hadrons, allowing us to use smaller distances in our fits. The Coulomb-wall correlators also have smaller statistical errors. We fit the correlators from the Coulomb-wall and random-wall sources simultaneously with different amplitudes for each source but common masses. The ground-state amplitude from the random-wall source gives the decay constant, but the Coulomb-wall source helps in accurately fixing the ground state mass, which in turn improves the determination of the random-wall amplitude. Figure 2 shows an example of heavy-light pseudoscalar correlators from the a ≈ 0.042 fm physical quark-mass ensemble for the lightcharm and strange-charm masses, showing the smaller excited state contamination in the Coulomb-wall correlator.
In all cases the sink operator is point-like, with quark and antiquark propagators contracted at each lattice sites. We sum the correlators over all spatial slices to project onto zero three-momentum.
The source time slices are equally spaced throughout the lattice. The location of the first source time slice varies from configuration to configuration by adding an increment close to one half the source separation, but such that all source slices are eventually used. For example, on the a ≈ 0.042 fm physical quark-mass ensemble, where we use six source time slices with a separation t=a ¼ 48, the location of the first source time slice on the Nth configuration is 19N mod 48, or a shift of 19 slices between successive configurations. Meson masses and decay constants are obtained from fitting to these correlators. For the light-light mesons, we include contributions from the ground state and one opposite parity state in the fit function, taking a large enough minimum distance to suppress excited states. This procedure works well for the light-light pseudoscalars, for which broken chiral symmetry makes the ground state mass much lighter than all the excited state masses.
Because the heavy-light correlators are noisier than the light-light correlators, and the gap in mass between the ground state and excited states is smaller, we include smaller distances and more states in the two-point correlator fits. The fits that yield the central values employed in the subsequent EFT analysis include three states with negative parity (pseudoscalars) and two states with the opposite parity, corresponding to the oscillations in t seen in Fig. 2. We refer to these as "3 þ 2" state fits. For these fits, the minimum distances and fit ranges used vary with the heavy-quark mass. However, they are kept constant in physical units across all ensembles with different sea-quark masses and lattice spacings, subject to being truncated to an integer in lattice units. In these fits, the mass gaps are constrained with Gaussian priors [75], but the amplitudes are left unconstrained. Table IV shows the constraints on the mass gaps used in the heavy-light correlator fits. Although we use loose priors for the lower splittings, tighter priors are needed for the higher splittings to ensure stable fits. Figure 3 shows the masses of the five fitted states as a function of the minimum distance included in the fit on the a ≈ 0.042 fm (left) and 0.06 fm (right) physical quark-mass   ensembles. In this plot, the size of the symbols is proportional to the quality of the fit p. We compute the p values of our fits using the augmented χ 2 that includes both data and prior contributions, and counting the degrees of freedom as the number of data points minus the number of unconstrained fit parameters. Thus it provides a measure of the compatibility of the fit result with both the data and the prior constraints. At small t min the p-value is poor, and more states would be required to get a good fit. At intermediate distances, the masses are mostly determined by the data, while at the largest distances the fit simply returns the prior central values and errors for excited-state and opposite-parity masses. We also perform heavy-light fits using 2 þ 1 states with larger minimum distances as a check, and use the difference between results of the 3 þ 2 state fits and 2 þ 1 state fits to estimate systematic errors coming from excited state contamination. Based on studies like Fig. 3 on every ensemble, we choose the minimum distances t min =a so that t min is as close as possible to the minimum distances given in Table V. As seen in this table, we use a slightly smaller t min =a for the Coulomb wall source since these correlators have smaller excited state contamination than the random wall source correlators. We expect the p values to be approximately uniformly distributed, with possible systematic deviations from uniformity coming from artificially loose or tight priors on the mass gaps, and, more importantly, neglecting effects of autocorrelations on the covariance matrix of the correlator at different distances. Figure 4 shows the distribution of p values for our full set of correlator fits using the fit ranges and number of states in Table V. It is approximately uniform from 0 to 1, indicating that we have not introduced any systematic bias in our fits from the choice of fit ranges or number of states. Because the p-values from correlators with different valence-quark masses in the same ensemble are strongly correlated, the statistical fluctuations in this histogram are larger than the expectation 1= ffiffiffiffi N p for independent data.
In order to subsequently fit the decay constants and masses obtained from these two-point correlator fits to an EFT function of the quark masses and lattice spacing, we need an estimate of the covariance matrix between these data. (Here the heavy-light decay constant is to be understood as Φ.) To distinguish this covariance matrix from the matrix of covariances of the correlators at different distances used in the two-point fits, in this section we refer to matrices of covariances of masses M and decay constants Φ as "MΦ covariance matrices." In the MΦ covariance matrix, all of the amplitudes and decay constants for different sets of valence quark masses are correlated, while those from different ensembles are uncorrelated. Thus, the MΦ covariance matrix is a large block-diagonal matrix, with each block corresponding to a single ensemble.
To obtain each block of the MΦ covariance matrix, we use a single-elimination jackknife procedure, omitting one configuration at a time from the two-point fits. This approach does not account for autocorrelations. Unfortunately, however, it is not practical to eliminate large enough blocks in the jackknife to suppress the autocorrelations, since we need a number of jackknife blocks that is large compared to the dimension of the block of the MΦ covariance matrix for that ensemble. We therefore use an approximate procedure. We first compute the block of the MΦ covariance matrix from the single-elimination jackknife, and then compute the dimensionless correlation matrix by rescaling rows and columns so that the diagonal elements are one. Next we compute the diagonal elements of the MΦ covariance matrix (that is, the variances of the masses and decay constants) using a block size large enough to reasonably well suppress the effects of autocorrelation, and rescale the rows and columns of the MΦ covariance matrix to set its diagonal elements equal to the variances obtained from blocking. On all ensembles with a ≳ 0.09 fm, we blocked the configurations by four; we used larger block sizes of up to 24 configurations on ensembles with finer lattice spacings to account for the longer autocorrelation times. This approach uses the single-elimination jackknife to determine the (dimensionless) correlations of all the masses and decay constants, and the blocked jackknife, which accounts for autocorrelations between gauge-field configurations, to determine the variances of each mass or decay constant.
The MΦ covariance matrix used in the EFT fit affects the p-value of the fit and the central values obtained for the decay constants at the physical quark masses and in the continuum limit. The statistical errors on the masses and decay constants in the MΦ covariance matrix range from 0.005% to 0.12% and 0.04% to 1.4%, respectively. The statistical errors quoted on the physical, continuum-limit decay constants are, however, obtained by an overall jackknife procedure, where we repeat the entire fitting chain 20 times, each time omitting 1=20 of the configurations from each ensemble.

IV. LATTICE SPACING AND QUARK-MASS TUNING
Tuning the masses of the light and charm quarks and the determination of the lattice spacings follow the procedure described in detail in Ref. [23]. In this procedure, we use TABLE V. Minimum distances used in our two-point correlator fits. Here "light" quarks include masses up to m s and "heavy" quarks masses beginning at m c , and the two numbers in the second column are the number of pseudoscalar and oppositeparity states included in the fit. The "Ã" indicates that this minimum distance is actually taken to depend weakly on the heavy quark mass, with the quoted distance the one used for the D s correlator.  Table V. the meson masses and decay constants in the physical quark mass ensembles (with a small correction for mistuned light quark mass), extrapolated to the continuum, to find the u, d, s, and c quark masses used in subsequent steps, and the lattice spacings of each ensemble. For setting the overall scale we use the pion decay constant f π . We also compute an intermediate scale f p4s , the decay constant of a fictitious pseudoscalar meson with degenerate valence quark with mass m p4s ¼ 0.4m s . To obtain f p4s and the associated meson mass M p4s , we draw quadratic functions in the valence-quark mass through the decay-constant and meson-mass data with degenerate valence quarks at 0.3, 0.4 and 0.6 times m 0 s , and evaluate these quadratic functions at 0.4 times the tuned strange quark mass m s . The quantity f p4s is convenient since it has small statistical errors and can be computed without light valence quark mass correlators. This feature is essential for the 0.03 fm ensemble where the lightest valence quark mass is m 0 s =5, so an extrapolation to f π on this ensemble would have large errors.
An initial value for the charm quark mass comes from matching the D s mass. With this m c and the light quark masses, we evaluate the masses of the D 0 and D þ mesons. The difference between them, 2.6 MeV, can be considered to be the part of the D þ -D 0 mass difference coming from the difference in the up and down quark masses. In Sec. VI, this quantity is denoted Cðm d − m u Þ and used to estimate the electromagnetic contribution to the mass splitting.
As discussed in Sec. II, the main new aspects of this work are the addition of three new ensembles and the increased statistics on some of the others. We also make some minor updates of the input parameters. The value of f π , used to set the scale, has been updated to 130.50 AE 0.13 MeV following the PDG [66,67], and the experimental neutral kaon and charmed meson masses have also seen slight changes.
In contrast with Ref. [23], we now use the strong coupling α V at scale q ¼ 2.0=a obtained from Ref. [76] in our central fit, and use α T , inferred from taste splittings, in an alternative fit to estimate systematic errors.
We also update the quantities ðM 2 K 0 Þ γ and ϵ 0 , which describe electromagnetic effects, to reflect the most recent results from the MILC Collaboration [77][78][79]. The quantity ðM 2 K 0 Þ γ is the electromagnetic contribution to the squared mass of the neutral kaon. The quantity ϵ 0 captures higherorder corrections to Dashen's theorem: We use ϵ 0 rather than the closely related quantity ϵ defined in Ref. [80] as Because the experimental pion splitting is largely due to electromagnetism, ϵ and ϵ 0 are close in size. The difference is estimated in Refs. [80,81] to be which is used to find ϵ 0 .
In this paper, we use [79] Our adjusted kaon masses, or "QCD masses," are then found from These quantities are used to match pure QCD to the more fundamental QCD þ QED. Consequently, any pure QCD calculation will have uncertainties coming from the particular scheme for separating electromagnetic and isospin effects. Our scheme is the one introduced for u and d quarks in Ref. [82] and extended naturally to the s quark using the fact that mass renormalization for staggered quarks is multiplicative [79]. As an estimate of the change that would result from the use of a different, but still reasonable, scheme, MILC compares to a scheme where the EM mass renormalization is calculated perturbatively (at one loop). While the resulting scheme dependence of ϵ 0 is small, AE0.038 [79], that of ðM 2 K 0 Þ γ is ∼420 MeV 2 , much  [66], and the meson masses after adjusting for electromagnetic effects (right side).

Experimental inputs QCD masses
larger than the errors in this quantity in a fixed scheme, although still small compared to M 2 K 0 . 1 Table VI summarizes the experimental masses that we use, and also the "QCD masses" where we have made the adjustments for electromagnetic effects described above, and the adjustments for the heavy meson masses from Eq. (6.1) in Sec. VI.
We extrapolate the scale-setting quantities f p4s and M p4s and the quark-mass ratios m u =m d , m s =m l , and m c =m s on the physical quark-mass ensembles to the continuum using a quadratic function in α s a 2 . The fit of m c =m s including all lattice spacings is poor, with p ¼ 0.01, because discretization errors from the charm quark are large at our coarsest lattice spacing. The m c =m s fit improves substantially to p ¼ 0.8 when the a ≈ 0.15 fm data are omitted. In an analysis of the heavy-light-meson masses in Ref. [36], we encounter similar problems when including data from the a ≈ 0.15 fm ensembles. We therefore omit the a ≈ 0.15 fm ensembles from our central continuum extrapolations here, in Ref. [36], and in the EFT analysis of the heavy-light decay constants in Sec. V. For estimating systematic errors from our choice of continuum extrapolation of scale-setting quantities, we also consider a fit quadratic in α s a 2 including all five physical quark-mass ensembles (as was done in Ref. [23]), a fit linear in α s a 2 omitting the 0.15 fm ensemble, a fit linear in α s a 2 omitting both the 0.15 fm and 0.12 fm ensembles, and a fit using α s inferred from taste violations. Figure 5 shows these extrapolations for the intermediate scale f p4s . In this fit, as in the other quantities discussed in this section, the central fit, shown in red, is at one end of the various extrapolations to a ¼ 0. We therefore assign a onesided systematic error from continuum extrapolations equal to the difference between this continuum extrapolation and the furthest of our alternative fits.
We assign five distinct systematic uncertainties to scalesetting quantities and quark-mass ratios stemming from electromagnetic effects, and tabulate them in Table VII. The first of these, labeled "K þ -K 0 splitting," is obtained by shifting ϵ 0 by the lower error bar, −0.11, in Eq. (4.4), and the error in the other direction is obtained by scaling by −8=11. Varying the result for ðM 2 K 0 Þ γ in Eq. (4.5) by its total error gives the second error, labeled "K 0 mass." The uncertainty labeled "K-mass scheme" is an estimate of the variation that would be produced by matching QCD þ QED to pure QCD in an alternative reasonable scheme. This is not taken to be a systematic error in our results, since we work in a fixed, well-defined, scheme. However, when using our results in a setting that does not take into account the subtleties of the EM scheme, one may wish to incorporate the estimate of scheme-dependence as an additional uncertainty. The two remaining electromagnetic uncertainties, which are discussed in more detail in Sec. VI, arise from electromagnetic effects on the relevant heavylight meson masses. In fact, only the EM effect on the mass of the D s , used to fix the charm quark mass, is needed here. From the estimates in Sec. VI, this effect is about 1.3 MeV, which is subtracted from the experimental D s mass before tuning the charm-quark mass, and 100% of the resulting shift is included in our systematic error estimates in the column labeled "H x mass." Scheme dependence arises again in the EM contribution to the D s mass, and we estimate it at 4.2 MeV in Sec. VI. The resulting uncertainty is listed in the column labeled "H s -mass scheme." The three uncertainties that do not arise from the choice of scheme, namely K þ -K 0 splitting, K 0 mass, and H x mass, are summed in quadrature to give the error labeled "Electromagnetic corrections" in the full error budget, Table VIII.
Another systematic error comes from possible incomplete adjustments for the effects of incorrect sampling of the distribution of the topological charge. Using the corrections found in Ref. [68] and described in Sec. II C, we adjust the meson masses and decay constants on the 0.042 and 0.03 fm ensembles to compensate for the incorrect average of the squared topological charge. We conservatively take 100% of the effects of this adjustment as a systematic error coming from poor sampling of the topological charge distribution.

FIG. 5.
Continuum extrapolations for f p4s on the physical quark mass ensembles. Our central fit, shown in red, is quadratic in α s a 2 excluding the 0.15 fm data. Alternative fits used for estimating systematic error are shown in blue. These include a quadratic fit including all the data, a linear fit including data up to 0.12 fm, and a linear fit including data up to 0.09 fm. The large error bar on the central fit line shows the statistical error on this fit at 0.15 fm, the point that is not included in this fit. 1 A preliminary value for ðM 2 K 0 Þ γ was reported in Ref. [83]. That result did not yet take into account EM quark-mass renormalization and is thus not reliable.
Corrections for finite spatial volume are estimated by the same procedure as in Ref. [23], where our central fit includes adjustments calculated in NLO staggered chiral perturbation theory, and an associated systematic error is taken to be the difference between this adjustment and using nonstaggered finite-volume chiral perturbation theory, at NNLO for M π and f π , and NLO for M K and f K . These estimates are considerably smaller than in Ref. [23] because we have now dropped from the central fit the coarsest ensembles, with a ≈ 0.15 fm, which dominate the earlier estimate. The tastesplittings at the next coarsest lattice spacing, a ≈ 0.12 fm, are about a factor of 2 smaller than at a ≈ 0.15 fm [33], so the difference between staggered and nonstaggered chiral perturbation theory is correspondingly reduced when the a ≈ 0.15 fm data are dropped.
Finally, we propagate the uncertainty in the PDG value of f π . The main effect is an overall scale error in dimensionful quantities. Because the decay constants depend on quark masses, an indirect effect also arises, leading to an uncertainty on dimensionless ratios, and a reduction in the uncertainty on dimensionful quantities, compared to the direct scale error. For the ratio m u =m d the experimental uncertainty in M K 0 − M K þ is also included. Table VIII shows the error budgets for the outputs of the scale-setting and quark-mass-ratio analysis, which are used in the subsequent fitting of the heavy-light results. The central values for these quantities are listed in Sec. VII B.

V. EFFECTIVE-FIELD-THEORY ANALYSIS
In this section, we discuss how we combine the lattice data for the meson masses and decay constants described in the previous sections to obtain continuum-limit, physical-quark-mass results. There are two crucial features of our data set. First, as discussed in Sec. II, the range of parameters is broader than that commonly encountered in lattice-QCD calculations. Figure 6 shows the lattice spacings and pion masses of the ensembles used in our analysis. The lattice spacing spans the range 0.03 fm ≲ a≲ 0.15 fm, while the light sea-quark mass lies between TABLE VII. Electromagnetic errors on, and estimates of scheme dependence of scale-setting parameters, quark mass ratios, and, for convenience, the phenomenologically interesting ratio f K þ =f π þ , and the ratio of the kaon to pion decay constants in the isospin symmetric limit, f K =f π .
Error budgets in per cent for scale-setting parameters, quark mass ratios, f K þ =f π þ , and f K =f π .  [23] are indicated with black outlines. Ensembles with unphysical strange-quark masses are shown as gold disks with orange outlines. The area of each disk is proportional to the statistical sample size N conf × N src . The physical, continuum limit is located at ða ¼ 0; M π ≈ 135 MeVÞ.
With the HISQ action, it is possible to simulate physical charm and bottom quarks with controlled discretization errors. Figure 7 shows the range of valence heavy-quark masses used in our analysis. On the coarsest a ≈ 0.15 and 0.12 fm ensembles, we have only two values m h ¼ 0.9m 0 c and m 0 c ; on our finest a ≈ 0.042 and 0.03 fm ensembles, however, we have several heavy-quark masses between 0.9m 0 c ≤ m h ≤ 5m 0 c , reaching just above the physical b-quark mass. Second, as discussed in Sec. III, we have large statistical sample sizes, with about 4,000 samples on most ensembles and large lattice volumes; the resulting errors on the decay constants range from 0.04% to 1.4%.
Because of the breadth and precision of the data set, it is a challenge to find a theoretically well-motivated functional form that is sophisticated enough to describe the whole data set. We therefore rely on several EFTs to parameterize the dependence of our data on each of the independent variables just described: Symanzik effective field theory for lattice spacing dependence [38], chiral perturbation theory for light-and strange-quark mass dependence, and heavy-quark effective theory for the heavy-quark mass dependence. These EFTs are linked together within heavy-meson rooted all-staggered chiral perturbation theory (HMrASχPT) [84]. Here we use the one-loop HMrASχPT expression to describe the nonanalytic behavior of the interaction between pion (and other pseudo-Goldstone bosons) and the heavy-light meson, and supplement it with higher-order analytic functions in the light-and heavy-quark masses and lattice spacing to enable a good correlated fit.
Even with these additional terms, however, the extrapolation a → 0 and the interpolation m h → m b oblige us to restrict the range of am h . In practice, we are able to obtain a good correlated fit of our data with heavy-quark masses am h ≤ 0.9. Note, however, that our final fit function describes even the data with am h > 0.9 quite well.
The rest of this section is organized as follows. In Sec. VA, we construct an EFT-based fit function with enough parameters (60) to describe the data as a function of the light-and heavy-quark masses and lattice spacing. For convenience, the complete final expression is written out in Sec. V B. Next, Sec. V C explains how we convert our decay-constant data from lattice units to "p4s units" and, eventually, to MeV. Finally, we describe how the fit works in practice and present our final fit used to obtain the decayconstant central values and errors in Sec. V D.
A. Effective-field-theory fit function for heavy-light decay constants Recall that H x denotes a generic heavy-light pseudoscalar meson composed of a light valence quark x and a heavy valence antiquarkh, with masses m x and m h , respectively. The decay constant and mass of H x are f H x and M H x , respectively. In heavy-quark physics, the conventional decay constant is defined and normalized as Φ H We start with massless light quarks, with Φ 0 and M 0 denoting the decay constant and the meson mass in this limit. We parametrize Φ 0 as whereΦ 0 is the matrix element of the HQET current in the infinite-mass limit, Λ HQET is a physical scale for HQET effects that we set to 800 MeV in this analysis, and the Wilson coefficient C arises from matching the QCD current and the HQET current [85,86] at scale m h :  The μ dependence in the usual Wilson coefficient comes from the exponential of the integral of the anomalous dimension of the HQET current, and therefore may be factored out.
As mentioned in Sec. II, we use m 0 l , m 0 s , and m 0 c to denote the simulation masses of the light (up-down), strange, and charm quarks, respectively; without the primes m l ¼ 1 2 ðm u þ m d Þ, m s , and m c denote the correctly tuned masses of the corresponding quarks.
We now discuss the dependence of Φ H x on the deviation of m 0 c from m c . The charm quark can be integrated out for processes that occur at energies well below its mass. By decoupling [87], the effect of a heavy (enough) sea quark on low-energy quantities occurs only through the change it produces in the effective value of Λ QCD in the low-energy (three-flavor) theory [88]. We use Λ Noting thatΦ 0 has mass-dimension 3=2, we take into account the effects of the mistuned mass m 0 where the indices S and Ξ run over sea-quark flavors and meson tastes, respectively; Δ Ã is the lowest-order hyperfine splitting; δ Sx is the flavor splitting between a heavy-light meson with light quark of flavor S and one of flavor x; δ 0 V and δ 0 A are taste-breaking hairpin parameters; and g π is the H-H Ã -π coupling. Definitions of the residue functions R ½n;k j , the sets of masses in the residues, and the chiral functions l and J at infinite and finite volumes are given in Ref. [84] and references therein. At tree-level in HMrASχPT, the squared pion mass is linear in the sum of quark masses, M 2 π ≈ B 0 ðm u þ m d Þ þ a 2 Δ Ξ , where B 0 is a low-energy constant (LEC) and the splitting a 2 Δ P ¼ 0 for the taste-pseudoscalar pion. We exploit this relation to define dimensionless quark masses and a measure of the taste-symmetry breaking as ð5:8Þ where q denotes the valence or sea light quark 3 and a 2Δ is the mean-squared pion taste splitting. The x q s and xΔ are natural variables of HMrASχPT; the LECs L s , L x , and L a are therefore expected to be of order 1. The taste splittings have been determined to ∼1-10% precision [33] and are used as input to Eq. (5.6).
Because we have very precise data and approximately 500 data points, NLO HMrASχPT is not adequate to describe fully the quark-mass dependence, in particular for masses near m s . We therefore include all mass-dependent analytic terms at next-to-next-to-leading order (NNLO) and next-tonext-to-next-to-leading order (NNNLO) by defining The terms that depend upon the light valence-quark mass are needed to describe our wide range of correlated data with x l ≤ x x ≤ x s . The terms without x x are expected to be less important for obtaining a good fit because most of the ensembles have similar strange sea-quark masses, and because the ensembles are statistically independent, but we include them to make it a systematic approximation at the level of analytic terms. We also include a quartic term q 12 x 4 x , again to describe our wide range of valence-quark masses.
The staggered chiral form in Eq. (5.6) is given at fixed heavy-quark mass m h , or equivalently at fixed M H s . As discussed above, the LECs in Eq. (5.6) encode the effects of short-distance physics, and the dependence can be parametrized as expansions in inverse powers of the meson mass M H s and powers of the lattice spacing of each ensemble. To take the effects at scale M H s into account, we replace ð5:10Þ and similarly for L s and g π . We do not introduce any corrections to L a because it is suppressed by a factor of α 2 s a 2 at the finest lattice spacings where the heavy-quark mass dependence could be important. (At coarsest lattice spacings we only have valence heavy-quark masses near charm and thus the variation due to the valence heavy-quark masses is less important.) We also add a 1=M H s correction term (but not 1=M 2 H s ) to the four analytic terms at NNLO: Meson-mass dependence also appears implicitly through the hyperfine splitting Δ Ã and the flavor splitting δ Sx in Eq. (5.6). To fix the heavy-mass dependence of Δ Ã , which first appears at order 1=m h , we use ð5:12Þ with A Δ Ã and B Δ Ã fixed by demanding that Δ Ã reproduce the experimental values of the hyperfine splitting in the D and B systems. Similarly, we determine δ Sx by writing and fixing A δ and B δ from the known flavor splittings in the D and B systems.
To enable a description of our data with a wide range of lattice spacings from 0.03 fm ≲ a ≲ 0.15 fm, we incorporate lattice artifacts into the fit function as follows. Tastebreaking discretization errors in masses of light mesons, which affect the decay constants of heavy-light mesons at one-loop in χPT, are already included in the staggered chiral form in Eq. (5.6). In addition to these NLO effects, various discretization errors in the LECs must be taken into account. In Appendix B, we use HQET to study heavyquark discretization effects at the tree level [89,90]. At the leading order, tree-level heavy-quark discretization errors are eliminated via a normalization factor, and at the next order in HQET discretization errors start at order x 4 h and α s x 2 h , where x h ¼ 2am h =π. For these and generic lattice artifacts, we replace in Eq. (5.5) where Λ is the scale of generic discretization effects, set to 600 MeV in this analysis. A factor of α s is included in the c 1 and c 4 terms because the HISQ action is tree-level improved to order a 2 [91], so the leading generic discretization errors start at order α s ðaΛÞ 2 or α s ðam h Þ 2 . In addition, a factor of α s is included in the c 5 and c 6 terms because of the tree-level normalization factor. For k 1 and k 2 in Eq. (5.5), we likewise replace k 1 → k 1 ½1þc 0 1 α s ðaΛÞ 2 þc 0 2 ðaΛÞ 4 þc 0 3 x 4 h þα s ðc 0 4 x 2 h þc 0 5 x 4 h Þ; ð5:15Þ No factor of α s is included in the c 0 3 term, because k 1 parametrizes effects at NLO in HQET.
Let us return to the parameters L x , L s , and g π found in δΦ NLO . Owing to the Naik improvement term, it is enough to introduce corrections of order α s ðaΛÞ 2 and ðaΛÞ 4 . Similarly, we add α s ðaΛÞ 2 corrections to the NNLO analytic terms in Eq. (5.9). Finally, to incorporate effects of heavy-quark discretization errors, we include corrections to L x , L s , and g π , as explained in Appendix B.
Our final EFT fit function has 60 fit parameters. With reasonable prior constraints on the large number of parameters describing discretization effects [three parameters at NLO in SχPT (δ 0 V , δ 0 A , L a ); 16 parameters for generic discretization effects in powers of ðaΛÞ; 10 parameters for the heavy-quark discretization], the uncertainties from the continuum extrapolation are propagated to the statistical error reported by the fit. We test this expectation in Sec. VI by looking at the stability of the results to changes in the widths of the prior constraints, the number of fit parameters, and the data included in the fit.

B. Summary formula
In summary, letting F be our fit function from Sec. VA, and letting blue (arXiv) denote fit parameters, we have and the indices i and j correspond to the labels of the terms in Eq. (5.9). The chiral logarithm term δΦ NLO is given by Eq. (5.6) with the replacements L s →L s , L x →L x , and g π →g π . It depends upon the LECs f, L a , δ 0 V , and δ 0 A ; the hyperfine splitting Δ Ã ; and the tasteindependent flavor splitting δ Sx . The breved quantities include terms that allow for the χPT parametersΦ 0 , k 1 , k 2 , L x , L s , and g π to have heavy-quark mass and latticespacing dependence:Φ 0 ¼Φ 0 ½1 þ c 1 α s y þ c 2 y 2 þ c 3 y 3 þ α s ðc 4 x 2 h þ c 5 x 4 h þ c 6 x 6 h Þ; ð5:19aÞ h þL 0 s α s y þL 00 s y 2 þ L 000 s w h α s x 2 h ; ð5:19eÞ h þg 0 π α s y þg 00 π y 2 þ g 000 π w h α s x 2 h : ð5:19fÞ Thus, there are a total of 60 fit parameters. Of these f is constrained by expectations from χPT, g π is constrained by the results of other lattice-QCD calculations, and δ 0 V and δ 0 A are constrained by MILC's light-pseudoscalar-meson χPT fits.

C. Setting the lattice scale for the EFT analysis
We set the lattice scale with a two-step procedure that combines the pion decay constant with the so-called p4s method, in a way similar to Ref. [23]. In the first step of the procedure, we use the PDG value of f π , f π;PDG ¼ 130.50ð13Þ MeV [66,67], to set the overall scale and to determine tuned quark masses for each physical-mass ensemble. Then, as described in Sec. IV, we calculate M p4s and f p4s , which are the mass and decay constant of a pseudoscalar meson with both valence-quark masses equal to m p4s ≡ 0.4m s , and with physical sea-quark masses. The continuum-extrapolated values of f p4s , R p4s ≡ f p4s =M p4s , and quark mass ratios are then used as inputs to the second step of the procedure, which we refer to as the p4s method. In the p4s method, we find am p4s and af p4s on a given physical-mass ensemble by adjusting the valence-quark mass am x until ðaf x Þ=ðaM x Þ takes the same value as the continuum-limit ratio R p4s just determined. In the p4s method, we use a mass-independent scale setting, in which all ensembles at the same β as a physical-mass ensemble have, by definition, the same lattice spacing a ¼ ðaf p4s Þ=f p4s and am p4s .
To determine am p4s and af p4s accurately, the data must be adjusted for mistunings in the sea-quark masses. To make these adjustments, we use the derivatives with respect to quark masses, which were calculated in our earlier work and listed in Table VII of Ref. [23]. We then iterate, computing am p4s and af p4s , readjusting the data, and repeating the entire process until the values of am p4s and af p4s converge within their statistical errors. The results for the lattice spacing a and am s ¼ 2.5am p4s are listed in Table IX. For the smallest lattice spacing, a ≈ 0.03 fm, where we do not have an approximately physical-mass ensemble, we rely on the derivatives to determine a and am s from data on the m 0 l =m 0 s ¼ 0.2 ensemble, leading to larger relative systematic errors at β ¼ 7.28.

D. Effective-field-theory fit to heavy-light decay constants
In Sec. VA, we have constructed an EFT fit function that contains 60 fit parameters. We use this function to perform a combined, correlated fit to the partially-quenched data at the five lattice spacings, from a ≈ 0.12 fm to ≈0.03 fm, and at several values of the light sea-quark masses. The sixth lattice spacing, a ≈ 0.15 fm, is used in a check of the estimate of discretization errors, but not included in the base fit used to obtain our central values and statistical errors. At the coarsest lattice spacings, we have data with only two different values for the valence heavy-quark mass: m h ¼ m 0 c and m h ¼ 0.9m 0 c . Recall that m 0 c is the simulation value of sea charm-quark mass of the ensembles, and is itself not precisely equal to the physical charm mass m c because of tuning errors. At the finest lattice spacings, we have a wide range of valence heavy-quark masses from near charm to bottom. We include all data with 0.9m 0 c ≤ m h ≤ 5m 0 c , subject to condition am h < 0.9, which is chosen to avoid large lattice artifacts. Note that our analysis includes an a ≈ 0.03 fm, m 0 l =m 0 s ¼ 0.2, ensemble for which am b ≈ 0.6, and thus no extrapolation from lighter heavy-quark masses is needed, although a chiral extrapolation to physical light-quark masses is required.
We use a constrained fitting procedure [75] with priors set as follows. For the LEC g π of the D system, we use the prior g π ¼ 0.53 AE 0.08, which is based on lattice-QCD calculations [92][93][94]. For 1=f 2 in Eq. (5.6), our prior is where we set f π ¼ 130.5 MeV and f K ¼ 156 MeV. For the taste-breaking hairpin parameters, we use priors of δ 0 A =Δ ¼ −0.88 AE 0.09 and δ 0 V =Δ ¼ 0.46 AE 0.23, which are taken from chiral fits to light pseudoscalar mesons [95]. The fits of Ref. [95] have been performed at a ≈ 0.12 fm, where ensembles with unphysical strange quark masses are available (see Table I). We take advantage of the fact that both the taste splittings and the hairpin parameters scale like α 2 s a 2 at NLO in the chiral expansion, so their ratio remains constant as a changes. ForΦ 0 , we use an extremely wide prior of 0 AE 1000 in p4s units. The rest of the fit parameters are normalized to be of order 1, and for them we choose a prior of 0 AE 1.5. We discuss this choice in Sec. VI and argue that it is conservative. Finally, for α s we use the coupling α V at scale q ¼ 2.0=a, obtained from Ref. [76].
Altogether we have 492 lattice data points in the base fit and 60 parameters in the EFT fit function. The fit has a correlated χ 2 data =dof ¼ 466=432, giving p ¼ 0.12. Figure 8 shows a snapshot of the decay constants for physical-mass ensembles, plotted versus the corresponding heavy-strange meson masses M H s at three lattice spacings. The continuum extrapolation is also shown. The valence light mass m x is tuned either to m s (upper points) or to m u (lower points).
Data points with open symbols that are at the right of the dashed vertical line of the corresponding color are omitted from the fit because they have am h > 0. 9. The fact that the fit lines agree well with the omitted points is evidence that we have not overfit the data. In the continuum extrapolation, the masses of sea quarks are set to the correctly-tuned, physical quark masses m l , m s , and m c , while at nonzero lattice spacing the masses of the sea quarks take the simulated values.
The width of the fit lines in Fig. 8 shows the statistical error coming from the fit, which is only part of the total statistical error, since it does not include the statistical errors in the inputs of the quark masses and the lattice scale.
To determine the total statistical error of each output quantity, we divide the full data set into 20 jackknife resamples. The complete calculation, including the determination of the inputs, is performed on each resample, and the error is computed as usual from the variations over the TABLE IX. Lattice spacing a and am s (in lattice units) in the p4s mass-independent scale-setting scheme. The error associated with f π;PDG is a multiplicative error for all values of β; the relative error is about 0.15% for lattice spacing a and about 0.3% for am s . The uncertainty labeled "EM scheme" is an additional uncertainty that can be incorporated when these results are used without attention to the EM scheme dependence. β a (fm) am s 5.8 0.15293ð26Þ stat ð19Þ syst ð23Þ f π;PDG ½07 EM scheme 0.06852ð24Þ stat ð22Þ syst ð20Þ f π;PDG ½05 EM scheme 6.0 0.12224ð16Þ stat ð15Þ syst ð18Þ f π;PDG ½05 EM scheme 0.05296ð15Þ stat ð17Þ syst ð15Þ f π;PDG ½04 EM scheme 6.3 0.08785ð17Þ stat ð11Þ syst ð13Þ f π;PDG ½04 EM scheme 0.03627ð14Þ stat ð12Þ syst ð10Þ f π;PDG ½02 EM scheme 6.72 0.05662ð13Þ stat ð07Þ syst ð08Þ f π;PDG ½03 EM scheme 0.02176ð10Þ stat ð07Þ syst ð06Þ f π;PDG ½01 EM scheme 7.0 0.04259ð05Þ stat ð05Þ syst ð06Þ f π;PDG ½02 EM scheme 0.01564ð04Þ stat ð05Þ syst ð04Þ f π;PDG ½01 EM scheme 7.28 0.03215ð14Þ stat ð28Þ syst ð05Þ f π;PDG ½01 EM scheme 0.01129ð10Þ stat ð19Þ syst ð03Þ f π;PDG ½01 EM scheme resamples. (For convenience, we kept the covariance matrix fixed to that from the full data set, rather than recomputing it for each resample.) The same procedure is performed to find the total statistical error of a and am s at each lattice spacing. The fit function Eq. (5.5), evaluated at a ¼ 0 and physical sea-quark masses, yields a parameterization of the decay-constant data as a function of the heavy-strange meson mass M H s and the valence light-quark mass m x . We ignore isospin violation in the sea, taking the light seaquark masses to be degenerate with the average u=d-quark mass. Because the HMrASχPT expression for the heavylight meson decay amplitude is symmetric under the interchange m u ↔ m d , the leading contributions from isospin-breaking in the sea sector are of Oððm d − m u Þ 2 Þ, and are expected to be smaller than the NNLO terms in the chiral expansion. We can check numerically the effect of sea isospin-breaking using our data by evaluating the fit function with physical up and down sea-quark masses. The resulting shifts in the decay constants are less than about 0.02% for the B system and 0.015% for the D system, which are consistent with power-counting expectations and are negligible compared to other uncertainties. We obtain the physical charged and neutral B-and D-meson decay constants by setting m x to either m u , m d or m s , and M H s to the experimental values M B s ¼ 5366.82ð22Þ MeV and M D s ¼ 1968.27ð10Þ MeV [66], respectively, along with a prescription to subtract electromagnetic effects from the masses, as discussed below. Figure 9 shows the stability of our final results for f D þ , f D s , f B þ and f B s under variations in the data set and the fit models. In our base fit, we use the decay constants obtained from the (3 þ 2)-state fit to two-point correlators. To investigate the error arising from excited-state contamination, we perform a fit to the decay-constant data obtained from the (2 þ 1)-state fit to two-point correlators. There is some evidence for such contamination, contributing a systematic error that is comparable to the statistical errors for the B system. We take the difference between the results from the two types of correlator fits as an estimate of the systematic error due to excited states. For consistency, we do so both for the D system as well as the B system, even though there is little evidence for such contamination for the D system. It is reasonable that the B correlators suffer from larger excited state effects, because, as seen in Fig. 3, the fits to correlators with heavier quarks tend to have smaller p values at fixed T min , as well as larger errors in the ground state mass. Figure 9 also shows a test of the systematic error in the continuum extrapolation from repeating the fit after either adding in the coarsest (a ≈ 0.15 fm) ensembles or omitting the finest (a ≈ 0.03 fm) ensemble. The differences with the base fit are well within the statistical errors, providing support for our earlier assertion that the continuumextrapolation errors are already included in our estimate of the statistical uncertainty of our fit.

VI. SYSTEMATIC ERROR BUDGETS
In our base fit, constrained Bayesian curve fitting [75] is employed to incorporate systematic errors in the continuum extrapolation. If the prior values have been chosen in a reasonable way, and if we have sufficiently many parameters in the fit, central values and error bars of final quantities should not change when more parameters are included in the fit. The error bars are then expected to capture the systematic errors in the continuum extrapolation.
To test the priors chosen for discretization effects, we repeat the analysis with different numbers of discretization parameters. The result of this test is shown in Fig. 9. The base fit has 60 parameters. We show results from alternative fits with 44, 47, 50, and 61 parameters. The fit with 50 parameters is constructed from our base EFT fit function by removing 10 terms that describe higher-order discretization effects in powers of ðaΛÞ 2 : specifically, the ðaΛÞ 6 correction toΦ 0 ; the ðaΛÞ 4 corrections to k 1 , L s , L x and g π ; and the ðaΛÞ 2 corrections to k 2 and the NNLO analytic terms in Eq. (5.9). In the fit with 47 parameters, three additional terms describing higher-order heavy-quark discretization effects are removed: we set to zero c 0 3 , c 0 5 and c 00 2 in Eqs. (5.15) and (5.16). The fit with 44 parameters is then obtained by removing, from the 47-parameter fit, the α s ðaΛÞ 2 corrections to L s , L x and g π . Finally, we consider a fit function with 61 parameters, which is constructed from our base EFT fit function by adding a term α s x 8 h to Eq. (5.14), which is the most important term at the next order in our expansion variables.
The 44-parameter fit shows a significant deviation from the base fit for f D þ , but already with 47 parameters the deviations of all quantities are small: the errors are essentially unchanged from those of the base fit, and the central values change by no more than half the error bars. Differences between the base fit and the 61-parameter fit are not visible at all. In the context of constrained Bayesian curve fitting [75], these findings suggests that the posterior uncertainty captures most or all of the systematic error of the continuum extrapolation.
The priors may be further tested by monitoring the posteriors in various fits. Figure 10 (left) shows the distribution of posterior central values for essentially unconstrained fit parameters (priors 0 AE 100) in the 44parameter fit. 4 The distribution is compared to Gaussian distributions with widths of 1 and 1.5. Note that the width-1 Gaussian is already fairly consistent with the distribution, but there may be some indication of excess in the tails. On the other hand, the width-1.5 Gaussian clearly encompasses the posterior distribution. Thus the natural size of these parameters is indeed of order unity, and a prior of 0 AE 1.5 seems to be a conservative assumption for any additional parameters in other fits that are not well constrained by data. Figure 10  The quantities δ 0 V , δ 0 A , g π , 1=f andΦ 0 , which are set by external considerations rather than power counting, have the same priors as in the base fit.
In the Bayesian approach, prior information about fit parameters is explicitly put into the fit. A non-Bayesian alternative is to limit the number of fit parameters to those constrained by the data with no external information about what sizes of the parameters are expected. External information nevertheless enters implicitly by assuming that the parameters omitted from the fit are all exactly zero. We apply this alternative approach to test whether there are additional systematic errors in the continuum extrapolation due to the choice of fit function that are not captured by the Bayesian analysis. Figure 9 shows two fits with fewer parameters than the base fit, which may then be determined by the data, with essentially no Bayesian constraint. 4 The fits are labeled "44 param=Wide" and "47 param=Wide." They have the same parameter sets as the 44-parameter and 47-parameter fits discussed above, but now with very wide priors, 0 AE 100. (The 44 param=Wide fit yields Fig. 10 (left).) We also include a fit, "60 param=47-Wide" with the same parameters as the base fit, but with the 47 parameters that can be determined by the data alone now essentially unconstrained by priors and priors of 0 AE 1 for the remaining 13 parameters. These three new fits have p values larger than 0.05, so we consider them to be acceptable alternatives. Comparing these fits with the base fit, we find that the central values vary a bit more than we would expect from the Bayesian analysis. In particular, f D þ in the 44-param=Wide fit and f B s in the 60 param=47-Wide fit differ from the base fit by slightly more than the error bar of the base fit (indicated by the gray band). We take a conservative approach and take the largest of these differences for each quantity as an additional systematic error due to the choice of fit model.
A final fit in Fig. 9, labeled "2 × priors," starts with the base fit and doubles, to 0 AE 3, the prior widths of the 55 parameters constrained by power counting arguments. The results of this fit are very similar to those from the 60 param=47-Wide fit. In the Bayesian context, it is to be expected that weakening the prior information in the base fit results in an increase in the resulting errors. However, the shifts in the central values for the B system are large enough that the inclusion of the fit model error discussed in the previous paragraph seems prudent.
Tables X and XI give representative error budgets for the decay constants and their ratios in the D and B systems, respectively. The error listed as "statistics and EFT fit" is determined by a jackknife procedure (described at the end of Sec. V D) in which we repeat, on data resamples, the EFT fit and its extrapolation to the continuum and interpolation to physical quark masses. It includes statistical errors in the inputs as well as those from the fit itself. As explained above, it also includes much of the systematic error associated with the continuum extrapolation. The small errors from the chiral interpolation are likewise captured by our Bayesian procedure, which includes all analytic chiral terms at NNLO and NNNLO.
The error labeled "two-point correlator fits" in Tables X and XI is an estimate of the contamination due to excited states. It is determined by comparison of the results from the base, (3 þ 2)-state, fits and those from (2 þ 1)-state fits.
The error we associate with the choice of fitting function, is labeled "Fit model" in each table. As explained above, it comes from comparing the results of different non-Bayesian (essentially unconstrained) fits to those from the base fit. While the differences are not so large that they necessarily invalidate the Bayesian error analysis, they are large enough that we are inclined to be conservative and include them as a separate source of error. Since the fit model controls the continuum extrapolation, this error may be interpreted as an estimate of those continuum extrapolation errors not completely captured by our Bayesian analysis.
The fourth line in each table, labeled "scale-setting quantities and tuned quark masses," gives the systematic error associated with the continuum extrapolations of f p4s , R p4s , and the tuned quark masses. As described in Sec. IV, the central values of these input quantities to the TABLE X. Representative error budgets for decay constants of the D system, estimated as described in the text. Error budgets for f D 0 and the isospin-limit value f D are similar to that for f D þ with one exception. The uncertainty from the topological-charge correction is larger for lighter valence-quark masses: 0.09% (0.07%) for f D 0 (f D ). heavy-light analysis come from a quadratic fit in α s a 2 to the ensembles with a ≤ 0.12 fm. We repeat the heavy-light analysis with the inputs instead determined by three alternatives: a quadratic fit including all the data, a linear fit including data up to 0.12 fm, and a linear fit including data up to 0.09 fm. The errors shown in Tables X and XI are obtained by taking the largest difference between the base values and the results from each of the three alternatives. The error labeled "finite-volume corrections" gives our estimate of residual finite volume errors, those finite volume effects not included in our chiral fitting forms. The errors associated with light-quark and scale-setting inputs are estimated in the same way as those associated with continuum extrapolation errors of those quantities, using the input finite-volume errors from Table VIII. To determine the corresponding finite-volume errors arising directly in the heavy-light analysis, we omit the finitevolume corrections at NLO in χPT from the EFT fits, and then repeat the fits. We take 0.3 of the differences between the results of the two fits as estimates of the residual finitevolume errors coming from omitted higher-order terms in χPT. We consider the factor 0.3 to be conservative because higher order corrections in SU (3) χPT are typically less than that; for example, f K =f π − 1 ≈ 0.2. We then add the finite volume errors from the heavy-quark analysis in quadrature with those from the inputs to get the values shown in Tables X and XI. This is reasonable because we do not know the correlations between the effects of finite volume errors on the light-light and heavy-light quantities. For example, the ratios between heavy-light and light-light decay constants, which enter through our scale-setting procedure, are likely to be less-dependent on volume than either decay constant alone. In any case, if we instead assumed 100% correlation between the light-light and heavy-light finite volume errors, it would make little difference in the total systematic error.
We note that the finite-volume errors in Table VIII are considerably smaller than in earlier drafts of this paper. The previous version was inconsistent, in that it took the input estimate of light-quark finite volume errors from a comparison of fits including the data at a ≈ 0.15 fm, while our central fit drops that lattice spacing. As discussed in Sec. IV, keeping the a ≈ 0.15 fm data gives an overestimate of finite-volume effects due to staggered taste splittings that predominantly affect that lattice spacing.
Despite the fact that the decay constants are by definition pure QCD matrix elements of the axial current, there are electromagnetic uncertainties in the values that the meson masses (used primarily to fix the physical quark masses) would have in a pure QCD world. 5 The estimated systematic error labeled "electromagnetic corrections" in Tables X and XI accounts for the two sources of this uncertainty. First, there are electromagnetic errors in the tuned values we use for the light-quark masses that arise from errors in the determinations of the electromagnetic contributions to pion and kaon masses. These correspond to the "K þ -K 0 splitting" and "K 0 mass," and errors described in Sec. IV. We vary the values of the tuned light-quark masses by these two EM uncertainties in Table VII to obtain the corresponding uncertainties on the decay constants in Table XII. In this work, we choose a specific scheme [79,82] for the electromagnetic contribution to the neutral kaon masses; other works, e.g., the FLAG report [80], choose other schemes. Changing the scheme so that ðM 2 K 0 Þ γ goes from þ44 MeV 2 to þ461 MeV 2 changes the listed quantities by the percentages in row "K-mass scheme." There are also electromagnetic effects in the heavy-light meson masses, which affect our calculation both directly, in the meson-mass value we use to convert from a Φ value to a decay constant f ¼ Φ= ffiffiffiffi ffi M p , and indirectly, through the tuned values of the heavy-quark masses. To estimate the resulting electromagnetic errors on the decay constants, we first need to relate the experimental values of the heavylight meson masses to QCD-only values. For this, we use the phenomenological formula [16,96,97] where e x and e h are charges of the valence light and heavy quarks, respectively, and we have added a term proportional to ðm x − m l Þ to account for the mass difference between u and d quarks. Physical contributions proportional to e 2 h , which come from effects such as the EM correction to the heavy quark's chromomagnetic moment, are suppressed by 1=m h , and are therefore dropped from this simple model. There are also prescription (scheme) dependent EM contributions to the heavy quark mass renormalization, which are proportional to e 2 h m h ; our choice of scheme is to drop them entirely. To estimate the parameters A and B, we use the experimental D 0 -, D þ -, B þ -and B 0 -meson masses in Eq. (6.1), which gives TABLE XII. Error contributions to, and estimates of scheme dependence of, the decay constants from electromagnetic effects. The sources of uncertainty are described in the text. ð6:3Þ Taking Cðm d − m u Þ ¼ 2.6 MeV as described in Sec. IV, we then obtain A ¼ 4.44 MeV and B ¼ 2.4 MeV. Using Eq. (6.1), we estimate that the electromagnetic contribution to the D s -meson mass to be about 1.3 MeV, which is substantially smaller than the result, 5.5(6) MeV, found for this shift in Ref. [98]. We emphasize that we do not add any terms in Eq. (6.1) proportional to e 2 h m h . Such terms, which can explain the difference between results of Ref. [98] and Eq. (6.1), can be absorbed into the heavyquark mass and do not contribute to electromagnetic mass splittings for the heavy-light mesons. Consequently, these terms only affect the tuned heavy-quark masses, which inevitably depend on the scheme used for matching a pure QCD calculation onto real-world measurements, which include electromagnetism.
We take the difference between results obtained with and without the electromagnetic shift from Eq. (6.1) as an estimate of the uncertainty in applying our phenomenological model. This error includes effects of neglecting mass-dependent corrections to the parameters A and B. We tabulate this error in the row labeled "H x mass." We also estimate the effect of the scheme dependence of the heavy quark mass, which we call "H s -mass scheme," by taking the difference between the electromagnetic contributions to the D s meson mass obtained from Eq. (6.1) and the scheme of Ref. [98], which includes the heavy-quark self-energy. We do not have corresponding information for the B s meson, so we take the D s shift and simply assume that it is dominated by a mass renormalization term proportional to e 2 h m h . Because m c e 2 c ≈ m b e 2 b , this leads to the same shift, 4.2 MeV, for both D s and B s .
The individual electromagnetic EM uncertainties on the decay constants discussed above are tabulated in Table XII. Because we have no information about correlations between the various EM errors, we add the K þ -K 0 splitting, K 0 mass, and H x mass error in quadrature to obtain the total "electromagnetic corrections" entries given in Tables X and XI. Even if there were strong correlations between the EM errors, this would make little difference to the total systematic errors of the heavy-light decay constants, because these errors are subdominant, as can be seen in Tables X and XI.
The error labeled "topological-charge distribution" accounts for the nonequilibration of topological charge in our finest ensembles. Before our EFT fit, we adjust the lattice data to compensate for effects of nonequilibration of topological charge as discussed in Sec. II C. We conservatively estimate the uncertainty in our treatment of effects of nonequilibration of topological charge by taking the full difference between the final results of the analyses with and without adjustments.
The last "f π;PDG " error included in Tables X and XI is the uncertainty due to the error in the PDG average for the charged-pion decay constant, f π AE ¼ 130.50ð13Þ MeV [67], which is the physical scale that is used to determine f p4s .
All errors in Tables X and XI should be added in quadrature to obtain the total uncertainties. In the following section, when we quote our final results for the physical decay constants, we separate the errors into "statistical" errors, which are the ones listed as "statistics and EFT fit," "systematic" errors, which are those due to the systematics of our calculation (rows 2-6 in the tables, added in quadrature), and, finally, the errors due to the PDG value of f π , which is external to our calculation.
As a byproduct of our EFT analysis, we can also obtain the decay amplitudes Φ for the D and B systems in both the SU(2) and the SU(3) limits, which are reported in Table XIII.

VII. RESULTS AND PHENOMENOLOGICAL IMPACT
We now present our final results for the heavy-light meson decay constants with total errors and then discuss some of their phenomenological implications.

A. Band D-meson decay constants
Our final results for the physical leptonic decay constants of the D and B systems including all sources of systematic uncertainty discussed in the previous section are  (2) and the SU(3) chiral limits. Here m x ¼ m 0 l ¼ 0 and the strange sea mass is either m 0 s ¼ m s (in the SU(2) case) or m 0 s ¼ 0 (in the SU(3) case). The uncertainty labeled "EM scheme" is an additional uncertainty that can be incorporated when these results are used without attention to the EM scheme dependence. These results are obtained in a specific scheme for matching QCD þ QED to pure QCD via the light and heavy meson masses tabulated in Table VI. When using our results in a setting that does not take into account the subtleties of the EM scheme, one may wish to also include the last quantities, in brackets, which are obtained by adding in quadrature the fourth and fifth rows in Table XII, as rough estimates of scheme dependence. Most recent lattice-QCD calculations of heavy-light meson decay constants work, however, in the isospinsymmetric limit. To enable comparison with these results, we also present results for the B-and D-meson decay constants evaluated with the light valence-quark mass fixed to the average u=d-quark mass: where we note that the D ðsÞ averages are dominated by our earlier result in Ref. [23].

D system Φ
For the D-meson decay constants, the uncertainties in Eqs. (7.2)- (7.3) are about 2.5 times smaller than from our previous analysis. The improvement stems primarily from the inclusion of finer ensembles with a ≈ 0.042 fm and 0.03 fm, which reduce the distance of the continuum extrapolation.
For f B s , HPQCD's most precise determination was obtained with the HISQ action for b quarks [17]. The substantial improvement in our result comes from a combination of higher statistics and the ensemble with a ≈ 0.03 fm, which eliminates the need to extrapolate to the bottom-quark mass from lighter quark masses, and also shortens the continuum extrapolation. For f B þ and f B 0 , HPQCD has employed only NRQCD b quarks [21]. Thus, our results for these quantities are the first obtained with the HISQ action for the b quarks. With HISQ, the dominant errors in HPQCD's calculation-from operator matching and relativistic corrections to the current-simply do not arise.

ð7:17Þ
The light quarks in the D þ and D s mesons have identical charges, so the deviation of f D s =f D þ from unity quantifies the degree of SUð3Þ-flavor breaking in the D system. Similarly, the ratio f B s =f B 0 characterizes the size of SUð3Þ-breaking in the B-meson system. Both yield values of about 20%, which is consistent with power-counting expectations of ðm s − m d Þ=Λ QCD .

ð7:21Þ
These results can be employed to correct other lattice-QCD results obtained in the isospin limit, which will be essential once other calculations reach subpercent precision. For f D þ , the isospin-breaking correction is larger than our total uncertainty in Eq. (7.2), while for f B þ it is comparable to the total error in Eq. (7.4). We find a smaller isospin correction to the B-meson decay constant than obtained by HPQCD in Ref. [21], ðf B − f B þ Þ HPQCD ¼ 1.9ð5Þ MeV, 6 by more than 2σ. HPQCD's estimate was obtained, however, by setting both the valence-and sea-quark masses in f B þ to m u because the analysis only included unitary data. Hence their value includes effects both from valence isospin breaking and from reducing the average light sea-quark mass; when we follow this prescription, we obtain a similarly-large shift of about 1.6(2) MeV. On the other hand, our results for the isospin corrections to both f D and f B agree with calculations using Borelized sum rules [99,100].
Tables XV and XVI in Appendix C provide the correlation and covariance matrices, respectively, between the Band D-meson decay constants in Eqs. (7.1)-(7.8). They can be used to compute any combination of our results with the correct uncertainties.
B. Quark-mass ratios, f K =f π , and scale-setting quantities In Sec. IV, we analyze the ensembles with physical lightquark masses to obtain several input parameters for the EFT fit of heavy-light meson decay constants. We obtain for the mass and decay constant of a fictitious pseudoscalar-meson with degenerate valence-quark masses 0.4m s : where the last quantity, in brackets, is an additional uncertainty when these results are used without attention to EM scheme dependence. These quantities are used to set the scale in our analysis. 6 The correlated uncertainties were provided by HPQCD (private communication).
We obtain for the ratios of quark masses: where m l is the average u=d-quark mass. The errors on the quark-mass ratios in Eqs. (7.25)-(7.27) are smaller than from our previous analysis in Ref. [23] because the finer lattice spacings employed here reduce the continuumextrapolation error. Figures 13 and 14 compare our results for m u =m d and m s =m l , respectively, with previous unquenched lattice-QCD calculations. The difference in our value for m s =m l relative to Ref. [23] mostly comes from three changes, which all push the value in the same direction. In order of size, these are the addition of the 0.042 fm physical-quark-mass ensemble, removing the 0.15 fm ensembles from our central fits, and adding more data on the 0.06 fm physical-quark-mass ensembles. An even more precise value for m c =m s is reported in a companion paper on the determination of quark masses from heavy-light meson masses [36]. Finally, we obtain the ratio of charged pion to kaon decay constants. We also give the ratio in the isospin symmetric limit, and the difference between the two: −12 syst ð2Þ f π;PDG ;ΔM K ½3 EM scheme ; ð7:30Þ which are again more precise than our previous determination in Ref. [23] because of the shorter continuum extrapolation. Our results agree with previous three-and FIG. 15. Comparison of f K þ =f π þ in Eq. (7.28) (magenta burst) with previous three-and four-flavor lattice-QCD calculations [23,25,[108][109][110][111][112]. four-flavor lattice-QCD calculations (see Fig. 15), and with the 2016 FLAG averages [80].

C. CKM matrix elements
We now combine our decay-constant results with experimental measurements of the D þ ðsÞ -meson leptonic decay rates to obtain values for the CKM matrix elements jV cd j and jV cs j within the standard model.
The products of decay constants times CKM factors from the Particle Data Group [67], are obtained by averaging the experimentally-measured decay rates into electron and muon final states. The value for f D þ jV cd j in Eq. (7.31) includes the correction from structure-dependent bremsstrahlung effects that lowers the D þ → μ þ ν μ rate by ∼1% [113,114]. Other electroweak corrections, however, are not accounted for in the PDG averages shown above. The electroweak contributions to leptonic pion and kaon decays are estimated to be about one or two percent [115,116], and the uncertainties in these corrections lead to ∼0.1% uncertainties in jV us j=jV ud j and jV us j. Now that the errors on f D and f D s are well below half a percent, electroweak corrections must also be included when extracting jV cd j and jV cs j from leptonic D-meson decays.
We take the estimate of the electroweak corrections to the leptonic D þ ðsÞ -meson decay rates from our earlier work [23], which includes all contributions that are included for pion and kaon decays. We first adjust the experimental decay rates quoted in the PDG by the known long-and short-distance electroweak corrections [117,118]. The former lowers the D þ -and D s -meson leptonic decay rates by about 2.5%, while the latter increases them by about 1.8%, such that the net effect is a slight decrease in the rates by less than a percent. We then include a 0.6% uncertainty to account for unknown electromagnetic corrections that depend upon the mesons' structure. This estimate is based on calculations of the structure-dependent electromagnetic corrections to pion and kaon decays [115,119,120], but allowing for much larger coefficients than for the light pseudoscalar mesons.
With these assumptions, and taking our D þ -and D smeson decay-constant results from Eqs. (7.2) and (7.3), we obtain for the CKM matrix elements where "EM" denotes the error due to unknown structuredependent electromagnetic corrections. In both cases, the lattice-QCD uncertainties from the decay constants are an order of magnitude smaller than those from experiment. Further, the electromagnetic errors are only a rough estimate, and need to be put on a more robust and quantitative footing by a direct calculation of the hadronic structure-dependent effects. The CKM matrix elements jV cd j and jV cs j can also be obtained from semileptonic D þ → π 0 l þ ν and D þ → K 0 l þ ν decays. Recently the ETM Collaboration published the first four-flavor lattice-QCD determination of the vector and scalar form factors for these processes [121]. Combining their form factors over the full range of momentum transfer with experimental measurements of the decay rates yields for the CKM elements [122] jV cd j D→π ¼ 0.2341ð74Þ; ð7:35Þ where the errors are primarily from the theoretical uncertainties on the form factors. Although our result for jV cs j in Eq. (7.34) agrees with this determination, our result for jV cd j in Eq. (7.33) is about 2.1σ lower than the above value from semileptonic decays. We note, however, that combining f Dπ þ ð0ÞjV cd j ¼ 0.1425ð19Þ from the Heavy Flavor Averaging Group [1] with f Dπ þ ð0Þ ¼ 0.666ð29Þ from the most precise three-flavor lattice-QCD calculations by HPQCD [123] leads to a lower value of jV cd j D→π ¼ 0.2140ð97Þ that agrees with our result.
Our results for jV cd j and jV cs j make possible a test of the unitarity of the second row of the CKM matrix. Taking jV cb j inclþexcl ¼ 41.40ð77Þ × 10 −3 from a weighted average of determinations from inclusive and exclusive semileptonic B decays [124][125][126][127][128][129], we obtain for the sum of squares of the CKM elements jV cd j 2 þjV cs j 2 þjV cb j 2 −1.0 ¼ 0.049ð2Þ jV cd j ð32Þ jV cs j ð0Þ jV cb j ; ð7:37Þ which is compatible with three-generation CKM unitarity within 1.5σ. The precision on the above test is only at the few-percent level, and is limited by the experimental error on the leptonic decay widths for D s → μν μ and D s → τν τ .
We can also update the determination of the ratio of CKM elements jV us =V ud j from leptonic pion and kaon decays. Combining our result for f K þ =f π þ in Eq. (7.28) with the experimental rates and estimated radiative-correction factor from the Particle Data Group [67], we obtain jV us =V ud j SM ¼ 0.2310ð4Þ f K =f π ð2Þ expt ð2Þ EM ; ð7:38Þ where we have averaged the upper and lower errors from our decay-constant ratio.
D. Branching ratios for B q → μ + μ − The rare leptonic decays B 0 → μ þ μ − and B s → μ þ μ − proceed via flavor-changing-neutral-current interactions and are therefore promising new-physics search channels. In the B s -meson system, the difference between decay widths of the light and heavy mass eigenstates is large, ΔΓ s =Γ s ∼ 0.1 [1], and leads to a difference between the CP-averaged and time-averaged branching ratios. Because only the heavy B s eigenstate can decay to μ þ μ − pairs in the standard model, to a very good approximation [130], the two quantities are related simply asBðB s → μ þ μ − Þ SM ¼ τ H s ΓðB s → μ þ μ − Þ SM , where τ H s is the lifetime of the heavy mass eigenstate, and the bar denotes time averaging. The relative width difference ΔΓ d =Γ d ∼ 0.001 is 100 times smaller in the B 0 -meson system, soBðB The LHCb and CMS experiments reported the first observation of B s → μ þ μ − decay in 2014 [11]. This observation was subsequently confirmed by the ATLAS experiment [12], and LHCb has since improved upon their initial measurement using a larger data set [13]. The most recent results for the B s → μ þ μ − time-integrated branching fraction are marginally compatible: ð7:39Þ with the LHCb measurement being about 1.8σ larger. The LHCb and CMS experiments also reported 3σ evidence for the decay B 0 → μ þ μ − , which is suppressed in the standard model relative to B s → μ þ μ − by the CKM factor jV td =V ts j 2 ∼ 0.04. The significance, however, has subsequently weakened, and ATLAS and LHCb most recently only presented upper limits of [12,13] BðB 0 → μ þ μ − Þ ATLAS < 3.4 × 10 −10 ; ð7:41Þ at 95% confidence level.
Here we update the theoretical predictions for the standard model branching ratios using our results for the neutral B 0and B s -meson decay constants. We employ the formulas in Eqs. (6) and (7) of Ref. [130], which provide the branching ratios in terms of the decay constants, relevant CKM elements, and a few other parametric inputs. Using the CKM elements and other inputs listed in Table XIV, and f B 0 , f B s , and their ratio from Eqs. (7.5)-(7.6) and (7.16), we obtain BðB s → μ þ μ − Þ SM ¼ 3.64ð4Þ f B s ð8Þ CKM ð7Þ other × 10 −9 ; ð7:43ÞB where the errors are from the decay constants, CKM matrix elements, and the quadrature sum of all other contributions, respectively. BecauseBðB q → μ þ μ − Þ is proportional to the square of the decay constant, our three-fold improvement in the uncertainty on the B-meson decay constants reduces the error contributions from the decay constants by almost a factor of two, such that they are now well below the other sources of uncertainty.

VIII. SUMMARY AND OUTLOOK
In this paper, we have presented the most precise lattice-QCD calculations to-date of the leptonic decay constants of heavy-light pseudoscalar mesons with charm and bottom quarks. We use highly improved staggered quarks with finer lattice spacings than ever before, which enables us for the first time to work with the HISQ action directly at the physical b-quark mass. As shown in Figs. 11 and 12, our results agree with previous three-and four-flavor lattice-QCD determinations using different actions for the light, charm, and bottom quarks. The errors on our D-meson decay constants in Eqs. (7.1)- (7.3) are about 2.5 times smaller than those from our earlier analysis [23]. The error reduction is primarily due to the use of finer lattice spacings, which reduces the continuum-extrapolation uncertainty. Our Bmeson decay constants in Eqs. (7.4)-(7.6) are about three times more precise than the previous best lattice-QCD calculations by HPQCD [17,21]. Here the improvement again stems from the use of finer lattice spacings, which enable us to employ the HISQ action directly at the physical TABLE XIV. Numerical inputs used to calculate B q → μ þ μ − branching ratios. The strong coupling (in the MS scheme) is a weighted average of three-and four-flavor lattice-QCD results [76,[131][132][133][134][135]. The B-meson lifetimes are from the Heavy Flavor Averaging Group's Summer 2017 averages [1,136]. The CKM matrix elements are from the CKMfitter group's global unitaritytriangle analysis including results through ICHEP 2016 [137], where we have symmetrized the errors on jV Ã ts V tb j and jV Ã td V tb j, and used the Wolfenstein parameters fλ ¼ 0.22510ð28Þ;ρ ¼ 0.1600ð74Þ;η ¼ 0.3500ð62Þg rather than the simple ratio to obtain jV td =V ts j with a reduced uncertainty. m t;pole ¼ 173.1ð6Þ GeV [66] α s ðm Z Þ ¼ 0.1186ð4Þ τ B d ¼ 1.518ð4Þ ps τ H s ¼ 1.619ð9Þ ps jV Ã ts V tb j ¼ 40.9ð4Þ × 10 −3 jV Ã td V tb j ¼ 8.56ð9Þ × 10 −3 jV td =V ts j ¼ 0.2048ð16Þ m b with controlled heavy-quark discretization errors, thereby eliminating the need to extrapolate to the bottomquark mass from lighter heavy valence-quark masses or to use an effective action such as NRQCD with its uncertainties from omitted higher-order corrections in α s or 1=m Q .
Our results for the charged D þ -and D s -meson decay constants can be combined with the experimental leptonic decay rates for D þ ðsÞ → l þ ν l [67] to yield the CKM matrix elements We note, however, that the uncertainties due to unknown hadronic structure-dependent electromagnetic corrections are only rough estimates based on the analogous contributions for pion and kaon decay constants (see Sec. VII C), and need to be calculated directly for the D system. The determinations of jV cd j and jV cs j from leptonic D decays in Eq. (8.1) enable us to test the unitarity of the second row of the CKM matrix at the few-percent level, and are compatible with three-generation CKM unitarity within 1.5σ. The significance of this test of the standard model is presently limited by the experimental errors on the corresponding leptonic decay widths [67]. Recently the BES-III Experiment published its first measurements of BðD þ s → μ þ ν μ Þ and BðD þ s → τ þ ν τ Þ [138], and presented a preliminary measurement of BðD þ → τ þ ν τ Þ [139]; these results are statistics-limited, and will improve with additional running. The forthcoming Belle II Experiment will also measure the leptonic D þ ðsÞ -meson decay rates, and anticipates obtaining sufficient precision to determine the CKM element jV cd j with an error below about 2% [140].
The neutral B s -and B 0 -meson decay constants are parametric inputs to the standard model rates for the rare decays B s → μ þ μ − and B 0 → μ þ μ − , respectively. Using our results for f B s and f B 0 , we obtain the predictions ð8:2Þ where the largest contributions to the errors are from the CKM elements jV ts j and jV td j, respectively. The theoretical uncertainty onBðB s → μ þ μ − Þ in Eq. (8.2) is more than ten times smaller than recent experimental measurements [11][12][13], while the prediction forBðB 0 → μ þ μ − Þ in Eq. (8.3) is half an order of magnitude below present experimental limits [12,13]. The high-luminosity LHC combined with upgraded ATLAS, CMS, and LHCb detectors should make possible significant improvements on these measurements in the next decade. In particular, given standard model expectations, the LHCb experiment anticipates determininḡ BðB s → μ þ μ − Þ to about 5% and the ratioBðB 0 → μ þ μ − Þ= BðB s → μ þ μ − Þ to the order of 40% by the end of the HL-LHC era [15]. Our results for f B s and f B 0 can also be used to improve the standard model predictions for the B ðsÞmeson branching ratios to electron-positron or τ-lepton pairs, which are of Oð10 −6 Þ and Oð10 −13 Þ, respectively [130]. The LHCb experiment recently placed the first direct limit onBðB s → τ þ τ − Þ < 6.8 × 10 −3 [141], and will continue to improve this measurement with additional running. Further, the decay ratesBðB s → e þ e − Þ andBðB 0 → e þ e − Þ can be substantially enhanced in new-physics scenarios in which the Wilson coefficients of the relevant four-fermion operators are independent of the flavor of the decaying B q meson and the final-state leptons [142]. In this case, the latter process could be observable by the LHCb and Belle II Experiments, providing unambiguous evidence for new physics.
Our result for f B þ can be combined with the experimental average for BðB þ → τ þ ν τ Þ [7-10,67] to yield the CKM matrix element with an about 10% uncertainty stemming predominantly from the error on the measured decay width. Within this large uncertainty, Eq. (8.4) agrees with the determinations of jV ub j from both inclusive [143][144][145][146][147] and exclusive [148][149][150][151] semileptonic B-meson decays. The Belle II Experiment expects, however, to collect enough data by 2024 to measure BðB þ → τ þ ν τ Þ with a precision of 3-5% [14], which will make possible a competitive determination of jV ub j from leptonic decays. The decay B þ → τ þ ν τ also probes extensions of the standard model with particles that couple preferentially to heavy fermions. Using f B þ from this work and taking 10 3 jV ub j ¼ 3.72ð16Þ from our recent lattice-QCD calculation of the B → πlν form factor [152], we obtain for the standard model branching ratio in agreement with the experimental average 10 4 BðB þ → τ þ ν τ Þ ¼ 1.06ð20Þ [7][8][9][10]67]. Given the current and projected experimental uncertainties on the D ðsÞ -and B ðsÞ -meson leptonic decay rates, better lattice-QCD calculations of the decay constants are not needed in the near future. Nevertheless, there are still opportunities for improvement. So far, D-and B-decay constant calculations include neither isospin nor electromagnetic effects from first principles. Isospin effects can be addressed straightforwardly with 1 þ 1 þ 1 þ 1 ensembles being generated for problems such as the anomalous magnetic moment of the muon [153]. The inclusion of electromagnetism in lattice-QCD simulations is more challenging, but calculations of the light-hadron spectrum and light-quark masses within quenched QED are available [104,105,154], and ensembles with dynamical photons [155] to be generated for other quantities can again be employed to calculate heavy-light meson decay constants. In addition, higher-order electroweak effects are presently ignored when relating experimental measurements of charged leptonic decays to standard model calculations. Effective-field-theory techniques can be used to separate effects at the electroweak and QCD scales from long-range radiation from charged particles. Further lattice-QCD calculations are needed to fit in with this scale separation. For leptonic pion and kaon decays, these effects are relevant and being studied [156,157]. Even if not immediately crucial for leptonic D and B decays, they are relevant for semileptonic D and B (as well as K and π) decays; see, e.g., the comparison of QED and QCD uncertainties in Ref. [124].
The next step in our B-physics program is to extend the use of HISQ b quarks on the same gauge-field configurations employed in this work to target other hadronic matrix elements needed for phenomenology. The analysis of ensembles with physical-mass pions and very fine lattice spacings will address two of the most important sources of systematic uncertainty in our recent calculations of the B → πðKÞlν and B → πðKÞl þ l − semileptonic form factors [152,158,159] and of the neutral B-mixing matrix elements [160] by eliminating the chiral-extrapolation uncertainty and reducing continuum-extrapolation and heavy-quark discretization errors. When combined with anticipated future measurements, this will enable us to determine more precisely the CKM matrix elements jV ub j and jV tdðsÞ j, which are parametric inputs to standard model and newphysics predictions. These advances will also make possible more sensitive searches for b → dðsÞ flavor-changing neutral currents, charged Higgs particles, and other extensions of the standard model that would give rise to new sources of flavor and CP violation in the B-meson sector. where (suppressing the gauge field) aΔ μ ψðxÞ ¼ 1 2 ½ψðx þμaÞ − ψðx −μaÞ, m 0 is the bare mass, and N ¼ 1 þ ϵ is the coefficient of the Naik improvement term [91]. The correction ϵ is needed to improve the dispersion relation when m 0 a≪1 [29]. The notation ϵ is used in Ref. [29]; in Appendix B, however, 1 þ ϵ appears, so we use N for brevity. We are interested in heavy quarks with mass much larger than their typical momentum. Then, the energy can be expanded as where m 1 and m 2 are called the rest and kinetic masses, respectively. At nonzero lattice spacing, these two masses are no longer identical. The parameter ϵ in the HISQ action is supposed to be tuned such that the kinetic mass of a quark equals its rest mass, i.e., This condition yields With this exact expression for ϵ, we have am 2 ¼ am 1 to all orders in am 0 , at the tree level. The Taylor expansion of ϵ, in Eq. (A4), about the origin reads The radius of convergence of this series is π=2, which is set by the singularities in the complex plane from the inverse power of sinhð2am 1 Þ in the exact expression. Equation (A6) can be rewritten as where x h ¼ 2am 1 =π. (The coefficients have been rounded to two significant figures.) This expansion converges inside the unit disc in the complex x h -plane, centered at the origin. One sees that many of the first several coefficients of this power series are of order 1, and in this sense, x h can be considered to be a natural expansion parameter. The bare mass m 0 in the quark action is related to its treelevel pole mass by with X as in Eq. (A5). As with ϵ, the Taylor expansion of m 0 breaks down at am 1 ¼ π=2, and m 0 has a natural series expansion in powers of x h .
APPENDIX B: NORMALIZATION OF STAGGERED BILINEARS WHEN am 0 ≪1 From Ref. [89] for massive Wilson fermions, it follows that when am 0 ≪1 a bilinear can lose the conventional normalization. In this Appendix, we derive the factor needed to restore this normalization for the pseudoscalar density of (improved) staggered fermions. To this end, we also need to think of HQET as a theory of cutoff effects, applied directly to lattice gauge theory [90].
The starting point is the time evolution of the fermion propagator at zero momentum. Using the residue theorem (δ is real, small, and positive), where the upper (lower) sign in front of γ 4 is for x 4 > 0 (x 4 < 0), and and similarly for single-antiquark states. With naive or staggered fermions, the pseudoscalar density appearing in the Ward identity of the exact remnant of chiral symmetry is the local one: P hx ðxÞ ¼ψ h ðxÞiγ 5 ψ x ðxÞ ð B9Þ using the notation of the naive formulation. Let us consider two matrix elements of the pseudoscalar density, namely when the x quark is nonrelativistic or ultrarelativistic. To the order needed, one finds h0jP hx ð0Þjq x ðξ x ; 0Þq h ðξ h ; 0Þi h0jP hx ð0Þjq x ðξ x ; p x Þq h ðξ h ; p h Þi for the nonrelativistic and ultrarelativistic cases, respectively, where w † ξ h and w ξ x are two-component spinors, and p x ¼ p x =jp x j. Similar results hold for other local bilinear currents.
These tree-level calculations reveal two important features about the heavy-quark discretization effects. First, depending on whether the x quark is a nonrelativistic or ultrarelativistic, matrix elements should be multiplied by a factor to remove tree-level mass-dependent discretization effects at the leading order in jp h j=m 0h . 7 Second, the next order in the HQET expansion requires an additional correction (as is the case with Wilson fermions [89,90]) to ensure the correct normalization of this term. It is, however, proportional to The numerator's leading discretization errors are of order x 4 h and α s x 2 h , owing to the tree-level Naik improvement term, and the dimensions are balanced, in a heavy-light meson, by Λ HQET or m x . As in Appendix A, x h ¼ 2am 1h =π is the natural expansion parameter for organizing heavy-quark discretization errors.
To arrive at the decay constant, the pseudoscalar density must be multiplied by the sum of the quark masses. From the axial Ward identity, the combination m 0x þ m 0h is natural. This quantity would, however, introduce heavy-quark discretization effects that can be avoided by using m 1x þ m 1h instead. With this choice and Eq. (B13) for normalizing Φ H x , all heavy-quark discretization errors are suppressed by either α s or Λ HQET =M H x or both.

APPENDIX C: COVARIANCE MATRIX FOR DECAY CONSTANTS
Tables XV and XVI provide the correlation and covariance matrices for our decay-constant results, respectively, to enable future phenomenological studies. 7 For a light quark (m x ≲ 2Λ QCD ),Ch x deviates from 1 by effects as small or smaller than other discretization effects. In particularCh x ¼ 1 þ Oða 2 m 2 x Þ for the unimproved action with N ¼ 0 andCh x ¼ 1 þ Oða 4 m 4 x Þ for the improved actions with N ¼ 1 or N ¼ 1 þ ϵ. XV. Correlation matrix between the D-and B-meson decay constants in Eqs. (7.1)-(7.8); entries are symmetric across the diagonal.