Scale setting the Möbius domain wall fermion on gradient-flowed HISQ action using the omega baryon mass and the gradient-flow scales t 0 and w 0

We report on a subpercent scale determination using the omega baryon mass and gradient-flow methods. The calculations are performed on 22 ensembles of N f ¼ 2 þ 1 þ 1 highly improved, rooted staggered sea-quark configurations generated by the MILC and CalLat Collaborations. The valence quark action used is Möbius domain wall fermions solved on these configurations after a gradient-flow smearing is applied with a flowtime of t gf ¼ 1 in lattice units. The ensembles span four lattice spacings in the range 0 . 06 ≲ a ≲ 0 . 15 fm, six pion masses in the range 130 ≲ m π ≲ 400 MeV and multiple lattice volumes. On each ensemble, the gradient-flow scales t 0 =a 2 and w 0 =a and the omega baryon mass am Ω are computed. The dimensionless product of these quantities is then extrapolated to the continuum and infinite volume limits and interpolated to the physical light, strange and charm quark mass point in the isospin limit, resulting in the determination of ﬃﬃﬃﬃ t 0 p ¼ 0 . 1422 ð 14 Þ fm and w 0 ¼ 0 . 1709 ð 11 Þ fm with all sources of statistical and systematic uncertainty accounted for. The dominant uncertainty in both results is the stochastic uncertainty, though for ﬃﬃﬃﬃ t 0 p there are comparable continuum extrapolation uncertainties. For w 0 , there is a clear path for a few-per-mille uncertainty just through improved stochastic precision, as recently obtained by the Budapest-Marseille-Wuppertal Collaboration.


I. INTRODUCTION
Lattice QCD (LQCD) has become a prominent theoretical tool for calculations of hadronic quantities, and many calculations have reached a level of precision to be able to supplement and/or complement experimental determinations [1]. Precision calculations of Standard Model processes, for example, are crucial input for experimental tests of fundamental symmetries in searches for new physics.
Lattice calculations receive only dimensionless bare parameters as input, so the output is inherently dimensionless. In some cases, dimensionless quantities or ratios of quantities may be directly computed without the need to determine any dimensionful scale. Calculations of g A and F K =F π are examples for which a precise scale setting is not necessary to make a precise, final prediction. However, there are many quantities for which a precise scale setting is desirable, such as the hadron spectrum, the nucleon axial radius, the hadronic contribution to the muon g − 2 [2] and many others.
In these cases, a quantity which is dimensionful (after multiplying or dividing by an appropriate power of the lattice spacing) is calculated and compared to experiment, following extrapolations to the physical point in lattice spacing, volume, and pion mass. Because the precision of any calculations of further dimensionful quantities is limited by the statistical and systematic uncertainties of this scale setting, quantities which have low stochastic noise and mild light quark mass dependence, such as the omega baryon mass m Ω , are preferred. The lattice spacing on each ensemble may then be determined by comparing the quantity calculated on a given ensemble to the continuum value.
However, the most precise quantities one may calculate are not necessarily accessible experimentally. For example, the Sommer scale r 0 [3] has been one of the most commonly used scales. This scale requires a determination of the heavy-quark potential which is susceptible to fitting systematic uncertainties. More recently, the gradient flow scales t 0 [4] and w 0 [5] have been used for a more precise determination of the lattice spacing [6][7][8][9][10][11][12][13][14]. In this case, a well-controlled extrapolation of these quantities to the physical point is also necessary.
In this paper we present a precision scale setting for our mixed lattice action [15] which uses N f ¼ 2 þ 1 þ 1 highly improved, rooted staggered sea-quark (HISQ) configurations generated by the MILC [16] and CalLat Collaborations and Möbius domain wall fermions for the valence sector. We compute the dimensionless products ffiffiffi ffi t 0 p m Ω and m Ω w 0 on each ensemble and extrapolate them to the physical point resulting in the determinations, with the statistical (s), chiral (χ), continuum-limit (a), infinite volume (V), physical-point (phys), and model selection uncertainties (M). We then perform an interpolation of the values of t 0 =a 2 and w 0 =a to the physical quark-mass limit and extrapolation to infinite volume which allows us to provide a precise, quark mass independent scale setting for each lattice spacing, with our final results in Table V. In the following sections we provide details of our lattice setup, our methods for extrapolation, and our results with uncertainty breakdown. We conclude with a discussion in the final section.

II. DETAILS OF THE LATTICE CALCULATION
A. MDWF on gradient-flowed HISQ The lattice action we use is the mixed-action [17,18] with Möbius [19] domain wall fermions [20][21][22] solved on N f ¼ 2 þ 1 þ 1 highly improved staggered quarks [23] after they are gradient-flow smeared [24][25][26] (corresponding to an infinitesimal stout-smearing procedure [27]) to a flow-time of t gf =a 2 ¼ 1 [15]. The choice to hold the flowtime fixed in lattice units is important to ensure that as the continuum limit is taken, effects arising from finite flowtime also extrapolate to zero. This action has been used to compute the nucleon axial coupling, g A , with a 1% total uncertainty [28][29][30][31], the π − → π þ matrix elements relevant to neutrinoless double beta-decay [32] and most recently, F K =F π [33]. Our calculation of F K =F π was obtained with a total uncertainty of 0.4% which provides an important benchmark for our action, as the result is consistent with other determinations in the literature [8,11,[34][35][36][37][38][39][40][41] (and the FLAG average [1]), and also contributes to the test of the universality of lattice QCD results in the continuum limit.
Our plan to compute the axial and other elastic form factors of the nucleon with this mixed-action, as well as other quantities, leads to a desire to have a scale setting with sufficiently small uncertainty that it does not increase the final uncertainty of such quantities. It has been previously observed that both w 0 [5,12] and the omega baryon mass [14,[42][43][44][45][46][47][48] have mild quark mass dependence and that they can be determined with high statistical precision with relatively low computational cost. The input parameters of our action on all ensembles are provided in Table I.

B. Correlation function construction and analysis
For the scale setting computation, we have to determine four or five quantities on each ensemble, the pion, kaon and omega masses, the gradient-flow scale w 0 and the pion decay constant F π . For m π , m K and F π , we take the values from our F K =F π computation for the 18 ensembles in common. For the four new ensembles in this work (a15m310L, a12m310XL, a12m220ms, a12m180L), we follow the same analysis strategy described in Ref. [33].
The a12m220ms ensemble is identical to a12m220 except that the strange quark mass is roughly 60% of the physical value rather than being near the physical value. The a15m310L ensemble has identical input parameters as the a15m310 ensemble but L ¼ 24 (3.6 fm) instead of L ¼ 16 (2.4 fm), while the a12m310XL ensemble is identical to the a12m310 ensemble but with L ¼ 48 (5.8 fm) instead of L ¼ 24 (2.9 fm). The a12m180L and a12m310XL ensembles have a lattice volume that is the same size as a12m130 but pion masses of roughly m π ≃ 180 and 310 MeV. These new ensembles provide important lever arms for the various extrapolations. The a12m220ms provides a unique lever arm for varying the strange quark mass significantly from its physical value, the a15m310L and a12m310XL provide other pion masses where we can perform a volume study and the a12m180L ensemble provides an additional light pion mass ensemble to help with the physical pion mass extrapolation. The first of these is important for this scale setting while the latter three will be more important for future work.
The omega baryon correlation functions are constructed similarly to the pion and kaon. A source for the propagator is constructed with the gauge invariant Gaussian smearing routine in Chroma [52] (GAUGE_INV_GAUSSIAN). Then, correlation functions are constructed using both a point sink as well as the same gauge invariant Gaussian smearing routine with the same parameters as the source. The values of the "smearing width" (σ) and the number of iterations (N) used to approximate the exponential smearing profile are provided in Table I. The correlation functions constructed with the point sink are referred to as PS and those with the smeared sink as SS.
Local spin wave functions are constructed following Refs. [53,54]. Both positive-and negative-parity omegabaryon correlation functions are constructed with the upper and lower spin components of the quark propagators in the Dirac basis. The negative-parity correlation functions are time-reversed with an appropriate sign flip of the correlation function, effectively doubling the statistics with no extra inversions. The four different spin projections of the omega are averaged as well to produce the final spin and parity averaged two-point correlation functions.
The reader will notice that the values of σ and N do not follow an obvious pattern. This is because in our first  [49] indicates the approximate lattice spacing in fm and pion mass in MeV. The S, L, XL which come after an ensemble name denote a relatively small, large and extra-large volume with respect to m π L ¼ 4. Additional ensembles generated by CalLat using the MILC code. The m350 and m400 ensembles were made on the Vulcan supercomputer at LLNL while the a12m310XL, a12m180L, a15m135XL, a09m135, and a06m310L ensembles were made on the Sierra and Lassen supercomputers at LLNL and the Summit supercomputer at OLCF using QUDA [50,51]. These configurations are available to any interested party upon request, and will be available for easy anonymous downloading-hopefully soon.
computations of g A [28,29], we applied an "aggressive" smearing with a larger value of σ and correspondingly larger number of iterations, which led to a large suppression of excited states, but also showed evidence of "over smearing" such that the non-positive-definite PS correlation functions displayed symptoms of having a relatively large negative overlap factor for excited states (there were wiggles in the PS effective masses). In a subsequent paper studying the two-nucleon system on the a12m350 ensemble [55], where we utilized matrix Prony [56] to form linear combinations of PS and SS nucleons to construct a "calm" nucleon which is ground-state dominated earlier in time, we observed that using a milder smearing with smaller width and fewer iterations provided a much more stable extraction of the ground state and did not show signs of large negative overlap factors. Hence, many but not all of the ensembles have been rerun with our improved choices of σ and N. We have observed the choice σ ¼ 3.0 and N ¼ 30 works well for the a15 and a12 ensembles and that σ ¼ 3.5 with N ¼ 45 works well for the a09 and a06 ensembles.
In order to determine the omega baryon mass on each ensemble, we perform a stability analysis of the extracted ground state mass as a function of t min used in the fit as well as the number of states used in the analysis. The correlation functions are analyzed in a Bayesian framework with constraints [57]. We choose normally distributed priors for the ground-state energy and all overlap factors, and log-normal distributions for excited-state energy priors. The ground-state energy and overlap factors are motivated by the plateau values of the effective masses with the priors taken to be roughly 10 times larger than the stochastic uncertainty of the respective effective mass data in the plateau region. The excited-state energy splittings are set to the value of two pion masses with a width allowing for fluctuations down to one pion mass within one standard deviation.
In Fig. 1 we show sample extractions of the ground state mass on our three physical pion mass ensembles. In the left plot, we show the effective mass data from the two correlation functions. The weights are normalized on a given time slice by the largest Bayes factor at that t min value. We have not implemented a more thorough algorithm to weight fits against each other that utilize different amounts of data, as described for example in Ref. [58]. Rather, we have chosen a fit for a given ensemble (the filled black symbol in the right panels highlighted by the horizontal colored band) that has a good fit quality, the maximum or near maximum relative weight, and consistency with the late-time data. We tried to optimize this choice over all ensembles simultaneously, with t min held nearly fixed for a given lattice spacing, rather than handpicking the optimal fit on each ensemble separately, in order to minimize the possible bias introduced by analysis choices. Good fits are obtained on all ensembles with n s ¼ 2, simplifying the model function and reducing the chance of overfitting the correlation functions, which is most relevant on ensembles with the more aggressive choices of smearing parameters. In Appendix C, we show the corresponding stability plots for all remaining ensembles. In Table II we show the resulting values of am Ω on all ensembles used in this work.
C. Calculation of t 0 and w 0 In order to efficiently compute the value of t 0 =a 2 and w 0 =a on each ensemble, we have implemented the Symanzik flow in the QUDA library [50,51,60]. We used the tree-level improved action and the symmetric, cloverleaf definition of the field-strength tensor, following the MILC implementation [61]. We used a step size of ϵ ¼ 0.01 in the Runge-Kutta algorithm proposed by Lüscher [4], which leads to negligibly small integration errors. The scales t 0 and w 0 are defined by the equations, where hEðtÞi is the gluonic action density at flow time t. In Fig. 2, we show the determination of the w 0;orig =a on the two physical pion mass ensembles that we have generated. The uncertainties are determined by observing a saturation of the uncertainty as the bin-size is increased when binning the results from configurations close in Monte Carlo time. These uncertainties were cross-checked with an autocorrelation study using the Γ-method [62] implemented in the UNEW PYTHON package [63]. Reference [64] determined the tree-level in lattice perturbation theory improvement coefficients for the determination of these gradient flow scales through Oða 8 =t 4 Þ for various choices of the gauge action, the gradient flow action and the definition of the field-strength tensor. As defined in Ref. [64], we have implemented the Symanzik-Symanzik-Clover (SSC) scheme with the relevant improvement coefficients (see Table 1 of Ref. [64]), One can then determine improved scales, t 0;imp and w 0;imp in which one has perturbatively removed the leading discretization effects in these flow observables, In the present work, we explore using both the original and improved versions of t 0 and w 0 when performing our scale setting analysis. For the improved versions, we have implemented the fourth order improvement [up to and including the C 8 ða 2 =tÞ 4 correction]. This is the same implementation performed by MILC in Ref. [12]. The dark gray and colored band are displayed for the region of time used in the analysis, and an extrapolation beyond t max is shown after a short break in the fit band. The horizontal gray band is the prior used for the ground state mass. The right plots show the corresponding value of E 0 as a function of t min and the number of states n s used in the analysis, as well as the corresponding Q value and relative weight as a function of n s for a given t min , where the weight is set by the Bayes Factor. See Appendix A for more detail on the selection of the final fit. The chosen fit is denoted with a filled black symbol and the horizontal band is the value of E 0 from the chosen fit. The y-range of the upper panel of the stability plots is equal to the prior of the ground state energy (the horizontal gray band in the left plot). TABLE II. The omega baryon mass (m Ω ) and gradient flow scales (t 0;orig , t 0;imp , w 0;orig and w 0;imp ) determined on each ensemble are listed as well as their dimensionless products. Additionally, in the bottom panel, we list the parameters used to control the physical point extrapolation: Þ=ð4πF π Þ 2 , m π L, ϵ a ¼ a=ð2w 0;orig Þ and α S . The values of α S are taken from Table III of Ref. [12] which were determined with a heavy-quark potential method [59]. An HDF5 file is provided with this publication which includes the resulting bootstrap samples of all these quantities which can be used to construct the correlated uncertainties.
Ensemble In Table II, we list the values of t 0 =a 2 and w 0 =a for the original and improved definitions as well as the dimensionless products of ffiffiffi ffi t 0 p m Ω and w 0 m Ω .

III. EXTRAPOLATION FUNCTIONS
This work utilizes 22 different ensembles, each with O (1000) configurations (Table I), to control the systematic uncertainties in the LQCD calculation of the scale. This allows us to address: (1) The physical light and strange quark mass limit; (2) The physical charm quark mass limit; (3) The continuum limit; (4) The infinite volume limit.

A. Physical light and strange quark mass limit
The ensembles have a range of light quark masses which correspond roughly to 130 ≲ m π ≲ 400 MeV. We have three lattice spacings at m π ≃ m phys π such that the light quark mass extrapolation is really an interpolation. On all but one of the 22 ensembles, the strange quark mass is close to its physical value, allowing us to perform a simple interpolation to the physical strange quark mass point. One ensemble has a strange quark mass of roughly 2=3 its physical value (a12m220ms), allowing us to explore systematics in this strange quark mass interpolation.
To parametrize the light and strange quark mass dependence, we utilize two sets of small parameters, where we have defined Using the Gell-Mann-Oakes-Renner relation [65], the numerators in these parameters correspond roughly to the light and strange quark mass, m 2 π ¼ 2Bm and The first set of parameters, Eq. (3.1), is inspired by χPT and commonly used as a set of small expansion parameters in extrapolating LQCD results. The second set of small parameters, Eq. (3.2), is inspired by Ref. [44]. In Fig. 3, we plot the values of these parameters in comparison with the physical point. Since we are working in the isospin limit in this work, we define the physical point as with the first three values from the FLAG report [1] and the omega baryon mass from the PDG [66]. The values of l F;Ω and s F;Ω are given in Table II for all ensembles.

B. Physical charm quark mass limit
The FNAL and MILC Collaborations have provided a determination of the input value of the charm quark mass that reproduces the "physical" charm quark mass for each of the four lattice spacings used in this work. The mass of the D s meson was used to tune the input charm quark mass until the physical D s mass was reproduced (with the already tuned values of the input strange quark mass), defining the "physical" charm quark masses [38], Comparing to Table I, the simulated charm quark mass is mistuned by less than 2% of the physical charm quark mass for all ensembles used in this work except the a06m310L ensemble, whose simulated charm quark is almost 10% heavier than its physical value. In order to test the sensitivity of our results to this mistuning of the charm quark mass, we perform a reweighting [67] study of the a06m310L correlation functions and extracted pion, kaon and omega baryon masses. While the relative shift of the charm quark mass is small, this shift is approximately equal to the value of the physical strange quark mass, As the reweighting factor is provided by a ratio of the charm quark fermion determinant, it is an extensive quantity, and the relatively large volume we have used to generate the a06m310L ensemble causes some challenges in accurately determining the reweighting factor. The summary of our study is that our scale setting is not sensitive to this mistuning of the charm quark mass, in line with prior expectation. For example, we find where the splitting is determined under bootstrap. We provide extensive details in Appendix A.

C. Continuum limit
In order to control the continuum extrapolation, we utilize four lattice spacings ranging from 0.057 ≲ a ≲ 0.15 fm. For most of the pion masses, we have three values of a with four values at m π ∼ 310 MeV and one value at m π ∼ 180 MeV. The parameter space is depicted in Fig. 4. The small dimensionless parameter we utilize to extrapolate to the continuum limit is ð3:8Þ As noted in Ref. [33], this choice is convenient as the values of ϵ 2 a span a similar range as l 2 F . This allows us to test the ansatz of our assumed power counting that treats corrections of Oðl 2 F Þ ¼ Oðϵ 2 a Þ which we found to be the case for F K =F π [33].
An equally valid way to define the small parameter characterizing the discretization corrections is to utilize the gradient flow scale that is also used to define the observable y being extrapolated. The following normalizations are comparable to our standard choice, Eq. (3.8), While these choices do not exhaust the possibilities, they are used to explore possible systematic uncertainties arising from this choice. Unless otherwise noted, the fixed choice in Eq. (3.8) is used in subsequent results and plots.

D. Infinite volume limit
The leading sensitivity of m Ω , t 0 and w 0 to the size of the volume is exponentially suppressed for sufficiently large m π L [68]. We have ensembles with multiple volumes at a15m310, a12m310 and a12m220 to test the predicted finite volume corrections against the observed ones. We derive the predicted volume dependence of w 0 m Ω to the first two nontrivial orders in Sec. III F.

E. Light and strange quark mass dependence
The light and strange quark mass dependence of the omega baryon has been derived in SUð3Þ heavy baryon χPT (HBχPT) [69,70] to next-to-next-to-leading order (N 2 LO) which is Oðm 4 π;K;η Þ [71][72][73]. It has been shown that SUð3Þ HBχPT does not produce a converging expansion at the physical quark masses [43,[74][75][76][77], and so using these formulas to obtain a precise, let alone subpercent, determination, at the physical pion mass is not possible when incorporating systematic uncertainties associated with the truncation of SUð3Þ HBχPT.
However, many LQCD calculations, including this one, keep the strange quark mass fixed near its physical value. Therefore, a simple interpolation in the strange quark mass is possible. Further, as the omega is an isosinglet, it will have a simpler, and likely more rapidly converging chiral expansion of the light-quark mass dependence than baryons with one or more light valence quarks. This has motivated the construction of an SUð2Þ HBχPT for hyperons which considers only the pion as a light degree of freedom [78][79][80][81][82]. In particular, the chiral expansion for the omega baryon mass was determined to Oðm 6 π Þ [79], ð3:11Þ α n , β n and γ 6 are linear combinations of μ-dependent dimensionless low energy constants (LECs) of the theory, and m 0 is the mass of the omega baryon in the SUð2Þ chiral limit at the physical strange quark mass. The renormalization group [83] restricts the coefficient of the ln 2 term to be linearly dependent on α 2 and α 4 , with the relation provided in Ref. [79]. In standard HBχPT power counting, in which the expansion includes odd powers of the pion mass, this order would be called next-to-next-to-next-to-nextto-leading order (N 4 LO), where leading order (LO) is the Oðm 2 π =Λ χ Þ contribution, next-to-leading order (NLO) would be an Oðm 3 π =Λ 2 χ Þ contribution, which vanishes for m Ω , etc.
The light quark mass dependence for t 0 and w 0 has also been determined in χPT through Oðm 4 π Þ [84] which is N 2 LO in the meson chiral power counting. For example, where the LO term, w 0;ch , is the value in the chiral limit and the k i are linear combinations of dimensionless LECs. The expression for t 0 is identical in form and will have different numerical values of the LECs. From these expressions, we can see both m Ω and t 0 and w 0 depend only upon even powers of the pion mass through the order we are working: m Ω receives a chiral correction that scales as Oðm 7 π Þ from a double-sunset two-loop diagram [79] and the next correction to t 0 and w 0 will appear at Oðm 6 π Þ. We can multiply these expressions together, Eqs. (3.10) and (3.12), in order to form an expression describing the light-quark mass dependence of w 0 m Ω . As the characterization of the order of the expansion with respect to the order of m 2 π is not the same for w 0 and m Ω , we define the contributions to w 0 m Ω as with a similar expression for ffiffiffi ffi t 0 p m Ω . We add polynomial terms in s 2 Λ;Ω such that We will consider both Λ ¼ F and Λ ¼ Ω for the two choices of small parameters. For convenience, we set μ ¼ Λ χ and μ ¼ m Ω respectively for these choices. For a detailed discussion how one can track the consequence of such a quark mass dependent choice for the dim-reg scale, see Ref. [33].

F. Finite volume corrections
The finite-volume (FV) corrections for m Ω are determined at one loop through the modification to the tadpole integral [85,86] where k 1 ðxÞ is given by K 1 ðxÞ is a modified Bessel function of the second kind, and c n are multiplicity factors for the number of ways the integers ðn x ; n y ; n z Þ can form a vector of length jnj; see Table III for the first few. At N 3 LO, the finite volume corrections for m Ω are also trivially determined, as the only two-loop integral that contributes is a double-tadpole with un-nested momentum integrals; see Fig. 2 of Ref. [79]. The N 3 LO correction to w 0 is not known. However, the isoscalar nature of w 0 means that at the two-loop order, just like the correction to m Ω , it will only receive contributions from trivial two-loop integrals with factorizable momentum integrals. Therefore, the N 3 LO FV correction can also be determined from the square of the tadpole integral, ð3:17Þ resulting in δ N 2 LO L;F ðl F ;m π LÞ ¼ c ln ll l 4 F 4k 1 ðm π LÞ δ N 3 LO L;F ðl F ;m π LÞ ¼ c ln 2 lll l 6 F lnðl 2 F Þ8k 1 ðm π LÞþc ln lll l 6 F 16k 1 ðm π LÞ: ð3:18Þ The FV correction through N 3 LO arising from loop corrections to w 0 and m O is given by with a similar expression for δ L;Ω . We are neglecting a few FV corrections at δ N 3 LO L;F . The NLO l 2 F;Ω correction to the omega mass is from a quark mass operator, which has been converted to an m 2 π correction, 2Bm l ¼ m 2 π þ Oðm 4 π =Λ 2 χ Þ. This choice for organizing the perturbative expansion induces corrections in what we have called N 2 LO and N 3 LO. At N 2 LO, the corrections arise from single tadpole diagrams, and so the FV corrections are accounted for through Eq. (3.18). At the next order, the corrections to the pion self-energy involve more complicated two-loop diagrams [87] and so the FV corrections arising from these are not captured in our parametrization. Similar corrections arise from expressing 4πF 0 ¼ 4πF π þ Oðm 2 π =Λ 2 χ Þ when using l 2 F to track the light-quark mass corrections [F 0 is F π in the SUð2Þ chiral limit].
While we have neglected these contributions, the FV corrections to m Ω are suppressed by an extra power in the chiral power counting compared with many observables, beginning with an m 4 π =Λ 4 χ prefactor, Eq. (3.18). In Fig. 5, we show the predicted FV correction along with the results at three volumes on the a12m220 ensembles. As can be observed, the predicted FV corrections are very small and consistent with the numerical results.

G. Discretization corrections
A standard method of incorporating discretization effects into the extrapolation formula used for hadronic observables is to follow the strategy of Sharpe and Singleton [88]: (1) For a given lattice action, one first constructs the Symanzik effective theory (SET) by expanding the discretized action about the continuum limit. This results in a local effective action in terms of quark and gluon fields [89,90]; (2) With this continuum effective theory, one builds a chiral effective theory by using spurion analysis to construct not only operators with explicit quark mass dependence, but also operators with explicit lattice spacing dependence. Such an approach captures the leading discretization effects in a local hadronic effective theory. At the level of the SET, radiative corrections generate logarithmic dependence upon the lattice spacing. The leading corrections can be resummed such that, for an OðaÞ improved action, the leading discretization effects then scale as a 2 α nþγ 1 S [91,92] where n ¼ 0 for an otherwise unimproved action, n ¼ 1 for a tree-level improved action and n ¼ 2 for a one-loop improved action. The coefficientγ 1 is an anomalous dimension which has been determined for Yang-Mills and Wilson actions [93]. 1 For mixed-action setups [17,18] such as the one used in this work, a low-energy mixed-action effective field theory (MAEFT) [18,[94][95][96][97][98][99][100][101][102] can be constructed to capture the manifestation of infrared radiative corrections from the discretization 2 . Corrections come predominantly from a modification of the pseudoscalar meson spectrum as well as from "hairpin" interactions [104] that are proportional to the lattice spacing in rooted-staggered [105] and mixedaction theories [96]; in partially quenched theories, these hairpins are proportional to the difference in the valence and sea quark masses [106][107][108].
In our analysis of F K =F π , we observed that the use of continuum chiral perturbation theory with corrections polynomial in ϵ 2 a was highly favored over the use of the MAEFT expression, as measured by the Bayes-Factor, though the results from both were consistent within a fraction of 1 standard deviation [33]. Similar findings have been observed by other groups for various quantities; see for example Refs. [109][110][111]. Therefore, in this work, we restrict our analysis to a continuum-like expression enhanced by polynomial discretizaton terms.
The dynamical HISQ ensembles have a perturbatively improved action such that the leading discretization effects (before resumming the radiative corrections [91][92][93]) scale as Oðα S a 2 Þ [23]. The MDWF action, in the limit of infinite extent in the fifth dimension, has no chiral symmetry breaking other than that from the quark mass. Consequently, the leading discretization corrections begin at Oða 2 Þ [112,113]. For finite L 5 , the OðaÞ corrections are proportional to am res which is sufficiently small that these terms are numerically negligible. Therefore, we parametrize our discretizaton corrections with the following terms where we count ϵ 2 a ∼ l 2 Λ ∼ s 2 Λ :

IV. EXTRAPOLATION DETAILS AND UNCERTAINTY ANALYSIS
We perform our extrapolation analysis under a Bayesian model-averaging framework as described in detail in Refs. [30,33,114], which is more extensively discussed for lattice QFT analysis in Ref. [58]. We consider a variety of extrapolation functions by working to different orders in the power counting, using the l F , s F or l Ω ; s Ω small parameters, by including or excluding the chiral logarithms associated with pion loops, and by including or excluding discretization corrections scaling as α S a 2 . The resulting Bayes factors are then used to weight the fits with respect to each other and perform a model averaging. In this section, we discuss the selection of the priors for the various LECs and then present an uncertainty analysis of the results.

A. Prior widths of LECs
In our F K =F π analysis [33], we observed that using ϵ 2 π ¼ l 2 F , ϵ 2 K and ϵ 2 a as the small parameters in the expansion, 3 the LECs were naturally of O(1). We therefore have a prior expectation that this may hold for ffiffiffi ffi t 0 p m Ω and w 0 m Ω as well.
Let us use w 0;orig m Ω to guide the discussion. The a ∼ 0.12 fm ensembles were simulated with a fixed strange quark mass. Therefore, the entire change in w 0;orig m Ω between the a12m130 and a12m180L ensembles can be attributed to the change in l F . This allows us to "eyeball" the c l prior to be c l ≃ 1 if we assume the dominant contribution comes from the NLO c l l 2 F term. 4 Motivated by SUð3Þ flavor symmetry considerations, we can roughly expect c s ∼ c l . In order to be conservative, we set the prior for these LECs as where Nðμ; σÞ denotes a normal distribution with mean μ and width σ. A similar observation is made for w 0;imp m Ω and the original and improved values of ffiffiffi ffi t 0 p m Ω . We observe (with a full analysis) that the log-Bayes-Factor (logGBF) prefers even tighter priors, with logGBF continuing to increase as the width is taken down to 0.1 on these NLO LECs. 5 The observation that m Ω increases with increasing values of l F and s F (normalized by any and all gradient flow scales) allows us to conservatively estimate the LO prior, 1 See also the presentation by N. Husung at the MIT Virtual Lattice Field Theory Colloquium Series, http://ctp.lns.mit.edu/ latticecolloq/.
2 While this might seem counterintuitive, it is analogous to the infrared sensitivity of hadronic quantities to the Higgs vacuum expectation value (vev): hadronic quantities have infrared (logarithmic) sensitivity to the pion mass from radiative pion loops, and the squared pion mass is proportional to the light quark mass which is proportional to the Higgs vev [103]. 3 The small parameter This is analogous to using the effective mass and effective overlap factors to choose conservative priors for the ground state parameters in the correlation function analysis. 5 For fixed data, exp log GBFg provides a relative weight of the likelihood of one model versus another. c 0 ¼ Nð1; 1Þ:

ð4:2Þ
We then conservatively estimate the priors for all of the higher order l F and s F LECs to bẽ c i ¼ Nð0; 1Þ: ð4:3Þ We observe, with a full analysis, that this choice is near the optimal value as measured by the logGBF weighting. For the discretization corrections, see Fig. 7 in Sec. IV B, as we change the gradient flow scale from w 0;orig to the improved version to using the original and improved versions of ffiffiffi ffi t 0 p , the approach to the continuum limit can change sign. We also observe, the convexity of the approach to the continuum limit (the ϵ 4 a contributions) can change sign. Therefore, we perform a prior-optimization study for the discretization LECs in which we change the prior width of the NLO and N 2 LO LECs in concert, such that the priors are given bỹ Although we find using tighter priors increases the logGBF, the final results are unchanged. When we add the α S ϵ 2 a term in the analysis, this introduces a fourth class of discretization corrections. As we only have four lattice spacings in this work, we perform an independent prior-width optimization for this LEC. For all four choices of gradient-flow scales, we find the choice, to be near-optimal, with three of the analyses preferring an even tighter prior width. An empirical Bayes study [57] in which the widths of all the chiral and all the discretization priors are varied together at a given order leads to similar choices of all the priors.
In Table IV we list the values of all the priors used in the final analysis. The full analysis demonstrates that these choices also result in no tension between the priors and the final posterior values of the LECs, further indication that our choices are reasonable.
When we use l 2 Ω and s 2 Ω as the small parameters instead of l 2 F and s 2 F , we note that since ðm Ω =Λ χ Þ 2 ∼ 2, we can use the same prescription, except to double the mean and width of all the NLO priors (which scale linearly in l 2 Ω and s 2 Ω ), set the widths to be 4 times larger for the N 2 LO priors and 8 times larger for the N 3 LO LECs. The mixed contributions which scale with some power of ϵ 2 a and l 2 Ω and s 2 Ω are scaled accordingly; see also Table IV.

B. Extrapolation analysis
For each of the four quantities, w 0;orig m Ω , w 0;imp m Ω , ffiffiffiffiffiffiffiffiffiffi t 0;orig p m Ω and ffiffiffiffiffiffiffiffiffiffi t 0;imp p m Ω , we consider several reasonable choices of extrapolation functions to perform the continuum, infinite volume and physical quark-mass limits. The final result for each extrapolation is then determined through a model average in which the relative weight of each model is given by the exponential of the corresponding logGBF value. The various choices we consider in the extrapolations consists of Include the lnðm π Þ terms or counterterm only∶ ×2 Expand to N 2 LO; or N 3 LO∶ ×2 Include=exclude finite volume corrections∶ ×2 Include=exclude the α S a 2 term∶ ×2 We find that there is very little dependence upon the particular model chosen. In Fig. 6, we show the stability of the final result of ffiffiffiffiffiffiffiffiffiffi t 0;imp p m Ω and w 0;imp m Ω as various options from the above list are turned on and off. In addition, we show the impact of including or excluding the a12m220 ms ensemble, whose strange quark mass is m s ∼ 0.6 × m phys s , as well as the impact of including the a06m310L ensemble. We observe a small variation of the result when either of these ensembles is dropped, but the results are still consistent with our final result (top of the figure). Using the fixed definition of ϵ 2 a , Eq. (3.8), we find with the statistical (s), chiral interpolation (χ), continuum-limit (a), infinite-volume (V), physical-point (phys), and model selection uncertainties (M). The conversion to physical units is performed with Eq. (3.4).
As discussed in Sec. III C, we explore potential systematics in the use of one definition of the small parameter used to characterize the discretization corrections, Eq. (3.8), versus another equally valid choice, Eq. (3.9). For each choice of w 0 m Ω and ffiffiffi ffi t 0 p m Ω that is extrapolated to the physical point, we repeat the above model averaging procedure, but we also include the variation of using these two definitions of ϵ 2 a for which we find  For all choices of the gradient-flow scale besides ffiffiffiffiffiffiffiffiffiffi t 0;orig p , the average over the choice of how to define ϵ 2 α has minimal impact on the final result, as can be seen comparing Eq. (4.8) and (4.9). In the case of ffiffiffiffiffiffiffiffiffiffi t 0;orig p , the two choices for the cutoff-effect expansion parameter lead to a slight difference in the continuum-extrapolated value, which is reflected in the model-averaging uncertainty. 6 In all cases, the dominant uncertainty is statistical, suggesting a straightforward path to reducing the uncertainty to a few per-mille.
To arrive at our final determination of ffiffiffi ffi t 0 p , Eq. (1.1) and w 0 , Eq. (1.2), we perform an average of the results in Eq. (4.9). As the data between the two choices differ slightly, we can not perform this final averaging step under the Bayes model-averaging procedure; instead we treat each result with equal weight. We would add half the difference between the central values as an additional discretization uncertainty, but as is evident from Eq. (4.9), the central values are essentially the same.
In Fig. 6, we also compare our result with other values in the literature. All the results, except the most recent one from BMWc [14], have been determined in the isospin symmetric limit. Our results are in good agreement with the more recent and precise results, though one notes, there is some tension in the values of ffiffiffi ffi t 0 p and w 0 reported. In Fig. 7, we show the resulting extrapolation of ffiffiffiffiffiffiffiffiffiffi t 0;orig p m Ω and w 0;orig m Ω projected into the l 2 F plane using the N 3 LO analysis including the lnðm π Þ type corrections. The finite lattice spacing bands are plotted with a value of ϵ 2 a taken from the near-physical pion mass ensembles from from Table IVof Ref. [12] with m 0 l =m 0 s ¼ 1=27 and use this to construct ϵ 2 a . The data points are plotted after being shifted to the extrapolated values of all the parameters using the posterior values of the LECs from the N 3 LO fit.
The lower panel of Fig. 7 is similarly constructed by shifting all the data points to the infinite volume limit, l phys F , s phys F and the value of ϵ 2 a from the particular ensemble with the corresponding band in this same limit and only varying ϵ 2 a . We plot the continuum extrapolation of both the original and improved values to demonstrate the impact of the improvement at finite lattice spacing, noting the agreement in the continuum limit. For w 0 , there is very little difference between the original and improved values with very similar continuum extrapolations. In contrast, there is a striking difference between the original and improved values using ffiffiffi ffi t 0 p , though they agree in the continuum limit. We also observe that the use of ffiffiffiffiffiffiffiffiffiffi t 0;orig p is susceptible to larger model-extrapolation uncertainties arising from different choices of parametrizing the continuum extrapolation; see Eq. (4.10). Additional results at a ≲ 0.06 fm will be required to control the continuum extrapolation using ffiffiffi ffi t 0 p in order to obtain a few-per-mille level of precision.
C. Interpolation of t 0 and w 0 With our determination of t 0 and w 0 , Eqs. (1.1) and (1.2), we can determine the lattice spacing for each bare coupling. We could use the near-physical pion mass ensemble values of the gradient-flow scales, or alternatively, we could interpolate the results to the physical quark mass point using the predicted quark mass dependence [84]. The interpolation can be performed for each lattice spacing separately, or in a combined analysis of all lattice spacings simultaneously. The latter is preferable in order for us to determine the lattice spacing a 06 as we only have results at a single pion mass at this lattice spacing. To perform the global analysis, we use an N 2 LO extrapolation function (which has the same form for t 0 ), a;ch þ k ll l 4 F þ k lln l 4 F lnðl 2 F Þ þ k ls l 2 F s 2 F þ k ss s 4 F þ k aa ϵ 4 a;ch þ k al l 2 F ϵ 2 a;ch þ k as s 2 F ϵ 2 a;ch g; ð4:11Þ 6 Rather than performing a model average over the two definitions of ϵ 2 a as defined in Eqs. (3.8) and (3.9), one might instead consider a model average over the choices for fixed ϵ 2 a , i.e.,  for each lattice spacing as a separate unknown parameter, and then assumes that the remaining dimensionless LECs are shared between all lattice spacings. We use this LO parameter to also construct ϵ a;ch which controls the discretization corrections rather than using ϵ a , as ϵ a is half the inverse of the left-hand side of Eq. (4.11). It is tempting to think of this as a combined chiral and continuum extrapolation analysis of w 0 , but it is not as one normally thinks of them. Because we do not know the lattice spacings already, there remains an ambiguity in the interpretation of w 0;ch =a and the LECs k a , k aa , k la and k sa : we are not able to interpret w 0;ch =a as the chiral limit value of w 0 divided by the lattice spacing. We perform this analysis for all four gradient-flow scales with independent LECs for each scale as well as a similar parametrization of the ϵ a;ch parameter.
When we perform the interpolation for each lattice spacing separately, we utilize this same expression except that we set all parameters proportional to any power of ϵ a;ch to zero. When the individual interpolations are used, the resulting values of a 15 , a 12 and a 09 are compatible with those from the global analysis well within 1 standard deviation. In Fig. 8 A quark-mass independent determination of the lattice spacing can be made by using the determination of ffiffiffi ffi t 0 p , Eq. (1.1) or w 0 , Eq. (1.2) at the physical point, combined with the physical-quark mass interpolated values of t 0 =a 2 or w 0 =a from either the original or improved values of these gradient flow scales. Each choice represents a different scheme for setting the lattice spacing. The continuum extrapolated value of some observable quantity, using any of these schemes, should agree in the continuum limit, while at finite lattice spacing, the results can be substantially different, as is evident in Fig. 7.
In Table V, we provide the determination of the lattice spacing for each bare coupling, expressed in terms of the approximate lattice spacing. It is interesting to note that the determination of the lattice spacing with ffiffiffiffiffiffiffiffiffiffi t 0;orig p is substantially different than with the other gradient-flow scales, while the scale determined with from the three remaining auxiliary scales are very similar.

V. SUMMARY AND DISCUSSION
We have performed a precise scale setting with our MDWF on gradient-flowed HISQ action [15] achieving a total uncertainty of ∼0.6%-0.8% for each lattice spacing, Table V. The scale setting was performed by extrapolating the quantities ffiffiffi ffi t 0 p m Ω ðl F ; s F ; ϵ a ; m π LÞ and w 0 m Ω ðl F ; s F ; ϵ a ; m π LÞ to the continuum (ϵ a → 0), infinite volume (m π L → ∞) and physical quark mass limits (l F → l phys F and s F → s phys F ), and using the experimental determination of m Ω to determine the scales ffiffiffi ffi t 0 p and w 0 in fm. The values of ffiffiffiffiffiffiffiffiffiffi t 0;orig p =a, ffiffiffiffiffiffiffiffiffiffi t 0;imp p =a, w 0;orig =a and w 0;imp =a were interpolated to the infinite volume and physical quark mass limits for each lattice spacing, allowing for the quark-mass independent determination of a for each bare coupling β, expressed in terms of the approximate lattice spacing; see Table V.
Of note, the approach to the continuum limit of ffiffiffiffiffiffiffiffiffiffi t 0;orig p m Ω and ffiffiffiffiffiffiffiffiffiffi t 0;imp p m Ω are quite different, Fig. 7, with the use of ffiffiffiffiffiffiffiffiffiffi t 0;imp p leading to an almost flat continuum extrapolation. The two different extrapolations agree quite nicely in the continuum limit, as they must if all systematic uncertainties are under control. In contrast, the use of the original and improved values of w 0 leads to very similar continuum extrapolations of w 0;orig m Ω and w 0;imp m Ω , which also agree very nicely in the continuum limit.
We also observe that the use of l Ω and s Ω as small parameters to control the quark-mass interpolation are relatively heavily penalized as compared to the use of l F and s F ; see Table VII for an example. We observe the same qualitative weighting with all choices of the gradient-flow scale. Perhaps this is an indication that this parametrization is suboptimal.
Our final uncertainty using w 0 is dominated by the stochastic uncertainty, Eq. (1.2), providing a clear path to reducing the uncertainty by almost a factor of 3 before an improved understanding of the various systematic uncertainties becomes relevant. At such a level of precision, a systematic study of the effect of isospin breaking on the scale setting, as has been performed by BMWc [14], is likely required to retain full control of the uncertainty. For ffiffiffi ffi t 0 p , we observe the model-selection uncertainty is comparable to the stochastic uncertainty, Eq. (1.1), which arises from the different ways to parameterize the continuum extrapolation; see Eq. (4.10). Therefore, additional results at a ≲ 0.06 fm will be required to obtain a fer-per-mille precision with ffiffiffi ffi t 0 p . The pursuit of our physics program of determining the nucleon elastic structure functions and improving the precision of our g A result [30,31] will naturally lead to an improved scale setting precision. The current precision is already expected to be subdominant for most of the results we will obtain, but a further improved precision is welcome.
The analysis and data for this work can be found at this git repo: [115].

APPENDIX A: CHARM QUARK MASS REWEIGHTING
The use of reweighting [67] to estimate a correlation function with a slightly different sea-quark mass than the one simulated with is very common in LQCD; see for example Refs. [11,124,125]. A nice discussion of mass reweighting, including single flavor reweighting is found in Refs. [126,127].
In our case, we are interested in reweighting the computation from the hybrid Monte Carlo (HMC) simulated charm quark mass, m HMC c , to the physical quark mass, m phys c which requires an estimate of the ratio of the fermion determinant with the physical mass to the determinant with the HMC mass. If the mass shift is aδm c ¼ m phys c − m HMC c then up to Oðδm 2 Þ this ratio can be written (including the quarter-root arising from rooted-staggered fermions) for each configuration U i and observables may be computed using the weight w, We can use two methods to stochastically estimate w for each configuration. First, by rewriting the determinant as the exponential of a trace-log, one finds and we can use vectors of complex Gaussian noise η to estimate the trace, where V is the size of each η vector. Alternatively, we may estimate the determinant in the reweighting factor (A1) using the identity, which is often used to implement pseudofermions. Up to Oðδm 2 Þ, this becomes which tells us to draw η according to the same gaussian (A5) and estimate Both the trace method (A3) and the pseudofermion method (A8) are only valid to Oðδm 2 Þ; when they agree we assume those corrections are under control. In order to stabilize the numerical estimate of the reweighting factors, it is also common to split the reweighting factor into a product of reweighting factors where each is computed with a fraction of the full mass shift [128][129][130]. For example, with a simulated mass of m 1 and target mass of m 1 þ Δm, one could use two steps of Δm=2 and estimate the reweighting factor with the trace method, using independently sampled complex Gaussian noise η and θ. Of course, one may split the shift Δm into finer steps if needed, for increased computational cost. The reweighting factor accounts for a change in the action and is exponential in the spacetime volume. This can lead to numerical under-or overflow. As a cure, we factor out the average reweighting factor. Recognizing the trace of the inverse Dirac operator on a configuration U i as the scalar quark density times the lattice volume, we can rescale w, shifting by the ensemble average Vhcci computed via (A4). For example, rescaling the trace method (A3) gives such a rescaling cancels exactly in the reweighting procedure (A2). A similar rescaling cures the pseudofermion method (A8). If we split the mass shift as in (A9), each factor of the weight may be independently so stabilized. On the a06m310L ensemble the lattice volume and shift in the mass are V ¼ 72 3 × 96 and While the shift in the mass is only about 10% of the physical charm quark mass, it is of the order of the physical strange quark mass. In order to stabilize the numerical estimate of the reweighting factors, we split this mass shift into ten equal steps, and for each step, we used N η ¼ 128 independent Gaussian random noise vectors. For each step in the reweighting, we used the same Naik value of ϵ N ¼ −0.0533 as was used in the original HMC. This ensures that the Dirac operator only differs from one mass to the next by the quark mass itself. As the Naik term is used to improve the approach to the continuum limit, this is a valid choice to make as it results in a slightly different approach to the continuum than if one simulated at the physical charm quark mass with the optimized Naik value for that mass. In Fig. 9, we show the reweighting factors for each of the mass steps, with the bottom panel having a mass a 06 m c ¼ 0.28319 closest to the HMC mass and the second from top panel having a 06 m phys c ¼ 0.2579, scaled for numerical stability. The top panel is the resulting product reweighting factor normalized by the average reweighting factor. One observes that there are a few large reweighting factors of O (100). We have verified that the trace estimation method (A3) and the pseudofermion method (A8) produce comparable normalized reweighting factors. The large reweighting factors are likely due to the parent HMC distribution of configurations having a suboptimal overlap with the physical, target distribution.

Reweighted spectrum
The next task is to understand how this reweighting impacts the extracted spectrum. To aid in this discussion, we reiterate our strategy for fitting the correlation functions. Because the noise of the omega baryon correlation function FIG. 9. Distribution of reweighting factors, r, for the a06m310L ensemble. The reweighting was performed with ten equal steps of the mass difference from a 06 m HMC c ¼ 0.286 to a 06 m phys c ¼ 0.2579. The reweighting factors from each step are shown in the bottom ten subpanels with the product reweighting factor shown in the top panel. The a06m310L ensemble was generated with two different streams of equal length. For more information on the ensembles, see Ref. [33]. grows in Euclidean time, it is the most challenging to fit, and so we focus our discussion on the omega. Our strategy is to find a good quality fit to the correlation function for which the extracted ground state energy is stable against the number of states and the time-range used in the analysis. For a given t min , we opt to chose the simplest model which satisfies this criteria, which amounts to picking the minimum number of excited states possible.
The SS correlation functions are positive definite, therefore implying that the excited state contamination of the effective mass must come from above. When examining the SS omega-baryon effective mass on the a06m310L ensemble, one observes that around t ¼ 25, the effective mass stops decreasing, and even increases a little. Because this is not allowed for a positive definite correlation function, we can conclude this must be due to a correlated stochastic fluctuation; see Fig. 10. In the rewighted effective mass, one observes more dramatic behavior of the effective mass beginning around the same time. To be conservative, we set t max ¼ 30 in our analysis as this allows the analysis to be sensitive to these stochastic fluctuations, which fluctuate in the opposite direction between the unweighted and reweighted configurations. As a comparison, we also show the reweighting factors and reweighted omega baryon effective mass on the a12m130 ensemble (Fig. 11) where the charm quark was 2% different from its physical value. In this case, the reweighting factors are much easier to estimate, and we do not observe any large values.
The lower panels in Fig. 10 show the extracted ground state mass as a function of t min and n state . We observe that an n state ¼ 2 fit beginning at t min ¼ 19 and 15 for the unweighted and reweighted correlation functions satisfies our optimization criteria. These lead to the estimate value of m Ω given in Eq. (3.7). In Table VI, we also show the reweighted value of m π and m K on this a06m310L ensemble. While the pion and kaon have a statistically significant shift from the reweighting, when we use the reweighted values of m π , m K and m Ω from this ensemble, the final extrapolated value of w 0 m Ω is within 1 standard deviation of the completely unweighted analysis. As the a06m310L ensemble has the largest potential change from reweighting, we conclude that at the level of precision we currently have, our results are not sensitive to the slight mistuning of the charm quark mass from its physical value on each of the configurations.

APPENDIX B: MODELS INCLUDED IN AVERAGE
We use w 0;orig m Ω as an example to demonstrate the model averaging and naming conventions for the various models. The other gradient-flow scale studies have identical in form, extrapolation functions. Table VII provides the various models and their relative weights for w 0;orig m Ω . Of note, in all four gradient-flow scale extrapolations, the extrapolations which make use of l Ω and s Ω versus l F and s F , all have relatively smaller weights in the averaging, such that in all cases, these extrapolations could be dropped from the model averaging without effecting the final result. A reprodution of this model averaging as well as for the other three gradient-flow scales can be obtained by downloading and running the associated analysis software at https://github.com/callat-qcd/ project_scale_setting_mdwf_hisq. A few example extrapolation formula are given below to demonstrate the naming convention, where the chiral, discretization, and finite volume corrections are defined in Eqs. (3.14), (3.20) and (3.18) respectively.

APPENDIX C: STABILITY PLOTS OF THE OMEGA GROUND STATE MASS
Here we present the stability plots for the remaining Omega correlator fits used in our analysis, which are presented in Figs. 12-25.