Explorations beyond dilaton chiral perturbation theory in the eight-flavor SU(3) gauge theory

We continue our study of spectroscopy data for the SU(3) gauge theory with eight fundamental fermions, motivated by the effective field theory framework of dilaton chiral perturbation theory (dChPT). At leading order dChPT predicts a constant mass anomalous dimension $\gamma_m$, consistent with the assumed proximity of an infrared fixed point. For the relatively large fermion masses simulated by the LatKMI collaboration, the influence of the infrared fixed point diminishes, and our fits suggest that $\gamma_m$ starts running. Since a complete higher-order analysis is not feasible with presently available data, we adopt a more phenomenological approach. We propose a partial extension to higher orders, which incorporates the running of $\gamma_m$ into the tree-level lagrangian. We find that this extension successfully describes the full fermion-mass range of the LatKMI data, including the pion taste splittings which arise from using staggered fermions in the lattice simulations. We also investigate a more general class of dilaton potentials proposed in the literature, using both the LSD and LatKMI data sets, concluding that these data favor the form predicted by dChPT.

I.

INTRODUCTION
Lattice simulations of the SU(3) gauge theory with eight Dirac fermions in the fundamental representation have revealed the existence of a flavor-singlet scalar particle, which, at the fermion masses explored in these simulations, is approximately degenerate with the pions-the Nambu-Goldstone bosons associated with chiral symmetry breaking [1][2][3][4]. A similar light scalar has been found also in the SU(3) gauge theory with two sextet fermions [5][6][7][8][9], or with four light, and six [10] or eight [11] heavy fundamental fermions. 1 The existence of a light flavor-singlet scalar particle roughly degenerate with the pions means that, besides the pions, any effective field theory (EFT) description of the low-energy behavior has to include a field that represents this scalar particle. Here, our starting point is dilaton chiral perturbation theory (dChPT), an EFT in which the lightness of the scalar particle is assumed to arise from approximate scale invariance of the underlying theory in the infrared [17][18][19][20][21]. 2 Increasing the number of (massless) fermionic degrees of freedom will eventually take the theory into the conformal window, where the non-abelian gauge theory is still asymptotically free, but develops an infrared fixed point (IRFP). The idea is that, with eight flavors, the SU(3) gauge theory is still outside the conformal window, but close enough to the conformal sill-the number of flavors where the IRFP first develops-that the breaking of scale invariance in the infrared is governed by the proximity of the IRFP.
The key assumption is then that the distance to the conformal sill can be treated as a small parameter, in which a systematic power counting can be developed. The scalar particle, which we will refer to as the dilaton, is interpreted as a pseudo Nambu-Goldstone boson (pNGB) for the approximate scale symmetry [17]. The mass of the dilaton is controlled by this small parameter, just as the fermion mass leads to a parametrically small pion mass. Since the fermion mass breaks scale invariance too, the dilaton mass will also depend on the fermion mass.
In a previous paper [33] we applied leading-order (LO) dChPT to numerical data for the eight-flavor SU(3) gauge theory produced in lattice simulations by the LSD collaboration [3]. We showed that, over the fermion mass range in these simulations, LO dChPT successfully describes the pNGB sector of the theory, including the dilaton. In Ref. [3] staggered fermions were used, which exhibit taste splittings-a lattice artifact mass splitting of the pion multiplet caused by a partial breaking of the flavor symmetry group in the staggered fermion formulation. 3 We showed that dChPT explains the pattern of taste splittings in the pion sector observed in Ref. [3] as a function of the fermion mass. The vacuum expectation value of the dilaton field depends on the fermion mass already in LO, leading to a fermion-mass dependence of pNGB decay constants and masses that is qualitatively different from QCD. This includes the taste splittings, which are also qualitatively different from the pattern seen in QCD with staggered fermions.
Given this success, our goal in this paper is to investigate whether dChPT can also be applied to the other major lattice study of the eight-flavor SU(3) gauge theory, by the LatKMI collaboration [4]. 4 This study also used staggered fermions, and presented extensive spectroscopy data for the pNGB sector, including taste splittings. The KMI simulations were done at larger fermion masses than those of LSD. Even if dChPT is the correct EFT, the question arises whether one can fit the KMI data using LO dChPT, or, alternatively, whether higher orders in the EFT expansion would be needed. Indeed, unlike for the LSD data [3], we found that LO dChPT does not quantitatively describe the KMI data over the full fermion mass range, as will be discussed in detail in this paper.
For the LSD data, we found that as the fermion mass is varied, hadron masses and decay constants respond with an approximate hyperscaling behavior [20]. As the fermion mass increases, the theory is drawn further away from the influence of the IRFP at the nearby conformal sill. Once the fermion mass becomes large enough, we expect that the running of the coupling will become noticeable, and thus also the running of the mass anomalous dimension γ m . 5 In dChPT, at leading order, the mass anomalous dimension is constant, γ m = γ * , where γ * is the mass anomalous dimension at the nearby IRFP. dChPT allows for a non-constant γ m , but the power counting underlying dChPT accommodates corrections to a constant γ m only through higher orders. In order to systematically compare dChPT with the KMI data, we would thus have to consider dChPT to next-to-leading order (NLO) or beyond. However, the relatively large number of additional parameters that would be needed already at NLO, and limitations of the presently available lattice data, to be discussed below, prevent us from attempting a complete NLO fit.
Instead, we will take a more phenomenological approach, based on the following observation. The salient difference between the KMI and LSD data appears to be that a constant γ m cannot account for the full range of (larger) fermion masses explored in the KMI data. We will thus extend LO dChPT by only including higher-order effects that are directly related to γ m ; we will refer to this extension as γ-dChPT. This makes our approach not systematic, since most NLO and higher-order effects are left out. Strictly speaking, γ-dChPT should thus be viewed as a model approach.
In order for LO dChPT to accommodate a varying γ m , we will modify the mass-dependent part of the potential, as described in detail in Sec. II. This raises the question of what happens if one also considers a generalization of the dilaton part of the potential. A class of potentials depending on a new parameter ∆, generalizing the dilaton potential of dChPT, has been proposed before [24,25,28,32], and we will refer to this different extension of LO dChPT as ∆-dChPT. One recovers LO dChPT, including its dilaton potential, by taking ∆ → 4. It is interesting to also confront ∆-dChPT with the data. We will revisit the analysis of the LSD data using ∆-dChPT by Ref. [32], and extend this investigation to the KMI data. Despite claims in the literature [32], ∆-dChPT takes us outside the systematic power counting of dChPT, and should thus be considered as a more phenomenological approach to the lowenergy behavior of the N f = 8 theory.
This paper is organized as follows. In Sec. II we introduce γ-dChPT, in which LO dChPT is extended to accommodate a varying γ m . In Sec. III we first present our evidence that γ m , as well as other LO parameters, are changing over the KMI mass range in a fit to LO dChPT. We then apply γ-dChPT to the pNGB sector of the KMI data. We find that a rather simple model for a varying γ m provides good fits of the KMI data, including taste splittings. In Sec. IV we consider the generalized class of dilaton potentials, reviewing the application of ∆-dChPT to the LSD data, and applying it to the KMI data. Combining these results provides some evidence that the dilaton potential of LO dChPT is preferred by the data, i.e., that the preferred value in ∆-dChPT is close to ∆ = 4. Finally, Sec. V contains our conclusions. In an appendix, we investigate the claim of Ref. [32] that ∆-dChPT admits a systematic power counting for any value of ∆, and show that this claim is incorrect.

II.
DILATON ChPT AND γ m In Sec. II A, we begin with a summary of LO dChPT. This is the EFT that was applied to the LSD data in Ref. [33]. In Sec. II B we revisit the physics of hyperscaling, and its manifestation in LO dChPT. This leads us in Sec. II C to introduce γ-dChPT, where we generalize the low-energy lagrangian to accommodate a non-constant mass anomalous dimension. We emphasize that this extension takes us outside the strict EFT framework. In Sec. II D we present the hadronic quantities to be fit to the KMI data of Ref. [4] in the rest of this paper.

A. dChPT at lowest order
The euclidean LO lagrangian for dChPT is given by The potential terms are Here Σ is the usual non-linear field describing the pion multiplet, while τ is the dilaton effective field. L depends on the low-energy constants (LECs) f τ , f π , B π , B τ , γ * , c 0 and c 1 . We define the theory in the Veneziano limit [37], in which N ≡ N c ∝ N f is taken to infinity keeping the ratio n f = N f /N c fixed, with N f the number of fundamental-representation flavors and N c the number of colors. The power counting is [17] The relation p 2 ∼ m defines the power counting of ordinary ChPT. 6 The small parameter controlling the hard breaking of scale invariance is n f − n * f , where n * f is the limiting value of n f for the theory at the conformal sill: the boundary between the regime where the massless theory undergoes chiral symmetry breaking, and the regime where this theory is conformal in the infrared, i.e., where the gauge coupling g runs into an infrared fixed point g * .
Invoking the proximity of the sill of the conformal window, we assume that the β function is small at the chiral symmetry breaking scale, and that the corresponding value of g is close to g * . We can then expand the mass anomalous dimension γ(g) in powers of n f − n * f around γ * = γ(g * ), the mass anomalous dimension at the infrared fixed point at the conformal sill. For a detailed discussion of the construction of the LO lagrangian, and the underlying power counting, see Refs. [17,20].
In the dilaton potential (2.2a), c 0 is O(1), while c 1 is proportional to the small expansion parameter n f − n * f . 7 For m = 0, we shift the τ field to τ + v 0 , with v 0 = τ m=0 (before the shift). After the shift, the dilaton expectation value v(m) = τ vanishes in the massless 6 The dimensionful quantities, p 2 and m, are measured in units of the dynamically generated infrared scale of the massless theory. 7 For a few more details about the power counting, see App. A.

theory. Definingf
The shift sets c 0 = −c 1 /4, and now the whole LO lagrangian is O(p 2 ) in the power counting (2.3). We will assume c 1 > 0, so that the potential L d + L m is bounded from below. Assuming m ≥ 0, the potential is minimized by Σ = 1. The dilaton expectation value v = v(m) solves the saddle-point equation The solution is positive, and monotonically increasing with m. The spectroscopy data we considered in Ref. [33] can then be expressed as functions of m, Explicitly, where W 0 is the Lambert W -function. The parameters d 0,1,2,3 are defined in terms of the LECs of the tree-level lagrangian, (2.10) In Ref. [33] we applied LO dChPT, as summarized above, to the LSD data [3]. The key assumptions underlying this analysis were: (a) the N f = 8, N c = 3 theory undergoes chiral symmetry breaking; (b) for the LSD mass range, the β function is small enough that the dChPT power counting is applicable. The results of our analysis corroborated these assumptions.

B. Hyperscaling
Consider momentarily a mass-deformed infrared conformal theory. We can probe the theory over a range of scales where g is so close to the infrared fixed-point g * that all effects of its running can be neglected. The breaking of scale invariance is then driven entirely by the input bare fermion mass m 0 . Under these circumstances, any hadronic mass M follows a simple hyperscaling law, (2.11) Here Λ UV is an ultraviolet scale for which the approximation γ m (µ) = γ * is valid for any µ ≤ Λ U V , and m 0 = m(Λ U V ), where m(µ) is the running renormalized mass. Hyperscaling is based on the following simple observations: 1. The renormalized mass, m = m(µ), runs as dictated by its anomalous dimension. By contrast, the renormalized coupling has attained its fixed-point value g * (up to negligible corrections), hence also the mass anomalous dimension has a fixed value γ * = γ m (g * ).
2. No physical scale is generated dynamically in the massless theory. When the fermion mass is nonzero, the induced physical scale M is set by the condition M ∼ m(M ).
Indeed, starting from the solution for m(µ) for a constant mass anomalous dimension, Returning to dChPT, in Ref. [33] we found that the LSD data is in the "large-mass" regime [20], where for all (bare) masses. As follows from the previous subsection, 8 in LO dChPT, c 1 encodes the magnitude of the β function at the chiral symmetry breaking scale. The large-mass regime is thus an approximate hyperscaling regime, where the input fermion mass dominates the breaking of scale invariance. Indeed, in Ref. [20] we showed that the leading mass dependence predicted by LO dChPT in the large-mass regime is the hyperscaling relation (2.11), for all hadronic masses and decay constants. We also calculated corrections to this relation, which are present in dChPT already at LO, because the β function at the chiral symmetry breaking scale, hence c 1 , is (by assumption) parametrically small, but not vanishingly small as in a mass-deformed infrared conformal theory. Moreover, we showed that as long as dChPT provide a systematic expansion, even though m 0 /M can be large. By Eq. (2.7), M is constructed from LECs which can be defined in the chiral limit. It is a striking difference between ordinary ChPT and dChPT that, because of the nearby IRFP, in dChPT a systematic low-energy expansion exists even if the fermion mass is not small relative to the infrared scale of the massless theory, so long as inequality (2.14) holds. The fermion mass range explored in the KMI data is higher than in the LSD data. The comparison can be made, for example, in units of t 0 , see Fig. 5 of Ref. [2]. We will return to the comparison between the LSD and KMI data, and its limitations, in Sec. III E below. As mentioned in the introduction, when we increase the input fermion mass the influence of the IRFP diminishes. Eventually, we will reach energy scales where the running of the coupling picks up, 9 and, as a result, so does the running of the mass anomalous dimension. In the next subsection, guided by this consideration, we will develop a generalized notion of hyperscaling, which is founded on the same principles as above, except that the assumption of a constant mass anomalous dimension is relaxed. This will lead to the framework of γ-dChPT, where LO dChPT is extended to accommodate a varying mass anomalous dimension. We stress that the power counting of dChPT allows for corrections to a constant γ m , but only via higher-order terms in the expansion in n f − n * f . In seeking an extension of LO dChPT that accommodates a varying γ m we are thus asking for a partial resummation of these higherorder terms, under the assumption that these are the dominant higher-order corrections.
We conclude this subsection with a technical comment. The hyperscaling law (2.11) can be rewritten as (2.15) It follows that the fermion mass m 0 is always much smaller than any hadronic mass M (as long as m 0 Λ UV ), and the same is true for the decay constants F π and F τ . Moreover, in Ref. [20] we showed that this conclusion extends to n f < n * f , below the conformal window, and that it applies also to the masses of the pNGBs, M π and M τ . We will assume that the ratio m 0 /M remains small also when the simple hyperscaling relations, Eqs. (2.11) and (2.15), are generalized to account for the running of γ m . Indeed, for the LSD data, m 0 /M π ranges between 0.015 and 0.04, while for the KMI data it ranges between 0.07 and 0.17. Since m 0 /M π 1, this allows us to use a mass-independent renormalization scheme. 10 As we will see below, this greatly simplifies our considerations.

C. Varying γ m and γ-dChPT
We will now proceed to develop the extension of LO dChPT allowing for a scale-dependent γ m . The RG equation governing the dependence of the renormalized mass m on the renormalization scale µ is closely related to the behavior of the renormalized mass under scale transformations. In order to relate the two, we first review how a scale is introduced into the bare theory; we will do this using dimensional regularization. For more details, we refer to Ref. [19]. We regulate the action of the microscopic theory as where L is the bare lagrangian, and d is the number of dimensions. With the factor µ d−4 0 , the bare action S is invariant under scale transformations if we promote the bare parameters µ 0 and m 0 to spurions. The scale transformation rules are where A µ is the bare gauge field and ψ the bare fermion field. The function γ m , defined by the RG equation describes the response of the renormalized mass m to a change of the renormalization scale µ. In a mass-independent scheme, all renormalization factors depend on the scales µ and µ 0 only through their ratio, µ/µ 0 . Hence, where g = g(µ/µ 0 ) is the running coupling. From now on, we will write γ m (µ/µ 0 ) for γ m (g(µ/µ 0 )), with slight abuse of notation. We choose µ not to transform under scale transformations: the transformation (2.17) describes a rescaling of all the dimensionful bare quantities relative to a fixed renormalization scale. Once γ m is known we can express m(µ), the renormalized mass at an arbitrary renormalization scale µ, in terms of the bare mass, m 0 = m(µ 0 ), by integrating Eq. (2.18) between µ 0 and µ. Introducing the formal solutions (2.20) of the RG equations Using Eq. (2.17) for the dependence of the bare parameters m 0 and µ 0 on the scale transformation parameter λ, it follows that an infinitesimal scale transformation of the renormalized mass is governed by the differential equation [19] ∂m(λ; µ) For constant γ m = γ * , Eq. (2.20) simplifies to The second equation explains the origin of the factor λ 1+γ * . A factor λ comes from the transformation of m 0 , Eq. (2.17a), while the remaining factor λ γ * comes from the transformation of µ 0 , Eq. (2.17b). With the transformation rules of the effective fields , as required for the invariance of the action.
In order to accommodate a non-constant γ m , we replace L m of Eq. (2.2b) by Let us derive the transformation properties of this lagrangian. The combination B π (µ/µ 0 )m(µ/µ 0 ) is by assumption RG invariant, and we can write B π (µ/µ 0 ) as The new LEC, B RG π , is both RG invariant and scale invariant, also by assumption. Hence B π (µ/µ 0 )m(µ/µ 0 ) = B RG π m 0 , and using Eq. (2.17a) it follows that under a scale transformation ∂ ∂ log λ B π (µ/(λµ 0 )) m(λ; µ) = +B π (µ/(λµ 0 )) m(λ; µ) . (2.30) The factor E − (e τ f π /µ 0 ) in Eq. (2.28) is invariant under a scale transformation by construction, because the combination e τ f π /µ 0 is. 11 Noting that the scaling dimension of Σ is zero, and taking the contribution from the factor e 3τ into account, we obtain The transformation rule (2.32c) is needed to ensure the invariance of (the space-time integral of) L d in Eq. (2.2a). 13 As usual, once the spurions m, µ 0 and c 0 are set equal to their fixed values, this breaks the scale symmetry explicitly. We may again shift the τ field, as we did in Sec. II A, such that after the shift it has a vanishing expectation value for m = 0. The LECs f π,τ and B τ are redefined as in Eq. (2.4), but nowB π is defined aŝ The lagrangian after the shift is again given by Eq. (2.5), but now with instead of Eq. (2.6c). Note that, instead of being a function of e τ f π /µ 0 , now E − is a function of e τf π /µ 0 . Let us now reconsider the trace anomaly. We first apply the scale transformation only to the effective fields, setting the spurions equal to their fixed values. In this case, 14 and we obtain the contribution of L m to ∂ µ S µ , the divergence of the dilatation current S µ (see App. D of Ref. [17]), In the last step we identified L m with the EFT representation of mψψ in the underlying theory. This reproduces, in the EFT, the contributions from the fermions to the trace anomaly [38]. Recall that we have defined γ m to be a function of µ/µ 0 , cf. Eq. (2.19).
Replacing τ by v(m), its vacuum expectation value at non-vanishing m, we see that Eq. (2.36) effectively identifies the renormalization scale µ with F π = e v(m)f π , cf. Eq. (2.8b). This reveals a key feature of our construction of γ-dChPT: γ m is evaluated at a renormalization scale equal to the physical scale F π , which, in turn, is a function of the input fermion mass. We comment that we chose the hadronic scale inside E − in Eq. (2.28) to be f π , but, to achieve the desired scaling behavior, we could equivalently choose f τ , or, more generally, any other hadronic scale m h that enters the dChPT lagrangian (or generalization thereof) via the combination e τ m h , such as, for example, the nucleon mass in the chiral limit.
We now specialize to specific choices for the function γ m . First, for constant γ m = γ * , 37) and the lagrangian L m in Eq. (2.34) reduces to Eq. (2.6c). 15 This also impliesB π (µ/f π ) = e (1−γ * )v 0 B π (µ/f π ), consistent with Eq. (2.4c). 13 The transformation rules of c 0 and c 1 get modified at higher orders. For a detailed discussion of L d , see Refs. [17,20]. 14 We omit the contribution from the scale dependence of the space-time coordinates (compare Eq. (2.31)). 15 In this special case, the dependence on µ 0 drops out. We next introduce a new choice for γ m that we will be using for the actual fits to the KMI data. With t = τ + log(f π /µ 0 ) we define a cubic polynomial in t. The variable t is invariant under scale transformations, and, consistent with our general discussion,γ 0 ,b andc are LECs that do not depend on µ or µ 0 . Re-expressing t in terms of τ , we writẽ which defines the coefficients of the cubic polynomial F (τ ) in terms of those ofF (t), and log(f π /µ 0 ). Substituting into Eq. (2.34), and absorbing e −F (log(fπ/µ 0 )) intoB π , the final form of the lagrangian becomes We will use the acronym γ-dChPT for the lagrangian defined by Eq. (2.1), with L d given by Eq. (2.2a), and L m by Eq. (2.42) for some general function F (τ ). Of course, for the case of a linear F (τ ), Eq. (2.42) reduces to Eq. (2.2b), and the lagrangian is just LO dChPT.
As an EFT, dChPT is based on the power counting established in Refs. [17,20] and reviewed above. As in ordinary ChPT, loop corrections in dChPT can be included systematically; the power counting (2.3) dictates which terms occur at the next-to-leading order (NLO) [17], at the next-to next-to-leading order (NNLO), and so on. The same is true in the large-mass regime, where the power counting is controlled by Eq. (2.14). This raises the question of how much γ-dChPT deviates from the strict EFT framework of dChPT itself. If we rely on algebraic sturcture and symmetries only, this allows E − (e τf π /µ 0 ) in Eq. (2.34), or, equivalently, F (τ ) in Eq. (2.42), to depend on an infinite number of parameters, reflecting the model nature of γ-dChPT. But if, on the other hand, we assume that F (τ ) takes the form of Eq. (2.41), with then the factor e −F (τ ) may be obtained via partial resummation of terms from all orders in the expansion in powers of n f −n * f . It thus reflects a fairly modest departure from dChPT, in that we will be taking into account some higher-order analytic terms, resummed into e −F (τ ) , while omitting other higher-order terms. In addition, we will not calculate any non-analytic higher-order corrections when fitting γ-dChPT to data. We will re-examine the scenario of Eq. (2.43) after presenting our fits to the KMI data in Sec. III.
Equation (2.44) can be rewritten as For a general function F , Eq. (2.47) cannot be explicitly inverted analytically. We will, in effect, solve it numerically for m as a function of v, as described in Sec. III. In terms of v, F π is still given by Eq. (2.8b). The pion mass is now so that, using Eq. (2.47), the ratio M 2 π /F 2 π is given by The three equations (2.47), (2.8b) and (2.50) contain six parameters,d 1 , d 2 ,f π and the three parameters inside F : γ 0 , b and c. We will not fit M τ to the KMI data, as the errors found in Ref. [4] are too large for such a fit to have statistical relevance. We will, however, fit the staggered taste-splittings obtained in Ref. [4]. With M Γ i the masses of the taste-split pions corresponding to the tastes we will fit the differences 16 according to [39,40] (2.53e) 16 We note that M Γ5 = M π is the mass of the Nambu-Goldstone pion.
Here C 1,3,4,6 are LECs associated with the taste-breaking potential [40], and (2.54) Equation (2.54) assumes that γ i , the anomalous dimensions of the taste-breaking fourfermion operators, are constant (see Ref. [33] for more details). A global fit of the data including all the taste splittings has eight new parameters, coming from Eq. (2.53), in addition to the six parameters of the basic fit. This is a large number of parameters, and, as we will see, some of them are not sufficiently constrained by the available data. Thus, we will not venture into an exploration of any scale dependence of the γ i . We end this section with a comment. While in LO dChPT the potential is bounded from below, in γ-dChPT with general F (v) the potential can be unbounded from below. 17 Mathematically, this appears to be a problem, but we contend that it is physically irrelevant. Within the EFT framework, the potential can only be known for O(1) values of the fields. While the pion field is always O(1) because it is a compact field, this is not the case for τ . We thus need to restrict the EFT to O(1) values of τ "by hand." In practice, this means that after fits to the data, we need to check that indeed values of v predicted by the fits are O(1), and do not land in the large-field region. In all our fits with a varying γ m indeed unphysical regions of the potential occur at very large values of v, but they are separated from the physical region by an exponentially large potential barrier. Consistently, our fits never explore the unphysical region of the potential.

III. FITS TO THE LatKMI DATA
In this section, we will present our fits to data reported in Ref. [4], obtained by the LatKMI collaboration for the eight-flavor SU(3) gauge theory. We begin in Sec. III A with a discussion of these data and the policies we will follow when we use them. In Sec. III B, we present "window" fits. These are fits of M 2 π /F 2 π and F π to the predictions of LO dChPT, for successive quintets of fermion masses, from the five lightest masses to the five heaviest ones. Altogether, ten different fermion masses were simulated in Ref. [4], making six (overlapping) windows. The window fits test the constancy of the LO dChPT parameters. We find a systematic trend of change for all fit parameters, by much more than their errors allow, proving that the full KMI mass range cannot be fit to LO dChPT. Then, in Sec. III C we fit the data at all ten fermion masses simultaneously to γ-dChPT, the extension of LO dChPT with a varying γ m constructed in Sec. II C, with the special choice of γ m in Eq. (2.46). We find that this extension of dChPT successfully describes the KMI data set. Data for taste-split pion masses is available for a more limited set of fermion masses, and we present our fits including the taste splittings in Sec. III D. We end with a discussion of the scale dependence of γ m found in our fits in Sec. III E.
The simulations of Ref. [4] were all performed at the same bare coupling. Invoking a mass-independent scale setting prescription, this automatically implies that all ensembles have a common lattice spacing a.
We will be using lattice units in all our fits. This means taking µ = µ 0 = 1/a, and thus m(µ) = m(µ 0 ) = m 0 . 17 For polynomial F (v), a necessary and sufficient condition that the potential will be bounded from below is that the highest power of v is even, and its coefficient is positive.

A. The LatKMI data
The pion mass M π and decay constant F π were measured in Ref. [4] [4] a great effort was made to also determine the dilaton mass M τ . It was found that indeed a dilaton exists, roughly degenerate with the pions. M τ was measured for only 6 fermion masses, leaving out am 0 = 0.05, 0.07, 0.08 and 0.1. More seriously, the statistical errors of M τ turn out to be too large to have any real impact on our fits. In the window fits to LO dChPT (next subsection), we found that when we include a fit of M 2 τ /F 2 π to Eq. (2.8d) in our global fit, d 3 remains largely undetermined, while all other fit parameters do not change. The only noticeable change is a higher p-value, as might be expected. We thus omit the dilaton mass from the fits discussed in this paper.
Other hadron masses were also determined, notably the vector meson mass aM ρ and the nucleon mass aM N . 18 For these hadrons, the prediction from LO dChPT is that the ratios M ρ /F π and M N /F π should be independent of am 0 [20]; this is also true if we extend LO dChPT to include a varying γ m . Excluding the two largest fermion masses, am 0 = 0.08 and 0.1, we found that we can fit M ρ /F π to a constant, with a p-value of 0.31. M N was measured only for a subset of the fermion masses, which leaves out am 0 = 0.05, 0.07 and 0.1. Keeping only the 5 lightest masses, we found that a fit of M N /F π to a constant has a p-value of 0.07. This suggests that for larger fermion masses, higher-orders corrections in dChPT (other than a varying γ m ) would be needed to fit these ratios. In addition, discretization effects could be playing a bigger role (see below). We will thus focus in this paper on the pion sector, considering M 2 π /F 2 π and aF π in Secs. III B and III C, and adding taste splittings in Sec. III D.
Information on the systematic errors of aM π and aF π is incomplete. Mostly, they were measured on at least two different volumes, and we estimate the finite-volume error by taking the difference between the results at the largest two volumes. For am 0 = 0.012 only one volume is available. In this case we took the finite-volume errors to be the same as for am 0 = 0.015. The latter was simulated on the same volume as am 0 = 0.012, as well as on a somewhat smaller volume. We note that, since am 0 = 0.012 is the lightest fermion mass, this procedure may underestimate its finite-volume errors. A single volume was reported also for am 0 = 0.08 and 0.1. For these fermion masses, the two largest ones, M π L is very large, and finite-volume corrections should be very small. We thus took the finite-volume errors for these two masses to vanish. We added the statistical error and the finite-volume error of aM π and aF π in quadrature. These errors were propagated to the ratio M 2 π /F 2 π , and correlations between this ratio and aF π were kept. 19 As the simulations of Ref. [4] were done at a single bare coupling, no direct information is available on the lattice spacing dependence, and it is not possible to take the continuum limit. We are thus forced to ignore scaling violations in our fits, but it should be kept in mind that these affect our results in an unknown way. Generally speaking, M ρ and M N are larger than M π , and are thus prone to larger discretization effects. Also, as an example, 18 The pions are too heavy for the ρ to decay. 19 Correlations between aM π and aF π on each ensemble are not available. We note that, in Ref. [33], we found that these correlations are small in the LSD data. for am 0 = 0.08 Ref. [4] finds the central values aM π = 0.51, aM ρ = 0.68 and aM N = 1.02, hence, at the largest fermion masses discretization effects could be significant for the pions as well. We will briefly mention evidence for scaling violations in the determination of the gradient flow scale t 0 in Sec. III E. The only other information on lattice spacing effects comes from pion taste splittings. The masses of taste-split pions, which were measured only on the seven ensembles with bare masses (3.2), will be considered in Sec. III D.

B. Window fits
We begin with fitting M 2 π /F 2 π and aF π to the predictions of LO dChPT, Eqs. (2.8a) and (2.8b). We consider sets of five successive fermion masses, taking first the lightest five masses from the set (3.1), then the second to the sixth masses, etc., for a total of six quintets. The results are shown in Table 1. 20 All the fits are good. However, the parameter values change with the partial mass range, more than allowed by their errors. In particular, the lowest mass range (fit 1A) and the highest mass range (fit 1F) do not overlap, hence their parameter errors are statistically independent. These fits are thus not consistent with each other. A simultaneous fit of LO dChPT to all ten masses has a p-value of order 10 −11 . Clearly, the whole KMI mass range cannot be fit to LO dChPT.
As dChPT admits a systematic expansion, the failure to describe a set of data at LO means that higher orders in the expansion are needed. However, already at LO, dChPT contains more parameters than ordinary ChPT. Depending on the observables being fitted, many more would be needed for an NLO fit. We believe that much better data is required for a meaningful NLO fit. As discussed in Sec. III A, the LSD and KMI data sets both contain only a single lattice spacing, leaving discretization errors as an uncontrolled source of systematic uncertainty. In addition, it may well be that more refined data, for additional bare masses and/or with smaller statistical errors, would be needed to determine all the parameters in the NLO fit.

C. Fits with a varying γ m
Being unable to carry out a full NLO fit at present, we are left with the option of partially extending LO dChPT by exploring different "directions" in "higher-order parameter space." By its very nature, no such extension is fully systematic, and each extension should thus be considered a model. Our assumption is that our model, γ-dChPT, captures the relevant physics better than other extensions of LO dChPT.
As we have discussed in Sec. II B, the physical mechanism that underlies the behavior of the LSD data is hyperscaling. The KMI mass range is higher than the LSD one, which motivates us to consider a minimal modification of this physical picture. We assume that the KMI mass range is still governed by the same principles that produce hyperscaling in the LSD mass range, except that, because of the diminishing influence of the IRFP, we now have to allow the mass anomalous dimension to vary. That consideration has led us to the framework of γ-dChPT, developed in Sec. II C.
In this subsection, we will thus consider fits of the KMI data to γ-dChPT. Specifically, we consider fits of M 2 π /F 2 π and aF π to Eqs. (2.50) and (2.8b), where γ m is quadratic in v, cf. Eq. (2.46). We begin with a technical issue. The independent variable in these equations is v, which, in turn, can be determined in terms of am 0 using Eq. (2.47). However, unlike in LO dChPT discussed in Sec. II A, Eq. (2.47) cannot be analytically inverted. 21 Instead, in addition to the parameters defining the γ-dChPT lagrangian, we introduce new parameters v i , one per ensemble. 22 We fit the corresponding bare mass am 0,i to Eq. (2.47), while simultaneously also fitting (M 2 π /F 2 π ) i and (aF π ) i , all as functions of the same parameter v i . Artificially introducing a tiny error for am 0,i , the fit in effect solves Eq. (2.47) numerically for v i in terms of am 0,i . Thus, for given values of the γ-dChPT parameters, v i is equal to v(am 0,i ) with numerical precision set by the "error" of the "data" am 0,i . We have varied the errors on am 0,i between 10 −6 and 10 −7 , finding no discernible differences in the results of our fits. χ 2 values remain equal to four decimal places, whether one includes the "am 0 part" in the computation of χ 2 or not.
As in Ref. [33], we can calculate (aB π ) i on each ensemble using Eq. (2.49) and our fit result for v i . In all cases studied in this paper the so-obtained values of (aB π ) i are equal within error. This confirms the self-consistency of our assumption that the lattice spacing a is independent of the fermion mass.
The results of our fits are shown in Table 2. Fit 2A includes all ten ensembles, fit 2B leaves out the am 0 = 0.1 ensemble, and fit 2C leaves out both am 0 = 0.1 and 0.08. All the fits are good, but fits 2B and 2C are better than fit 2A. We also carried out fits setting c = 0, i.e., taking γ m in Eq. (2.46) to be a linear function of v. Fits with c = 0 including all ten ensembles, or omitting the am 0 = 0.1 ensemble, have very low p-values, 0.001 and 0.01 respectively. We do not show them in the table. However, if we omit both the am 0 = 0.1 and 0.08 ensembles, we obtain fit 2D, which is a good fit. The parametersf π ,d 1 and log d 2 are relatively stable between the fits with c as a free parameter, and fit 2D, where c = 0. By contrast, the parameters defining the function γ m change substantially: Fit 2D yields much smaller values for both γ 0 and b than the other fits of Table 2.
The results of fits 2B and 2D are shown in Fig. 1. The black points are data that were included in the fits, whereas the magenta points were excluded. The lower left panel shows 21 In principle, the formal inverse function m = m(v) may not be single valued. In practice, we found that v is monotonically increasing with m over the entire KMI mass range. 22 The total number of parameters increases by the number of v i parameters, i.e., by the number of ensembles included in the fit. The number of data increases by the same amount (the am 0,i ), leaving the number of degrees of freedom unchanged. that if we simplify our ansatz for γ m to be linear in v, then the am 0 = 0.08 and 0.1 ensembles must be excluded.
We have proposed in Sec. II C that the exponential factor e −F (v) may originate from a resummation of the dominant contributions from all orders in the expansion in n f − n * f . According to the hypothesis (2.43), b is an NLO parameter, while c is an NNLO parameter. One way to test this scenario is to examine the effect of truncating the Taylor expansion of the exponential factor. The range of values we find for v in the fits to the KMI data is 1.5 ≤ v ≤ 2.5. Considering first fit 2B, we can compare the numerical values of exp 1 2 bv 2 − 1 3 cv 3 , and its version truncated at NNLO, namely 1 + 1 2 bv 2 + 1 8 b 2 v 4 − 1 3 cv 3 . When we vary v from 1.5 to 2.5, the exponential and its truncated version take values ranging from 3.4 to 15, respectively 3.4 to 13. The differences (taking the correlations into account) are −0.05(8) and 2(4), respectively, so that the exponential and truncated forms are consistent with each other. The situation is somewhat different for fit 2D, where the smallness of both b and its relative error allows for a more precise comparison. Varying again v from 1.5 to 2.5, exp 1 2 bv 2 varies from 1.15 to 1.48, while the expansion to NLO, 1 + 1 2 bv 2 , varies from 1.14 to 1.39. The (correlated) differences are 0.010(4) and 0.09(3), respectively. Thus, while the behavior of both forms is qualitatively similar, the differences are statistically significant. Fits with the truncated version give results consistent with fits 2B and 2D, but with lower p-values.
Without more data it is difficult to decide which fit in Table 2 is the preferred one. Clearly, unless the two heaviest masses are dropped, c must be kept in the fit. Given its (conjectural) role as an NNLO parameter, it is to be expected that eventually c will be needed to describe the data as the mass range is increased. Still, we cannot rule out that the main reason why fit 2D does not accommodate the two heaviest masses is large scaling violations at those mass values.
In all fits where the parameter c is present, it is always small compared to b, consistent with the conjectured hierarchy (2.43). However, in the same fits, one cannot say that b is small compared to γ 0 . By contrast, in fit 2D, where c = 0, also b is clearly small compare to γ 0 . The most appealing scenario thus appears to be the following. We exclude the two largest fermion mass values, because they require going to (at least) NNLO in the EFT expansion, and/or because they are afflicted by too large scaling violations. The remaining mass range may be amenable to an NLO dChPT fit, 23 for which fit 2D is our closest substitute.

D. Taste splittings
We now turn to fits which also include the taste splittings (2.52), i.e., fits of M 2 π /F 2 π , aF π and ∆ A,T,V,S to γ-dChPT, augmented by Eq. (2.53). Our fits are limited to the smaller ensemble set (3.2), where the taste-split pion masses were measured.
We show five different fits in Table 3. Fit 3A includes all the parameters: the basic γ-dChPT parameters of Sec. III C, namelyf π ,d 1 , log d 2 , γ 0 , b and c, as well as all eight taste-splitting parameters of Eq. (2.53). Data from all seven ensembles in the set (3.2) are included in the fit. The p-value is very high. The results for the six basic γ-dChPT parameters are consistent with fit 2B. 24 As for the taste-splitting parameters, most of them, namely, γ 1,3,6 and log C 1,3,6 , are not well determined by the fit. We conclude that fit 3A gives an excellent description of the data, but the data are not precise enough to determine  (7) 0.23 (7) - Fits of M 2 π /F 2 π , aF π and taste splittings to γ-dChPT. The "omitted" row shows bare mass values from the set (3.2) which are not included in the fit, if any. For description see text.
all parameters in the fit.
We next consider fits omitting poorly determined parameters. Among the taste-splitting parameters, only log C 4 and γ 4 were determined with good precision. As for C 1 , C 3 and C 6 , if we take their errors seriously, using them as 1σ bounds, these paramters are "allowed" to be very small relative to C 4 (by factors ∼ 2 × 10 3 , ∼ 10 and ∼ 10 5 , respectively). Setting C 1 = C 3 = C 6 = 0, we obtain fit 3B. This is a good fit, even though its p-value is much smaller than fit 3A, as one would expect. The results of fits 3A and 3B are in very good agreement. The dominance of the taste splittings generated by the C 4 E(γ 4 ) term is consistent with the results we obtained for the LSD data [33], as well as with the familiar taste splittings found in QCD.
In Sec. III C we saw that the parameter c can be omitted if the fermion masses am 0 = 0.1 and 0.08 are not included in the fit. While am 0 = 0.08 is present in the ensemble set (3.2), we also repeated fits 3A and 3B while setting c = 0, obtaining fits 3C and 3D, respectively. Finally, fit 3E is similar to fit 3D, except that the am 0 = 0.08 ensemble is not included. Fit 3C, were we set c = 0 but keep all the taste-splitting parameters, is very good. Setting both c = 0 and C 1 = C 3 = C 6 = 0 leads to a relatively low p-value in fit 3D. After dropping the am 0 = 0.08 ensemble, in fit 3E the p-value is again very high.
Our results forf π ,d 1 , log d 2 are fairly consistent in all the fits reported in Tables 2 and 3 determined) values of the remaining taste splitting parameters are consistent between fits 3A and 3C.
In fit 3C, LO dChPT has been minimally extended (within the framework of γ-dChPT) to include an NLO correction to the function γ m . This fit gives an excellent description of the ensemble set (3.2) with taste splittings included; the parameter c is not needed. We thus consider fit 3C to be the preferred fit from Table 3. We plot the taste splittings of this fit in Fig. 2. A caveat is that, even though all the taste-split pion masses were measured in Ref. [4], the data are not precise enough to determine all taste-splitting parameters. 25 We recall that the QCD taste splittings are essentially independent of the fermion mass [34,40]. 26 By contrast, as for the LSD data [33], also in the KMI mass range the taste splittings vary with the fermion mass. This behavior can be successfully described in dChPT, where the scale dependence of the taste-breaking operators gives rise to mass dependent tree-level taste splittings, through the factors E(γ i ) in Eq. (2.53).

E. Scale dependence of γ m
The anomalous dimension function γ m obtained from two of the fits of Table 2 is shown in Fig. 3. The blue band represents fit 2B, where γ m = F (v) is quadratic in v (Eq. (2.46)), while the magenta band represents fit 2D, where γ m is linear in v. With Eq. (2.8b), we take the argument of γ m to be v = log(aF π /af π ), and then plot γ m as a function of aF π . The two γ m functions agree well in most of the interval containing the fitted data, 0.045 ∼ < aF π ∼ < 0.12. The good agreement deteriorates towards the lower end of the interval, below which these functions diverge from each other. If we would overlay the (constant) results of each window The running mass anomalous dimension γ m , obtained from fit 2B (blue band) and 2D (magenta band), plotted as a function of aF π (see text). The gray horizontal band is γ * = 0.936 ± 0.019, from our fit to the LSD data [33]. The fitted KMI data have values of aF π between 0.045 and 0.12. fit from Sec. III B as a set of horizontal bands (each stretching over its corresponding range of aF π ), these bands would be consistent with the blue and magenta bands in that interval. 27 Figure 3 also shows the value γ * = 0.936 (19) obtained from our fits of the LSD data to LO dChPT [33], as a gray horizontal band. The LSD mass range is lower than the KMI range, and the (generalized) hyperscaling behavior we have observed implies that the LSD range of F π should also be lower than the corresponding KMI range, in physical units. Equivalently, the LSD values of aF π , properly converted to KMI lattice units, should lie to the left of the KMI range of aF π in Fig. 3.
Since the LSD data is successfully described by a constant γ m = γ * , we expect that also in the chiral limit γ m will remain constant, at a value consistent with γ * . The continuity of γ m as a function of F π thus requires that, as F π is lowered from the KMI range into the LSD range, γ m will rise to a value consistent with γ * , and then stay roughly constant all the way to the chiral limit. It is intriguing that the strong dynamics of the N f = 8 system might induce this behavior of γ m . 28 Fig. 3 shows that, when extrapolated below the KMI range, the quadratic γ m of fit 2B overshoots γ * , while the linear γ m of fit 2D undershoots it. The desired behavior of γ m over the combined KMI and LSD ranges cannot be described by simple ansatzes such as the ones we have used. One cannot rule out, however, that the combined LSD and KMI mass ranges could be described by including higher orders in dChPT systematically.
Clearly, an investigation of the combined LSD and KMI mass ranges would be extremely interesting. However, this is just not possible with the existing data sets. We already pointed out that the LSD and KMI data sets were each produced at a single lattice spacing. Moreover, the lattice actions used by LSD and by KMI differ in their details, and scaling violations can potentially differ significantly between the two lattice actions and axial currents. This means that the only way to reliably compare these results is by first taking the continuum limit separately for the LSD lattice action and for the KMI lattice action. The minimal requirement to make this possible is a second set of data at a different lattice spacing, for each lattice action. 29 We have attempted a comparison of the LSD and KMI lattice scales, using t 0,ch , the chiral-limit value of the gradient-flow scale t 0 [42], which we have determined for the LSD data set in Ref. [33]. The comparison is deficient for several reasons. First, unlike in ordinary ChPT [43], dChPT does not predict the behavior of t 0 as a function of the fermion mass [33], so the best we can do is a phenomenological fit. Second, usually the gradient flow scale (or its chiral limit) is used to compare the lattice spacings of ensembles generated with different bare couplings, but with the same lattice action. By contrast, here we are comparing results obtained using two different lattice actions, hence the meaning of the comparison is less clear. Finally, there are also scaling violations in the lattice observables used to extract t 0 , as well as in the gradient-flow equation. KMI used two lattice definitions for t 0 which should agree in the continuum limit, but which consistently differ by some 15% over the entire KMI mass range; we do not have equivalent information about uncertainties associated with the LSD data. With all these caveats in mind, our findings suggest that the ratio r = a(KMI)/a(LSD) is smaller than one. Using Eq. (3.1) together with Eq. (4.5) below, it follows that the KMI mass range is indeed higher than the LSD mass range, in agreement with the physical picture reflected in Fig. 5 of Ref. [2]. But, we are unable to turn this conclusion into a more quantitative statement.
We close this section with a comment. As discussed above, our experimentation with t 0 (and its chiral extrapolation) suggests that r < 1. Now, an alternative way to estimate r would be to take advantage of the fact thatf π , the chiral-limit value of the pion decay constant, is a physical observable. Expecting √ 2f π (LSD) ≈f π (KMI) in physical units, 30 it follows that af π (KMI)/( √ 2af π (LSD)) ≈ r. The reason why we only expect an approximate equality between √ 2f π (LSD) andf π (KMI), is the different scaling violations of the two lattice actions. In reality, using the value of af π (LSD) from Ref. [2], and taking af π (KMI) ∼ 0.01, we find af π (KMI)/( √ 2af π (LSD)) ∼ 10, in stark conflict with the estimate r < 1 obtained from the gradient flow scale. It is unlikely that scaling violations per-se can account for this inconsistency. The problem must be related to the long extrapolation to the chiral limit inherent in the extraction of af π . It does not necessarily imply that (γ-)dChPT cannot be trusted. The factor e v(m) = F π /f π is very sensitive to m, which makes a long extrapolation to the chiral limit much more difficult than in the case of QCD. For at least one of the data sets our fit result for af π is likely to contain a large, and unaccounted for, source of systematic error. A comparison of the values of d 2 obtained from the two data sets reveals a similar problem, which presumably have a similar source, given that d 2 =f 2 π /(2B π ). We comment that in order to compare aB π between the LSD and KMI lattice scales we have to apply an RG transformation, but once again, it is hard to see how such a transformation would suffice to match the values of d 2 found in the two simulations. 29 To make sure that the same physical mass range is covered, one can, for example, monitor the values of some observable, such as a hadron mass or a decay constant, in units of √ t 0 . 30 The factor of √ 2 is due to different normalization conventions.

THE ∆ CLASS OF DILATON POTENTIALS
So far, we have considered a model modification of the LO dChPT form of L m , based on the observation that the coupling of the underlying theory may start running at the physical scale determined by a growing fermion mass, thereby inducing a varying mass anomalous dimension as well. In this section we turn to a class of modifications to the dilaton-potential term L d . Alternate forms of the dilaton potential were first applied to the LSD data in Ref. [28]. In Ref. [32] a class of dilaton potentials L ∆ was proposed, defined by (compare Eq. (2.6)) where ∆ is a new free parameter. 31 We have translated the notation of Ref.
[32] to our notation. In the limit ∆ → 4, the potential L d of Eq. (2.6a) is recovered. For ∆ = 2, L ∆ becomes the linear σ-model potential considered in Ref. [8]. We will refer to the low-energy lagrangian with L d replaced by L ∆ as ∆-dChPT. Applying ∆-dChPT to the LSD data, Ref. [32] concluded that these data appear to favor a value of ∆ around 3.5, with a large uncertainty. Correlations in these data were not taken into account [32]. Moreover, correlations which occur because of the appearance of F π in all three equations fitted in Ref. [32], as well as the appearance of M π in two of them, apparently were not taken into account either. In Sec. IV A we begin by collecting the expressions needed to fit ∆-dChPT. In Sec. IV B we revisit the determination of ∆ using the LSD data, taking all correlations into account. This analysis departs from the framework of LO dChPT (Sec. II A) only by replacing the dilaton potential L d by L ∆ . At this stage the mass anomalous dimension is held fixed, cf. Eq. (2.2b). Then, in Sec. IV C, we explore fits of the KMI data to the ∆ class of potentials. As in the previous section, we consider both fixed-γ m fits to subsets of the KMI data, as well as fits with a varying γ m to the entire KMI data set. We summarize our findings in Sec. IV D.
Unlike the modification of L m to accommodate a running γ m , we are not aware of a concrete physical motivation to replace L d by the more general form L ∆ . A closely related question is whether or not ∆-dChPT is the leading order in a systematic low-energy expansion for an arbitrary value of ∆.
The potential L d , Eq. (2.6a), is based on the systematic power counting developed in Ref. [17]. Since L d corresponds to the limit ∆ → 4 in Eq. (4.1), it follows by continuity that there must exist a neighborhood of ∆ = 4 where the dChPT systematic expansion is still applicable. For arbitrary ∆, a power counting was proposed in Ref. [32]. We prove in App. A that the arguments given in Ref. [32] are not correct. ∆-dChPT, i.e., the low-energy lagrangian consisting of Eq. (II A) with L d replaced by L ∆ , should thus be considered to be a model.
It is then straightforward to derive the relations where γ m is given in Eq. (2.45), andd 1 is defined in Eq. (2.48).
We now turn to fits of the LSD and KMI data, in order to explore to what extent they constrain the value of ∆. We emphasize again that this investigation is empirical, as no systematic power counting is available for this model for arbitrary values of ∆.

B. The LSD data
Data reported in Ref. [3] includes results at five different fermion masses, am i ∈ {0.00125 , 0.00222 , 0.005 , 0.0075 , 0.00889} . (4.5) All ensembles have the same bare coupling, and, in a mass-independent scheme, the same lattice spacing [33]. We fitted the LSD data to LO dChPT in Ref. [33]. Here, we repeat some of those fits replacing L d by L ∆ , keeping ∆ as a free parameter. Our results are shown in Table 4. These fits correspond to four fits presented in Ref. [33]: Fits 4A and 4B are to be compared to the fits shown in Table 1 of Ref. [33], while fits 4C and 4D are to be compared with the third column of Table 3 and the second column of Table 4 in Ref. [33].
As discussed in great detail in Ref. [33], it is not possible to fit all parameters in the tastebreaking sector with the available LSD data. Here we kept those taste-breaking parameters that gave rise to the best fits of Ref. [33]. Furthermore, in Ref. [33] we argued that fourensemble fits, which exclude the ensemble with the largest fermion mass, are better behaved. While the five-ensemble fits reported in Table 4  11.6(9) 12.9(2.5) 11.3 (7) 12.6(2.1) d 3 17 (9) 9(9) 20(8) Parameter values for γ * and log d 0 are in good agreement with the corresponding fits in Ref. [33]. The parameters d 1 and d 3 are very poorly determined by the fits; especially by those with four ensembles. This is no surprise, as d 1 and d 3 relate directly to the dilaton potential L ∆ , in which now a new parameter, ∆, has been introduced. The results for the taste-breaking parameters are in reasonable agreement with Ref. [33] for the five-ensemble fit, and in good agreement for the four-ensemble fit. By holding ∆ fixed in the fit, we verified that in the limit ∆ → 4 the results of Ref. [33] are reproduced.
The parameter ∆ itself is reasonably well determined by each fit. However, there is a visible difference between the four-ensemble and five-ensemble fits. From the four-ensemble fits, we conclude that ∆ = 3.5 (7). This is consistent with the hypothesis that dChPT, which predicts ∆ → 4, is the correct low-energy EFT. The linear σ-model value, ∆ = 2, is disfavored. By contrast, the values found in the five-ensemble fits average to 2.8 (7). This is 1.7σ away from ∆ → 4, and, in fact, between the two options, it slightly favors the linear σ-model value.

C. The KMI data
We next turn to fits of the KMI data, with L ∆ replacing L d . We first consider again window fits similar to those of Table 1, but now with ∆ an additional free parameter. The results are reported in Table 5 other parameters are generally consistent between Tables 5 and 1. As before, a constant γ m is not sufficient to describe the KMI data over the full mass range. However, while γ * varies with the mass range selected in the fit, ∆ does not. If we compare the values of ∆ between two of the fits in Table 5, these values are always consistent within the smaller of the two errors (with the exception of the second fit, for which ∆ has an anomalously small error). The first and last values, 3.8(5) and 4.0(6), coming from the lowest and highest mass ranges, are statistically independent, in agreement with ∆ = 4 and with each other. As in Sec. III, our next step is to consider fits to all, or most, of the KMI data, with L m of Eq. (2.42), and a varying γ m as defined in Eq. (2.46). As before, this introduces two more parameters (b and c) into the fits, for a total of seven parameters. We will refer to this flavor of the low-energy lagrangian as γ∆-dChPT.
In Table 6 we show a scan in ∆: at each chosen value of ∆, we fit the other six parameters. The fit for ∆ = 3.9999 coincides with fit 2A, as one would expect. If we decrease ∆, we find that the p-value rapidly decreases, dipping below 0.01 for ∆ < 3.8. We verified that the p-value keeps decreasing down to ∆ = 2 (where the p-value is of order 10 −30 ). If we increase ∆ above 4, the p-value increases until ∆ reaches 4.5, where the p-value appears to start decreasing again. However, we found that fits with ∆ ≥ 4.5 become very difficult. This is reflected in the very large errors in the six fit parameters: for ∆ = 4.5, essentially all of them are not determined by the fit. We have repeated the fits of Table 6 omitting the am 0 = 0.1 ensemble, or the am 0 = 0.1 and 0.08 ensembles, and we have also redone such fits setting c = 0 (as in fit 2D). The conclusions are always the same as for the fits shown in Table 6. The fit at ∆ = 3.9999 is consistent with the corresponding fit in Table 2; values of ∆ below roughly 3.8 are strongly disfavored; and the fit starts to deteriorate at ∆ = 4.5. If we attempt to include ∆ as a parameter in the fit itself (instead of scanning over ∆) fits appear to be unstable.
Given the difficulty fitting the KMI data with the L ∆ potential, we have not attempted to include taste splittings in the KMI case.

D. Discussion
Taking the fits of the LSD and KMI data together, it is clear that no very precise statement about the value of ∆ can be made. The KMI data appear to exclude the σ-model value ∆ = 2. dChPT, which corresponds to ∆ → 4 with fixed γ m = γ * , is consistent with  the fits shown in Tables 4 and 5. An exception is the second window fit, fit 5B, which yields a result with a rather small error, ∆ = 4.4 (1). But clearly, this result does not account for the variation of ∆ across all fits shown in Tables 4, 5 and 6.
Our results are consistent with those of Ref. [32]. The main difference is that the KMI data, which were not considered in Ref. [32], present a much stronger lower bound on ∆.
As we show in the appendix, for values of ∆ not close to 4, no power counting exists for the low-energy theory with L d replaced by L ∆ of Eq. (4.1). However, we do not wish to imply that attempts to understand data in terms of models are not interesting. Fits to models, including ∆-dChPT (with ∆ not constrained to be close to 4), can provide a valuable "stress test" of dChPT. This is why we considered fits of the LSD and KMI data to ∆-dChPT; Ref. [32] can be seen as a similar exploration of only the LSD data.
Fits of the LSD data, comparing in particular the values ∆ = 2 and ∆ → 4, were considered also in Ref. [8]. 32 There, it was found that both dChPT and ∆-dChPT with ∆ = 2 provide good fits to data using all five of the LSD ensembles. This finding agrees with our fits in Table 4: fits 4A and 4C are consistent with ∆ = 2, but are less than ∼ 2σ away from ∆ = 4.
In summary, a precise determination of the favored value of ∆ is not possible with presently available data. Taking the results based on fits to both the LSD and KMI data together, we arrive at an estimated range for ∆, 3.5 < ∆ < 4.5 . (4.6) Our lower bound is based on the four-ensemble fits to the LSD data, which favor a value around ∆ ∼ 3.5, combined with the γ∆-dChPT scan of Table 6, which strongly disfavors values below 3.8. Any fit of the KMI data set must somehow account for the running of γ m . Including higher orders systematically is not an option here, because, as we prove in the appendix, the claim of Ref. [32] that ∆-dChPT admits a systematic expansion is incorrect. The model alternatives are to use a fixed value of γ m while limiting the mass range as in the "window" fits, or else to use an explicitly varying γ m function. As for the window fits, Table 5 shows that ∆ is rather insensitive to the mass range in the fit. Also, while both the five-ensemble fits to the LSD data, and some of the window fits to the KMI data allow for ∆ < 3.5, the fits of Table 6 to the KMI data strongly disfavor ∆ < 3.8. Based on all fits together, the σ-model value ∆ = 2 appears to be excluded. Once again, the caveats discussed in the previous section regarding the LSD and KMI data sets, and, in particular, the lack of information about scaling violations, apply also to our conclusions in this section.

V. CONCLUSION
Our main goal in this paper was to confront the EFT framework provided by dChPT with the KMI data for the eight-flavor SU(3) gauge theory [4]. The KMI simulations were performed at larger fermion masses than the LSD ones [3], taking the theory further away from conformality. Hence, even with the successful application of LO dChPT to the LSD data, which we reported on in Ref. [33], there is no guarantee that LO dChPT can also be applied to the KMI data.
Indeed, we found that the full fermion-mass range of the KMI data cannot be fitted to LO dChPT. The natural next step would be to attempt an NLO fit in dChPT. However, as we explained in Sec. III, this is not feasible with presently available data. First, the large number of parameters involved in any NLO dChPT fit requires extensive precision data for a successful fit. Moreover, the KMI data set (and, likewise, the LSD data set) has only a single lattice spacing, making a continuum extrapolation impossible.
Instead, we introduced γ-dChPT, a model extension of LO dChPT with a scale-dependent mass anomalous dimension, which can be interpreted as arising from partially resumming higher orders in the EFT expansion. We found that γ-dChPT provides a successful description of the KMI data over the entire mass range.
Given the success in describing the LSD data using LO dChPT [33], and the KMI data using γ-dChPT with a relatively simple ansatz for the γ m function, the question arises whether γ-dChPT can be used to fit the LSD and KMI data simultaneously. Over the KMI mass range, γ m would then have to increase as the fermion mass is decreased, eventually saturating to a constant when reaching the lower LSD mass range (see Fig. 3). Once again, however, the inability to take the continuum limit makes it impossible to carry out this program at this time. The lack of information on the lattice spacing dependence is even more severe when trying to consider the LSD and KMI data sets together, because they were produced with different lattice actions, and thus, their scaling violations for any given physical observable are different functions of the corresponding lattice spacing.
We also considered ∆-dChPT-another generalization of LO dChPT in which the dilaton potential is replaced by a class of potentials depending on a new parameter ∆. We emphasize that ∆-dChPT does not allow for a systematic power counting, and should thus be considered a model, except in the limit ∆ → 4 where dChPT is recovered. ∆-dChPT was applied to the LSD data before [32], where it was found that it is difficult to determine the parameter ∆ from these data. We confirmed this result, but found that the KMI data allow us to better constrain the value of ∆. We used both the "window" fits in which ∆-dChPT is applied to subsets of the KMI ensembles, as well as a combination of the two extensions of LO dChPT, with the ∆ class of dilaton potentials together with a varying γ m . We concluded that the preferred range of our combined analysis of the LSD and KMI data is 3.5 < ∆ < 4.5. This is centered around ∆ = 4, where ∆-dChPT reduces to LO dChPT.
Recently, LO dChPT has also been successfully applied to the light sector of the SU (3) gauge theory with four light and six heavy flavors [10]. dChPT provides for a systematic treatment of the pNGBs, the pions and the dilaton, of a near-conformal gauge theory, but it does rest on certain assumptions [17,20]. These initial successes are thus encouraging. We hope that, in the future, more extensive and refined data will become available, allowing for further and more stringent tests of dChPT.
The first two terms in the expansion reproduce the LO potential V d (τ ). 36 It follows that, for any fixed value of ∆ such that |∆ − 4| ∼ |n f − n * f | η with any η > 0, the V ∆ (τ ) potential will inherit the power counting of dChPT. The same is not true for values of ∆ not close to 4, and thus, it is also not true for the low-energy lagrangian in which ∆ is treated as a free parameter.