Pion and Kaon Distribution Amplitudes in the Continuum Limit

We present a lattice-QCD calculation of the pion, kaon and $\eta_s$ distribution amplitudes using large-momentum effective theory (LaMET). Our calculation is carried out using three ensembles with 2+1+1 flavors of highly improved staggered quarks (HISQ), generated by MILC collaboration, at 310 MeV pion mass with 0.06, 0.09 and 0.12 fm lattice spacings. We use clover fermion action for the valence quarks and tune the quark mass to match the lightest light and strange masses in the sea. The resulting lattice matrix elements are nonperturbatively renormalized in regularization-independent momentum-subtraction (RI/MOM) scheme and extrapolated to the continuum. We use two approaches to extract the $x$-dependence of the meson distribution amplitudes: 1) we fit the renormalized matrix elements in coordinate space to an assumed distribution form through a one-loop matching kernel; 2) we use a machine-learning algorithm trained on pseudo lattice-QCD data to make predictions on the lattice data. We found the results are consistent between these methods with the latter method giving a less smooth shape. Both approaches suggest that as the quark mass increases, the distribution amplitude becomes narrower. Our pion distribution amplitude has broader distribution than predicted by light-front constituent-quark model, and the moments of our pion distributions agree with previous lattice-QCD results using the operator production expansion.


I. INTRODUCTION
Meson distribution amplitudes (DAs) φ M hold the key to understanding how light-quark hadron masses emerge from QCD, an important topic of study at a future electron-ion collider [1]. Meson DA are also important inputs in many hard exclusive processes at large momentum transfers Q 2 Λ 2 QCD [2,3]. In such processes, the cross section can be factorized into a short-distance hard-scattering part and long-distance universal quantities such as the lightcone DAs. Unlike the hard-scattering subprocess, which can be calculated perturbatively, the lightcone DAs need to be determined from fits to experimental data or to be calculated nonperturbatively from lattice QCD. Such direct computations have become possible recently, thanks to large-momentum effective theory (LaMET) [4][5][6]. The LaMET method calculates equaltime spatial correlators (whose Fourier transforms are called quasi-distributions) on the lattice and takes the infinite-momentum limit to extract the true lightcone distribution. For large momenta feasible in lattice simulations, LaMET can be used to relate Euclidean quasidistributions to physical ones through a factorization theorem, which involves a matching and power corrections that are suppressed by the hadron momentum [5]. The proof of factorization was developed in Refs. [7][8][9].
For meson DAs, the first lattice calculation of the leading-twist pion DA using LaMET was performed in Ref. [25]. The result favored a single-hump form for the pion DA. The first calculation of the kaon DA was performed in Ref. [84]. The expected skewness was seen in the asymmetry of the kaon DA around the quark momentum fraction x = 1/2. These results were improved by a Wilson-line renormalization that removes power divergences. Also, the momentum-smearing technique proposed in Ref. [92] was implemented to increase the overlap with the ground state of a moving hadron. Despite these improvements, the DAs did not vanish in the unphysical region outside x ∈ [0, 1].
In this paper, we further improve the meson-DA calculation by implementing nonperturbative renormaliza-arXiv:2005.13955v1 [hep-lat] 28 May 2020 tion (NPR) in regularization-independent momentumsubtraction (RI/MOM) scheme. Also, computations are performed with three different lattice spacings and two different pion masses, allowing the continuum extrapolation and chiral extrapolation. Despite these improvements, the contribution in the unphysical region remains. This is largely due to the omission of the long-range tail of the spatial correlator, which is cut off by the finite size of the lattice. To fix this problem, we would need larger hadron momentum instead of a larger lattice volume, because the long-range correlations of the matrix elements increase the undesired mixing with higher-twist operators. Alternatively, we explore the possibility of constraining the DA without the long-range correlation by fitting to a commonly used DA parametrization. The model dependence of the parametrization can be later reduced by using a general set of basis functions, using machine learning to determine the functional form or by combining with other lattice inputs.
The continuum extrapolation performed in this work is relevant to several important questions regarding the LaMET and related approaches. First, how does the quasi-distribution approach avoid the power-divergent mixing of a twist-2 operator with a twist-2 operator of lower dimension as seen in moment calculations? The answer is that this power-divergent mixing is due to the breaking of rotational symmetry on a lattice. When the continuum limit is taken after the correlator is renormalized, rotational symmetry can be recovered as an accidental symmetry. (This is possible because the nonlocal operators used for quasi-distributions are the lowestdimension ones with the same symmetry properties [32].) Hence, power-divergent mixing among twist-2 operators should no longer exist.
Second, the operator product expansion of the equaltime correlators gives rise to twist-2, twist-4 and highertwist contributions. Ref. [46] argued that the matrix element of the twist-4 operator is set by the scale a; hence, its suppression factor compared with twist-2 is O(1/(P z a) 2 ) instead of O(Λ 2 QCD /P 2 z ) with the hadron momentum P z . However, the twist-4 contribution that needs to be subtracted from the quasi-distribution operator can be written as equal-time correlators with two more mass dimensions than the original quasidistribution operator [24]. Hence, they do not cause power-divergent mixings that need to be subtracted before applying RI/MOM renormalization.
Proving the above statements requires a careful analysis of the mixing matrix, which is beyond the scope of this paper. In this work, we check whether the continuum extrapolation of the RI/MOM-renormalized matrix elements is consistent with the absence of power-divergent terms, which, by itself, is a necessary (but not sufficient) condition for the above statements to be true. If there were mixing with lower-dimension operators, the matrix element could still be renormalized, but one might get the undesired lower-dimension operator in the continuum limit instead of the one of interest. However, as discussed above, power-divergent mixing in quasi-distributions was not found in the studies of Refs. [24,32].
The article is organized in the following way. In Sec. II we present the lattice setup of this calculation, and the strategies used to extract the bare matrix elements from lattice DA correlators. Section III shows the NPR procedure, and the continuum and chiral extrapolation of the renormalized matrix elements. The x dependence of DAs are obtained from two approaches: the fit to a functional form and the prediction with a machine learning algorithm. Finally, we summarize the results and future prospects in Sec. IV.

II. LATTICE SETUP
In this work, we extend our previous work on the kaon distribution amplitude from a single a12m310 lattice [84] to 3 lattice ensembles with different lattice spacings and extrapolate the results to continuum. The three ensembles have lattice spacings a = 0.582(4) fm, a = 0.888(8) fm and a = 0.1207 (11) fm with N f = 2 + 1 + 1 flavors of highly improved staggered quarks (HISQ) [93] generated by MILC collaboration [94]. One-step hypercubic (HYP) smearing of the gauge links is applied to improve the discretization effects. We use clover action for the valence quarks with the clover parameters tuned to recover the lowest pion mass of the staggered quarks in the sea [95][96][97]: M π = 319.3(5) MeV, 312.7(6) MeV and 305.3(4) MeV on the three ensembles, respectively. On each lattice configuration, we use multiple sources uniformly distributed in the time direction and randomly distributed in the spatial directions to reach high statistics. We have 24 sources in total for the a06m310 and a09m310 ensembles and 32 sources for a12m310, corresponding to 2280, 5544 and 2912 measurements in total, respectively.
The hadron spectrum (HS) and distribution amplitude (DA) two-point correlators are calculated for different mesons: where M represents different mesons (π, K, η s ), {ψ 1 , ψ 2 } are {u, u} for π, {u, s} for K and {s, s} for η s (only connected diagrams are computed in this work), U ( y, y + zẑ) = z−1 x=0 U z (y + xẑ, t) is the Wilson line connecting lattice site y to y + zẑ, as defined in Ref. [25,84]. The light-quark u and strange-quark s mass parameters used here are from Ref. [98].
The DA matrix element (ME) and ground-state energies of the mesons can be extracted from the HS and DA two-point correlators by a two-state fit to the form: where A M,0 (P ) and E M,0 (P ) are the amplitude and energy, respectively, of ground state of a boosted meson with momentum P z = P while A M,1 (P ) and E M,1 (P ) are for the first excited state. E M,0 (P = 0) is the mass of the meson. We consider the energies to be the same for HS and DA. Therefore, we fit both the HS and DA correlators simultaneously to get the ground-state energy E M,0 (P ) and first excited-state energy E M,1 (P ) of the various momenta P z . The fit range [t min , t max ] is determined by scanning different t to get the smallest χ 2 /dof for all the Wilson-line lengths z and at different P z . When χ 2 /dof for different fit ranges are close to each other, we prefer the smaller-t region where the data is less noisy. Selected effective masses at the largest meson momentum P z ≡ n z 2π L with n z = 4 are shown in Fig. 1 for HS and DA correlators. The bands reconstructed from the fitted parameters agree with the data well. We check the dispersion relation, E M,0 (P ) 2 = E M,0 (P = 0) 2 + c 2 P 2 , where c is the dispersion coefficient (often called "the speed of light"). The dispersion relations for all three mesons on the three ensembles are shown in Fig. 14 of Appendix B. On the two coarser lattices, c is closer to 1 for lighter mesons, and it becomes closer to 1 for finer lattices. On the a06m310 lattice, the c values for all three mesons are consistent with 1.
Two fit strategies are used to extract the ground-state amplitude A M,0 for z = 0 using the ground-state energy E M,0 and excited-state energy E M,1 from the simultaneous fit of the HS and DA correlators at z = 0. One way of doing this is to fix E M,0 at fixed P by simultaneously fitting the HS and z = 0 DA correlators, and obtain fitting parameters of A DA M,0 and A DA M,1 for the real and imaginary corrector and with a common E M,1 . Another way is to fix both E M,0 and E M,1 from z = 0 correlator fit of the same boosted momentum, while fitting the imaginary and real parts of DA correlators simultaneously. To help visualize the resulting ground-state amplitude A M,0 from different fit strategies, we multiply the DA two-point correlators by e E M,0 t .
which should goes to A M,0 when t → ∞. The reconstructed bands of this quantity are shown in Fig. 2 from different fit strategies for the real part and the imaginary part of A M,0 at z = 7, for the largest momentum P z = 4 2π L on the a06m310 ensemble. The fit with fixed E M,0 , represented by the blue bands, and the fit with fixed E M,0 and E M,1 , represented by the red bands, are consistent with each other within uncertainties. However, the bands with fixed E M,0 and E M,1 are more stable in the large-t region. Thus, the fit with fixed E M,0 and E M,1 is used in further calculations.
We consider the effects of t min dependence on the extracted ground-state amplitude A M,0 for the three mesons and three ensembles. The ground-state amplitudes A M,0 as functions of z are shown in Fig. 15 of Appendix B with multiple t min choices on ensembles a06m310, a09m310 and a12m310. The fitted groundstate amplitudes A M,0 with smaller t min tend to have smaller errors. However, the χ 2 /dof becomes larger when too small a t min is chosen, because a two-state fit cannot describe the first few points of t well. Therefore, t min = {4, 4, 5} are chosen for the a06m310 π, K and η s fits, t min = {5, 4, 5} are chosen for the a09m310 π, K and η s fits, and t min = {2, 2, 3} are chosen for the a12m310 π, K and η s fits.

A. Nonperturbative Renormalization
The Wilson line z−1 i=0 U z (iẑ) introduces a divergence into the quasi-PDF operator, so the bare matrix elements (ME) cannot be matched directly to physical observables and need to be renormalized. In contrast to the previous work [84] where an effective mass counter-term is used to renormalize the matrix elements, we now follow a standard nonperturbative renormalization (NPR) in regularization-independent momentumsubtraction (RI/MOM) scheme [99]. The NPR factors Z(z, µ R , p R z , a) are calculated by implementing the condition that We calculate the NPR factors at µ R = 3.8 GeV, p R z = 0 for all three ensembles. Figure 3 shows the inverse renormalization factors for the DA on all three lattice ensembles. The relative errors L on ensembles a12m310, a09m310 and a06m310, respectively, from top to bottom. The bands are reconstructed from the fitted parameters of real part of HS correlators and the imaginary part of DA correlators, which are represented by blue triangles and red squares, respectively. The momentum Pz = 4 2π L is the largest momentum we used, and it is the noisiest data set. FIG. 2. The ground-state amplitudes AM,0 for the pion (left column), kaon (middle column) and ηs (right column) at z = 7 with boost momentum Pz = 4 2π L on the a06m310 ensemble. Two strategies of two-state fits are used here: fixed E0 (red band) and fixed E0 and E1 (blue band) obtained from the local correlators; both fits are consistent with each other within uncertainties The fits with fixed E0 and E1 are more stable in the large-t region; therefore, we use this fitted strategy for the rest of the analysis. The renormalized matrix elements are then obtained by where the bare matrix elements obtained from the ground-state meson amplitude A M,0 fit in the previous section via Figure 4 shows the n z = 4 renormalized matrix elements on the three ensembles, along with the quasi-DA matrix elements matched from two lightcone DA function forms, with α = 1 and α = 0.5, respectively. The former (α = 1) is the asymptotic form of the pion lightcone DA [100,101], and the latter (α = 0.5) has a second moment close to previous lattice computations of the pion DA moments [102][103][104][105][106]. We impose the symmetries to symmetrize the real parts and antisymmetrize the imaginary parts of the matrix elements, and enforce the normalization 1 0 dx φ(x) = 1 so that the central value h(z = 0) = 1. The matrix elements for lighter mesons are noisier. We see that the renormalized matrix elements at different lattice spacings are consistent with each other, suggesting that the higher-order discretization effects are small. We also note that when α increases, the peaks in h(z) shift toward larger z, and the magnitude of the first peak increases while the magnitude of the second peak decreases. In our data, the pion result is closer to the form with α = 0.5.
In Eq. (6), the operator that appears in h B M (h, a) might mix with other operators. If it mixes with lowerdimension operators, then subtractions of the lowerdimension operators should be performed first; otherwise, the Z factor in Eq. (6) will just renormalize the most singular (lowest-dimension) operator in the a → 0 limit rather than the desired operator. Fortunately, Ref. [32] shows that is not the case. The nonlocal operators used for quasi DAs in this work are the lowest-dimension ones with the same symmetry properties. This ensures that continuum limit can be taken for Eq. (6). Then, by going to the continuum limit rotational symmetry is restored, so mixing among twist-2 operators of different mass dimensions will not happen. Also, power-divergent mixing among twist-2 and twist-4 operators was suggested in Ref. [46]. However, the study in Ref. [24] shows that the twist-4 contribution is higher dimension. It can be written as equal-time correlators with two more mass dimensions than the original quasi-distribution operator. Hence, the twist-4 contribution does not cause powerdivergent mixing.
Checking these mixings requires a careful analysis of the mixing matrix, which is outside the scope of this work. In general, it is not enough to show that h R M of Eq. (6) has a continuum limit, since, as we argue above, if the operator associated with h B M (h, a) mixes with a lower-dimensional operator, then the Z factor can still renormalize this lower-dimensional operator and make h R M finite in the continuum limit. However, the information that the lower-dimension operator provides is different from what we want. Although the existence of the continuum limit for h R M is by itself a necessary but not sufficient condition for our quasi-DA program, the studies of Ref. [24,32] show that power divergent mixing does not appear in quasi-distributions.

B. Continuum Extrapolation
Now, we remove the remaining lattice discretization effects by extrapolating the renormalized matrix elements to the continuum by taking the continuum limit a → 0. Because the matrix elements with three different lattice spacings do not have data from the same physical z's, we first need to interpolate the points as functions of z for each lattice spacing, then do the extrapolations pointwise on these curves. For the continuum extrapolation, we use the following functional forms: where we use i = 1 for linear and 2 for quadratic latticespacing dependence. We find that the coefficient d M is consistent with zero within errors, except for kaon and η s at smallest momentum P z = 0.86 GeV. Since power divergence should be a short distance property of the Wilson coefficient, the dependence on the long distance properties of P z and meson flavor suggests that it is due to complications associated with small P z . Hence, we set d M = 0 from now on and focus on P z = 1.73 GeV results in the discussion below. Bootstrap resampling is applied to the three data sets to estimate the error of the continuum extrapolation, a06 a09 a12 x(1-x) a09 a12 x(1-x) a09 a12 x(1-x) since the number of measurements on three ensembles are different. The fitted functional forms are consistent with the data points and have average χ 2 /dof ≈ 1.2 for n z = 4. We observe that for the pion the slopes c π,1 and c π,2 are consistent with zero for zP z < 8. Figure 5 shows the extrapolated renormalized matrix elements for all mesons at n z = 4. We find that at small link lengths z < 0.5 fm, the lattice-spacing dependence of the matrix elements is consistent with zero, so the extrapolated results are consistent with the data on all ensembles. At moderate link lengths, 0.5 fm < z < 1 fm, near the peaks, the dependence is the most significant and we see |c M,1 | ≈ 2 fm −1 for K and η s . At large link lengths z > 1 fm, the lattice-spacing dependence is obscured by the large error, and the extrapolations are mainly constrained by the two cleaner data sets on a ≈ 0.09 fm and a ≈ 0.12 fm, where fewer Wilson links are needed at a given physical length of z.
To take into account the systematics of using different fitting functions, we used the Akaike information criterion (AIC) technique [107] to combine the linear and quadratic fits: where k 1 and k 2 are the number of free parameters, are both 1 in this case. The quadratic dependence on lattice spacing does not well describe the data; thus, the χ 2 is large in the quadratic extrapolation, and the combined extrapolation is dominated by the linear extrapolation. Overall, the extrapolations using the two functional forms are close to each other, so the combined extrapolation is consistent with both results, as shown in Fig. 5.
Future study using ensembles with different lattice spacing can help resolve any quadratic dependence.
The extrapolation formula obtained from one-loop chiral perturbation theory [108] is where the chiral logarithm has been proved to be absent for the DAs of pseudo-Goldstone bosons [108]. The chiral-extrapolated results are shown in Fig. 6, and they are very close to the ones from calculations at lighter pion mass but with slightly larger error bars due to the extrapolation. To test whether the higher-loop corrections are significant for M π = 690 MeV, data at another value of M π is needed. In this work, we will use valance pion mass, M val π , in Eq. 10 for a naive chiral extrapolation to estimate what the DA may be look like at physical pion mass point. Future work should include ensembles at lighter pion mass to improve the reliability of the chiral extrapolation and reduce the uncertainty due to such extrapolation.

C. Quasi-DA matrix elements to lightcone DA
The standard procedure to obtain the lightcone DA via quasi-DA is to first Fourier transform the chiral and continuum extrapolated matrix elements from the coordinate space to the momentum space (i.e. x space), then to apply the inverse matching kernel to obtain the lightcone DA. The quasi-DA is obtained through Because our matrix elements in coordinate space are discretized and bounded in the range |z| < 1.44 fm, we can only do a truncated Fourier transformation with |z| ≤ z max ≤ 1.44 fm after interpolating the data. This truncation will introduce a step function into the Fourier transformation and lead to oscillations in the quasi-DA in momentum space. This was first observed in the nucleon PDF studies [19,30], and multiple solutions have been proposed to help resolve or minimize the issue [33,109,110]. A similar problem is also observed in our meson-DA study; an example from the pion quasi-DA is shown in Fig. 7. Not only does the pion distribution have similar oscillations, but it is worse than those observed in the nucleon PDF distribution in Ref. [19,30]. In addition, the shape of the peak at x = 1 2 is sensitive to the choice of z max used in the Fourier transformation, causing large uncertainty in the DA determination.
To better constrain the DAs such that they vanish outside the physical region, x = [0, 1], we adopt the fitting approach by parametrizing the distribution amplitude using the commonly used meson PDF global-fitting form where B(m + 1, n + 1) is the beta function, which normalizes the lightcone DA such that the area under the curve is unity. We then obtain the parameters m and n for the meson lightcone DAs by fitting to the lattice where C is the matching kernel for the DA [111] with µ = 2 GeV (the MS renormalization scale), µ R = 3.8 GeV, and p R z = 0. This approach was originally proposed for the pion valence PDF [112]. Figure 8 shows the reconstructed matrix elements from Eqs. (12) and (14) using the fitted parameters, m and n, for all three mesons, along with the input chiralcontinuum-extrapolated ones at P z = 1.73 GeV. Results using different values of z max , ranging from 0.72 fm to 1.44 fm, as input data h(z, µ R , p R z , P z ) are also shown. The χ 2 /dof = 1.02(58) is small for the fit of full-range pion data z max = 1.44 fm, and it reproduces the peak locations. However, we can see from the plot that the fitted function cannot reproduce the large amplitude of the secondary peaks. This indicates that more complicated forms need to be used. The fit results at the two largest z max are consistent with each other, because z max = 1.08 fm already covers the secondary peaks, and the fit is trying to recover the large amplitude there, resulting in a small m, n. When we truncate the data at smaller z max , then the fit is trying to recover the large amplitude at the first peaks, resulting in a larger m, n. For the remaining part of the paper, we only show the fit results for the full-range data with z max = 1.44 fm.
We first study the pion-mass dependence of the pion distribution amplitude in the continuum limit.   shows pion DA results using pion masses of 690, 310 and extrapolated to 135 MeV. We remind the reader that our chiral extrapolation is dominated by 310-MeV results. Nevertheless, the DAs for the heavier mesons, at strange point, have a narrower distribution, showing a similar trend as suggested in Ref. [113]. Mapping out how the DA shapes change as a function of quark masses helps us understand the origin of mass [1], which is a priority research direction for a future EIC and other facilities. We will leave a more complete study of the quark-mass dependence of the DAs to the future.
Our pion distribution amplitude extrapolated to the physical pion mass is shown on the left-hand side of Fig. 10 with the fitted parameters m = 0.57 (27) and n = 0.60 (26). We also show results from the Dyson-Schwinger equation (DSE) prediction (DSE'13) with the form φ π (x) = 1. 81 [114], the data from Belle experiments [115], the prediction of the light-front constituent-quark model (LFCQM'15) [116], and the fit to the form Eq. (12), of the second moment [106] (labeled as "RQCD'19"). Our pion result is consistent with the DSE and "RQCD'19" moment reconstructed results, showing a broader distribution than the LFCQM result. Our pion amplitude obtained through the parametrization is constrained to physical region 0 < x < 1 by definition, and, therefore, has a higher peak compared with the results in our previous work [84]. RQCD also calculated the x-dependent pion distribution amplitude using multiple Euclidean correlation functions [85] on a N f = 2 295-MeV pion mass, a ≈ 0.071 fm lattice-spacing ensemble. They found a much broader distribution than our results. Using the parameters m = 1.04 (20) and n = 1.05 (20) obtained from fitting the kaon matrix elements, we obtain the FIG. 8. Fit of the matched function form to Pz = 4 2π L pion (upper) and kaon (lower) renormalized matrix elements at µ R = 3.8 GeV, p R z = 0 in range |z| < zmax. We note that the fit with zmax = 1.08 fm is the same as the fit with zmax = 1.44 fm. If we go to smaller zmax, the exponential a, b will become larger to recover the large amplitude of the first peaks. kaon lightcone DA, as shown on the right-hand side of Fig. 10. We compare the kaon result with DSE predictions [117] (labeled as "DSE'14-1" and "DSE'14-2"), the LFCQM result (LFCQM'15) [116], and the fit to the form Eq. (12) of the first and second moments [106] (labeled as "RQCD'19"). Again, the kaon DA has higher peak compared with the one in our previous work [84], but no observed asymmetric around x = 1/2. Our kaon distribution is narrower than the DSE and RQCD momentreconstructed results.
With the fitted DA, we can calculate their second moments by integration We find ξ 2 π = 0.244(30) for pion and ξ 2 K = 0.198(16) for kaon. A comparison with previous moment calculations on lattice is shown in Table I. The pion moments calculated from our x-dependent distributions suffer from larger error due to the usage of larger momentum in the hadron states, while the traditional moment calculations rely on hadrons at rest to obtain better signal. Our pion results are generally consistent with earlier lattice determinations using the moment approach. However, our kaon second moment is about 20% smaller; this is anticipated since our kaon m, n in Eq. (12) are larger. The kaon distribution is narrower than pion one and almost symmetric around x = 1/2; therefore, we have a smaller kaon moment.

D. Machine Learning Predictions for Lightcone DAs
Another approach to obtain lightcone DAs from the spatial matrix elements is to apply machine learning. The idea here is to train a supervised machine-learning model  [106] 2+1f clover clover 0.234(6)(6) 0.231(4)(6) RI'-SMOM 0.039-0.086 130-420 3.6-6.4 RQCD'17 [105] 2+1f clover clover 0.2077(43) N/A RI'-SMOM 0.086 222-420 3.9-5.8 RQCD'15 [104] 2f clover clover 0.236 (4) with randomly generated pseudo-data which have similar properties to the DAs and are constrained by the same physical requirements. The model is then applied to real lattice matrix elements in coordinate space to predict the lightcone DAs. A similar application to PDFs was studied in Ref. [110], where instead of real lattice data, a set of pseudo-data generated from global-fitting results was used to test the method. Note that Ref. [110] attempted to reconstruct nucleon PDFs using pseudo lattice data but did not finish by using actual lattice data to obtain PDFs.
In this work, we use the multilayer perceptron (MLP) regressor [118][119][120], a machine-learning algorithm implemented in the Python scikit-learn package [121]. Since this is a first attempt to use purely lattice data to reconstruct the distribution functions, we use the same parametrization formula as shown in Eq. (12), and their linear combinations with 100,000 randomly generated m, n pairs in Eq. (12), evaluated at 99 points x ∈ (0, 1) as outputs of the model. Random relative noise at each point is added to these samples. Then, we apply Eq. (14) at renormalization scale µ R = 3.8 GeV, p R z = 0 to obtain the corresponding matrix elements at z ∈ [0, 24] × 0.06 fm in coordinate space as inputs of the model. We train and test the MLP regressor on these labelled pseudo-data. The model optimizes the squared- where a large relative deviation near the boundary x ∈ {0, 1} will not contribute much to the loss because of its small amplitude. We tune the hyperparameters of the model, i.e., the geometry of the hidden layer and the activation function, with Grid-SearchCV in scikit-learn. The optimized model is a MLP regressor of three hidden layers with 100 perceptrons and the activation function f (x) = max(0, x) on each layer.
To make sure that the above procedure works, we test our procedure on a simpler formula. We generate a test set of data with the same constraints but from different form, f (x) = N sin β (πx), to check the stability of the model when extrapolating to unknown functions. We generate the test data for β ∈ {0.5, 1, 1.5, 2}. After transforming to coordinate space, we generate 1000 samples for each β, following a Gaussian distribution N (µ, σ 2 ) with µ = h(z), σ = h(z) × 0.1 exp[0.1z] to simulate the noise from data on lattice. We test the model on these sets of noisy data. It turns out that the the model works fairly well, as shown in Fig. 11, indicating that even if the lattice results do not follow the functional form we used to train the data, the model is able to give close predictions.
With the success of the simple sine-function tests, we apply our procedure to the chiral-continuum extrapolated pion, kaon and η s lattice data. However, simply applying our procedure to real lattice data gives a very noisy distribution. This is mainly due to the fact that the trained network knows nothing about the physics, especially around x = 0 and 1, which sometimes causes unstable distributions. To solve the problem, we divide the target lightcone DA pseudo data by a factor of x d (1−x) d to increase the weight near the boundary, which stabilizes the prediction while keeping these points finite. We found d = 0.3 gives the most stable results, as shown in the leftmost column of Fig. 12. For the less noisy data sets from η s and K mesons, the output distributions are more stable and have smaller uncertainty. However, for the noisier pion data, the prediction becomes much worse. This is not a surprise, as most ML training networks require high-statistics data to work well. We also show a comparison with the fit method described in the previous subsection for both DAs and how the ML reproduces the coordinate-space matrix elements (the left two columns of Fig. 12). Note that in this study, the machine-learning results are very close to the fit ones. This is likely due to the fact that we set up the training data with the same form as the fitting approach. In future work with higherstatistics data, more function forms should be included in the training process to remove the parametrization dependence.

IV. SUMMARY AND OUTLOOK
In this work, we presented an updated lattice calculation of the pion, kaon and η s distribution amplitudes using the LaMET/quasi-distribution approach. We not only improved our previous single-lattice-spacing calculations [84] with smaller statistical errors for all mesons, but also extended the calculations to two smaller lattice spacings, 0.09 and 0.06 fm. This allowed us to perform a continuum extrapolation using the lattice data and address issues relating to power-divergent mixing among twist-2 operators and among twist-4 operators [46]. Our analysis confirmed that the coefficient of the leading 1/a 2 power divergence is consistent with zero within errors for P z = 1.29 and 1.72 GeV. This power divergence is not seen in our extrapolation (keeping P z constant while taking a → 0), together with the absence of mixing to lower dimensional non-local operators [24,32], suggests the power divergent mixing problem does not happen.
We attempted a naive chiral extrapolation to the physical pion mass M π = 135 MeV using 690-MeV and 310-MeV renormalized matrix elements. We used two strategies to extract the lightcone DAs. First, we fit the continuum-chiral-extrapolated matrix elements in coordinate space using Eq. 14 with the distribution form used by global fit, Eq. 12. Our results in MS at 2-GeV show a pion distribution symmetric around x = 1/2 and having broader distribution than the asymptotic prediction, consistent with prior DSE results. The second moment, taking the integral of our pion DA, gives 0.244 (30), which is consistent with past direct lattice-QCD moment calcu-lations. Our kaon DA has a narrower distribution than the pion one, but we do not observe asymmetric behavior after the continuum-chiral extrapolation. This is likely due to the fact that our light-quark mass is not far enough away from the strange-quark mass, and thus the milder asymmetric distribution that washed out in the increase uncertainties of continuum-chiral extrapolation. As a result, our second moment of the kaon DA, 0.198 (16), is about 20% smaller than the previous direct calculation. Future calculations with improved statistics and lighter quark mass will be crucial to resolve this question.
Our second strategy used a machine-learning algorithm to make predictions of the meson DAs. Our procedure has been tested with a simpler sine function that mimics the lattice data statistical distribution, modified for stable outputs. The same setup is trained using pseudolattice data with Eq. 12, before being applied to the continuum-chiral-extrapolated lattice data to predict the meson DAs. Further tuning is needed to obtain a stable output from the network. We found that the ML can give stable predictions on the more precise dataset in the cases of K and η s with the predicted result to similar the fitting one. This is likely due to the fact that pseudo-data generated to train the model is limited to Eq. 12 so far, but getting nonzero results is quite exciting for a first result. Future work with even higher precision data would allow us to explore wider range of the training models, remove the model dependence, and see the impacts on the real lattice data. then the FT formula Eq. (11) will becomẽ φ(x, µ R , p R z , P z ) = dz e −i(1/2−x)zPzH R (zP z , p R z , µ R ). (A2) We can see from Eq. (A2) that ifH R is real,φ(x) = φ(1−x) will hold. The phase-rotated matrix elements for K at n z = 2/3/4 are shown in Fig. 13. From the data on a ≈ 0.12 fm lattice, we see a clear nonzero imaginary part for the kaon. Yet, when we extrapolate to the continuum, the imaginary part of n z = 4 becomes consistent with zero. Thus our kaon result in continuum at n z = 4 is close to a symmetric distribution.

Appendix B: Additional Figures
The dispersion relation for three particles on three lattices are in Fig. 14. We can see that the speed of light gets closer to one at finer lattice. On coarser lattices, heavier mesons show a larger deviation.
By varying the fit range for the two-point correlators, we obtain different sets of ground-state coefficients. These fit results on three lattices are shown in Fig. 15, Fig. 16 and Fig. 17. Fit results from different ranges are generally consistent with each other. Taking both fit stability and fit qualities on all operators into account, we choose t min = {4, 4, 5} for π, K and η s on a06m310 lattice, t min = {5, 4, 5} on a09m310 lattice, and t min = {2, 2, 3} on a06m310 lattice.
We show a comparison of our new data and the data from the previous work [84] in Fig. 18. We see that they are consistent at most points; however, these slight deviations can result in very different asymmetry behavior, because the asymmetry is only a few percent of the overall magnitude.
The continuum extrapolation for smaller momenta P z = 0.86 GeV and P z = 1.29 GeV are shown in Fig. 19. There is a large discretization effect at P z = 0.86 GeV, which may come from higher-twist effects and the 1 a 2 power divergent pole. FIG. 13. Imaginary part of rotated matrix elements for K at Pz = nz 2π L with nz = 2/3/4. From the a ≈ 0.12 fm data we see that there is an asymmetry in K. However, this asymmetry becomes consistent with zero when extrapolated to the continuum.     FIG. 19. Extrapolation of the kaon renormalized matrix elements at Pz = nz × 2π L with nz = 2 (left) and nz = 3 (right), µ R = 3.8 GeV, p R z = 0 to the continuum limit from two functional forms and their AIC combination. We observe larger discretization effect for small Pz due to the non-negligible higher twist effects.