Valence parton distribution function of pion from fine lattice

We present a lattice QCD study of valence parton distribution inside the pion within the framework of Large Momentum Effective Theory. We use a mixed action approach with 1-HYP smeared valence Wilson clover quarks on 2+1 flavor HISQ sea with the valence quark mass tuned to 300 MeV pion mass. We use $48^3 \times 64$ lattice at a fine lattice spacing $a=0.06$ fm for this computation. We renormalize the quasi-PDF matrix element in the non-perturbative RI-MOM scheme. As a byproduct, we test the validity of 1-loop matching procedure by comparing the RI-MOM renormalized quasi-PDF matrix element with off-shell quark external states as computed in the continuum 1-loop perturbation theory with the lattice results at $a=0.04$ and 0.06 fm. By applying the RI-MOM to ${\overline{\rm MS}}$ one-loop matching, implemented through a fit to phenomenologically motivated PDFs, we obtain the valence PDF of pion.


I. INTRODUCTION
QCD factorization allows us to calculate the cross-section of hard hadronic processes in terms of the convolution of partonic cross-section and parton distribution functions [1]. Parton distribution for a hadron can be defined using hadronic matrix elements of appropriately chosen gauge invariant operators separated along the light cone. For example, the quark parton distribution function (PDF) of a hadron H can be defined in terms of an operator bilocal in quark field ψ as [1,2] f (x) = 1 4π dξ − e ixP + ξ − H(P)| ψ(ξ − )γ + W(ξ − , 0)ψ(0) |H(P) , where W(ξ − , 0) = Pe ig ξ − 0 dξ − A + is the path-ordered straight Wilson Line on the light-cone, and the light-cone coordinates ξ ± = (t ± z)/ √ 2. A straight forward first principle calculation of PDF is not possible because lattice QCD is formulated in Euclidean space-time, and thus, it cannot access quantities defined on the light-cone. To circumvent this problem, it has recently been proposed to calculate quasi parton distribution function (qPDF),q(x, P z ), defined in terms of matrix elements of equal time, but spatially separated, quark bilinears [3] evaluated in a hadron state boosted to large momentum P z : where Γ is either γ z or γ t for the unpolarized parton distribution addressed in this paper. Here, W(z, 0) is a straight spatial Wilson line joining the quark and anti-quark. For sufficiently boosted hadrons, one can use the Large Momentum Effective Theory (LaMET) [4] to relate the qPDF to PDF through a convolution with a matching kernel C as q(x, µ L , P z ) = +1 −1 dy |y| C x y , yP z µ , µ L yP z f (y, µ).
Here µ L and µ are the renormalization scales of the schemes in which the qPDF and PDF are defined. For the latter, MS scheme is used and µ is referred to as the factorization scale. The matching kernel is perturbative and hence universal for all the hadrons. Therefore, it is calculated using quark external states in a chosen gauge. Such calculations at 1-loop order have been performed using the cutoff scheme [5] as well as in the MS scheme [6][7][8]. There are also related approaches to calculate the PDF from the lattice that use similar logic but differ in details, like the pseudo-PDF approach proposed in Ref. [9,10] and the use of good lattice cross sections [11,12].
The latter includes the current-current correlators [13].
Using LaMET and related approaches, various attempts have been made to calculate the unpolarized and polarized iso-vector quark distribution of nucleon [7,[14][15][16][17]. The first studies of the valence quark distribution for the pion have also been presented [13,18]. One important issue in the calculation of the PDF from the lattice is the renormalization and matching. As indicated above, the PDF and qPDF are usually defined in different renormalization schemes. The qPDF, which is calculated on the lattice, needs a non-perturbative renormalization scheme because of the self-energy divergence of the Wilson line [19], and this is usually implemented using the RI-MOM scheme [20] defined using external off-shell quark states accessible on the lattice. Then, one has to match the qPDF in this lattice renormalization scheme to the PDF in the MS scheme through Eq. (3). This is achieved through the convolution using the matching kernel between the RI-MOM and MS schemes that is perturbatively calculated in the continuum theory using dimensional regularization [8]. One could also define the qPDF operator in the MS scheme and then perform the matching between PDF and qPDF [6,21]. The current status of this field, including the comparison with the phenomenological PDF and the issue of renormalization, is reviewed in Refs [22][23][24].
In principle, Eq. (3) offers a way to calculate PDF from the lattice, but it is unclear as to what extent this is actually feasible given the various assumptions that go along with the equation implicitly. For example, at any finite hadron momentum P z , the Eq. (3) suffers from O Λ 2 QCD /(x 2 P 2 z ) higher twist corrections. This is closely related to the assumption that the perturbative calculation, currently truncated at 1-loop order, is able to capture the renormalization as well as the matching of the qPDF matrix element over a range of quark-antiquark separations, z -to be in the perturbative regime, one would expect z to be smaller than or about O(1) fm. It is also important to ensure that aP z < 1 to make sure we are not overcome with lattice artifacts [25]. Therefore, a closer look at this new methodology is warranted and is actively being studied [7,26]. The aim of this paper is to explore these issues further by using finer lattices than what is being used in the qPDF literature, and use pion as a case study. The smaller mass of the pion makes it easier to achieve a large boost, the numerical calculations are expected to be less expensive and it also helps suppress the target mass correction by ensuring m π P z . We focus on the valence PDF of pion since it can be accessed using the isotriplet u − d PDF, and thereby, avoid mixing with the gluon sector. In our study, we will use the renormalization and matching strategy outlined in [7,20]. The pion valence PDF has been determined through leading order and next to leading order analysis of experimental data [27][28][29][30][31][32][33][34][35], but it is much less constrained than the nucleon PDF and therefore, the lattice calculations may have more impact in this case, especially in constraining the x → 1 limit which is not yet well established.
The paper is organized as follows. In section II, we discuss our lattices setup. In section III, we present the calculations of the two point function of the boosted pion and check how reliable the extractions of the ground state and the first excited state are. In Section IV, we present our results for the pion three point function that defines the qPDF. Here, we also discuss the problem of excited state contamination. In section V, we discuss the non-perturbative renormalization as well as the validity of 1-loop matching. Our results on the renormalized pion qPDF and the matching to PDF are presented in section VI. Some technical aspects of the calculations are discussed in the Appendices. Preliminary results on this work have been reported in conference proceedings [36][37][38].

II. LATTICE SETUP
We performed the calculations of the pion two-point and three-point functions needed to obtain the qPDF using the Wilson-Clover action for valence quarks on 1-HYP smeared gauge configurations [39] and Highly Improved Staggered Quark (HISQ) action [40] in the sea. We used the 2+1 flavor gauge configurations corresponding to lattice size 48 3 ×64 and the lattice spacing of a = 0.06 fm generated by the HotQCD collaboration [41]. In addition to this ensemble, we also used 64 4 HISQ lattices [41] with the lattice spacing a = 0.04 fm for the study of the non-perturbative renormalization (NPR). In both the ensembles, the sea quark mass was tuned to a pion mass of 160 MeV. A similar setup was used by the PNDME collaboration albeit for 2+1+1 flavor MILC configurations (c.f., Ref. [7]). For the valence quark masses, we used the values am = −0.0388 (i.e., κ = 0.12623) for the a = 0.06 fm ensemble and am = −0.033 (i.e., κ = 0.12604) for the a = 0.04 fm ensemble, which are tuned such that the pion mass, m π , is 300 MeV. We did not see any exceptional configurations for these valence quark masses in our calculations.
We used higher statistics at smaller quark-antiquark separation z than at larger ones; to be exact, we used 214, 162 and 100 gauge configurations for |z|/a ∈ [0, 8], (8,16] and (16,24] respectively. We further improved the statistics by using All-Mode Averaging (AMA) [42] in the computations of the twoand three-point functions, with 32 sloppy calculations to one exact solve for each configuration. For the exact and sloppy inversions, we used the stopping criterion of 10 −10 and 10 −4 , respectively. In our study, we will consider the valence quark distribution, which in turn is related to the iso-vector u − d quark distribution in the pion, and thus we do not compute the quark line disconnected diagrams.
For a reliable extraction of qPDF, a good overlap of the source operator with the pion state is necessary so as to project out the ground state at as small source-sink separation as possible. The quark sources with Gaussian profile, typically implemented through a gauge covariant Wuppertal smearing [43], are used for this purpose when the hadron is at rest. However, for fast moving hadrons that are required in the qPDF framework, the use of the Gaussian sources is no longer sufficient and this necessitates the usage of the boosted Gaussian sources [44] instead. Since we are interested in the calculation of the pion two and three point function at several values of the pion momenta and several source-sink separations we found it more practical to implement the Gaussian sources by using Coulomb gauge instead of implementing the Wuppertal smearing. We found the optimal size of the Gaussian profile to be about 0.3 fm, which roughly corresponds to 90 steps of Wuppertal smearing. We checked that in terms of signalto-noise ratio, the Wuppertal and Coulomb-gauge Gaussian sources are similar (see Appendix E). In the next section, we discuss the boosted sources in detail and the energy levels of the boosted pion. In Appendix A, we have explained the construction of boosted sources in detail.

III. TWO POINT FUNCTION OF THE BOOSTED PION
We calculated the two point functions of the positively charged pion (π + = du), for a spatial pion momentum P = (0, 0, P z ) which is nonzero only along the z-direction, using the pion source and sink π + s (0, P) and π + s (t, P), respectively. The values of momenta in lattice units are aP z = ±2πn z /48 for n z ranging from 0 to 5, which in physical units correspond to P z = 0, 0.43, 0.86, 1.29, 1.72 and 2.15 GeV, respectively. We always used the Coulomb gauge Gaussian smeared-source (s = S ), and either smeared sink (s = S ) or point sink (s = P). In the rest of the paper, we will refer to the smeared-source and smeared-sink set-up to be SS, and we will refer to the smeared-source point-sink set-up as SP. For the lowest two momenta, we used the usual Gaussian sources. To improve the signal for higher momenta, we followed Ref. [44] and used boosted sources in which the valence quarks are boosted to momentum k z = ζP z , with ζ being a tunable parameter. Naively, one might expect that the optimal choice to be ζ = 0.5. However, we found that the optimal choice of ζ for the pion in terms of the signal-to-noise ratio is between 0.6 − 0.75. For P z = 0.86 GeV the signal-to-noise ratio is not very sensitive to the value of ζ. These findings are in agreement with Ref. [44]. We discuss the optimization of boosted sources further in Appendix E. Since, we need to create a source for each value of ζ, we used k z = 2(2π/48) for n z = 2, 3 and k z = 3(2π/48) for n z = 4, 5, corresponding to choices of parameter ζ = 1, 2/3, 3/4 and 3/5 for n z = 2, 3, 4 and 5 respectively. We have shown the corresponding effective masses for the SS two point functions in Fig. 1. By using boosted smeared sources, one can see that a reasonable signal for the two point correlation function can be obtained up to source-sink separations t = 12a for all momenta except for the highest momentum P z = 2.15 GeV. Simply from the data points in Fig. 1, we see that the effective masses approach a plateau corresponding to the continuum dispersion relation E π (P z ) = P 2 z + m 2 π , shown as the horizontal lines. The effective mass approach the plateau region at larger source-sink separations when the momentum is increased, as one would expect from the shrinking gap between the ground and excited states as the pion is boosted.
As we will discuss next, we used source-sink separations t = 8a, 10a and 12a for the computation of three-point func- The systematical dependence of the ground state E π and the first excited state E 1 on the fit range [t min , t max ] is shown. In the left panels, we show such a dependence for the pion SP correlator at two different P z . For each t min , data from t max = 24a and 32a are shown. The black solid line is the value of E π expected from continuum dispersion relation. The red patterned band is our best estimate of E 1 using the SP correlator. In the right panels, the fit systematics of E 1 for the SS correlator is shown. The red band is the prior used for E 1 from the SP correlator (same as the one in the left panels). The different symbols are various fit strategies.
tions. Therefore, we needed to analyze the excited state contribution to the SS and SP correlators to perform the infinite source-sink extrapolations. For this, we performed multiexponential fits on the SS and SP pion two point functions in the interval t ∈ [t min , t max ] in order to extract the energy levels. For fixed t min we varied t max and checked the sensitivity of the result to t max . Then, we repeated the procedure for different values of t min . We found that we were able to reliably extract the ground state E π (P z ) as well as the first excited state E 1 (P z ) using four-parameter two-state fits to the SP correlator instead of using the SS correlator. This could be due to the fact that the contributions from the high-lying energy levels are smaller in the SP correlator compared to the SS correlator stemming from the possible cancellations between the positive as well as negative amplitudes that are allowed in the SP correlator. In the top-left and bottom-left panels of Fig. 2, we have shown the systematics of the two-state fits to the SP correlator. In the top-left and the bottom-left figures, we have shown the dependence of the best fit values of E π (blue circles) and E 1 (black circles) as a function of t min used in the fits. For a given t min , the data points from two values of t max have been clubbed together for P z = 0 and 1.29 GeV respectively, and it demon-strates that there is no dependence on t max . The ground state is seen to compare well with the expectation from the dispersion relation shown by the black solid lines. The red band shows the values of E 1 chosen as the best estimate of the first excited state.
On the top-right and bottom-right panels of Fig. 2, we show similar plots for the first excited state E 1 as estimated using the SS correlator. The statistical errors of the excited state energy E 1 in the simple two state fits (magenta diamond) quickly grow large with increasing t min and thus, these fits turned out to be of limited use. Therefore, we performed constrained two-and three-state exponential fits for the SS correlator with the ground state energy fixed to E π = P 2 z + m 2 π with m π = 300 MeV, and imposing a prior on E 1 using its best estimate from the SP correlator -that is, we added the term (E 1 − E 1,prior ) 2 /σ 2 prior to the χ 2 with E 1,prior and σ prior being the mean and error of E 1 , respectively, as determined from SP correlator. The t min dependence of the resulting E 1 from constrained two-state fit (black circles) and constrained three-state fit (blue triangles) are shown in the top-right and bottom-right panels. The two-state fits of the SS two-point correlator largely overestimate the energy of the first excited state for small t min whether or not priors are used, and there is a significant dependence on t min . One should use t min ≥ 6a to obtain reliable results for the first excited state from SS correlator. The three-state fits with priors on E π and E 1 give energies of excited states that are the same within errors for the SS two point correlators and show almost no t min dependence. In summary, we determined the lowest three energy levels using the SP correlator and then determined the corresponding amplitudes |A n | = | 0| π + S (0, P) |E n , P z | of these excited states in the SS correlator through a constrained fit analysis.
In Fig. 3, we show the three energy levels obtained from different fits discussed above, as a function of P z . For P z = 0, we compare our result with the energy levels that would correspond to the pion resonances π(1300) and π(1800) from the PDG [45]. In order to account for the 300 MeV pion mass, we shifted the PDG values by 0.161 GeV as an approximation and these are shown as the two arrows in Fig. 3. Our estimate of the first excited state energy agrees with this shifted mass of π(1300). We also show the expected P z -dependence of E 1 (P z ) assuming a particle-like dispersion and this describes the actual data very well. The energy of the second excited state is much larger than expected, meaning that the third state effectively parametrizes several higher lying states. As one can see from the figure the energy gap between E π and E 1 shrinks with increasing P z as expected. The results on the excited state energies will be important for the analysis of the pion three-point function discussed in the next section.

IV. EXTRACTION OF THE BARE QUASI-PDF MATRIX ELEMENTS FROM THE THREE-POINT FUNCTIONS
The next step is the calculation of the bare qPDF matrix element h B Γ (z, P z ) = E π , P z |O Γ (z; τ)|E π , P z , The energy of the ground state and the first two excited states as function of P z . The red, blue and black symbols correspond to E π , E 1 and E 2 respectively. For each color, the different symbols correspond to different fitting methods (2-state and 3-state fit with or without prior on ground state) and types of source-sink (SP or SS). The lines show the expected dispersion relations for pion and its first excited state. The arrows are the PDG values of π(1300) and π(1800) which are shifted to account for m π = 300 MeV.
where the bilocal u − d qPDF operator in a time-slice τ involving quark and antiquark separated along the z-direction by L = (0, 0, 0, z) is given by which is made gauge-invariant by the Wilson line W x,x+L . The Dirac γ matrices in the qPDF operator are in the Minkowskian convention. The state |E π , P z denotes the onshell ground state pion with momentum P z . In addition to the natural choices of Γ = γ t and γ z that approach γ + in the light-cone limit, we also considered Γ = 1. This choice of Γ is needed because under renormalization, O γ z (z) mixes with O 1 (z) [6]. We applied one-level of HYP smearing to the links entering W x,x+L in order to reduce the noise. Since the qPDF calculation involves values of z ∼ O(a), we checked that there is no significant difference between the renormalized matrix elements using smeared and unsmeared Wilson line. To obtain the bare matrix element h B Γ (P, z), we computed the three-point function at different source-sink separations t and operator insertion point τ and constructed the ratio of the three-point function to twopoint function,  The reader can refer to Appendix B for a detailed description of the construction of three-point functions. The two-point function is always real when the source and sink are of the same type. The three-point function for the u − d qPDF operator O Γ (z) in a pion external state is real at all z for Γ = γ z , γ t and purely imaginary for Γ = 1 (refer Appendix C). Inserting a complete set of states in the above equation, with E n+1 ≥ E n , and E 0 = E π . It is easy to see that in the infinite t limit, R(t, τ; z, P z , Γ) is equal to h B Γ (z, P). The above equation holds for infinite time extent. For finite time extent, the effects of periodic boundary condition should be taken into account. This turns out to be important for P z = 0, while for non-zero P z the effect is negligible as discussed in Appendix G. In practice, one truncates the sums in Eq. (9) at some value n, and then obtain h B Γ (z, P z ) by fitting the t and the τ dependence of R(t, τ; z, P z , Γ) using E n , P| O Γ (z) |E n , P) as fit parameters. In the fits, the values of A n and E n were held fixed at values determined from the two-state fit analysis on the SS correlators. In what follows, we will refer to this method of fitting using n-state ansatz to the data between τ/a > τ o and τ/a < t/a − τ o as Fit(n,τ o ). In this method, the excited states are suppressed by exp(−(E n − E π )t/2). For z = 0, it is easy to see that R(t, τ; z, P z , Γ) is symmetric in τ around the mid-point τ − t/2. For z 0 and P z 0, we only have the following relation (see Appendix D), where φ Γ = 1 for Γ = γ t , 1 and Φ Γ = −1 for Γ = γ z . Thus, generically the matrix elements E n , P z | O Γ (z) |E n , P z ) and E n , P z | O Γ (z) |E n , P z ) are independent and the number of fit parameter is thereby increased. Based on the above relation, we constructed appropriate averages using both the positive and negative values of momenta to increase statistics. However, in practice the gain was marginal.
We demonstrate the extraction of the matrix element using the fit method in Fig. 4, where the ratio R(t, τ; z, P z , Γ) is shown for Γ = γ t and γ z qPDF for the P z = 1.29 GeV pion. The results on R(t, τ; z, P z , Γ) and the extraction of matrix elements for the other values of P z are given in Appendix G. In the figure, we show the data at t/a = 8, 10 and 12 along with the result of Fit(2,2). Using the fit, the results for the t → ∞ extrapolations are shown with the horizontal bands. We also performed three states fit of R(t, τ; z, P z , Γ) and picture looks similar. In this case, the data points at all τ − t/2 could be described by the fit. The t → ∞ extrapolations from three state fit gave results consistent with the two state ones, albeit with larger errors. A closer look at Fig. 4 (and also from Fig. 21) reveals that the excited state contribution is larger for Γ = γ z than for Γ = γ t . Furthermore, the excited state contribution grows with increasing z. The non-symmetric nature of R(t, τ; z, P z , Γ) for z 0 is also apparent in the figure. We expect that h Γ (z = 0, P) = 1 for Γ = γ t because of charge con- γz (z, P z )t + const to the data. The dashed lines are the expected curve for R sum (t; z = 0, γ z ) using the fit(2,2) best fit parameters. servation once proper renormalization is implemented and the continuum limit is taken. Our extrapolation procedure gives a result for the bare matrix element which is larger than one at all the values of momenta as can be seen in Fig. 4, as well as from Fig. 22 in Appendix G where, in addition, one can also see that h B γ t (z = 0, P z ) is independent of P z . Thus, any deviation of h B γ t (z = 0, P z ) away from unity should be taken care of by renormalization. We will see in Section V that this is indeed the case.
Alternatively we can use the summation method [46] to obtain h B Γ . Here one sums over all τ/a minus a certain number We will refer to this method as sum(τ o ). For large t, one would find a linear behavior in t of R sum as The advantage of this method is that the excited state contributions are suppressed as exp[−(E n − E π )t] instead of being suppressed as exp[−(E n − E π )t/2] in the fitting method. We show a sample result using sum(1) and sum(2) in Fig. 5 for Γ = γ z and z = 0. We see that R sum (t; z, γ z ) can be well fitted by a straight line in t, and the slope gives the value of the matrix element. As a cross-check, we also show the expected curve for R sum (t; z, Γ) using our best fit from Fit(2,2) as the dashed curves. It can be seen that the difference between a simple straight line fit and the curve from Fit(2,2) is small. One can also note that sum(1) and sum(2) are almost parallel, meaning that the extracted matrix element is independent of τ o confirming that the method works well.
To better understand the systematic effects due to excited state contaminations, one can look at the case z = 0 in detail, where the statistical errors are the smallest. The bare matrix element h B γ z (z = 0, P z ) after renormalization is expected to be proportional to the hadron velocity, P z /E π (P z ). One can take the ratio of matrix elements h B The ratio of the matrix elements for γ z to γ t as function of P z . The curve shows the expected result, P z /E π (P z ).
avoid issues of renormalization. The results for this ratio of matrix elements is shown in Fig. 6 along with the curve for We see that our lattice results follow the expectations reasonably with small 3 − 4% deviations from the expected result at small P z . A reason for this could be the large systematic uncertainty in γ z matrix element due to relatively larger excited state extrapolations required. We see that within errors, the two-state fit, three-state fit and the summation methods are consistent.
In Fig. 7, we show the results for h B Γ (z, P z ) as function of z for the two highest momenta P z = 1.29 and 1.72 GeV determined using the HYP smeared Wilson line. Since the real part is symmetric about z = 0, we have only shown the data for z ≥ 0. At each z/a, we have shown the resulting t → ∞ extrapolated results using Fit(3,1), Fit(2,2), sum(2) and sum(2) methods, and these points at a given z/a are slightly displaced for better visibility. We see that the results from all these methods agree with each other within the errors. For γ z , some tension between the summation method and two and three state fits is observed at larger |z|. At large value of z the matrix elements are suppressed partly because of the large value of P z and in part by the divergent self-energy contribution in the spatial Wilson line. The latter will be removed upon renormalization as we will see in the next section. Having demonstrated a robust determination of the matrix element using multiple t → ∞ strategies, we will use the matrix elements obtained using Fit (2,2) in the rest of the paper.
So far we discussed results on the three-point function obtained using 1-HYP smearing for the spatial link. We also performed calculations using unsmeared spatial Wilson line. In this case, the bare matrix element rapidly decreased with z due to the larger value of the Wilson line self-energy divergence. However, we found that the results, after non-perturbative renormalization (discussed next in Section V), were similar to those obtained with smeared Wilson lines within errors. The main difference between the renormalized three-point function obtained with smeared and unsmeared Wilson line is that, for the latter the statistical errors at large z are significantly larger.
as a function of quark-antiquark separation z. The panels in the top row show results for P z = 1.29 GeV, while the panels in the bottom row show the results for P z = 1.72 GeV. The different symbols are from various methods of t → ∞ extrapolation.

V. RI-MOM NON-PERTURBATIVE RENORMALIZATION AND ITS COMPARISON WITH 1-LOOP
In the last section, we discussed the extraction of the bare qPDF matrix element which has to be renormalized. The renormalizability of qPDF has been recently demonstrated to all orders of perturbation theory [47,48]. In addition to the quark wavefunction renormalization Z q and the composite operator renormalization required for z = 0, the qPDF operator at non-zero z requires additional renormalization due to the the UV divergence present in the Wilson line connecting the quark and anti-quark [49]. When a lattice fermion that breaks chiral symmetry at finite lattice spacings is used, as is the case in this paper, it has been shown that only the renormalization of γ t qPDF operator is purely multiplicative, while the γ z qPDF operator mixes with the scalar qPDF [6,20]. A renormalization scheme that is implementable on the Euclidean lattice is the RI-MOM scheme and it is now standard in the lattice QCD literature. The corresponding RI-MOM counterterm for qPDF operator in leading order perturbation theory has been worked out using off-shell quark external states [8], and it is one of the ingredient used in the perturbative matching of RI-MOM renormalized qPDF to the MS PDF. In this section, we discuss the renormalization procedure, and then compare the running of the renormalization constants as de-termined on the lattice with the corresponding perturbative expectations. This allows us to quantitate the validity of the leading order perturbation theory and matching.
For non-perturbative renormalization, we compute the expectation value of qPDF operator between offshell quark external states with momentum p. We refer to the momentum of quark in the direction of Wilson-line as p z and the magnitude of the component perpendicular to the Wilson-line as p ⊥ . For these computations, we use Landau gauge fixing. Let Λ Γ (z, p) be the quark-line amputated bare qPDF, where Q(p) is the quark propagator u(p)u(p) and u(p) = x u x e −ip.x . Let us define the bare qPDF after projection as consistent with the definition used in perturbative calculations. Here, P is the operator used to project onto one of the γmatrices Γ = γ α , and Tr(. . .) is a trace over both color and Dirac indices. Based on previous works [8,20], we will use / p-projection for which P = / p/ (12p α ). Alternatively, one can use P = Γ [21] or minimal projection [8]. In the case of Γ = γ t , since the renormalization is simply multiplicative, the renormalized quark qPDF is given by where the z-dependent RI-MOM renormalization constant Z is determined using the renormalization condition set at momentum p R as The right-hand-side of the above equation is the tree-level value of q γ t . The renormalization constant so obtained is in general a complex number. For Γ = γ z , we have to take care of mixing with the scalar Γ = 1. Hence, the renormalized qPDF is defined as The diagonal part Z γ z γ z and the mixing term Z γ z 1 are determined using the two RI-MOM conditions [20] Using the renormalization constants Z determined above using quark external states, the renormalized pion qPDF can also be determined by where Z q is the quark renormalization, that can be determined using the condition [50] Z q (p R ) −1 1 12 where Q(p) is the quark propagator determined using Landau gauge and Q tree is the free quark propagator for which we use the free massless Wilson-Dirac propagator. In Fig. 8, we show the renormalization factors using the above RI-MOM renormalization conditions on the 0.06 fm ensemble. On the top left panel of Fig. 8, we show the real and imaginary parts of Z γ t γ t determined at p R z = 1.29 GeV and p R ⊥ = 1.49 GeV. The rapid, almost exponential, increase in Z with z is due to the self-energy divergence present in the bare Wilson line that connects the quark and antiquark in the qPDF operator. This divergent piece, e c|z| , cannot be captured perturbatively and it needs to be determined nonperturbatively in a particular scheme. However, this might not be an issue for the one-loop matching if e c|z| cancels exactly between the renormalization factors and the bare qPDF operator. Therefore, we remove e c|z| from the renormalization constant that is shown in the top-left panel, and display the result in the top-right panel. The value of c for our a = 0.04 fm ensemble was determined in [51], and for 1-HYP Wilson line ca = 0.1586. This removal of Wilson line self-energy reduces the almost exponential dependence of Z(z) to a weak dependence on z. In fact, we see that both the real and imaginary parts of Z γ t γ t remain O(1) even up to z = 1 fm, thereby providing a qualitative justification for the usage of leading order perturbation theory to describe the lattice data at short z and at high quark momenta. We show similar data in the bottom left and right panels for the Z-factors for γ z qPDF. In this case, we have the diagonal factor Z γ z γ z as well as off-diagonal factor Z γ z 1 to take care of mixing with scalar on the lattice. We show Z γ z γ z and Z γ z 1 as the filled and unfilled symbols in the bottom panels. We observe that the imaginary part of Z γ z γ z is small compared to the real part. This is not the case for Z γ t γ t , which in turn will affect the asymmetry of the u − d qPDFq u−d (x) of pion about x = 0. We also note that mixing of γ z with scalar is a minor 5-10% effect, but we nevertheless take care of it in our calculation.
As we discussed in the last section, the matrix element at z = 0, h γ t is the local current operator which will be exactly conserved in the continuum limit. Hence, Z q Z γ t γ t (z = 0) is the vector current renormalization factor Z V and the dependence of Z V on p will give us an idea of the leading (pa) 2 perturbative lattice artifacts for values of p Λ QCD as well as the other higher order (or perhaps non-perturbative) contributions to this lattice correction to Z V at smaller renormalization scales [50]. In Fig. 9, we show Z q determined using Eq. (20), the value of Z γ t γ t at z = 0 as well as their product Z V as a function of (pa) 2 . One sees a reasonable plateau for Z V ≈ 0.97 only for (pa) 2 > 2. For comparison, the value of Z V as obtained from the bare pion isospin charge h B at the other non-zero P z also give consistent values. With the uncertainties of choosing the scaling region in (pa) 2 to take the (pa) → 0 limit of Z V , we expect the Z V to be in the range 0.97 to 0.99. For relatively smaller values of renormalization momenta (pa) ≈ 1 − 1.5, chosen such that the renormalization scales lie in the vicinity of pion momenta used in this paper, one sees noticeable, but small 5% dependence on pa. We used the value of Z q estimated at the same value of p as used in Z γ t γ t for renormalizing our pion qPDF.
A. Comparison with leading order perturbation theory for z < 0.3 fm We will now investigate in a quantitative way the agreement/disagreement of the lattice determination of the RI-MOM renormalized amputated quark qPDF at z < 0.3 fm which one can expect to be in the perturbative regime. For this, we construct a quantity ζ Γ (z, p, p R ) in the following way where q R Γ (z, p, p) = e ip z z by renormalization condition. In the case of Γ = γ t , the above definition is simply which is similar to a discrete scale-dependent anomalous dimension ∂ log Z γ t γ t (z, p) /∂p. Through the dependence of ζ  on p R slightly away from p, we can understand how well the leading order perturbation theory is able to describe the exact non-perturbative determination on the lattice. It is important to stress that apart from understanding non-perturbative renor-malization of qPDF in this way, we are also essentially comparing one of the steps in LaMET formalism that is also calculable on lattice. Hence, any agreement/disagreement we observe quantifies the limitations of leading order LaMET also. In perturbation theory, ζ is the ratio of one-loop perturbative correction to q(z, p) to its tree-level value. This expression for ζ has been calculated, and it is given by 1 where H(x, p) is the 1-loop correction term to the bare qPDF, and the two terms in the right hand side come from the bare and RI-MOM renormalization counter terms respectively 2 . GeV respectively, and they are chosen to be roughly equal for this comparison. The real part of ζ is shown in red while the imaginary part is shown in blue. The band enclosed by the solid red (blue) curves corresponds to the 1-loop result for real (imaginary) parts of ζ at (p z , p ⊥ ) for the 0.04 fm data, and similarly the band enclosed by the dashed curves corresponds to (p z , p ⊥ ) of the 0.06 fm data.
The functional form of H(x, p) for γ z and γ t isovector qPDF is given in [7,8] and therefore we do not provide them here. The asymptotic 3/(2|x|) behavior of the bare and the RI-MOM counter term, that contributes to the UV divergence when integrated over x, gets exactly canceled and leads to UV finite and renormalized result for ζ. In the discussions below, we will consider the cases with p z = p R z and p z p R z separately. In the above leading order formula, the scale at which α s has to be evaluated is unclear. Therefore, we vary α s by changing the scale from 0.5p R z to 2p R z through the 1-loop running, and quote this variation as an uncertainty in the perturbative results below. On the lattice side, we determine ζ(z, p, p R ) using non-perturbatively determined Z-factors. In order to estimate the lattice spacing effects, we determined ζ using two different lattice spacings; a = 0.04 fm is shown as filled symbols and a = 0.06 fm is shown as open symbols in various plots that follow.
In the left and right panels of Fig. 10, we show the typical dependence of ζ γ t (z) and ζ γ z (z) respectively, as a function of z when p R ⊥ differs slightly from the transverse quark momentum p ⊥ , while the longitudinal components p z and p R z are the same. Using Eq. (23), we calculated the prediction from leading order perturbation theory for ζ(z) at the same values of momenta. The uncertainty bands for the perturbative result are shown in Fig. 10 along with the actual lattice data at the two different lattice spacings that are shown using symbols. For the data shown in Fig. 10, the longitudinal components p z for the two lattice spacings are exactly 1.92 GeV, but the transverse components p ⊥ are only approximately the same between the two due to the constraints of allowed momenta on the two different lattice volumes i.e., p ⊥ = 1.49 GeV for a = 0.06 fm and p ⊥ = 1.67 GeV for a = 0.04 fm. To take care of this slight offset in p ⊥ between the two lattice spacings, we have distinguished the perturbative results corresponding to a = 0.04 fm as bands enclosed by solid lines, and similarly for a = 0.06 fm as bands enclosed by dashed lines. It can be seen that the two perturbative results are not very sensitive to this difference in p ⊥ assuring us that whatever change we observe between the data at two different a is mainly due to change in a. We observe from the plots that the leading order perturbation theory captures the qualitative z-dependence of both the real and imaginary parts of ζ when p R ⊥ is changed from p ⊥ . Surprisingly, the 1-loop result seems to work better for ζ γ z than for ζ γ t . In the case of γ t , one can certainly see a large lattice spacing effect with the movement of data towards the 1-loop result as lattice spacing is reduced, while in the case of γ z , one can already see a consistency with the one loop result at the lattice spacings that we use. Thus, it opens up a question on whether γ z qPDF fares worse compared to γ t qPDF simply due to the presence of mixing with scalar or whether γ z qPDF might eventually show better perturbative convergence and lesser lattice spacing dependence in spite of its other disadvantages.
In Fig. 11, we concentrate on the renormalization flow of ζ Γ at fixed small value of z = 0.12 fm. The two panels show the dependence of ζ γ t and ζ γ z on p R ⊥ which is changed around p ⊥ . As before, we keep p z = p R z = 1.92 GeV. The 1-loop result is able to capture the qualitative trend of the flow in both γ z and γ t . For both the cases, we can see that the reduction of lattice spacing leads to better agreement with 1-loop result. Having discussed the cases where p z = p R z , we now study the dependence of ζ on p R z p z , while keeping p ⊥ = p R ⊥ . We show the z-dependence of ζ γ z when p R z = 1.5p z in Fig. 12. We find the perturbative result to have the same qualitative behavior as the lattice data. Putting together the various observations in this section, we found only an overall qualitative agreement between the lattice results on ζ and the one-loop perturbative results. When the lattice spacing is reduced, we found the agreement to get better. It remains to be seen what the effect of including higher order corrections in the perturbative result for ζ.

B. A way to classify quark-antiquark separations as perturbative or nonperturbative
Physically, one would expect that for well-separated quarkantiquark with z > 1 fm, one would start seeing traces of nonperturbative physics in the qPDF. Quantifying the advent of non-perturbative physics for large enough z at finite quark/hadron momentum is important with regard to the extraction of PDF since the real-space qPDF at all z enter the computation of its Fourier transform. A simple first approximation to study this effect is the following. In free theory, the qPDF with external quark states is a pure wave e ip z z . We expect, to a first approximation, that the effect of nonperturba- FIG. 12. The dependence of ζ on the longitudinal momentum p R z . ζ γz is shown as a function of z for a specific choice of p z and p ⊥ = p R ⊥ . The uncertainty bands for the real and imaginary parts for the leading order expectation are shown using bands enclosed by solid lines. The symbols are the lattice data determined at lattice spacing a = 0.04 fm.
where we have removed the UV divergent piece e −c|z| from the qPDF and defined the left-over exponent m scr as a physical scale. We have also accounted for ω p z in the interacting theory since the quark can lose momentum by emitting gluons. There could be remnant non-trivial dependence of the amplitude A on z, which we assume to be sub-leading compared to the leading damped oscillatory behavior and ignore them in the discussion here. There is an ambiguity in m scr depending on the scheme used to determine the divergent piece c. Since the values of c determined from static quark potential method ensures that the renormalization factors after the removal of e −c|z| are O(1) at smaller z in Fig. 8  separation of the exponential suppression factor into a divergent and physical scales as defined in Eq. (24) is well motivated in this scheme. In Fig. 13, we show the bare quark qPDF q γ t (z, p) for p ⊥ = 1.49 GeV and p z = 1.29 GeV determined on the a = 0.06 fm ensemble. The short distance can simply be described by a pure oscillatory e iωz behavior which is shown using the dashed curves (with ω = 0.85p z for the case shown). The solid curves in the figure correspond to the ansatz in Eq. (24) which describes the data at larger |z| well. Without dwelling further on finding the best parametrization of the lattice data that asymptotically behaves like Eq. (24), we simply define an effective z-dependent ω and m scr through In Fig. 14, we show the behavior of m scr and ω as a function of z as extracted from q γ t (z, p). We have chosen different set of p z and p ⊥ to show the dependence on p z at fixed p ⊥ and vice versa. From the top panel, we see that ω/p z is below 1 for z < 0.4 fm and seems to approach a plateau closer to 1 for z > 0.4 fm. While the values of ω at short distances is dependent on p z and p ⊥ , the approach to ω ≈ p z is universal. We observed this behavior when we used q γ z (z, p) as well. A physical reasoning for this observation could be that at shorter z, the quark has the ability to radiate a gluon, and at distances z > 0.4 fm there is effectively a dressed quark carrying all the momentum. In the bottom panel of Fig. 14, we have shown the effective screening mass m scr . In the plots, we have only shown the data where 1-HYP smeared Wilson line was used. For this case, we subtracted ca = 0.1586 in Eq. (25) to get m scr . One can clearly see the emergence of non-zero m scr ≈ 300 MeV for |z| > 0.5 fm which is in the typical Λ QCD scale. When we repeated this using quark qPDF with unsmeared Wilson line, we found the results to be consistent with the data shown in Fig. 14 after we subtracted out ca = 0.3687 corresponding to unsmeared Wilson line. This assures us that the observed m scr ≈ 300 MeV is a real physical scale independent of the self-energy divergence of the Wilson line. This signals the significant presence of a confinement scale beyond z ≈ 0.5 fm. Also, the near plateauing of both ω and m scr for these larger z indicates that a simple physically motivated ansatz in Eq. (25) surprisingly offers a very good description of the actual non-perturbative data. One could have expected this simply from observing the large |z| part of Fig. 13. It remains to be seen if this observation can be used advantageously in improving the LaMET matching at finite moderately large p z .

VI. FROM RENORMALIZED QUASI-PDF TO PDF
A. On obtaining the valence PDF using isovector u − d qPDF of pion Having determined the renormalized qPDF we can now discuss the matching between qPDF and PDF as well as the determination of pion PDF from our lattice results. We computed the u − d qPDF matrix element of a pion which in practice we obtained from the real part of the connected piece of the u quark qPDF matrix element. Now, we discuss how the u − d qPDF and PDF are related to the valence PDF of pion.
The u and d quark distributions, f u (x) and f d (x), as determined using Eq. (1) has support from x = −1 to 1. One can make connection with the conventional, separately defined quark distributions Q u,d (x) and the anti-quark distributions Q¯u ,d (x) that are non-zero only between x = 0 and 1, through the relation Therefore, f u,d (x) contains information on both the quark as well as the antiquark distributions in the positive and negative regions of x respectively. Let us first focus on x ≥ 0. In the isospin symmetric case we are considering, Q u (x) = Qd(x) and Qū(x) = Q d (x). Therefore, for the positively charged pion is the valence u-quark distribution. Again, due to isospin symmetry the u andd valence distributions are the same as f π,u v (x) = f π,d v (x) = f π v (x). However, unlike the valence quark distribution, the isotriplet and it has support from −1 to 1. That is, Therefore, one can obtain the u − d quark distribution, and from it, one can obtain f π v (x) from x ∈ [0, 1], or equivalently, from [−1, 0].
By applying the matching formula on f u (x) and f d (x) separately and taking the difference to obtain the u − d RI-MOM qPDF, we now try to learn what is expected for this qPDF. Writing down only the x/y dependence for the sake of brevity and keeping the dependence on yP z /P R z , (P R /P R z ) 2 and factorization scale µ implicit, the one-loop contribution to matching kernel C(x/y) from RI-MOM to MS consists of two terms: F 1 (x/y) and F 2 (1 + η (x − y)) with η = P z /P R z . The expressions 3 for F 1,2 depend on the choice of Γ (γ z or γ t ) [7,8]. Furthermore, F 2 depends on the projection method of RI-MOM scheme. Using the matching formula [7,8] on f u (x) and f d (x) to obtain the qPDFsq u (x) andq d (x), The above equation includes both the sea and valence quarks, and there will be mixing with the gluon PDF which are included in the "· · · " part. In the above convolution, the vector current conservation is ensured by the the plus function defined as such that any extra variable that F 1,2 will depend on are held fixed in the above integral. Since the matching between qPDF and PDF is linear, with the terms in "· · · " in Eq. (28) exactly canceled between the u and d terms. This is the matching relation we use to obtain the u − d PDF from u − d qPDF. Using the u − d PDF, we obtained the valence PDF as discussed above. While in the RI-MOM scheme. One way to understand this is from the fact that the bare qPDF matrix element is purely real while 3 In [8], the terms F 1 and F 2 are referred to as f 1 and f 2 , respectively the RI-MOM renormalization factor is in general complex making the renormalized qPDF matrix element complex. One can see this by starting from the matching convolution above and find that, is non-zero due to an RI-MOM specific term F 2 , while the terms containing F 1 cancel due to its dependence only on |P z |. In other schemes such as the MS, this symmetry about x = 0 would be preserved by matching because the corresponding factorization formulas depend on renormalization/regularization scales through combinations such as µ 2 z 2 [52]. In RI-MOM scheme there are two renormalization scales, P R and P R z , and since the z-direction is special the above statement does not hold. Thus, it is important to capture this asymmetry in the qPDF, or equivalently to describe both the real and imaginary parts of the RI-MOM renormalized pion qPDF from matching. We use the matching kernel corresponding to the / p-projection in the results to be discussed next.

B. Numerical results on pion valence PDF from matching
The one-loop perturbative matching relates the Fourier transformq(x, P z , P R ) of the renormalized RI-MOM realspace qPDF matrix element h R (z, P z , P R ), and the MS PDF f (x, µ) at factorization scale µ. The relation is through the convolution in Eq. (3). There are two approaches to consider here: 1. One can parametrize the real space data h R (z, P z , P R ) over the range z where one has the lattice data and then model the dependence of h R (z, P z , P R ) over z extending to infinity where data does not exist (c.f., [53,54]). Using such a parametrization, one can obtain its Fourier transformq(x, P z , P R ). Since, the matching is only up to O(α s ), one can invert the relation Eq. (3) by replacing f ↔q and α s → −α s . Thereby, one can obtain f (x, µ).
In this method, one does not control what values of z enter the Fourier transform and one could question the validity of perturbation theory for z > 1 fm.
2. One can start from a phenomenologically motivated nparameter family of PDFs f (x, µ; a 1 , . . . a n ). Through Eq.
(3), one can obtain qPDFq(x, µ; a 1 , . . . a n ), and thereby, obtain a family of real space space qPDF matrix elements h R (z, P z , P R ; a 1 . . . a n ). Using this, one can fit the parameters (a 1 , . . . , a n ) so as to best describe the real space lattice data over a range z. This method was used in the case of lattice cross-section approach in [13]. Since the model PDFs are not predictions from QCD, the model dependence enters the analysis and one has to rely on the prior that experimentally determined PDFs are indeed very well described by such family of PDFs. However, the advantage of this method is that one can precisely control the range of z that enters the analysis, and one also does not have to invert the matching convolution.
From our observation on how the 1-loop perturbation theory fails to capture the quark qPDF quantitatively even at short distances and from the observation of significant nonperturbative screening effects beyond z = 1 fm, we think it is important to be in control of what values of z enter the convolution and hence, in this paper we take the second approach. Also, due to the loss of signal to noise ratio for z > 1 fm, we found Fourier transforming the noisy data to be challenging without introducing unwanted wiggles inq(x) at larger x.
To be on par with the experimental extraction of PDFs, one should use sophisticated methods such as the usage of neural networks to choose the set of model PDFs to start with (c.f., [55]). We defer such an analysis to a future work and instead, we use a simple two-parameter phenomenologically motivated functional form for the valence PDF: for x ∈ [0, 1] and zero elsewhere. As we will see below, such a form is enough to describe our lattice data. One can fix the coefficient A through a stringent condition 1 0 f π v (x)dx = 1. Instead, we use a more conservative constraint on A using 1 0 f π v (x)dx = h R (z = 0, P z , P R ) to allow for sample by sample fluctuations in h R (z = 0, P z , P R ) close to 1 and fold this into the error estimate. It should be noted that the valence PDF of pion determined from the experimental data by the JAM collaboration [56] can be well described by such a two parameter ansatz, for example with a = −0.407 and b = 1.12 at µ = 3.2 GeV.
Using the above valence PDF, we construct the u − d PDF as with x ∈ [−1, 1] and zero elsewhere. Through the convolution of f u−d (x) with the matching kernel, we obtainq u−d (x; a, b), which in turn we use to construct the real space qPDFs h(z; a, b) = ∞ −∞q (x; a, b)e ixP z z dx. We will refer to these functions h(z; a, b) as the two-parameter family of phenomenologically motivated qPDF matrix elements. With the set of h(z; a, b) from a range of a and b, we can fit the parameters a and b to the data by minimizing either χ 2 r or χ 2 ri below: In the above equations, [−z max , z max ] specifies the fit range. The statistical errors on the real and imaginary parts of the lattice data h R (z) is σ r (z) and σ i (z) respectively. To account for systematic errors coming from higher order corrections in α s in the matching kernel, we determine h(z; a, b) from f π v (x; a, b) by varying the value of α s in matching kernel from α s (µ/2) to α s (2µ) though the 1-loop running. The corresponding change in real and imaginary parts of h(z; a, b) is denoted as σ pert r (z) and σ pert i (z) respectively, and we include this uncertainty in the matched result in the χ 2 . If the matching was exact, then by fitting only the real part by minimizing χ 2 r would automatically guarantee that the imaginary part also agree with the The top and bottom panels show our estimated pion valence PDF at µ = 3.2 GeV using γ t qPDF at P z = 1.29 and 1.72 GeV respectively. The results using multiple RI-MOM scales (P R z , P R ⊥ ) are shown using different colored error bands. On the left panels, the results for f π v (x) are shown, while on the right panels the results for x f π v (x) are shown. For all the cases shown, the fit range was held fixed at z max = 0.98 fm. The solid line (with a small error band around) is the JAM result [56] for pion valence PDF at the same µ.
data. Therefore at any finite order matching, the fits obtained by minimizing χ 2 r and χ 2 ri will in general be different. For the results shown below, we used χ 2 ri in order to obtain the PDF that best describes both the real and imaginary parts of the real space qPDF, but we also used χ 2 r and found it to lead to consistent results, but with larger uncertainties. We did not include the correlations between the data at different z for the primary reason that it is difficult to keep these correlations intact in the process of excited state extrapolations. It also helps us to easily incorporate the effect of σ pert from non-statistical origin in the analysis, and in treating the real and imaginary parts of the renormalized matrix elements as two distinct pieces of data as is the case in the context of matching. We determined the errors on the fit parameters through bootstrap analysis.
In Fig. 15, we show the fitting procedure for γ t qPDF. In the top panels, we show the P z = 0.86, 1.29 and 1.72 GeV real-space RI-MOM pion qPDF matrix elements from left to right. The symbols are the actual lattice data. The solid and patterned red (blue) bands are 1-σ error-bands of the real (imaginary) parts of the fitted real space qPDF matrix element that best fits the data over the range [−z max , z max ] for z max = 1.44 and 0.72 fm respectively. The agreement with both real and imaginary parts of the lattice data is noteworthy. In fact, we find the qPDF matrix element as inferred from the JAM PDF [56] is able to explain the lattice data well for the entire range of z at the two largest momentum. In the bot-tom panels, we show the process leading from model PDF to the real space qPDF matrix elements shown in the top panels. In order to avoid cluttering the figure, we have shown only the mean value of f u−d (x) (shown as dashed lines) while we have shown the error bands for the qPDFq u−d (x) as obtained through 1-loop matching. The colors red and blue in the bottom panels correspond to the fits with z max = 0.72 and 1.44 in the top panels respectively. As it can be seen, we started from a symmetric u − d PDF by construction and matching introduces an x → −x asymmetry. After Fourier transformation, this asymmetry leads to the imaginary part in the real space data in the top panels which captures the lattice data to a good accuracy. For both the real-space as well as in x space, we find no significant difference between using z max = 1.44 fm and 0.72 fm in the fits. We could infer that within the precision of our numerical results, the non-perturbative effects at z ≈ 1 fm that we found using quark qPDFs is not important. Therefore, we show results for an intermediate z max = 0.98 fm in the results below. When we repeated this analysis by minimizing χ 2 r , we found estimates to be consistent with the above, but with larger uncertainties.
In Fig. 16, we show our results for f π v (x, µ) and x f π v (x, µ) at the factorization scale µ = 3.2 GeV using the procedure described above at our two largest pion momenta P z = 1.29 GeV and 1.72 GeV starting from γ t qPDF. For each case, we overlay results from two different RI-MOM scales P R in or- The 1-σ confidence region ellipse of the exponents a and b in the model PDF at µ = 3.2 GeV that best describes the real space RI-MOM qPDF are shown. The solid lines and dashed lines correspond to P z = 1.29 GeV and P z = 1.72 GeV. For each of these pion momenta, the different colored lines correspond to different RI-MOM scale P R . The black point is the JAM value [56] for valence pion PDF. The black straight line is the line of constant first moment of valence PDF, x = 0.215. der to show the scatter as a systematic error in our estimates. We find the P R dependence to be minor compared to the error bands (we repeated the analysis with multiple other values of P R that are not shown and only minor scatter with respect to P R was seen). We also show the result from the JAM collaboration [56] for the pion valence PDF at the same factorization scale as the black solid line, which lies within the statistical and systematic uncertainties of our estimates. In the left panels showing f π v (x), this overall agreement can be seen even up to smaller x, but one has to be cautious of this agreement for x Λ QCD /P z ≈ 0.2 for the two highest pion momenta we use. By construction, in our fitting procedure f π v (x) has support only from 0 to 1 without any necessity to recover this condition in the infinite P z limit. However, the values of exponent b closer to zero are also allowed thereby leading to a wider error band closer to x=1. This seems to be consistent with the observation in Ref [18] that the PDF obtained from qPDF through inverse one-loop matching (approach-1) vanishes at about x ≈ 1.2. We see our P z = 1.29 GeV and 1.72 GeV estimates to be consistent albeit with a significant increase in error at the largest momentum.
In Fig. 17, we summarize the information in Fig. 16 by showing the 1-σ ellipses (whose x and y projections give the marginal 68% confidence intervals of the exponents a and b respectively). In this figure, the dashed and continuous ellipses are for P z = 1.29 and 1.72 GeV respectively. The ellipses for different P R are distinguished by the colors, with the color code being the same as in Fig. 16. The P z = 1.29 GeV data offers a stronger constraint on allowed region of (a, b) than the noisier P z = 1.72 GeV. In this plot, the JAM estimate is the black point. The JAM data is well within the P z = 1.72 GeV ellipses while the P z = 1.29 GeV data seems to favor slightly smaller exponent b. However, these differences are In the top and bottom panels, the real (red) and imaginary (blue) parts of the renormalized real space γ z qPDF matrix element are shown for pion momenta P z = 1.29 GeV and 1.72 GeV respectively. The data points are the actual lattice data. The bands are the expected matched γ z qPDF matrix element starting from our best estimate for valence pion PDF obtained using γ t qPDF analysis. well within 2σ. Even though our lattice data has large errors on the exponents a and b individually, the data offers a tight constraint on the combined allowed region. In particular, the principal component of this correlation between a and b points directly at the JAM data implying that if one fixes the exponent a to be from the experiment, then the best value of b would also be closer to that from the experiment. To understand this better, we have also shown the line of constant value of first moment of the valence PDF, x = 1 0 xq π v (x)dx, set to 0.215 as inferred from the JAM data. It is clear that the 1σ ellipses are oriented along this line, which means that qPDF determines x robustly and this in turn provides a strong constraint in the allowed PDFs. Not surprisingly, we do find consistent values of x = 0.21(2) and 0.22(3) from the P z = 1.29 and 1.72 GeV estimates. It should be noted that the moments of pion PDF have also been directly determined without the usage of LaMET formalism [57][58][59][60][61] and similar values for the first moment for the pion were obtained, but at slightly different values of µ 2 than used here.
The exponents a and b were also recently obtained using the lattice cross-section approach [13] which used currentcurrent correlators, with matching implemented at tree-level. Here, the exponents were estimated as a = −0.34 (31) and b = 1.93 (68) which are consistent with region allowed at the largest momentum in Fig. 17. It is worth noting that there are indications from next-to-leading-logarithmic soft gluon resummation calculation [35] and from the Dyson-Schwinger equation [62][63][64] that the value of exponent b could be ap-proximately 2 as expected from perturbative counting rule (c.f., [65]). It will be interesting to see if a similar implementation of an improved matching kernel could lead to a softer large x behavior for the pion than what is observed using 1loop qPDF matching here and perhaps in [18]. In fact, a general consideration of power corrections to qPDF [66] revealed the presence of the form Λ 2 QCD / (1 − x)x 2 P 2 z implying higher values of P z might be required in order to correctly describe physics close to x = 1, and this might be the effect which we are finding. Similar conclusions were obtained in 2d QCD also [67].
Due to the larger errors in the γ z qPDF attributed mostly to the steep excited state extrapolations, we use the γ z qPDF to provide a consistency check of our calculations instead. For this, we use our above best estimates of the PDF obtained using the γ t qPDF to get the corresponding prediction for the real space γ z qPDF matrix element through a convolution with the appropriate matching kernel. In the top and bottom panels of Fig. 18, we show such a comparison between the actual real space data of γ z qPDF (data points) along with the prediction from our estimated PDF (bands) for pion momenta P z = 1.29 and 1.72 GeV. We find good descriptions of the real part of the RI-MOM γ z qPDF at both the pion momenta with a slight tension between the imaginary parts. From our discussion on excited state contamination, it is important to first gain better control of the larger excited state contamination in γ z qPDF before one can investigate the effect of one-loop matching on this rather small discrepancy.

VII. CONCLUSIONS
We studied pion PDF in the framework of LaMET, which relates the qPDF to PDF through the matching convolution in Eq. (3). For this, we used small lattice spacing a = 0.06 fm. We carefully examined the effects of excited states using 2 and 3 exponential fits of the relevant 2-point and 3-point functions as well as the summation method. For our final analysis we used two momenta P z = 1.29 GeV and P z = 1.72 GeV. We found the qPDF defined using Γ = γ t was better determined compared to the γ z qPDF in the lattice calculation for the following reasons: smaller statistical error, relatively smaller excited state extrapolation leading to more robust result for the matrix element as well as due to the absence of mixing. Therefore, we focused on the analysis of the γ t matrix element.
The pion qPDF was non-perturbatively renormalized using RI-MOM scheme by calculating the matrix elements of qPDF operator with off-shell quark states in Landau gauge for different separations z. For these calculations we also used finer lattices with lattice spacing a = 0.04fm. We performed the comparison of this matrix element in Landau gauge with 1loop perturbative calculations in RI-MOM scheme and found qualitative agreement for z < 0.3 fm. For the smaller lattice spacings, a = 0.04 fm we even found quantitative agreement with the 1-loop result for sufficiently small z. We also explored the role of non-perturbative effects in the calculation of the off-shell matrix element. The real part of the RI-MOM renormalization coefficient is close to one, while the imagi-nary part is close to zero once the divergent self energy part of the Wilson line is removed. We pointed out that the RI-MOM renormalization procedure leads to asymmetry in the iso-vector pion qPDFq(x, P z , p R z , µ R ) around x = 0, while other renormalization procedures lead to qPDF that is symmetric around x = 0.
From the renormalized qPDF, we determined the valence quark pion PDF using 1-loop perturbative matching of the γ t qPDF, which we implemented through a fit to phenomenologically motivated x a (1 − x) b functional form for the valence PDF. We found our results for the pion valence PDF using the two largest pion momenta were consistent with each other, though the statistical errors are rather large. An overall agreement with the results obtained recently by the JAM collaboration [56] was seen. We found our result for the PDF to capture the first moment x more robustly than the small-x and largex exponents, a and b themselves. We used the γ z qPDF matrix elements to provide an internal consistency check by comparing to the expectation from our estimates of the PDF and a satisfactory agreement was seen. From our analysis it is clear that the dominant source of errors in the PDF determination is the statistical error of the lattice calculations. It will be necessary to significantly increase the statistics in the future lattice calculations. Future high statistics lattice calculations will be important for an accurate determination of the pion PDF as well as testing of the LaMET approach around small x. In order to create hadron interpolating operators that have good overlap with the corresponding ground states, quark field smearing is typically required. The amount of applied smearing is tuned to produce spatial quark distributions of roughly of the same spatial size as the hadron. Gauge-covariant Wuppertal (Gaussian) smearing [68] is commonly used for this purpose. However, calculation of quasi-and pseudo-PDFs and high-momentum hadron structure and spectrum in general requires lattices with small lattice spacing. Keeping the physical size of smeared quark distributions the same becomes a numerical challenge on finer lattices because it requires larger numbers of smearing iterations. For this reason, we use Gaussian shape smearing in a fixed (Coulomb) gauge that can be performed efficiently through a convolution with a Gaussian profile kernel, In the free-field case, this kernel corresponds to the Wuppertal smearing operator (1 + w 2 4N ∆ sp ) N , where ∆ sp is the spatial Laplacian and w 2 = 2w 2 CG . The value for the width w CG is chosen to match the mean-squared radius r 2 = 3w 2 CG to that of the optimal Wuppertal-smeared quark sources. First, we fix the Coulomb gauge where Ω C x is the gauge transformation to the Coulomb gauge, which minimizes the functional (for the Coulomb gauge, µ t and the functional is minimized independently on each time slice). The numerical implementation is identical to the algorithm used for fixing the Landau gauge in NPR calculations. Application of the smearing kernel requires two 3D Fourier transformations which is accelerated with offloading matrix-matrix products to GPU. Incorporating momentum (boosted) into Coulomb-gauge Gaussian smearing amounts to translation of the kernel in the momentum space, x,y e −i k y ψ y = e i k x S e −i k y x,y ψ y (A5) In a periodic finite volume, care must be taken to avoid spatial discontinuities in the boosted smearing kernel (A5). Such discontinuities may arise because the optimal boosted smearing momentum k typically does not conform to finite-volume momentum quantization k = 2π n L and the phase factors e i k x , e i k y do not satisfy periodic boundary conditions. The solution is to define the smearing kernel in the momentum space as S ( k) x, y = p e i p( x− y) e − 1 2 w 2 CG ( p− k) 2 , where the momentum difference ( p − k) is understood as the shortest distance between p and k in the Brillouin zone. Such choice leads to a smooth distribution in the momentum space and respectively smooth and continuous smearing kernel in the coordinate space.
Finally, it is important to note that the smearing kernel A5 is Hermitian (as an operator acting in the [coordinate ⊗ color] space), which is similar to the (boosted) Wuppertal smearing operator and important for computing symmetric hadron correlation functions.

Appendix B: Meson correlation functions with boosting
We use the interpolating operator for the π + =du meson which is constructed from smeared quark fields where the spinor matrix Γ M = γ 5 . The Hermitian-conjugated (creation) meson operator is where Γ M = γ 4 Γ M γ 4 = (−γ 5 ). The meson two-point correlation function with boost-smeared source and sink and momentum projection at the sink is 4 C 2pt (y 4 , p; x) = y e −i p( y− x) π +,(2 k) y π +,(2 k) † where Q q x,y = q xqy andQ q,(± k) = S (± k) Q q S (± k) are unsmeared and smeared quark propagators, respectively. Note that the meson two-point function is constructed from the u-quark propagator y ← x and the d-quark propagator x ← y smeared with momenta ( k) and (− k), respectively. Therefore, separate propagators for u and d quarks are required to construct meson correlation functions "boosted" with the total momentum (2 k),Q u,( k) y,x = S ( k) y,y Q u y ,x S ( k) x ,x ∝ e i k( y− x) , where "∝" sign stands for additional coordinate dependence due to the boosting. The d-quark x ← y propagator, as usual, is computed using γ 5 -Hermiticity of the Dirac operator, where the sign of the boosting momentum is preserved due to the Hermiticity of the (boosted) smearing operator S (A6).
Repeating similar steps to for the meson three-point function with the insertion of the operator ū W Γ u z with arbitrary Γ-matrix and Wilson line W z,z+L = ( L U) z,z+L along path L, we get where the forward propagator F u = Q u S ( k) and the meson sink-sequential (backward) propagator Bd Γ M u(y 4 , p ) is defined as y,y Q d y ,z .
Appendix C: Explicit calculation to show that bare pion u − d three point function is purely real or imaginary In the previous appendix, we constructed the connected piece of the three point function of uWΓu operator in π + . If one repeats the computation using dWΓd operator, one will find the disconnected piece to be the same as the one in the full uWΓu three point function and hence such quark line disconnected terms will cancel in the uWΓu−dWΓd isospin nonsinglet operator that we are interested in. Below, we further explain as to why only the real part of the connected uWΓu three point function for Γ = γ t , γ z and imaginary part for Γ = 1 contributes to the total isospin nonsinglet three point function. For the sake of simplicity let us take the case of point source and point sink, and take Γ = γ t . The full expression for the u − d qPDF three point function is where we do not make distinctions between u and d quark propagators due to isospin symmetry. Let us call the trace in first term on the right hand side as T 1 and the second trace before being conjugated as T 2 . One can go from T 2 to T 1 by parity transformation x = ( x, x 4 ) → x p = (− x, x 4 ) , followed by a spatial translation x → x + L by making use of the transformation of the Dirac propagator to be Q x,y → γ t Q x p ,y p γ t and W x,x+L → W † x p −L,x p under parity. In this case, the γ t from parity transformation for Q commutes with Γ = γ t . In other cases, one should take care of the ± factor. Thus C 3pt becomes and therefore proportional to the connected piece of uΓWu, which is the first term in the above equation. We normalize the three point function such that the u − d isospin charge of the pion is 1. By going through the similar calculation, one can show that the three point function is real also for Γ = γ z while it is is purely imaginary for Γ = 1 u − d pion qPDF.  In this appendix we discuss some details of the calculations of the pion two point function. We tested several different sources for the pion. In these tests we used 50 gauge configurations. We used Gaussian sources with several steps of Wuppertal smearings as well as in Coulomb gauge (see main text). In Fig. 20 we show the effective mass for 40 and 90 steps of Wuppertal smearings as well as Coulomb gauge Gaussian sources of size 0.3 fm. We see that 90 steps of Wuppertal smearings and Coulomb gauge Gaussian sources give similar effective masses, while the excited state contamination is larger for 40 steps of Wuppertal smearings. We also studied the two point functions for different boosted Gaussian sources, with momentum boost k z . The corresponding effective masses are shown in Fig. 19 for P z = 0.86, 1.29 and 1.72 GeV. for different values of ζ = k z /P z . We clearly see that non-zero value of ζ improves the signal for all P z . We also see that ζ = 0.5 is too small, while ζ = 1.0 is too large for P z = 1.29, but works well for P z = 0.86 GeV.
Appendix F: Implementation of matching convolution Here, we describe the implementation of plus function in the matching formula such as to ensure current conservation. The matching kernel is of the form where the dependence on P R and µ are implicit. The first perturbative correction is a plus function that ensures the vector current conservation. The property we know of the plusfunction is that y , yP z = 0, since the second dependence of the function is independent of x. In order to implement the plus-function correctly, we can use the following procedure: The x-independent but momentum dependent coefficient N(yP z ) is with y held fixed as Λ → ∞ and → 0. The following is then true for any function f : dxdyC (1) + (x/y, yP z ) f (y) = dy dxC (1) + (x/y, yP z ) f (y) = 0, leading to the vector current conservation or equivalently to the total area preservation between the qPDF and PDF. With this prescription, the matching formula becomes ∞ −∞ dy |y| C (1) + (x/y, yP z )q(y) = reg dy |y| C (1) (x/y, yP z )q(y) − N(xP z )q(x).
It is convenient to write the above formula in an explicitly vector conservation preserving form as Appendix G: Results on two-state extrapolations to obtain the matrix elements at all P z In Fig. 4 in the main text, we showed some sample results for the t − τ/2 behavior and the t → ∞ extrapolations of the three-point function to two-point function ratio R(t, τ; z, P z , Γ) for Γ = γ t and γ z at a specific intermediate value of P z = 1.29 GeV. In Fig. 21 and Fig. 22 of this appendix, we show similar results at all P z for Γ = γ z and γ t respectively using Fit (2,2).
For the case of P z = 0, special care needs to be taken. For a finite temporal extent L t of the lattice, ignoring the effect of periodicity due to the presence of the terms e −E π (L t −t) in the denominator of Eq. (9) is justified when e −E π t e −E π (L t −t) . But, one should include the effect of boundary condition if the two terms become comparable. For the largest source-sink separation t/a = 12 we use, the contribution from the wrapping-around term, e −E π (L t −t) , relative to e −E π t for P z = 0 is 2.7%, whereas for higher P z it is negligible e.g., for the smallest non-zero momentum P z = 0.43 GeV, this effect is 0.2%. Hence, we included the term e −E π (L t −t) in the denominator of Eq. (9) for the extrapolation of R(t, τ, z; P z ) for P z = 0, and we also checked that the effect of the periodicity of lattice was indeed negligible for any of the non-zero P z we used.
For the case of P z = 0 displayed in the top-most panels of Fig. 22, we have shown the data in two ways to make the fits and the extrapolated value easier to understand. The unfilled symbols are the data for R(t, τ; z, P z , γ t ) defined as the ratio of C 3pt (t, τ; z, P z ) to C 2pt (t; P z ), and the solid curves are the fits including the e −E π (L t −t) term in the denominator of Eq. (9). While the fits describe the data well, the trend in the data with increasing t can be seen to be away from the extrapolated value. To make the reason clearer, we have shown the modified ratio of C 3pt (t, τ; z, P z ) to the two-point function without the wrap-around term, C 2pt (t; P z ) − A 0 e −E π (L t −t) , as the filled symbols. The dashed curves are now the fits using just Eq. (9). The values of the amplitude A 0 and the energy E π were obtained by the two-state fit as described in the main text. Now, the trend with increasing t is clearer.  The horizontal band corresponds to the extrapolated result for infinite source-sink separation. The case of P z = 0, in the top-most panels, is special due to the presence of the effect of lattice periodicity, and hence, the various symbols and curves for the top-most panels are explained in detail in the text of Appendix G.