Systematic uncertainties in parton distribution functions from lattice QCD simulations at the physical point

We present a detailed study of the helicity-dependent and helicity-independent collinear parton distribution functions (PDFs) of the nucleon, using the quasi-PDF approach. The lattice QCD computation is performed employing twisted mass fermions with a physical value of the light quark mass. We give a systematic and in-depth account of the salient features entering in the evaluation of quasi-PDFs and their relation to the light-cone PDFs. In particular, we give details for the computation of the matrix elements, including the study of the various sources of systematic uncertainties, such as excited states contamination. In addition, we discuss the non-perturbative renormalization scheme used here and its systematics, effects of truncating the Fourier transform and different matching prescriptions.


I. INTRODUCTION
Quantum chromodynamics (QCD), being the fundamental theory of the strong interactions, describes the interactions among quarks and gluons both in the perturbative regime, as well as in the nonperturbative regime with the emergence of complex structures like hadrons. As the latter requires a nonperturbative approach, the use of a discretized Euclidean spacetime to numerically solve QCD has been proven to be extremely successful in the computation of physical properties of hadrons, such as their masses and decay constants. The rich and diverse structure of hadrons is described in terms of observables like various kinds of form factors, revealing different structural aspects, such as the average distribution of electric charge or parton (quark and gluon) momentum and spin inside the hadron. Even more information is contained in distribution functions, such as parton distribution functions (PDFs), generalized parton distributions and transverse-momentum-dependent PDFs, which decompose the aggregate observables into probability densities for partons with specific longitudinal momentum, transverse momentum and momentum transfer within a scattering event. The simplest of these are the PDFs, which have been studied intensively and continuously in experimental facilities over the last few decades, most notably for the proton, and also provide input in collider experiments. It is well established that the main source of information on PDFs are global QCD analyses, providing accurate results due to theoretical advances and new data emerging from accelerators, covering different kinematical regions (see e.g., Refs. [1][2][3][4][5][6][7][8][9][10][11][12]). Such analyses rely on the factorization framework, in which scattering cross sections, generally obtaining contributions from all energy scales, are written as convolutions of perturbatively computable coefficient functions describing the highenergy scales, and the low-energy PDFs. Despite the tremendous progress in phenomenological parametrizations of PDFs, the procedures are not without ambiguities [7], as there are still kinematical regions that are not easily accessible experimentally, e.g., the large Bjorken-x region. Uncertainties on the PDFs in the large-x range propagate to smaller-x regions when QCD evolution is applied.
A calculation of PDFs from first principles is a valuable addition to the global fitting analyses and of crucial importance for the deeper understanding of the inner structure of hadrons. Apart from providing a fundamental framework for the study of these quantities, it can also serve as input for experimental analyses in collision experiments. The nonperturbative nature of PDFs makes lattice QCD an ideal ab initio formulation to determine them, utilizing large-scale simulations. A novel method to extract parton distribution functions from lattice QCD was proposed five years ago by Ji [13]. It is based on considering matrix elements probing purely spatial correlations, making them accessible in Euclidean lattice QCD.
The first studies on quasi-PDFs within lattice QCD [14][15][16] appeared soon after their proposal by Ji. These papers and the follow-up ones [17,18] focused mainly on the feasibility of the approach where only the bare nucleon matrix elements of boosted nucleons are calculated for nonsinglet operators. The first computations that utilized a proper procedure for renormalization and matching to MS light-cone PDFs and using simulations with physical pion masses were published in Ref. [19] for the unpolarized isovector PDFs, uðxÞ − dðxÞ, and helicity PDF, ΔuðxÞ − ΔdðxÞ and in Ref. [20] for the isovector transversity PDF, h u 1 ðxÞ − h d 1 ðxÞ. These works were published in shorter letters, where many of the technical and methodological details of the calculations needed to be left out.
It is the purpose of this paper to fill this gap and provide the above-mentioned details in a comprehensive way, give the present understanding and control of systematics effects appearing in the computations and to present the status of the calculations of PDFs for our twisted mass setup. In particular, we extend the work of Refs. [19,20] and give a systematic and in-depth account of the salient features entering in the evaluation of quasi-PDFs and their relation to the light-cone PDFs. We explain in detail the computation of the matrix elements including the study of the various sources of systematic uncertainties, such as excited-states contamination. An improved renormalization procedure is presented using three ensembles for the nonperturbative extraction of the renormalization functions, allowing to take the chiral limit, as well as an investigation of other sources of systematic uncertainties related to the renormalization, such as the finite volume and scale dependence. In addition, different prescriptions for applying the Fourier transform and the matching prescription are employed and critically compared.
On the physics side, in order to compare with existing phenomenological estimates, we convert our results in the RI 0 scheme into the MS scheme following the procedure given in Ref. [21]. We then apply appropriate matching to relate quasi-PDFs in the MS scheme to light-cone PDFs extracted from global fits. With the detailed analysis of systematic effects, as described in this work, having a physical value of the pion mass and the improved renormalization and matching prescriptions, we are finally in a position to indeed make contact with the phenomenological analyses of PDFs and we present final results of our lattice PDF calculations in Sec. VI below. In addition to our work, results on the quasidistribution approach were reported in Refs. [22][23][24][25][26][27][28][29][30][31][32] that include, besides the nucleon PDFs, exploratory studies of pion distribution amplitudes and PDFs as well as of gluon PDFs. For complete references and an overview of the theoretical and numerical developments, see Ref. [33].
On more general grounds, it is very important to realize that quasi-PDFs and light-cone PDFs have been shown to share the same infrared physics [34][35][36][37], which is the fundamental observation that allows one to relate both quantities using perturbation theory, provided that the hadron is moving with a large, although necessarily finite, momentum in a chosen spatial direction. It has also been proven that quasi-PDFs can be extracted from lattice QCD in Euclidean spacetime [36] and that they do not suffer from power-divergent mixings with lower-dimensional operators [38][39][40]. A factorization formula makes it possible to extract the PDFs from the quasi-PDFs, an operation called matching [19,20,29,31,34,35,37,[41][42][43][44][45]. In general, this procedure is based on a newly developed largemomentum effective theory [46], and it is renormalizable to all orders in perturbation theory [47][48][49][50]. Other approaches for a direct computation of the x dependence of PDFs include the hadronic tensor [51][52][53], fictitious heavy quark [54], auxiliary light quark [55], good lattice cross sections [35,37] (closely related to the auxiliary light quark method), "OPE without OPE" [56] and pseudo-PDFs [57][58][59][60], where the latter can be seen as a generalization of PDFs off the light cone. These alternative approaches have been explored in lattice QCD, and recent results can be found in Refs. [40,56,[61][62][63][64][65][66][67][68]. The new formulation and its successful implementation within lattice QCD has also led to a wider interest in phenomenological studies using models and toy theories of QCD [57,58,[69][70][71][72][73][74][75][76]. A detailed overview of the current status of lattice QCD calculations of PDFs and other partonic distributions can be found in the recent reviews of Refs. [33,77].
The remainder of the paper is organized as follows. In Sec. II, we provide the general theoretical aspects, lattice QCD action and parameters. We discuss the computation of the bare matrix elements, along with the lattice techniques necessary to attain high statistical accuracy. In Sec. III, we present our results on bare matrix elements, together with various analysis methods that are required to control excited-states contamination. Section IV describes in detail the nonperturbative renormalization program, which investigates pion mass dependence, volume effects and renormalization scale dependence. The matching procedure is presented in Sec. V and we demonstrate that a modified MS scheme is preferable. Different prescriptions are presented and we compare the effect on the final PDFs. Section V also includes an investigation of systematic uncertainties resulting from the Fourier transform. In Sec. VI, we present our final results with a discussion on the systematic uncertainties, while in Sec. VII we conclude.

A. From PDF to quasi-PDF
The original definition of the quark momentum distribution within a hadron can be derived from the operator product expansion (OPE) of hadronic deep-inelastic scattering and is given by where x is the momentum fraction of the quark, jNi is the hadron state at rest (assumed to be the nucleon), the momentum P þ ¼ ðP 0 þ P 3 Þ= ffiffi ffi , Wðξ − ; 0Þ is the Wilson line connecting ξ − and 0, and Γ is a Dirac matrix whose structure defines the momentum distributions of quarks having spin parallel or perpendicular to the nucleon momentum. The factorization scale is implicit. Equation (1) is light-cone dominated, as it receives contributions only in the region where ξ 2 ¼ t 2 − ⃗ r 2 ¼ 0. Consequently, a direct implementation of this equation is impossible on the lattice, where only a single point can satisfy the Euclidean light-cone condition ξ 2 ¼ t 2 þ ⃗ r 2 ¼ 0. The alternative approach proposed by Ji overcomes this difficulty by reconstructing light-cone distributions from correlation functions where the bilinear fermionic fields, ψ andψ, are separated by a purely spatial distance z and the nucleon is boosted with a finite momentum. Quasi-PDFs are defined as qðx;P 3 Þ ¼ Z þz max −z max dz 4π e −ixP 3 z hNjψð0;zÞΓWðz; 0Þψð0; 0ÞjNi; where jNi represents a nucleon, which is boosted in the z direction with momentum P ¼ ðP 0 ; 0; 0; P 3 Þ. On the lattice, quasi-PDFs are thus computed using matrix elements of nonlocal operators containing a straight Wilson line of finite length z, that varies from 0 to some maximum value, z max . This aspect will be further discussed in the following sections. The standard light-cone PDFs can be recovered from quasi-PDFs using the large momentum effective theory [46], developed specifically for these operators. Eventually, quasi-PDFs are matched to the physical PDFs through the factorization where qðx; μÞ is the light-cone PDF at the scale μ and we follow the usual choice of common factorization and renormalization scales, μ F ¼ μ R ≡ μ. C is the matching kernel, which can be computed perturbatively and has so far been evaluated to the one-loop level. For sufficiently large momenta, ideally much larger than the nucleon mass, m N , and Λ QCD , higher-order corrections in α s are very small and the extraction of quark distributions is fully robust. Presently, such a situation cannot be accomplished and it is part of this paper to investigate whether practically reachable momenta are sufficient.

B. Lattice gauge ensemble
The analysis is performed using a gauge ensemble of two dynamical degenerate light quarks (N f ¼ 2), generated by the ETM 1 Collaboration [78]. The light quark mass has been tuned to reproduce the physical value of the pion mass, henceforth referred to as the physical point. The lattice volume is 48 3 × 96, the lattice spacing a ¼ 0.0938ð2Þð3Þ fm [79], the spatial lattice extent L ≈ 4.5 fm and m π L ¼ 2. 98. Based on a model prediction of excitedstates effects in certain hadron structure quantities [80], the dependence on m π L is mild (see, e.g., Fig. 3 of Ref. [81]). The complete list of parameters of this ensemble is reported in Table I. The gauge configurations were generated with the Iwasaki-improved gauge action [82]. In the fermionic sector, the twisted mass fermion action [83,84] with a clover term [85] was employed. The fermion action is given by where D W is the Wilson-Dirac operator, μ l is the bare twisted mass for the light quarks, τ 3 ¼ diagð1; −1Þ is the third Pauli matrix in flavor space and the last term includes the field-strength tensor F μν ½U weighted by c SW , known as the Sheikoleslami-Wohlert (clover) coefficient. In the gauge ensemble used in this work, the clover parameter is taken to be c SW ¼ 1.57551 [86]. In Eq. (4) χðxÞ ¼ ðu; dÞ T denotes the light quark doublet in the "twisted basis" at maximal twist. The fermion fields in the "physical basis," denoted by ψðxÞ, are recovered by the following chiral rotation: ψðxÞ ≡ e i α 2 γ 5 τ 3 χðxÞ;ψðxÞ ≡χðxÞe i α 2 γ 5 τ 3 ; Effective as of this year, the European Twisted Mass Collaboration has officially changed its name to the Extended Twisted Mass Collaboration, as now it also comprises members from non-European institutions. Along with the name change, there is also a new logo.
where α ¼ π=2 at maximal twist. In the next sections of this paper, the interpolating fields and the nucleon matrix elements have to be understood with quark fields in the physical basis, unless otherwise specified.
The introduction of the twisted mass term in the lattice action has a series of advantages in hadron structure calculations, such as excluding zero eigenvalues from the spectrum of the Wilson-Dirac operator allowing a speed-up in the numerical simulations. Moreover, it also simplifies renormalization properties of operators [84,[87][88][89] and at maximal twist it provides an automatic OðaÞ improvement. However, the isospin symmetry breaking of the twisted mass formulation could lead to instabilities when simulations are carried out at light quark masses close to the physical value. The isospin breaking, which manifests itself in the neutral pion being lighter than the charged one, is reduced with the addition of the clover term in the fermion action [78].

C. Nucleon bare matrix elements
In this work, we evaluate the unpolarized, helicity and transversity quasi-PDFs using three values for the nucleon momentum, that is, P 3 ¼ 6π=L, P 3 ¼ 8π=L and P 3 ¼ 10π=L, corresponding in physical units to 0.83, 1.11 and 1.38 GeV, respectively. The matrix element of interest is given by for a straight Wilson line, W, with varying length from z ¼ 0, corresponding to the standard ultralocal operators, up to half of the spatial extension, L=2. Γ is the Dirac structure leading to different types of PDFs, as discussed below. The matrix elements of Eq. (6) are extracted from a ratio of two-and three-point functions, averaged over the gauge field ensemble. The two-point and three-point functions are given by where N α ðxÞ is the proton interpolating field, t (t s ) is the time separation of the sink relative to the source in the twopoint (three-point) function, 2 τ is the insertion time of the O operator and P αβ ,P αβ are the parity projectors for the twoand three-point functions, respectively. For the two-point functions, we use the plus and minus parity projectors and average over the forward and backward correlators. The projectorP αβ depends on the operator under study. The operator Oðy; τ; zÞ has the following form: Oðy; τ; zÞ ¼ψðy þ z; τÞΓWðy þ z; yÞψðy; τÞ; ð9Þ where the Wilson line is aligned in the direction of the nucleon momentum and ψ is the fermion doublet of flavors ðu; dÞ. We compute the isovector combination u − d, inserting a τ 3 in flavor space. With this choice, the disconnected diagrams cancel and the total contribution is obtained from the connected diagram shown in Fig. 1.
As mentioned in Sec. II A, the choice for the Dirac structure in the operator of Eq. (1) gives access to different quark distribution functions. In fact, to extract a given PDF, different Γ matrices may be used, such as γ 3 (parallel to the Wilson line direction) or γ 0 (temporal direction) for the unpolarized PDF. However, some choices are preferable (e.g., γ 0 ) as they avoid finite mixing under renormalization, which was found to be allowed in lattice regularization [21]. In the absence of mixing, the operators renormalize multiplicatively, and one avoids systematic uncertainties related to the elimination of mixing. In this work, we compute matrix elements of the operators with (1) Γ ¼ γ 3 , γ 0 for the unpolarized distribution, qðxÞ ¼qðxÞ ↑ þq ↓ ðxÞ, for the helicity distribution, ΔqðxÞ ¼q ↑ ðxÞ −q ↓ ðxÞ, and (3) Γ ¼ σ 3j for the transversity distribution, δqðxÞ ¼q ⊥ ðxÞ þq ⊤ ðxÞ.   ), the pion mass (m π ) and the lattice spacing (a) have been determined in Ref. [79].
We use different symbols to emphasize that the three-point function is computed only for selected values of the source-sink separation, t s =a ¼ 8, 9, 10, 12 in this work, while the two-point function is evaluated for the full range of time separations to constrain more precisely the gap between the ground state and the first excited state in two-state fits; see Sec. III C. q ↑ (q ↓ ) and q ⊥ (q ⊤ ) indicate quarks with helicity aligned (antialigned) with that of a longitudinally and transversely polarized proton, respectively. The index j entering the matrix element of the transversity distribution is purely spatial and denotes the direction of the quark spin, which is orthogonal to the proton momentum. Each choice of an operator requires an appropriate parity projectorP αβ , that is, 1þγ 0 2 for the unpolarized, iγ 3 γ 5 1þγ 0 2 for the helicity, and iγ k 1þγ 0 2 for the transversity PDF. The desired matrix elements of Eq. (6) are finally extracted from fitting the ratio of three-point and two-point functions to isolate the ground state. One choice is a constant fit as a function of the time insertion of the operator over which the ratio is time independent (plateau region). Other choices are twostate fits that take into account the first excited-states contributions, as discussed in Sec. III with a comparison among them. Thus, the desired matrix element in the plateau fit is given by where K is a kinematic factor equal to ) for the vector current γ 3 , whereas K ¼ 1 for all other choices of the Dirac structure used. A gauge average (hÁ Á Ái G ) is performed on the two-and three-point functions prior to taking the ratio.

D. Lattice techniques
To evaluate the correlation functions of Eqs. (7) and (8), a state with the same quantum numbers as the nucleon is created at the source and annihilated at the sink. In our calculation, we employ point sources generated with the proton interpolating field N α ðxÞ ¼ ϵ abc u a α ðxÞðd b T ðxÞCγ 5 u c ðxÞÞ, where C ¼ γ 0 γ 2 is the charge-conjugation matrix. In general, the overlap of the states generated with such interpolating fields with the desired ground state is improved by employing smearing. Gaussian smearing [90,91] of quark fields at the source and the sink, used in combination with APE smearing on the gauge field, is a very effective approach to improve ground-state dominance. For nucleon states at rest, previous studies [92] performed on the gauge ensemble employed in this work, have extracted the optimal parameters for both Gaussian and APE smearing, tuned to approximately reproduce a nucleon state with an rms radius of about 0.5 fm and maximize the overlap with the ground state. These parameters are ðN G ; α G Þ ¼ ð50; 4Þ for Gaussian smearing and ðN APE ; α APE ¼ 0.5Þ for APE smearing.
However, in the computation of PDFs, we need to optimize the overlap to a boosted nucleon state while keeping the statistical noise minimal. We, thus, modify the standard Gaussian smearing by using momentum-smeared interpolating fields [93]. This has proven extremely advantageous when the nucleon is boosted with a large momentum, as is the case for quasi-PDFs. The momentum smearing modifies the standard Gaussian smearing function by including a complex phase factor that affects only the gauge links along the direction of the boost, namely where U j denotes a gauge link in the j direction. The value of the free momentum smearing parameter ξ depends, in general, on the parameters of the gauge ensemble and on the nucleon momentum. The tuning of ξ is necessary in order to improve the overlap of our interpolator with the boosted proton. We, thus, have optimized the parameter ξ for each value of the momentum employed by minimizing the statistical errors of the nucleon two-point functions.
In Fig. 2, we demonstrate the effect of the momentum smearing, by plotting the scaling of the error of the twopoint correlator and the effective energy. The results shown have been extracted using 100 measurements for the nucleon boost P 3 ¼ 8π=L. For comparison, we include also the results using the standard Gaussian smearing (ξ ¼ 0). As can be seen, the errors in the correlation functions reduce dramatically as the value of ξ increases and convergence is observed in the range ξ ∈ ½0.6-0.75. Thus, any value of ξ in this window of values leads to a similar signal-to-noise ratio. The tuning procedure for the other two momenta, P 3 ¼ 6π=L and P 3 ¼ 10π=L, leads to the same conclusions and we fix ξ ¼ 0.6 throughout this work. The three-point functions are computed using the sequential method [94], which has the advantage of summing automatically the sink spatial volume to produce the sequential propagator for all insertion points. Compared to the stochastic method [95], the sequential method does not introduce stochastic noise, but has the drawback that new inversions of the Dirac matrix have to be carried out for different source-sink separations and nucleon momenta. For details on the comparison between the stochastic method (with pure Gaussian smearing) and the sequential method (with and without momentum smearing), we refer to Refs. [15,18]. In addition, the sequential method is preferable when the momentum smearing technique on the quark fields is employed.
To increase the number of measurements of the threepoint functions, we use multiple source positions (N src ) for each configuration. To confirm that the data extracted from different source positions on the same configuration are statistically independent, we study the scaling of the statistical error with the increase of N src . In Fig. 3 (left), we show the error on the matrix elements for z ¼ 0, obtained from the analysis of a sample of gauge configurations for a nucleon boost of 6π=L. For comparison purposes, the absolute error is normalized to the one obtained using only one source position per configuration. As can be seen, the errors for the three choices of Γ structure decrease approximately as 1= ffiffiffiffiffiffiffiffi N src p , which is the scaling expected if the measurements are not correlated.
To further decrease the statistical uncertainties, we boost the nucleon along different spatial directions and orientations, that is, AEx, AEy and AEz, with the Wilson line taken always along the axis in which the spatial component of the momentum is nonzero. The correlation functions obtained from these N dir ¼ 6 possible directions lead to the same physical results due to the spatial rotational symmetry on the lattice, and therefore can be averaged within each configuration. The statistical error reduction is close to the ideal behavior 1= ffiffiffiffiffiffiffiffi N dir p , as shown in the right panel of Fig. 3.
Despite the use of the momentum and APE smearing, the exponential decrease of the signal-to-noise ratio persists as the nucleon momentum and source-sink separation increase. To increase statistics at a reduced cost, we employ, together with the momentum and APE smearing, the covariant approximation averaging (CAA) [96] that belongs to the class of truncated solver methods with a bias correction. For each configuration, N LP low-precision (LP) inversions of the Dirac matrix are carried out from a set of random source positions fX src g and the bias from the measurements is removed using a small number N HP of high-precision (HP) inversions. Denoting with C LP and C HP the correlation functions produced with low and high precision of the solver, respectively, the improved  correlation functions C imp for each configuration are defined by where, in the second sum, C LP and C HP are computed at the same source position, as otherwise the bias cannot be corrected. The error for a given observable scales with the ratio N HP =N LP as where r c is the correlation coefficient among nucleon correlators computed at high and low precision. A compromise is needed to have r c ≃ 1, while keeping the inversions as fast as possible and N LP ≫ N HP . Thus, a tuning of the precision of the solver has to be carried out. To invert the Dirac operator, we use the adaptive multigrid solver with twisted mass fermion support [97] and require the residual to be r HP ¼ 10 −10 for HP inversions. After testing different values of the residual for LP inversions, we find that the stopping criterion guarantees a correlation coefficient r c ≥ 0.999 with a considerable speed-up in the inversion time. Moreover, taking 15 HP inversions as the reference setup, a comparison of the HP and CAA estimates for the two-and threepoint correlators verified that the bias introduced from LP inversions is negligible compared to the gauge noise of our measurements when one HP inversion for each configuration is performed. Thus, to extract the nucleon matrix elements for quasi-PDFs at momenta 8π=L and 10π=L, we use the CAA setup with ðN HP ; N LP Þ ¼ ð1; 16Þ.
The quasi-PDFs are computed at different source-sink time separations t s in order to investigate excited-states effects. In particular, we use t s =a ¼ 8, 9, 10, 12 for the unpolarized and t s =a ¼ 8, 10, 12 for the helicity and transversity cases. For a detailed discussion on the excited-states effects, we refer to Sec. III C. In the remaining part of this section and unless otherwise stated, we focus on the results extracted from t s ¼ 12a, which is the one where excited states are found to be suppressed for all three PDFs when the statistical precision is around 10%. The number of configurations and the total statistics for each operator and momentum are reported in Table II. In the number of total measurements, we include a factor of 6, coming from the average over correlators computed from the boost aligned along the AEx, AEy, AEz directions, and a factor of 16 (15) from the source positions for P 3 ¼ 6π=L (P 3 ¼ ½8; 10π=L). This translates into an additional factor of 96 for P 3 ¼ 6π=L and 90 for P 3 ¼ 8, 10π=L, where the CAA is employed. Moreover, we note that only for the transversity PDF at P 3 ¼ 6π=L, we have averaged over the matrix elements computed for the two possible choices of Dirac matrices (for example, σ 31 and σ 32 for a boost in the z direction). This contributes an additional factor of 2 to the total number of measurements at this momentum, as well as to the computational cost (due to the different parity projectors needed for σ 31 and σ 32 ).
Although large nucleon momenta are needed to approach light-cone PDFs, high values of the momentum on a Euclidean lattice may lead to substantial cutoff effects if the condition P 3 ≪ 1=a is not satisfied. One possible check of cutoff effects is via the computation of the dispersion relation. To this end, we compute the nucleon energies for the momenta f0; 2; 4; 6; 8; 10gπ=L using the momentum smearing method, and check whether the relativistic dispersion relation E 2 ¼ m 2 N c 4 þ P 2 3 c 2 is satisfied for our results. As can be seen in Fig. 4, no deviations from the continuum energy-momentum relation are observed for the values employed in this work (up to 1.38 GeV), which is well below the inverse lattice spacing (10π=L ≈ 0.65=a). Moreover, by performing a two-parameter fit to the lattice data, we obtain c 2 ¼ 1.00ð3Þ for the squared speed of light and a 2 m 2 N c 4 ¼ 0.207ð4Þ for the nucleon mass in lattice units, which is compatible with the value TABLE II. The statistics of our calculation at a source-sink separation of t s ¼ 12a, for each current insertion and each momentum. N conf is the number of gauge configurations, N HP (N LP ) is the number of high-(low-)precision measurements, and N meas is the total number of measurements. A factor of 6 is included in the total measurements due to the averaging of data with the Wilson line and momentum boost in the AEx, AEy, AEz directions. An additional factor of 2 compared to all other cases is included for the transversity at P 3 ¼ 6π L to take into account the averages over the two possible Dirac structures, as explained in the text. am N ¼ 0.4436ð11Þ extracted at zero momentum [79]. Whether this finding of only small cutoff effects in the dispersion relation also holds in the case of the PDFs we are interested in, can eventually only be answered when results at several lattice spacings are available.

III. TECHNIQUES FOR THE EVALUATION OF BARE QUASI-PDF NUCLEON MATRIX ELEMENTS IN LATTICE QCD
In this section, we discuss crucial aspects related to the lattice QCD computation of bare quasi-PDF nucleon matrix elements, such as dependence on the number of stout smearing iterations used in the operator, the choice of the Dirac structure and identification of excited-states contamination.

A. Stout smearing
We apply three-dimensional stout smearing to the gauge links of the Wilson line of the operator, following the prescription of Ref. [98]. This reduces the power divergence that is present in the matrix elements of fermion operators with Wilson lines (see, e.g., Refs. [99,100], as well as Ref. [33] for a review of recent investigations of the power divergence). The application of smearing was especially crucial in the first calculations of quasi-PDFs [14,[16][17][18] when the renormalization procedure was not yet developed. Even though the complete renormalization procedure was developed recently [101], a few iterations of stout smearing are still useful for noise reduction in the renormalized matrix elements. In Fig. 5, we show examples of the effect of stout smearing for the case of the bare matrix elements of the unpolarized (insertion γ 0 ), helicity and transversity quasi-PDFs for a nucleon momentum of P 3 ¼ 8π=L without stout smearing, using 0, 5, 10 and 15 stout steps. As expected, the stout smearing modifies the matrix elements, increasing the values of the real and imaginary parts at each z. We find convergence of the matrix elements after a few iterations of stout smearing and, thus, in what follows we discuss in detail results obtained with 5 stout steps. The renormalized matrix elements are expected to no longer show any dependence on the smearing. We discuss the details of the renormalization in Sec. IV.

B. Choice of the Dirac structure
Although the natural choice to extract the unpolarized PDF would be γ þ ¼ ðγ 0 þ γ 3 Þ= ffiffi ffi 2 p , a recent study [21] has revealed that the matrix element with γ 3 exhibits mixing with the twist-3 scalar operator (later confirmed by symmetry properties [23]) and the computation of a mixing renormalization matrix is required [101]. However, the twisted mass formulation has the advantage that the mixing is between the vector and pseudoscalar operators. The latter vanishes in the continuum limit and only contributes as a discretization effect that increases the gauge noise. Consequently, it may be neglected in the first approximation, since eventually one is interested in the results for a → 0. In this work, we compute both matrix elements, but use the data for γ 0 to extract our final results for the unpolarized quasi-PDF. Even though the mixing is a purely lattice artifact for twisted mass fermions, we expect increased noise contamination for the case of γ 3 , which is clearly visible in the lattice data, as shown in Fig. 6. This behavior is momentum independent and we compare the matrix elements for the two γ structures at a momentum of 8π=L. The statistical errors for γ 3 are twice as large compared to the ones for γ 0 with the same number of measurements. We, thus, conclude that the insertions γ 3 or γ þ are not optimal for extracting the unpolarized PDFs within a lattice QCD calculation.
We compute the matrix elements for the three values of the nucleon momentum given in Table II with the associated number of measurements. The momentum dependence of the resulting matrix elements is shown in Fig. 7 for the three operators considered in this work. We find that with increasing momentum, the matrix elements decay to zero faster and, for the highest momentum employed, both the real and imaginary parts are compatible with zero for z ≥ 11a ≃ 1.03 fm.

C. Excited states contamination
Identification of the nucleon state is crucial in order to extract the correct nucleon matrix elements from lattice QCD measurements. This requires a careful analysis of excited states. An additional challenge is the need to boost the nucleon with a relatively large momentum, something that it is not needed in e.g., studies of nucleon charges and to the lattice data (red squares), we find a value of the squared speed of light equal to c 2 ¼ 1.00ð3Þ, and the nucleon mass in lattice units a 2 m 2 N c 4 ¼ 0.207ð4Þ, which is also in agreement with the estimate given in Ref. [79].
form factors. Requiring a large nucleon momentum in combination with using simulations at the physical point leads to a more severe contamination due to the excited states, since the spectrum is denser. Therefore, a thorough assessment of the excited states is even more essential for the reliable extraction of PDFs. We extend the study of excited states first presented in Ref. [102] in order to eliminate, as much as possible, one of the major systematic uncertainties in the evaluation of nucleon matrix elements from which the PDFs are extracted within lattice QCD. In what follows, we review the analysis methods that we employ to isolate the ground state. As will be discussed in this subsection, the excited-states contributions are milder for the matrix elements of the unpolarized operator as compared to the matrix elements of the helicity and transversity operators. To extract the ground-state contribution in the correlation functions, we employ three analysis methods, briefly described below.
(1) Plateau method (single-state fit): In this method, one looks for a region where the ratio of Eq. (10) becomes independent of the insertion time, τ, which indicates a (possibly partial) suppression of excited states. The ratio in this region is fitted to a constant value, yielding the matrix element of the ground state. Indeed, inserting into the two-and three-point functions two sets of complete eigenstates of the QCD Hamiltonian (jnðP 3 Þi and jn 0 ðP 3 Þi, where we indicated that the states are momentum dependent, but we keep this dependence implicit below to simplify the notation) with quantum numbers of the nucleon, the ratio of Eq. (10) can be written as 3 where jNi is the nucleon state, E n;n 0 are energies of the eigenstates jni, jn 0 i (where j0i is the ground state, j1i is the first excited state, etc.) and the time slice of the source is set to zero. Isolating in the ratio the contribution of the ground state and expanding the sum up to the first excited state, Eq. (15) reads where we introduced the constant f 10 ¼ hNj1i=hNj0i and As can be seen, the excited-states contributions fall off exponentially and for ΔEt s ≫ 0, ΔEτ ≫ 0 and ΔEðt s − τÞ ≫ 0, the first time-independent term dominates, yielding the matrix element of the nucleon state, h0jOj0i. Thus, fitting the ratio within the plateau region to a constant value yields the desired nucleon matrix element. Since statistical errors grow exponentially with t s , the challenge is to identify the smallest value of t s that suppresses the contributions of excited states in the ratio to a level negligible as compared to the statistical accuracy.
(2) Summation method: This method was introduced in Ref. [103] and entails the sum of the ratio of Eq. (16) over the insertion time τ, excluding the time slices of the source and the sink. The sum is a geometric series in the terms involving the excited-states contributions and yields the expression where C is a constant and the desired matrix element M ¼ h0jOj0i can be extracted from a linear twoparameter fit. As a consequence, this procedure yields, in general, results with larger uncertainties as compared to the plateau method, but has the advantage of suppressing the excited-states contamination by a faster-decaying factor e −ΔEt s , as compared to the leading exponential factors in Eq. (16). (3) Two-state fits: In this method, one retains the first excited-states contributions in the two-point and three-point correlation functions. We use two types of fits: (a) a fit to the two-point correlator followed by a fit to the three-point function, and (b) a simultaneous fit to both the two-and three-point correlators. We refer to the type (a) fit as a sequential fit and parametrize the nucleon two-point function with momentum P 3 as with the fit parameters being the ground-state amplitude, jA 0 j 2 ¼ jhNj0ij 2 , the ground-state energy E 0 , jf 10 j 2 and ΔE. The three-point correlator can be written as The amplitudes jA 0 j 2 and jf 10 j 2 and energies E 0 and ΔE are determined from first fitting the two-point function in Eq. (18) and used as inputs into the three-point function, which is fitted separately for the real and imaginary parts. The fit parameters are Reh0jOj0i, h0jOj1i, Reh1jOj1i and Imh0jOj0i, Imf 10 , Imf 01 , Imh1jOj1i, respectively. Through this fitting procedure that entails four fit parameters for the two-point function fit and an additional four parameters for the three-point function, we extract the desired matrix element, h0jOj0i.
The call the type (b) fit simultaneous and it is a combined fit to Eqs. (18) and (19), using all eight parameters. We expect that both fit types will lead to very similar results. Proper analysis requires the evaluation of two-and threepoint correlation functions on exactly the same gauge field configurations and for the same set of source positions. Only in such a case can the correlations among the fit parameters be probed in a fully consistent manner. In practice, this is a severe problem for the observables we are interested in, which is due to the exponential increase of the noise with increasing source-sink time separation. When the time separation grows by one lattice spacing, the necessary statistics to suppress noise to the same level as before increases by a factor of 2-3 in our data. Thus, small source-sink time separations yield a precise signal already for Oð10 3 Þ measurements, while correlators at t s ¼ 12a require Oð10 5 Þ measurements to reach a similar precision. For a given amount of computational resources, one, therefore, has to make a choice between using the same statistics for all t s values and thus obtaining very precise data at small values of t s and considerably less precise ones for larger values of t s , or using different statistics and achieving similar precision for all values of t s . The severe drawback of the former choice is that the twostate fits are dominated by the precise results at small t s that can lead to biased results for the ground-state matrix element. This bias becomes more severe as the boost increases due to the denser spectrum prohibiting the robust identification of the effects of excited states.

Two-point correlator
Before we compare different analysis methods for the extraction of quasi-PDF matrix elements, we discuss the two-state fit of the two-point correlator at our largest momentum, using Eq. (18). We obtain the boosted nucleon ground-state energy of aE 2-state 0 ¼ 0.773ð18Þ, which is in line with a single-state fit, leading to a value aE 1-state 0 ¼ 0.788ð9Þ. These values are illustrated in the left panel of Fig. 8. As we showed in Fig. 4, the obtained ground-state energy satisfies the continuum dispersion relation. The energy difference between the ground state and the first excited state is aΔE ¼ 0.36ð7Þ, which in physical units is 0.76 (15) GeV. This can be compared with the expectation from the approximation of noninteracting stable hadrons in a box, which leads, at our volume and the physical pion mass (m π L ≈ 3), to the following splittings (for Nππ, Nπ, Nππππ, Nπππ) with respect to the ground state (N): ≈0.27, 0.36, 0.54, and 0.63 GeV, respectively. The above values hold for a nucleon at rest, whereas the spectrum becomes denser with increasing nucleon momentum. In the interacting case, the spectrum obviously changes, but its general features are unchanged; for a comprehensive discussion on the spectrum of excitations, we refer to the recent review of Ref. [81].
The energy extracted for the first excited state does not match the energy expected for the two-and three-particle states lying below. Furthermore, it is known that a two-state fit will model, in one exponential, all the states above the ground state that have an overlap with the interpolating field. Thus, one expects a bias in the determination of the first excited state. Consequently, the conclusions from twostate fits need to be checked against the plateau values or three-state fits, if the accuracy of the data allows for this. If consistency is found, this indicates ground-state dominance up to the achieved statistical precision.

Matrix elements of the unpolarized PDF
We first present the analysis of excited states in the nucleon matrix elements for the unpolarized operator using the γ 0 structure and the largest momentum, P 3 ¼ 10π=L, where excited-states effects are expected to be most severe. We use four values of the source-sink time separation, namely t s =a ¼ 8, 9, 10, 12 or in physical units t s ≃ 0.75, 0.84, 0.94, 1.13 fm. We opt to increase the number of measurements as we increase t s , to have statistical errors that are approximately the same for consequent values of t s . This enables us to perform a reliable analysis at each value of t s . We use the same CAA setup for all separations, as explained above. In Table III, we collect the statistics for each value of t s .
The plateau method is employed to analyze the data from each t s value and the results are shown in Fig. 9. For t s ¼ 8a, the real part of the matrix element shifts towards smaller values at each z=a compared to larger t s , while the effect in the imaginary part is less prominent. Within statistical uncertainties, a convergence between the results at t s ¼ 10a and t s ¼ 12a is observed in both the real and imaginary parts. Comparing the data for the four t s values, we observe signs of a nonmonotonic behavior that affects the real and imaginary parts differently, depending on the value of z. This can introduce a complicated effect in the determination of the PDF. Ideally, one must compute the matrix elements for several large enough values of t s with equally small statistical errors and demonstrate convergence to one value. However, the exponential increase of the noise-to-signal ratio seen in Fig. 2 and the need for new sets of sequential inversions for each t s , require computational resources beyond what is currently available, placing limitations on the maximum value of t s . Nevertheless, if consistency can be found between the converged plateau values of the single-state fits and other methods, groundstate dominance can be reliably established.
We apply the summation method for two sets of t s values, namely (a) t s ¼ 8a, 9a, 10a, 12a and (b) t s ¼ 9a, 10a, 12a, with the results shown in Fig. 10. The first set yields results that, although consistent with those from the second fit, are systematically higher. Furthermore, for large values of z=a, the slope obtained using the (b) set is significantly larger than that from the (a) set and χ 2 =d:o:f: > 2 for z=a ≈ 10-15, indicating that the (a) fits do not provide a good description of the data. This is a consequence of the ratio at t s ¼ 8a being considerably below the ones at other source-sink separations. Figure 11 shows examples of summation fits for our data. In the right panel, only the (b) fit yields an acceptable χ 2 =d:o:f: of around 0.16, while the full fit to all t s values has χ 2 =d:o:f: ≈ 2.4. Thus, the fitting ansatz of Eq. (17) does not provide a correct description of the data and the large-z values in Fig. 10 for t s ¼ ½8a-12a are not reliable. The effect is visible also for small z=a (see the left panel of Fig. 11 for z=a ¼ 0 fits) and may result in overestimating the real part of the matrix elements. At z=a ¼ 0, the extracted value of the matrix element is 1.21 (16) for the (a) set and 1.02 (24) for the (b) set, after applying in both cases the renormalization factor Z V ¼ 0.7565 [104]. We note that the matrix element should be equal to 1 upon renormalization, which is satisfied only by the set (b), while the value from the summation method that includes the smallest t s in the fit is considerably larger. All of the above points lead to the conclusion that using too small sourcesink separations gives incorrect results. We will thus omit t s ¼ 8a in our summation method analysis, and take fits (b) as our final estimates from the summation method. All of these fits, both in the real and imaginary parts, have χ 2 =d:o:f: ≲ 1, indicating that the exponential contributions TABLE III. Numbers of measurements of the matrix element for the unpolarized operator (γ 0 ) at each source-sink separation and P 3 ¼ 10π=L. from higher excited states in Eq. (17) are sufficiently suppressed, yielding a good estimate of the ground-state matrix element M. However, the precision of the summation method estimates is much worse than the one from the plateau or two-state fits (see below).
We present now the two-state fits and compare results from the sequential and simultaneous two-state fits. In addition, we check the robustness of these fits by using three or four values of t s . The fits using only two t s values lead to considerably larger errors (with compatible central values) and hence, we do not show them. The different choices of fits with three t s values lead to very similar results in terms of the central values and their errors. Thus, in Fig. 12, we present results from one choice of three source-sink separations, namely t s ¼ 8a, 9a, 10a, and from the fit to all t s ¼ 8a, 9a, 10a, 12a. We conclude that the simultaneous and sequential fits lead to statistically identical results for both sets of t s values. Including the results for the largest t s yields consistent fits, but with errors that are slightly smaller (up to 10-15%) for most z values. This is visible particularly for the simultaneous fits at all z and for sequential fits at small z values in the real part. Note that the central values are essentially the same, suggesting that excited states are sufficiently suppressed.
In Fig. 13, we collect the results of the analyses using the three aforementioned procedures, for all z=a values, and in Fig. 14 we present part of the same data for two selected values of the Wilson line length, z=a ¼ 5 and z=a ¼ 10, for better visibility. Excited states effects are clearly visible for t s ¼ 8a, especially in the real part when using the plateau method. For t s ¼ 9a and 10a, we observe a small tension in the imaginary part: the plateau values are consistently lower for t s ¼ 9a and consistently higher for t s ¼ 10a, with respect to the two-state fits. We note that, in particular, there  is tension between the two-state fit using t s ¼ 8a-10a and the plateau values for t s ¼ 10a, at z=a ≈ 10, which fails to satisfy our criterion of consistency between two-state fits and plateau fits. Hence, data at source-sink separations below 12a are likely still contaminated by excited states, although statistical fluctuation as the source of the trend cannot be excluded. We find that the results extracted from the plateau method using t s ¼ 10a and t s ¼ 12a are in agreement with each other and the ones from t s ¼ 12a are compatible with those extracted using the all-t s two-state fits for all values of z=a and for both the real and imaginary parts. The errors in the summation method are too large to draw any meaningful conclusions. Given that this investigation is carried out for the largest value of the momentum, we can take as our final value for the nucleon matrix elements of the unpolarized operator the data at t s ¼ 12a for all the momentum values. We note that increasing the nucleon boost would need a similarly thorough reanalysis of the effects of the excited states. Likewise, increased statistical precision could reveal excited-states contamination even at this nucleon momentum. Thus, we emphasize that the attained conclusion about ground-state dominance is valid only within the present statistical uncertainties of Oð10Þ%.

Matrix elements of the helicity and transversity PDFs
In this subsection, we discuss the excited-states effects for the two other Dirac structures used in our work-axial and tensor-associated with the helicity and transversity PDFs, respectively. We perform a similar analysis as for the Real (left) and imaginary (right) parts of the matrix element for the unpolarized PDF from the plateau method (points labeled with appropriate t s ), the summation method (using t s ≥ 9a) and sequential two-state fits (to t s ¼ 8a, 9a, 10a and to all t s ). The nucleon is boosted with 10π=L ≃ 1.38 GeV.
unpolarized case at the largest momentum, P 3 ¼ 10π=L, and again use four values of the source-sink separation, namely t s =a ¼ 8, 9, 10, 12, or in physical units t s ≃ f0.75; 0.84; 0.94; 1.13g fm. The numbers of measurements are listed in Table IV. The methodology of our investigation is the same as for the unpolarized case, i.e., for each t s value, we perform single-state fits within a plateau region, and we combine data at different t s using the summation method approach, as well as two-state sequential and simultaneous fits. In Figs. 15 and 16, we plot the full z dependence of bare matrix elements for helicity and transversity PDFs, respectively. Figure 17 displays a zoom-in for two selected values of z=a, i.e., z=a ¼ 5 and z=a ¼ 10. In this case, the fits using the summation method have good quality ( χ 2 =d:o:f: ≲ 1) even when including t s ¼ 8a and hence, we use all source-sink separations for this approach. For the real part, the two-state fits and the summation method yield results compatible with plateau fits for t s ¼ 12a. The only observed tensions in the real part are rather small, possibly statistical fluctuations, and are visible for the plateau fits using t s =a ¼ 8, 9 versus the summation method, for intermediate and large z=a in the helicity case. However, for the imaginary part, we consistently observe a rather strong dependence of the plateau values extracted when using t s ¼ 12a as compared to those from t s =a ¼ 8, 9, 10, with 2σ to 3σ tensions. The latter are also incompatible with the results extracted using two-state fits to all t s and the summation method, both of which are consistent with the plateau results for t s ¼ 12a. Moreover, the two-state fits using t s =a ¼ 8, 9, 10 are incompatible with plateau fits for t s ¼ 10a (see right panel of Fig. 15), which again violates our criterion for consistency, necessitating the use of t s ¼ 12a. This behavior reinforces the conclusion reached in the case of the bare matrix elements for the unpolarized quasi-PDF, namely that the tensions observed between results from source-sink separations below 1 fm are indeed manifestations of excited-states contamination and groundstate dominance is achieved, to around 10% statistical accuracy, only at t s ¼ 12a. Hence, to extract PDFs, we take the plateau results for t s ¼ 12a as our preferred ones, since they are more precisely determined as compared to both the results extracted using the two-state and the summation approaches, but show full consistency with them. The more severe excited-states effects observed in the cases of helicity and transversity are in accordance with observations connected to the extraction of the nucleon axial and tensor charges, where excited-states contamination is more significant as compared to the case of the vector operator. Thus, studies that aim at a higher precision in the determination of quasi-PDF at increasing values of the nucleon boost would need large statistics to account for the increased errors resulting from having a nucleon state with large boost, but also for investigating excited states.

Remarks and computational costs
This concludes our investigation of excited-states effects. We emphasize that the spectrum of nucleon excitations is TABLE IV. Statistics used in the study of excited states for the polarized cases (γ 5 γ 3 : helicity; σ 3j : transversity) at P 3 ¼ 10π=L. rich, particularly for a boosted nucleon with quarks with physical masses. Thus, one method of extracting bare matrix elements can be misleading, as the fitted energy gap between the ground state and the explicitly modeled first excited state suggests that there are tens of excited states. In such a situation, excited states need to be suppressed by going to large enough source-sink separations and robust statements can be made only when they are based on compatible results between different methods-in our case the plateau fits at the largest source-sink separation of around 1.1 fm, the two-state fits and the summation method. We note that the largest t s ¼ 12a is crucially needed to establish this compatibility. Having only t s ¼ 8a, 9a, 10a in the two-state fits and comparing to plateau fits at t s ¼ 10a, one would conclude that there is significant tension between the former and the latter. This is best illustrated in the imaginary part of the bare matrix elements for the helicity quasi-PDF; see the right panels of Fig. 15.
Finally, we would like to make some remarks on the computational resources that go well beyond what one needs for typical hadron structure calculations. The quasi-PDFs will reproduce the light-cone PDFs in the limit of large boosts. How large the boost should be, needs further investigation. Within the lattice QCD formulation, as already explained, one cannot increase the momentum of the nucleon to arbitrarily large values. The reasons are as follows: (1) As the momentum increases, the signal-to-noise ratio rapidly deteriorates, despite the utilization of special methods, such as smearing techniques that reduce the noise. We find that to increase the momentum from 6π=L ≈ 0.83 to 8π=L ≈ 1.11 GeV, we need to increase the statistics by a factor of around 3.7 (for unpolarized PDFs) or around 6 (for helicity and transversity PDFs) and to increase to 10π=L ≈ 1.38 GeV by an additional factor of 3.7 or 6, respectively, in order to keep the statistical error approximately the same at t s ¼ 12a. (2) A careful study of excited states must be carried out and becomes increasingly more difficult as the momentum increases, because of the denser spectrum. As the source-sink separation increases, the statistical errors grow exponentially, e.g., for t s ¼ 12a, we needed a factor of about 10-12 (similar for all Dirac structures) more statistics for the same error as for t s ¼ 10a (at the largest boost). It is imperative to have large enough source-sink separations for at least three values of t s with comparable errors to perform a reliable analysis of excited-states effects and extract the ground-state matrix element. Therefore, in order to reach a nucleon momentum of e.g., 2 GeV at t s ¼ 12a, for unpolarized PDFs, we estimate that one would need Oð100Þ million core hours (Mch) on a typical supercomputer, as compared to Oð5Þ Mch at P 3 ≈ 1.38 GeV studied in this work and for a momentum of 3 GeV, we would need Oð10 000Þ Mch. For the polarized PDFs, the projected estimate for P 3 ≈ 3 GeV is Oð10 6 Þ Mch. In addition, one may have to increase the source-sink separation to account for the increased excited-states contamination. If t s ¼ 14a is used, then the computational resources for a boost of 3 GeV would be Oð10 5 Þ=Oð10 7 Þ million core hours for the unpolarized/ polarized case, which is prohibitively expensive, given currently available computers. These requirements may be alleviated with the possible development of better algorithms, enhancing the signal-to-noise ratio at large nucleon boosts and large source-sink separations. Another way to milden the need for huge computational resources is the derivation of two-loop matching and conversion factors, which is foreseen in the near future. In this way, quasi-PDFs may be robustly connected to light-cone PDFs already at lower momenta. We emphasize that the appropriate strategy is to not increase the nucleon boost uncontrollably, and to rely on precise data only at low source-sink separations. Failing to keep the statistical errors approximately constant as one increases t s may introduce uncontrolled systematic errors, since the fits will be determined mostly by the more accurate data at small t s . As we have shown, it is essential that all source-sink separations have approximately equal errors for a reliable extraction of the matrix elements and this is the criterion we adopt. Unlike other studies [26,28,31], we do not rely solely on two-state fits of data from source-sink separations with widely varied errors, which are thus dominated by the precise data at the small values of t s . Such an approach can be uncontrolled and lead to a systematic bias in the final results.

IV. RENORMALIZATION
Renormalization is needed in order to relate the bare lattice QCD matrix elements to physical results. This is achieved by removing ultraviolet divergences, as well as finite dependence on the lattice action. 4 In the case of quasi-PDFs, one needs to also eliminate additional divergences arising due to the utilization of operators with a finite Wilson line. Renormalization of Wilson loops was addressed a long time ago using dimensional regularization (DR) for smooth contours [99], as well as for contours containing singular points [100]. Based on arguments valid to all orders in perturbation theory, it was demonstrated that smooth Wilson loops in DR are finite functions of the renormalized coupling, while the presence of cusps and self-intersections introduces logarithmically divergent multiplicative renormalization factors, referred to as Z factors. More importantly, it was shown that other regularization schemes are expected to lead to further Z factors, which are power-law divergent with respect to the dimensionful ultraviolet cutoff. This also appears in the lattice formulation, where a divergence arises as a function of the lattice spacing, that increases exponentially with the length of the Wilson line as ∼e z=a . Such a divergence must be removed prior to the extrapolation to the continuum limit. We describe in this section our renormalization program that includes the removal of the exponential divergence. A recent work on the perturbative renormalization of Wilsonline fermion operators of the type given in Eq. (9) has identified the mixing pattern among nonlocal straight Wilson-line operators, and led to the development of an appropriate renormalization prescription for both the multiplicative renormalization and the mixing coefficient [21]. The nonperturbative renormalization program that we developed is a generalization of the RI 0 scheme [105], which is appropriate for operators that contain a Wilson line [101]. 5 The Z factors are extracted by imposing the following conditions: Tr½Vðz;p;m π ÞðV Born ðz;pÞÞ −1 Tr½ðSðp; m π ÞÞ −1 S Born ðpÞ We use the general notation Z O and consider O ¼ V 0 , A, T corresponding to the unpolarized, helicity and transversity operators, respectively. Z O and Z q are the renormalization functions of the operator and the quark field, respectively. Both Z O and Z q are scheme and scale dependent, and are expected to have some dependence on the pion mass. Also, Z O is a function of the length of the Wilson line, z. V is the amputated vertex function of the operator and S is the fermion propagator, while V Born and S Born are the corresponding tree-level values. Note that this condition is applied independently for each value of z. The RI 0 renormalization scale, μ 0 , is chosen to be democratic in the spatial directions, that is aμ 0 ¼ 2π L s ðn t þ π 2 ; n; n; nÞ, which minimizes the ratio P4 ≡ P i μ 4 i =ð P i μ 2 i Þ 2 and suppresses discretization effects [104,107].
For the calculation of renormalization functions, we employ the momentum source method [104,108] that has the advantage of yielding results of high statistical accuracy. This method requires new inversions for each momentum used, but significant reduction in the gauge noise is observed, which by far outweighs the additional computational cost. Data are produced using the three N f ¼ 2 ensembles given in Table V that have a different value of the pion mass. The twisted mass parameters of the light quarks in the sea and valence sectors have been set equal (isospin limit and unitary setup). Even though the ensemble simulated with the smallest value of the pion mass has a larger volume, the Z factors are obtained at the same values of ðaμ 0 Þ 2 and P4. In the current work, we improve our previous analysis on the Z factors of onederivative operators [101] by performing the following: (1) A chiral extrapolation using the three ensembles at different values of the pion mass. (2) A fit on the chiral data to eliminate residual dependence on the RI 0 scale. We use several values of the RI 0 scale that cover the range ðaμ 0 Þ 2 ∈ ½1-4. An extensive study on the choice of the renormalization scale and the corresponding systematic uncertainties can be found in Ref. [101]. 4 Elimination of residual dependence on the lattice formulation requires continuum extrapolation. 5 For alternative approaches, using, e.g., an auxiliary field method [106], see the recent review of Ref. [33].

A. Pion mass dependence
The pion mass dependence for the Z factors for operators with a finite Wilson line has never been explored, and it is expected that small values of z will have a weak dependence on m π , as observed in the case of local operators computed within the same setup. However, it is not known how the Z factors will behave when the length of the Wilson line is large. To obtain the Z factors in the chiral limit, we fit the data from the three ensembles at each value of z. This fit is applied to the real and imaginary parts independently. The data for Z RI 0 O are expected to have a quadratic dependence on the pion mass (equivalently linear with respect to the twisted mass parameter) and are fitted using The chirally extrapolated value is given by the fit parameter Z RI 0 O;0 ðz; μ 0 Þ. Since the fitted data are obtained on different ensembles, we use the super-jackknife method (see, e.g., Ref. [109]) to correctly calculate the statistical error. This method is applicable to both correlated and uncorrelated data. In Fig. 18, we show the pion mass dependence of the real and imaginary parts of the Z factors for all three operators. For a clear presentation, we focus on z=a ¼ 1, 5 and we plot against m 2 π for the scale ðaμ 0 Þ 2 ¼ 2.5. We find that the dependence is almost constant in m 2 π and the chiral fit yields a slope consistent with zero for both the real and imaginary parts for small values of z (see, e.g., z=a ¼ 1). As z increases, a nonzero slope is observed, with the dependence being linear, as expected. The dashed line corresponds to the chiral fit of Eq. (22), and the open symbols are the extrapolated values Z RI 0 O;0 ðz; μ 0 Þ.

B. Volume effects
Another source of systematic uncertainty entering the determination of matrix elements is due to the finite lattice extent. Finite-volume effects are expected to be suppressed as expð−m π LÞ and based on empirical studies, m π L ≥ 4 is considered sufficient in most applications. However, most lattice QCD studies deal with matrix elements of local operators, and as discussed in Ref. [110], the length of the Wilson line entering the operator may enhance finitevolume effects. To date, this systematic uncertainty has not been investigated due to the absence of lattice QCD  Hence, we conclude that volume effects in the Z factors are small and have little effect on the renormalized matrix elements. This partly results from the fact that the bare matrix elements go to zero in the large-z region. In determining the final values for the Z factors, we do not include the 64 3 × 128 ensemble, because the ratio P4 is large for most of the scales ðaμ 0 Þ 2 ∈ ½1-4 compared to the smaller volumes, leading to contamination from finite-a effects, as seen in Ref. [104]. However, for ðaμ 0 Þ 2 ¼ 2.5 which was used in the comparison, P4 is the same as the one obtained from the smaller volumes, which allows one to isolate the volume effects. Finite-a effects are in fact under investigation for this class of nonlocal operators in lattice perturbation theory [111].

C. Conversion to the standard and modified MS schemes
In order to compare renormalized lattice QCD matrix elements with phenomenological results extracted from global analyses, the Z factors must be converted to the same scheme and evolved to the same scale as those used in the phenomenological analyses. Traditionally, the chosen scheme is the MS scheme and the scaleμ is typically set to 2 GeV. The appropriate conversion factors for the nonlocal operators with a straight Wilson line are taken from Ref. [21], where a calculation was carried out to one-loop level in perturbation theory, using dimensional regularization. Technical complications related to the nonlocality of the operators under study make it very hard to extend such a calculation to higher loops, as done for local operators, which are usually known to three and four loops. As a consequence, it is expected that the Z factors will have a residual dependence on the initial RI 0 scale μ 0 . Thus, one typically computes the Z factors at several values of the RI 0 scale, as set by Eq. (20), and then uses an appropriate conversion to bring each Z For Z MS O ðz;μ; μ 0 Þ, we use the chirally extrapolated Z factors converted to MS at the scaleμ ¼ 2 GeV. For simplicity, in the notation we dropped the subscript "0" appearing in the fit of Eq. (22). The desired quantity is the fit parameter Z MS O;0 ðz;μÞ. Our renormalization program entails a new element, namely the use of a modified MS scheme (MMS). The development of such a scheme was motivated by the fact that the existing matching formulas do not satisfy particle number conservation (e.g. Ref. [45]). The matching using the MMS scheme was already presented in our recent work [19,20]. Here we complement the previous analyses by giving the appropriate conversion of the Z factors to the MMS scheme, instead of MS. We find that the resulting modification is numerically very small, but moves the final values of the PDFs towards the phenomenological ones; see Sec. VA for details of the MMS scheme and Sec. V B for numerical effects when applied to our data. In a nutshell, an additional conversion factor is needed to bring Z is different for each operator under study and its general form is given by where μ F is the factorization scale that is taken equal to the MS scale, that is, μ F ¼μ ¼ 2 GeV. The expression also contains the special functions Ci (cosine integral), Si (sine integral) and Ei (exponential integral), as well as the sign function (sgn).
The coefficients e ðiÞ O depend on the operator and their numerical values are given in Table VI. In Fig. 20, we show Z MMS O;0 ðz;μÞ after multiplying Z MS O;0 ðz;μÞ with the conversion factor given in Eq. (25). The data are shown against ðaμ 0 Þ 2 to demonstrate the dependence of the Z factors on the initial RI 0 scale. As done for the previous figures, we choose representative values of the length of the Wilson line, namely z=a ¼ 1 and z=a ¼ 5. We find that the imaginary part has a stronger dependence on the μ 0 value. Note, however, that the imaginary part is an order of magnitude smaller than the real part of the matrix element and thus this dependence is still suppressed in the total matrix element, especially in the region ðaμ 0 Þ 2 ∈ ½1-2, which is an indication of nonperturbative effects. Such behavior is also observed in local operators for ðaμ 0 Þ 2 < 1. This tendency seems to affect the region ðaμ 0 Þ 2 < 2 for nonlocal operators with z=a ≥ 5. Since we are using perturbative expressions for the conversion to the MMS scheme, it is important to choose a region where perturbation theory is valid, and thus, we choose the values obtained from ðaμ 0 Þ 2 ∈ ½2-4 for both the real and imaginary parts. There is a systematic uncertainty attached to the Z factor due to the choice of the fit range, and here we used the ranges ðaμ 0 Þ 2 ∈ ½1−3, [1][2][3][4], [2][3], [2][3][4] to estimate the systematic uncertainty. Even though the real and imaginary parts of the Z factors do have mild dependence on the range, the final values are consistent. Therefore, we do not give any systematic uncertainty from the ðaμ 0 Þ 2 → 0 extrapolation. In the same figure, we also show the improvement of the nonperturbative results when subtracting Oðg 2 a ∞ Þ effects computed perturbatively in Ref. [104] on the same ensembles. They are obtained by replacing Eq. (20) with Tr½Vðz; p; m π ÞðV Born ðz; pÞÞ −1 where the denominator has been modified by subtracting the Oðg 2 a ∞ Þ artifacts in Z q , denoted by A ∞ q ðμ 0 Þ. As expected, the differences between the subtracted results of Eq. (27) and the unsubtracted results of Eq. (20) are small, and mostly affect the real part of the Z factors. In addition, subtraction of the artifacts in Z q is not sufficient to eliminate the dependence on aμ 0 . For the latter to be achieved, subtraction of the artifacts in Vðz; p; m π Þ is crucial, as has been discussed in Ref. [104] for the local operators. Such a computation for the nonlocal operators is not only more technically complicated, but also requires the addition of stout smearing in the links of the operator resulting in lengthy expressions. This calculation is under way and will be presented in a separate publication [111].
The final values for the Z factors are given in the MMS scheme upon a linear fit in ðaμ 0 Þ 2 → 0. In the region ðaμ 0 Þ 2 ∈ ½2-4 [Eq. (24)]. They are shown in Fig. 21 for the vector operator. As can be seen, the statistical errors of the Z factors are much smaller as compared to those of the   are also zero but with large statistical uncertainty originating from the large values of the real part of the Z factors. Both the matrix elements and the renormalization functions of a given operator share similar properties with respect to z (symmetric real part and antisymmetric imaginary part), with the difference that the Z factor decreases when a smearing technique is applied to the Wilson line, while the matrix element increases. Of course, the dependence in the smearing is nonmonotonic due to the complex nature of the matrix elements and Z factors, but the stout smearing dependence is expected to cancel out in the renormalized matrix elements. This is demonstrated in Fig. 22, where we compare the renormalized matrix elements for the helicity operator extracted using 0, 5, 10 and 15 stout smearing steps, in the MS scheme at 2 GeV and for a momentum of 6π L . This confirms that the elimination of the power divergences is correctly realized via the renormalization program, yielding compatible results for the renormalized matrix elements for different stout iterations. We find that this holds also for the renormalized matrix elements of the unpolarized and transversity operators and therefore any stout step may be used without changing the final physical result. It is worth mentioning that the agreement is more striking upon the ðaμ 0 Þ 2 → 0 extrapolation of the Z factors, as the central values of the data from different stout steps are very close to each other. Nevertheless, a similar comparison using a specific value of ðaμ 0 Þ 2 shows that the agreement still holds within statistical uncertainties.

A. Derivation of the matching formulas
In this section, we discuss, in detail, the matching procedure that relates the quasi-PDFs, renormalized in some scheme, to light-cone PDFs in the same or another renormalization scheme, typically chosen to be the MS scheme. We derive new matching formulas that relate MSrenormalized quasi-PDFs to MS-renormalized light-cone PDFs and conserve the particle number. To satisfy this condition, we introduce a modification of the MS scheme, i.e., the MMS scheme, which was already partially discussed in the previous section, as it is a procedure applied also on the conversion factors.
Quasi-PDFs can be obtained as a Fourier transform (FT) of renormalized matrix elements, h Γ ðP 3 ; zÞ, To relate the quasi-PDFsqðx; P 3 Þ to light-cone PDFs, one relies on a perturbative matching procedure [34,[43][44][45][46]. To one-loop order, and in the Feynman gauge, one needs to compute self-energy corrections, which include the usual self-energy plus the virtual "sail" and "tadpole" diagrams, and the vertex corrections, with the corresponding real "sail" and "tadpole" diagrams (see Fig. 23). We use operators with four Dirac structures, namely γ 0 and γ 3 for the unpolarized distribution, γ 3 γ 5 for the helicity, and γ 3 γ j , j ¼ 1, 2 for the transversity case.
Since the extraction of the matching formula follows a similar process for all four operators, we use the γ 0 structure as an example and we only present the final results for the other cases. As already mentioned, we take the nucleon momentum in the third direction, P ¼ ðP 0 ; 0; 0; P 3 Þ. It is assumed that, before gluon emission, the quark momentum p is collinear to the nucleon momentum, i.e. p ¼ ðp 0 ; 0; 0; p 3 Þ. It also obeys the Dirac equation puðpÞ ¼ 0, and carries a fraction of the momentum y of the parent hadron. After gluon emission, the quark has momentum ξp 3 ¼ ξyP 3 . When the bare quark distributions are dressed by taking the p 3 → ∞ limit, we have the usual quark distributions in the infinite-momentum frame or, equivalently, on the light cone. Defining x ≡ ξy, the nonsinglet quark distribution at one loop is given by where Π γ 0 denotes the self-energy corrections, Γ γ 0 represents the vertex corrections, and Λ stands for the IR and UV regulators in any given scheme. Using DR to regulate both the IR and UV divergences, the resulting one-loop correction for 0 < ξ < 1 in the MS scheme is where the pole from the UV divergence has already been subtracted, andμ is the corresponding renormalization scale in the MS scheme. The pole from the soft IR, 1=ϵ IR , can be absorbed into the bare distribution, at the factorization scale μ F , but it is explicitly written in Eq. (31), as it must cancel a similar pole arising in the one-loop correction to the quasi-PDF. The fact that the poles in 1=ϵ IR are the same for the light-cone PDF and the quasi-PDF, is a crucial observation that allows one to match them using perturbation theory. There are remaining IR divergences, which are located at ξ ¼ 1 and have their origin in the emission of soft gluons. However, they cancel between the vertex and self-energy one-loop corrections, which are related by Π γ 0 ðΛÞ ¼ − R 1 0 dξΓ γ 0 ðξ; ΛÞ. To obtain the quasi-PDF, the quark is dressed with finite p 3 . To simplify notation, we write the one-loop correction for positive x only, that is BecauseΓ γ 0 ≠ 0 for ξ > 1, the lower limit of integration in Eq. (32) goes to zero, and the quasi-PDF has support for x > 1. Using DR to perform the integrals, the one-loop correction to the vertex is given bỹ Note that the vertex corrections have no poles when DR is used, but they are nonzero outside the physical region 0 < ξ < 1, a behavior that is different from the one-loop correction to the light-cone PDF. Thus, in addition to the IR divergences at ξ ¼ 1, there are UV divergences associated with the ξ → AE∞ limits when Eq. (33) is integrated. So far, these UV divergences have not been subtracted, and this is reflected by the Λ dependence in the functional form ofΠ γ 0 ðΛ; p 3 =μ F Þ. DR is again used to isolate the poles, in which case dξ → dξðξp 3 Þ d−1 ðμ 2 e γ E =4πÞ 1=2−d=2 . For the region ξ > 1, the self-energy is FIG. 23. One-loop diagrams entering the perturbative calculation for the matching between quasi-PDFs and light-cone PDFs.
where d ¼ 1-2ϵ, with ϵ > 0. Using the plus prescription for the collinear divergence at ξ ¼ 1, Eq. (34) can be written as Similar computations can be done for the other regions, and the renormalized one-loop self-energy in the MS scheme for the quasi-PDF is given bỹ with the corresponding renormalization function, Upon integration of Eq. (33) and employing the MS scheme, the Ward identity of the resulting local current is naturally respected.
For the computation of the x dependence of the distributions, however, the convolution involving the vertex correction prevents the aforementioned cancellation from occurring, and one needs to impose a prescription to ensure conservation of the quark number for the nonsinglet distributions. From the equations for the quark distributions, Eq. (30), and quasidistributions, Eq. (32), and including the negative x regions we have to one loop The dependence on p 3 ,μ, and μ F is implicit in C MS ðξÞ.ΓðξÞ is the bare function, hence the absence of an MS superscript. The poles of 1=ϵ IR , on the other hand, automatically cancel in C MS γ 0 , which is explicitly written as For the rest of this paper, we will employ the usual choice of renormalization and factorization scales, μ F ¼μ. Equation (38) has an imbalance when integrated, becauseΓ γ 0 outside the physical region picks up a logarithmic divergence in the momentum fraction as ξ → AE∞, while this divergence has already been removed from the self-energy part in Eq. (36). This implies an anomalous nonconservation of the quark number, even if the classical current is conserved. In Ref. [45], C MS γ 0 has also been computed and a plus prescription at infinity was used to cancel the UV divergences between the vertex and self-energy corrections. This explains the source of the difference between our result in Eq. (38) and that in Eq. (68) of Ref. [45]. Namely, in our case, the term proportional to the Dirac δ function depends on the ratio of the factorization and renormalization scales, while in the case of Ref. [45], it depends on the quark momentum p 3 . Both prescriptions, however, do not conserve quark number.
To treat the unbalanced divergence, we introduce a modified MS scheme (MMS), which has already been discussed in the previous section and used in our previous work [19,20]. In this scheme, an extra subtraction is made outside the physical region of the unintegrated vertex corrections, which, in practice, renormalizes the ξ dependence for ξ > 1 and ξ < 0 and removes the potential divergences, One can write Eq. (39) in z space, noticing that ξ outside the physical region is now renormalized, at the scaleμ. The inverse Fourier transform is then from ξ to zμ, and the extra subtraction in z space is written as For consistency, Eq. (40) has been also applied to the renormalization functions to bring them to the MMS scheme, as described in Sec. IV C. Thus, these renormalization functions are obtained as follows: the Z factors calculated in the RI 0 scheme are converted to the MS scheme according to the perturbative formulas of Ref. [21] and then multiplied by the factor given in Eq. (26) to convert them to the MMS scheme. An important selfconsistency check is that the expression of Eq. (40) must cancel the z → 0 divergence in lnðz 2 Þ present in the MS scheme [21,47]. Indeed, in the limit z → 0, one has i.e., the Z factor of the vertex corrections at z ¼ 0 is the same in our scheme and in the "ratio" scheme introduced in Ref. [45] and both cancel the divergence. The latter scheme was proposed as an alternative to our solution of the current conservation problem when using the pure MS expression of Eq. (68) of Ref. [45]. The "ratio" scheme is another modification of the MS scheme that also needs an additional conversion factor in order to bring the renormalization functions in the MS scheme into the "ratio" scheme.
The difference with respect to our solution is that the form of the matching kernel [or of Z ratio Γ γ 0 ðzμÞ ¼ R dξ 2π e iξzμZMMS Γ γ 0 ðξÞ after doing the Fourier transform to z space] implies that the ξ dependence of the matching equation in the physical region is also renormalized, while our approach does not modify this region. Thus, the "ratio" scheme can induce potentially large modifications to the matched PDF, as we show below numerically.
We applyZ MMS Γ γ 0 ðξÞ to Eq. (38) to obtain a matching kernel that keeps the norm of the nonsinglet distributions unchanged. We also include the results for the γ 3 and γ 3 γ 5 Dirac structures. Because quarks are taken to be massless, the one-loop corrections are the same for the γ 3 and γ 3 γ 5 cases. Compared to the γ 0 case, a shift of þ2ð1 − ξÞ inside the plus prescription in the physical region of Eq. (38) is needed, and also, the factor of 5=2 in the last line of Eq. (38) becomes 7=2. The matching equation is then written as with the matching kernels, for the different Dirac structures, given by where ι ¼ 0 for γ 0 , and ι ¼ 1 for γ 3 and γ 3 γ 5 . We note that an alternative procedure to get MS-renormalized light-cone PDFs is to directly match to them from RI-renormalized quasi-PDFs. The relevant formulas for this were derived in Ref. [44] for the operators with Dirac structures γ 3 and γ 3 γ 5 and in Ref. [29] for γ 0 . Such a procedure is equivalent, to one-loop order, to the one described above, but with different higher-order effects. This choice is investigated numerically below, by comparing the direct matching from a variant of the RI scheme to the MS scheme with a two-step procedure that first brings the renormalization functions to the MMS scheme and then performs the MMS-to-MS matching. For the transversity case, the computation (albeit simpler) is similar because the usual one-loop vertex correction is zero in the Feynman gauge. The quark self-energy is not zero, but the unbalanced terms cancel each other in the matching equation, since they are the same for Π γ 3 γ j andΠ γ 3 γ j . Thus, Z MMS Γ γ 3 γ j ðzμÞ is given by the same functional form as Eq. (40), with the replacement of the numerical factors 3=2 and 5=2 by 2. C MMS γ 3 γ j is given by The matching kernel for the transversity has been calculated previously, with a hard cutoff and a quark mass to regularize the UV and the IR divergences, respectively, but without renormalization [34]. More recently, following the idea of Ref. [44] to match directly from RI-renormalized quasi-PDFs to light-cone PDFs in the MS scheme, the matching for the transversity was also derived in Ref. [31].

B. Comparison of matching in the MS, MMS and ratio schemes
Our choice for the matching to extract the light-cone PDFs from quasi-PDFs is to use a minimal modification of the MS scheme that ensures current conservation. Thus, the matching formula is applied on a quasi-PDF renormalized in a modified MS scheme, the MMS scheme, to yield the corresponding light-cone PDF in the pure MS scheme. The difference between the MS and MMS schemes was also taken into account at the level of the renormalization functions in Sec. IV. Note that in our previous work [19,20], we did not modify the conversion and, thus, our quasi-PDFs were renormalized in the standard MS scheme. The formulas for the conversion modification, derived in the current paper, were not available at the time of our previous work and we argued that the numerical effect of the modification is subleading with respect to other uncertainties (as the modification in the MS is minimal and only in the unphysical regions). Moreover, it disappears in the infinite-momentum limit. Having now derived the relevant formulas to convert renormalization functions from the MS scheme to MMS, we test this and validate that it is indeed a small effect.
In Fig. 24, we compare matched PDFs obtained from quasi-PDFs renormalized in the MS scheme, where the additional conversion of Eq. (26) is not included, and in the modified MMS scheme. As anticipated, the numerical effect is very small, particularly in the unpolarized case. Nevertheless, it is important to take the conversion modification into account to have a self-consistent procedure. We note that the modification brings the matched PDFs slightly towards phenomenological extractions, which is reassuring.
Another possibility to obtain a matching formula with current conservation is to use the procedure proposed in Ref. [45], the so-called "ratio" scheme, already mentioned in Sec. VA. It consists in a different way of changing the MS scheme to achieve current conservation, including a modification of the physical region. Thus, the effect on the matched PDFs is expected to be larger numerically than when using the MMS scheme. We show the comparison of the matching from our previous procedure and from the "ratio" scheme in Fig. 25. Indeed, the "ratio" scheme has a noticeable effect on the matched PDF, which is particularly large in the small-x region. Theoretically, both the "ratio" and MMS schemes are valid modifications of the MS scheme. Obviously, they are equivalent at the one-loop level, but they have significantly different higher-order effects. Hence, they can both be used to match quasi-PDFs to light-cone PDFs and the difference between them can serve as an estimate of truncation effects at one loop. For the remainder of the paper, we use the MMS scheme, as the one for which the higher-order effects are likely to be smaller, since this scheme is a milder modification of the MS scheme, i.e., one that does not affect the physical region in the matching.

C. Comparison of MS → MS and
RI → MS matching As we discussed above, our renormalization and matching procedure proceeds in two steps. First, the renormalization functions computed in the RI 0 scheme are converted to the MMS scheme and evolved to a chosen scale, μ ¼ 2 GeV, using one-loop perturbative formulas derived in Ref. [21], and including the additional conversion from the MS scheme to the MMS scheme, derived in this work. The resulting MMS Z factors are used to renormalize the bare matrix elements, which are then Fourier transformed from z space to x space, yielding the renormalized quasi-PDF in the MMS scheme. In the second step, a matching procedure is applied that brings the MMS-renormalized quasi-PDF to the light-cone PDF in the MS scheme, at the same renormalization scaleμ. We will refer to this procedure as a two-step procedure.
In Ref. [44], a one-step procedure was proposed. It consists in applying the renormalization functions computed in the RI scheme to the bare matrix elements and taking the Fourier transform of RI-renormalized matrix elements to obtain a quasi-PDF renormalized in the RI scheme. The matching procedure then serves to simultaneously bring the quasi-PDFs to the light-cone PDFs and to  convert from the RI scheme to the MS scheme, evolving them from the given RI scale to the reference scale of 2 GeV. For this procedure, we use the formulas derived in Ref. [29] and consider the operator with the γ 0 Dirac structure. We choose the variant with the p projection that differs from the projection that we used in our variant of the RI 0 scheme. We refer to this choice of the projection as RI p .
The comparison of the one-step and two-step procedures is presented in Fig. 26. In the left panel, we show our quasi-PDFs, renormalized in the RI p scheme and in the MMS scheme. The corresponding renormalization functions are applied to the same lattice data for bare matrix elements. The results of applying the one-step and two-step procedures are given in the right panel. We observe that the results of both procedures are fully compatible for small positive and negative x. For large positive x, the twostep procedure gives a PDF that goes to zero more slowly, while the PDF from the one-step procedure crosses zero at x ≈ 0.6 and remains negative until x ≈ 1 (reaching a minimum of around −0.16 at x ≈ 0.8). In the large negative-x region, the situation is analogous: the one-step (two-step) procedure leads to the PDF approaching zero from below (above).
Both the one-step and two-step procedures use a oneloop computation in continuum perturbation theory. Obviously, both contain higher-order contributions that cannot be quantified unless the two-loop formulas are available. Hence, neither of them can be considered to be the preferred one. It might happen that one of the procedures evinces smaller higher-loop effects without any theoretical arguments to support this, but it is not possible to say which one. The comparison between them is, thus, useful, as it reveals systematic effects due to higher-order terms, which may differ for different x regions. The present study suggests that the large-jxj regions suffer more from such higher-order effects. As Fig. 26 indicates, there may be significant two-loop effects in the conversion/evolution/matching procedures. Thus, a two-loop computation is indeed necessary. The two-step procedure allows for separating the conversion, the evolution and the matching and hence, a two-loop computation even for one of these, which is simpler than the full computation of the one-step procedure, can provide some insight.

D. Truncation of the Fourier transform
We turn now to the investigation of systematic effects related to the FT that is applied on the renormalized matrix elements in z space, giving the quasi-PDFs in x space. In principle, the FT integrates over all Wilson line lengths from zero to infinity, whereas on the lattice we have available for a finite number of discrete lengths z=a. Hence, one needs to decide about the maximum value of z=a taken in the discretized FT integral. Ideally, it should be a value for which both the real part and the imaginary part of the matrix elements have decayed to zero. For unrenormalized matrix elements, the choice of such a value poses no problem. As can be seen in Fig. 27, the bare matrix elements are zero for any jz=aj ≳ 15. However, renormalized matrix elements involve multiplication of the bare ones by complex Z factors that mix bare real and imaginary parts, according to Re½h ren ¼ Re½ZRe½h bare − Im½ZIm½h bare ; ð45Þ Im½h ren ¼ Re½ZIm½h bare þ Im½ZRe½h bare : ð46Þ . This is for the unpolarized case, with a nucleon momentum of P 3 ¼ 10π=L.
As a result, the real part of the renormalized matrix elements can be nonzero even if the real part of the bare matrix elements has already decayed to zero. This stems from an unphysical truncation effect in the perturbative conversion between the intermediate RI 0 scheme Z factors and the MS ones. Analogously, the imaginary part of the renormalized matrix elements gets contaminated by Im½Z which multiplies the real part of the bare matrix elements and leads to nonphysical contributions. Moreover, the large values of Z factors at large Wilson line lengths amplify the bare matrix elements and even if the latter are compatible with zero, this amplification introduces very large noise to the data, which propagates through the FT and matching to the final PDFs. These effects call for a truncation of the renormalized matrix elements at some justified value of z=a, which we call z max =a. In Fig. 28, we illustrate the z max =a dependence of the resulting final PDFs for all operators and for our largest momentum. The smallest (largest) value of z max =a is chosen according to where the real (imaginary) part of the renormalized matrix elements is compatible with zero. For the transversity case (lower plot of Fig. 28), we observe very little dependence on the choice of z max =a (apart from increased statistical noise for larger values of z max =a) and we choose z max =a ¼ 12 as the value for our final plots. In this case, both the real and imaginary parts of the renormalized matrix elements are compatible with zero. Hence, the unphysical effects of truncating the Fourier transform at finite z max =a are minimal. For the unpolarized case, a value of z max =a where Re½h ren and Im½h ren are both zero does not exist. Hence, we observe that the variation of z max =a leads to a visible effect in the upper left plot of Fig. 28. However, matched PDFs from all considered values of the cutoff are compatible with one another. In the end, we choose the middle value, z max =a ¼ 10, as the final one. In this way, the unphysical effect of truncation is split between the real and imaginary parts. For the former, the matched PDF gets a contribution from unphysical negative values of Re½h ren (resulting from the imaginary part of Z factors being nonzero due to the truncation of the perturbative conversion between renormalization schemes). In turn, for the latter, part of the contribution from the imaginary part of the renormalized matrix elements is missing. In the helicity case, Re½h ren decays to zero at a similar value of z=a as for the other cases and, contrary to the unpolarized matrix elements, does not go below zero. However, jIm½h ren j, reaches a minimum around z=a ≈ 12-13 and then increases again, to become compatible with zero at z=a ≈ 17-18, within huge errors. For our preferred value of the cutoff, we choose z max =a ¼ 12, where the real part is compatible with zero and the absolute value of the imaginary part is locally minimal. Varying z max =a (upper right plot of Fig. 28) leads to changes in the matched helicity PDF that are statistically insignificant.
Apart from the nonphysical effects appearing when the renormalized matrix elements are not compatible with zero at z max =a, further nonphysical effects are introduced by the FT if the decay of h ren is not fast enough. Namely, the periodicity of the Fourier transform leads to unphysical oscillations in the quasi-PDFs and finally in the matched PDFs. This effect is visible in all of our final distributions as a distorted approach of PDFs to zero for x between 0.5 and 1, as a mild oscillatory behavior for large negative x, and as an unphysical minimum in the antiquark part at −x ≈ 0.1-0.2. We note that these symptoms are interconnected, e.g., vector current conservation implies that a too large integral of the quark part, caused by the FT truncation effects, must lead to a too low value of the integral in the antiquark part, signaled by the negative value ofd −ū for low negative x. Decomposing the FT into cosine and sine transforms, one can observe that the quasi-PDF is bound to be negative if the matrix elements entering the transforms do not decay to zero fast enough. The rate of decay of these matrix elements depends on the nucleon boost, as was demonstrated in Fig. 7. For larger momenta, the decay becomes faster, i.e., zero is reached at a smaller value of z=a, for both the real and imaginary parts. Thus, the unphysical oscillations that we observe in the data clearly depend on the nucleon boost. However, as argued in a study performed in parallel to our work [112], the issue is more general and profound. They showed that the problem of reconstructing a distribution from a naively discretized and truncated Fourier transform, i.e., with only a few available values of the matrix elements (corresponding to integer values of z=a) extending to a finite Wilson line length (up to some z max =a), is mathematically ill defined. They advocated the use of advanced reconstruction techniques that can improve the situation considerably. Among such techniques that were tested in that paper are the Backus-Gilbert method, a neural-network reconstruction and Bayesian methods. Applying them to our current data is beyond the scope of this work, but will be pursued by us in the future. We also remark that the FT problem is aggravated in the process of renormalization. In Fig. 29, we display the results for the unpolarized quasi-PDF for z max ¼ 10a, comparing the bare, RI 0 and MMS renormalized cases. As can be seen, the oscillations do not emerge when Fourier transforming bare matrix elements that decay to zero faster. The amplification of the matrix elements by the Z factors is responsible for their relatively slow decay, which enhances the reconstruction problem. At this stage, one needs to remember that the Z factors, as calculated now, are subject to two kinds of nonphysical effects: lattice artifacts in their nonperturbative evaluation on a hypercubic lattice in the RI 0 scheme, and truncation effects in the perturbative conversion to the MMS scheme and evolution to the reference scale of 2 GeV. After curing these effects, by computing the higher-order conversion formula and by subtracting lattice artifacts computed in lattice perturbation theory [111], we expect that the oscillatory behavior may be considerably less prominent, but we do not expect that the final PDFs will be devoid of it. This is supported also by the behavior observed for the quasi-PDF renormalized using the RI 0 scheme, which has no conversion truncation effects, where we find that the oscillation is reduced with respect to the MMS-renormalized quasi-PDF, as seen in Fig. 29. However, even in the RI 0 scheme, the bare matrix elements are still amplified by the Z factors, whose real part is exponentially increasing due to the presence of the power divergence related to the Wilson line.
An approach that may address nonphysical oscillations was proposed in Ref. [24]. The idea is to rewrite the FT that yields the quasi-PDF, using integration by parts as follows: where h 0 ðzÞ is the derivative of the matrix elements with respect to the Wilson line length z. We note that the difference between the standard FT and the "derivative" method vanishes if hðjzj ≥ z max Þ ¼ 0. The integration-byparts step is exact; however, the proposed approximation is to neglect the surface term, even if it is nonvanishing. The latter happens when the matrix elements have not decayed to zero at z ¼ z max . The resulting FT, given by the second term of Eq. (47), thus, involves the derivatives of renormalized matrix elements, which decay to zero faster than the matrix elements themselves. As argued above, a fast enough decay of the transformed matrix elements is important to avoid oscillations. In this so-called "derivative" method, the oscillations are effectively transferred to the neglected surface term. We note, however, that this procedure is dangerous, due to the 1=x factor of the surface term, which leads to an uncontrolled effect for small values of x, where also the exponential of the surface term has a large real part. This aspect is illustrated in Fig. 30 for all three types of PDFs. The explicit 1=x factor amplifies the final PDF for small jxj, leading to its huge values, particularly in the small positive-x region. This is very visible in the case of the unpolarized PDF, where the results are statistically the most precise. In the polarized cases, in the small-jxj regions, one observes that the statistical errors are significantly enhanced by the "derivative" method. All three final PDFs evince divergences to −∞ as x → 0 in the antiquark part, in contradiction with the behavior observed in phenomenological PDF extractions. Furthermore, the oscillatory behavior is very mildly affected, undermining the very aim of using the "derivative" procedure. The "derivative" method was also analyzed by the authors of Ref. [112], who showed that this method does not alter the ill-conditioned behavior encountered for a naive truncated FT limited to Oð10Þ values of z=a. Thus, it does not offer a genuine solution to the FT issue. We are led to conclude that the problem does not disappear, since the Fourier transform that defines the quasi-PDF is still truncated and part of the truncation encoded in the surface term may be hiding physical information relevant for reconstructing the full PDF. In other words, for a fully reliable extraction of PDFs, the surface term has to go to zero when applying actual lattice QCD data, and not by setting it to zero by hand. Another disadvantage of using the "derivative" method is that it introduces additional discretization effects, which need to be controlled. For all the above reasons, we opt not to use the "derivative" method and treat the residual oscillations as an indication that advanced reconstruction methods need to be used instead of the naive approaches. Moreover, this aspect may be improved with a nucleon boost that is increased in a controlled manner (by bare matrix elements that decay to zero faster) and by refinements of the renormalization procedure. All of these are important directions for further study and we plan to explore these ideas in the future.

VI. FINAL RESULTS AND DISCUSSION
In this section, we present our final results for the collinear nonsinglet quark PDFs, whose determination was guided by the extensive investigation of several systematic effects presented in the previous sections. Although it is not possible to quantify all of the systematics, this investigation clearly points to the steps that one must follow in the future in order to arrive at precise extractions of PDFs from lattice QCD, with systematic uncertainties under control. We will be discussing these steps together with the presentation of the final results. In Fig. 31, we show the quasi-PDFs, the matched PDFs and the final PDFs that take into account the nucleon mass corrections (NMCs) for the largest value of the momentum. The nucleon mass corrections are performed according to the formulas of Ref. [17]. They are Oðm 2 N =P 2 3 Þ corrections to the PDF resulting from the finite mass of the nucleon and they can be derived in a closed form to all orders. We note that the one-loop matching that connects the quasi-PDFs to the light-cone PDFs is still relatively large for our largest nucleon boost. Increasing the nucleon boost while excitedstates contamination is fully under control is one aspect that has to be addressed in future computations. Another is the two-loop perturbative conversion between renormalization schemes and evolution to the chosen reference scale. This will further reduce the dependence on the initial RI 0 scale, and may be used to quantify truncation effects in the conversion factor. NMCs, on the other hand, are very small and negligible in most regions of x and thus not expected to contribute to large systematic effects.
In Fig. 32 we show the dependence of the final PDFs on the nucleon boost. For the unpolarized PDF, we observe a strong effect when increasing the momentum from 6π=L to 8π=L, which seems to converge after that yielding values that are compatible with those obtained for a nucleon boost of 10π=L over almost the whole range of x. In addition, the oscillatory behavior (at the level of both quasi-PDFs and matched PDFs) becomes milder as the momentum increases. In particular, there is an additional minimum of the final PDFs for the smallest considered momentum, at x ≈ 0.5 in the unpolarized case (see the upper left plot of Fig. 32), which disappears with momentum increase. As we discussed above, this behavior results from the fact that for larger boosts, both bare and renormalized matrix elements become consistent with zero at smaller Wilson line lengths z; see Fig. 7. For example, the real (imaginary) part of the bare matrix elements corresponding to the unpolarized case becomes compatible with zero for z=a ¼ 8 (13) at P 3 ¼ 10π=L and for z=a ¼ 13 (18) for P 3 ¼ 6π=L, the latter aggravating the truncation problem, as discussed in the previous section. Moreover, the reconstruction of the x-dependent distributions is more problematic with renormalized data, as again argued in Sec. V D. Also, the decay of the matched PDF to zero at x ¼ AE1 is best achieved at the largest momentum. For the helicity PDF, on the other hand, increasing the momentum from 6π=L to 8π=L is a minor effect, i.e., the final PDFs are compatible with each other for both of these momenta, over almost the whole range of x. However, when increasing the momentum from 8π=L to 10π=L, a significant shift is observed, especially for small values of x. Thus, it is likely that higher-twist effects are more pronounced for the helicity PDF. The oscillatory behavior is also milder in this case as the momentum increases, and the final PDF is compatible with zero at x ¼ 1, which is not the case for the two smaller momenta. For the transversity PDF, the momentum dependence is the mildest and results using momenta of 8π=L and 10π=L are compatible over a wide range of x values. Similarly to the unpolarized and helicity PDFs, the unphysical oscillations are significantly reduced and the transversity PDF vanishes at x ¼ 1 for a momentum of 10π=L.
In Fig. 33, we analyze the momentum dependence of the final matched PDFs more closely. In the unpolarized case, we observe compatibility between the data for the largest two momenta for all values of x illustrated in the upper panel. This is in accordance with theoretical expectations: the higher-twist effects (HTEs) are suppressed by OðΛ 2 QCD =P 2 3 Þ, which amounts to Oð5%Þ, Oð7%Þ and Oð13%Þ, for P 3 ¼ 10π=L, P 3 ¼ 8π=L and P 3 ¼ 6π=L, respectively. With our statistical precision, around 10% for most ranges of x, HTEs are expected to be hidden within statistical uncertainties when considering the two largest momenta. Obviously, the prefactor of OðΛ 2 QCD =P 2 3 Þ terms is unknown and can also depend on x, but the lattice data seem to favor Oð1Þ values (or smaller) in the unpolarized case. For the helicity PDF (middle panel of Fig. 33), we observe larger deviations between P 3 ¼ 10π=L and P 3 ¼ 8π=L, particularly for intermediate values of x, i.e., x ≈ 0.3. The central value for the largest momentum is approximately 25% lower than the one for P 3 ¼ 8π=L.
This can indicate enhanced HTEs [a prefactor of OðΛ 2 QCD =P 2 3 Þ larger than one] or a statistical fluctuation. The latter interpretation may be more favored, given the fact that the results are compatible for the other illustrated values of x. Moreover, HTEs are typically expected to be enhanced for low and high x, as concluded e.g., in Ref. [113]. Even though the results of that paper are not directly applicable to our setup, since the renormalization is performed by forming suitable ratios that have canceled divergences, this generic feature of HTE enhancement is very likely to hold. Nevertheless, the behavior that we obtained for the helicity PDF at intermediate x needs to be checked in future studies. In the transversity case (lower panel of Fig. 33   phenomenological results, we use the ones from CJ15 [114], ABMP16 [115] and NNPDF3.1 [116] for the unpolarized PDF, the ones from DSSV08 [117], NNPDF1.1pol [118] and JAM17 [119] for the helicity PDF and two parametrizations for the transversity PDF: one extracted from experimental semi-inclusive deep-inelastic scattering data and one where the tensor charge computed in lattice QCD was used as input in the phenomenological analysis [120]. We stress that the comparison with phenomenological PDFs is intended to be qualitative, since we only include statistical uncertainties. The unpolarized PDF has a similar slope to the phenomenological PDFs, but lies above them in the positivex region. Increasing the momentum should bring the PDF closer to the phenomenological values. In Fig. 32, we choose not to show the phenomenological curves for clarity of the plots, but looking at the position of these curves in Fig. 34, it is clear that a larger boost indeed moves the lattice-extracted PDFs in the direction of the phenomenological PDFs. Our results for the helicity PDF are compatible with phenomenological ones for x ≲ 0.4-0.5. While this may indicate faster convergence of the quasi-PDF, it can also be the result of different systematic effects, with some of them possibly canceling out. Likewise, our result for the transversity PDF is in agreement with both the phenomenologically extracted one, as well as the one using the lattice determination of the tensor charge as input, for x ≲ 0.4-0.5. An interesting feature is that the precision of our results is better than in both phenomenological determinations. Thus, lattice QCD holds the prospect of significantly impacting our knowledge of the transversity PDF, in particular because we expect that in the next generation of lattice QCD computations of PDFs a better control of the systematic effects will be achieved. The errors shown in the above figures are statistical, and significant effort is needed to properly quantify the systematic uncertainties present in the various steps of the analysis. Currently, these constitute the dominant source of uncertainty, and addressing them will allow one to draw final conclusions from the lattice results. Based on the current studies, it is not possible to quantify and hierarchize them. Typical systematics related to the lattice calculation include discretization effects, finite-volume effects and the role of the pion mass value. The pion mass of the ensemble used here is already at its physical value and hence, there is no need for a chiral extrapolation. The latter would be an important source of uncertainty, as the fit function is not known. We remark that, in practice, the effect of a nonphysical pion mass can be large, as we demonstrated in Ref. [19], comparing the physical pion mass PDF with one at m π ≈ 375 MeV.
The parameters of the ensembles are expected to satisfy certain criteria for the range of values of the pion mass, the volume and the lattice spacing, to study the following uncertainties.
(1) Cutoff effects: A reliable control of cutoff effects requires at least three values of the lattice spacing smaller than 0.1 fm. Normally, cutoff effects are found to be relatively small in lattice hadron structure calculations. In the quasi-PDF computation, the nucleon is boosted to momenta for which P 3 becomes significant in comparison to the inverse lattice spacing and this may lead to increased cutoff effects. We note that for our largest momentum, we have aP 3 ¼ 0.65 which is below the lattice cutoff (unlike Refs. [26,28,31] where the employed nucleon momenta are significantly above the lattice cutoff), and the continuum dispersion relation is still satisfied, as shown in Fig. 4. Still, it is unclear to what extent the good quality of the dispersion relation translates into discretization effects of the matrix elements considered here. (2) Finite-volume effects: Similarly to discretization effects, finite-volume effects are also usually found to be rather small in hadron structure observables. The situation with quasi-PDFs is likely to be somewhat more complicated, since we use operators with Wilson lines of significant length. The volume behavior of such extended operators was considered by Briceño et al. [110] within a model using current-current correlators in a scalar theory.
Despite the fact that the model is not directly applicable to our investigation, it does provide a warning that the suppression of finite-volume effects for matrix elements of spatially extended operators may change from the standard expð−m π LÞ to ðL m =jL − zj n Þ expð−m N ðL − zÞÞ, where m and n are undetermined exponents, which may become dominant for large z. Thus, finite-volume effects may turn out to be a significant source of systematics and their investigation will be crucial in the future. (3) Systematic uncertainties in the determination of the renormalization functions: Uncertainties also arise in the computation of renormalization functions due to the breaking of rotational invariance. We have partly improved our work by subtracting lattice artifacts in the renormalization factor of the quark field, computed in lattice perturbation theory (see Sec. IV). A similar subtraction for the Z factors of local operators was very successful [104,107,121], and we intend to perform such a subtraction in the future [111]. We note that the pion mass dependence and finite-volume effects in the renormalization functions studied here were found to be insignificant. (i) Uncertainties specific to the quasi-PDF approach: The most significant uncertainty of this type is the truncation of the Fourier transform and the finite number of evaluations of the matrix elements, when going from the z space of the latter to the x space of partonic distributions. This issue can be alleviated at larger nucleon boosts, which implies a faster decay of matrix elements to zero and, thus, mildened oscillations. However, according to the findings of the recent Ref. [112], the problem needs a different treatment, using advanced reconstruction techniques. Other ad hoc prescriptions, such as the "derivative" method, do not actually remove the source of the oscillations. In this technique, it is unclear how to handle the small-jxj region, since the ignored surface term contains an explicit 1=x factor. In addition, the need to discretize the derivative of the matrix elements introduces additional cutoff effects. The effects from the finite momentum are likely to be within the order of magnitude of their theoretical expectation, OðΛ 2 QCD =P 2 3 Þ, which amounts to around 5% at our largest momentum. Such higher-twist effects can be suppressed by going to larger boosts or alternatively, by explicitly computed and subtracted. However, the former is difficult, as the signal is exponentially decaying at larger momenta and the larger momentum also significantly increases excited-states contributions, particularly when using simulations at the physical pion mass. As we argued, good control over excited states is necessary for extracting reliable physical results. Thus, a detailed study is required to maintain ground-state dominance, following the approach explained in detail in this work. In this paper, we showed that excited states are suppressed below ∼10% if the source-sink separation is 1.1 fm at P 3 ≈ 1.4 GeV, as established with a detailed analysis based on three methods. We note that at the physical pion mass and at small source-sink separations, the contamination comes from tens of excited states, and thus, a one-state or two-state fit alone is not reliable. Compatibility between onestate and two-state fits provides a strong argument that ground-state dominance has been achieved. Another possibly significant source of systematic errors are the perturbative truncation effects in the conversion to the MS scheme, evolution to the reference scale and matching, all of which are currently known to the one-loop level. We observed that the effect of the latter is significant at our largest momentum and hence two-loop effects are likely to be substantial. Therefore, a two-loop computation would lead to better connection to light-cone PDFs with smaller values of the nucleon momentum.

VII. CONCLUSIONS
In this work we presented a detailed investigation of the formalism employed to extract x-dependent PDFs within lattice QCD. The analysis was carried out using one ensemble of N f ¼ 2 twisted mass fermions with the quark mass fixed to its physical value. Results on the collinear unpolarized, helicity and transversity PDFs were obtained for the isovector flavor combination. The quasi-PDF method used here relies on a computation of equal-time correlators involving boosted nucleons, with the momentum increased to large enough values, so that the large momentum effective theory is applicable. In this work, we investigated three values of the momentum, reaching a maximum of 1.38 GeV. All necessary components to obtain light-cone PDFs have been considered, namely, the calculation of the bare nucleon matrix elements, renormalization, matching to light-cone PDFs and subtraction of finite nucleon mass corrections. The work presented here is an extension of our earlier work [19,20], and includes details on technical and theoretical aspects, as well as improvements in various aspects of the calculation and examination of systematic effects. In particular, we provided the lattice techniques used in the evaluation of the matrix elements, such as momentum smearing and methods to eliminate (within statistical accuracy) excited-states contributions. Theoretical developments associated with the nonperturbative renormalization were presented, where a chiral extrapolation and a modification of the conversion to the MMS scheme have been employed. The latter is a variant of the MS scheme that ensures particle number conservation in the matching procedure. Issues related to the Fourier transform as well as different matching prescriptions were also explored.
Particular attention was given to discussing the role of systematic effects in the various steps of the analysis. Typical systematics related to the lattice calculation that are also common in other hadron structure calculations include discretization effects and finite-volume effects. These can be addressed and eliminated by simulations using additional gauge field configuration ensembles. At present, these are part of the unquantified uncertainties, that need to be addressed in future studies. Another typical systematic enters lattice data if the pion mass of the ensemble is at an unphysical value. In this work, we used simulations at the physical point and chiral extrapolation was not needed. Analyses using nonphysical values of the quark mass would introduce an important source of uncertainty, as the fit function to extrapolate the results to the physical point is not known.
Apart from the aforementioned systematics that are common in other hadron structure investigations, there are additional ones that are specific to the quasi-PDF approach. As discussed in the previous sections, significant effects are caused by the truncated and discretized Fourier transform. We will explore whether the reconstruction techniques advocated in Ref. [112] will indeed cure this problem. Also, increasing the nucleon boost can lead to further systematic uncertainties. High values of the momenta reduce the appearance of oscillations, but at the same time, they increase the number of contributing excited states. Systematics related to the truncation of the conversion factor and the matching formula to the one-loop level are non-negligible. In particular, having a matching formula to two loops may lead to better convergence to light-cone PDFs at smaller nucleon momenta. Hence, a two-loop computation is strongly desired and would help to establish a better connection to light-cone PDFs with smaller values of the nucleon momentum.
Despite these uncertainties, this work demonstrates tremendous progress in the determination of PDFs from the quasidistribution approach [33]. Lattice QCD results confirm the feasibility of extracting PDFs from firstprinciple calculations. The success in the quasi-PDF approach for the nucleon has also resulted in studies of other hadrons and alternative approaches, as summarized in Ref. [33]. The theoretical and technical aspects are now well understood for the nucleon studies and addressing the systematic uncertainties is the next step that will be enabled by using the methodology developed, in combination with the availability of increased computational resources. In fact, our preliminary results for an N f ¼ 2 þ 1 þ 1 twisted mass ensemble with physical values of the quark masses, a lattice spacing of 0.082 fm and a larger volume of 64 3 × 128 have been shown in Ref. [122] and demonstrate the future direction and progress of such computations. Furthermore, the production of another ensemble at a finer lattice spacing is already underway.