Transversity PDFs of the proton from lattice QCD with physical quark masses

We present a lattice QCD calculation of the transversity isovector- and isoscalar-quark parton distribution functions (PDFs) of the proton utilizing a perturbative matching at next-to-leading-order (NLO) accuracy. Additionally, we determine the isovector and isoscalar tensor charges for the proton. In both calculations, the disconnected contributions to the isoscalar matrix elements have been ignored. The calculations are performed using a single ensemble of $N_f = 2 +1$ highly-improved staggered quarks simulated with physical-mass quarks and a lattice spacing of $a = 0.076$ fm. The Wilson-clover action, with physical quark masses and smeared gauge links obtained from one iteration of hypercubic smearing, is used in the valence sector. Using the NLO operator product expansion, we extract the lowest four to six Mellin moments and the PDFs via a neural network from the matrix elements in the pseudo-PDF approach. In addition, we calculate the PDFs in the quasi-PDF approach with hybrid-scheme renormalization and the recently developed leading-renormalon resummation technique, at NLO with the resummation of leading small-$x$ logarithms.


I. INTRODUCTION
A significant goal of hadron physics is the determination of the full structure of nucleons.There has been much progress towards this end, both experimentally and theoretically.Experimentally, the leading-twist parton distributions functions (PDFs) for both the unpolarized and longitudinally polarized proton have been determined with high precision through global analyses [1][2][3][4] of experimental data collected for example at HERA, the Tevatron, the LHC, etc.In addition, to obtain the full collinear structure at leading twist requires the transversity PDF, which gives the difference in the probability to find a parton aligned and anti-aligned with the transversely polarized hadron.However, the transversity PDF is less well constrained experimentally, but measurements of the single transverse-spin asymmetries from semi-inclusive deep-inelastic scattering (SIDIS) processes by COMPASS [5] and HERMES [6], as well as dihadron production in SIDIS by COMPASS [7,8] and HERMES [9], and pp collisions by RHIC [10][11][12], have led to a series of extractions of the transversity PDF [13][14][15][16][17][18][19][20][21][22][23][24][25][26][27][28][29][30].However, the uncertainties can still be as large as 40% or more in the valence region [30].One of the major goals of the JLab 12 GeV upgrade and the upcoming Electron-Ion Collider (EIC) is to gain more information on the spin structure of the nucleon, including the transversity PDF [31,32].
On the theoretical side, there has been significant de- * Electronic address: gaox@anl.gov† Electronic address: ahanlon@bnl.govvelopment in the first-principles calculations of PDFs through lattice QCD (see reviews in Refs.[33][34][35][36][37][38][39]).Among them, the two most widely used approaches utilize either the quasi-PDF within the framework of large-momentum effective theory (LaMET) [36,40,41] or the pseudo-PDF [42,43].Both the quasi-PDF and pseudo-PDF are defined from the matrix elements of a gauge-invariant equal-time bilinear operator in a boosted hadron state [40], which can be directly simulated on the lattice.In the LaMET approach, the PDF can be calculated from the quasi-PDF through a power expansion and effective theory matching at large hadron momentum, with controlled precision for a range of moderate x.On the other hand, the pseudo-PDF method relies on a short-distance factorization in coordinate space [44][45][46], which allows for a model-independent extraction of the lowest Mellin moments [46] or a model-dependent extraction of the x-dependent PDF.Both methods require larger hadron momenta to extract more information on the PDF, and can complement each other in practical calculations [47,48].
Over the past decade, there have been a few calculations of the transversity PDF from lattice QCD using both the quasi- [49][50][51][52][53][54] and pseudo-PDF [55] approaches, which were all carried out with a next-to-leading-order (NLO) perturbative matching correction.Among them, the first physical pion mass calculations [50][51][52] were accomplished with the regularization-independent momentum subtraction (RI/MOM) scheme [56][57][58][59] for the lattice renormalization, which is flawed by the introduction of non-perturbative effects at large quark-bilinear separation.To overcome this problem, the hybrid scheme [60] was proposed to subtract the Wilson line mass with a matching to the MS scheme at large quark-bilinear separation, which was used in the recent calculation of Ref. [54] with continuum and physical pion mass extrapolations.More recently, a systematic way to remove the renormalon ambiguity in the Wilson-line mass matching, called leading-renormalon resummation (LRR), was proposed in Refs.[48,61].
In this work we carry out a lattice QCD calculation of the proton isovector and isoscalar quark transversity PDFs at physical quark masses, where the latter have been calculated without the inclusion of disconnected diagrams.This is an extension of our previous calculation of the proton isovector unpolarized PDF [62].Here we utilize both methods for calculation, which can help to understand the significance of the different systematics within them.In particular, for the quasi-PDF method, we adopt the hybrid scheme with LRR and work at NLO with leading-logarithmic (LL) resummation that accounts for PDF evolution, which gives us a reliable estimate of the sysmtematic uncertainty in the small-x region [61].
The rest of the paper is organized as follows.First, in Sec.II, we review the setup of our lattice calculation.Then in Sec.III we describe our analysis strategy to extract the ground-state matrix elements, which includes an estimate for the tensor charge.In Sec.IV we use the ground-state matrix elements to extract the lowest few Mellin moments from the leading-twist operator product expansion (OPE).We then move on to the determination of the transversity PDF with the pseudo-PDF method in Sec.V and the quasi-PDF method in Sec.VI.And, finally, we conclude in Sec.VII.

II. LATTICE DETAILS
Our setup is nearly identical to that used in our previous work on the unpolarized proton PDF [62], and is also similar to our work on the pion valence PDF [63,64].There are only two differences here: i) the specific correlators needed for the transversity PDF, which were, in fact, computed at the same time as the correlators needed for the unpolarized PDF; and ii) an increase in statistics for the P z = 2π6 L , t sep = 12a data.Therefore, we only repeat the most pertinent details here.
The calculations are performed on a 64 3 × 64 ensemble of N f = 2 + 1 highly-improved staggered quarks (HISQ) [65] with masses tuned to their physical values and a lattice spacing of a = 0.076 fm, which was generated by the HotQCD collaboration [66].For the valence quarks the tree-level tadpole-improved Wilson-clover action is used on the smeared gauge field background.We use one iteration of hyper-cubic (HYP) smearing [67] on the original gauge fields.The valence quark masses are set to their physical value.Because of the use of a HYP smeared background, we do not see any exceptional configurations.This feature of the Wilson-clover action with HYP smeared gauge background holds even on coarser lattices, as was pointed out some time ago [68].Note that, in principle, the mixed action setup leads to violations of unitarity at non-zero lattice spacing.However, the effects of this appear to be very mild (e.g.see Ref. [69] which makes consistent comparisons for nucleon structure observables between different mixed and unmixed setups).
In order to build a nucleon operator with good overlap onto a highly-boosted nucleon state, the quark fields are smeared using Coulomb-gauge momentum smearing [70] as described in App.A of Ref. [71].Within this method, for a given desired momentum P z ≡ 2πnz L of the nucleon, the momentum smearing assumes a quark boost of 2πkz L , where n z , k z ∈ Z.For an optimal signal, k z should be about half of n z .
We use the Qlua software suite [72] for calculating the quark propagators and subsequently constructing the final correlators.The needed inversions are performed using the multigrid solver in QUDA [73,74] and utilize all-mode averaging (AMA) [75] to reduce the total computational cost.The residual used in our solver is 10 −10 and 10 −4 for exact and sloppy solves, respectively.Some of the more important details, including the total statistics achieved, can be found in Tab.I.

A. Correlation functions
We use the standard interpolating operator for the nucleon, given by where C = γ t γ y is the charge-conjugation matrix.These are then used to construct the two-point correlation func-tions as follows where the superscripts on the nucleon operators specify whether the quarks are smeared (s = S) or not (s = P ), and P 2pt is a projection operator.As described in Ref. [62], we always use smeared quarks at the source time, but consider both smeared and unsmeared quarks at the sink time, which helps to more reliably extract the spectrum by looking for agreement between the independent analysis of both correlators.
The three-point correlators computed are given by where P 3pt is a projection operator, ⃗ p is the momentum of the sink nucleon, ⃗ q is the momentum transfer, and the inserted operator is where Γ is a product of gamma matrices, ψ f (⃗ z, t) is a quark field of flavor f , and W is a straight Wilson line of length z connecting the quark fields.For the threepoint functions, we only consider nucleon operators built from smeared quarks.The Wilson line is formed from products of the HYP-smeared gauge links and is needed to construct a gauge-invariant operator.In this work, we consider the light quark flavors f = u, d separately, allowing us to access the isovector (u − d) and isoscalar (u + d) combinations.Note, however, that all disconnected contributions are ignored, leading to uncontrolled errors due to their neglect in the isoscalar combination.This approximation is expected to be reasonable given that estimates from PNDME for the disconnected contributions to the tensor charge have indicated they are smaller than the statistical error on the connected contributions [68,76].In what follows, we only consider zero momentum transfer ⃗ q = 0 and the sink momenta are always in the z-direction ⃗ p = 2πnz L ẑ ≡ P z ẑ.We use four different values for the sink momenta n z ∈ {0, 1, 4, 6} which in physical units corresponds to P z = {0, 0.25, 1.02, 1.53} GeV.The statistics gathered and quark boosts used for each n z are given in Tab.I.
In this work, we are interested in the tensor charge and the transversity PDF, which can be accessed with Γ ∝ σ zj (with j being either x or y) and which projects the nucleon to positive parity and its spin to be aligned along the direction given by ŝ.Here we use Γ = −iσ zy = −iγ z γ y and ŝ = x.Throughout the remainder of the text, we use δ to denote the specific operator and polarization used, which is motivated by the standard usage of δq(x) in the literature for the transversity PDF.
In order to guarantee the cancellation of amplitudes that appear in the spectral decompositions of the threeand two-point functions, we set P 2pt = P 3pt ≡ P, and we will denote this in the two-point functions by C 2pt Sx .

III. GROUND-STATE MATRIX ELEMENTS
In this section, we extract the ground-state bare matrix elements from the three-point correlation functions.Our analysis strategy is nearly identical to that used in our previous work of Ref. [62], and we repeat the most important details here for convenience.The only difference in the strategy is the choice in our preferred fit ranges.In this work, the quality of our data has increased, giving us more confidence in our fits, and therefore, we end up excluding less time insertions from our final fits.
Our approach first extracts the spectrum and ratios of amplitudes from the two-point correlation functions in order to use these as priors on the parameters shared in our fits to the ratio of three-point to two-point functions.Although the two-point correlation functions differ slightly from those used in our previous work in Ref. [62], because P 2pt is different, we do not include any discussions here, as the analysis strategy is identical and the results only change by a slight increase in the error.The increase in error can be understood from the fact that the change in P 2pt amounts to only using a single spin polarization, as opposed to averaging over both spin polarizations as done previously.

A. Analysis strategy
We follow the standard approach for extracting the bare matrix elements, which begins by forming an appropriate ratio of the three-point to two-point correlation functions given by FIG. 1: The Wilson-line length dependence of the (upper) real and (lower) imaginary parts of the isovector ground-state bare matrix elements from two-state and summation fits with nexc = 2, 3 for the three nonzero values of momentum considered (one for each column).The results shown are averaged with the negative z fits.
where h f δ;0,0 (z, P z ) is the desired bare ground-state matrix element.Since the values of t sep considered here are not likely in the asymptotic region, we include the effects from the lowest N states by substituting the spectral decompositions of the three-and two-point functions truncated at the N th state.After some algebra, we find α (P z ) ≡ ⟨Ω| N β P βα |n, P z ⟩ (|Ω⟩ is the vacuum state and |n, P z ⟩ is the n-th nucleon state with momentum P z ), and The parameters h ′f δ;m,n depend on z and P z , but this dependence is suppressed to save space.For convenience, we typically suppress the indices on the matrix elements when referring to the ground state matrix element (i.e.h f δ (z, P z ) ≡ h f δ;0,0 (z, P z )).As the excitedstate matrix elements are never used, this should not cause any confusion.We fit the ratio of data in Eq. ( 6) to R f δ (P z , t sep , t ins , z; N ), where h ′f δ;m,n , ∆ i,j , and r i are the fit parameters.The parameters ∆ i,j and r i are priored using the fit results from the two-point functions (see Ref. [62] for details).In this work, we only consider N = 1, 2, as our limited data tend to lead to unreliable fits when N > 2.
In order to reduce the effects from unaccounted for excited-states as much as possible, we remove some of the data points nearest the sink and source times.We do this in a symmetric way, i.e. for each t ins not included in the fit, we also do not include t sep − t ins − 1.We define n exc to be the number of insertion times removed on each side of the middle point for each t sep .Therefore, for each t sep , the insertion times included in the fit are t ins ∈ [n exc + 1, t sep − n exc − 1].However, making n exc too large can leave too little data left, and therefore we only consider n exc ≤ 3.As described in our previous work of Ref. [62], the two-point function fits show contributions from three states for t sep ≤ n exc + 1 requiring the use of an effective value for the prior on the gap ∆ 1,0 that takes into account effects from higher states.The specific value used for the prior on the gap comes from the twostate fit to the two-point function with the lower fit range t min = n exc + 1.
As an additional consistency check on our fit results, we also make use of the summation method, which involves first summing R f δ (P z , t sep , t ins , z) over the subset FIG. 2: The same as Fig. 1, but for the isoscalar matrix elements.
be extracted from a linear fit to the sum as In Figs. 1 and 2, we show comparisons of the two-state and summation fit results for the isovector and isoscalar combinations, respectively, as a function of the Wilson line length for both n exc = 2 and 3. We see generally good agreement across these different fits, and, with the better data quality as compared to the unpolarized case, we choose our preferred fit as the two-state fit to the ratio Eq. ( 6) with n exc = 2.
Several representative fits, all using our preferred fit strategy, are shown in App.B. There we include the fits to the zero-momentum local matrix elements, relevant for the tensor charge, in Fig. 18 for the isovector and isoscalar combinations.We also include various fits to the non-local matrix elements, relevant for information on the PDFs, in Figs.19 and 20 for the isovector combination and in Figs.21 and 22 for the isoscalar combination.
which is then converted to MS at four-loop accuracy [77] and subsequently evolved to the scale µ = 2 GeV using the two loop evolution function [78] (see App.A for details).Then, using the ratio of bare charges g bare T /g bare V along with the expectation of Z V g bare V = 1, the renormalized tensor charge g T = Z T g bare T can be determined.Using our estimate for Z T /Z V , we find In Tab.II, we show a comparison of our results to the other N f = 2 + 1 results given in the FLAG review 2021 [79].

IV. MELLIN MOMENTS FROM THE LEADING-TWIST OPE
We now move on to the extraction of the lowest few Mellin moments using the leading-twist OPE approximation.Here we avoid the need for the renormalization factors, which depend on the Wilson-line length z and the lattice spacing a, by forming the renormalization-group invariant ratio [43,88] where λ ≡ zP z is known as the Ioffe time.In the literature, this ratio is referred to as the Ioffe time pseudodistribution (pseudo-ITD).In order to cancel the renormalization factors, the z = 0 matrix elements are not necessary, but this choice is favorable in that it enforces a normalization and cancels correlations.In this work, we only consider the case with P 0 z = 0, commonly referred to as the reduced pseudo-ITD [43,[89][90][91][92][93][94].Additionally, since there are no gluons involved in the case of the transversity distributions, the leading-twist OPE expansion of the pseudo-ITD does not depend on the flavor combination f ′ , even if f ̸ = f ′ , and we, therefore, opt to omit the f ′ from our notation in order to not be overly cumbersome.In what follows, when extracting the isovector flavor combination, f = f ′ = u − d, and for the isoscalar flavor combination, f = u + d and f ′ = u − d.
Then, using the leading-twist OPE approximation, we can write down the reduced pseudo-ITD as an expansion in Mellin moments [95] where C δ n (µ 2 z 2 ) are the Wilson coefficients for the transversity computed in the ratio scheme up to NLO in the strong coupling α s (µ), 1 which at fixed order are given by [49,55] C F = 4/3, and ⟨x n ⟩ f δ (µ) is the nth Mellin moment of the transversity PDF of flavor f defined at the factorization 1 We note that the anomalous dimension of the transversity matrix elements are known to N3LO [96].
where δq f (x, µ) is the transversity PDF of a quark with flavor f for x ≥ 0 and of its antiquark for x < 0. Estimates for the strong coupling itself are determined from Ref. [97], and we exclusively work at the scale µ = 2 GeV, resulting in α s (µ = 2 GeV) = 0.2930.Further, we also consider the effects from target mass corrections (TMCs), which can be incorporated with the following substitution As the Wilson coefficients are all real, it is clear from Eq. ( 14) that the real and imaginary parts of the reduced pseudo-ITD, M f δ (λ, z 2 , 0), can be written solely in terms of the even and odd moments, respectively.Therefore, we choose to separately fit the real and imaginary parts of the reduced pseudo-ITD to respectively, where the reduced moments with n > 0 are the fit parameters.The n = 0 reduced moment is identically one, which is enforced explicitly in the fit.Additionally, g f T ≡ ⟨x 0 ⟩ f δ , which implies that the reduced moments are the original moments in units of the tensor charge, and we express all results as such.
We start the analysis as before in Ref. [62] by first assessing the validity of the leading-twist approximation (i.e.how important are the O(Λ 2 QCD z 2 ) corrections which are ignored).To this end, we perform fits to Eq. ( 18) at only a single value for z 2 (referred to as a fixed-z 2 analysis) and look for any dependence of the extracted moments on the specific value of z 2 .Observing little or no dependence on z 2 would suggest that the higher-twist contributions are negligible within our statistics and that the leading-twist approximation is valid.
As z increases, the higher-moment terms in Eq. ( 14) begin to become important.We can determine when these higher-moment terms are expected to be nonnegligible by using the leading-twist OPE with the moments extracted from the global analysis of JAM3D-22 [28].We found that including two n ̸ = 0 moments .
FIG. 3: Results for the lowest four Mellin moments of the (upper) isovector and (lower) isoscalar PDF as a function of z from fits of the reduced pseudo-ITD at fixed z with nz ∈ [1,4,6] to the leading-twist OPE using LO, fixed-order NLO, and NLO+RGR Wilson coefficients evaluated at µ = 2 GeV, all of which include TMCs.Only the first two moments are extracted for z ≤ 4a.The horizontal dashed lines and bands correspond to the central values and errors, respectively, of the moments extracted from the global analysis of JAM3D-22 [28] defined at the scale Q = 2 GeV.
in the OPE for both the real and imaginary parts is necessary for z > 4a ∼ 0.304 fm, and that a third n ̸ = 0 moment becomes necessary for z > 8a ∼ 0.608 fm.However, using only one value of z 2 allows for up to two moments to be fit to both the real and imaginary data, as the number of non-zero P z considered is three.Therefore, for the fixed-z 2 analysis, the largest value of z used is z = 8a and we use two moments in the fits when z > 4a.
Our results for the fixed-z 2 analysis for both the isovector and isoscalar combinations are shown in Fig. 3.All results shown always include TMCs.Initially, we only considered the LO and fixed-order NLO Wilson coefficients in the analysis.However, the fixed-order NLO results show significant z dependence for ⟨x⟩ at small values .2 .4 .6 .8 FIG. 4: Results for the lowest four Mellin moments of the (upper) isovector and (lower) isoscalar PDF from uncorrelated fits of the reduced pseudo-ITD to the leading-twist OPE as a function of zmax, with z ∈ [zmin, zmax] and nz ∈ [1,4,6].The results use the fixed-order NLO Wilson coefficients evaluated at µ = 2 GeV and include TMCs.Only the first two moments are considered for zmax ≤ 4a.The next two moments, ⟨x 3 ⟩ and ⟨x 4 ⟩ are included for 4a < zmax ≤ 8a.And, two more moments, ⟨x 5 ⟩ and ⟨x 6 ⟩ are included for zmax > 8a.The horizontal dashed lines are the same as in Fig. 3.
of z.This is not completely unexpected, as discretization errors [98] and large logs can be significant for small values of z, see Appendix B in Ref. [64].Note, however, in that work, the analysis was done for the pion PDF, where the large logs become important at somewhat smaller z compared to the range of z where we see strong dependence here.To better understand the effects of large logs for the transversity PDF of the proton, we also use the NLO Wilson coefficients combined with renormalization group resummation (RGR) at next-to-leading logarithm  48) 39) 83) 73) FIG. 5: The (left) real and (right) imaginary parts of the (upper) isovector and (lower) isoscalar reduced pseudo-ITD for the three momentum used in this work.The data come from our preferred fit strategy described in Sec.III A. The fit ranges used are z ∈ [3a, 10a].The shaded bands correspond to the fits using the leading-twist OPE with fixed-order NLO Wilson coefficients and TMCs evaluated at µ = 2 GeV and including three moments in both the real and imaginary parts.
(NLL) accuracy, given by where a s = α s /(2π), β n is the nth order coefficient of the β function, and γ (1) n and γ (2) n are the anomalous dimensions of the nth moments [99,100].The RGR evolve the running coupling α s from the physical scale µ 0 = 2e −γ E /z to the factorization scale µ.As can be seen from Fig. 3, the use of the NLO+RGR Wilson coefficients produces results mostly consistent with the NLO case at small z.This suggests the significant z dependence at small z is mainly a discretization effect rather than due to large logs.
Next, we move on to including a range of values of z in our fits, considering various ranges z ∈ [z min , z max ].With the extra data, we can include an extra moment in the fits for z > 8a.The results for both the isovector and isoscalar moments are shown in Fig. 4. Given the small effect from the RGR which also becomes unstable when a s (µ 0 ) runs close to the Landau pole, we opt to use the fixed-order NLO Wilson coefficients, and also always include TMCs for the final results.There is some dependence on the choice of z min , as expected from the fixed-z 2 analysis results, however it is rather mild.Our preferred fit range is z ∈ [3a, 10a], which removes most of the effects from discretization errors and large logs at small z and keeps z max small enough to likely keep higher-twist contributions negligible.The results of these preferred fits are shown in Fig. 5. Finally, we show a summary of the results from different strategies and their comparison to JAM3D-22 [28] in Fig. 6.
It is interesting to note the rather good agreement with the global analysis from JAM3D-22, especially for the lowest two moments, whereas we found tension for the lowest non-trivial moment in the unpolarized case [62].However, comparing the matrix elements presented here 6: Summary plots of our extracted (left) isovector and (right) isoscalar moments from the leading-twist OPE approximation evaluated at µ = 2 GeV for various fitting strategies compared to the results from JAM3D-22 [28] defined at the scale Q = 2 GeV.Two zmax are considered, as well as Wilson coefficients at LO and fixed-order NLO.
versus those from the unpolarized ones, there is some hint of smaller excited-state contamination in the matrix elements of this work, which may be responsible for the better agreement.

V. PDF FROM LEADING-TWIST OPE: DNN RECONSTRUCTION A. Method
It has been shown that we can extract the Mellin moments of transversity PDFs by applying the OPE formula to the ratio-scheme renormalized matrix elements, model independently.Limited by the finite λ = zP z , the lattice data is only sensitive to the first few moments while the higher ones are factorially suppressed.As a result, to predict the x dependence of the PDFs, one needs to introduce additional prior knowledge or a reasonable choice of model.Commonly used models are usually of the form which is inspired by the end-point behavior of the PDFs.However, the sub-leading terms may play an important role, particularly in moderate regions of x, and one may find reasonable models for the sub-leading terms that give acceptable fits to the data.But, unless the data is precise, the model could introduce an uncontrolled bias.The use of a deep neural network (DNN) is a flexible way to maximally avoid any model bias (see e.g.Refs.[101,102] which also employed a neural network for this purpose) -but cannot remove the bias entirely, as a nueral network is still a model -which is capable of approximating any functional form given a complicated enough network structure.As proposed in Ref. [62], we parametrize the PDFs by, where f DNN (x, θ) is a DNN multistep iterative function, constructed layer by layer.The initial layer consists of a single node, denoted as a 1  1 , which represents the input variable x.Subsequently, in the hidden layers, a linear transformation is performed using the equation: Here, z .Following this linear transformation, a nonlinear activation function σ (l) (z (l) i ) is applied, and the resulting output serves as the input to the next layer, represented by a (l) i .We specifically employed the exponential linear unit activation function σ elu (z) = θ(−z)(e z − 1) + θ(z)z.Lastly, the final layer generates the output f DNN (x, θ), which is subsequently utilized to evaluate q(x; α, β, θ).The lower indices i = 1, ..., n (l) are used to identify specific nodes within the lth layer, where n (l) denotes the number of nodes in the lth layer.The upper indices, enclosed in parentheses, l = 1, ...
The first term in the loss function serves the purpose of preventing overfitting and ensuring that the function rep-0.00.2 0.4 0.6 0.8 resented by the DNN remains well-behaved and smooth.The details of the χ 2 function can be found in the appendix of Ref. [62].Due to the limited statistics, a simple network structure such as {1, 16, 16, 1} (indicating the number of nodes in each layer) is sufficient to provide a smooth approximation of the sub-leading contribution.In practice, we experimented with different values of η ranging from 10 0 to 10 −2 and considered network structures of sizes {1, 16, 16, 1}, {1, 16, 16, 16, 1}, and {1, 32, 32, 1}.However, the results remained consistent across these variations.Therefore, we opted for η = 0.1 and selected a DNN structure with four layers, including the input and output layers, specified as {1, 16, 16, 1}.
To balance the model bias and data precision, the contribution of the DNN is limited by |ϵ(x) sin(f DNN )| ≲ ϵ(x), which can be fully removed by setting ϵ(x) = 0.It is also possible to control the size of the DNN parametrized sub-leading contribution at each specific x.However, in this work, given the limited statistics, we simply fix ϵ(x) to be a small constant, e.g.0.1.

B. DNN represented PDF
To train the PDFs, we re-write the short distance factorization as, where the renormalized matrix elements hf δ (z, P z , µ) are directly connected to the x-dependent PDFs δq f (x, µ), and C δ (α, µ 2 z 2 ) can be determined from the Wilson coefficients C δ n (µ 2 z 2 ) [46,100].In this section we use the NLO fixed-order Wilson coefficients.In our case, the real and imaginary parts of the reduced pseudo-ITD M f f ′ δ (λ, z 2 , P 0 z = 0) are related to δq f,− (x) and δq f,+ (x), defined as in the region x ∈ [0, 1] and where δq f (x) and δq f (x) are the quark and anti-quark transversity distributions of flavor f , respectively.However, as observed in the literature [54,55], with the current lattice accuracy, the anti-quark distributions are mostly consistent with zero.We therefore ignore the anti-quark contribution and fit the real and imaginary parts together to δq f (x; α, β, θ) = δq f,− (x) = δq f,+ (x).We use the matrix elements in the range z ∈ [2a, z max ] for the parameter training, skipping z = a in order to avoid the most serious discretization effects.In Fig. 7, we show the fit results for z max = 10a with ϵ = 0 and ϵ = 0.1 which both lead to a good description of the data.The corresponding PDFs are shown in Fig. 8, and the results from ϵ = 0.1 exhibit slightly larger errors but mostly overlap with the ϵ = 0 case.It is evident that the effects of the DNN were minimal which is likely a result of the limited statistics.We anticipate the DNN playing a more significant role when more precise data becomes available.In what follows, we use the results with ϵ = 0.1.The short distance factorization could suffer from power corrections at large values of z 2 .To check this, we vary the z max used to train the PDFs to investigate such systematic errors.As shown in Fig. 9, by slightly increasing z max , the results do not change significantly within the large errors, suggesting that higher-twist effects are less important compared to the statistics of our data.For comparison, we also show the most recent global analysis results from JAM3D-22 [28], and overall agreement is observed.

VI. x-SPACE MATCHING
We now move on to our final method for extracting information on the transversity PDF.This method utilizes LaMET to match the quasi-PDF -determined from the Fourier transform of hybrid-renormalized matrix elements -to the light-cone PDF.

A. Hybrid renormalization
It is well known that the bare matrix elements can be multiplicatively renormalized by removing the linear divergence originating from the Wilson line self energy and the overall logarithmic divergence For comparison, we also show the global analysis results from JAM3D-22 [28].
where hf δ is the renormalized matrix element, δm(a) contains the Wilson-line self-energy linear UV divergences, Z T (a) contains the logarithmic UV divergences, and m0 is used to fix the scheme dependence present in δm(a).The Wilson-line self-energy divergence term δm(a) can be extracted from physical matrix elements, like those involving Wilson loops.Here we use the value aδm(a) = 0.1597 (16) determined from the static quark-antiquark potential taken from Refs [103][104][105][106][107]. The scheme dependence in δm(a) can be attributed to a renormalon ambiguity, but can be fixed to a particular scheme by appropriate determination of m0 [61,63], and here we choose the MS scheme.Our strategy for determining m0 is to compare the P z = 0 bare matrix elements h f δ (z, P z = 0) to the Wilson coefficient C δ 0 (µ 2 z 2 ) computed in the MS scheme In order to remove Z T (a) and hopefully cancel some of the discretization effects, we next divide ( 27) by itself with z shifted by one unit of the lattice spacing.Then, after rearranging, we arrive at .
(28) Before proceeding, we must first discuss the specifics of the Wilson coefficients used.
The renormalon ambiguity, by definition, is an artifact that arises from the summation prescription of the perturbative series in the QCD coupling α s .Therefore, we use the Wilson coefficients after leading renormalon resummation (LRR) given in Ref. [61] under the large-β 0 approximation by To be consistent with the known fixed-order Wilson coefficients at NLO, in practice, we use where the C δ 0,LRR,NLO is the NLO expansion of C δ 0,LRR and the fixed-order NLO Wilson coefficient is given by In addition, we can also resum the large logarithms ln µ 2 z 2 e 2γ E /4 by the renormalization group resummation (RGR) [108].Using these coefficients, the m0 determined using Eq. ( 28) are shown in Fig. 10 as a function of z.The bands of NLO+LRR come from the scale variation of µ in the Wilson coefficients by a factor of √ 2. When using RGR, the running coupling is evolved from the physical scale µ 0 = 2ke −γ E /z to the factorization scale µ [48,108].And we vary k ∈ [1/ √ 2, 1, √ 2] to estimate the scale uncertainty.It can be observed that the scale uncertainties in the RGR case are smaller at small z, benefiting from the resummation, while they become larger at large z as they become close to the Landau pole.In addition, plateaus can be observed after z ≥ 3a ∼ 0.228 fm when the discretization effects become negligible, though the uncertainty bands for the NLO+LRR+RGR case are larger with the running coupling when z > 0.25 fm.To avoid discretization effects at small z and the Landau pole at large z, we choose values at z = 3a which give m0 = 28(2) MeV and 129(2) MeV for NLO+LRR and NLO+LRR+RGR cases, respectively.In Fig. 11, we show the data points defined on the left-hand side of Eq. ( 28) using the computed matrix elements and δm(a), along with the ratios defined on the right-hand side of Eq. ( 28) using m0 chosen above and Wilson coefficients at NLO+LRR (orange bands) and NLO+LRR+RGR (red bands).
The hybrid scheme renormalized matrix elements are given by 0.9 3 , NLO+LRR FIG. 11: The ratio of Pz = 0 matrix elements (black points) defined in Eq. ( 28) are shown.The bands are inferred from the NLO+LRR and NLO+LRR+RGR Wilson coefficients respectively with scale variation.
with z s = 3a.In Fig. 12 we show the hybrid renormalized matrix elements for the isovector case (left panels) and isoscalar case (right panels) for momenta n z = 4, 6.
It can be seen that the large momentum matrix elements show a slow P z evolution and a good scaling in λ within the statistical errors, suggesting we have good convergence in momentum.

B. Extrapolation to large λ
Due to the finite extent of the lattice, one can only calculate the matrix elements up to some maximum λ max ≡ z max P max z .Further, the signal deteriorates as λ is increased.This poses a problem, as the matrix elements need to be Fourier transformed to obtain the quasi-PDF, and truncating the integral will lead to unphysical oscillations in the resulting quasi-PDF (see Ref. [101] for a discussion on this issue).Therefore, we choose to perform 1.0 0.5 0.0 1.0 0.5 0.0 FIG. 12: The (upper) real and (lower) imaginary parts of the renormalized matrix elements in the hybrid scheme for the (left) isovector and (right) isoscalar cobminations.
an extrapolation of the data to infinity before performing the Fourier transform.In practice, we estimate the Fourier transform with a discrete sum up to some value λ L = z L P z at which point an integral of the extrapolated function takes over.There are a few considerations when deciding upon an appropriate value for λ L .In this work, we choose a value in the region where either the signal is no longer reliable or the values of the matrix elements are nearly consistent with zero.As in our previous work in Ref. [62], the extrapolation itself is done by performing a fit in this region using the exponential decay model where the fit parameters are constrained by m eff > 0.1 GeV, A > 0, and d > 0. Using this constraint on m eff helps to ensure the extrapolation falls off at a reasonable rate and does not significantly change the results in the regions of x for which we trust the LaMET procedure.A detailed derivation which motivates the use of this model can be found in App.B of Ref. [63].Results of the extrapolation fits for the largest two momenta are shown in Fig. 13.

C. The quasi-PDF from a Fourier transform
The quasi-PDF is defined as the Fourier transform of the renormalized matrix elements Finally, we split the integrals up into two regions: i) 0 ≤ z ≤ z L where the integral is performed via a sum over the lattice data for hf δ (z, z s , P z , µ) and ii) z L < z < ∞ where the integral is performed using the resulting extrapolation for hf δ (z, z s , P z , µ) δ qf (y, z s , P z , µ) = where z re L and z im L are the values of z in which the extrapolation integral takes over for the real and imaginary parts of hf δ (z, z s , P z , µ), respectively, and N re/im ≡ z re/im L /a + 1.

D. Matching to the light-cone PDF
The final step in obtaining the light-cone PDF from the quasi-PDF is to match them perturbatively in α s (µ) as where C −1 δ ( x y , µ yPz , |y|λ s ) is the inverse of the perturbative matching kernel for the transversity distribution, and the notation ⊗ is used as short-hand for the integral.One caveat of this method is that the leading power corrections to the matching can be seen to be enhanced when x is near 0 or 1.Therefore, we must be careful to estimate the range in x in which these power corrections become significant and hence spoil the matching procedure.
To see how we obtain the the inverse matching kernel, we start with the perturbative expansion of the matching kernel itself where only the NLO Wilson coefficients are known, i.e. we have only C (1) δ ( x y , µ yPz , |y|λ s ) [50,51,100,109].Next, by imposing the definition of the inverse matching kernel we find As done in our previous work Ref. [62], we approximate the integration by defining the integral on a finite-length grid which can be represented via matrix multiplication with a matching matrix C δ xy to obtain the light-cone PDF at NLO as (41) where δy = 0.001 is the grid size used for the integration.
For the matching coefficients themselves, we also implement LRR and RGR, where the RGR involves running the coupling from the physical scale µ 0 with three choices of k ∈ [1/ √ 2, 1, √ 2] to µ = 2 GeV which allows for assessing the systematics due to scale variation.

E. Results
As a first check regarding the significance of the power corrections, we show the momentum dependence of the NLO light-cone PDFs for both the isovector and isoscalar flavor combinations using our largest two momenta in Fig. 14.There we see a relatively mild momentum dependence, which is expected given the observed momentum convergence in the renormalized matrix elements shown in Fig. 12.
Finally, using the largest available momentum of P z = 6 2π L , in Fig. 15 we show the quark and antiquark transversity distributions from the LaMET approach for both the isovector and isoscalar flavor combinations compared to the DNN results from the previous section and with the global analysis from JAM3D-22 [28] and Radici, Bacchetta [23] which are both performed at LO.
There are a few things to note about these results.First, recall that the global analysis and our DNN results both assume the anti-quark distribution to be zero.Our x-space matching results in this section favor this assumption, at least when using an NLO matching kernel.Further, recall that power corrections in the light-cone matching lead to a breakdown of the formalism when x ∼ 0, 1.However, the RGR results also breakdown at small x, as seen by the onset of oscillations near x ∼ 0.2, already giving a natural boundary for where we no longer trust the results.

VII. CONCLUSIONS
In this paper we have presented various extractions of the transversity isovector and isoscalar quark PDFs, and their lowest few moments, of the proton from lattice QCD using a physical pion mass.This work is a continuation towards the ultimate goal of uncovering the full structure of the proton from first principles.Additionally, the matrix elements needed in this work also allow an esti-mate of the tensor charge g T to be extracted, and our results show reasonable agreement with other lattice extractions, as shown in Tab.II.However, our calculations are performed at a single value of the lattice spacing, and for the isoscalar case we neglected the disconnected diagrams.
Regarding the transversity isovector and isoinglet PDFs, in our first method, we utilized the leading-twist OPE expansion of the reduced pseudo-ITD to extract the first few Mellin moments.We found excellent agreement with the global analysis from JAM3D-22 for the lowest two moments and minor tensions for the next two moments.Higher moments could not be reliably extracted.Next, we used the pseudo-PDF approach, based on shortdistance factorization, to extract an x-dependent PDF and utilized a deep neural network to overcome the inverse problem while remaining as unbiased as possible.We saw some mild tension with the results from JAM3D-22 for a few small ranges of x but otherwise mostly saw agreement.Finally, we used the quasi-PDF approach, L .We also include comparisons with the DNN method of the previous section and with the global analysis from JAM3D-22 [28] and Radici, Bacchetta [23] which are both performed at LO.The darker bands for the NLO+LRR+RGR results are the statistical errors when setting k = 1 and the lighter bands are the additional systematic errors associated with scale variations by additionally using k = 1/ √ 2 and k = √ 2.
based on LaMET, to calculate the x-dependence PDF from hybrid-scheme renormalized matrix elements.For this we found reasonably good agreement with JAM3D-22 in the moderate region of x, but there is significant tension with the results from Radici, Bacchetta.
A number of systematics are being ignored here and are left for future work.These include the use of a single lattice spacing, NNLO corrections in α s , power corrections from the use of finite momentum, and isoscalar disconnected diagrams.We can address the expected significance of these systematics, and we have good reason to expect their effects to be rather small.Regarding the NNLO corrections, we saw in Fig. 15 the NLO corrections were relatively mild in the middle x regions, suggesting that NNLO corrections will be quite small as seen for the unpolarized proton distribution in our previous work [62].And, for the disconnected diagrams, we discussed earlier the expectation that the effects of these diagrams for lo-cal operator matrix elements would be smaller than the statistical error based on the study done in Refs.[68,76].Further, in our study of the unpolarized proton distribution [62], we saw no evidence of convergence in the momentum used.However, as seen in Fig. 14 The computation of the correlators was carried out with the Qlua software suite [72], which utilized the multigrid solver in QUDA [73,74].The analysis of the correlation functions was done with the use of lsqfit [110] and gvar [111].Several of the plots were created with Matplotlib [112].
Appendix A: Renormalization constant ZT in RI-MOM scheme Here we discuss the extraction of the renormalization constant Z T in the RI-MOM scheme and its subsequent conversion to the MS scheme at the scale µ = 2 GeV.The method starts by calculating matrix elements between off-shell quark states with lattice momenta where L µ is the size of the lattice in the µth direction, n µ ∈ Z, and µ = 0 is the temporal direction.These matrix elements are computed in the Landau gauge.The renormalization point is given by (ap R ) 2 ≡ 3 µ=0 sin 2 (ap µ ), which is inspired by the lattice dispersion relation and helps to reduce discretization errors.Our results for Z T are shown in Fig. 16.There is a significant dependence on p 2 R caused by nonperturbative associated with condensates and discretization errors that can be clearly seen from the "fishbone" structure at large p 2 R .We had difficulty appropriately modeling these effects and instead chose to form ratios of Z T /Z V in an attempt to cancel them as much as possible, similar to what was done in Ref. [113].We then use the conversion factor from RI-MOM to MS for the tensor current at both three loop [114] and four loop [77] accuracy.The resulting renormalization factors are then in the MS scheme at the scale µ 2 = p 2 R , and thus we evolve them to the same scale using the evolution function computed at two loops in Ref. [78].We evolve Z T to the scale µ = 2 GeV, as this is a commonly used scale for reporting results of nucleon charges.The resulting ratio Z T /Z V after conversion to MS at µ = 2 GeV is then fit to where last two terms incorporate discretization effects.In order to remove bias from our choice of fit, we consider six different variations of this fit form, corresponding to setting various terms to zero.Specifically, we consider a linear form (i.e.D 2 = 0), a quadratic form (i.e.D 1 = 0), and a linear+quadratic form (i.e.D 1 ̸ = 0 and D 2 ̸ = 0).Then for each of these three, we also consider fits in which B is zero and non-zero.To further give variation to our fits, we use three ranges of the data.The first includes all but the smallest values of p 2 R , which is always left out.Then we consider removing more of the small p 2 R data, and finally removing the largest p 2 R data.This gives a total of 18 fits we consider.To give a final estimate, we simply take an AIC average over all fits, giving Z T /Z V = 1.018 (18), MS(µ = 2 GeV), 3-loop, Z T /Z V = 1.007 (20), MS(µ = 2 GeV), 4-loop, (A3) using the three-loop and four-loop conversion to MS, respectively.The results of all these fits, using the four-loop conversion to MS, are shown in Fig. 17.This, rather conservative, method for estimating the systematic error is justified for this observable which is likely affected by large systematics. - result obtained by adding the bias term b (l) i to the sum of the weighted inputs from the previous layer, represented by W , N , are employed to indicate the individual layers, where N corresponds to the number of layers, representing the depth of the DNN.The parameters of the DNN, namely the biases b (l) i and weights W (l) ij , represented by θ, are subject to optimization (training) by minimizing the loss function defined as

FIG. 7 :
FIG. 7: The DNN training results using the (left) isovector and (right) isoscalar reduced pseudo-ITD matrix elements in the range z ∈ [2a, 10a] for the (upper) real part and (lower) imaginary part (lower panel) are shown.The results using ϵ = 0 and ϵ = 0.1 are shown as the solid and dotted curves, respectively.

1 FIG. 8 :
FIG. 8: The DNN represented PDFs using the matrix elements in the range z ∈ [2a, 10a] for the (upper) isovector and (lower) isoscalar cases are shown.The results with ϵ = 0 and ϵ = 0.1 are shown as the red and blue bands, respectively.

1 z
FIG.9:The DNN represented PDFs using the matrix elements in the range z ∈ [2a, zmax] for the (upper) isovector and (lower) isoscalar cases are shown.For comparison, we also show the global analysis results from JAM3D-22[28].

FIG. 10 :
FIG. 10: The m0 determined using NLO+LRR and NLO+LRR+RGR Wilson coefficients are shown.The bands come from the scale variation.

FIG. 13 :
FIG.13:The (upper) isovector and (lower) isoscalar hybrid renormalized matrix elements with (left) Pz = 4 2πL and (right)Pz = 6 2πL .The hatches show the range of data used for the fit to the extrapolation model and the bands are the result of that fit starting from λL.The hybrid renormalized data makes use of the NLO+LRR+RG Wilson coefficients computed at µ0 = 2e −γ E /z (i.e.k = 1) and subsequently evolved to µ = 2 GeV.

LFIG. 14 :
FIG.14:The Pz dependence for the (upper) isovector and (lower) isoscalar (left) quark and (right) antiquark NLO+LRR+RGR light-cone transversity PDF δq f (x, µ).The darker bands are the statistical errors when setting k = 1 and the lighter bands are the additional systematic errors associated with scale variations by additionally using k = 1/ √ 2 and k = √ 2.
FIG.15:The (upper) isovector and (lower) isoscalar (left) quark and (right) antiquark transversity PDFs at NLO+LRR and NLO+LRR+RGR using the LaMET framework with the largest momentum Pz = 6 2π L .We also include comparisons with the DNN method of the previous section and with the global analysis from JAM3D-22[28] and Radici, Bacchetta[23] which are both performed at LO.The darker bands for the NLO+LRR+RGR results are the statistical errors when setting k = 1 and the lighter bands are the additional systematic errors associated with scale variations by additionally using k = 1/ √ 2 and k = √ 2.

FIG. 16 :
FIG.16:The tensor current renormalization factor ZT as a function of the RI-MOM momentum pR.

FIG. 17 :
FIG. 17: Ratio of the tensor to vector current renormalization factors ZT /ZV as a function of the RI-MOM momentum pR.The conversion from RI-MOM to MS is done at four loops.The bands show the 18 different fits considered, all overlaid on top of one another.The AIC-averaged result final result for the ratio is given in the bottom right corner.

TABLE I :
[62]more important details on the ensemble and the statistics gathered for our calculation.The integer momentum nz of the nucleon and the corresponding integer boost momentum kz of the quarks are given.The sink-source separations used are given by tsep.And, finally, the number of samples used for the exact and sloppy solves is given by #ex and #sl, respectively.The asterisk indicates where extra samples were generated as compared to our previous work in Ref.[62].
which reduces the leading contamination from excited states.The bare ground-state matrix element can then , the convergence in momentum is much more convincing for the transversity distribution.Office of Nuclear Physics through Contract No. DE-SC0012704, Contract No. DE-AC02-06CH11357, and within the frameworks of Scientific Discovery through Advanced Computing (SciDAC) award Fundamental Nuclear Physics at the Exascale and Beyond and the Topical Collaboration in Nuclear Theory 3D quark-gluon structure of hadrons: mass, spin, and tomography.SS is supported by the National Science Foundation under CA-REER Award PHY-1847893 and by the RHIC Physics Fellow Program of the RIKEN BNL Research Center.YZ is partially supported by the 2023 Physical Sciences and Engineering (PSE) Early Investigator Named Award program at Argonne National Laboratory.This research used awards of computer time provided by: The INCITE program at Argonne Leadership Computing Facility, a DOE Office of Science User Facility operated under Contract No. DE-AC02-06CH11357; the ALCC program at the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR22725; the Delta system at the National Center for Supercomputing Applications through allocation PHY210071 from the ACCESS program, which is supported by National Science Foundation grants #2138259, #2138286, #2138307, #2137603, and #2138296.Computations for this work were carried out in part on facilities of the USQCD Collaboration, which are funded by the Office of Science of the U.S. Department of Energy.Part of the data analysis are carried out on Swing, a high-performance computing cluster operated by the Laboratory Computing Resource Center at Argonne National Laboratory.
. 18: The real parts of the ratio of local (left) isovector and (right) isoscalar three-point to two-point functions for Pz = 0.The χ 2 /dof reported, estimate for the ground-state bare matrix element (also represented by a gray band), and fit bands come from the preferred fit strategy, i.e. the two-state fit to the ratio R δ with nexc = 2, where nexc is the number of data points nearest both the source and sink that are not included in the fit.The range in tins of the tsep bands covers the included data points in the fit.104032 (2023), 2211.15746. FIG