Complete flavor decomposition of the spin and momentum fraction of the proton using lattice QCD simulations at physical pion mass

We evaluate the gluon and quark contributions to the spin of the proton using an ensemble of gauge configuration generated at physical pion mass. We compute all valence and sea quark contributions to high accuracy. We perform a non-perturbative renormalization for both quark and gluon matrix elements. We find that the contribution of the up, down, strange and charm quarks to the proton intrinsic spin is $\frac{1}{2}\sum_{q=u,d,s,c}\Delta\Sigma^{q^+}=0.191(15)$ and to the total spin $\sum_{q=u,d,s,c}J^{q^+}=0.285(45)$. The gluon contribution to the spin is $J^g=0.187(46)$ yielding $J=J^q+J^g=0.473(71)$ confirming the spin sum. The momentum fraction carried by quarks in the proton is found to be $0.618(60)$ and by gluons $0.427(92)$, the sum of which gives $1.045(118)$ confirming the momentum sum rule. All scale and scheme dependent quantities are given in the $\mathrm{ \overline{MS}}$ scheme at 2 GeV.


I. INTRODUCTION
The spin decomposition of the proton reveals important information about its non-perturbative structure. Since the proton is composed of quarks and gluons, it is expected that its spin arises from the intrinsic spin and orbital angular momentum of its constituents. The first attempts to measure the proton spin were performed at SLAC in the E80 [1,2] and E130 [3,4] series of experiments. The successful quark model that describes well properties of the low-lying hadrons predicted that all the spin is carried by the three valence quarks. The first major surprise came from the measurements of the European Muon Collaboration (EMC) [5,6] that determine the proton spin-dependent structure function down to x = 0.01. Their conclusion was that only about half of the proton spin is carried by the valence quarks. This came to be known as the proton spin puzzle. It triggered a series of precise measurements by the Spin Muon Collaboration (SMC) in 1992-1996 [7] and by COMPASS [8] since 2002. For a review on these experiments and related ones see Ref. [9]. Recent experiments using polarized deep inelastic lepton-nucleon scattering (DIS) processes indeed confirmed that only about 25-30% [10][11][12][13][14][15] of the nucleon spin comes from the valence quarks. These experiments also suggest a strange quark contribution to the intrinsic spin, ∆Σ s + . Phenomenological analyses point to a negative value but the error is large, giving values of ∆Σ s + ranging from −0.120 (81) [10,14,16,17] to −0.026 (22) [18]. We use the shorthand notation q + = q +q to denote the sum from quark and antiquark contributions to the intrinsic spin and momentum fraction. Results from inclusive DIS experiments have, however, small sensitivity to the gluon helicity ∆g. In contrast, polarized protonproton collisions, in particular jet or hadron production at high transverse momentum available from the Relativistic Heavy Ion Collider (RHIC) [19][20][21] at BNL provide tighter constraints on 0.2 0.05 ∆g(x)dx = 0.005 +0.129 −0.164 . Despite the tremendous progress in the determination of the gluon helicity, large uncertainties remain mostly in the small-x range. Thanks to its large kinematic reach in x and Q 2 , the planned Electron-Ion Collider(EIC) [22] will provide significantly more input to constrain ∆g.
While experiments play a crucial role in the understanding of the sources of the proton spin, they need to be complemented by phenomenological analyses, which involve model dependence and parameterizations. Lattice QCD (LQCD), on the other hand, provides the initio non-perturbative framework that is suitable to address the key questions of how the nucleon spin and momen-tum is distributed among its constituents using directly the QCD Lagrangian. Tremendous progress has been made in simulating lattice QCD in recent years. State-ofthe-art simulations are being performed with dynamical up, down and strange quarks with mass tuned to their physical values (referred to as the physical point). A subset of simulations also include a dynamical charm quark with mass fixed to approximately its physical value. This progress was made possible using efficient algorithms and in particular multigrid solvers [23] that were developed for twisted mass fermions [24].
A number of recent lattice QCD studies were carried out to extract the intrinsic spin carried by each quark flavor. They include previous works by the Extended Twisted Mass Collaboration (ETMC) [25][26][27], by PNDME [28] and by χQCD [29]. First attempts to compute the gluon average momentum fraction were carried out by the pioneering work of the QCDSF collaboration [30,31] in the quenched approximations. Results on the gluon momentum fraction using dynamical gauge field configurations appeared only recently. They used mostly simulations with larger than physical pion mass relying on chiral extrapolations to obtain final results [32][33][34]. A first attempt to fully decompose the nucleon spin was carried out by χQCD [35] in the quenched approximation, followed by a study of the gluon spin [36] using 2 + 1 dynamical fermions on four lattice spacings and four volumes including an ensemble with physical values for the quark masses. ETMC was the first to compute the gluon momentum fraction directly at the physical point without the need of a chiral extrapolation [26]. The latter is a significant progress, since chiral extrapolations in the nucleon sector introduce uncontrolled systematic errors.
In this study we will provide the complete quark flavor decomposition of the proton spin. This requires the computation of both valence and sea quark contributions. It also includes the computation of the gluon contributions to the spin and momentum fraction of the proton. In order to evaluate the quark loop contributions that are computationally very demanding, we apply improved techniques that are developed and implemented on graphics cards (GPUs) [37], as well as noise reduction techniques [38,39]. This work updates our previous results on the proton spin presented in Ref. [26] in several respects: i) While Ref. [26] used an ensemble of twisted mass fermions generated with two degenerate light quarks (N f = 2) [40], we here use an ensemble of twisted mass fermions [41,42] that includes, besides the light quarks, the strange and the charm quarks all with masses fixed to their physical values (N f = 2 + 1 + 1); ii) we perform a more elaborated analysis of excited state contributions; iii) we use larger statistics; iv) we compute the gluon contribution to the proton spin taking into account the generalized form factor B 20 (0); and v) we use non-perturbative renormalization not only for the quark operators but also for the gluon operator.
The remainder of this paper is organized as follows: In Section II we provide the theoretical basis for the nucleon spin decomposition [43]. Sections III, IV describe the methodology to extract the nucleon bare matrix elements needed, while Section V explains the renormalization procedure and the conversion to the MS scheme.
Our final results are discussed in Section VI and compared with other studies in Section VII. Finally, in Section VIII we summarize our findings and conclude.

II. NUCLEON SPIN DECOMPOSITION
A key object for the study of the spin decomposition is the QCD energy-momentum tensor (EMT) T µν . The symmetric part of the EMT can be separated [44] into two terms, the traceless term, denoted byT µν , and the trace termT µν as Only the traceless part is relevant for this study. Keeping only the gauge-invariant parts ofT µν , this can be expressed in terms of the gluon partT µν;g and the quark partT µν;q asT µν =T µν;g +T µν;q . whereT where F µν is the gluon field-strength tensor and the notation {· · · } means symmetrization over the indices in the parenthesis and subtraction of the trace. The symmetrized covariant derivative ← → D is defined as ← → D = ( ← − D + − → D)/2. The angular momentum density M 0ij can be written in terms of the EMT as M αµν =T αν x µ −T αµ x ν (5) and the i-th component of the angular momentum operator as The first term in Eq. (8) is the quark intrinsic spin operator and the second term is the orbital angular momentum. Putting gluon and quark operators together we have that This is the so called Ji decomposition [43] which does not allow to decompose J g any further in a gauge invariant manner. Jaffe and Manohar suggested a non gaugeinvariant way to decompose further [46] the gluon angular momentum, with the issue of the gauge-invariance being addressed in Ref. [47]. In this work we use Ji's decomposition [43] and thus compute the total gluon angular momentum J g . In order to compute the nucleon spin, we need to evaluate the nucleon matrix elements of the EMT. They can be decomposed into three generalized form factors (GFFs) in Minkowski space as follows [45] N (p , s )|T µν;q,g |N (p, s) =ū N (p , s ) where u N the nucleon spinor with initial (final) momentum p(p ) and spin s(s ), P = (p + p)/2 is the total momentum and q = p − p the momentum transfer. A 20 (q 2 ), B 20 (q 2 ) and C 20 (q 2 ) are the three GFFs. In the forward limit, A q,g 20 (0) gives the quark and gluon average momentum fraction x q,g . Summing over all quark and gluon contributions gives the momentum sum x q + x g = 1.
As shown in Ref. [43] the nucleon spin can be written in terms of A 20 and B 20 in the forward limit as where we consider a reference spin axis. The spin sum J = 1 2 together with the momentum sum are satisfied if B q 20 (0) + B g 20 (0) = 0. Although A q,g 20 (0) and thus the average momentum fractions are extracted from the nucleon matrix element directly at zero momentum transfer, B q,g 20 (0) can only be computed at non-zero momentum transfer requiring an extrapolation to Q 2 = 0.
Since we have a direct way to compute the quark contribution J q and the intrinsic spin 1 2 ∆Σ q we can determine the quark orbital angular momentum by

A. Ensembles of gauge configurations
In Table I we give the parameters of the N f = 2 + 1 + 1 ensemble analyzed in this work denoted by cB211.072.64 [48]. For completeness we also list the parameters of the N f = 2 ensemble analyzed in our previous study [26], referred to as cA2.09. 48. In both cases the lattice spacing is determined using the mass of the nucleon [48][49][50][51].
The ensembles are produced using the Iwasaki [52] improved gauge action and the twisted mass fermion formulation [41,42]. A clover term [53] was added to stabilize the simulations. The twisted mass fermion formulation is very well suited for hadron structure providing an automatic O(a) improvement [42] with no need of improving the operators.  [48] and cA2.09.48 [40] ensembles. cSW is the value of the clover coefficient, β = 6/g where g is the coupling constant, N f is the number of dynamical quark flavors in the simulation, a is the lattice spacing, V the lattice volume in lattice units, mπ the pion mass, mN the nucleon mass, and L the spatial lattice length in physical units. For the parameters of the cA2.09.48 ensemble, the second error arises from the systematic error on the determination of the lattice spacing due to the extrapolation to the physical value of mπ [40]. For the cB211.072. 64

B. Construction of correlation functions
To compute the nucleon matrix elements one needs to evaluate two-and three-point functions in Euclidean space. To create states with the quantum numbers of the nucleon we use as interpolating field where u(x), d(x) are the up, down quark fields and C is the charge conjugation matrix. The interpolating field in Eq. (13) does not only create the nucleon state but also excited states with the quantum numbers of the nucleon, including multi-particle states. In order to increase the overlap of the interpolating field with the ground state we employ Gaussian smearing [54,55] on the quark fields as well as APE smearing [56] on the gauge links entering the hopping matrix of the smearing function. For more details about how we tune these smearing parameters are given in Refs. [49,51]. The nucleon two-point function is given by where x 0 is the initial lattice site at which states with the quantum numbers of the nucleon are created, referred to as source position and x s is the site where they are annihilated, referred to as sink. An appropriate operator O µν probes the quarks and gluons within the nucleon at a lattice site x ins referred to as insertion point. The resulting three-point function is given by The operator O µν may represent the EMT with two Lorentz indices or the helicity operator with one Lorentz index. The Euclidean momentum transfer squared is given Q 2 = −(p − p) 2 and Γ ρ is the projector acting on the spin indices. We consider Γ 0 = 1 2 (1 + γ 0 ) and Γ k = iΓ 0 γ 5 γ k taking the non-relativistic representation of γ µ .

C. Analysis of correlation functions to extract the nucleon matrix elements
The information about the desired nucleon matrix element is contained in the three-point correlation function of Eq. (15). In order to extract it, we construct appropriate combinations of three-to two-point functions, which in the large Euclidean time limit, cancel the time dependence arising from the time propagation and the overlap terms between the interpolating field and the nucleon state. An optimal choice that benefits from correlations is the ratio [57][58][59][60].
The sink and insertion time separations t s and t ins are taken relative to the source. In the ratio of Eq.(16), taking the limits (t s − t ins ) a and t ins a, with a the lattice spacing, the nucleon state dominates. When this happens, the ratio becomes independent of time and yields the desired nucleon matrix element. In practice, (t s − t ins ) and t ins cannot be taken arbitrarily large, since the signal-to-noise ratio decays exponentially with the sink-source time separation. Therefore, one needs to take (t s − t ins ) and t ins large enough so that the nucleon state dominates in the ratio. To identify when this happens is a delicate process. We use three methods to check for convergence to the nucleon state as summarized below.
Plateau method: The ratio of Eq. (16) can be written as where the first term is time-independent and contributions from excited states are exponentially suppressed. In Eq. (18) ∆E is the energy gap between the nucleon state and the first excited state. In order to extract the nucleon matrix element of the operator of interest, we seek to identify nucleon state dominance by looking for a range of values of t ins for which the ratio of Eq. (16) is time-independent (plateau region). We fit the ratio to a constant within the plateau region and seek to see convergence in the extracted fit values as we increase t s . If such a convergence can be demonstrated, then the desired nucleon matrix element can be extracted.
Summation method: One can sum over t ins the ratio of Eq. (16) [61,62] to obtain Assuming the nucleon state dominates, Π µν (Γ ρ ; p , p) is extracted from the slope of a linear fit with respect to t s . As in the case of the plateau method, we probe convergence by increasing the lower value of t s , denoted by t low s used in the linear fit, until the resulting value converges. While both plateau and summation methods assume that the ground state dominates, the exponential suppression of excited states in the summation is faster and approximately corresponds to using twice the sink-source time separation t s in the plateau method.
Two-state fit method: In this method we explicitly include the contributions from the first excited state. We thus expand the two-and three-point function correlators entering in the ratio of Eq.(16) to obtain C( p, t s ) = c 0 ( p)e −E0( p)ts + c 1 ( p)e −E1( p)ts + · · · , (20) and C µν (Γ ρ , p , p, t s , t ins ) = where c 0 ( p) and c 1 ( p) are the overlaps of the ground and first excited state with the interpolating field and E 0 ( p) and E 1 ( p) the corresponding energies. The parameters A µν i,j , are the matrix elements of the i, j states multiplied by the corresponding overlap terms. Note that A µν 0,1 = A µν 1,0 for non-zero momentum transfer. Our procedure to determine these parameters is as follows: we first fit the effective mass using the two-point function of Eq. (20) at p = 0 and finite momentum p to extract the nucleon mass m N , Fig. 1 we compare the energy E 0 (p) extracted directly from the finite momentum two-point function and the dispersion relation. As can be seen, the dispersion relation is well satisfied and holds for all the momenta that are used in this study. In- serting the expressions of Eqs. (20) and (21) in Eq. (16) and using E 0 , ∆E and c 1 /c 0 extracted from the twopoint correlators we fit the resulting ratio to extract the remaining four parameters, M µν 0,0 ≡ A µν 0,0 / c 0 ( p )c 0 ( p), A µν 0,1 /A µν 0,0 , A µν 1,0 /A µν 0,0 and A µν 1,1 /A µν 0,0 . The first parameter M µν 0,0 is the desired ground state matrix element and the rest are excited states contributions. In the case of zero momentum transfer, A µν 1,0 = A µν 0,1 and we only have three parameters to determine.
For all the three methods we minimize χ 2 defined as where C is the covariance matrix, r = y − f (p, x) the residual vector between our data y and the model function f (p, x) and p a vector holding the parameters of the fit.
After determining the nucleon matrix element we can extract the charges, moments and GFFs. For zero momentum transfer we have in the case of the helicity operator. The average momentum fraction can be obtained from the matrix element of the one-derivative vector operator at zero momentum transfer, where in the second expression the nucleon is boosted in the i th direction with momentum p i . As already discussed in connection to Eq. (10), B 20 (0) cannot be extracted directly. We thus compute B 20 (Q 2 ) as a function of Q 2 and extrapolate to zero Q 2 . More details about the procedure to extract B 20 (Q 2 ) will be given in Sec. IV B.

D. Connected and disconnected three-point functions and statistics
The three-point function, defined in Eq. (15), receives contribution from two types of diagrams: one when the operator couples directly to a valence quark, known as the connected contribution and one when the operator couples to a sea quark resulting into a quark loop, known as the disconnected contribution. The gluon operator as defined in Eq. (3), produces a closed gluon loop and thus a disconnected contribution. To evaluate the connected contributions we use standard techniques that involve the computation of the sequential propagator through the sink. In this approach the sink-source time separation, the projector and the momentum at the sink p are kept fixed. We perform the computation of the connected three-point functions fixing the sink momentum to zero, i.e we set p = 0. We then compute the sequential propagator for both the unpolarized and polarized projectors Γ 0 and Γ k , k = 1, 2, 3, respectively.
In total we analyze 750 configurations separated by 4 trajectories. We use seven values of the sink-source time separation t s ranging from 0.64 fm to 1.60 fm. In order to keep the signal-to-noise ratio approximately constant we increase the number of source positions as we increase t s .
The statistics used for each value of t s are given in Table II. A range of t s and the increasingly larger statistics for larger t s allow us to better check excited state effects and to thus reliably extract the nucleon matrix elements of interest. The disconnected contribution involves the disconnected quark loop correlated with the nucleon two-point correlator. The disconnected quark loop is given by where D −1 (x ins ; x ins ) is the quark propagator that starts and ends at the same point x ins and G is an appropriately chosen γ-structure. For the helicity operator we use γγ 5 and for the quark part of EMT γ µ ← → D ν . A direct computation of the quark loops would need inversions from all spatial points on the lattice, making the evaluation unfeasible for our lattice size. We, therefore, employ stochastic techniques combined with dilution schemes [63] that take into account the sparsity of the Dirac operator and its decay properties. Namely, we employ the hierarchical probing technique [38], which provides a partitioning scheme that eliminates contributions from neighboring points in the trace of Eq. (26) up to a certain coloring distance 2 k . Using Hadamard vectors as the basis vectors for the partitioning, one needs 2 d * (k−1)+1 vectors, where d=4 for a 4-dimensional partitioning. Note that the computational resources required are proportional to the number of Hadamard vectors, and therefore, in d=4 dimensions, increase 16-fold each time the probing distance 2 k doubles. Contributions entering from points beyond the probing distance are expected to be suppressed due to the exponential decay of the quark propagator and are treated with standard noise vectors that suppress all off-diagonal contributions by 1/ √ N r , i.e.
where N r is the size of the stochastic ensemble. Hierarchical probing has been employed with great success in previous studies for an ensemble with a pion mass of 317 MeV [64]. For simulations at the physical point, it is expected and confirmed [49] that a larger probing distance is required since the light quark propagator decays more slowly because of the smaller quark mass. We avoid the need of increasing the distance by combining hierarchical probing with deflation of the low modes [65]. Namely, for the light quarks we construct the low mode contribution to the quark loops by computing exactly the smallest eigenvalues and corresponding eigenvectors of the squared Dirac operator and combine them with the contribution from the remaining modes, which are estimated using hierarchical probing. Additionally, we employ the one-end trick [66], also used in our previous studies [26,27,37] and fully dilute in spin and color. For the calculation of the nucleon matrix element of the gluonic part of the EMT, we use the gluon field strength tensor with g the bare coupling constant. For the gauge links entering the field strength tensor we apply stout smearing [67] with parameter ρ = 0.129 [33]. As will be discussed in Sec. IV A, we investigate the signal-to-noise ratio as we increase the number of stout steps. While the evaluation of the gluon loop is computationally cheap because no inversions are needed, the calculation of the quark loops is very expensive. We use the same combination of methods for the calculation of the flavors of the quark loops except for deflation, which is only used for the light quark loops. The parameters used for the evaluation of the quark loops are collected in Table III. Two hundred low modes of the square Dirac operator are computed in order to reduce the stochastic noise in the computation of the light quark loops. For the charm quark we use a coloring distance 2 2 in hierarchical probing instead of 2 3 used for the light and the strange quark loop since charm quarks are relatively heavy. Instead we compute 12 stochastic vectors. We evaluate the nucleon two-point functions for two hundred randomly chosen source positions which sufficiently reduces the gauge noise for large enough sink-source time separations of the disconnected three-point functions. Since they are available we use the same number of two point functions for all source-sink time separations.
In summary, we perform in total 12,690,000 inversions for the connected and 16,272,000 for the disconnected contributions by employing the DD-αAMG solver and its QUDA version [24,68,69] to accelerate the inversions. Using GPUs and the DD-αAMG solver are essential to obtain the required statistics.

IV. BARE NUCLEON MATRIX ELEMENTS
As already discussed, for the decomposition of the nucleon spin we need the axial charges for each quark flavor, which give the quark helicities, the average momentum fractions and B 20 (0). The extraction of the axial charges or 1 2 ∆Σ q for the cB211.072.64 ensemble is presented in Ref. [25], while the evaluation of the isovector A 20 and B 20 in Ref. [51]. In this section we focus in the extraction of the remaining quantities needed for the full decomposition of the nucleon spin.

A. Average momentum fraction x
The average momentum fraction x is extracted directly from the nucleon matrix element at zero momentum transfer from the Eqs. (24), (25). In the case of the connected contribution, since we only have access to three-point functions with p = 0, we are restricted to use Eq. (24). In Fig. 2 we show the bare ratio which leads to the extraction of the connected contribution to the isoscalar x u + +d + B . The ratio for each t s has been constructed between three-and two-point functions with the same source positions to benefit from the correlations between numerator and denominator. One can easily observe a clear contamination from excited states at small time separations. In fact for t s < 1 fm no plateau is detected and the ratio clearly decreases as t s increases. We note that we exclude t ins /a = 0, 1, t s /a − 1, t s /a since they do not carry physical information. For t s 1.12 fm we fit within the plateau region discarding five points from left and right, thus t ins ∈ [2 + τ, t s − τ − 2] with τ /a = 5 for all t s values. This range is found to yield a good χ 2 /d.o.f..
In Fig. 3 we show the summed ratio of Eq. (19). As can be seen, a linear fit describes well the results.  The linear fits are shown by the blue bands as we increase, from top to bottom, the smallest value of ts used in the fit, t low s .
Our approach for the two-state fits has been discussed in Sec. III C. We extract m N , ∆E and the overlap ratio c 1 /c 0 using the full statistics of two-point function produced with 264 source positions. However, as discussed above, for the construction of the ratio we use the same source positions for three-and two-point functions. We fit the resulting ratios simultaneously for all values t s ≥ t low s . We vary t low s , to check convergence of the extracted nucleon matrix element. We show the resulting fits to the ratios in Fig. 4 for t low s /a = 12. Additionally, we plot in the middle panel the predicted time dependence of the ratio when fixing t ins = t s /2 for t low s /a = 12. In the same panel we include values extracted using the plateau method. For t s /a < 14 where no plateau could be identified we plot the midpoint t s /2. As can be seen, the two-state fit predicts well the residual time-dependence of values extracted using the plateau method. It also shows that the plateau values, even for t s = 1.6 fm, still have excited state contributions and convergence is not demonstrated. The two-state fit suggests that t s > 2 fm is needed to sufficiently suppress contributions from excited states. This would require t s ∼ 26a = 2.08 fm requiring an order of magnitude more statistics as compared to the statistics used for t s = 20a = 1.6 fm. However, the values extracted from the two state fits are consistent as we vary t low s . They also agree with the value extracted from the summation method when t low s = 16a = 1.28 fm is used in the fit, as shown in the right panel of Fig. 4. We thus take as our final determination of the connected bare x u + +d + B the value extracted from the two-state fit for t low s /a = 12 with The selected value is shown by the horizontal grey band spanning the whole range of Fig. 4. We find for the connected bare isoscalar momentum fraction  For completeness and easy reference we repeat the analysis following the same procedure as for the connected The results are shown in Fig. 5. As can been seen, the effect of excited states is similar to what is observed for the connected isoscalar case. The value determined from the two-state fit for t low s /a = 8 varies only very mildly as we increase t low s and it is in agreement with the value extracted from the summation method for t low s /a = 14. We thus select as our final value the one extracted from the two-state fit using t low s /a = 8, obtaining In Figs. 6 and 7 we present our results for the quark disconnected contributions to the isoscalar and strange average momentum fractions. Since for the disconnected contribution one can use a boosted nucleon without the need of additional inversions, one can extract it both from the diagonal part of the EMT as in Eq. (24) and from the non-diagonal as in Eq. (25). If we use the diagonal part of EMT, there is a large non-zero vacuum expectation value, which, although it cancels after the trace subtraction, it leads to large statistical fluctuations. In the case of Eq. (25), where the off-diagonal components enter, this problem does not arise. We thus boost the nucleon using the first non-zero momentum, namely p =n 2π/L witĥ n = (1, 0, 0) and all other permutations and we average over the three directions and two orientations to obtain a good signal-to-noise ratio as presented in Figs. 6 and 7.
Unlike the connected contributions, for both light disconnected and strange, the ratios show fast convergence, indicating that excited states are suppressed, within our statistical uncertainties. We thus perform a fit within the plateau range that includes t ins ∈ [3a, t s −3a]. The values extracted from the plateau fits converge to a constant and are consistent with the results extracted from the summation method. We take the weighted average of the converged plateau values, namely for the plateau values extracted for t s > 0.7 fm in both cases, to determine our final value. The summation method yields fully compatible results with the plateau method, which remain consistent as we increase the low fit point t low s in the range [6a, 9a] corroborating the fact that for these quantities excited states contamination is suppressed compared to the statistical error. We find for the disconnected contri-bution to the isoscalar average momentum fraction and x s + B = 0.038(10) (32) for the average momentum carried by strange quarks. We perform the same analysis for the charm quarks. We find which is compatible with zero. . In both cases we use stout smearing with nS = 10 steps. The notation is the same as that in Fig. 6. We show results for ts/a = 6, 8, 10, 12, 14, 16, 18 with blue circles, orange down triangles, up green triangles, left red triangles, right purple triangles, brown rhombus, and magenta crosses, respectively.
In Fig. 8 we present our analysis for the gluon average momentum fraction. For this case we employ stout smearing on the gauge links entering in the field strength tensor of Eq. (28) to improve the signal of the gluonic part of the EMT. We show the case where the number of stout steps n S is 10. We analyze both the diagonal and off-diagonal components of EMT given by Eqs. (24) and (25). When using the diagonal components, due to the subtraction of a large trace, large gauge fluctuations are observed. This is analogous to the quark disconnected contributions discussed above. Although the vacuum expectation value for the traceless part of the EMT, 0|T 44 g |0 , is compatible with zero, as expected by the subtraction of the trace, we find that subtracting it from the corresponding nucleon matrix element significantly improves the signal due to the correlation between the two terms. For the vacuum expectation value 0|T 4i g |0 we find that it is also compatible with zero but subtracting it from the nucleon matrix element does not improve the signal. Therefore, in this case, subtraction of the vacuum expectation value is not performed. The gluonic ratios using the diagonal and non-diagonal elements of EMT are shown in Fig. 8. For both cases the plateaus values, obtained by fitting in the range t ins /a ∈ [3, t s − 3] for each t s , show convergence and agreement with the results extracted using the summation method. Thus, we use the plateau values for t s 1 fm to perform a weighted average finding a value which is in agreement with the summation method. The values extracted when using the diagonal and off-diagonal EMT are in agreement. Given that the results using the off-diagonal elements EMT are more accurate our value for x g B is determined from the matrix element of the off-diagonal elements of EMT. We find for n S = 10 stout smearing steps. We note that, for disconnected quantities, where effects from excited sates are significantly milder, in combination with the larger statistical errors, two-state fits are not reliable and are therefore not presented. By performing the same analysis for different steps of stout smearing we can investigate the dependence on the smearing steps n S . In Fig. 9 we show the dependence of the extracted value of x g B on the number of stout smearing steps when using both the diagonal and offdiagonal elements of EMT. As can be seen, the errors decrease as n S increases and the values converge when n S 8. This means that the renormalization functions should also converge for n S 8, since the renormalized matrix element should be independent of the stout smearing. Details about the renormalization will be provided in Sec. V.
B. B20 at zero momentum transfer As already discussed in connection to Eq.(10), direct access to B 20 (0) is not possible due to the vanishing of the kinematical factor in front of B 20 (Q 2 ). Therefore, one needs to compute B 20 (Q 2 ) for finite Q 2 and extrapolate to Q 2 = 0 using a fit Ansatz. In order to accomplish this, one has to isolate from the other two GFFs appearing in the decomposition of Eq.(10). We thus need to compute the three-point function of the one-derivative vector operator for finite momentum transfer using both unpolarized and polarized projectors and for both diagonal and off-diagonal elements of the traceless EMT.
To isolate B 20 (Q 2 ) from A 20 (Q 2 ) and C 20 (Q 2 ) one has to first minimize where R is the ratio of Eq. (16) and w its statistical error. The kinematical coefficients G are defined in Appendix A.
The three form factors are the components of The time dependence t s , t ins appears due to contributions from excited states that will be analyzed using the methods discussed in Sec. III C. In the following discussion we suppress the time dependence for simplicity. As discussed, one can extract the form factors by minimizing the χ 2 in Eq. (35) or alternatively show that it is equivalent to wherẽ R µν (Γ ρ , p , p) ≡ [w µν (Γ ρ , p , p)] −1 R µν (Γ ρ , p , p), , p , p), and where we compute the Singular Value Decomposition (SVD) ofG in the last line. U is a hermitian N × N matrix with N being the number of combinations of µ, ν, ρ and components of p , p that contribute to the same Q 2 . V is a hermitian 3 × 3 matrix since we have three GFFs. Typically, N 3 for finite momenta. Σ is the pseudo-diagonal N × 3 matrix of the singular values of G. In the following we use the latter approach since no explicit minimization is needed.
After extracting B 20;B (Q 2 ; t s , t ins ) using the SVD method we investigate its dependence on t s and t ins following the same procedure as for the average momentum fraction. In Fig. 10 we present the analysis for the connected contribution to the bare isoscalar B u + +d +

20;B
for two representative values of the momentum transfer. As for the case of the connected contributions to the isoscalar average momentum fraction, we observe large effects from excited states. As t s increases the value changes from negative to positive. Two-state fits yield consistent results as we increase t low s . The value extracted from the two-state fit for t low s /a = 8 is in agreement with the value determined using the summation method for t low s ≥ 1.12 fm and we thus select it as our final value. In Fig. 11 we show results for the connected bare GFF B u + +d + 20;B (Q 2 ) as a function of Q 2 up to 1 GeV 2 . As can be seen, the Q 2 behavior is relatively flat for small values of Q 2 . In order to extrapolate to zero momentum we use a dipole form supported by the quark-soliton model in the large N climit for Q 2 ≤ 1 GeV 2 [70], where M the mass of the dipole. Since we are interested in fitting the small 1. In the zeroth approximation, Eq. (40) yields a constant and in the first approximation a linear function of Q 2 M 2 . In Fig. 11 we show the fits to both a constant and linear forms up to Q 2 = 0.4 GeV 2 . As can be seen, both constant and linear fits are in agreement, confirming that the Q 2 /M 2 is negligible. We take as our value the one extracted from the constant fit, to obtain for the connected isoscalar Following the same procedure we extract the disconnected light and strange quark contributions to B 20;B (Q 2 ). In Fig. 12 we show the results as a function of Q 2 . Although the results are rather noisy for both quark flavors we can fit the Q 2 -dependence to the form of Eq. (39). We find for disconnected isoscalar and for the strange B s + 20;B (0) = −0.017 (18).
The gluon GFF B g 20;B (Q 2 ) is shown in Fig. 13 as a function of Q 2 and the value extracted using a constant In Table IV we tabulate our values on the bare results for x and B 20 (0). Bare results for the momentum fraction x = A20(0) and B20(0) for the isovector, isoscalar connected, isoscalar disconnected, strange, charm, and gluon contributions. For details on the extraction of the isovector B u + −d + 20;B see Ref. [51].

V. RENORMALIZATION
One of the main ingredients of this work is the renormalization of the quark and gluon parts of the EMT of Eqs. (4) and (3). The renormalization of each part requires a dedicated calculation, and in this section we classify them in multiplicative renormalization functions (Z qq , Z gg ) and mixing coefficients (Z qg , Z gq ). The latter are needed to disentangle the quark and gluon momentum fractions from the bare matrix element of the operators of Eqs. (3) and (4). Therefore, a 2 × 2 mixing matrix needs to be constructed for the proper renor-malization procedure, that renormalize the momentum fractions given by In the above equations, x q + is understood to be the flavor singlet combination that sums the up, down, strange and charm quark contributions. The subscript R (B) represents the renormalized (bare) matrix elements. We note that a complete calculation of the 2 × 2 mixing matrix would require the solution of a system of four coupled renormalization conditions that involve vertex functions of both gluon and quark EMT operators. In our analysis we imposed decoupled renormalization conditions for the nonperturbative calculation of the diagonal elements, namely Z gg and Z qq , since the mixing coefficients Z gq and Z qg are small as shown in Ref. [33] using the oneloop lattice perturbation theory. Furthermore, note that gluon and quark EMT operators also mix with gaugevariant operators (BRS-variations and operators vanishing by the equations of motion), which have zero nucleon matrix elements and are not considered.
In the following subsections we present the nonperturbative renormalization of the quark EMT, the nonperturbative renormalization of the gluon EMT and our estimates for the mixing coefficients as extracted from a calculation within one-loop lattice perturbation theory [33].

A. Quark EMT renormalization
The quark EMT is renormalized non-perturbatively using an analysis within the Rome-Southampton scheme (RI scheme) [71]. This is a very convenient prescription for non-perturbative calculations, and is obtained by applying the conditions The trace in the above conditions is taken over spin and color indices. The momentum p of the vertex functions is set to the RI scale, µ 0 . Note that the vertex function Γ L qEMT is amputated in the above condition. Also, S Born (Γ Born qEMT ) is the tree-level value of the quark propagator (quark operator). We employ the momentum source method introduced in Ref. [72], which offers high statistical accuracy using a small number of gauge configurations as demonstrated for twisted mass fermions in Refs. [73][74][75]. Discretization effects and other systematic uncertainties in the renormalization functions (Z-factors) can be amplified based on the choice of the momentum. A way around this problem is the use of momenta with equal spatial components. The temporal component is then chosen such that the ratio P 4 ≡ i p 4 i /( i p 2 i ) 2 is less than 0.3 [76]. Such a ratio is relevant to Lorentz non-invariant contributions present in perturbative calculations of Green's functions beyond leading order in a [77]. Therefore, a large value of P 4 would indicate large finite-a effects in the non-perturbative estimates too. The momenta employed in this work for the quark EMT are of the form ap ≡ 2π n t T /a , n x L/a , n x L/a , n x L/a , n t ∈ [2,10], n x ∈[2, 5] , taking all combinations of n t and n x that satisfy P 4 < 0.3 and 1 ≤ (a p) 2 ≤ 7. T and L are the temporal and spatial extent of the lattice, and correspond to T /a = 48, L/a = 24 for the N f = 4 ensembles that are generated specifically for the renormalization program at the same coupling constant as the cB211.072.64 ensemble. An important aspect of our renormalization program is the improvement of the non-perturbative estimates by subtracting finite-a effects [75,78], calculated to one-loop in lattice perturbation theory and to all orders in the lattice spacing, O(g 2 a ∞ ). Note that the dimensionless quantity appearing in the perturbative expressions is ap (for massless fermions).
Z qq has two components depending on the indices of the operator defined in Eq. (4). Z qq1 corresponds to the quark EMT operator with µ = ν, while Z qq2 to µ = ν. Z qq1 and Z qq2 renormalize the bare matrix elements of the quark EMT operator obtained with the same constraints on the external indices. Thus, in our work we use Z qq1 for the connected contribution and Z qq2 for the disconnected ones, as described in Sec. IV A.
For the proper extraction of Z qq we use five N f = 4 ensembles at different pion masses reproducing a β value of 1.778 to match the N f = 2 + 1 + 1 ensemble on which the bare matrix elements have been calculated. The N f = 4 ensembles correspond to a pion mass that ranges between 350 and 520 MeV, allowing one to take the chiral limit. More details on the N f = 4 ensembles can be found in Ref. [51]. The chiral extrapolation is performed using a quadratic fit with respect to the pion mass of the form where Z RI andz RI depend on the scheme and the scale.
The pion mass dependence of Z RI qq1 is found to be very mild, as demonstrated in Fig. 14 where we show the data from the five ensembles for a representative renormalization scale (a µ 0 ) 2 = 2. Same conclusions hold for Z qq2 . Once the chiral extrapolation is performed, we apply the subtraction of artifacts calculated in one-loop lattice perturbation theory. This subtraction procedure leads to improved estimates, as it significantly reduces discretization effects. The next step is the conversion to the MSscheme, which is commonly used to compare to experimental and phenomenological values. The conversion procedure is applied on the Z-factors obtained on each initial RI scale (a µ 0 ), with a simultaneous evolution to a MS scale, chosen to be µ=2 GeV. Assuming absense of mixing, the conversion and evolution uses the intermediate Renormalization Group Invariant (RGI) scheme, which is scale independent and relates the Z-factors between the two schemes: Therefore, the appropriate factor to multiply Z RI qq is [96] .
(52) The quantity ∆Z S qq (µ 0 ) is expressed in terms of the βfunction and the anomalous dimension γ S qq ≡ γ S of the operator and may be expanded to all orders of the coupling constant. The expression for the quark EMT operator is known to three-loops in perturbation theory and can be found in Ref. [75] and references therein. The conversion and evolution is followed by a fit to eliminate the residual dependence on aµ 0 using the Ansatz For both Z qq1 and Z qq2 we distinguish between the singlet and the non-singlet case, which is necessary for the proper renormalization and the flavor decomposition presented in Sec. VI. Their difference is known to be very small as it first appears in two-loop perturbation theory [79]. Here, we calculate also the singlet renormalization function non-perturbatively, which requires both connected and disconnected contributions to the vertex functions.
In the upper panel of Fig. 15 we plot the nonsinglet values of Z MS qq1 (at 2 GeV) with and without the subtraction of the corresponding O(g 2 a ∞ ) contributions. We also include the singlet Z (s),MS qq1 , after subtraction of the O(g 2 a ∞ ) terms. We find that the singlet and nonsinglet renormalization functions are compatible within uncertainties, with the singlet being more noisy, due to the inclusion of the disconnected contributions. In the lower panel of Fig. 15 we show the corresponding quantities for Z MS qq2 . While the singlet one has large uncertainties in this case too, it is smaller than the statistical errors of Z (s),MS qq1 . In both cases we have subtracted the vacuum expectation value.
We find that the subtraction procedure improves significantly the data, leading to smaller dependence on the initial scale (a µ 0 ) 2 . As can be seen from the plot, the O(g 2 a ∞ )-terms capture a large part of the discretization effects. The subtraction of finite-a terms from the non-perturbative estimates of Z MS qq1 (as well as Z MS qq2 ) reduces the slope with respect to (a µ 0 ) 2 , between momenta with the same n x value and different n t . As an example, let us consider the class of momenta with n x = 3 ((a µ 0 ) 2 ∈ [2 − 3.1]). The fit of Eq. (54) as applied on the unimproved and improved data leads to z unsub qq1 = 0.0133(4) and z sub qq1 = 0.0017(4), respectively. As can be seen, the slope in the improved data reduces by an order of magnitude, making it negligible for the values of (a µ 0 ) 2 considered in this work.  The final estimates for Z qq1 and Z qq2 are determined using the fit interval (a µ 0 ) 2 [2 − 7], and we obtain the following values employing the subtracted data The numbers in the first and second parenthesis correspond to the statistical and systematic uncertainties, respectively. The source of systematic error is related to the (a µ 0 ) 2 →0 extrapolation and is obtained by varying the lower ((aµ 0 ) 2 = 2) and higher ((aµ 0 ) 2 = 7) fit ranges and taking the largest deviation as the systematic error. We emphasize that the procedure of improving the Zfactors utilizing lattice perturbation theory, has important implication on the spin and momentum decomposition: use of the unimproved Z-factors would underestimate both the intrinsic spin and the quark momentum fraction by 5%.

B. Gluon EMT renormalization
Similarly to the case of the quark EMT, we renormalize the gluon EMT non-perturbatively. This is a crucial improvement compared to our previous work [26,33] in which we used Z gg from one-loop perturbation theory. The renormalization condition for Z gg involves the gluon-field renormalization function Z g , which in the RI scheme reads In the above equations N c is the number of colors and ap j = 2 sin (a p j /2). The color factor (N 2 c − 1)/2 comes from the trace over color indices in the tree-level expressions. The gluon fields on the lattice are computed as and the gluon propagator in momentum space is given as A ρ (p)A ρ (−p) in the ρ direction. The numerator 3/p 2 of Eq. (57) is the tree-level expressions for the gluon propagator in the Landau gauge, in which the Lorentz indices have been set equal to each other and are summed over. Similarly, (2p 4pi )/(p 2 ) 2 in Eq. (58) is the non-amputated tree-level value corresponding to the gluon EMT in the Landau gauge. In this study we focus on theT 4i g case, since as shown in Fig. 9 it is significantly more precise.
This setup justifies the presence of Z −1 g in Eq. (58), instead of Z +1 g . For simplicity in the notation, the dependence of Z g and Z gg on the RI scale µ 0 is implied. Unlike the case of Z qq , here we use non-amputated vertex functions for the gluon EMT. Such a choice is desirable, as the 4×4 matrix of the gluon propagator in the Landau gauge is not invertible.
The definition of Z g given in Eq. (57) is convenient, as there is a sum over the Lorentz indices of the gluon fields. While a similar condition could be imposed on Z gg , we do not sum over ρ in Eq. (58). Instead we choose the index ρ to be different from the Lorentz indices of the operator (4 and i). This has the advantage that any mixing with other gluon operators [80] vanishes automatically, at least to one-loop level.
We also explore an alternative definition of Z gg as proposed in Ref. [34]. In such condition for Z g there is no summation over ρ, which is set equal to the ρ index of the external gluon fields in Eq. (58), and has the constraint p ρ = 0. Therefore, it is convenient to eliminate Z g from Eq. (58), obtaining One major difference in the calculation of Z gg as compared to Z qq is the need to reduce the high noiseto-signal ratio appearing in the calculation of gluonic quantities. To this end, some equivalent renormalization prescriptions have been proposed to reduce the statistical uncertainties. In the discussion that follows we will investigate three methods for the extraction of Z gg . Note that for zero stout steps, n S = 0, all these methods reduce to the same equation.
Method 1: Application of stout smearing only on the operatorT 4i g in Eq. (60), while the external gluon fields remain unsmeared.
Method 2: Application of stout smearing in both the operator and the external gluon fields of Eq. (60) as suggested in Ref. [32]. Since our action is not smeared one would need to apply reweighting in the calculation of both the Z-factors and matrix elements. We follow Ref. [32] and assume that its effect is negligible on the renormalization function.

Method 3:
A generalization of Method 1 as suggested in Ref. [81], in which we multiply Eq. (60) by the ratio where Presence of an index s implies stout smearing with n S steps. The multiplication of Eq. (60) by Eq. (61) leads to the same Z gg in the (a µ 0 ) 2 → 0 limit, as R((aµ 0 ) 2 ) → 1 when the above limit is taken. The same number of smearing steps are applied on the links entering the operator T 4i g and the gluon fields.
The vertex functions entering Eq. (60) are calculated on one N f = 4 ensemble with β = 1.778 and volume 12 3 ×24 with a pion mass of 350 MeV. While the Z-factors are defined in the massless limit, the use of a single ensemble is sufficient given the negligible pion mass dependence observed for the quark case shown in Fig. 14, where the results are very precise. Focusing on a single ensemble allows one to reach higher statistics. As a purely gluonic quantity it is susceptible to large gauge fluctuations and therefore about 31,000 configurations are analyzed to reduce the noise.
In contrast to the quark case, the momenta for the vertex functions cannot have the same spatial components, due to the constraint p ρ = 0 in the renormalization condition. Consequently, the value of P 4 is larger than 0.35 for such momenta. We calculate Z gg for momenta satisfying P 4 < 0.4, and within the range 1 ≤ (a µ 0 ) 2 ≤ 4. The conversion factor is calculated to one loop in Dimensional Regularization using the main results of Ref. [33]. Note that the conversion factor must be obtained for nonamputated Green's functions, to match the scheme of Eq. (60). In the Landau gauge we find which must multiply Z RI gg . In the expression above,μ is the renormalization scale in the MS scheme and µ 0 in the RI scheme, as defined previously. This conversion factor is consistent with the expression of Ref. [81], with the only difference a global minus sign in the one-loop expression, due to a different definition.
We obtain Z gg in the MS scheme evolving the values from every scale (a µ 0 ), and therefore, we must eliminate any residual dependence on (a µ 0 ) by taking the limit (a µ 0 ) 2 → 0. While such a fit is linear in Z qq where democratic momenta can be used, here the residual dependence on (a µ 0 ) 2 may be polynomial. We identify two sources for this behavior: i. Truncation effects in the conversion factor C gg , which is only known to one-loop order; ii. finite-a effects due to p ρ = 0, making it unreliable to go to high (a µ 0 ) 2 values. The latter effect can be reduced by a similar subtraction of finite-a effects as in the case of Z qq . Since these have not yet been calculated we cannot make the subtraction in the current data.

Method 1
The most straightforward and conceptually concrete method to extract Z gg is to use Eq. (60) with stout smearing applied only on the gauge links of the oper-atorT 4i g . In Fig. 16 we show (Z MS gg2 ) −1 as a function of the initial scale squared for various smearing steps. As we increase the number of stout steps, the (a µ 0 ) 2 → 0 fit requires a higher degree polynomial to capture the proper (a µ 0 ) 2 dependence since more smearing alters the discretization effects between the numerator and denominator in Eq. (60). For the examples shown in Fig. 16 we use a polynomial of second degree with respect to (a µ 0 ) 2 for three and five steps, and third degree for 10 steps of stout smearing. We observe that the value of (Z MS gg2 ) −1 increases (Z MS gg2 decreases) with the stout steps, which is expected, as the value of the bare matrix elements increases with the stout steps. As will be discussed later, we find that the renormalized matrix element is independent of the number of stout steps (see Fig. 21).

Method 2
The difference between method 1 and method 2 is the use of stout smearing on the gauge links used to construct the gluon fields entering Eq. (60). As already mentioned, this would need reweighting. This was assumed to be negligible in Ref. [32] and we also neglect it here. In Fig. 17 we show Z MS gg from method 2 for selected number of stout steps including zero steps, the same for all three methods. Without smearing there is no noticeable dependence on the scale (a µ 0 ) 2 allowing us to fit to a constant while as increasing the number of steps the dependence becomes linear. We note that smearing also the gluon field, provides a better correlation with the operator for higher momenta allowing us to investigate up to (a µ 0 ) 2 = 7. It is worth mentioning that while there is a big jump on the extrapolated value between n S = 0 and n S = 5, between n S = 5 and n S = 10 the difference is relatively small. FIG. 17: Z MS gg2 using method 2 with the same notation as in Fig. 16. From top to bottom we show results for nS = (0, 5, 10).

Method 3
This method is more involved since one has to compute first Eq. (62) and extrapolate to (aµ 0 ) 2 → 0. In Fig. 18 we show the ratio of smeared to unsmeared gluon propagators. Note that this ratio does not alter the conversion factor for Z gg . As we increase the number of smearing steps the discretization effects between the numerator and the denominator change, not canceling in the ratio leading to a more curved behavior. We fit the results up to a forth order polynomial since the gluon propagators alone are very precise.
In Fig. 19 we show the Z MS gg when the Eq. (61) multiplies Eq. (60). The resulting behavior is fitted to a constant since most of the systematics are cancelled when multiplying with Eq. (61).
It is interesting to compare the final extrapolated values of Z MS gg2 among the three methods. The results from each method should agree since they renormalize the same bare matrix elements. Such a comparison will give an indication of additional discretization effects, which might remain after the (a µ 0 ) 2 → 0 extrapolation, as well as on the assumption that reweighting can be neglected in method 2. The final estimates are plotted in Fig. 20 as a function of the stout steps, n S . We find that all three methods are overall compatible as a function of number of stout smearing steps. It is worth mentioning that the Z MS gg2 has a strong dependence on n S going from zero steps up to five steps, whereas increasing further the steps the Z-factor does not change significantly. In Fig. 9  we demonstrated that the bare matrix element shows an increase from zero steps up to 12 steps, albeit large errors for small number of steps, tending to converge after 10 steps.
One important consistency check for the calculation of Z gg is the comparison of the renormalized matrix elements between different methods, which is demonstrated in Fig. 21. For simplicity, we neglect the mixing for this discussion. Such mixing is found to be very small (see Subsec. V C), and thus, does not alter the main conclusions of Fig. 21. The multiplication of the bare matrix gg2 is shown in Fig. 21 for the three methods investigated. As can be seen, the three methods yield compatible results for all stout steps. While the stout smearing does not alter the values of the renormalized matrix element, it has the advantage that it reduces the gauge fluctuations. The chosen value for Z gg in the MS scheme at 2 GeV is obtained at 10 stout steps using method 1 as shown in Fig. 21.
which is a conservative choice as it has the largest statistical uncertainty compared to the other methods (see Fig. 20). A systematic has been added by varying the highest point in the polynomial fit from 4 to 3.5 of the initial RI scale (aµ 0 ) 2 . The value above will be used in the 2 × 2 renormalization of the bare values given in Table IV.

C. Mixing between fermion and gluon operators
The renormalization of the quark and gluon EMT is more complicated as compared to other operators stud-ied within hadron structure (e.g., intrinsic spin). This is due to their mixing, resulting into a 2 × 2 matrix necessary for the appropriate renormalization, as given in Eqs. (45) - (46). In fact, the mixing pattern of the gluon operator of Eq. (3) is more complicated, as it includes other operators such as Becchi-Rouet-Stora (BRS) variations or operators that vanish by the gluon equations of motion [82]. However, such operators do not contribute to matrix elements between physical states.
All coefficients Z qq , Z qg , Z gg , Z gq of the mixing matrix can be obtained within lattice perturbation theory, following the procedure of our previous work on the gluon EMT [33]. In particular, to one loop level, one needs to calculate the diagrams of Fig. 22. Here we are interested in the extraction of Z gq and Z qg from our perturbative calculation, as Z gg and Z qq are computed non-perturbatively. In the calculation within perturbation theory we use up to two steps of stout smearing for the gluon EMT. This limitation is posed by the fast increase of algebraic expressions (millions of terms), for higher number of stout steps. We find that the polynomial nature of the perturbative renormalization functions with respect to the stout parameter, leads to a convergence at a small number of stout steps. This has been confirmed in other calculations with stout smearing [33,83]. Using the lattice spacing and coupling constant of the ensemble under study we extract the mixing coefficients:

VI. RESULTS
In this section we give the renormalized matrix elements, by combining the bare matrix elements extracted in Sec. IV and the renormalization factors in Sec. V yielding our physical results. The renormalized results are obtained from the expressions where X = x , J, and δZ qq the difference between singlet and non-singlet Z qq and N f = 4 since we have four flavors in the sea. In order to fully decompose the quark flavors we use the corresponding isovector results from Refs. [25,51], which are also given in Table VI for completeness.
In Fig. 23 we show our results for the proton average momentum fraction for the up, down, strange and charm quarks, for the gluons as well as their sum. The up quark is the largest quark contribution, namely about 35%, twice bigger than the down quark. The strange quark contributes significantly smaller, namely about 5% and the charm is restricted to about 2%. The gluon has a significant contribution of about 45%. Summing all the contributions results to q=u,d,s,c x q + R + x g R = 104.5(11.8)%, confirming the expected momentum sum. Fig. 23 also demonstrates that disconnected contributions are crucial and if excluded would result to a significant underestimation of the momentum sum.
The individual contributions to the proton spin are presented in Fig. 24 as extracted from Eq. (11). The major contribution comes from the up quark amounting to about 40% of the proton spin. The down, strange and charm quarks have relatively smaller contributions. All quark flavors together constitute to about 60% of the proton spin. The gluon contribution is significant, namely about 40% of the proton spin, providing the missing piece to obtain in total 94.6(14.2)% of the proton spin.
which is indeed compatible with zero within its statistical uncertainty.
Since the quark contribution to the proton spin is computed, it is interesting to see how much comes from the intrinsic quark spin. In Fig. 25 we show our results for 1 2 ∆Σ q + = 1 2 g q + A . These are taken from Ref. [25] and included in Table V, for easy reference. The up quark has a large contribution, up to about 85% of the proton intrinsic spin. The down quark contributes about half compared to the up and with opposite sign. The strange and charm quarks also contribute negatively with the latter being about five times smaller than the former giving a 1% contribution. The total 1 2 ∆Σ q + is in agreement with the upper bound from COMPASS [84].
Having both the quark angular momentum and the quark intrinsic spin allows us to extract the orbital angular momentum using Eq. (12). For a direct calculation using TMDs see Ref. [85]. Our results are shown in Fig. 26 The total contribution of the four flavor is also shown (grey bar) [25]. The dashed horizontal line is the observed proton spin and the percent numbers are given relative to it. Results in MS scheme at 2 GeV. quark is negative reducing the total angular momentum contribution of the up quark to the proton spin. The contribution of the down quark to the orbital angular momentum is positive almost canceling the negative intrinsic spin contribution resulting to a relatively small positive contribution to the spin of the proton.
Our final values for each quark flavor and gluon contribution to the intrinsic spin, angular momentum, orbital angular momentum and momentum fraction of the proton are summarized in Table V Orbital angular momentum L contributions to the proton spin. The color notation is as in Fig. 25. The dashed horizontal line denotes the observed proton spin and the percentage is given relative to the it. Results are given in MS at 2 GeV. , the total angular momentum J and the orbital angular momentum L in the MS scheme at 2 GeV. Results are given separately for the up (u + ), down (d + ), strange (s + ), charm (c + ) and for gluons (g) where for the quarks, results include the antiquarks contribution. The sum over quarks and gluons is also given as Tot.
x J  The results given in Table V and VI are obtained using one ensemble of twisted mass fermions. Therefore, it is not possible to quantitatively determine finite lattice spacing and volume systematics. However, in Ref. [86] several N f = 2 twisted mass fermion ensembles were analyzed with pion masses in the range of 260 MeV to 470 MeV and lattice spacings a =0.089, 0.070 and 0.056 fm as well as for two different volumes. A continuum extrapolation at a given value of the pion mass was performed. We found negligible O(a 2 )-terms yielding a flat continuum extrapolation. Therefore, we expect that cut-off effects will be small also for our current ensemble.

VII. COMPARISON WITH OTHER STUDIES
In order to evaluate the contribution of each quark flavor to the proton spin and momentum one needs to compute the quark-disconnected diagrams as done here. The evaluation of these contributions is much more challenging as compared to the connected ones, in particular at the physical point. This is the main reason that most lattice QCD studies to date have mostly computed isovector quantities for which the aforementioned diagrams cancel. For instance, in the case of the axial charge, which is an isovector quantity, there are numerous studies [87], whereas for the individual quark flavor axial charges g q + A ≡ ∆Σ q + results computed directly at the physical point are still scarce. In order to make a comparison with other lattice QCD studies, we include results obtained using a chiral extrapolation. We limit ourselves to comparing results that were obtained having at least one ensemble with close to physical pion mass, meaning below 180 MeV. Although such a chiral extrapolation may introduce uncontrolled systematics that are absent from the results reported here, it allows for a comparison with published lattice QCD results on these quantities.
We begin with 1 2 ∆Σ q + and consider the following lattice QCD studies: i) The χQCD collaboration analyzed three N f = 2+1 gauge ensembles of domain-wall fermion (DWF) generated by the RBC/UKQCD collaboration with pion masses 171, 302 and 337 MeV and lattice spacings of 0.143, 0.111 and 0.083 fm. They used overlap fermions in the valence sector. They performed a combined fit in order to extrapolate to the physical pion mass, the continuum and infinite volume limits [29].
ii) The PNDME collaboration analyzed several N f = 2 + 1 + 1 gauge ensembles of highly Improved Staggered Quarks (HISQ) generated by the MILC Collaboration. They used Wilson clover fermions in the valence sector. For the connected contributions they analyzed eleven ensembles with m π 315, 220, 135 MeV and lattice spacings a 0.15, 0.12, 0.09, 0.06 fm. The disconnected contributions were computed on a subset of these ensembles. The strange quark contributions were computed on seven ensembles using all lattice spacings but only one physical point ensemble was analyzed; the light disconnected were computed on six ensembles for two values of m π = (315, 220) MeV, which are not close to the physical pion mass and thus excluded from the comparison. They performed a combined chiral and continuum limit extrapolation to extract results at the physical point [28].
iii) The ETM collaboration analyzed an N f = 2 ensemble of twisted mass fermion with m π = 130 MeV, a = 0.094 fm and Lm π = 3 [27]. . From left to right: for u + , d + , s + and c + quarks. Red, green, blue, and orange denote lattice QCD results, with filled symbols being results that are computed directly at the physical point and open symbols results that were obtained after a chiral extrapolation. The inner error bar is the statistical error and the outer one the total that includes systematic errors. In particular, red circles show the results using the cB211.072.64 ensemble of this work and reported in Ref. [25] with the associated error band. Green squares show ETM Collaboration results from Ref. [27]; blue upwards pointing triangles show results from χQCD [29]; and orange left-pointing triangles from PNDME [28]. Results from global fits of polarized PDFs are shown with black symbols and right triangles, pentagons, diamonds, and rhombus being from NNPDFpol.1 [17], DSSV08 [11], JAM15 [88], JAM17 [89], respectively.
In Fig. 27 we show a comparison of our results on the intrinsic spin 1 2 ∆Σ q + to the aforementioned lattice QCD studies. As can be seen, there is an agreement among different lattice QCD analyses. In addition, we compare to the results extracted from global-fit analysis of polarized parton distribution. The values from the analysis of the cB211.072.64 ensemble of this work for the up and down quarks agree very well with the phenomenological extractions. We note that the precision reached is comparable to that of the phenomenological values. For the strange quark contribution 1 2 ∆Σ s + lattice QCD results achieve a better accuracy than the results extracted from global fits and point to a smaller value as compared to those from DSSV08 [11] and JAM15 [88]. Our results for 1 2 ∆Σ c + predict a non-zero value, showing small but non-zero charm quark effects in the nucleon.
For the comparison of the average momentum fraction of each quark flavor and the gluon we consider lattice QCD results from the following groups: i) The χQCD collaboration using the same gauge en-sembles as described for the case of the intrinsic quark spin. In addition, they included in the analysis an ensemble with m π = 139 MeV and a = 0.114 fm [81]. Despite the fact that a physical point ensemble is included, a chiral extrapolation is still performed in order to extract their value at the physical point. In using more precise results for heavier than physical pion mass ensembles their result at the physical point is weighted less in the fit. This procedure may yield better precision at the physical point but it can also potentially introduce an unknown systematic error due to the chiral extrapolation.
ii) The ETM collaboration using the same setup as for the intrinsic quark spin. In Fig. 28 we compare the results for the average momentum fraction for each quark flavor. The results highlight the improvement achieved in the current analysis as compared to the two previous direct determinations using physical point ensembles by the ETM [26] and χQCD [81] Collaborations. This is mostly due to the precise determination of the quark loop (disconnected) contributions using our improved techniques. Additionally, our current determination is in remarkable agreement with the phenomenological extractions resolving a long standing dis-crepancy between lattice QCD results and experimental determinations. In Fig. 28 we also include a comparison of the gluon momentum fraction where we only show lattice results with non-perturbative renormalization, thus excluding the previous result from the ETM Collaboration [26]. There is agreement between the result of this study and the one from the χQCD Collaboration as well as with the phenomenological determinations, which are very precise compared to the current lattice QCD values.
For the angular momentum and orbital angular momen-tum, the quark decomposition at the physical point has  29: Results for the angular momentum J for each quark flavor and the gluon. With red circles are results from this work and with green squares results from our previous study [26]. The notation is as in Fig. 28. only been computed by the ETM Collaboration [26]. We show the comparison between these two studies in Figs. 29 and 30, respectively. The results of this work have improved accuracy for all quark flavors for this class of observables. Both J u + and L u + have smaller values while the rest are in agreement with our previous study. This is due the fact that more sink-source time separations are used reaching larger separations with more accuracy. This leads to a better extraction of the nucleon matrix element. For J g the result of this study is the only one available with a non-perturbative renormalization at the physical point.

VIII. CONCLUSIONS
This work updates the ETM Collaboration results of Ref. [26] by making six major improvements: i) an analysis of an ensemble of N f = 2 + 1 + 1 of twisted mass fermions adding dynamical strange and charm quarks as compared to an N f = 2 ensemble; ii) a more accurate evaluation of the disconnected contributions yielding the most accurate lattice QCD determination of these quantities directly at the physical point to date; iii) the analysis of larger sink-source time separations at higher accuracy eliminating more reliably excited states contributions in dominant connected contributions; iv) the computation of the GFF B 20 (Q 2 ) needed for J q,g ; v) the nonperturbative evaluation of the difference between the singlet and the non-singlet renormalization functions for all the relevant operators; and vi) non-perturbative renormalization of the gluon momentum fraction and angular momentum J g .
iv) The computation of the quark orbital angular momentum obtaining q=u,d,s,c L q + = 0.094 (51).
A next step of this study is to compute the mixing coefficients discussed in Sec. V C non-perturbatively and analyze N f =2+1+1 physical ensembles with finer lattice spacings and bigger volumes to perform the continuum and infinite volume extrapolations.
The following expressions are provided in Euclidean space. We suppress the Q 2 = −q 2 argument of the generalized form factors, E N is the nucleon energy for three-momentum q, for the case p = 0, the kinematic factor K = 2m 2 N /[E N (E N + m N )] and Latin indices (k, n, and j) take values 1, 2, and 3 with k = j while ρ takes values 1, 2, 3, and 4.