Lattice evaluation of the Compton amplitude employing the Feynman-Hellmann theorem

The forward Compton amplitude describes the process of virtual photon scattering from a hadron and provides an essential ingredient for the understanding of hadron structure. As a physical amplitude, the Compton tensor naturally includes all target mass corrections and higher twist effects at a fixed virtuality, $Q^2$. By making use of the second-order Feynman-Hellmann theorem, the nucleon Compton tensor is calculated in lattice QCD at an unphysical quark mass across a range of photon momenta $3 \lesssim Q^2 \lesssim 7$ GeV$^2$. This allows for the $Q^2$ dependence of the low moments of the nucleon structure functions to be studied in a lattice calculation for the first time. The results demonstrate that a systematic investigation of power corrections and the approach to parton asymptotics is now within reach.


I. INTRODUCTION
Understanding the internal structure of hadrons from first principles remains one of the foremost tasks in particle and nuclear physics. It is an active field of research with important phenomenological implications in highenergy, nuclear and astroparticle physics. The static properties of hadrons, from the hybrid structure of quark and meson degrees of freedom at low energies down to the partonic structure at short distances, are encoded in structure functions. The tool for computing hadron structure functions from first principles is lattice QCD.
The connection between nucleon structure functions and the quark structure of the nucleon is commonly rendered by the parton model. Although providing an intuitive language in which to interpret the deep-inelastic scattering data, the parton model represents an ideal case, valid if the partons are scattered elastically and incoherently by the incoming lepton. In the operator product expansion (OPE) of the Compton amplitude the operators are classified according to twist. The parton model accounts for twist-two contributions only, and does not accommodate power corrections arising from operators of higher twist. It has been known for a long time though that contributions from operators of higher twist are inseparably connected with the contributions of leading twist, as a result of operator mixing and renormalization [1,2].
So far lattice calculations of nucleon structure functions have largely been limited to matrix elements of leading twist. That includes the calculation of a few lower moments of parton distribution functions (PDFs), see Refs. [3,4] for pioneering work and Ref. [5] for a summary of the latest activity. More recently, there has been significant activity in the calculation of so-called quasi-PDFs [6][7][8] and its extensions, pseudo-PDFs [9] -see Ref. [10] for a review. Various other strategies have also been proposed to overcome shortcomings of the study of structure functions on the lattice [11][12][13][14][15][16] and other related inclusive processes [17][18][19][20][21].
In the present work, we build upon a recent Letter [22] outlining a procedure to determine nucleon structure functions from a lattice calculation of the forward Compton amplitude. By working with the physical amplitude, this approach overcomes issues of operator mixing and renormalization, and the restriction to light-cone operators [23,24]. Here we establish the theoretical foundation of the approach and present results for the Compton amplitude across a range of kinematics. The calculations are performed at the SU(3) flavor symmetric point [25] at an unphysical pion mass. Results are reported on the lowest four moments of the unpolarized structure functions of the nucleon for photon momenta Q 2 ranging from approximately 3-7 GeV 2 . The variation of Q 2 demonstrates the potential to provide a quantitative test of the twist expansion on the lattice for the first time.
In terms of the practical computation, the determination of the Compton amplitude takes advantage of the Feynman-Hellmann [26][27][28][29][30] approach to hadron structure -see also Refs. [31][32][33][34][35]. The use of Feynman-Hellmann allows one to avoid the need to compute 3-or 4-point functions and, thereby, obtain statistically cleaner signals as well as maintain better control over excited-state contamination. Here we also present a derivation of the second-order Feynman-Hellmann theorem necessary for the present work -a related derivation has been presented in Ref. [36].
This paper is organized as follows: formal definitions of the Compton amplitude and the structure functions, along with the connection between the OPE and the dispersion relation are given in Section II. We explicitly derive the second order Feynman-Hellmann theorem in Section III. Our lattice setup and the implementation details are given in Section IV. Results for the Compton amplitude and the moments of the structure functions are presented in Section V. We summarize our findings in Section VI.

A. Notation
At leading order in the electromagnetic interaction, the general description for the inclusive scattering of a charged lepton from a hadronic target, e.g. eN → e ′ X, is encoded in the hadron tensor. Conventionally, the hadron tensor is expressed as a matrix element of the commutator of electromagnetic current operators [37][38][39], 1 W µν (p, q) = 1 4π d 4 z e iq⋅z ρ ss ′ ⟨p, s ′ [J µ (z), J ν (0)] p, s⟩, (1) for a hadron of momentum p and (virtual) photon momentum q. For the present discussion, we will only consider spin-averaged observables by taking ρ ss ′ = 1 2 δ ss ′ . The current operator takes the familiar form as the charge-weighted sum of the quark vector currents, J µ = ∑ f Q f J f µ , with Q f being the charge of quark flavor f . The flavor decomposition will be discussed in further detail in a later section.
The spin-averaged nucleon tensor can be decomposed as which is defined such that Lorentz-invariant structure functions, F 1,2 , match onto their conventional partonic interpretation in the deep inelastic scaling region. These 1 In this section, we work in Minkowski space. structure functions are expressed as functions of the Bjorken scaling variable (x = Q 2 (2p ⋅ q)) and Q 2 = −q 2 . While the inelastic structure functions are not directly accessible within a conventional Euclidean lattice formulation, we highlight that the spacelike component of the Compton tensor can be studied within a Euclidean framework-as also discussed in Refs. [11,14,17].
The (spin-averaged) Compton tensor is defined similarly to Equation (1), where T is the time-ordering operator. This tensor can be decomposed in precisely the same way as W µν in Equation (2), which defines the analogous scalar functions F 1,2 (ω, Q 2 ). For our purposes, it is convenient to express these in terms of the inverse Bjorken variable ω = 2p ⋅ q Q 2 . These Compton structure functions are related to the corresponding ordinary structure functions via the optical theorem, which states: Analyticity and crossing symmetry means we can write a dispersion relation for F [40] To accommodate the subtraction necessary in F 1 , we will make use of the bar notation to denote the dispersive part, F 1 (ω, Q 2 ) = F 1 (ω, Q 2 ) − F 1 (0, Q 2 ). The dispersion integrals can be directly connected to the hadron tensor by the optical theorem, giving: The nature of the dispersion integral makes it clear that whenever ω < 1 the singularities are never encountered and the time-ordering i becomes irrelevant. Hence the current-current correlation remains spacelike and there is no distinction between the Euclidean and Minkowski amplitudes. Physically, the condition ω < 1 is simply the statement that the eigenstates which propagate between the current insertions in Equation (3) cannot go on-shell.

B. Operators displaced in time
In the Feynman-Hellmann approach employed in this work, the matrix elements calculated involve currentcurrent correlations which are displaced in Euclidean time. It is therefore instructive to further clarify the relationship between the Compton tensor, as defined in Minkowski space, and the corresponding calculation within a Euclidean framework.
We start by separating Equation (3) into two distinct time orderings by first defining the amplitude at fixed temporal separation between the currents: From this definition it is straightforward to recover the full Compton amplitude by integrating over t: To isolate the explicit t dependence in Equation (10), we insert a complete set of states and exploit translational invariance in the usual way: The completeness integral, I = ⨋ X X⟩⟨X , describes a full integral over the entire state space, implicitly including all possible momenta over all possible configurations of particles.
Similarly to Equation (10), one can write down an expression where the current insertions are separated in Euclidean time [11,39]: Within the Feynman-Hellmann approach, by construction, there is no external energy transfer, q 0 = 0 -however, we retain this variable explicitly in our presentation for completeness. Inserting a complete set of states and using translational invariance, under Euclidean evolution, we have: It is evident that Equations (12) and (14) differ nontrivially in their dependence on the temporal coordinate -a similar point has been made in Ref. [41]. However, upon integrating with respect to time, the Minkowski and Euclidean expressions are easily equated -subject to the caveat that we are below the elastic threshold.
In particular, provided that E X(p±q) > E p ± q 0 for all non-vanishing contributions to the completeness sum ⨋ X , then the i prescription becomes irrelevant and we have: By summing the two terms of Equation (11), this makes it clear that the Euclidean and Minkowski Compton amplitudes are identical in the unphysical region, even though the current insertions are allowed to be separated in time. We of course note that if the intermediate states can go on shell, E X(p±q) = E p ± q 0 , then the Euclidean integral in Equation (15) is not well defined [42]. The i factor in Equation (10) then becomes essential in order to define the analytic continuation required to render the integral finite. However, the fixed-t Euclidean matrix elements are perfectly well defined, and there is no restriction on the kinematic thresholds-as has been studied in [11,39,42].
Although we have just been through this careful consideration of the t dependence, we note that the Feynman-Hellmann technique relies on resolving the spectrum of a perturbed Hamiltonian-which itself does not make any reference to the nature of the temporal correlations. In the present work, there is hence no need to explicitly resolve the τ dependence of Equation (13). The connection to the Minkowski amplitude lies in Equation (15), as the current insertions act at all times.

C. Moments and the OPE
As will become clear in the next section, within the Feynman-Hellmann formalism, each external current momentum vector q of interest requires a unique propagator inversion. However, for each inversion, the variation of hadron Fourier momenta p allows access to multiple distinct ω = 2p.q Q 2 values. An ensemble of q and p values therefore provides a wealth of kinematic points to resolve the Compton amplitude. It is therefore convenient to present a summary of the kinematic coverage in terms of the moments of the structure functions. We consider the Taylor series expansion of Equation (8) at fixed Q 2 : with the moments being defined by From the perspective of the present lattice calculation, one can proceed by calculating the Compton tensor for a number of ω values and extract the moments of the structure functions. By treating Q 2 as an external scale, the method connects directly to the physical amplitudes of interest, and therefore circumvents the operator mixing issues discussed above. Although the described lattice calculations relate directly to the physical moments, we note that at asymptotically large Q 2 the moments become dominated by their leading-twist contributions, namely the moments of the familiar parton distribution functions, v 2n , where the sum runs over partonic flavors f . The short distance structure of the operator product in Equation (3) is encoded in the Wilson coefficients, C, and the long-distance hadronic features are encoded in the matrix elements of local operators, v, renormalized at some scale µ. For completeness, our notation is summarized in Appendix A.

III. SECOND ORDER FEYNMAN-HELLMANN THEOREM
The purpose of applying the Feynman-Hellmann theorem to lattice QCD is to relate matrix elements of interest to energy shifts in weak external fields. In the case of a generalized Compton amplitude, described by a matrix element of two (non-local) current insertions, the conventional approach would require the evaluation of lattice 4point functions. The application of Feynman-Hellmann then reduces the problem to a more straightforward analysis of 2-point correlation functions using spectroscopic techniques.
In order to compute the forward Compton amplitude via the Feynman-Hellmann relation, we introduce the following perturbation to the fermion action, where λ is the strength of the coupling between the quarks and the external field, J µ (x) = Z Vq (x)γ µ q(x) is the electromagnetic current coupling to the quarks along the µ direction, q is the external momentum inserted by the current and Z V is the renormalization constant for the local electromagnetic current. The general strategy for deriving Feynman-Hellmann in a lattice QCD context is to consider the general spectral decomposition of a correlator in the presence of the background field. The differentiation of this correlation function with respect to the external field reveals a distinct temporal signature for the energy shift. By explicit evaluation of the perturbed correlator, one is able to identify this signature and hence resolve the desired relationship between the energy shift and matrix element. Our principle theoretical result here is that for the perturbed action described in Equation (22), the second-order energy shift of the nucleon is found to be: where T is the Compton amplitude defined in Equation (3), q = (q, 0) is the external momentum encoded by Equation (22), and E N λ (p) is the nucleon energy at momentum p in the presence of a background field of strength λ. In the following we sketch the main steps of the derivation, and refer the interested reader to Appendix B for further details.
In the presence of the external field introduced in Equation (22), we define the two-point correlation function projected to definite momentum as, where here and in the following, a trace over Dirac indices with the spin-parity projection matrix Γ is understood, and Ω λ ⟩ is the vacuum in the presence of the external field. The asymptotic behavior of the correlator at large Euclidean times takes the familiar form, where E N λ (p) is the energy of the ground state nucleon in the external field and A λ (p) the corresponding overlap factor.
For the purpose of current presentation, a nucleon interpolating operator is assumed for χ. However, the derivation applies to any ground-state hadron, provided the ground state in the presence of the external field is perturbatively close to the free-field state. A simple counter example could be a Σ baryon in the presence of a strangeness-changing current, where at λ = 0 the correlator behaves as e −E Σ t but at any finite λ this will eventually be dominated by e −E N t (kinematics permitting).
It is for a similar physical reason that one must work with nucleon states that have the least possible kinetic energy among all states connected to any number of current insertions. This same condition guarantees the connection between the Euclidean and Minkowski Compton amplitudes described in the previous section. In the presence of the background field, the Hamiltonian of the system will mix momentum states p ± nq. We hence choose the Fourier projection of our correlation function, Equation (24), such that p corresponds to the lowest kinetic energy, p < p + nq ∀n ∈ Z. For notational purposes, at non-zero λ, the label p can be understood to describe the space of all momentum states connected by an integer multiple of q.
Assuming that first-order perturbations of the energy vanish, as ensured by the chosen kinematics, the secondorder derivative of Equation (24), evaluated at λ = 0, reduces to The derivatives of A λ (p) and E N λ (p) are assumed to be evaluated at λ = 0. The first term corresponds to the shift in the overlap factor and the second order energy shift is identified in the t-enhanced (or time-enhanced) term. It is this t enhancement that leads to a relationship between the energy shift and matrix element. Hence to complete the derivation, we differentiate the path integral representation directly to identify the time-enhanced contributions to the correlator. The path integral expression for the 2-point correlator, Equation (24), in the background field is given by where λ ⟨⋯⟩ λ denotes the full path integral over all fields, using the perturbed action S(λ) given in Equation (22)an absence of λ subscript is taken to imply λ → 0. By differentiating twice with respect to λ and evaluated at λ → 0, one finds (see Appendix B for details) To arrive at this form, it is assumed that the vacuum expectation value of a single current insertion vanishes, ⟨∂S(λ) ∂λ⟩ = 0, such as is the case for the electromagnetic current. It is clear that the second term in Equation (28) only acts to modify the unperturbed correlator, and hence cannot generate the temporal enhancement associated with the energy shift. Focusing purely on the first term, and inserting an explicit form for the electromagnetic external field, the corresponding second derivative of the correlator becomes The correlator defined here involves a four-point correlation function with nucleon interpolating operators held at fixed temporal separation t, with the currents inserted across the entire four-volume. Importantly, this expression is evaluated in the absence of the external field, and hence momentum conservation is exact. It is then possible to perform a spectral decomposition of this correlator in terms of a transfer matrix that is diagonal in the momenta. It is a rather straightforward calculation to perform the standard procedure of inserting a complete sets of states, and then exploit translational invariance to complete the spatial integrals. Since the temporal integrals over the currents extend over all time, each distinct time ordering of the 4-point function must be treated separately. However the contribution to the energy shift can only come from the contribution where the two current operators both appear between the nucleon creation and annihilation operators. Isolating the contributions that give rise to the dominant te −E N (p)t behavior at asymptotic times gives: where the subleading terms are suppressed by the ellipsis. Note that the spin indices have been suppressed here, however a detailed presentation is provided in Appendix B. Finally, a comparison of this form with Equation (26) and the Compton amplitude, Equation (3), leads to the result quoted in Equation (23).
In principle the derivation presented in this section (and Appendix B) can be generalized to mixed currents by adding an additional perturbation (or current) to Equation (22) with a different coupling strength, λ ′ , and current momentum, q ′ , which can, in general, be taken to be different from λ and q. This would allow access to interference terms, allowing one to study u-d flavor interference effects, spin-dependent amplitudes or the off-forward Compton amplitude and generalized parton distributions with q ≠ q ′ . Details of a prescription for the off-forward Compton tensor will be presented in a forthcoming paper [43].  We use a single gauge ensemble generated by the QCDSF/UKQCD Collaborations employing a stoutsmeared non-perturbatively O(a)-improved Wilson action for the dynamical up/down and strange quarks and a tree-level Symanzik improved gauge action [44]. We work on a volume of L 3 × T = 32 3 × 64, the bare coupling parameter is β = 5.5, and the lattice spacing, a = 0.074(2) fm, is set using a number of flavor-singlet quantities [45][46][47][48]. We are working on the SU(3)-flavor symmetric point where the masses of all three quark flavors are set to approximately the physical flavor-singlet mass, m = (2m s + m l ) 3 and corresponds to a pion mass of ≃ 470 MeV and m π L = 5.6. Further details and advantages of this choice are discussed in [25,46]. The renormalization constant for the local vector current is determined to be Z V = 0.8611(84) by imposing the charge conservation on the Sachs electric form factor calculated at Q 2 = 0. This value is in agreement within statistical precision with the value determined in the chiral limit using the RI ′ -MOM scheme [49]. We tabulate the details in Table I for the Reader's convenience.

B. Feynman-Hellmann implementation
We implement the second-order Feynman-Hellmann theorem through the valance quarks. It can clearly be implemented at the hybrid Monte Carlo level, which would relay the effects of the perturbations to the seaquarks [29], however this would lead to a significant increase in required computing resources, so in this work we focus on the quark-line connected contributions to the Compton amplitude. To this end, we add the perturbation given in Equation (22), to the valence quark action only, where the renormalized local vector current, , is chosen to be along the z-direction, µ = 3. The second exponential term symmetrizes the Fourier transform and ensures the Hermiticity of the action. In order to evaluate the second-order energy shift with respect to λ at λ = 0, one has to compute additional quark propagators at several choices of λ. This added cost of computation is countered by optimizing the inversion of the perturbed Dirac Figure 1. The six possible ways of inserting two currents to a nucleon. Upper three correspond to the uu flavor contributions while the lower leftmost is for dd. Remaining two are for the ud, which we omit in this work.
matrix. We adopt an approach where we feed the unperturbed propagator as an initial guess to the inversion of the perturbed one, which results in roughly a factor of 10 gain in inversion time.
In order to improve the stability of estimating the energy-shifts at λ = 0, one aims to introduce the smallest possible perturbations by choosing a suitable λ. The objective is to keep λ sufficiently small to minimize the contamination from λ 4 effects, yet large enough to ensure that the perturbation is not lost within the numerical precision of the calculation. Our tests indicate that any choice in the range 10 −1 > λ ≥ 10 −5 leads to meaningful results. Note that the upper bound is sensitive to the quark mass, where a too large λ might lead to increased instabilities in the Dirac matrix inversion, particularly as one approaches the physical point.

C. Flavor decomposition
The implementation of the Feynman-Hellmann theorem described above effectively inserts an external current on to a quark line by computing its propagator with the perturbed quark action Equation (31). When both currents are inserted onto the u-quarks or the d-quark, we evaluate the "uu" or "dd" contributions to the Compton structure functions, respectively. By employing positive and negative pairs of λ's (see Section V), one can form u + d and u − d type insertions leading to the possibility for isolating a "ud" insertion where one current hits a uquark and the other the d quark. The six different ways of inserting the currents are shown in Figure 1. The ud contribution is particularly interesting since it directly corresponds to a higher-twist contribution [50], i.e. the twist-4 cat's ears diagram. An investigation of these contributions is left for future work.

D. Isolating the energy shift
The energy of the ground state, N λ , in a weakly coupled external field can be expanded as a Taylor series in λ, Collecting terms that are even and odd in λ to all orders, we may rewrite the expansion as, where E N (p) in the above two expressions corresponds to the unperturbed (λ = 0) energy. In order to extract the second order energy shift from the lattice correlation functions, we construct a ratio which isolates the even-λ energy shift, ∆E e N λ (p), where the perturbed two-point functions, G ±λ (p, t), are defined in Equations (24) and (25) and G (2) (p, t) is the unperturbed one. The large t behavior given in Equation (35) is arrived at by combining Equations (25) and (33). While not necessary for our discussion, for completeness we note that the overlap factor is Extraction of the even-λ energy shift ∆E e N λ then follows standard spectroscopy methods by fitting R e λ (p, t) defined in Equation (34) with a single exponential at sufficiently large times. Details follow in the next section.

V. RESULTS AND DISCUSSION
A. Extracting the structure functions and the moments In order to illustrate the feasibility and the versatility of the method, we carry out simulations with several values of current momentum, Q 2 , in the range 3 ≲ Q 2 ≲ 7 GeV 2 . Utilizing up to six randomly placed quark sources per configuration, we perform up to O(10 4 ) measurements for each pair of λ and q. Quark fields are smeared in a gauge-invariant manner by Jacobi smearing [51], Multiple ω values that we can access with several combinations of p = (px, py, pz) and q = (qx, qy, qz) in lattice units. Note that the ω ≥ 1 values are omitted in the analysisvalues of ω outside the allowed range are indicated by italics. where the smearing parameters are tuned to produce a rms radius of ≃ 0.5 fm. We bin the measurements to account for the autocorrelations. In order to estimate the statistical errors, we pull a set of bootstrap samples from the binned dataset and perform all steps of the analysis on each sample. We access multiple ω values at each simulated value of q by varying the nucleon momentum p as shown in Table II. Setting p z = 0 and q z = 0, along with our choice of µ = 3, simplifies the Compton tensor such that the second order energy shift in Equation (23) corresponds to the F 1 Compton structure function directly, where the energy of the nucleon is calculated via the continuum dispersion relation, E N (p) = m 2 N + p 2 . We compute the perturbed two-point correlation functions (Equation (24)) with four values of λ = [±0.0125, ±0.025]. Even-λ energy shifts are extracted from the ratio of correlation functions given in Equation (34) following a covariance-matrix based χ 2 analysis to pick the best available range for each ratio. We show a representative case for q = (4, 2, 0) 2π L and p = (1, 0, 0) 2π L in Figure 2. Having even-λ energy shifts for two λ values, we perform polynomial fits of the form, to determine the second order energy shift. Given the smallness of our λ values, higher order O(λ 4 ) terms are heavily suppressed, hence the fit form reduces to a simple    Figure 3. In Figure 4 we show the subtracted Compton structure function for uu and dd insertions as obtained from the energy shifts via Equation (37) for each value of ω for q = (4, 1, 0) 2π L . With our particular choice of Lorentz indices and momenta, we can connect the lattice Compton amplitude (Equation (37)) to the moments of the structure functions (Equation (16)) as, Since the Compton amplitude is directly related to the experimental cross-section, it must be positive definitive for the entire kinematic region. Consequently, this holds for the uu and dd moments as well. Hence, the moments M (1) 2n are constrained to be monotonically decreasing, More generally, the sequence of moments satisfy the Hausdorff moment criteria [52], yet the simple monotonic decreasing form of Equation (40) allows an assessment of the constraint on the moments provided by the data.  Table III.
This series is rapidly converging and stable with respect to the truncation order. However, imposing the above condition in a least-squares analysis is not so straightforward, but it can be easily implemented in a Bayesian approach. The particular Bayesian inference implementation we are employing [53] has the advantage of using adaptive algorithms to optimize the hybrid Monte Carlo parameters [54], which removes the extra effort of fine-tuning such parameters and returns the sampling results within mere minutes with convergence checks implemented.
In the present analysis, we sample the moments from uniform distributions with bounds M is the χ 2 function with the covariance matrix C ij , ensuring the correlations between the data points are taken into account. Fits depicting the extraction of the moments are shown in Figure 4. Note that, Equation (40) is not necessarily true for the isovector, uu − dd, moments. Therefore, the Bayesian priors for the uu and dd are treated independently. However, by sampling the uu and the dd datasets within the same trajectory, we ensure underlying correlations between those datasets are accounted for. Hence, the indices i, j in Equation (41) run through all the ω values and both flavors. The first few moments extracted are given in Table III and the isovector moments are plotted in Figure 5 for    Table III. Q 2 are given in GeV 2 .
each choice of Q 2 . Error margins correspond to the highest posterior density interval with a 68% credible region and the asymmetric intervals reflect the shape of the posterior distributions. We find that the lower moments have a negligible dependence on the truncation order of the series in Equation (39) for n ≥ 3. We note that although the fall-off of the moments is quite evident, the second moments do not decrease as rapidly as one would expect from DIS data. Combined with the interplay between the u and the d moments, this leads to rather large second moments for the isovector u − d combination, which are even comparable to that of the first moment in some cases. While this is likely due to the limited statistics of the current simulations, it may also be a signal of significant power corrections which we discuss in the next section.

B. Power corrections and scaling
The moments M (1) 2n (Q 2 ) in Table III appear to indicate that power corrections are present throughout our range of photon momenta, 3 ≲ Q 2 ≲ 7 GeV 2 . The leading power corrections to moments of structure functions have essentially two sources, target mass (together with possible threshold effects) and mixing with operators of higher twist. Target mass and threshold effects can be accounted for, to a certain extent, by replacing the Bjorken x scaling variable by a generalized scaling variable, ξ (e.g. [55][56][57]). For example, a commonly used form proposed by Nachtmann [56] is where m N is the nucleon mass. Besides being power corrected, the various moments defined in terms of these new ξ scaling variables are mixtures of the (Cornwall-Norton [58]) moments defined in terms of x, with the mixings suppressed by powers of 1 Q 2 . For a generalized scaling variable incorporating the analytic structure of the forward Compton amplitude see Ref. [55]. The second source of power corrections in the structure function moments are due to contributions from operators with twist-4 and above, represented by the O(1 Q 2 ) terms in Equation (20). We note that while in principle it is possible to compute the Wilson coefficient for the twist-4 contribution to M (1) 2 (Q 2 ) non-perturbatively on the lattice [13,59,60], the hyper-cubic nature of the lattice can only accommodate operators of spin four or less, which thwarts any direct prediction of the Wilson coefficients for higher moments.
Since the moments computed in this work are determined from a fit to the full Compton amplitude, they naturally include all possible power corrections. While the present data do not have the precision or range of Q 2 to isolate individual power corrections, we are able to account for the observed Q 2 -dependence of each moment by fitting with the functional form Of particular interest is a fit to the lowest isovector moment M 2,uu−dd (Q 2 ), which we show in Figure 6 as a function of Q 2 . Here we clearly see that the current data is well described by Equation (43), with the fit form suggesting large power corrections may be present at low Q 2 . However we add a cautionary note that the behavior at small Q 2 is heavily influenced by the presence of the large value for the moment at Q 2 = 2.74 GeV 2 and neglecting this point would result in a much softer, although still non-trivial, Q 2 -dependence in the small-Q 2 region.
2,uu−dd (Q 2 ) moment. Black data points (tabulated in Table III) are obtained from independent fits to the ω-dependence of F1(ω, Q 2 ) at fixed Q 2 . Curve shows the fit to Equation (43).
The phenomenological values of the moments in the language of the parton model, namely v 2n (µ) in Equation (20), are commonly quoted at the scale µ = 2 GeV. The results presented in Figure 6 indicate that in our approach this number should be obtained from first taking the asymptotic value of M (1) 2,uu−dd (Q 2 ), say at Q 2 ≳ 16 GeV 2 , then performing a perturbative rescaling down to µ = 2 GeV. In practice, to achieve a reliable prediction would require an extension of the current simulations to larger values of Q 2 and a further increase in statistics.
At this stage we refrain from extending the present analysis to the higher moments, however an immediate avenue of study will be to investigate if the observed enhancement of the M (1) 4 isovector moment persists with higher statistics over a larger Q 2 range. Should this be the case, it will be interesting to compare with the enhancement observed in the empirical results at small-Q 2 (see e.g. [61]).
In addition to studying the Q 2 -dependence of the uu, dd and uu − dd moments of F 1 (ω, Q 2 ), the techniques described in this work can be easily extended to investigations of additional quantities such as the higher-twist ud moments (see the final two diagrams in Figure 1) or a test of the Callen-Gross relation at small Q 2 . For a first attempt see [50]. We emphasize that this work clearly demonstrates that a study of the Q 2 -dependence of such observable on the lattice is now possible.

VI. SUMMARY AND CONCLUSIONS
We have presented a derivation of the second-order Feynman-Hellmann theorem and its relationship to the forward Compton amplitude. In particular, the Compton amplitude can be computed directly on the lattice with a simple extension of the already established lattice implementations of the Feynman-Hellmann theorem, devoid of operator mixing and related complications of the conventional approach. In order to illustrate the feasibility and the versatility of this method in directly probing nucleon structure functions, we have performed highstatistics simulations for several photon momenta, Q 2 , on the 2 + 1-flavor, 32 3 × 64 QCDSF/UKQCD lattices at the SU (3) flavor symmetric point corresponding to a pion mass of ≃ 470 MeV. By studying the Compton amplitude across a range of kinematics, we have presented nontrivial signals for the first few moments of the nucleon structure functions. By revealing the Q 2 dependence of the low moments, there is a clear opportunity to directly study the evolution to the partonic regime -more detailed investigations of the power corrections will be pursued in future work. Beyond studying the approach to the partonic regime, the method could also be applied at smaller Q 2 and probe the dynamics of low-energy Compton scattering processes [62]. While moments of structure functions are relatively straightforward, there is also a prospect to invert the Compton amplitude to extract the x-dependence of the structure functions directly -see Ref. [63] for some first attempts.

ACKNOWLEDGMENTS
The numerical configuration generation (using the BQCD lattice QCD program [64])) and data analysis (using the Chroma software library [65]) was carried out on the IBM BlueGene/Q and HP Tesseract using DIRAC 2 resources (EPCC, Edinburgh, UK), the IBM Blue-Gene/Q (NIC, Jülich, Germany) and the Cray XC40 at HLRN (The North-German Supercomputer Alliance), the NCI National Facility in Canberra, Australia (supported by the Australian Commonwealth Government) and Phoenix (University of Adelaide For completeness, we summarize the relevant notation to complement the OPE discussion in Section II C. The (leading-order) Wilson coefficients are given by: and similarly for the gluons where

Appendix B: Second order Feynman-Hellmann Theorem
For the interested reader, we provide some of the key intermediate steps to produce the principal derivation presented in the main text, Equation (23).
The 2-point nucleon correlator in an external field, Equation (22), is given by: where Γ the spin-parity projection matrix, with trace implied, and Ω λ ⟩ is the vacuum in the presence of the ex-ternal field. A nucleon interpolating operator is assumed for χ. The strategy to derive the second-order Feynman-Hellmann relation is to consider the general form of the spectral decomposition of the Euclidean correlator, and match the energy shift against the explicit decomposition of the correlator in the presence of a weak external field. Following the usual procedure of inserting a complete set of states in between the operators, , and carrying out the momentum integral, the spectral decomposition of Equation (B1) in the large (Euclidean) time limit, where the ground state dominance is realized, is given as, where E N λ (p) and A λ (p) are the energy of the ground state nucleon and overlap factor, respectively, in the background field. We note that in the presence of the background field, the Hamiltonian of the system will mix momentum states p ± nq-with that p chosen to correspond to the lowest kinetic energy, p < p + nq ∀n ∈ Z.
The second order derivative of Equation (B2) with respect to λ, evaluated at λ = 0, is given by: The derivatives of A λ (p) and E N λ (p) are understood to be evaluated at λ = 0. The first-order energy shifts vanish, ∂E N ∂λ = 0, provided we restrict ourselves to the non-Breit-frame kinematics, i.e. p ≠ p ± q [27,30]. In this case, the above equation thus reduces to where the first term corresponds to the shift in the overlap factor and the second order energy shift is identified in the t-enhanced or the time-enhanced term. The familiar overlap factor is given by: We now directly evaluate the second-order derivative within the path integral formalism. The 2-point correlation function takes the form: where S(λ) is the perturbed action given in Equation (22), and Z(λ) is the corresponding partition function. Projecting the 2-point function to definite momenta and spin gives the standard correlator, To simplify the following expressions, we use the shorthand notation to describe the product of interpolating operators, Figure 7. Distinct time orderings of the current insertions, with increasing time assumed from left to right.
The first-order derivative of the correlator is then given by The first term corresponds to a vacuum shift and the second term encodes a the three-point correlation function that is related to the first-order energy shift. This term has been discussed in detail and applied to the calculation of forward matrix elements [27] and form factors [30]. For the Compton amplitude, the second order derivative is required, which is straightforward to evaluate, The first two terms vanish when the external perturbation is purely linear in λ. In the limit λ → 0, vacuum matrix elements of the external fields vanish, ⟨∂S(λ) ∂λ⟩ = 0, assuming the operator doesn't carry vacuum quantum numbers, such as the electromagnetic current-the scalar current would be an obvious counter example. The term involving ⟨(∂S(λ) ∂λ) 2 ⟩ will not in general vanish, however this can only act as a multiplicative factor on the free-field correlator and hence cannot contribute to the time-enhanced term in Equation (B4). The second-order energy shift can therefore only arise from the final term in Equation (B10), where the ellipsis denotes terms that are not time-enhanced. By restoring the explicit form for G, we have Using our explicit form for the electromagnetic external field, the corresponding second derivative of the correlator is given by The correlator defined here involves a four-point correlation function with nucleon interpolating operators held at fixed temporal separation t, with the currents inserted across the entire four-volume. Importantly, this expression is evaluated in the absence of the external field, and hence momentum conservation is exact. It is then possible to perform a spectral decomposition of this correlator in terms of a transfer matrix that is diagonal in the momenta. Given that the Fourier projection of the nucleon sink is at definite momentum p, and p < p ± q (as discussed above Equation (B4)), the leading asymptotic behavior of the correlator must have an exponential behavior given by e −E N (p)t . By resolving the corresponding t-enhanced coefficient of this exponential, we can identify the second-order energy shift, as given in Equation (B4). Assuming that the temporal length is sufficiently large that we can neglect the temporal boundary conditions, there are six distinct time orderings of where the current insertions can act relative to the nucleon interpolating fields. They are shown in Figure 7. Configuration A is the obvious ordering that contains the desired Compton amplitude. This corresponds to ground-state saturation of the nucleon on either side of the current insertions. The t dependence of this particular contribution, including explicit integrals over the current insertion times, will take the form: It is convenient to isolate the current separation time by transforming the coordinates to: and hence The term linear in t corresponds to the anticipated time enhancement of Equation (B4)-details of the connection to the Compton amplitude are given below. Given the condition that E X > E N , the damping ensures that the term proportional to ∆ is independent of t for large times. It is this damping which ensures the current separation remains localized in time, and allows the nucleon to saturate to the ground state on either side of the current.
Having selected the term of interest, it is necessary to confirm that none of the other possible configurations can scale as te −E N (p)t at large times. One potential example would be to consider the nucleon at the source to carry momentum p + 2q. This case gives a temporal behavior according to The second term clearly contains a damped exponential and hence the integral over ∆ converges for large t. In the first term, the ordering of the levels E X (p + q) and E N (p + 2q) will govern which contribution dominates at large t. However in either case, this term is exponentially-suppressed relative to e −E N (p)t . This example, that does not exhibit the desired te −E N (p)t behavior, makes it clear that in order to generate the coefficient linear in t, one must have 2 intermediate propagators of the lowest energy nucleon, such as in Equation (B14). With three available time windows and the momentum transfer through the current insertion, it is only possible to achieve this with the lowest-energy nucleons separated by an intermediate, energetic state. It is then straightforward to conclude that it is not possible for any of the temporal configurations B to F to generate a contribution te −E N (p)t . To highlight how these other terms contribute, we consider the behavior of the B-type ordering. One of the contributions would take the form Although a "light" vector meson propagates outside the nucleon interpolators, the τ ′ integral is convergent and no remnant of this mass scale can appear in the t-dependent exponent. And even though the momentum states were chosen to highlight a e −E N (p)t contribution, there cannot be a temporal enhancement since the kinematics are chosen to ensure E X (p + q) > E N (p).
Given that the contribution to the second-order energy shift must come from the temporal orientation of type A, we demonstrate how this relates to the Compton amplitude. Explicitly written out, configuration A gives rise to the 4-point function: We insert complete sets of states next to the nucleon interpolating operators, and translate the operator expressions according to the standard form, χ(x) = e −iP .x χ(0)e iP .x and J µ (z)J µ (y) = e −iP .y J µ (z − y)J µ (0)e iP .y , which leads to × e i(k−p)⋅y (e iq⋅y + e −iq⋅y )(e iq⋅z + e −iq⋅z ) × Γ⟨Ω χ(0) X(p)⟩⟨X(p) J µ (z − y, τ ′ − τ )J µ (0, 0) Y (k)⟩⟨Y (k) χ(0) Ω⟩. (B23) By adopting the transformation, z ′ = z − y, y ′ = y, the Fourier integral over y ′ can be eliminated, and hence eliminate the k integral: As described above in Equation (B20), the term involving the momentum transfer between in and out states cannot contribute to the energy shift, it is only the term involving a p → p matrix element that is of interest. By applying the result of Equation (B17), and noting that at large t, the correlator must be dominated by the state E X = E Y = E N : where the ellipsis represents terms that are suppressed at large t, relative to te −E N (p)t . Here, in identifying the groundstate nucleon, the spin sums implied by the ∑ X,Y have been restored. Because the matrix element ⟨N J(∆)J(0) N ⟩ is exponentially damped at large ∆, the term in the integrand proportional to ∆ also cannot generate a contribution to the second-order energy shift. Hence the only remaining term contributing to the energy shift is: where a spin-density overlap is used: A comparison of the form presented in Equation (B26) with Equation (B4), together with the Compton amplitude in Equations. (15) and (11) with q 0 = 0, yields our result quoted in Equation (23).