Off-shell effects in bound nucleons and parton distributions from $^1$H, $^2$H, $^3$H and $^3$He data

We report the results of a new global QCD analysis including deep-inelastic scattering data off $^1$H, $^2$H, $^3$H, and $^3$He targets. Nuclear corrections are treated in terms of a nuclear convolution approach with off-shell bound nucleons. The off-shell (OS) corrections responsible for the modification of the structure functions (SFs) of bound nucleons are constrained in a global fit along with the proton parton distribution functions (PDFs) and the higher-twist (HT) terms. We investigate the proton-neutron difference for the OS correction and discuss our predictions for the SF ratio $F_2^n/F_2^p$ and the corresponding PDF ratio $d/u$ in the proton, as well as their correlations with the underlying treatment of the HT terms and of the OS corrections. In particular, we find that the recent MARATHON data are consistent with equal relative OS corrections for both the proton and the neutron.


I. INTRODUCTION
The parton distribution functions (PDFs) are universal process-independent characteristics of hadrons driving the cross sections of various leptonic and hadronic processes at high momentum transfer. The PDFs are usually extracted from global QCD analyses of experimental data at high values of momentum transfer (for a review see, e.g., [1,2]). Precise determinations of PDFs are increasingly important in a variety of tests of the Standard Model and new physics searches. Nuclear deep-inelastic scattering (DIS) data can be helpful in this context for various reasons. For instance, the use of nuclear targets with a different proton-neutron content allows one to better constrain the -quark distribution in the proton. Furthermore, including nuclear DIS data in a QCD analysis improves the statistical significance of the fit. However, employing those data requires understanding of nuclear effects.
Nuclear effects are usually treated empirically in PDF analyses, employing simple parametrizations of the and dependencies (for a review see, e.g., [3,4]). Alternatively, we can follow a different approach and employ a microscopic model accounting for a number of nuclear effects caused by the energy-momentum distribution of bound nucleons, the off-shell (OS) corrections to the nucleon structure functions (SFs), and meson-exchange currents, as well as the nuclear propagation of quark-gluon states resulting in the nuclear shadowing effect [5]. A number of dedicated studies [5][6][7][8][9] indicate that this approach describes with good accuracy the observed nuclear effects in the charged-lepton and neutrino DIS and in the Drell-Yan process, as well as in the / boson production in proton-lead collisions.
In this Letter we report the results of a global QCD analysis, in which we simultaneously constrain the proton PDFs together with the higher-twist (HT) terms and the OS functions of the nucleon SFs. To this end, we use the deuterium DIS cross section data from various experiments, together with recent precision data on the 3 He to 3 H ratio of the DIS cross sections from the MARATHON experiment [10]. We study the interplay between the / PDF ratio (and the related SF ratio 2 / 2 ), the underlying model of HT terms, and the OS corrections. In particular, in this way, we constrain the proton-neutron asymmetry in the OS corrections to the SFs.

II. THEORY FRAMEWORK
For the spin-independent charged-lepton inelastic scattering, the cross sections are fully described in terms of two SFs, = 2 1 and 2 . In the DIS region of high invariant momentum transfer squared 2 , in the massless limit, SFs can be treated in terms of a power series in −2 (twist expansion) within the operator product expansion (OPE). The leading twist (LT) SFs are given by a convolution of PDFs with the functions describing the quark-gluon interaction at the scale , which can be computed perturbatively as a series in the strong coupling constant (see, e.g., [1,2]). A finite target mass produces a correction that can be treated within the OPE [11]. We can then write where = , 2 and TMC are the corresponding LT SFs with the account of the target mass correction (TMC) [11], and describe the twist-4 contribution. In this study, we consider two commonly used HT models: (1) additive HT model (aHT) motivated by the OPE, in which we assume = ( ) and (2) multiplicative HT model (mHT) [12], in which is assumed to be proportional to the corresponding LT SF, = LT ( , 2 )ℎ ( ). To address the nuclear corrections in DIS, we consider this process in the target rest frame and treat it as incoherent scattering off bound nucleons. The nuclear SFs can then be calculated in terms of the bound proton and neutron SFs integrated with the corresponding spectral arXiv:2211.09514v2 [hep-ph] 13 Jun 2023 functions and [5,6,[13][14][15], where , = , 2, we assume a summation over the repeated index , and are the kinematic factors [5,16]. The integration is performed over the bound nucleon fourmomentum . The OS nucleon SFs depend on the scaling variable ′ = 2 /2 · , the DIS scale 2 , and the nucleon invariant mass squared 2 = 2 0 − 2 . This latter dependence originates from both the power TMC terms of the order 2 / 2 and the OS dependence of the LT SFs. Following Refs. [5,15], we treat the OS correction in the vicinity of the mass shell 2 = 2 by expanding SFs in a power series in = ( 2 − 2 )/ 2 . To the leading order in we have where the derivative in Eq. (4) is taken on the mass shell 2 = 2 . We assume identical functions in Eq. (4) for and 2 based on the observation that ≈ 2 at large values, for which the OS effect is numerically important [5,6,16,17]. We thus suppress the index = , 2 for the function . The proton (neutron) spectral function ( ) ( , ) describes the corresponding energy ( = 0 − ) and momentum ( ) distribution in the considered nucleus at rest. This function is normalized to the proton (neutron) number. For the deuteron, the function = is fully determined by the deuteron wave function as discussed in detail in Refs. [5,16]. For the proton spectral function of 3 He, 3 He , the relevant contributions come from twobody intermediate states. They can be divided into two terms: the bound state, i.e. the deuteron, and the states in the continuum. The neutron spectral function of 3 He, 3 He , involves only the continuum states. We then have [6,18]: where we consider the spectral function as a function of the separation energy > 0, which is related to as = − − 2 /(4 ) with 2 /(4 ) as the recoil energy of the residual two-nucleon system, and 32 ≈ −7.72 and ≈ −2.22 MeV are the binding energies of 3 He and the deuteron, respectively. Similarly, for the 3 H nucleus, the neutron spectral function involves contributions from the bound state and from the continuum states, while the proton spectral function includes only the continuum states: We constrain the proton PDFs, the HT corrections, and the proton and the neutron OS functions, and , in a global QCD analysis including the charged-lepton DIS data off 1 H, 2 H, 3 H, and 3 He targets, combined with the ones from the ± / boson production at D0 and LHC experiments. The main datasets used in our analysis are described in Refs. [16,17]. 1 In addition, we employ the recent data on the ratio of the DIS cross sections of the three-body nuclei, 3 He / 3 H , from the MARATHON experiment [10]. This allows us to study the isospin dependence of nuclear corrections, and, in particular, the neutron-proton asymmetry = − . To ensure a perturbative QCD description and for consistency with the previous studies [16,20], we apply the cuts 2 > 2.5 and 2 > 3 GeV 2 .
The point-to-point correlations in the data are accounted in the fit whenever available. For the MARATHON data, we combine in quadrature the published point-to-point systematic uncertainties with the statistical (uncorrelated) ones. We keep fixed the normalization of the most precise datasets, including the MARATHON 3 He / 3 H one, and use them for the calibration of the other datasets (see Table 1 of Ref. [16]).
The PDFs are parametrized following Ref. [20]. The 2 dependence of the LT SFs is computed to the nextto-next-to-leading order (NNLO) in perturbative QCD. The functions ( ) in the aHT model are treated independently for = , 2 and are parametrized in terms of spline polynomials interpolating between the points = (0, 0.1, 0.3, 0.5, 0.7, 0.9, 1). A similar procedure is applied to the functions ℎ in the mHT model. To reduce the number of unknown quantities in our fit, we assume = in the aHT model. We also test the assumption ℎ = ℎ in the mHT model.
The nuclear effects are treated using Eq. (2). In this approach, the nuclear corrections are driven by the momentum distribution, the nuclear binding, and the OS effect. It was verified [17] that other nuclear effects, such as the meson-exchange currents and the nuclear shadowing, are within experimental uncertainties and therefore neglected in the present analysis. We use the deuteron wave function computed with the Argonne potential [21,22] (AV18), and the 3 He and 3 H spectral functions of Ref. [23] computed with the AV18 nucleon-nucleon force and accounting for the Urbana three-nucleon interaction as well as the Coulomb effect in 3 He. It was also verified that the use of the 3 He spectral function of Ref. [24] and the 3 H spectral function obtained from isospin symmetry, i.e., Computing the nuclear SFs requires both an energymomentum integration and light-cone momentum integrations inside TMC and NNLO SFs. Such integrations significantly slow down the fitting procedure. To optimize the computing performance we treat TMC on the NNLO SFs as TMC = ( TMC / LT ) LO LT(NNLO) , i.e., TMC is effectively applied to the leading order (LO) SFs. We verified that such an approximation has little impact on the predictions of Ref. [16] for the MARATHON data. In particular, the calculations including terms up to N 3 LO order in the QCD coupling constant [25,26] are in good agreement with such an approximation (Fig. 1). The corresponding predictions are within 1 of the results of Ref. [16], thus allowing us to safely include these data into the present fit.
The function ( ) is determined phenomenologically from a global fit and its dependence is parametrized as [16,17] where the parameters , , and are determined simultaneously with those of the proton PDFs and HTs. 2 We perform a number of fits with different setup. In our default setup, we assume equal OS functions for the proton and neutron, = = , and the aHT model for the HT terms. With such settings, we obtain a good agreement with the MARATHON data on the ratio 3 He / 3 H [10] with 2 per number of data points (NDP) of 20/22, as shown in Fig. 1. Considering all data points included in our fit, we have 2 /NDP = 4861/4065. We verified that the MARATHON nuclear data do not deteriorate the description of the other datasets. In particular, we have 2 /NDP = 42/31 and 45/32 for, respectively, 7 and 8 TeV LHCb data [27][28][29], to be compared with the values 45/31 and 40/32 of the analysis with no nuclear data [20]. A small difference between the present result and Ref. [20] is within statistical fluctuations of data.
Our results on the function ( ) are shown in Fig. 2 (left panel), together with ones from Refs. [5,16]. The present results are in good agreement with the analysis of Ref. [5], in which the function ( ) was determined from a fit to the data on the ratios / for the DIS cross sections off nuclear targets with a mass number ≥ 4 using the proton and the neutron SFs of Ref. [30]. Our results are also in accord with the analysis of Ref. [16], which does not include the MARATHON data from = 3 nuclei. The addition of the MARATHON data on the ratio 3 He / 3 H in the fit allows a reduction of the ( ) uncertainty at large .
In order to study the sensitivity of our results to the functional form of ( ), we performed a fit with the term 3 included in Eq. (9) and verified that this does not improve the fit accuracy. The KP error band in Fig. 2 2 The correlation matrix is available upon request.  [10]. Also shown are the 1 uncertainty band of analysis [16] (AKP21) performed without the MARATHON data (shaded area) and a variant of the AKP21 predictions with the terms up to N 3 LO order in the QCD coupling constant [25] accounted for in the TMC (dashed line).
includes systematic uncertainties from the functional form as well as from the nuclear spectral function. The results presented in Figs. 1 and 2 (left panel) are obtained assuming an isospin symmetric function = and the aHT model for the HT terms. The validity of this approximation was verified in the analysis of the nuclear SFs (EMC effect) in Ref. [5]. The same approximation was also used in Refs. [16,17]. In this study we use the MARATHON data on the 3 He and 3 H nuclei to constrain the asymmetry = − . To this end, we perform a fit in which is parametrized by Eq. (9), while for the neutron-proton asymmetry we assume a linear function, ( ) = 1 + 1 . For we obtain a result similar to that of the isospin-symmetric fit shown in Fig. 2 (left  panel). The corresponding asymmetry is in a broad agreement with zero, see Fig. 2 (right panel).
We also studied the sensitivity of the functions ( ) and ( ) obtained in the fit to the underlying model for the HT terms. In the default fit (i.e., fixed = 0), we found identical ( ) within uncertainties for both the aHT and the mHT models. For this reason, we only show the results of the aHT model in the left panel of Fig. 2. However, the results on the function differ substantially in the aHT and mHT models, as shown in the right panel of Fig. 2.
It should be emphasized that in the mHT model the HT terms are directly correlated with the LT SFs. As a result, the functions = LT ( , 2 )ℎ ( ) depend on the scale , making comparisons between the aHT and mHT models sensitive to the data selection and the kinematic cuts. The HT terms obtained in the present study are similar to those of Ref. [16] (see Fig. 5 in [16]). Note that the factor LT introduces a nucleon isospin dependence in the HT terms even if ℎ = ℎ . Therefore, the nonzero asymmetry in this model (right panel of Fig. 2) may partially compensate the isospin dependence of the HT terms from the factor LT .
A good description of the MARATHON 2 / 2 data for ≤ 0.7 is obtained for both the aHT and the mHT models (see Fig. 3). However, for larger the aHT model provides better description of data. The total value of 2 /NDP = 20/22 for our default fit with the aHT model to be compared with the corresponding value 34/22 of the mHT model.
It is instructive to compare the PDF ratio / obtained with different HT models. This comparison, shown in Fig. 4 for the kinematics of the MARATHON experiment, indicates that the ratio / at large is significantly higher in the mHT model. Figure 4 also shows the ratio / from the analysis of Ref. [31] (ABMP16), which was performed with the aHT model but without any nuclear data. In this case, the ratio / is mostly constrained by forward -boson production data from the LHCb [27][28][29] and D0 [32] experiments. The ABMP16 result is in good agreement with the present one obtained with the aHT model. Instead, for the mHT model we have a significant enhancement of the ratio / at large , which appears to be correlated with the nonzero values of the asymmetry (cf. Figs. 2 and 4). This observation demonstrates a tension in obtaining a simultaneous description of the DIS and Drell-Yan data in the mHT model. . Also shown are the results of ABMP16 analysis [20] (left-tilted hashed area), which does not include any nuclear data.

IV. DISCUSSION AND OUTLOOK
In summary, we obtain a good description of data with the simple assumption of isoscalar HT contributions in the aHT model. From a QCD analysis of the precise MARATHON data on 3 He and 3 H mirror nuclei, we obtain the same OS function for both protons and neutrons within the uncertainties. This nucleon OS function is consistent with our former observations from the global QCD analyses including 2 H DIS data [16,17], as well as with the analysis of the nuclear DIS data with ≥ 3 [5,6]. Furthermore, the resulting / ratio for the proton is similar to the one obtained in Ref. [20] without the use of any nuclear data. The addition of DIS data from 2 H, 3 He, and 3 H targets in the present QCD analysis allows a significant reduction of the uncertainty on the proton / ratio at large .
In contrast with the aHT model, in the mHT model the HT terms are different for protons and neutrons, due to a correlation with the LT terms. In the mHT model we find a nonzero neutron-proton asymmetry in the OS function. The ratio / at large is correspondingly enhanced in the mHT model as compared to that in the aHT model. These results are driven by the MARATHON 3 He/ 3 H data and originate from the interplay between the LT and HT terms in SFs, which is inherent to the mHT model. We therefore conclude that this feature of the mHT model can lead to potential biases and inconsistencies. Furthermore, the recent MARATHON data clearly prefer the aHT model over the mHT one with 2 /NDP = 20/22 vs 34/22. The interplay of the OS function with the / ratio and the HT terms that we observe in the context of the mHT model can shed some light on the recent claim about isovector nuclear EMC effects from a global QCD analysis [33]. These results appear to be also driven by the MARATHON data on 3 He and 3 H within the mHT model, as discussed earlier [16]. In the absence of an explicit isospin dependence of the ℎ ( ) terms, the HT contributions to the 3 He/ 3 H ratio cancel out in the mHT model. We therefore expect similar biases in analyses of the MARATHON 3 He/ 3 H ratio based on the LT approximation to SFs [34].
Future precision cross section measurements with 2 H, 3 H and 3 He targets in a wide kinematical region would further allow us to address the HT model and to constrain the isospin dependence of nuclear effects at the parton level. These would include future flavor sensitive DIS data at the electron-ion collider [35] and from both neutrino and antineutrino charged-current interactions with hydrogen and various nuclear targets [36,37] at the long-baseline neutrino facility [38].