Testing momentum dependence of the nonperturbative hadron structure in a global QCD analysis

We discuss strategies for comparisons of nonperturbative QCD predictions for parton distribution functions (PDFs) with high-energy experiments in the region of large partonic momentum fractions $x$. Analytic functional forms for PDFs cannot be uniquely determined solely based on discrete experimental measurements because of a mathematical property of mimicry of PDF parametrizations that we prove using a representation based on B\'ezier curves. Predictions of nonperturbative QCD approaches for the $x$ dependence of PDFs instead should be cast in a form that enables decisive comparisons against experimental measurements. Predictions for effective power laws of $(1-x)$ dependence of PDFs may play this role. Expectations for PDFs in a proton based on quark counting rules are compared against the effective power laws of $(1-x)$ dependence satisfied by CT18 next-to-next-to-leading order parton distributions. We comment on implications for studies of PDFs in a pion, in particular on the comparison of nonperturbative approaches with phenomenological PDFs.


I. INTRODUCTION
Quantum Chromodynamics (QCD) governs interactions of strongly interacting particles and predicts existence of bound states of hadronic matter. The scale dependence of the QCD coupling constant leads to the existence of two regimes of the strong interaction -the nonperturbative and perturbative -that result in formation of hadronic bound states at low energies and in quasi-free interactions of QCD partonic degrees of freedom at high energies. While both regimes require complex theoretical treatments, the perturbative regime of QCD has the advantage of rendering calculable predictions using the expansions in small parameters such as the inverse hard energy scale of the process and small QCD coupling constant. Experiments at the Large Hadron Collider and other facilities test the perturbative phase of QCD to high accuracy.
As for the nonperturbative regime, powerful approaches characterize internal dynamics of hadrons based on the models of the hadronic wave function (or the bound-state amplitude) and the effective Lagrangian approaches incorporating emergent low-energy symmetries. The pioneering studies of the hadron structure in nonperturbative models (see, e.g., [1,2]) have paved the way for recent rapid advancements, driven particularly by discretized (lattice) QCD [3,4] as well as by other approaches such as analytical representations (e.g. [5]) or Schwinger-Dyson formalism, e.g. [6]. Of particular interest to these studies are parton distribution functions (PDFs) -universal functions quantifying probabilities for finding partons in a fast-moving hadron probed at a factorization scale µ 1 GeV. The highly challenging computation of a PDF for an arbitrary parton becomes more amenable when the parton carries a significant fraction x of the hadron's momentum, of order 0.1 or more. Recent nonperturbative/lattice computations provide many predictions for PDFs at x → 0.1, as well as of the Mellin moments dominated by large-x PDFs [4]. At x > ∼ 0.5, the flavor dependence of the proton wave function further simplifies, with only the up and down quarks having the appreciable PDFs. Furthermore, under specific conditions outlined in Sec. II, notably requiring that the parton entering the hard scattering carries nearly all of the hadron's momentum (i.e., when x → 1), the proton wave function right before the hard scattering may reduce to a small number of simplest quasi-free partonic Fock states, which in turn may allow one to predict the x → 1 asymptotics of the PDFs probed at sufficiently high µ. This physical picture gives rise to the famous quark counting rules [7][8][9]. The x dependence of the PDFs at x > 0.5, the topic of interest for this paper, may open avenues for appraising the predictiveness of nonperturbative approaches.
On the phenomenological side, the PDFs are used to predict long-distance contributions to hadronic cross sections, when combined with perturbative parton scattering cross sections. Precise phenomenological parametrizations of PDFs [10][11][12][13][14][15][16][17] for unpolarized protons are determined by performing the global QCD analysis of experimental measurements. A global QCD analysis is a large-scale study involving fits of parametrized PDFs to various experimental data sets in the framework of perturbative QCD (PQCD). Flexible functional forms for these PDFs are fitted to measured cross sections in diverse high-energy processes, such as deeply inelastic scattering and production of vector bosons and jets. We will focus primarily on unpolarized proton PDFs, as they are best constrained experimentally, although baryons are not the simplest particles from the nonperturbative point of view.
Phenomenological PDFs can provide (and have provided) useful guidance for models of nonperturbative dynamics, e.g. by identifying the energy scale where the model is applicable [18,19]. However, comparisons of model and phenomenological PDFs must not be done uncritically. On the one hand, the most common M S PDFs enter a factorized approximation for the hadronic scattering cross section that is valid up to process-dependent power-suppressed terms. While the nonperturbative computations predict the PDFs in a free hadron, the initial-state hadrons in high-energy scattering processes are not truly free. They interact with other participating particles through soft QCD interactions. Scale-dependent PDFs provide a logarithmic approximation to the process of collinear parton showering in the initial state. The full radiation pattern also depends on particle masses and kinematic constraints. The QCD factorization formulae approximate the hadronic cross sections in simple inclusive processes in a way that accounts for soft and collinear contributions, and neglects numerically small mass terms. The relation of these formulae to the nonperturbative PDFs includes power-suppressed terms that are not controlled to the necessary extent.
On the other hand, the PDFs enter the fitted cross sections through elaborate, flavor-dependent convolution integrals and are determined from complex experimental measurements. The global analysis of PDFs relies on the critical assumption of universality, that the PDFs do not depend on the hard-scattering process. Multiple factors contribute to the final PDF uncertainty, as detailed, for instance, in the recent reviews [20,21]. The question then arises, to which extent the phenomenological PDF analyses can genuinely reproduce the features reflecting nonperturbative dynamics.
To illustrate these issues, we will revisit a classical problem in the PDF analysis, determination of the power laws that govern the falloff of PDFs as x approaches one. Quark counting rules (QCRs), reviewed in Sec. II, are one of the earliest predictions that the valence PDFs in the proton fall off roughly like (1 − x) 3 both for the up and down quarks. Remarkably, the power law predicted by the QCRs, as well as the asymptotic behavior of the d(x)/u(x) at x → 1 -a consequence of the spin-flavor extension of the original QCRs [22] -are consistent with the behavior of the actual phenomenological PDFs, although alternative behaviors are not ruled out.
We will examine this (1 − x) falloff in the recent CT18 next-to-next-to-leading order global analysis [23]. (For an informative study of the empirical small-x and large-x power laws in the NNPDF fit, which follows a different methodology, see Ref. [24].) We start in Sec. II A by briefly reviewing the rationale for the QCRs and by stating the conditions under which the QCRs are expected to hold. Next, we revisit the connection between the PDFs and factorized formulas for fitted hadronic cross sections in Sec. II B.
Can the power laws predicted by the QCRs be tested by experimental hadron measurements? We address this question by presenting a mathematical argument in Sec. III A to show that the polynomial functional form of the structure functions or PDFs cannot be uniquely determined from experimental observations. This conclusion follows from basic properties of polynomial interpolation. As an alternative to the reconstruction of the analytic form, we introduce an empirical power law exponent for the proton PDFs, as defined in Eq. (27). The empirical exponent allows a phenomenologist to reliably confront theoretical models with the observed behavior of hadronic cross sections. We address the dependence of the empirical power law exponent on the functional form of the PDFs, factorization scale, and the type of the scattering process in the remainder of Sec. III.
Functional forms of phenomenological PDFs incorporate various assumptions about the asymptotic behaviors of the PDFs at x → 0 and 1, including flavor dependence. It is important to understand compatibility of these assumptions with the experimental data. This study also serves as a sandbox problem illustrating broad aspects of comparisons of phenomenological PDFs with the large-x predictions. In particular, investigations of the x dependence of the PDFs in pions and kaons have been proposed as a powerful test to understand mechanisms for the emergence of the hadronic mass [25]. Yet modern pion PDF analyses [26,27] arrive at varied conclusions about the validity of QCRs for the pion, or they may even appear to be at odds with expectations from nonperturbative approaches [28]. In Sec. IV we comment on the reconciliation of the physical pictures arising from the perturbative and nonperturbative descriptions of QCD and on implications for pion PDF studies. The QCRs for a structure function F (x B , Q 2 ) in lepton-hadron deeply inelastic scattering arise from the parton model in gauge theories with small quark-boson coupling constants, such as QED or asymptotically free QCD. Consider a Feynman diagram in Fig. 1(a) in such a weakly coupled theory with massless quarks. The diagram corresponds to scattering of a virtual photon γ * (q) on a highly boosted "proton" p(P ) whose lowest Fock state entering the hard scattering (at momentum resolution scales somewhat below Q 2 ≡ −q 2 ) consists of three weakly bound quarks.
[Alternatively, we could consider scattering on a "meson" consisting of a quark and an antiquark.] φ is the low-energy (long-distance) part of the hadronic wave function, describing the binding of quarks into the hadron at virtualities much less than Q 2 . H, the hard-scattering subgraph of the diagram, can be approximated by the quark-photon bag diagram (the squared tree-level amplitude of the γ * q scattering) if all couplings are small. The diagram in Fig. 1(a) dominates the cross section when the γ * p scattering energy W 2 = Q 2 (1/x B − 1) + m 2 p barely exceeds the mass m 2 p of the initial proton. This regime corresponds to the maximal Bjorken variable, x B ≡ Q 2 /(2P · q) → 1. The contribution of this diagram to the structure function at x B → 1 behaves as with n s being the number of spectator partons (two for a baryon and one for a meson); and λ A and λ q denoting helicities of the parent hadron and active (struck) quark. For spin-averaged proton and pion structure functions, we obtain the limits The (1 − x B ) power law for F 2 (x B , Q 2 ) thus arises when the (n s + 1)-quark Fock state dominates in the x B → 1 limit. In this picture, the (1 − x) falloff is driven primarily by semi-hard gluon propagators binding the (n s + 1) quarks before the hard scattering, on the top of long-distance binding effects included in the nonperturbative wave function φ. The QCRs were initially demonstrated based on the examination of leading perturbative diagrams [22,29,30] as well as analyticity of partial-wave amplitudes [31], and including helicity dependence as in Eq. (1) [22]. They are also expected to apply in various nonperturbative approaches, see examples in Sec. II B. Adding even more gluon propagators to the graph in Fig. 1(a) suppresses the rate both by additional powers of (1 − x B ) and by additional factors of the coupling constant. The term of order (1 − x B ) in curly brackets in Eq. (1) arises from these higherorder radiative contributions. The extra suppression power can be found by counting the added propagators. Such contributions also introduce anomalous dimensions that make the (1−x B ) exponent dependent on the renormalization scale Q [31,32].
We thus see that the QCRs are mostly directly formulated for the structure functions; QCRs for the PDFs are discussed below. Sec. III A argues on general grounds that the power law introduced by the QCRs in Eq. (2) cannot be directly tested. Instead, we construct an effective power-law exponent that can be compared against experimental measurements, as discussed in Sec. III B. In the case of proton structure functions, Sec. III D demonstrates that the effective power-law exponent predicted by the QCRs is compatible with the global QCD analysis of hadronic scattering data. See, in particular, the discussion of Fig. 3.

QCRs for electromagnetic form factors
The described physics picture also applies to exclusive processes [7][8][9]. Drell-Yan-West duality [33,34] relates deep inelastic structure functions near the threshold, x B → 1, to elastic electromagnetic form factors at large momentum transfer, Q 2 1 GeV 2 . The scattering contributions that dominate the large-x B limit of the DIS structure function are thus expected to drive the 1/Q 2 falloff of the elastic form factor [29,31]. For example, in the parton model, the inelastic structure function νW 2 (x B ) and electromagnetic form factor F 1 (Q 2 ), when both represented in terms of the target wave function, are related through position-dependent parton distributions which have an explicit parton density interpretation [31], implying the interdependent power laws for νW 2 (x B ) and F 1 (Q 2 ). More generally the structure functions and form factors are related via impact-parameter generalized parton distributions for the x region for which DGLAP evolution applies (see, e.g., Ref. [35]).

QCRs for parton distribution functions: DIS scheme
In real-life QCD, computation of high-energy hadronic cross sections involves factorization of long-distance and short-distance QCD radiative contributions. The power law fall-off of factorized structure functions translates into the fall-off of PDFs f a (x, Q 2 ) at large light-cone momentum fractions x (which differ from the Bjorken variable x B starting at the next-to-leading order).
The connection is most transparent in the DIS factorization scheme [36], where the neutral-current DIS structure function is given by the charge-weighted sum of quark parton distributions to all orders in the QCD coupling constant α s : For valence-quark DIS PDFs in the proton, this implies the same fall-off power as in Eqs. (1) and (2), where A 2i = 3 for i = u and d, if the diagram in Fig. 1(a) dominates. Note that the predicted falloff power is the same for valence up and down quarks.
Starting at the next order in α s , the cross sections receive significant contributions from QCD radiation. DGLAP evolution equations [37][38][39][40] implement a collinear approximation for initial-state QCD radiation, valid when Q 2 is much larger than 1 GeV 2 . The gluon and sea (anti)quark PDFs are generated by collinear radiation off the valencequark lines, with the respective lowest-order diagrams shown in Figs. 1(b) and (c). The splitting functions for the q → g and g → q splittings in these diagrams are By computing the convolutions of P g←q (x) and P q←g (x) with the leading term of the valence PDF, Eqs. (3), 4), one determines the lowest-order estimates for the falloff exponents for sea (anti)quark and gluon PDFs in the proton: Fig. 1(b); (7) A 2i = 5 forī =ū,d, ... , Fig. 1(c).
If, instead, we solve the DGLAP differential equations for the scale evolution, we exponentiate the cumulative effect of collinear splittings from all α s orders. The solution introduces anomalous dimensions for the leading asymptotic powers A 2i (Q 2 ). In QCD, the respective anomalous dimensions are positive [24]; A 2i (Q 2 ) grow with Q 2 . For an arbitrary Q 2 , the QCRs thus predict A 2i (Q 2 ) ≥ 3, 4, and 5 (9) for the valence, gluon, and sea quark PDFs in the proton, respectively. The errors in these estimates critically depend on the size of radiative corrections to the lowest Feynman diagrams in Figs At x → 1, the main channel of QCD radiation is due to emission of gluons off valence quarks, as described by the nonsinglet DGLAP equation for the valence quark PDFs. Contributions with radiation off initial-state sea (anti)quarks and gluons are strongly suppressed by smallness of their respective PDFs. The Q 2 dependence of A 2i (Q 2 ) computed based on the DGLAP equations thus reflects the magnitude of higher-order corrections beyond the simplest QCR picture. Figure 1(d) presents an example of a contribution that is normally not discussed in the derivations of the quark counting rules, but may be an important part of the inclusive F (x B , Q 2 ) at relatively low Q 2 . In this diagram, the photon splits into a qq dipole that interacts with the hadronic state. For highly virtual photons, this contribution can be estimated perturbatively. At Q 2 of order 1 GeV 2 or less, the resolved photon contribution is independent from Fig. 1(a) and is not small. It requires an independent "resolved photon" PDF. 4. QCRs for parton distribution functions: M S scheme Modern phenomenological PDFs are provided predominantly in the M S factorization scheme, which offers a number of advantages compared to the DIS factorization scheme. The M S PDFs are defined in a process-independent way as summarized in the next subsection. In the M S scheme, the leading-power Feynman integrals for DIS inclusive cross sections are given by convolutions of perturbative Wilson coefficients H a and nonperturbative PDFs f a/A (where A = p or π) according to Eq. (10). In Fig. 1, factorization of Feynman subgraphs is indicated by the horizontal double dotted lines separating the hard and PDF parts. We expect roughly the same falloff powers for the M S PDFs as in Eq. (9). Differences between the DIS and M S factorization schemes start at the next-to-leading order in α s .

B. Large-x behavior of QCD processes
We see that the QCRs reflect a simplified picture of hadron scattering, in which cross sections near the elastic limit are dominated by the lowest-order diagrams like those in Fig. 1. Is this simple depiction tethered to realistic measurements?
Our view is that an arbitrary hadron scattering process is likely to include substantial violations of the QCRs. Thus, the QCRs need not be precisely obeyed by all processes included in global PDF fits. However, there may be processes where the kinematics favors the dominance of the lowest diagrams, and the QCRs are more closely followed. Specifically, when the initial hadron is "minimally perturbed" by the hard scattering, the higher-order Fock states may be better suppressed in the elastic limit. Several factors may indicate the "minimally perturbed" regime, including the smallness of the QCD coupling constant and vanishing Q 2 dependence of the effective power laws preferred by the experimental measurement.
The key assumption of the QCRs, that the (1 − x) dependence is determined mostly by scattering off a few quarks knocked out of the parent hadron, suggests two possible conditions under which the QCRs may hold. First, a weakly bound incoming state may be required, so that only the diagram with the minimal number of semi-hard propagators gives an appreciable rate in the elastic limit. In that case, each additional perturbative vertex introduces a large suppression factor into the scattering rate.
Second, the hadron-parton vertex described by a bound-state amplitude φ reflects the low-energy dynamics, i.e. the long-distance interaction that cannot be approximated by a few (semi-)hard gluons. The respective part of the hadronic correlator function can be evaluated consistently in a fully nonperturbative approach to hadron binding, such as MIT bag (e.g. [1]), Isgur-Karl (e.g. [41]) or chiral quark soliton models (e.g. [42]) for the proton, Nambu-Jona-Lasinio or chiral quark models (e.g. [2]) as well as Schwinger-Dyson equations (e.g. [43]) for the pion. In such approaches, the proton correlator is computed starting with the lowest-energy bound states consisting of three quark fields. The coupling here is large, but the analyticity of partial-wave amplitudes [31] indicates that the power-law falloff may be realized in a variety of theories that lead to asymptotic freedom at short distances. The dominance of final Fock states with lowest parton multiplicities is essential for realizing the QCRs in both cases. In the latter case, the QCRs may be more evident in a subsample of DIS events in which the final-state hadron multiplicity is low.
Exclusive processes like the deeply virtual Compton scattering with a photon that minimally perturbs the proton, and with the QCD coupling constant and mass terms tuned down, should satisfy the QCRs, as shown in the original derivation of Brodsky and Farrar. In particular, we must assume that the QCD radiation is weak enough so that the excited intermediate Fock states with five or more partons (or equivalently, the correlator contributions with sea partons, or with disconnected topologies) are negligible. The QCRs do not directly hold if there are excited Fock states.
Neither picture -a weak coupling or a minimally perturbed hadron-applies automatically in typical experimental measurements used in the proton PDF fits. Indeed, a typical inclusive hadronic observable used to determine the phenomenological PDFs includes high-multiplicity events. But we may spot the trace of the QCRs in some kinematic regime, when final-state multiplicities are small, and the impact of other corrections is minimal.
We will further argue in Sec. IV B that the conditions supporting QCRs may be easier to achieve in pion scattering than in nucleon scattering.
We take neutral-current DIS on a proton as an example. In this process, two scales control the QCD radiation, the photon-proton center-of-mass energy W 2 (equal to the invariant mass squared of the hadronic final state) and the photon virtuality Q 2 . For any reasonable choice of W 2 and Q 2 -with the Bjorken regime limited by There is no region of W 2 and Q 2 where both the QCD coupling α s (Q 2 ) is small, and initial-state radiation into final states with more than three partons can be neglected. Now consider three relevant kinematic regions of DIS: • In the elastic limit, i.e., when W 2 → m 2 p ∼ 1 GeV 2 , the proton mass m 2 p is not negligible. The relevant three quark degrees of freedom are not massless and free, and some modification of the original Brodsky-Farrar motivation is necessary.
• When W 2 increases up to about 4 GeV 2 , the proton mass terms eventually become negligible, but the behavior of the DIS cross section is initially complex in this region because of the resonant contributions. Globally, one may expect that the picture based on scattering of quasi-free partons approximates the DIS cross section on average because of the Bloom-Gilman parton-hadron duality [44]. However, over small intervals of W 2 , the cross section can exhibit very complex resonant behavior that does not satisfy the QCRs [45,46].
• At even higher W 2 , the power-suppressed terms become small. The leading-power contribution dominates a DIS structure function F (x B , Q 2 ) and can be factorized in terms of the PDFs f a/p and coefficient functions H a as where H a consists of a delta function for quark a at the zeroth order of α s and of respective higher-order radiative contributions for a = q, g at higher orders. The quark PDFs f a/p , with their dependence on the partonic momentum fraction x and factorization scale µ, are defined in the MS scheme as where is the Wilson eikonal line, and we have used the light-cone coordinates, see e.g. Ref. [21]. At these W 2 and Q 2 , we can finally talk about scattering on nearly independent initial-state partons, which nevertheless feel some long-distance interaction with other particles mediated by long-wavelength gluon fields. The factorization formula captures this interaction in two places, through the insertion of the eikonal line W (y − , 0) in f a/A (x, µ 2 ) to approximate the interaction of the initial-state quark field with the soft gluon field A(y) connecting to the other particles, and through non-factorizable terms in the power-suppressed correction O(M/Q).
The inelastic cross section grows quickly in this region of W 2 , indicating that final states with multiple partons are now easily produced. This effect is captured in the leading-power logarithmic approximation by the scale dependence of f a/A (x, µ 2 ). These multi-parton final states violate the naive prediction of the QCRs. One indication of this violation is significant Q 2 dependence of the effective power law.
Threshold resummation. The collinear factorization formula (10) is based on a highly non-trivial proof [47,48] that separates the leading-power convolution integral from power-suppressed terms O(M/Q) such as target-mass corrections. The collinear formula is perturbatively stable when W 2 is of order Q 2 . When x → 1, the inclusive DIS cross section becomes sensitive to soft interactions among various particles that are not necessarily associated with the PDF(s). A different factorization formula, including a soft exponential factor, replaces the collinear factorization (10) in this limit. Soft radiation can be reliably approximated by a resummed all-order series of large logarithms if Q is much larger than 1 GeV. At Q of a few GeV, when the perturbative logarithms are not large, the threshold behavior is most sensitive to the nonperturbative part of the soft factor that should be fitted together with the PDFs. In either case, radiation of multiple soft partons modifies the x dependence of the DIS and DY cross sections at x → 1 as compared to the QCR-based estimates.
QCD factorization for other processes. To determine phenomenological functional forms for M S PDFs of various flavors, a global QCD analysis includes a comprehensive combination of experimental measurements in DIS, production of lepton pairs, jets, tt pairs, and other processes. As in the case of DIS, the connection between PDFs and inclusive cross sections relies on factorization theorems, and those are known with less confidence for more complex measurements. The cross sections used in the PDF fits are usually evaluated at a fixed order in α s and often neglecting power-suppressed terms.
The QCRs demonstrated for inclusive DIS cross sections do not translate automatically to the other processes. For example, while the leading-power collinear factorization for the Drell-Yan (DY) pair production cross section, is structurally similar to that in inclusive DIS, as given in Eq. (10), the underlying scattering processes and factorization proofs are drastically different between two processes. In the Drell-Yan process, the underlying hadronic activity from multiperipheral scattering of two parent hadron remnants plays a far more prominent role and creates difficulties in proving the factorization [47]. The parent hadrons are more perturbed by soft interactions in hadron-hadron scattering than in DIS. Any factorization formula holds up to power-suppressed terms which are different in the collinear and threshold factorization formalisms, and which are different at some level in DIS and DY, or between different hadron and heavy nuclei parent species. 1 To summarize, in realistic QCD processes that determine the PDFs, the large-x behavior is modified compared to the predictions of the massless parton model. The scope of modifications in the large-x power laws introduced by higher Fock states, mass terms, resonant contributions, and nuclear effects varies by the scattering process. It is reasonable to expect large modifications of parton-model predictions in events with high final-state parton multiplicities. Still, there can be situations that are close to realizing the assumptions that underlie the QCR's, such as when one tests the internal structure of a meson or looks at a subsample of DIS events with low final-state hadronic multiplicities.

III. TESTING LARGE-x PDFS IN EXPERIMENTAL MEASUREMENTS
A. Bézier curves as polynomial interpolations of discrete data Models of the hadron structure make concrete predictions for the x dependence of the structure functions and PDFs. One can straightforwardly check the agreement of a given model with an experimental observation within the uncertainties. A stronger assertion, that the experiment demands the 1 − x dependence of the PDFs to follow a specific power law, is difficult to demonstrate since the functional forms of the PDFs are not known exactly. This is clearly not possible in the presence of local or resonant structures that disagree with the global trend. Even when the PDF functional forms are restricted to be polynomial, the discrete experimental data can be compatible with multiple functional forms.
To illustrate why, consider an idealized example, in which we seek a polynomial function f (n) (x) of degree n to interpolate k + 1 data points {x 0 , p 0 }, {x 1 , p 1 },..., {x k , p k } that have no uncertainty. Our points satisfy 0 ≤ x i ≤ 1. From mathematics, we know that the existence and number of the interpolating solutions depend on the degree n of the polynomial.
If n = k, the unisolvence theorem guarantees that there exists a unique interpolating polynomial going through all points: f (n) (x i ) = p i . Two equivalent closed-form solutions for the interpolating polynomial are given by the Lagrange polynomial, and by a Bézier curve of degree n, constructed from Bernstein basis polynomials Denote the vector B (n) (x i ) as B. This vector can be written in a matrix form [50,51], where C ≡ c l ; and T ≡ t ip with t ip = x p i . Here i runs from 0 to k, and l, p run from 0 to n. Given the matrix P ≡ p i of data values, the matrix C for the Bézier curve B (n) (x) going through all points satisfies [51] This equation shows that k+1 data points uniquely determine the polynomial of order n = k, assuming no experimental errors. If n < k, an interpolating solution that goes through all points may not exist. Rather, there is a Bézier curve that minimizes the total squared distance to p i , The matrix of the coefficients of this Bézier curve is The total squared distance from this special curve to p i is min χ 2 (P, B) = P T · K T · K · P for n < k, where If we set n = k in Eq. (21), it reduces to Eq. (19). We also get K = 0 and min χ 2 (P, B) = 0. Finally, if n > k, an infinite number of polynomial solutions have min χ 2 (P, B) = 0. They can be constructed by adding n − k arbitrary points to the Lagrange polynomial (14) found for n = k.
Equations (19) and (21) for the coefficients of the Bézier curve are readily solvable and can be used to explore strategies for experimental determination of the x dependence of the PDFs. The numerical solution for the interpolating polynomial is generally unstable for large n. The Bézier form is more stable compared to the other forms, sometimes allowing us to get stable interpolation for n as large as 10 or 15. In our equations, numerical instabilities may arise from the inversion of matrix T if T is ill-conditioned when n is large (especially if k > n), or when some points are spaced too closely in x.
Once found from the data, the Bézier curve (15) can be expanded into the monomial (Taylor) power series of (1−x), which in turn can be compared against the predictions of the quark counting rules. The QCRs discussed in Sec. II might predict that the empirically found coefficientsc l vanish when l ≤ A 2 (Q). Or the whole set of c l orc l can be compared against predictions of a given model. This comparison is impeded, however, by large cancellations between the terms with highest powers l in the monomial expansion (24) when p i values are sampled from a realistic PDF shape. The high-l monomial terms tend to have alternating signs when interpolating such p i . The monomial components with low l, signifying the cutoff by the QCRs and sensitive to the high-l cancellations, vary significantly depending on the range and spacing of x i . Figure 2 illustrates this feature by comparing 9 points sampled from the functions f (x) specified in the plots to the interpolation using the Bézier curve B (n) (x) constructed according to Eqs. (19) and (21). We also show truncated monomial approximations for this curve,  For a more elaborate input function assumed in Figs. 2(b,c,d), we must resort to B (8) (x) with n = k = 8 to obtain interpolation that goes through all points p i according to Eq. (19). One can see from the figures that linear terms (N = 1) are present in the respective monomial expansions, even though the input function does not contain linear terms. The slopes of the N = 1 terms are different in the three panels, the monomial expansion converges too slowly for x < 0.8. The coefficients of the linear and higher-l terms depend on the ranges and spacings of x i , which are varied in Figs. 2(b,c,d). The ambiguity in the N = 1 term is introduced by the correlation of the coefficientc 1 with the coefficientsc l with high l. To pin down the linear term, we must restrict the fit to the highest portion of the x range, such as x > 0.9. It appears that finding the low-l monomial terms with good accuracy requires one to fit precise data in the interval that extends closely to the end point x = 1, where higher-twist effects are likely important. With typical input PDF shapes and fewer than about 10 input points, we find that interpolation by Bézier curves is numerically stable over the range x 0 ≤ x ≤ x k : the Bézier curve goes through all input points and may differ from the input function in high-order terms between the points. Outside of the x range covered by the data points, extrapolation can be unstable.

B (4) [x]
N=0 and 1 Fits by B (8 )  Fits by B (8 )  (c,d) Same as (b), for different ranges and spacings of x covered by the sampled points.

B. Effective large-x exponent
The example in Fig. 2 demonstrates that mimicry of the fitted functional forms impedes determination of the lowest powers in the monomial (1 − x) expansion even in an idealized fit to a few "data" points without uncertainties. An interpolation or fit by a high-degree polynomial may render terms with low powers of (1−x) that are not present in the fitted function, and which depend on how the data are sampled. In QCD, there is no reason to expect that coefficients c l for high powers of (1 − x) are suppressed in the PDFs. Statistical and systematic errors in the measurements also get in the way of the determination of the analytic (1 − x) dependence.
The remainder of the article will follow a less pretentious path. Predictions of QCR's and various nonperturbative models suggest that, in the x (B) → 1 limit, the structure functions or PDFs, denoted collectively as F(x (B) , Q 2 ), behave as where Φ(1 − x (B) ) is a slowly varied function. Based on this observation, it is natural to define with the expectation that A eff 2 ≈ A 2 when the logarithmic derivative of Φ(1 − x (B) ) is small. 2 We will compare theoretical predictions for A 2 presented in Sec. II with the A eff 2 F(x (B ), Q 2 ) values obtained from a phenomenological PDF ensemble.
C. The CT18 PDF ensemble and estimation of PDF uncertainty In the present analysis, we focus on the results obtained with the CT18 global QCD analysis [23]. The CT18 PDFs are determined by fitting NNLO theoretical cross sections to 40 experimental data sets with a total of 3681 data points. The fitted scattering processes -DIS, production of lepton pairs, jets, and tt pairs -cover a large kinematic region that extends up to x = 0.75 in the typical momentum fraction and down to Q = 2 GeV in the factorization scale.
The CT18 functional form is given by Functions Φ a (x) are parametrized by Bézier curves B (n) (y) with n = 4 − 5 and y ≡ √ x or a similar scaling function; see Appendix C in [23]. At the initial scale of the fit, Q 0 = 1.3 GeV, the functional forms provide the initial condition for DGLAP equations that predict PDFs at an arbitrary Q > Q 0 . Each interpolating polynomial Φ a (x) reduces to a non-zero constant at x → 0 or 1, so that the exponents A 1,a and A 2,a control the behavior of the individual PDFs when approaching these limits. While in principle all A 2,a parameters can be determined from the data, in practice not all their combinations result in non-negative PDFs or physically acceptable values of flavor-dependent observables. The CT18 fit imposes a requirement A 2,uV = A 2,dV to guarantee a finite value of d(x, Q 2 )/u(x, Q 2 ) at x → 1 or, equivalently, a nontrivial asymptotic value of F p 2 (x, Q 2 )/F n 2 (x, Q 2 ). We thus expect where the limit is reached only at very high x values that are outside of the x region covered by the experimental measurements.
On the other hand, at x values below 0.8, where sufficient amount of experimental data exists, the polynomials Φ a (x) with a = u or d are flexible enough to allow a variety of functional behaviors of the u and d PDFs. Therefore, at x < 0.8, A eff 2,dV (x) can be quite different from A eff 2,uV (x).
The CT18 ensemble consists of the central PDF set and 58 error PDF sets that can be used to estimate the PDF uncertainty in A eff 2 according to the master formulas of Refs. [52,53]. A variety of sources contribute to this PDF uncertainty, including experimental, theoretical, parametrization, and methodological uncertainties [21]. Among these, the uncertainty introduced by the choice of the PDF functional forms has been examined in the CT18 study by examining the spread of the PDFs in 250 candidate fits with alternative functional forms or methodological settings, as explained in Sec. III.C.3 of [23]. The tolerance on the nominal Hessian error sets has been selected so as to cover, on average, the spread of the best-fit values in the alternative fits. Thus, the CT18 Hessian uncertainty covers the spread of results with the alternative parametrization choices, with the exception of the extrapolated x regions, x > 0.7, where some of the explored best fits fall outside of the nominal CT18 error band. We therefore quote the uncertainty on A eff 2 as the envelope constructed from the CT18 Hessian uncertainty at the 68% probability level and the extreme variations of A eff 2 obtained with the extended set of 363 alternative functional forms.

D. An effective exponent for a DIS structure function
In deep inelastic scattering, a structure function F (x B , Q 2 ) of the proton can be predicted in terms of phenomenological PDFs f a/p (x, µ 2 ) as in the QCD factorization formula in Eq. (10).  [52,53] at the 68% probability level. The extreme curves correspond to the envelope of the Hessian and parametrization uncertainties estimated as in Sec. III C. The transparent part of the Q = 2 GeV band corresponds to the region with W 2 < 2m 2 p , approximately corresponding to the resonance region in DIS. The reference prediction from the QCRs is shown by a line at A eff 2 (F 2 ) = 3.
A eff 2 for F 2 (x B , Q 2 ) computed according to Eq. (27) with the CT18 NNLO set [23] for three values of Q 2 . These are predictions for the leading-twist contribution to F 2 (x B , Q 2 ) that reflect a combination of constraints from 40 diverse experiments, and which may not be directly comparable to the actual DIS data because of the limitations discussed in Sec. II B. At Q = 2 GeV, the interval x B > ∼ 0.8 corresponds to DIS in the resonant region, see Sec. II B, where the smooth behavior of F 2 (x B , Q 2 ) that we predict is modified by complex local features that do not obey the QCR's. We indicate the x B values lying in the resonant DIS region, approximately corresponding to W 2 < 2m 2 p , by using the semi-transparent fill for a part of the error band for Q = 2 GeV.
Each error band in Fig. 3 consists of an inner (darker) part, indicating the 68% probability level Hessian uncertainty, and the outer (lighter) part, representing the envelope formed by the Hessian uncertainty and the exponents (27) for the alternative functional forms, as explained in Sec. III C. We see that the magnitudes of the Hessian and envelope uncertainties are similar for the region where data are available, i.e. x < 0.75. On the other hand, in the region in which the PDFs are unconstrained by the data, the error reflecting the choice of parametrization dominates. The envelope indicated by the outer error band represents a more conservative estimate of the uncertainty at each Q 2 .
We now compare the A eff 2 F(x B , Q 2 ) from the CT18 fit to the prediction A 2 = 3 from the QCR's for F 2 (x B , Q 2 ) in a proton given by Eq. (2). The A eff 2 values for Q = 2 GeV in Fig. 3 clearly fall below the QCR prediction. On the other hand, starting from a scale of about 4 GeV, the expected exponent of three is attained at the highest x within the error bands.
The A eff 2 values in Fig. 3 substantially depend on x B and Q 2 . The true limit of the QCR's would imply weak dependence on either x B or Q 2 , indicating that neither multiparticle final Fock states nor anomalous dimensions are important. In accordance with the discussion in Sec. II, the QCR's would be realized when the Feynman diagrams in Fig. 1 dominate.
As we alluded in Sec. II B, we do not expect the conditions for the QCR's to be fully met in a global PDF fit. Various simplifications in the fitted processes limit the anticipated accuracy, for example, due to the neglect of process-dependent power-suppressed terms. Nevertheless, Fig. 3 shows that the CT18 values for A eff 2 F(x B , Q 2 ) are consistent with the QCR prediction of three within about one unit. Keep in mind that the bound-state wave function for the hadronic target predicts parton distributions in a free hadron. On the other hand, the nucleons -and a fortiori the pions-probed in the global fits are not truly free at some level: their partons feel the other hadrons present in the scattering event before or after the hard scattering. At the leading power, the effect of the soft QCD background field created by spectator hadrons on the propagation of partons entering the scattering is given by the Wilson line in the M S PDF in Eq. (11). Additional radiation of this kind and power-suppressed terms may modify the x dependence compared to the QCR prediction.

E. Effective exponents for PDFs
We will now investigate effective exponents A eff 2 for individual PDFs. In the CT18 global fit, universal PDFs enter theoretical cross sections for the fitted processes as shown in Eqs. (10) and (13) for DIS and DY. The hadronic cross sections are evaluated up to NNLO in α s . Equation (9) states the QCR predictions for the A 2 exponents for PDFs of various flavors. These predictions can be compared with A eff 2 computed for the respective phenomenological PDFs. First, we plot, in the left subfigure of Fig. 4, the nominal parameters A 2,uV and A 2,dV in the parametrizations for u and d valence quarks, introduced as in Eq. (28). As summarized in Sec. III C, the CT18 fit assumes these parameters to be the same. The figure shows the red line that corresponds to the best-fit CT18 value for A 2,uV = A 2,dV and its 68% c.l. Hessian uncertainty.
In the same subfigure, the green ellipse shows the 68% c.l. region for the effective exponents A eff 2 computed using the CT18 Hessian error PDF set at Q = 1.3 GeV and x = 0.875 according to Eq. (27). Finally, blue scattered points in the left subfigure are for the A eff 2 combinations obtained with 363 alternative parametrizations of CT18 PDFs. The lines indicate the QCR prediction of three for each exponent.
We see in the left Fig. 4 that, at this high value of x and the initial scale Q 0 , the nominal parameters A 2,uV and A 2,dV are consistent with the QCR predictions within the PDF uncertainty. While the nominal A 2,uV and A 2,dV are set to be equal in the initial functional forms, the effective coefficients for u V and d V at x = 0.875 turn out to be slightly different. The distribution of the scatter points is narrower for the up valence than for the down valence, implying that the PDF for d V is less constrained by the data at large x.   The lines A eff 2 (u V , d V ) = 3 are shown for reference.
Similar plots for g andū +d PDFs are shown in the right Fig. 4 for the A eff 2 values at Q 0 = 1.3 GeV and x = 0.75, and in Fig. 6 for the x and Q dependence. These PDFs quickly vanish at very large x, thus we limit the respective comparisons to the region x ≤ 0.75 to avoid numerical issues.
Since the nominal A 2 parameters are not constrained to be the same for g andū+d, the 68% c.l. Hessian uncertainty region in the right Fig. 4 is given by an ellipse and not   is truly compatible with the respective QCR predictions: the gluon A eff 2 of 2-4 is systematically lower than the QCR prediction of 4, while the sea-quark A eff 2 of 6-9 tends to be higher than 5. The Q dependence due to the singlet DGLAP evolution is very pronounced for both types of PDFs. The gluon effective exponent grows above four starting from about 4 GeV, now in compliance with the QCRs. The A eff 2 for the sea combinationū +d has a large uncertainty at Q 0 but quickly correlates with the gluon once the DGLAP evolution is turned on.
We can compare the effective exponents for CT18 NNLO PDFs with those computed for the MMHT14 and NNPDF3.0 sets. We find reasonable agreement between the effective exponents obtained based on these three PDF ensembles. We also broadly agree with the observations presented in Ref. [24] for a low Q scale. 3 We learn several things from these comparisons. The phenomenological effective exponents and QCR predictions better agree for the valence u and d quarks. The agreement is not so good for the gluon and especially the sea quark PDFs.
The assumptions justifying the quark counting rules hold when the scale dependence is small. In reality, Figs. 5 and 6 demonstrate pronounced scale dependence, suggesting that higher-order QCD radiation, producing multi-parton  final states, is not entirely negligible in the fitted processes.
Focusing on DIS for a moment, the numerical effect of the multi-parton final states at various Q 2 and W 2 is twofold. When Q 2 is increased at a fixed x value, QCD radiation, evaluated in the logarithmic approximation by DGLAP equations, increases the effective power A eff 2 for valence quarks and gluons -see [24] and references therein. The growth of the corresponding A eff 2 in our figures is consistent with this expectation, while the Q 2 dependence for antiquark A eff 2 is less clear. When W 2 is increased (and x B is decreased) at a fixed Q 2 , new final states may be produced and introduce terms with high powers of 1 − x in the PDF expressions. As discussed in Sec. III A, such terms can increase or reduce the effective leading power A eff 2 , as compared to the QCR prediction, and introduce dependence of A eff 2 on x (B) .

F. Process dependence of effective exponents
In Secs. III D and III E, we presented the effective exponents A eff 2 and their uncertainties for leading-power DIS structure functions and PDFs determined based on the totality of 40 experiments fitted in the CT18 global analysis. We will now examine the agreement of individual experimental data sets in their preferences for the large-x behavior of PDFs quantified by A eff 2 . Toward this goal, we will employ a statistical indicator called the L 2 sensitivity [54], following its applications in the CT18 analysis [23] to examine agreement between the experimental data sets. The L 2 sensitivity is constructed from the values A eff 2 and goodness-of-fit function χ 2 E for each fitted experiment E, computed for each Hessian eigenvector set of the CT18 NNLO ensemble. See the relevant equations in Ref. [54]. Figure 7 plots the L 2 sensitivity of several fitted experiments to the effective A eff 2 for the proton structure function, F p 2 (x B , Q 2 ), evaluated at the x B values shown on the horizontal axis, for Q = 1.3 (top), 4 (lower left) and 10 GeV (lower right subfigure). The L 2 sensitivity is approximately equal to the variation in is increased by the 68% c.l. Hessian uncertainty above its best-fit value at the specified x B . In other words, we increase A eff 2 [F p 2 (x B , Q 2 )] to the upper boundary of the dark error band for the respective Q in Fig. 3 and ask how χ 2 E changes under this variation. The curves in Fig. 7 are for several experiments in the CT18 NNLO fit that show the highest sensitivity to 3 GeV, these are BCDMS ep and ed DIS cross sections [55,56], the NMC ratio of ep and ed DIS cross sections [57], CDHSW F 2 and F 3 measurements for charged-current DIS on a heavy nucleus [58], E866/NuSea pp Drell-Yan cross sections [59], and CMS jet production cross sections at 7 and 8 TeV [60,61]. Although the coverage by these data in the CT18 fit ends roughly at x ≈ 0.75, they predict At Q = 4 and 10 GeV, the CDHSW data sets play less prominent role, while the combined HERA I+II DIS data set [62] imposes some constraints.
Next, we turn to the sensitivities to A eff 2 of u V and d V distributions at Q = 1.3 GeV in Fig. 8. The pattern of sensitivities to A eff 2 of u V in the left panel is visually similar to that for F p 2 (x B , Q 2 ) at Q = 4 GeV in the second Fig. 7. Once again, competing pulls of the BCDMS ep cross sections and E866/NuSea pp Drell-Yan cross sections against the BCDMS ed DIS cross sections stand out at x > 0.8. At x = 0.6, the E866/NuSea data and to some extent the HERA DIS data prefer higher A eff 2 than the BCDMS ed and NMC p/d DIS measurements. The pattern on the pulls on A eff 2 is more elaborate for d V in the right panel of Fig. 8. The d V is less constrained at large x than u V because the relevant data in the global fit are dominated by neutral-current DIS measurements that are four times more sensitive to up-type quark PDFs than to down-type ones. In Figs. 4 and 5, we observed a moderate PDF uncertainty on A eff 2 for u V and a much larger uncertainty on d V , especially at x > 0.75 where the parametrization uncertainty dominates.
Since the published CT18 parametrization sets A 2,dV = A 2,uV , in the region x > 0.9 with no data, A eff 2 for d V essentially follows that for u V . Namely, its value reflects a tradeoff between the opposing pulls of the BCDMS ep and ed DIS data sets. At x < 0.9, we see a different pattern, whereby a fairly strong upward pull on A eff 2 by the NMC p/d ratio, complemented by LHCb W/Z production at 8 TeV [63] and both BCDMS ep and ed DIS, is opposed by the combined HERA DIS, E866/NuSea pp, as well as by CDHSW and CCFR [64] inclusive charged-current DIS data on heavy nuclei.  In these comparisons, we see some differences between the preferences of scattering experiments on the proton versus deuteron and heavy-nuclei scattering. While these differences do not rise to clear disagreements, they nevertheless suggest importance of the treatment of nuclear effects in future PDF fits.

IV. IMPLICATIONS FOR LOW-ENERGY DYNAMICS
We can now address the question of whether phenomenological PDFs reflect manifestations of low-energy dynamics, based on the discussion of the physical meaning of PDFs in nonperturbative approaches and phenomenological fits in Sec. II B, the mathematical arguments of Sec. III A, and numerical results in Secs. III D-III F. We will focus on two points relevant for further analyses: the relation between the nonperturbative approaches and phenomenological PDFs, and studies of PDFs in the pion.
A. On the relation to nonperturbative approaches It seems appropriate to assume that the quark counting rules are realized when the QCD coupling in semi-hard scattering is reasonably small, so that the final Fock states are dominated by contributions with a small number of perturbative partons interacting through nearly perturbative interactions. The reality is more complex: radiation in the PQCD regime results in logarithmic evolution of the partonic probability from large x at low Q 2 towards smaller x values at higher Q 2 , which in turn introduces Q 2 dependence of the (1 − x) exponents quantified by their anomalous dimensions [24,31,32]. The ideal window in {x, Q 2 } for QCR studies overlaps with the resonance region in DIS, from which the structure functions can still be extracted, but the global fits of PDFs in this domain must include target-mass corrections -a kinematic effect dependent on a specific scattering process -and other higher-twist terms [14]. Threshold resummation [65,66] and related nonperturbative effects, such as the modified running of the QCD coupling constant [67], are important at the highest x.
Reconciliation of the nonperturbative and phenomenological definitions of PDFs runs into important differences between the degrees of freedom adopted in various theoretical approaches. In the PQCD collinear factorization framework exemplified by Eq. (10) for deep inelastic scattering, universal PDFs correspond to the long-distance part of the hadronic cross section that is perturbatively expanded as a series of the small QCD coupling and powersuppressed (twist) terms. Here, both expansions are made possible by the presence of a scale Q > 1 GeV in the hard cross section. Phenomenological PDFs are defined in an M S or another factorization scheme introduced to separate long-and short-distance radiative contributions.
Nonperturbative predictions for the hadron structure do not have an inherent large energy scale that sets the small expansion parameters. They describe the internal structure at a low hadronic scale µ 0 < 1 GeV and must be matched to the factorized PQCD predictions at an intermediate scale Q 0 > µ 0 . The bridge between these two scales, namely the one-to-one connection between the low-scale dynamic degrees of freedom to the PQCD quarks and gluons, remains an unsolved problem, with hints available in Dyson-Schwinger approaches and lattice QCD, e.g. in [6,[68][69][70].
In the absence of such a clear connection at present, one might resort to a model of the spin and flavor dependence of the whole operator matrix element in the M S definition (11) of f a/p (x, Q 2 ). Results obtained with an SU (6)symmetric wave function or an SU (6)-broken [71], quark-diquark configuration [72][73][74][75], to name a few, reveal useful hints about the x dependence of PDFs at large x. We relied on these considerations when equating A 2 , the (1 − x) exponents of the valence up and down quarks, in the PDF parametrizations adopted in the CT18 analysis.
A similar spin-flavor consideration for the pion, discussed in the next subsection, seems less relevant at mild energies due to the pion's pseudo-Nambu-Goldstone origin.

B.
On the pion case In Refs. [22,[29][30][31], the counting rules were also formulated for the structure function of the pion, predicting a (1 − x) 2 falloff near the threshold. This behavior is often predicted based on the expression of the pion PDF, or distribution amplitude, in terms of the long-distance pion wave function φ and semi-hard scattering contribution dominated by the qq state, as reviewed in Sec. II A 2 and Fig. 1.
The pion structure provides a fascinating window on QCD dynamics. Kinematics of the target meson in neutral-current DIS (Sec. II B) takes a new meaning in light of the pion mass generation, a key emerging feature for pion-related observables. Chiral symmetry and its breaking govern the pion structure at low-to mid-energies. The nonperturbative quark-quark interaction cannot be replaced by a hard-gluon exchange at energies at which manifestations of chiral symmetry are substantial compared to PQCD interactions. This point is also highlighted for a related case of hard exclusive processes in Ref. [76]. The latter are best understood by comparing a fully nonperturbative approach for predicting the pion electromagnetic form factor to a large-Q 2 description in terms of distribution amplitudes and a hard-scattering part [9,77]. The addition of nonperturbative effects to the hard-gluon exchange in [76] improves the description of the form factor at low/moderate Q 2 and provides a better transition to the asymptotic "perturbative" behavior [77] associated with the QCRs for Q 2 up to a least 10 GeV 2 . The role played by the large-x distribution amplitude (or the structure function) in the behavior of the pion electromagnetic form factor at large Q 2 has been emphasized numerous times, see, e.g. [78][79][80]. In other words, for the pion, the concepts of weak coupling and a loosely-bound initial state assumed in the QCR picture cannot be approached without also considering the long-distance effects induced by chiral symmetry.
Present and future experiments -at JLab, EIC, or AMBER/COMPASS++ -aim to unveil the pion structure in DIS and Drell-Yan pair production. The questions examined throughout this manuscript apply to the fits of pion PDFs. Given the simpler valence structure of the pion and the pion's low mass, we anticipate considerable simplifications with respect to the case of the proton. The main experimental constraints on the pion PDFs at large x for now come from the E615 Drell-Yan pair production in pion-nucleus scattering [81]. In this process, a large momentum fraction x 1 for the pion corresponds to a small x 2 for a nucleus, except in the true threshold limit when s ≈ Q 2 , where no measurements currently exist. In such kinematic regime, when the nuclear beam remnant creates high hadronic multiplicities in the final state, one must carefully revisit the justifications for PQCD factorization for the Drell-Yan process presented at the end of Sec. II B. Nuclear shadowing in the initial state and interactions with the nuclear remnant in the final state may elevate the power-suppressed contributions as compared to the nucleon scattering. A concern about having a genuinely free pion target arises in prompt photon production in pion-proton scattering, as well as in leading neutron electroproduction. On the positive side, the finely binned E615 data points extend to x 1 = 0.99, very close to the end point. Due to the smallness of the pion mass, target-mass corrections for pion DIS are almost negligible [78].
Modern global analyses for the pion PDFs [26,27,82] now apply advanced theoretical frameworks, e.g. threshold resummation [65,83]. Depending on the theoretical framework, the recent analyses find that the pion data are compatible with a nominal (1 − x) or (1 − x) 2 behavior of the PDF parametrization at Q 0 . When the large-x resummation is included for the DY data, the authors [65,83] find a fast falloff of the valence pion PDFs in (1 − x), consistent with A 2 = 2. While threshold resummation must come into play at large x, it is not a sufficient condition for testing that the extracted PDFs fulfill the QCR predictions. None of these analyses addresses the functional mimicry of high-degree polynomial fits discussed in Sec. III A.
As we emphasized in that section, without knowing the exact functional form of the PDFs, one must include the end-point region, ideally x > 0.9, to pin down the low powers of (1 − x) in the monomial expansion. Physical uncertainties grow in the end-point region. For example, if threshold resummation is necessary, one must account for its uncertainties due to the choice of factorization scales, matching on the fixed-order prediction, and power-suppressed terms.
On the other hand, a fit that is restricted to smaller values of x introduces a spurious correlation between the coefficients with low and high powers of (1 − x). This correlation depends on the fitted data sample and strongly modifies the lowest-power monomial terms, as illustrated in Figs. 2 (b-d).
An alternative approach in Sec. III B computes the effective exponent A eff 2 that can be compared against theoretical predictions without reconstructing the analytic form of the PDFs. The value of A eff 2 depends on the range of x. Its global trend, reflecting slow variations over x, must be distinguished from the local one, existing in a small neighborhood of the examined data bins. For complex PDF parametrizations, like the ones used by NNPDF, the effective exponent may have large local variations, due to mimicry, and require averaging over PDF replicas and/or a range of x in order to determine its x → 1 limit. For smooth PDF parametrizations, like the CT18 or JAM ones, averaging is not necessary: the A eff 2 values computed based on such PDFs follow smooth trends and can be determined for comparisons against the QCR's at the momentum fractions of about 0.8, as has been done in Fig. 3.
Analogous considerations apply to PDFs predicted by low-energy models. Ideally, these models should provide uncertainty bands for the cross sections or PDFs that can be verified (or falsified) by the experimental data over the full x range. Such uncertainties are difficult to estimate faithfully. In the absence of the uncertainty bands, one must focus on the aspects of the low-energy predictions that are preserved in hard scattering. For example, a low-energy dynamic effect like the broadening of the parton distributions due to the emergence of dynamical mass [2,[84][85][86] may favor a particular (1 − x) falloff power, yet consistency of this power with the experimental data in some kinematic region is not sufficient for validating such prediction as the only viable one. The local trend quantified by the empirical A eff 2 values may differ from the global trend in complex nonperturbative models. A comparison against the QCRs requires experimental access to the end point x = 1, where additional dynamical effects are most pronounced.
Secs. III D and III E demonstrate that A eff 2 depends on the factorization scale Q. The pion valence PDFs obey the same non-singlet evolution equation as the proton ones, thus A eff 2 for the pion PDFs may change by 0.5-1 units within the typical Q range, like in the proton case. The evolution of a PDF that fulfils the QCRs at a low, pre-factorization scale may either increase or decrease A eff 2 at a higher Q, reflecting the functional mimicry.

V. CONCLUSIONS
Complementary to the constraints based on first principles, the quark counting rules offer predictions for DIS structure functions and PDFs near the elastic threshold. In this picture, when a hadron target is almost unperturbed, and the strong coupling constant is small, the structure functions in a nucleon exhibit a (1 − x) p fall-off in the limit x → 1, reflecting exchanges of semihard gluons between the quarks in the incoming bound state, as discussed in Sec II A.
We examined two possible strategies for testing the quark counting rules with experimental data. From a purely mathematical point of view, the concept of polynomial mimicry, demonstrated in Sec. III A by employing the Bézier curve technique, reveals a limitation in reconstructing the exact functional forms of PDFs from discrete data, whether based on an interpolation or a fit. It is not possible to uniquely determine the powers of a (1 − x) monomial expansion except very closely to the end point. Associated uncertainties can be very large.
As an alternative, Sec. III B showed that it is possible to define an effective exponent to examine the large-x behavior of any functional form for the PDFs. The leading-power structure function F p 2 (x, Q 2 ) reconstructed within the CT18 NNLO global analysis [23] agrees, within error bands at moderate scales Q, with the predicted power law. However, a non-negligible shift in the effective (1 − x)-exponent with increasing Q 2 is observed in Fig. 3, as expected from DGLAP evolution. Moreover, the resonance region in DIS forbids a reliable analysis at large-x values and photon virtuality of a few GeVs.
This has led us to our bottom line: exploration of the relevance of quark counting rules for high-energy processes must address various factors arising from both theory and statistics. We have investigated the pertinent issues in the case of PDFs for the nucleon, for which the situation is best understood. These include differences between the quark counting rules for hadronic observables, such as F 2 (x, Q 2 ), as opposed to M S PDFs; the universality of the PDFs and the role of power-suppressed and soft corrections; the phenomenological PDF uncertainty reflecting the scarcity of the data at large x as well as the choice of the PDF parametrization. The latter point has been developed by studying N = 363 replicas of the CT18NNLO analysis, see Sec. III C.
Global analyses are based on experimental data from several processes, such as DIS and DY. When considering the effective power laws at the PDF level in Sec. III E, we have found that not all flavors behave on the same footing. The valence up distribution is consistent with the running exponent of three, the valence down and gluon distributions on average tend to have lower-than-expected exponents, the sea quark exponent is too high, see Fig. 4. Theū +d exponent only slightly decreases with Q 2 . In the same spirit, the preferred effective exponents depend at some level on the fitted experimental process, as highlighted in Sec. III F.
In conclusion, we emphasize that the quark counting rules emerge in the limit of weak coupling in processes with little underlying hadronic activity. Violations of these conditions in at least some high-energy processes put in question the universality of the rules, which is especially relevant for uses in global analyses. On the other hand, there may be experimental measurements that favor the QCRs, such as a subclass of DIS events with low final-state hadronic multiplicities. The other possibility is offered by pion scattering discussed in Sec. IV B, which is less affected by collateral factors present in the nucleon or nuclear cases.
An experimental observation of the x dependence predicted by a nonperturbative calculation constitutes an insufficient, but non-redundant condition for validating the calculation. Functional forms of fitted PDFs are unnecessary but sufficient for describing the data. Only by measuring the structure functions/PDFs near the end point x = 1 one may reveal evidence of the primordial power law, as we have demonstrated in Sec. III A based on the comparison of the monomial and Bézier expansions. Reconciliation of the predictive phenomenological fits and the interpretative nonperturbative approaches within uncertainties will require more data at large x as well as efforts to address theoretical issues that hinder our understanding of cross sections at the end point x = 1.