Gluon Shadowing in Heavy-Flavor Production at the LHC

We study the relevance of experimental data on heavy-flavor [$D^0$, $J/\psi$, $B\rightarrow J/\psi$ and $\Upsilon(1S)$ mesons] production in proton-lead collisions at the LHC to improve our knowledge of the gluon-momentum distribution inside heavy nuclei. We observe that the nuclear effects encoded in both most recent global fits of nuclear parton densities at next-to-leading order (nCTEQ15 and EPPS16) provide a good overall description of the LHC data. We interpret this as a hint that these are the dominant ones. In turn, we perform a Bayesian-reweighting analysis for each particle data sample which shows that each of the existing heavy-quark(onium) data set clearly points --with a minimal statistical significance of 7 $\sigma$-- to a shadowed gluon distribution at small $x$ in the lead. Moreover, our analysis corroborates the existence of gluon antishadowing. Overall, the inclusion of such heavy-flavor data in a global fit would significantly reduce the uncertainty on the gluon density down to $x\simeq 7\times 10^{-6}$ --where no other data exist-- while keeping an agreement with the other data of the global fits. Our study accounts for the factorization-scale uncertainties which dominate for the charm(onium) sector.

Introduction -Parton-distribution functions (PDFs), describing the longitudinal-momentum distributions of quarks and gluons inside hadrons, provide the essential link between the measurable hadronic cross sections and the perturbativelycalculable cross sections of high-energy processes induced by quarks and gluons. The precise determination of PDFs of protons, f p i , is an extremely active area of research where several groups perform global analyses of a wide variety of experimental hard-process data. The modern global analyses [1][2][3][4][5][6] have evolved into impressive ventures with state-of-the-art perturbative calculations and sophisticated statistical methods to extract optimum PDFs along with their uncertainties.
The situation is more challenging -but not less interestingfor PDFs of nucleons inside nuclei, f [p,n]/A i , with nuclear data significantly more complex to collect and with two additional degrees of freedom, the number of protons (Z) and neutrons (N = A − Z) in the studied nuclei. Nuclear PDFs (nPDFs) are key ingredients to use perturbative probes of the quark-gluon plasma produced in ultrarelativistic nucleus-nucleus collisions at RHIC and the LHC [7]. As such, their determination goes even beyond the understanding of the nucleus content in terms of quarks and gluons. Since the early 1980s, we know that the nuclei are not a simple collection of free nucleons, and nPDFs are not equal to a sum of nucleon PDFs. In fact, the corresponding analyses rather bear on nuclearmodification factors (NMF), like in lepton-nucleus ( A) collisions R[F A 2 ] = F A 2 /(ZF p 2 + (A − Z)F n 2 ) for the deep-inelastic scattering (DIS) structure function F 2 and parton-level NMFs i + N f n/A i (i = g, q,q), instead of the absolute nPDFs.
Based on earlier studies of F 2 [8][9][10][11][12][13][14], one knows that, for the quarks, (i) R A q > 1 for x 0.8 (Fermi-motion region), (ii) R A q < 1 for 0.25 x 0.8 (EMC region), (iii) R A q > 1 for 0.1 x 0.25 (antishadowing region), and (iv) R A q < 1 for x 0.1 (shadowing region) where different physics mecha-nisms were proposed to explain this behavior. At medium and large longitudinal-momentum fractions, x, R A q is usually explained by nuclear-binding and medium effects and the Fermi motion of the nucleons [15] but a fully conclusive picture has not yet emerged after the discovery of the EMC effect [16]. At small x, coherent scatterings inside the nucleus explain the observed suppression of F 2 , referred to as shadowing. Antishadowing is even less understood. Therefore, just like in the nucleon case, nPDFs are determined by performing global analyses of experimental data [17][18][19][20][21].
Compared to the quark content -directly probed by data on A DIS and the proton-nucleus (pA) Drell-Yan process-, the gluon content of the nuclei is even less known. To compensate this lack of constraints, both most recent global next-toleading order (NLO) analyses of nPDFs, nCTEQ15 [18] and EPPS16 [17], used RHIC pion and LHC jet data (in case of EPPS16) to constrain the gluon densities down to x ∼ 10 −3 . However, there is no data at x 10 −3 . Hence, we do not know anything about the gluon at small x; the gluon nPDFs in this region are obtained by extrapolating nPDFs from larger x region. As such, they essentially depend on the parametrizations of the x-dependence of nPDFs at the initial scale µ F,0 ∼ 1 GeV.
As discussed in Refs. [22,23], this lack of knowledge of the gluon nPDF is thus a priori not reflected by the set of error PDFs provided together with the best fit PDFs. Accordingly, increasing the flexibility of the initial nPDF parametrizations leads to much larger uncertainties in this region as evidenced by the EPPS16 set as opposed to the EPS09 [24] and nCTEQ15 ones. Clearly, a determination of the small-x gluon nPDFs and the reduction of their uncertainties is necessary for the heavy-ion phenomenology.
Recently, using heavy-flavor (HF) production at the LHC was proposed for an improved determination of the small-x gluons in the proton [25][26][27][28][29]. We also noticed an earlier proposal [30]. Motivated by the results of these studies, we performed the first analysis of the impact of heavy-quark(onium) data in LHC proton-lead (pPb) collisions on the determination of nPDFs (nCTEQ15, EPPS16) as a way to constrain the small-x gluon density in lead down to x 7 × 10 −6 .
The interpretation of our results depends on the reliability of nPDF factorization in the nuclear environment, which is a question of considerable theoretical and practical importance. In this context, we note that other cold-nuclear matter (CNM) effects  could become relevant in some specific conditions, in particular for the quarkonium case. In our study, they can be seen as higher-twist (HT) contributions and the use of leading-twist (LT) factorization becomes a working assumption to be tested. Once validated by data, as we will show, this assumption of LT factorization can be employed to learn about the internal structure of the nucleus.
Methodology -The cross sections measured in pA collisions at colliders are nearly always normalized to the pp ones [7,52,53] since one is primarily interested in deviations from the free nucleon case, up to isospin effects. For DIS off a nucleus A, and thus F A 2 , the NMF R[F 2 ] is directly related to the modification R A q of the (anti)quark nPDF compared to its PDF. For the gluons, one similarly defines R A g entering theoretical evaluations of the NMF R pA ≡ dσ pA /(A × dσ pp ), which can be differential in the transverse momentum (P T,H ) or the center-of-momentum (c.m.s.) rapidity y c.m.s.,H of the hadron H. The nPDF sets provide parametrization of R A g at any x and scale. In the absence of nuclear effects, R A g = 1 and we observe R pA (O H ) 1. Unlike the simple case of F 2 at leading order, R A g enters R pA via a convolution which requires a control of the parton-scattering kinematics.
The focus on R pA has several advantages. It allows to leave aside, in the theory evaluation, the proton PDF uncertainty at very small x which may not always be negligible. Second, R pA is in general less sensitive to QCD corrections which may affect the normalization of the cross-section predictions. Third, some experimental uncertainties cancel in R pA and, at the LHC, R pPb is usually more precise than the corresponding pPb cross sections.
There are several advantages in using this approach: (i) the uncertainty in the pp cross section is controlled by the measured data, (ii) it can be applied to any single-inclusiveparticle spectrum as long as the relative weights of the different channels (parton luminosities times |A| 2 ) are known, and (iii) the event generation is much faster than with QCDbased codes, allowing us to study several nPDFs with several scale choices in an acceptable amount of computing time. Indeed, to quantify the intrinsic theoretical uncertainty from the factorization scale µ F , we have varied it about a default scale µ 0 as µ F = ξµ 0 with ξ = 0.5, 1.0, 2.0. µ 2 0 was taken to be M 2 [54], the pp baseline study was improved. For the first time, we considered the B → J/ψ data. For D 0 , J/ψ and Υ(1S ), we advanced the scale study with a variation in the pp baseline itself and not only in R Pb g (x, µ F ), where pp fits were done with each scale choice. As what concerns the R pPb results, we checked that, for the cases of D 0 and B → J/ψ production, the scale uncertainty is nearly identical to that with the "fixed-order-plus-next-to-leading log" (FONLL) [64][65][66] calculation (see a comparison in the Supplemental Material). As expected, FONLL gives much larger scale uncertainties on the yields.
As announced, to study the impact of HF experimental data on the gluon nPDF determination without performing a full fit, we employed the Bayesian-reweighting method [77][78][79][80][81][82]. This method is a direct application of Bayes theorem allowing one to include new data into a given PDF analysis without a fit. For the present study, we followed the same approach as in Ref. [82]. Since both nCTEQ15 and EPPS16 are Hessian nPDFs, we converted the Hessian error PDFs into 10 4 Monte Carlo replicas, representing the underlying probability distribution [83]. For each PDF replica, one computes the χ 2 of the considered data which is used to reweight them. Replicas describing better the data get larger weights than those unfavored by them. Hence, one obtains a modified probability distribution of the nPDFs like a fit would do.
Data selection -Like for all global PDF fits, a data selection is in order to avoid HT corrections. In our case, it is also important to select a kinematical region where gluon fusion dominates and other effects are negligible. As such, we considered the HF production in pA collisions at LHC energies. In the quarkonium case, due to the large Lorentz boost at these energies, the heavy-quark pair remains almost pointlike all along its way through the nuclear matter. Therefore, breakup [84,85], thought to be important at lower energies, is negligible at the LHC. We focused on J/ψ and Υ(1S ) to limit the contamination by possible comover effects [33][34][35][36]86], on the more fragile excited states Overall, this gives the ALICE [87] and LHCb [88] D 0 data; the ALICE [89,90] and LHCb [91,92] J/ψ data; the LHCb [92] B → J/ψ data; the ALICE [93], ATLAS [94] and LHCb [95] Υ(1S ) data. We could also add the dAu J/ψ RHIC data. Instead, we preferred to focus on the LHC data at 5 and 8 TeV and to use the RHIC [96,97] and the new LHC [98,99] ones as cross checks.
Results - Figs. (1a-1d) show a representative comparison of our theoretical calculations with the data for D 0 , J/ψ, B → J/ψ and Υ(1S ). The NMF obtained with nCTEQ15 and EPPS16 have significantly different central values and uncertainties but both agree with the data. This observation is striking as the used gluon nPDFs were derived from totally dif-  ferent observables like DIS and Drell-Yan processes, and yet they allow us to reproduce the most important feature of the data [54] which makes our reweighting analysis meaningful. We see this as a confirmation of the LT factorization (see also Refs. [100][101][102][103]). As for the reweighting results (gray-blue hatched bands in Figs. [1a-1d]), if we could simply fix the scale to a single value for each particle, the LHC R pPb data for prompt D 0 and J/ψ would reduce the uncertainties of the gluon density by a factor 3 for EPPS16 and 2 for nCTEQ15 down to x 7 × 10 −6 [compare the gray-blue and red hatched bands in Figs. 1a and  1d). The current B → J/ψ and Υ(1S ) data do not constrain the gluon nPDFs due to their large uncertainties and relatively large scales. Yet, the larger samples collected at 8 TeV should improve the situation.
We now discuss the scale uncertainties and recall that dσ pPb ∼ f p g ( f p g R Pb g )⊗|A| 2 . Because of QCD evolution, a larger µ F implies a R Pb g closer to unity together with a smaller PDF uncertainty. Indeed, the bands in Figs. (1a-1d) are closer to unity and shrink from µ F = 0.5µ 0 to µ F = 2µ 0 . For nCTEQ15, such variations for D 0 and J/ψ are even similar to the nPDF uncertainty itself.
Clearly, such a scale ambiguity should impact the reweighting results even though the (gray-blue) reweighted bands seems not to show such a sensitivity. It is perfectly normal since the replicas are to match the data. The key point is that they match it at different scales. Consequently, when the reweighted bands are evolved to a common scale µ F = 2 GeV, the reweighted nPDF uncertainties obtained with different scales do not superimpose (compare the black, blue and green bands in Figs. (1e-1f)).
The envelope of these scale-induced variations is about twice as large as their width for the D 0 and J/ψ cases, confirming that the scale uncertainty must be accounted for to obtain reliable uncertainties from these precise data. For the heavier bottom(onium) states, the scale uncertainty is not only much smaller than the nPDF uncertainties but also very small in absolute value, which implies that more precise data could play a major role for a precision determination of the gluon nPDF at small x.
Despite these uncertainties, our results are striking: the D 0 and J/ψ data point to the same magnitude of R Pb g and their inclusion in the EPPS16 fit would likely result in a considerable reduction of its gluon uncertainty by a factor as large as 1.7, see Fig. 1f. For nCTEQ15, the effect seems less spectacular but we should recall that the original nCTEQ15 values at x below 10 −3 are pure extrapolations. The dashed red lines in Fig. 1e illustrates this by showing two equally good fits [22], which are now excluded by the LHC HF pPb data. Overall, the nCTEQ15 extrapolation to small x is unexpectedly well confirmed by the charm(onium) data.
Beside the mere observations of the nPDF-uncertainty reduction, our results have two important physics interpretations. First, the LHC pPb HF data give us the first real observation of gluon shadowing at small x with R Pb g smaller than unity -the no-shadowing null-hypothesis-by more than 11.7 (10.9) and 7.3 (7.1) σ at x = 10 −5 and µ F = 2 GeV for nCTEQ15 and EPPS16 using D 0 (J/ψ) data [see Figs. Fig. 1e-1f, left panels]. Our results thus quantitatively confirm the qualitative observations of [102][103][104] indirectly made from J/ψ photoproduction on lead, which strictly speaking is sensitive to generalized parton distributions -not nPDFs-and suffers from significant scale uncertainties [105,106]. Second, our analysis corroborates the existence of a gluon antishadowing [107]: R Pb g > 1 for x 0.1. This can be seen in Figs. 1e and 1f where the error band after reweighting is smaller and more clearly separated from unity. The analyzed LHC heavy quark(onium) data cover the x region 7 × 10 −6 x 0.1. It is an interesting question how much of the antishadowing can be explained by direct data constraints in the region x 0.1 and how much of the effect is indirectly driven by the momen-tum sum rule correlating a strong suppression at small x with an enhancement in the antishadowing region. We leave this question open for a future publication.
Finally, we consider the global coherence of the HF constraints with other data (to be) included in nPDF global fits. We do it with nCTEQ15 of which 2 of us are authors. We thus have all the data at hand. First, let us observe that the agreement with the DIS NMC data [108], the only DIS set with a mild sensitivity to the gluon distribution, is not degraded in a statistically significant way. The original χ 2 /N data , 0.58, becomes (0.81, 0.58, 0.57) for D 0 with ξ = (0.5, 1, 2), and is similar for other hadrons. Clearly, the inclusion of HF data does not create any tension with the DIS data. One can also make a similar comparison for the W/Z pPb LHC data whose impact on nCTEQ15 was recently studied [82]. The χ 2 /N data of these data was found to be 2.43, and after our HF reweighting it becomes (2.14, 2.49, 3.11) for D 0 . With the same caveats as above, our reweighted nPDFs do not change the theory-data compatibility with the LHC W/Z data. The χ 2 /N data of the J/ψ PHENIX R dAu results [96,97] with nCTEQ15 is (3.58, 2.55, 3.12) and after our J/ψ reweighting becomes (1.81, 2.38, 2.77). This confirms the global coherence of the HF constraints. Tables of these χ 2 values can be found as Supplemental Material.
Conclusion -In this Letter, we used, for the first time, experimental data for the inclusive HF [D 0 , J/ψ, B → J/ψ, Υ(1S )] production in pPb collisions at the LHC to improve our knowledge of the gluon density inside heavy nuclei. We compared the data with computations obtained in the standard LT factorization framework endowed with the two most recent globally fit nPDFs (nCTEQ15, EPPS16). No other nuclear effects were included which are supposed to be of HT origin and hence suppressed as inverse powers of the hard scale. We found a good description of the LHC data with both nCTEQ15 and EPPS16 nPDFs validating our theoretical framework.
By performing a Bayesian-reweighting analysis and studying the scale uncertainties, we demonstrated that the existing heavy quark(onium) data can significantly -and coherentlyreduce the uncertainty of the gluon density down to x 7 × 10 −6 . For charm(onium), the gluons are shadowed with a statistical significance beyond 7 σ at µ F = 2 GeV and x = 10 −5 . These data should thus be included in the next generation of global nPDF analyses. While our results cannot rule out that other HT CNM effects were effectively "absorbed" into seemingly universal LT nPDFs, the observed consistent description of both the D 0 and J/ψ data is far nontrivial since they may interact differently with the nuclear matter.

ACKNOWLEDGMENTS
We are grateful to H. Paukkunen for the comments on using EPPS16 grids, to M. Cacciari for the private FONLL code, and to F. Arleo, E.G. Ferreiro, and P. Zurita for useful discussions. The work is supported by the ILP Labex (ANR-11-IDEX-0004-02, ANR-10-LABX-63) and the COPIN-IN2P3 Agreement.

Validation with FONLL
We have explicitly checked the reweighted results from our data-driven approach with those from the FONLL perturbative calculation [64,66] for open heavy-flavour production. The reweighted nPDFs from both approaches are shown in Fig. 2 for the case of LHC D 0 data. In the data-driven approach, the only theoretical uncertainty to be considered is from the factorisation scale variation (shown in the top-left insets of Figs. 2a and 2b). On the other hand, in the case of FONLL, besides the factorization scale uncertainty (shown in the top-right insets of Figs. 2a and 2b), we also provide other theoretical uncertainties from the renormalisation scale (shown in the bottom-left insets of Figs. 2a and 2b) and charm quark mass variations (shown in the bottom-right insets of Figs. 2a and 2b). It confirms that for observables like R pPb , the dominant theoretical uncertainty is from the factorisation scale variation, while the renormalisation scale and the charm mass uncertainties are largely cancelled. One also clearly sees that the results obtained using our data-driven method and FONLL are very similar confirming the validity of the data-driven method. The same situation holds also for the B → J/ψ case. Comparison of the reweighted nPDFs with D 0 data between our data-driven approach and FONLL with various theoretical uncertainties. The error bands due to nPDF uncertainty are given at 68% CL level.

χ 2 numbers
The χ 2 values before and after reweighting are displayed in Tab. II for {D 0 , J/ψ, B → J/ψ, Υ(1S )} production in pPb collisions at the LHC, together with total number of data points N data . As it is customary, these χ 2 values do not account for any theoretical uncertainties. Regardless of the scale choice (i.e. ξ), χ 2 /N data are around 1 after reweighting, while χ 2 /N data varies significantly with the original nPDFs. It is normal as our replicas are matching the R pPb vs P T , y data. We take the inclusive J/ψ PHENIX R dAu results as a postdiction, where the χ 2 numbers before and after reweighting are shown in Tab. III. The compatibility between the theoretical calculations and the PHENIX data is further improved with the reweighted nPDFs. We also have checked the global coherence of the HF constraints with the LHC W/Z and DIS NMC data. The corresponding χ 2 values are also shown in Tab. III. No degradation is observed as the χ 2 /N data values similar before and after the reweighting.

Effective number of replicas
The reliability of the reweighting procedure can be estimated by the effective number of replicas N eff after reweighting (see Eq. (14) in Ref. [82]). It provides an estimation of the number of replicas effectively contributing to the reweighting procedure. If N eff /N rep 1 with N rep the number of original replicas, the reweighting procedure becomes inefficient and a new global fit is necessary. We provide the values of N eff in Tab. IV based on our original N rep = 10 4 replicas. Among the 24 reweighting results (2 nPDFs, 4 data sets and 3 factorisation scale choices µ F = ξµ 0 ), we conclude that we always have N eff > 3000, which confirms the reliability of our reweighting results.