Exploring the impact of high-precision top-quark pair production data on the structure of the proton at the LHC

The impact of recent LHC top-quark pair production single differential cross section measurements at 13 TeV collision energy on the structure of the proton is explored. In particular, the impact of these high-precision data on the gluon and other parton distribution functions (PDFs) of the proton at intermediate and large partonic momentum fraction $x$ is analyzed. This study extends the CT18 global analysis framework to include these new data. The interplay between top-quark pair and inclusive jet production as well as other processes at the LHC, is studied. In addition, a study of the impact of scale choice on the theory description of the new 13 TeV $t\bar t$ measurements is performed.

The physics of the top quark, discovered by the D0 and CDF collaborations at the Tevatron proton-antiproton collider [1,2] in 1995, is central to a large number of precision programs for theory and experiments at the Large Hadron Collider (LHC) and its future upgrades.
In proton-proton collisions at the LHC, properties of the top quark are being thoroughly investigated as they are critical to many endeavors in high-energy physics, e.g., test the QCD theory and structure of the proton with unprecedented accuracy, unravel details of the electroweak (EW) sector of the Standard Model (SM) at high energies, and search for new physics interactions.
The mass of the top quark (m t = 172.69±0.30 GeV from direct measurements [3]) is very close to that of the recently discovered [4,5] Higgs boson (m H = 125.25 ± 0.17 GeV) and quantum corrections to these masses are deeply related to one another.In particular, the mass of the top quark is an important ingredient in the determination of absolute stability conditions for the EW vacuum as top-quark mass radiative corrections drive the couplings of the Higgs boson [6][7][8][9][10][11][12].In addition, the top quark provides us with unique opportunities to search for signatures of new physics interactions at the TeV scale and beyond (see for instance refs.[13,14] and referenced therein).
The top quark mainly decays into a real W boson and a b quark before hadronization occurs.Therefore, precision measurements of top-quark pair production cross sections allow us to set stringent tests on perturbative QCD and to explore the structure of the proton with higher precision.In fact, more than 90% of the production rate is due to the gluon-gluon fusion channel at the LHC with √ S = 13 TeV of center-of-mass collision energy.This makes top-quark pair production a unique probe of proton's parton content, especially the gluon, at intermediate and large partonic momentum fraction x, which motivates this work.
The higher precision LHC data samples obtained at 13 TeV are at the core of this analysis which aims at studying their impact on parton distribution functions (PDFs) of the proton in global QCD analyses using the CTEQ framework.High-precision theoretical calculations to predict top-quark pair production cross sections for a variety of kinematic distributions obtained with different bin resolutions are therefore crucial for this task.
Top-quark pair production in global PDF analyses.Top-quark pair production cross section measurements at the LHC and Tevatron are now a staple part of the data set baseline in modern global analyses at NNLO and beyond in QCD to determine proton PDFs.Examples of these analyses are ABMP [125], CT18 [126], MSHT20 [127,128], and NNPDF [129,130].In particular, measurements of t t differential cross sections at the ATLAS and CMS experiments have been included in the most recent analyses [126,127,129,130] at NNLO where, together with inclusive jet production measurements, they play an important role in constraining the gluon PDF in the intermediate to large-x region as they complement each other.Though the t t and inclusive high-p T jet productions largely overlap in the Q − x plane, their matrix elements and phase-space suppression are different so that their constraints on the gluon are placed at different values of x.However, the presence of tensions between experiments which can potentially result in different pulls on the gluon at intermediate/large x, and the strong correlation between the top-quark mass m t , the strong coupling α s and the gluon itself, make PDFs extraction particularly challenging [125,131,132].
Another complication arises from the amount of information on the statistical and systematical uncertainties published by the experimental collaborations.These uncertainties can be expressed in terms of either the covariance matrix or nuisance parameter representation and conversion from covariance matrix to nuisance parameters is not unique.Complete information on the statistical, uncorrelated and correlated systematic uncertainties (and their sources) is critical to maximize constraints from the data in PDF determinations.
A large number of studies appeard in the literature that have used total and differential t t cross section measurements at the LHC to constrain the gluon and other PDFs.Recent and less recent analyses can be found in refs.[131][132][133][134][135][136][137][138][139].
Main goals of this analysis.In this work, we shall study the impact of particular selections of 13 TeV t t single differential distributions at ATLAS (in the all-hadronic and lepton+jets channels) [21,22] and CMS (in the dilepton and lepton+jets channels) [24,26], on NNLO PDFs obtained by using the same framework (i.e., strategy, and tolerance criteria definitions for the uncertainties) adopted in the CT18 analysis [126].In particular, PDFs are extracted by using optimal combinations of t t absolute differential cross sections measurements with different IL added on top of the CT18 baseline.The impact on the global fit from individual and combined kinematic distributions is analyzed by selecting different renormalization µ R and factorization µ F scale choices in the theory predictions.EW corrections are also considered, however their impact is found to be negligible.
In addition, we analyze the impact of 13 TeV t t double differential distributions at ATLAS and CMS using the ePump (error PDF Updating Method Package) framework [140,141].In Sec.II E we discuss the observed impact from double differential distributions on PDFs, and find that this is comparable to that from single differential ones.However, the treatment and interpretation of correlated systematic uncertainties in the analysis with double differential distributions is more challenging and complicates the data vs theory description.This work mainly concerns with the study of the impact of 13 TeV t t single differential distributions on PDFs extracted in CTEQ global QCD analyses.A thorough and more extensive investigation of t t double differential distributions in PDF determinations will be presented in the future, in a separate work.A previous study investigating t t double differential cross sections at the LHC at 8 TeV and their impact on PDFs with ePump, is discussed in ref. [137].
Single top production.Single (anti)top production cross sections in the t-and schannel have also been measured at ATLAS and CMS at 7, 8, and 13 TeV collision energies.The impact of t-channel single (anti)top production has been explored in ref. [142] where it is found that an optimal combination of single-top data constrains the light quark and gluon PDFs with a reduction of their relative uncertainty by a fraction of a percent in the region 10 −3 ≤ x ≤ 0.5 with even more pronounced reduction on the ratio u/d around x ≈ 0.1.Part of these measurements are also included in the NNPDF4.0global analysis [130].An investigation of the impact of single top-quark production cross section measurements at the LHC on CTEQ PDFs will also be presented in a separate work.
The single and double differential cross sections of top-quark pair production as well as single (anti)top production cross sections will be ingredients of high importance in the next generation of PDF determinations which are going to use higher IL data.
The rest of this paper is organized as follows: in Sec.II A we summarize the findings from recent global QCD analyses using 7 and 8 TeV top-quark pair production measurements, while in Sec.II B we describe the LHC measurements from ATLAS and CMS at 13 TeV considered in this study.In Sec.II C, we describe the theoretical framework where details of the calculations such as NNLO QCD corrections, EW corrections, and scale dependence are discussed.In Sec.II D, we discuss the impact on PDFs from the bin-by-bin statistical correlations in the ATLAS 13 TeV lepton + jets channel measurements, while in Sec.II E we discuss single vs double distributions at the LHC.In Sec.III, we describe the impact on PDFs from individual 13 TeV t t data sets considered in separate fits, and in Sec.IV we present the main results of this analysis obtained from two optimal combinations of 13 TeV t t measurements.We will conclude in Sec.V, while the details of theoretical calculations are presented in App.A, and the treatment of the correlated systematics are summarized in App.B.

II. TOP-QUARK PAIR PRODUCTION AT THE LHC RUN I, AND II
We start with a brief overview of top-quark pair production measurements in recent global QCD analyses of PDFs which include LHC 7 and 8 TeV t t data from ATLAS and CMS, and summarize their findings.These measurements with respective analyses are also reported in Tab.I. Next, we discuss the LHC 13 TeV data that are at the core of this work and will play an important role in all post-CT18 PDF determinations.

A. Top-quark data in the CT18 era
The CT18NNLO analysis [126] includes two ATLAS absolute single-differential cross sections [17] dσ/dp t T and dσ/dm t t for the invariant mass with 20.3 fb −1 of IL, and the CMS normalized double-differential cross section [20] d 2 σ/dp T,t dy t , with 19.7 fb −1 .The two AT-LAS measurements are combined into one single data set which includes the combination of the e+jets and µ+jets channels for the p T,t and m t t distributions with statistical correlations.These distributions are chosen according to their best compatibility within the global fit.In fact, the impact from the single-differential y t and y t t rapidity distributions (absolute or normalized) at ATLAS [17] is also explored and tension is found with some other data sets.For example, the y t and y t t rapidity distributions show agreement with HERA DIS data, but have opposite trend as compared to the CMS d 2 σ/dp T,t dy t and ATLAS p T,t and m t t combined distributions.Their inclusion, either in the single-differential or double-differential form does not reduce PDF errors.The resulting t t impact on the CT18 PDFs is found to be modest with a preference for a softer gluon at large x in the 0.1 x 0.3 range, and with changes that have no statistically significant amount.
The MSHT20 analysis [127], in addition to the same CMS double-differential distribution considered in the CT18 study, includes ATLAS single-differential cross section measurements for the m t t, y t t, p T,t , and y t distributions [17] in the lepton+jets channel combined with statistical correlations, ATLAS measurements of y t t distribution [143] in the dilepton channel, as well as the CMS normalized y t t distribution in the lepton+jets channel [144].In addition, four total cross section measurements from ATLAS [145] and CMS [19,146,147] are included.For the ATLAS single-differential measurements in the lepton+jets channel, the correlated systematic parton-shower (PS) error across the four distributions has been decorrelated according to the procedure described in ref. [138].The impact on the MSHT20 gluon PDF from top-quark pair production data results in a suppressed high-x gluon (x 0.1) with a complicated interplay/tension with the Z-p T and LHC jet data.PDF uncertainties at large x are slightly reduced.Most of the impact is from the ATLAS combination of single differential distributions [17,143] in the lepton+jets channel, and the treatment of systematic uncertainties plays a significant role in the description of data in terms of χ 2 .
The NNPDF4.0 analysis [130] includes the ATLAS normalized statistically combined m t t, y t t, p T,t , and y t distributions [17] in the lepton+jets channel, as well as the ATLAS normalized y t t distribution in the dilepton channel [143].From CMS, it includes the normalized y t t distribution in the lepton+jets channel [144] as well as the normalized 1/σd 2 σ/dy t dm t t distribution in the dilepton channel [20].Two measurements of t t total cross section from ATLAS [148] and CMS [149] are also considered.In addition to these measurements at √ S= 8 TeV, the NNPDF4.0analysis includes two CMS rapidity y t distributions at √ S = 13 TeV in the lepton+jets [25] and dilepton channel [24], respectively, and other total cross section measurements at different center-of-mass energy: σ t t at CMS [150] with √ S = 5.02 TeV, at ATLAS [148] and CMS [149] with √ S = 7 TeV, and at ATLAS [151] and CMS [152] with √ S = 13 TeV.t-channel single top total and differential production cross section measurements are also considered in the NNPDF4.0global fit.The resulting gluon is more suppressed at x 0.1 as compared to CT18 and MSHT20.However, top-quark pair production data have overall a moderate impact on the NNPDF4.0global fit.In a fit with no top-quark data, the gluon is slightly enhanced at x 0.1, but well within the NNPDF4.0uncertainty.
The ABMP16 analysis [125] considers selected measurements of top-quark pair production total cross section only at the LHC and  [164], and four at CMS [152, [165][166][167] at √ S = 13 TeV.The ABMP16 PDF parameters are fitted simultaneously with α s and m t .These total cross section measurements lead to an increase in the gluon central value at large x in the 0.05 ≤ x ≤ 0.35 range, with 10 − 20% increase at x 0.1 in both the n f = 4 and n f = 5 calculations, depending on the factorization scale choice.These variations are well within the ABMP PDF uncertainty.However, the impact on the gluon PDF uncertainty is found to be small.

B. The 13 TeV top-quark data in the post-CT18 era
In this section we describe the top-quark pair production differential cross section measurements at ATLAS and CMS that are considered in this study.
In 2018, the CMS collaboration published differential cross section measurements at 13 TeV in the dilepton [24] and lepton+jets [25] channels.The lepton+jets channel measurements of ref. [25] have recently been superseded by new measurements with a higher IL of 137 fb −1 [26] which we use in this analysis.In parallel, the ATLAS collaboration published two 13 TeV measurements based on the lepton+jets [21] and the all-hadronic [22] channels respectively.These measurements are described below.
ATLAS lepton+jets channel (ATL13lj).We explored the impact of absolute topquark pair production single differential distributions based on lepton+jet events measured at ATLAS 13 TeV with 36 fb −1 of IL [21], which we label "ATL13lj".We use full phasespace results at parton-level, and consider reconstructed measurements at parton-level in the resolved topology which are expected to provide direct constraints on PDFs.We do not consider the transverse momentum p T,t t distribution of the t t pair because final-state interactions (FSI) between hard partons and beam remnants from the initial state may lead to substantial corrections to pair invariant mass (PIM) kinematic distributions when recoiling radiation is suppressed (see for instance the discussion in ref. [168]).These contributions may manifest as higher-order perturbative corrections to the factorized cross section, and as nonperturbative corrections that are suppressed by powers of perturbative scales.
adopt the nuisance parameter representation because this is the default treatment for correlated systematic uncertainties in all CTEQ PDF analyses.
We study the impact of bin-by-bin statistical correlations between these distributions released for various combinations.The results are shown in Sec.II E where we find that the statistical correlations for these measurements have some impact on the χ 2 description, but their overall impact on the PDFs and their errors is negligible.
In addition, we study the impact of combinations of double-differential distributions, such as dσ/dm t tdy t t, and confront the results with those obtained by including the corresponding statistically combined single differential dσ/dm t t, dσ/dy t t distributions.
To facilitate comparisons with measurements at CMS, the ATL13lj measurements are released by using two different bin resolutions: 1) the original ATLAS bin resolution, and 2) the CMS bin resolution which shares the size of its bins and the number of points with the CMS 13 TeV t t measurements in the dilepton channel [24].These two bin resolutions differ in number of bins and bin size.Moreover, bin-by-bin statistical correlations are made available only for bin choice 1).As we shall see in the next sections, this has a non-negligible impact on the data-vs-theory description for single distributions using these two bin resolutions.
ATLAS all-hadronic channel (ATL13had).For the ATLAS 13 TeV all-hadronic channel with an integrated luminosity of 36.1 fb −1 [22], we consider absolute single dif-ferential cross sections for the reconstructed top quark in terms of p T,t 1 (t 2 ) , |y t t|, m t t, and H t t T = p T,t + p T, t, where p T,t 1 (t 2 ) is the transverse momentum of the leading (trailing) top quark.We label this data set as "ATL13had".Combinations of double-differential cross sections for this channel are not available.For these measurements, correlated systematic uncertainties are made available in terms of nuisance parameters which are included in our global analysis.
CMS dilepton channel (CMS13ll).We study the impact of differential cross sections of top-quark pair production in the dilepton channel measured at CMS 13 TeV with 35.9 fb −1 of IL [24].These are labeled as "CMS13ll".They are published in terms of absolute and normalized single-differential distributions for the reconstructed top quarks.We consider only absolute cross section measurements in the full phase space at the top-quark level.This allows us to simplify the calculation of the theoretical predictions.Measurements relative to the decayed particle in the fiducial phase space will be analyzed in a future work as they require an additional effort to obtain complete predictions at the NNLO accuracy.In this study, we consider single-differential cross sections in terms of the p T,t (p T, t), y t (yt), y t t, and m t t distributions.
Correlated systematic uncertainties are presented in terms of the covariance matrix.In accordance with the default treatment of systematic correlations in the CTEQ framework, we convert the covariance matrix into the nuisance parameter representation by using a version of the iterative Σ + K decomposition method, adopted in the CT18 analysis [126].A slightly extended discussion is in Appendix B.
Bin-by-bin statistical correlations between measurements are not available to date, to the best of our knowledge.Therefore, we cannot exploit combinations of single differential distributions for these measurements in the global fit.
CMS lepton + jets channel (CMS13lj).In 2018, the CMS collaboration published 13 TeV measurements with 35.8 fb −1 of IL in the lepton+jets channel [25].These measurements are now superseded by new measurements with higher precision with 137 fb −1 of IL [26], which we use in this analysis and are labeled as "CMS13lj".We examine the impact from the m t t and y t t single differential distributions in the full phase space.Bin-by-bin statistical correlations are not provided also for these measurements.Statistical and correlated systematic uncertainties are given in terms of the covariance matrix which we convert to nuisance parameter representation as discussed above.

C. Theoretical framework
Details of the theoretical framework used in this analysis are given below.Additional details and comparisons are given in Appendix A.
Global PDF analyses necessitate fast, precise, and accurate theory predictions that are compared to experimental data in the χ 2 -minimization procedure.To reduce the CPU turn-around time, fast theory predictions are obtained as interpolating tables through the FatNLO [169][170][171][172] and APPLGrid [173] frameworks.
NNLO QCD and NLO EW corrections.The theory predictions at NNLO in QCD used in this work are based on two independent calculations.One is the numerical calculation described in refs.[53,54], based on the STRIPPER subtraction method [55][56][57] and implemented in fastNLO tables [174,175]; the other one is described in refs.[58][59][60], based on the q T -subtraction method [176] and implemented in the computer program MATRIX [61,177], that is publicly available.
The theory predictions utilized for the CMS13ll [24] differential distributions which share the same bin resolution of some of the ATL13lj [21] distributions, are generated at NNLO with fastNLO [175].The theory predictions for the ATL13lj distributions resolved in terms of the ATLAS bins, are instead obtained with MATRIX.To optimize the calculation and minimize the CPU's turn-around time in the global fit, the MATRIX NNLO theory is constructed through bin-by-bin NNLO/NLO K-factors defined as K = (σ (NNLO) ⊗ L (NNLO) )/(σ (NLO) ⊗ L (NNLO) ), where L (NNLO) represents the corresponding NNLO PDF luminosity.The NLO theory calculation is obtained from fast lookup APPLGrid tables [173] generated with MCFM [33,178].A detailed comparison between the NNLO theory obtained with fastNLO tables [174] and that obtained with MATRIX, as well as a consistency check, are in Appendix A 1.
The theory predictions relative to the ATL13had [22] and CMS13lj [25, 26] measurements in the global fit are obtained with MATRIX in similar manner.
The EW corrections and their implementation are discussed in Appendix A 2. Overall, the impact of the EW corrections on PDF determination is found to be negligible given the current size of the experimental errors.
Impact of different central scales in the theory prediction.It is interesting to explore the impact of different choices for the central scale in the theory prediction for the t t differential cross sections at 13 TeV and make a comparison with the experimental uncertainties.To this purpose, we consider the CMS13lj measurements [26] with 137 fb −1 of IL as they are the most precise data in this analysis.In Fig. 1, we confront to the experimental data theory predictions for the m t t (left) and |y t t| (right) distributions calculated with the CT18NNLO PDFs and with different scale choices.The theory predictions are computed with central scales µ F = µ R = H T /2 and H T /4 respectively, represented by solid lines of different color in Fig. 1.For both distributions, the agreement with data deteriorates at large m t t (m t t 1.5 TeV) and large |y t t| (|y t t| 1.5), although it is better in correspondence of H T /2.These two CMS13lj measurements are very sensitive to the gluon and play a major role in the global fit which is discussed later.
In this work, independent fits with central-scale choices H T /4, H T /2, and H T are examined.Differences in the predictions obtained with these scale choices can be used to quantify part of the theoretical uncertainty in the t t theory predictions in the global analysis.
Hessian profiling with the ePump framework.A preliminary assessment of the impact of new measurements on existing PDFs sets before performing the global fit can be done by using different tools relying on a variety of statistical procedures based on the Monte Carlo or the Hessian method.Examples of these are: ePump (error PDF Updating Method Package) [140,141], Hessian profiling [179], Bayesian reweighting techniques [180][181][182], and the PDFSense method [183].
We perform a preliminary investigation to assess the impact of the 13 TeV t t on PDFs using the ePump package.ePump allows one to obtain the updated best-fit PDF and relative PDF errors using the Hessian method.Previous studies using ePump are presented in refs.[137,139,141,[184][185][186].
In Fig. 2 and 3, we present plots for the correlation-cosine between experimental data and PDFs for various differential distributions as a function of the parton momentum fraction x for the ATL13had [22] and CMS13ll [24] measurements, added on top of the CT18 baseline.PDFs in the various insets are represented by different colors and each solid line corresponds to different bins of the distribution under scrutiny.In both experiments, the gluon PDF appears to be strongly correlated at large x in the interval 0.05 x 0.4 for all distributions.

Stat⊕Sys
Stat Data Theory/Data The correlation plots relative to the other experimental data considered in this work are very similar.
The results from ePump for the preliminary assessment of impact of individual experiments are reported in Tab.III, where they are shown for different values of the central scale in the theory predictions, and can be compared to those from the global fit which will be discussed later in Sec.III.We study the quality-of-fit in terms of χ 2 /N pt and use it as the criterion for an initial investigation of the data.
Looking at the χ 2 /N pt values from ePump in Tab.III, we note that the y t t and p T,t 1,2 distributions produce an acceptable fit quality (with µ F = µ R = H T /2 or H T /4) among the ATL13had measurements, while for the H t t T and m t t distributions χ 2 /N pt > 1.5 regardless of the scale choice.
The y t t and y t distributions are the only two that can be well described (for µ F = µ R = H T /2 or H T /4) among the CMS13ll measurements, and the m t t distribution is the only acceptable candidate in the CMS13lj measurements as y t t cannot produce a good fit, regardless of the scale choice.
The ATL13lj case deserves more efforts because of the two bin resolutions and the bin-bybin statistical uncertainties.In the ATL13lj measurements with the CMS bin resolution, we observe that the y t t, m t t and y t distributions produce acceptable χ 2 /N pt values in Tab.III, while the p T,t distribution produces χ 2 /N pt > 2, independently of the scale choice.
We discuss the ATL13lj measurements with the ATLAS binning resolution separately in Sec.II D, where we analyze the results from ePump with and without inclusion of bin-by-bin statistical correlations.

D. Statistical correlations in the ATLAS lepton+jets data
The impact of bin-by-bin statistical correlations in the ATL13lj measurements [21] available on the HEPData [188] repository is first analyzed with ePump.The theoretical predictions are obtained with the MATRIX computer program as described in Sec.II C, with topquark mass set to m t = 172.5 GeV in the pole mass approximation.The impact of centralscale dependence is analyzed for the three different scales µ R = µ F = {H T , H T /2, H T /4}.
In Tab.II we report the χ 2 /N pt values from ePump obtained with and without inclusion of statistical correlations.The implementation using bin-by-bin statistical correlations is labeled as "WSC" while the one without statistical correlations is "NSC".In addition, we use the notation adopted in the ATLAS publication where y B t t denotes the boosted topology for the rapidity of the t t system, while H t t T denotes the scalar sum of the transverse momenta of the hadronic and leptonic top quarks, H t t T = p had T,t + p lep T,t .To consistently account for the statistical correlations, the WSC implementation adopts the ePump "error type 2" option, where one has nuisance parameters and (statistical) correlation matrices as inputs.In this case, the correlated systematic uncertainties are given in terms of their absolute (rather than percent) contribution to each data point, and are therefore treated as additive errors.For more details about the ePump implementation we refer the reader to the online manual [189].The NSC implementation adopts the ePump "error type 4" option, where the nuisance parameters representation is used and correlated systematic uncertainties are normalized to data 1 .
On the one hand, from the values in Tab.II we note that regardless of the inclusion of statistical correlations and scale choice, there is some tension between various distributions, e.g., m t t and H t t T , m t t and y t t, and m t t and y B t t .Taken individually, all of these single differential distributions produce acceptable χ 2 /N pt .In the ePump environment, tensions are amplified when statistical correlations are included.In addition, we note that when all distributions are statistically combined, y t t and m t t produce a net effect (χ 2 /N pt values) that is similar to that of the CT18+y B t t +H t t T case for the scale choice H T /4.On the other hand, as illustrated in Fig. 4, the impact of the bin-by-bin statistical correlations on PDFs is negligible.In addition, even if the m t t, y t t, y B t t , and H t t T distributions are simultaneously included in ePump, the impact of these distributions on the gluon is very small.As we shall see in Sec.III, a similar behavior is observed in the true global QCD analysis where the overall impact from statistical correlations in the ATL13lj data is very small.

E. Single vs double distributions at the LHC 13 TeV
The ATLAS and CMS collaborations published several measurements of top-quark pair production double differential distributions at 13 TeV at both particle and parton level.A question then arises about the optimal strategy to be used to best exploit these new highprecision measurements.That is, whether to use statistically-combined single differential distributions, or double differential distribution directly.While a more exhaustive analysis In Fig. 5 we illustrate the ePump results for the CT18 gluon in four different cases: 1) when the m t t + y t t 1d distributions are added together on top of the CT18 baseline (orange dashed curve), 2) the y t t distribution only is added (green curve), 3) the m t t distribution only is added (red curve), 4) the d 2 σ/dm t tdy t t 2d distribution is added (purple curve).The blue band represents the CT18NNLO PDF uncertainty at 90% C.L. In this example, when the m t t and y t t 1d distributions are individually added, or added together on top of the CT18 baseline, they generate opposite pulls on the large-x gluon as compared to the 2d distribution.At Q = 1.3 GeV, the 2d distribution has a preference for a softer gluon in the x 0.4 region, while at Q = 100 GeV this preference is in the x 0.3 region.In both cases, the 1d distributions have opposite trend at large x with a much milder effect.
Because of these opposite trends at large x, and because the gluon PDF uncertainty is large in this region, where there is essentially no data, we conclude that there is no obvious preference between the 1d m t t + y t t combination, and the 2d d 2 σ/dm t tdy t t distribution.Ultimately, the impact from 2d and other multiple differential distributions must be assessed in the more general environment of a global PDF fit and explored against that of 1d distributions and their combinations.This will be analyzed in a future work.

III. GLOBAL QCD ANALYSIS: IMPACT FROM INDIVIDUAL SINGLE DIFFERENTIAL DISTRIBUTIONS AT THE LHC 13 TEV
In this section, we extend the analysis of the 13 TeV t t data to the global fit and extract post-CT18 PDFs at NNLO.In particular, we illustrate the results of multiple global QCD analyses at NNLO using the 1d absolute differential cross sections from the ATL13had, ATL13lj, CMS13ll, and CMS13lj measurements individually added on top of the CT18 baseline.We analyze the impact of these high-precision measurements on the gluon PDF and other relevant PDF combinations, and compare the results to the CT18 fit.
The setup of the global fits is the default one used in the CT18 study, where we use the same tolerance criterion to estimate the post-CT18 PDF uncertainties, and the same PDF parametrizations.An extended analysis using more flexible PDF parametrizations will be presented in a forthcoming study.The value of the top-quark mass is set to m (pole) t = 172.5 GeV, and the strong coupling constant at the Z-boson mass is set to α s (M Z ) = 0.118.
To find optimal combinations of measurements that allow us to include as much information as possible and, at the same time, minimize tension between observables, we first perform analyses including the 13 TeV t t data sets from every single channel one at a time, in individual global fits.Distributions are included in a combined manner only for the ATL13lj measurements where bin-by-bin statistical correlations are studied in advance in ePump preliminary assessments.
In addition, we consider different choices for the central scale in the 13 TeV t t theory predictions and investigate their impact.The quality-of-fit in terms of χ 2 /N pt (as well as the same values previously obtained with ePump), are reported in Tab.III.
The results of individual global fits using the 13 TeV ATLAS and CMS t t measurements in the various channels are discussed below.

A. Impact from the ATLAS lepton+jets channel
We start by analyzing the impact from the ATL13lj measurements [21] using the two bin resolutions, i.e., the ATLAS and CMS binning, published by the ATLAS collaboration and available on the HEPData repository [188].
ATL13lj resolved with CMS bins.The ATLAS measurements resolved in terms of CMS bins share the bin size with the CMS13ll data that have equal IL [24].Bin-bybin statistical correlations are not available for this specific binning resolution.Looking at the individual χ 2 /N pt values in Tab.III, the 1d distributions that are better described by the theory are m t t and y t t, while the description of the p T,t and y t distributions is more challenging due, in part, to the fact that they refer to reconstructed hadronically-decayed top quarks.We note that both the m t t and y t t distributions have χ 2 /N pt 1 independently of the choice of the central scale in their theory predictions.The y t t spectrum appears to be more stable when the central scale is varied, with χ 2 /N pt ≈ 0.75 in both cases.The description of the y t distribution is largely affected by the central-scale choice.
The m t t and y t t distributions provide us with most of the information and introduce the least tension when they are independently added on top of the CT18 baseline.
In Fig. 6 we compare the individual impact on the large-x gluon obtained using the two bin resolutions in a global fit at NNLO.Error bands with different hatching represent PDF uncertainties at 90% CL, and the central-scale choice in the 13 TeV t t theory predictions is set to H T /4 (results are very similar for H T /2).We observe that the constraints placed by the m t t and y t t distributions resolved with the CMS13ll bins have pulls in opposite directions above x 0.3.In particular, the m t t distribution prefers a harder gluon while the y t t one prefers a softer gluon as compared to CT18.Overall, the impact on the gluon PDF error is small.As reflected in Tab.III, both distributions exhibit χ 2 /N pt 1 with small variations when the central-scale choice in their theory prediction is varied.
ATL13lj resolved with ATLAS bins.The impact from both the individual and cumulative (in terms of bin-by-bin statistically combined) 1d ATL13lj distributions, resolved with ATLAS bins, is analyzed together with the impact of statistical correlations.Tab.III is of order 1 or less for all distributions, regardless of the central-scale choice in their theory predictions, with some deterioration when the four spectra are statistically combined and fitted together.This is expected as statistical correlations impose further constraints in the fit.Correlated systematic uncertainties are given in terms of the nuisance parameter representation.
By looking at the m t t and y t t distributions in Fig. 6, we observe that the pulls on the gluon at large x are in the same direction, i.e., both m t t and y t t prefer a harder gluon at x 0.4 and distortions are more pronounced for m t t.However, these distortions are milder as compared to those obtained from the same distributions resolved with the CMS13ll bins.The behavior of the gluon at large-x constrained by the inclusion of the y B t t and H t t T distributions in the fit, is very similar to that of the m t t and y t t distributions, respectively, and it is not shown here.The individual distribution impact on the PDF uncertainty is negligible also in this case.
Impact of bin-by-bin statistical correlations in the global fit.In Fig. 7 we illustrate the impact on the gluon PDF from the m t t, y t t, H t t T , and y B t t distributions of the ATL13lj data added together on top of the CT18 baseline, and added together including statistical correlations.The gluon PDF is shown at a scale of Q = 100 GeV to emphasize the impact, while error bands with different hatching represent PDF uncertainties at 90% CL.Also here, the central-scale choice in the 13 TeV t t theory predictions is set to H T /4.Considerations are similar for the scale choice H T /2.The left panel in Fig. 7 shows that statistical correlations have a negligible impact on the gluon PDF errors.In the right panel, we show distortions in the gluon central value at large x in fits with and without statistical correlations.The impact of statistical correlations is very small and mostly localized at x 0.6 where there is weak or essentially no constraint from data.In addition, we observe that when the four distributions are added together, their effect on the gluon central value is diluted in the fit, and the quality-of-fit improves (χ 2 /N pt ≈ 1.3 in Tab.III) when the central scale is set to H T /4 as compared to H T /2, in presence of statistical correlations.Without statistical correlations, a good description (χ 2 /N pt ≈ 1.06) of the combined spectra is obtained by using the H T /2 scale choice.
As discussed in Sec.II D, the small impact of statistical correlations from the ATL13lj data is confirmed by ePump.However, the observed trend of pulls on the gluon from ePump in Fig. 4 is in the opposite direction as compared to the global fit in Fig. 7.This is plausible, because the statistical procedure to update PDFs and their errors in ePump poses more restrictions as compared to a global PDF fit.In addition, χ 2 /N pt may differ depending on the treatment of the errors (i.e., the "error type" option) in ePump (see for instance the χ 2 /N pt values in Tab.II and Tab.III).In Tab.III, the ePump χ 2 /N pt values are obtained by using the "error type 1" option, where uncorrelated statistical and systematic errors are given, and correlated systematic errors (with percent correlation to each data point) are also given.The correlated systematic errors are treated multiplicatively, i.e., by multiplying the percentages by the original best-fit theory predictions for each data point.
The interplay between the two bin resolutions in the ATL13lj measurements is further discussed in Sec.IV where optimal data combinations are selected.The two bin resolutions differ in number of bins and bin size.We have cross checked the theory predictions in both cases, with independent calculations from MATRIX [59,60] and the NNLO version of fastNLO tables [174], and we find consistency.Overall, we find that the gluon impact of the ATL13lj measurements in the global analysis is negligible.

B. Impact from the ATLAS all-hadronic channel
The impact on the gluon when the ATL13had measurements [22] are added on top of the CT18 baseline is illustrated in Fig. 8.The most relevant information in the global fit is obtained from the y t t, m t t, H t t T , p T,t 1 , and p T,t 2 distributions which we study here.The y t t, m t t, and H t t T distributions produce visible impact on the large-x gluon at x 0.5.Pulls are all in the same direction and there is a preference for a softer gluon at large x, with a more pronounced effect from H t t T at x 0.5.A milder impact is observed for y t t and m t t at x 0.6.The p T,t 1 and p T,t 2 distributions produce an almost identical behavior, with most of the impact located at x 0.5.The

PDF Ratio to CT18
x CT18 the χ 2 /N pt for individual fits using p T,t 1 and p T,t 2 depends on the central-scale choice.As before, correlated systematic uncertainties are given in terms of nuisance parameters.As for the ATL13lj data, the overall individual impact of the ATL13had measurements is negligible.

C. Impact from the CMS dilepton channel
In Fig. 9 we illustrate the individual PDF impact of the CMS13ll measurements [24].The subset from which we gather the most relevant information to constrain the gluon is obtained by considering the y t t, m t t, y t , and p T,t distributions.As discussed in Sec.II B, the correlated systematic uncertainties for these measurements are published in terms of the covariance matrix representation.Therefore, we perform conversion to nuisance parameters as discussed in Appendix B.
From the χ 2 /N pt values in Tab.III, the m t t and p T,t distributions do not produce a good fit, regardless of the central-scale choice in their theory predictions.This is also confirmed by the results from ePump.The y t t and y t spectra are the distributions for which we obtain the best description with χ 2 /N pt ≈ 1 using the central-scale choice H T /2, and with χ 2 /N pt ≈ 0.7 with central-scale set to H T /4.
The gluon central value with central-scale choice H T /4 in Fig. 9 is affected by the CMS13ll data in the x 0.25 region with a strong preference for a much softer large-x gluon as compared to the ATL13lj and ATL13had data.However, the impact on the gluon PDF uncertainty is very modest and comparable to that of the ATL13lj data.

D. Impact from the CMS lepton+jets channel
The CMS collaboration released two sets of measurements in the lepton+jets channel, obtained with 35.8 fb −1 of IL [25] and with 137 fb −1 IL [26], respectively.We focus on the 137 fb −1 measurements as they extend the previous ones [25] and their precision is significantly improved.In addition, we observe that they place stronger constraints on the gluon in the global analysis, due to their improved systematic uncertainties which results in higher precision.Bin-by-bin statistical correlations are not available for these measurements.
We study the individual impact of the m t t and y t t distributions, which provide us with the most relevant information among the CMS13lj measurements in the global fit.In Fig. 10, we illustrate the impact from these two measurements on the gluon PDF.
From the χ 2 /N pt values in Tab.III, we observe that the y t t distribution cannot be well described in either the global QCD analysis or with ePump, with χ 2 /N pt ≈ 6 or more, regardless of the central-scale choice.The m t t distribution instead produces a better fit with a preference for a slightly harder gluon in the 0.2 x 0.65 range as compared to the y t t spectrum.Moreover, we note that the description of m t t improves when the central-scale choice in the theory prediction is set to H T /2.
We note that this dynamical scale choice is different from the default H T /4 discussed in ref. [192].Throughout this work, the criterion according to which the optimal central scale for the theory prediction is chosen, is based on improvements in the quality-of-fit.
In Fig. 10, we observe that these measurements have stronger impact on the gluon uncertainty as compared to those previously analyzed.This is mainly ascribed to a better control of experimental uncertainties which in turn, enhances the constraining power of these data.It would be useful to examine the impact of bin-by-bin statistical correlations, should they become available.

IV. OPTIMAL COMBINATIONS OF 13 TEV TOP-QUARK PAIR PRODUCTION MEASUREMENTS IN THE GLOBAL ANALYSIS
The analysis of impact from individual 13 TeV t t differential cross section measurements in various channels at ATLAS and CMS, previously discussed in Sec.III, allows us to select optimal combinations of measurements from which we can extract maximum information to constrain the gluon PDF, and minimize tensions among data in the extended baseline.The guiding principle for this selection is essentially based on the quality of impact and quality of description in the global fit, in terms of χ 2 .That is, we select those measurements that place effective constraints and produce a good quality fit, and that do not deteriorate the description of data in the pre-existing baseline.
Looking at the χ 2 /N pt values and central-scale choice for the 13 TeV t t theory predictions in Tab.III, we observe that the y t t distribution is well described (i.e., χ 2 /N pt ≈ 1) in both the ATL13had and CMS13ll cases.We therefore select the y t t distribution from these measurements.
We select the m t t distribution from the CMS13lj measurements, because the y t t one does not produce a good fit (χ 2 /N pt ≥ 5 regardless of the central-scale choice).
Because of the differences observed in the two bin resolutions of the ATL13lj measurements, we consider two separate cases: in one case we include the y t t distribution with the CMS bin resolution, whose χ 2 appears to be more stable against central-scale changes in the theory prediction as compared to m t t (see Tab. III); in the other, we include the m t t, y t t, y B t t , and H T t t distributions added together without statistical correlations as the latter produce negligible impact on PDF uncertainties.In addition, this greatly simplifies the global analysis.Therefore, we identify two optimal combinations which we refer to as CT18+nTT1 and CT18+nTT2, respectively.The CT18+nTT1 combination includes the ATL13hady t t, CMS13ll-y t t, CMS13lj-m t t, and ATL13lj-y t t distributions (resolved in terms of the CMS bins), while the CT18+nTT2 combination includes the same distributions from the ATL13had, CMS13ll, CMS13lj measurements, and the y t t + y B t t + m t t + H t t T combination without statistical correlations from ATL13lj, resolved with ATLAS bins.
The results of the NNLO global analysis obtained with the nTT1 and nTT2 combinations are summarized in Tab.IV, and illustrated in Figs. 13 and 14 where we show the gluon PDF ratio to CT18, and the R s = (s + s)/(ū + d) ratio.g(x, Q) and R s (x, Q) are selected as representative cases as they show the most visible impact from the inclusion of the new data in the global fit.
In Tab.IV we compare the χ2 /N pt values obtained from the CT18+nTT1 and CT18+nTT2 to CT18.There, we only report the CT18 data sets that exhibit a noticeable change in χ 2 /N pt .The remaining CT18 data sets have the same χ 2 /N pt as in Tabs.I and II in ref. [126].
The CT18 NNLO fit [126] has N pt = 3681 and χ 2 = 4293 with χ 2 /N dof = 1.16.In the CT18+nTT1 fit, the total number of points is N pt = 3728 and the resulting total χ 2 from the global fit is χ 2 = 4341 when the central scale of the 13 TeV t t theory is set to H T /2, while we obtain χ 2 = 4346 when the same scale is set to H T /4.In the CT18+nTT2 fit, the total number of points is N pt = 3752, χ 2 = 4366 for the central-scale choice H T /2, and χ 2 = 4376 for H T /4.
Comparing the χ 2 values of the individual experiments to those of CT18 in Tab.IV, we observe that the most noticeable quality-of-fit deterioration happens for both nTT1 and nTT2, regardless of the scale choice, in the case of the LHCb 8 TeV Z → e − e + forward rapidity cross section [193].Another case, for both the CT18+nTT1 and CT18+nTT2 fits, is to fit the CMS13lj m t t distribution when the central scale in the theory is set to H T /4.
Another χ 2 /N pt increase for both nTT1 and nTT2 is observed for the LHCb 8 TeV W/Z production cross section [194], and for the CMS 8 TeV single inclusive jet cross section [195] measurements, respectively.The LHCb 8 TeV measurements have an impact on the strange PDF, anti-quarks (e.g., ū, d), and their errors at both small and large x.This interplay can be further understood in terms of the L 2 sensitivity [196], that is a statistical tool to explore the pulls from individual measurements on the best-fit PDFs, and to identify tensions between competing data sets.
In Figs.11 and 12, we illustrate the L 2 sensitivity for the CMS13lj m t t distribution and the LHCb 8 TeV 2.0 fb −1 Z → e − e + forward rapidity cross section measurements in the CT18+nTT1 global fit.The same figures for the CT18+nTT2 global fit are very similar and we do not show them here.The T 2 = def.in the inset label refers to the default tier-2 penalty term in the χ 2 definition adopted in the CT18 analysis.
We observe that the m t t distribution in CMS13lj has a strong preference for a softer gluon in the 0.2 x 0.7 range, while the LHCb 8 TeV Z → e − e + large rapidity data prefer a harder gluon in the same range.In addition, we observe a different suppression preference for the u, ū, d, and s quarks across the entire x range, where in particular we note a higher preference for a softer ū, d, and s quarks in the 0.3 x 0.7 range by CMS13lj as compared to the LHCb 8 TeV measurements.These different suppression preferences have an impact on quark ratios, in particular on R s .The L 2 sensitivity for the R s ratio is illustrated in Fig. 12 where u V , d V and other PDF ratios are also shown 2 .
For these reasons, the χ 2 increase for the LHCb 8 TeV Z → e − e + large rapidity data in Tab.IV can be ascribed to the presence of tension with the 13 TeV t t data.Moreover, the χ 2 increase can in part be related to an increase in the uncertainty of the R s ratio, as compared to CT18.The R s uncertainty is shown in the right plot of Figs. 13 and 14 for CT18+nTT1 and CT18+nTT2, respectively.We note that the uncertainty increase in the 0.2 x 0.8 region is more pronounced for CT18+nTT1 than for CT18+nTT2 where the strange-quark s(x, Q) PDF is less suppressed in that x range.
The left plots in Figs. 13 and 14 illustrate the resulting changes in the gluon PDF and its uncertainty, as well as (small) variations induced by selecting the two central scales H T /2, and H T /4, in the t t theory predictions at 13 TeV.For both scale choices we obtain a reduction in the gluon uncertainty in the 0.2 x 0.5 range and above in the extrapolation region, and a much less pronounced reduction in the 2 × 10 −3 x 5 × 10 −2 range.
We observe a preference for a softer gluon as compared to CT18 in the 10 −1 x 0.6 region.This constraint is mainly driven by the m t t distribution from the CMS13lj data, with an improved description when the central-scale is set to H T /2.
While the quality-of-fit of the two CT18+nTT1 and CT18+nTT2 combinations is essentially the same as that of CT18 with χ 2 /N dof ≈ 1.16, small differences between the CT18+nTT1 and CT18+nTT2 global fits are noticeable in Figs. 13 and 14 as compared to CT18.These are discussed in more detail in Sec.IV B.

B. Interplay between top-quark and jet production data
Top-quark pair production and inclusive jet production at the LHC place complementary constraints on the gluon and other PDFs as these two processes overlap in the Q − x kinematic plane.It is therefore interesting to examine the interplay between top-quark pair and jet production data in NNLO global analyses with the CT18+nTT1 and CT18+nTT2 combinations.In particular, we analyze the individual constrains placed on the gluon from these processes separately.
To this purpose, we perform three main fits in each of which we study the impact of different central scales in the 13 TeV t t theory predictions.These global fits are performed by using the CT18 baseline as the starting point, where we remove either the jet or the 8 TeV t t data, or both, and add the nTT1 combination with scale choice H T /2 or H T /4.The same fits performed with the nTT2 combination produce similar results.CT18+nTT1 (90% c.l.) CMS13lj21mttHTO2 (581), Q=100 GeV 10 -4 10 -3 0.01 0.02 0.05 0.1 0.2 0.5 0.7 -10 -5 12. Same as in Fig. 11, but for different PDF ratios.
Global fits without inclusive jet data.In a first global fit, which we label CT18mJet, PDFs are obtained from the CT18 fit by removing the inclusive jet production data.To explore the impact of the new 13 TeV t t combinations we consider two variants of this fit including the nTT1 data sets and where the theory predictions for the latter are included with different central scales.These fit variants are: i) the CT18 baseline without QCD jets and with nTT1 data sets, with central scale H T /2, labeled as CT18mJet+nTT1-H T /2; ii) the CT18 baseline without QCD jets and with nTT1 data sets, with central scale H T /4, labeled as CT18mJet+nTT1-H T /4.
We focus our attention on the gluon which is the PDF receiving most of the impact.The results for the gluon PDF and its uncertainties resulting from these global analyses are illustrated in Fig. 15 where they are compared to the CT18 gluon at Q = 100 GeV.
Looking at the PDF ratio plot in Fig. 15 (left), we observe that removing the QCD jet data from the CT18 baseline (CT18mJet) results in an increase of the PDF uncertainty at large x in the 0.1 x 0.5 range where there is a preference for a harder gluon with a bump in the 0.2 x 0.6 region.Upon inclusion of the nTT1 combination in the CT18mJet+nTT1-H T /2 and CT18mJet+nTT1-H T /4 global fits, we observe a softer gluon in the x 0.1 range, regardless of the central-scale choice.
Comparing the gluon error bands in Fig. 15 (right), we note that uncertainties are only marginally reduced when nTT1 is included, with a small increase in the 0.06 x 0.15  region for scale choice H T /4.This is due to small tensions with several data sets in the baseline, in particular the D0 run II 9.7 fb −1 electron charge asymmetry A ch , with p T l > 25 GeV measurements [198], and the structure function measurements F p 2 [199] and F d 2 [200] from the BCDMS collaboration.From an L 2 sensitivity study, one can see that the latter have a preference for a harder gluon PDF at large x as compared to the nTT1 data, while the D0 run II electron charge asymmetry A ch data are sensitive to the u-and ū-quark PDFs with opposite trend as compared to BCDMS F d 2 .Global fits without t t data.In a second global fit, labeled as CT18mTop8, PDFs are obtained from the CT18 fit by removing top-quark pair production data (at √ S =8 TeV only in CT18).As before, we consider two fit variants to assess the impact of the 13 TeV t t data, where the nTT1 combination is included with scale choices H T /2 and H T /4 PDF Ratio to CT18 in the theoretical predictions for the nTT1 data subset.These fit variants are labeled as CT18mTop8+nTT1-H T /2 and CT18mTop8+nTT1-H T /4, respectively.The results for the gluon PDF are shown in Fig. 16 where we illustrate the impact.This is complementary to that in Fig. 15.We note that when the 8 TeV t t measurements are removed from the CT18 baseline, the impact is negligible on both the central value and uncertainty of the gluon.When the nTT1 combination is included, there is a preference for a softer gluon at large x in the 0.15 x 0.9 region.However, the impact on the gluon uncertainty is negligible, as shown in Fig. 16 (right), regardless of the scale choice.Moreover, we observe that the central value of the gluon in the nTT1 fits is harder as compared to the CT18mJet+nTT1-H T /2 and CT18mJet+nTT1-H T /4 fits.This is due to the impact of inclusive jet production data which prefer a softer gluon PDF in the large-x region.
Global fits without QCD jets and t t production.Finally, in a third fit labeled as CT18mJet&Top8, PDFs are obtained by removing both the inclusive jet and t t data from the CT18 baseline.The results of this fit are then compared to the fit variants obtained by including the nTT1 data subset with central scale choice H T /2 for the nTT1 theory predictions labeled as CT18mJet&Top8+nTT1-H T /2, and to that obtained with H T /4, labeled as CT18mJet&Top8+nTT1-H T /4.
The results are shown in Fig. 17 where we observe that the general trend is similar to that in Fig. 15.However, we note that the 8 TeV t t data lead to an approximately 8% reduction in the gluon central value in the 0.2 x 0.6 region (compare the red solid line in Fig. 15(left) to that in Fig. 17(left)).
Looking at the PDF uncertainties in Fig. 17(right), we note only a small increase in the 0.15 x 0.4 range as compared to the CT18mJet fit, which reflects the small impact of the 8 TeV t t data in the CT18 fit.When the nTT1 combination is included, the gluon central value becomes softer in the 0.2 x 0.6 range similar to that of the CT18mJet+nTT1-H T /2 and CT18mJet+nTT1-H T /4 fits with a reduction of the uncertainty in the 0.2 x 0.5 range as compared to the CT18mJet&Top8 fit (see Fig. 17(right)).
The conclusions we draw by comparing the results of the three main fits discussed above is that inclusive jet data place stronger constraints on the gluon as compared to top-quark PDF Ratio to CT18 pair production.This is mainly ascribed to the much larger number of data points in the inclusive jet measurements which tend to dilute the impact from t t production.However, the impact from the 13 TeV t t production data results in a softer gluon at large x, similar to that of the LHC jet data, but with a very different degree of suppression.In fact, the phase-space suppression in the hard scattering contributions for these two processes is different and generates different shapes and suppression in Fig. 15(left), and Fig. 16(left).Moreover, we note that most of the gluon suppression at large x is driven by the CMS13lj data with 137 fb −1 of IL.
Finally, we observe that all of the changes in shape and magnitude in the gluon central value from the impact of the new measurements are within the 90% uncertainty band obtained from the CT18 NNLO fit.Differences between the nTT1 and nTT2 data subsets.To conclude this section, we wish to discuss differences between the nTT1 and nTT2 data subsets and their interplay with the 8 TeV top-quark data in the CT18 baseline.To facilitate this analysis and better emphasize the differences, we study the impact of the nTT1 and nTT2 data subsets on the gluon PDF uncertainty in the CT18mJet global fit, where inclusive jet data are removed.The results of this study are illustrated in Fig. 19.

PDF Ratio to CT18
In Fig. 19(left) we observe an increase in the gluon uncertainty in the 0.06 x 0.15 region when the central scale in the nTT1 theory predictions is set to H T /4.The same increase is not present for the nTT2 combination in Fig. 19(right).This is one of the main differences which emerges in the nTT1 combination containing the y t t distribution from the ATL13lj measurements resolved in terms of CMS bins that uses H T /4 as the central scale in the nTT1 theory predictions.This tension disappears when the H T /2 central scale choice is used.
By looking at the χ 2 /N pt values in Tab.IV, we argue that this mild tension is generated in part by the CMS 8 TeV 19.7 fb −1 single inclusive jet cross section, whose χ 2 /N pt increases from 1.1 to 1.2 (regardless of scale choice); in part by the ATLAS 8 TeV 20.3 fb −1 top-quark p T,t and m t t absolute distributions, with a χ 2 /N pt increase from 0.6 to 0.7 (regardless of scale choice), and finally, by the ATL13lj data in nTT2 with a χ 2 /N pt increase from 0.7 to 1.1 when the central scale goes from H T /2 to H T /4, and by the CMS13lj data in nTT1 and nTT2 with a χ 2 /N pt increase from 1.1 to 1.6/1.7 when the central scale is reduced to H T /4.As already pointed out in Sec.III D, this indicates a preference for the H T /2 scale choice in the 13 TeV t t theory predictions in contrast to the suggested H T /4 scale choice discussed in ref. [192].In our global fits, the optimal central scale for the 13 TeV t t theory predictions is chosen according to the improvements it produces in the quality-of-fit.
It is also interesting to look at the L 2 sensitivity plots for the ATL13lj measurements in the full CT18+nTT1 and CT18+nTT2 gloabl fits which we illustrate in Fig. 20.There, in the left panel we show the ALT13lj y t t distribution resolved in terms of CMS bins.In the right panel, we show the y t t resolved with ATLAS bins, combined with the y B t t , H t t T , and m t t distributions without bin-by-bin statistical correlations.As already argued in previous discussions, the ATL13lj bin-by-bin statistical correlations have negligible impact and can essentially be ignored.We note that the two different treatments of the ATL13lj measurements lead to different preferences for the gluon and strange PDFs in the 0.2 x 0.7 range: the ATL13lj y t t distribution resolved with CMS bins prefers softer gluon and softer strange PDFs, while the ATL13lj combination has opposite behavior.
ePump Optimization.To further investigate differences in the CT18+nTT1 and CT18+nTT2 combinations, it is interesting to identify eigenvector (EV) directions that have maximal PDF sensitivity.The ePump framework allows us to identify a reduced set of error PDFs that contain the majority of the PDF dependence of the observables under consideration.The procedure is entirely based on the Hessian method and is documented in ref. [140].The new eigenvectors contain exactly the same information as the original eigenvectors, but are optimized so that a smaller set of error PDFs can be chosen for use with the set of observables to any required PDF-sensitivity.In turn, this allows us to assess and validate the data that place the strongest constraints on PDF errors.
In Fig. 18, we illustrate the behavior of the optimized eigenvector directions for the CT18+nTT1 and CT18+nTT2 combinations, while fractional contributions to the PDF error from the leading eigenvectors for each individual data set are reported in Tabs.V and VI.With ePump, we find six optimized PDF error sets for both CT18+nTT1 and CT18+nTT2.
In the case of the CT18+nTT1 optimized directions, we observe that the three leading eigenvectors approximately account for 99% of the PDF error band.Among them, the largest contribution is from CMS13lj m t t, with fractional uncertainty of approximately 31%.The second largest contribution is from ATL13had y t t, with roughly 25%, and the smallest contributions are from CMS13ll y t t and ATL13lj y t t resolved in terms of CMS bins, both with fractional uncertainty of approximately 22%.
In the case of CT18+nTT2, the data with the largest contribution is the ATL13lj combination of m t t y t t, y B t t , and H t t T , which account for approximately 47% of fractional uncertainty.This is due to the larger number of data points.The second largest contribution is from CMS13lj m t t with approximately 21%, and ATL13had y t t with 16%.The smallest contribution is from CMS13ll y t t, with fractional uncertainty of 13%.
These differences in the treatment of the ATL13lj measurements prompted us to identify the two optimal combinations CT18+nTT1 and CT18+nTT2, but overall they have mild impact in the global fits we have analyzed.PDF Ratio to Opt-CT18+nTT1

V. CONCLUSIONS
We presented a comprehensive study of the impact of recent high-precision LHC topquark pair production measurements at √ S = 13 TeV of collision energy on PDFs of the proton, in particular the gluon, in global analyses at NNLO in QCD.This extensive analysis of post-CT18 PDFs is relevant for the next release of CTEQ-TEA PDFs which will be challenged by the inclusion of high-precision forthcoming measurements at the LHC for a multitude of standard candle processes like top-quark pair production.
Besides the PDF impact of the 13 TeV t t differential cross section measurements from ATLAS and CMS, we studied their interplay with inclusive jet production measurements in global PDF fits.
Due to differences in the binning resolution of the t t 13 TeV ATLAS lepton+jet data, we identified two optimal combinations of measurements that maximize the information to constrain the gluon, and minimize conflict with the other data sets in the extended baseline.
Overall, the impact of these measurements in reducing the uncertainty in the gluon PDF is found to be mild.However, their role is important as they constrain the behavior of the gluon PDF at large momentum fraction x in a way that complements that of inclusive jet production data.In fact, the t t and inclusive jet production processes overlap in the Q-x plane, but their matrix elements and phase-space suppression are different, and constraints on the gluon and other PDFs are placed at different values of x.
We analyzed the impact of bin-by-bin statistical correlations whenever possible, as well as that of central-scale variations in the theory prediction for the 13 TeV top-quark data.The criterion according to which a particular scale is chosen, is based on the improvements produced in the quality-of-fit description (χ 2 description) in the global analysis.We observed that the µ F = µ R = H T /2 choice of the central scale improves the description of the 13 TeV lepton+jet data at CMS with 137 fb −1 of integrated luminosity.These are the most precise data included in this work and place stronger constrains on PDFs as compared to the other measurements.Future analyses and extensions of this work will include measurements from ATLAS with similar or higher integrated luminosity as well as implications due to novel PDF parametrizations.
The global analyses performed in this work have been challenged by the interpretation of the correlated systematic uncertainties published by the ATLAS and CMS collaborations.The default treatment of correlated systematic errors in the CTEQ-framework is in terms of nuisance parameters.The top-quark pair production measurements from the CMS collaborations have been recently published in terms of the covariance matrix representation.We performed conversion between the covariance matrix and nuisance parameter representation using a similar strategy to that used in the CT18 study.This allowed us to obtain identical χ 2 values in both representations.However, this conversion is not unique.Detailed information on both the covariance and nuisance parameter representations for experimental errors is critical to fully exploit constraints from the data in global QCD analyses for PDF determinations, and is critical to perform a simultaneous determination of m t , α s , and the PDFs as well as their correlations in future analyses.

NNLO QCD prediction benchmarks
As discussed in Sec.II C, two independent theory calculations for the top-quark pair production differential cross section at NNLO in QCD have been used in this work.One is obtained with fast tables [174,175] based on the STRIPPER [55,56] subtraction method, that are produced at NNLO with fastNLO.The other one is obtained by using NNLO/NLO K-factors, where fast tables for the NLO cross section in the denominator are generated with APPLgrid [173] using the MCFM [33,178] program, while the NNLO cross section in the numerator is computed with the MATRIX program [58][59][60], based on the q T -subtraction method [176].Currently, there are no fast tables available for MATRIX.
Comparisons between these two calculations are shown in Figs.21 and Fig. 22, where the theory is compared to the m t t, y t t, y t , and p T,t distributions measured at CMS at 13 TeV in the dilepton channel [24], and to the m t t, y t t, p T,t 1 , and H t t T distributions measured at ATLAS at 13 TeV in the all-hadronic channel [22], respectively.
For this case study, we use CT18 NNLO PDFs [126], the central scale is set to µ F = µ R = H T /4, and the top-quark mass is m (pole) t = 172.5 GeV.These are our default parameters.In general, we find agreement between the two calculations within 1% accuracy, which is sufficient for all of the analyses in this work.
In the upper insets of each panel of Fig. 21, theory predictions obtained with fastNLO at LO, NLO, and NNLO perturbative orders for the absolute differential distributions are compared to the CMS measurements.Statistical and systematical uncertainties are shown separately using error bars with different colors.In the lower insets, we show NNLO/NLO K-factors for the two calculations.We note that for the y t , y t t and p T,t distributions the overall NNLO correction is about 7%, except for the m t t where the K-factor has larger variation ranging from 5% to 12%.We observe agreement between MATRIX and fastNLO at the percent level for all distributions, though they agree better in the y t t than in m t t distributions.
In the upper insets of each panel in Fig. 22, the theory predictions that are compared to the ATLAS measurements are obtained with MATRIX using default parameters.Statistical and systematical uncertainties are displayed as in Fig. 21.In the lower insets, we show the MATRIX NNLO/NLO K-factor as well as the ratio between the AppGrid theory prediction from MCFM and MATRIX both computed at NLO.These NLO calculations agree within 1% accuracy.In addition, we note that the leading transverse momentum p T,t 1 and H t t T distributions are affected by large QCD perturbative corrections as their K-factors produce large variations.

NLO electroweak corrections
We explore the impact of the electroweak (EW) corrections on the t t differential distributions at 13 TeV and discuss their impact in our global PDF analyses.
EW corrections are computed as K-factors using the multiplicative scheme as described in Ref. [111], which are available in the repository [175].These EW corrections include contributions of order O(α 2 s α) as well as subleading ones, of order O(α s α 2 ) and O(α 3 ).Meanwhile, EW corrections are also incorporated in MadGraph5 aMC@NLO [109,201] which performs differential cross section calculations in an automated fashion, up to NLO in both couplings.Recently, MadGraph5 aMC@NLO has been interfaced with the PineAPPL library [112] to obtain fast interpolation grids which include EW corrections up to O(α 2 s α).In addition, EW FIG. 21.Theoretical predictions for the m t t, y t t, y t , and p T,t distributions compared to the 13 TeV CMS measurements in the dilepton channel [24].The CT18 NNLO PDFs [126], scale choice µ F = µ R = H T /4, and pole mass m (pole) t = 172.5 GeV are selected here as default parameters.
corrections for top-quark pair production at hadron colliders are also included in MCFM [113].
In Fig. 23, we illustrate EW corrections from Czakon et al. [111,175], PineAPPL [109,112], and MCFM [113] which are defined as multiplicative K-factors K EW = QCD × EW/QCD.The 13 TeV t t theory predictions for the differential distributions are calculated using the CT18NNLO PDFs, and use the same bins shared by the ATLAS distributions in the lep-ton+jets [21] channel and the CMS distributions in the dilepton channel [24].
While the two calculations produce almost identical shapes in the various distributions, differences within 1% or smaller are found, mainly driven by the subleading EW corrections.In addition, we observe that the rapidity distributions y t and y t t appear to be more stable against EW corrections as compared to m t t and p T,t , with most of the impact affecting bins at large rapidity.
EW corrections calculated using multiplicative EW K-factors are expolored in our global PDF analyses.We essentially observe no sizable impact on either the central-value PDFs or their uncertainties.The impact of EW corrections in our PDF analyses is negligible given the current size of the experimental errors and other theoretical uncertainties affecting the calculation in the global fit.which we refer to as N (0, 1), this can be used to test the goodness-of-fit (see refs.[202] and references therein for more details).The experimental collaborations at the LHC often present experimental uncertainties using the covariance matrix representation (see the discussion in Sec.II B).Therefore, to examine the normal distribution of shifted residuals a conversion to the nuisance parameters representation is required.This conversion is generally not unique, especially when the statistical and uncorrelated systematical errors are not fully specified, and when correlated systematic uncertainties and their sources are not explicitly known.In this case, a question arises about finding the optimal strategy to factor the covariance matrix and perform the conversion to nuisance parameters, and select the optimal number N corr of correlated sources that captures most of the features of the true correlated sources of uncertainty and does not introduce artificial fluctuations in the χ 2 calculation (see the discussion in ref. [203]).In this work, when the information on the experimental uncertainties is not fully provided with the measurements used in our analysis (e.g., the CMS13ll, CMS13lj measurements), we express the covariance matrix in terms of nuisance parameters representation by using a version the Σ + K decomposition, which is an iterative procedure introduced in the CT18 study [126] to obtain the nuisance parameter representation from the covariance matrix.This allows us to numerically match the χ 2 in the two representations and obtain identical values.
An extended discussion of this problem, in which we consider alternative methods of performing the conversion from the covariance matrix to the nuisance parameter representation, and study the problem of finding an optimal representation for the independent experimental errors that limits artificial fluctuations in the calculation of the χ 2 function, will be addressed in a separate work.Independent discussions on the treatment of the experimental uncertainty correlations in global PDF analyses can also be found in refs.[202,203].
-quark pair production at the LHC run I, and II 5 A. Top-quark data in the CT18 era 5 B. The 13 TeV top-quark data in the post-CT18 era 6 C. Theoretical framework 8 D. Statistical correlations in the ATLAS lepton+jets data 12 E. Single vs double distributions at the LHC 13 TeV 12 III.Global QCD analysis: Impact from individual single differential distributions at the LHC 13 TeV 14 A. Impact from the ATLAS lepton+jets channel 15 B. Impact from the ATLAS all-hadronic channel 17 C. Impact from the CMS dilepton channel 19 D. Impact from the CMS lepton+jets channel 19 IV.Optimal combinations of 13 TeV top-quark pair production measurements in the global analysis 20 I. INTRODUCTION

FIG. 1 .
FIG.1.Central scale choices for the m t t(left) and |y t t|(right) theory predictions compared to the CMS13lj measurements.The two central scales are represented by solid lines of different colors: µ R = µ F = H T /2 is red, while H T /4 is green.

FIG. 2 .FIG. 3 .
FIG. 2. PDF correlation-cosine plots[187] as a function of the momentum fraction x for the ATL13had [22] measurements added on top of the CT18NNLO baseline at Q = 100 GeV.
FIG. 17. Global fit without t t and jet data vs CT18NNLO.Left: PDF ratio to CT18.Right: Error bands comparison.
FIG. 20.L 2 sensitivity of the various PDFs for the ATL13lj measurements in the CT18+nTT1 (left) and CT18+nTT2 (right) global fits.Here, the central scale in the 13 TeV t t theory predictions is set to H T /2.

10 FIG. 22 .
FIG. 22. Theoretical predictions for the m t t, y t t, p T,t 1 , and H t t T distributions compared to the 13 TeV ATLAS measurements in the all-hadronic channel [22].Here, default parameters are used as in Fig. 21.

FIG. 23 .
FIG. 23.Comparison of the EW corrections from the Czakon et al.[111], PineAPPL[112], and MCFM[113] for the t t production at the LHC 13 TeV measured by the ATLAS collaboration in the lepton+jets channel[21]  and CMS in the dilepton one[24].
The most relevant information is obtained from the m t t, y t t, H t t T , and y B t t distributions for which binby-bin statistical correlations are available.Their χ 2 /N pt values from the NNLO global fit in

TABLE III .
Results of the NNLO global QCD analysis for all the 13 TeV measurements from ATLAS and CMS included one at a time, and in a combined manner with statistical correlations when these are available.Included are also the χ 2 /N pt values from ePump.The analysis is performed by adopting different central-scale choices for the t t 13 TeV measurements.
ATLAS lepton+jet channel.Impact of the m t t and y t t distributions on the gluon PDF in a NNLO global fit.Distributions are shown for two different binning resolutions labeled by "ATLASBin" and "CMSBin" respectively.Error bands with different hatching represent PDF uncertainties at 90% CL.The central scale in their theory predictions is set to H T /4.PDF Impact of statistical correlations in the ATL13lj data.Impact of statistical correlations on the gluon PDF in a NNLO global QCD analysis.Left: Impact on the error bands.Right: PDF ratios to CT18 PDFs.The y t t, y B t t , m t t, and H t T,t 1 and p T,t 2 distributions, added one at a time in the NNLO global QCD analysis.Error bands with different hatching represent PDF uncertainties at 90% CL.The central-scale choice in the theory predictions is set to H T /4.
constraining power of these data in this kinematic region is limited by large experimental uncertainties that affect these distributions.For all distributions, the impact on PDF uncertainties is negligible.The χ 2 /N pt in Tab.III is of order 1 for the y t t distribution regardless of the central scale choice in the theory predictions.Moreover, the m t t and H t t T distributions are in general not well described, while FIG. 8. ATLAS all-hadronic channel.Impact of the y t t, m t t, H t t T , p CMS lepton + jets channel.Impact of the y t t, and m t t distributions added one at a time in the NNLO global fit.Error bands with different hatching represent PDF uncertainties at 90% CL.

TABLE IV .
Data sets of the extended NNLO global QCD analysis including the optimal combinations CT18+nTT1 and CT18+nTT2.Here we directly compare the quality-of-fit found for CT18+nTT1 and CT18+nTT2 vs. CT18 NNLO on the basis of χ 2 /N pt and scale choices {H T /2, H T /4} for the central theory predictions of the t t data at 13 TeV.We only report the CT18 data sets for which the χ 2 /N pt exhibits a noticeable change.Error bands with different hatching represent different choices for the central scale in the 13 TeV t t theory predictions: H T /4 (green), and H T /2 (red).PDF uncertainties are evaluated at the 90% CL.

TABLE V .
Fractional contribution of leading eigenvectors (EV) from ePump optimization for each data set in the CT18+nTT1 combination.The second column shows the sum of fractional contributions from individual data sets.

TABLE VI .
Same as in Tab.VI, but for the CT18+nTT2 combination.