Measurement of the integrated and differential t ¯ t production cross sections for high- p T top quarks in pp collisions at ﬃﬃ s p = 8 TeV

The cross section for pair production of top quarks ( t ¯ t ) with high transverse momenta is measured in pp collisions, collected with the CMS detector at the LHC with ﬃﬃﬃ s p ¼ 8 TeV in data corresponding to an integrated luminosity of 19 . 7 fb − 1 . The measurement is performed using lepton þ jets events, where one top quark decays semileptonically, while the second top quark decays to a hadronic final state. The hadronic decay is reconstructed as a single, large-radius jet, and identified as a top quark candidate using jet substructure techniques. The integrated cross section and the differential cross sections as a function of top quark p T and rapidity are measured at particle level within a fiducial region related to the detector-level requirements and at parton level. The particle-level integrated cross section is found to be σ t ¯ t ¼ 0 . 499 (cid:2) 0 . 035 ð stat þ syst Þ (cid:2) 0 . 095 ð theo Þ (cid:2) 0 . 013 ð lumi Þ pb for top quark p T > 400 GeV. The parton-level measurement is σ t ¯ t ¼ 1 . 44 (cid:2) 0 . 10 ð stat þ syst Þ (cid:2) 0 . 29 ð theo Þ (cid:2) 0 . 04 ð lumi Þ pb. The integrated and differential cross section results are compared to predictions from several event generators.


I. INTRODUCTION
Measurements of top quark pair (tt) production cross sections provide crucial information for testing the standard model (SM) and the accuracy of predictions from Monte Carlo (MC) generators. The CMS [1] and ATLAS [2] Collaborations at the CERN LHC have previously measured the differential tt cross sections at ffiffi ffi s p ¼ 7 and 8 TeV as a function of transverse momentum (p T ) and other kinematic properties of the top quarks and the overall tt events [3][4][5][6][7][8][9]. These measurements use events where each parton from the top quark decay is associated with a distinct jet. However, when top quarks are produced with large Lorentz boosts, their decays are often collimated and the final decay products may be merged. For a top quark with a Lorentz boost of γ ¼ E=m, where E is the energy and m the mass of the top quark, the angle ΔR in radians between the W boson and the b quark from the top quark decay is approximately ΔR ¼ 2=γ. In this paper, a measurement of the tt production cross section is presented utilizing jet substructure techniques to enhance sensitivity in the kinematic region with high-p T top quarks. Accurate modeling of the boosted top quark regime is important as it is sensitive to many physics processes beyond the SM, as discussed, for example, in Ref. [10]. This paper presents the first CMS measurement of the tt production cross section in the boosted regime. The cross section is measured as a function of the top quark transverse momentum (p t T ) and rapidity (y t ) for p t T > 400 GeV, corresponding to the upper p T range covered by the CMS measurement in Ref. [4]. A dedicated measurement of tt production in the boosted regime has recently been reported by the ATLAS Collaboration [11].
The analysis is performed for events in lepton þ jets final states where one top quark decays according to t → Wb → lνb, with l denoting an electron or a muon, and the second top quark decays to quarks (t → Wb → qq 0 b). Lepton þ jets final states originating from W boson decays to τ leptons (t → Wb → τνb → lννb) are treated as background. The boosted top quark that decays to a hadronic final state is reconstructed as a single, large-radius (large-R) jet. Jet substructure techniques similar to those used in Refs. [12,13] are applied to identify those large-R jets originating from top quarks (t-tagged jets). A maximum-likelihood fit is performed to extract the background normalizations, the t tagging efficiency, and the integrated tt production cross section for p t T > 400 GeV. The results are presented at the particle level in a fiducial region similar to the event selection criteria to minimize the dependence on theoretical input, and fully corrected to the parton level. Differential tt cross sections are also measured at the particle (parton) level as a function of the t-tagged jet (top quark) p T and y after subtracting the background contributions and correcting for inefficiencies and bin migrations.

II. THE CMS DETECTOR, EVENT RECONSTRUCTION, AND EVENT SAMPLES
The CMS detector [1] is a general-purpose detector that uses a silicon tracker, a finely segmented lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter (HCAL). These subdetectors have full azimuthal coverage and are contained within the bore of a superconducting solenoid that provides a 3.8 T axial magnetic field. Charged particles are reconstructed in the tracker, covering a pseudorapidity [1] range of jηj < 2.5. The surrounding ECAL and HCAL provide coverage for photon, electron, and jet reconstruction for jηj < 3. Muons are measured in gas-ionization detectors embedded in the steel flux-return yoke outside the solenoid. Events are reconstructed using the particle-flow algorithm [14,15], which identifies each particle with an optimized combination of all subdetector information. The missing transverse momentum vectorp miss T is defined as the projection on the plane perpendicular to the beams of the negative vector sum of the momenta of all reconstructed particles in an event. Its magnitude is referred to as E miss T . A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Ref. [1].
The measurement is performed using the CMS data recorded at ffiffi ffi s p ¼ 8 TeV, corresponding to an integrated luminosity of 19.7 AE 0.5 fb −1 [16]. For the e þ jets channel, data are collected with a trigger requiring an electron with p T > 30 GeV and jηj < 2.5, at least one jet with p T > 100 GeV, and at least one additional jet with p T > 25 GeV. For the μ þ jets channel, the trigger demands a muon with p T > 40 GeV and jηj < 2.1, with no jet requirements. At the trigger level, the leptons are not required to be isolated.
Simulated events are used to estimate the efficiency to reconstruct the tt signal, evaluate the systematic uncertainties, and model most of the background contributions. Samples of tt and electroweak single top quark events are generated using the next-to-leading-order (NLO) MC generator POWHEG (v. 1.0) [17][18][19][20][21], while W boson production in association with jets is generated with the leading-order (LO) generator MADGRAPH (v. 5.1.3.30) [22]. Additional tt samples, generated using MADGRAPH and the NLO generator MC@NLO (v. 3.41) [23], are used for comparison with POWHEG. The MC@NLO production is interfaced to HERWIG (v. 6.520, referred to as HERWIG6 in the following) [24] for parton showering, while all other generators are interfaced to PYTHIA (v. 6.426,referred to as PYTHIA6) [25]. For the samples produced with MADGRAPH, the MLM prescription [26] is applied for matching of matrix-element jets to parton showers. The most recent PYTHIA Z2* tune is used. It is derived from the Z1 tune [27], which uses the CTEQ5L parton distribution function (PDF) set, whereas Z2* adopts CTEQ6L [28]. The POWHEG tt and single top quark samples are generated using the CT10 next-to-next-to-leading-order (NNLO) [29] PDFs, while the MC@NLO tt sample uses the NLO CTEQ6M [28] PDF set. The LO CTEQ6L1 [28] PDF set is used for the MADGRAPH tt and W þ jets samples. All generated events are propagated through a simulation of the CMS detector based on GEANT4 (v. 9.4) [30].
The simulated events are corrected to match the conditions observed in data. All simulated events are reweighted to reproduce the distribution of the number of primary vertices that arises from additional pp interactions within the same or neighboring bunch crossings (pileup), as measured in data. The jet energy resolution is corrected by scaling the difference between the generated and the reconstructed jet momentum so that the resolution matches that observed in data [31]. Lepton trigger and identification efficiencies are also corrected for differences between data and simulation. Jet energy corrections are obtained from the simulation and further corrections are applied to data from in situ measurements using the energy balance in dijet and photon þ jet events [31]. The contribution to the jet energy in data from pileup is removed using the area-based subtraction technique outlined in Ref. [32], augmented by corrections from data as a function of the jet η, as described in Ref. [31].

III. EVENT SELECTION
Jet clustering is performed with the FASTJET package (v. 3.1) [33]. Two jet clustering algorithms are used in the measurement. The anti-k T algorithm [34] with a distance parameter R ¼ 0.5 is used to reconstruct jets that are hereafter referred to as small-R jets. Lepton candidates that are found within ΔR < 0.5 of a jet, where ΔR ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi ðΔηÞ 2 þ ðΔϕÞ 2 p and Δη and Δϕ are the pseudorapidity and azimuthal angle (in radians) differences between the direction of the lepton and the jet, are subtracted from the jet four-vector to avoid including such leptons within jets. The small-R jets are required to have p T > 30 GeV and jηj < 2.4. Small-R jets that are identified as originating from a bottom (b) quark through the use of an algorithm that combines secondary-vertex and track-based lifetime information [35,36] are classified as being b tagged. The algorithm working point used has an efficiency for tagging a b jet of ≈65%, while the probability to misidentify lightflavor jets as b jets is ≈1.5%. The secondary-vertex mass of the b-tagged jet (m vtx ) is defined as the invariant mass of the tracks associated with the secondary vertex, assuming that each particle has the pion mass. Jets that are b tagged are also required to have a secondary vertex (resulting in a small change in the efficiency). Differences in b tagging efficiency and misidentification rates between data and simulated events are accounted for through scale factors applied to the simulation.
The second jet clustering algorithm is the Cambridge-Aachen (CA) algorithm [37,38], used to reconstruct large-R jets with a distance parameter R ¼ 0.8. These jets are required to have p T > 400 GeV, where this lower p T bound is set such that the top quark decay products are typically fully merged for R ¼ 0.8. The kinematics of the large-R jet is used for the p t T and y t measurements. The CMS top quark tagging algorithm [39], using large-R jets as input, is employed in this measurement to identify top quark candidates decaying hadronically. The algorithm begins by identifying subjets through recursive declustering of the original large-R jet, reversing the clustering sequence of the CA algorithm. First, the last clustering step is reversed, splitting the large-R jet j, with transverse momentum denoted as p j T , into two subjets j 1 and j 2 , with transverse momenta p j 1 T and p j 2 T . If the two subjets satisfy ΔRðj 1 ; j 2 Þ > 0.4-0.0004p j T , with p j T in GeV, they are passed to the next step of the algorithm; if not, they are reclustered and the parent is labeled as a hard subjet. Each subjet is required to satisfy p j i T > 0.05p j T ; otherwise, the subjet is discarded. A secondary decomposition is next applied to the subjet(s), identifying up to a maximum of four hard subjets.
The large-R jet that is identified as a t jet candidate is required to contain at least three subjets, corresponding to the presumed b, q, andq 0 fragmentation products. In addition, the minimum pairwise invariant mass of the three subjets of highest p T is required to be greater than 50 GeV, as expected for the t → Wb decay, and the total jet invariant mass m j is required to be consistent with the top quark mass by demanding 140 < m j < 250 GeV. Large-R jets which fulfill these requirements are labeled as t-tagged jets. The cumulative efficiency for these t tagging requirements is about 25% for jηj < 1.0 and 13% for 1.0 < jηj < 2.4 [39]. The difference in the t tagging efficiency between data and simulation is accounted for through a scale factor applied to the simulation that is derived using a maximumlikelihood fit.
Electrons [40] and muons [41] must have, respectively, p T > 35 GeV and 45 GeV, and jηj < 2.5 and 2.1, where the differences are a consequence of the requirements on the respective lepton triggers. Since leptons from high-p T top quark decays are often emitted close to their accompanying b jets, they may not be well-isolated. To reject background contributions from jets misidentified as leptons, the leptons must pass a two-dimensional (2D) selection, requiring either ΔRðl;closest small − RjetÞ > 0.5 or p rel T > 25 GeV, where p rel T is the component of the lepton p T perpendicular to the axis of the closest small-R jet. An additional criterion is applied in the electron channel to further reduce the multijet background contribution from mismeasured jets. The requirement ensures thatp miss T does not point parallel to the direction of either the electron (e) or the highest-p T jet (j) for low-E miss T events: jΔϕðfe or jg; p miss T Þ − 1.5j < E miss T =50 GeV. Events that contain more than one lepton with p T > 20GeV and jηj < 2.5 (2.1) for electrons (muons) are rejected.
Events selected for the analysis must contain exactly one electron or muon, at least one small-R jet near the lepton (ΔRðl; jetÞ < π=2, referred to as the leptonic side), and one large-R jet away from the lepton (ΔRðl; jetÞ > π=2, referred to as the hadronic side). These events are next separated into three exclusive event categories with different signal and background admixtures: "0t", "1t þ 0b", and "1t þ 1b". The 0t events are defined by requiring that no hadronic-side jet pass the t tagging selection. For the 1t þ 0b events, the hadronic-side jet must pass the t tagging selection, and no leptonic-side jets can be b tagged. The third category of 1t þ 1b events must contain both a hadronic-side t-tagged jet and a leptonic-side b-tagged jet. The 0t sample is dominated by background events, primarily from W þ jets production, while the signal and background yields for the 1t þ 0b sample are expected to be of comparable size. The 1t þ 1b sample is dominated by signal events.

IV. BACKGROUND ESTIMATION
The dominant sources of background are single top quark production (primarily from the Wt channel), W þ jets production, and multijet production. In addition, tt events with decays to τ þ jets (resulting in either hadronic or leptonic final states) or any other than e=μ þ jets final states are treated as background in the measurement, and hereafter referred to as "tt other". Other sources of background, including diboson, Z þ jets, WH, and ttW=Z production, were found to be negligible. All background normalizations are extracted through a maximum-likelihood fit discussed in Section VI, while the signal and all background distributions are modeled using simulation, except multijet production, which is obtained from data. The tt other contribution is constrained to have the same relative normalization as the tt signal in the likelihood fit.
The background from multijet production is estimated using control samples in data. Multijet templates for each event category (0t, 1t þ 0b, 1t þ 1b) are extracted using control samples, defined by inverting the 2D lepton-jet separation requirement and subtracting residual contributions (corresponding to 3-15% of events in the control samples) from tt, single top quark, and W þ jets events. An initial multijet background normalization is obtained for each event category from a fit of multijet and other signal and background templates to the E miss T distribution in data.

V. SYSTEMATIC UNCERTAINTIES
Systematic uncertainties in the measurement arise from reconstruction and detector resolution effects, background estimation, and theoretical uncertainty in the modeling of signal. The dominant experimental uncertainty is the uncertainty in the t tagging efficiency. The different sources of systematic uncertainty are described in detail below.
The uncertainty in the t tagging efficiency and the corresponding data-to-simulation correction factor are evaluated in Ref. [39]. Since there is a large overlap between those events and events in the signal region in this measurement, and since the t tagging efficiency is strongly anticorrelated with the tt cross section measurement, the t tagging efficiency and its uncertainty are determined simultaneously with the cross section (see Sec. VI A). The resulting efficiency is in agreement with the previous measurement [39].
The uncertainties in jet energy scale are estimated by changing the jet energy as a function of jet p T and η by AE1 standard deviation [31]. These uncertainties, which include differences in jet response between light-and heavy-flavor jets, have been measured for anti-k T jets with distance parameters of R ¼ 0.5 and 0.7, but not for R ¼ 0.8 CA jets. The response of the R ¼ 0.8 CA jets is estimated in simulation to be within 1% of the response of R ¼ 0.7 anti-k T jets. This is checked by comparing the reconstructed W boson mass in data and simulation in moderately boosted tt events (outside of the signal region). An additional 1% uncertainty is used to account for the small differences observed in these studies. The jet energy scale uncertainties for R ¼ 0.5 and R ¼ 0.8 jets are treated as fully correlated.
The jet energy resolution is known to be about 10% worse in data than in simulation, and the resolution is therefore adjusted in simulation, using smearing factors in bins of jet η [31]. An associated systematic uncertainty is obtained by rescaling the resolution smearing in simulation by AE1 standard deviation. This corresponds to changes in the smearing of about AEð2.4-5.0Þ%, depending on η. The effect of jet mass scale and jet mass resolution were found to be very small compared to those from the jet energy. These are accounted for with the data-to-simulation correction factor.
The uncertainties associated with the jet energy scale and resolution are propagated to the estimation of the E miss T . The uncertainty in the modeling of the large-R jet mass, which was measured in Ref. [42], is also accounted for through propagating the jet energy uncertainties to the full jet four-vector.
In addition to uncertainties in the distributions, we also consider several normalization uncertainties affecting the signal yield. The uncertainties in background yields are taken into account in the combined signal-and-background maximum-likelihood fit by changing the W þ jets, single top quark, and multijet normalizations, assuming conservative log-normal prior uncertainties of AE50%, AE50%, and AE100%, respectively. The background normalizations are constrained in the maximum-likelihood fit, and corresponding background uncertainties extracted as the AE1 standard deviation uncertainties in the fit. In addition, the statistical uncertainty resulting from the finite sizes of the simulated samples are included. The uncertainty in the measurement of the integrated luminosity of AE2.6% [16] is also included.
The uncertainty in the pileup modeling is evaluated by varying the total inelastic pp cross section used in the simulation within its uncertainty of AE5% [43]. The resulting uncertainty in the cross section measurements is less than 1%.
Systematic uncertainties from the lepton trigger and corrections to the lepton identification efficiencies that are applied to all simulated events contribute negligibly to the uncertainty in the cross section measurement. This includes the lepton η dependence of these uncertainties. The uncertainty in the b tagging efficiency [35,36] is also considered, but has a negligible impact on the final result since the measurements are performed by combining events in the 1t þ 0b and 1t þ 1b event categories. Uncertainties pertaining to the modeling of the secondary-vertex mass, which is one of the variables used in the maximumlikelihood fit, are negligible compared to the statistical uncertainty in the sample.
Theoretical uncertainties in the modeling of the tt events originate from the choice of PDF and renormalization and factorization (μ R and μ F ) scales, whose nominal values are chosen to be equal to the momentum transfer Q in the hard scattering, given by Q 2 ¼ m 2 top , where the summation runs over all final-state partons in the event. The uncertainty in the modeling of the hard-scattering process is evaluated using samples where the renormalization and factorization scales are simultaneously changed up (2Q) or down (Q=2). The uncertainty from the PDF is evaluated using the up and down eigenvector outputs from the NNLO PDF sets CT10 [29], MSTW 2008 [44], and NNPDF2.3 [45], following the PDF4LHC prescription [46,47]. An additional theoretical uncertainty is assigned to account for the choice of event generator and parton shower algorithm in extracting the integrated and differential cross sections, evaluated using MC@NLO+HERWIG6 (see Secs. VI A and VI C).

VI. CROSS SECTION MEASUREMENTS
The tt signal yield, background normalizations, and t tagging efficiency are extracted simultaneously using a binned, extended maximum-likelihood fit to different templates of several kinematic variables described below. First, the fit is used to determine the integrated tt cross section for p t T > 400 GeV, providing a simultaneous measurement of the cross section with nuisance parameters and constraints on the background yields in the data. The results are then used to obtain the differential tt cross section as a function of p t T and y t . The cross sections are presented at both the particle and parton levels.

A. Maximum-likelihood fit
Three exclusive event categories are used in the maximum-likelihood fit (0t, 1t þ 0b, 1t þ 1b), as defined in Section III. The lepton jηj is used as the discriminant for events in the 0t and 1t þ 0b categories, while m vtx is used to discriminate tt events (tt signal and tt other are constrained to the same relative normalization in the fit) from non-tt background in the 1t þ 1b event category. The electron and muon channels are fitted separately, yielding a total of six categories. The maximum-likelihood fit is performed within the THETA framework [48].
Background normalizations and experimental systematic uncertainties are treated as nuisance parameters in the fit, three of which are built into the model as uncertainties in the input distributions, these being the jet energy scale, jet energy resolution, and t tagging efficiency. The event categories for the fit are designed such that the t tagging efficiency is constrained by the relative populations of events in the different categories. The tt cross section and the background normalizations are therefore correlated with these variables. The strongest correlation with the tt cross section is the t tagging efficiency. A log-normal prior constraint is used for each nuisance parameter that corresponds to a normalization uncertainty, while uncertainties based on the form of the distributions are modeled with a Gaussian prior for the nuisance parameter, which is used to interpolate between the nominal and shifted templates. The e þ jets and μ þ jets events use common nuisance parameters for all systematic uncertainties and background normalizations, except for multijet backgrounds, which are taken as independent of each other. The total fitted uncertainties in the background yields are 46% for single top quark, 7.5% for tt other, 6.8% for W þ jets production, and 47% and 17%, respectively, for the muon and electron multijet backgrounds.
A correction factor to account for small differences in the t tagging efficiency between data and simulation is also determined through the maximum-likelihood fit. While the dependence of this efficiency correction on the t jet η is taken from Ref. [39], an additional uncertainty to account for a potential dependence of p t T is evaluated by performing separate fits for events with p t T <600GeV and > 600 GeV. All other nuisance parameters are required to be the same in both p t T regions for this check. An additional uncertainty of 17% is assigned for p t T > 600 GeV to account for the p T dependence, resulting in a total uncertainty in the t tagging efficiency of AE5% (AE18%) for p t T < 600 ð> 600Þ GeV. The measured normalizations in the signal and background yields, as determined from the maximum-likelihood fit, are given, together with the number of observed events in data, in Table I. The electron and muon channels are shown separately. The quoted uncertainties are from the total fit, and include the statistical components, but not the theoretical uncertainties in the tt signal. The total signal and background yields are consistent with the observed number of events in the data within about one standard deviation.
The distributions in jηj and m vtx after the combined maximum-likelihood fit to e þ jets and μ þ jets events are shown in Fig. 1, comparing the fitted values of the model to the data from each of the fitted categories (0t, 1t þ 0b, 1t þ 1b). The uncertainty bands show the combined fitted statistical and experimental systematic uncertainties in the signal and backgrounds, added in quadrature neglecting correlations for presentational purposes, although the full likelihood with correlations is used to compute the uncertainties in the measurements of the cross section. The p T and y distributions of the hadronic-side, large-R jet are shown for each category in Fig. 2. These figures show the data, together with the signal and background yields from simulation (or, for multijet background, from data enhanced with multijet events), using the normalizations from the fit, as well as the ratio of the data to the total fit. Since the p t T and y t variables are not used in the fit, the signal and background distributions in Fig. 2 are taken from simulation (or the data sideband for the multijet background). In extracting the differential cross sections, these distributions are used for the backgrounds, while the signal is taken from the data after subtracting the background contributions.
B. Integrated tt cross section measurement The measurement at the particle level is defined within a fiducial region designed to closely match the event TABLE I. Predicted numbers of signal and background events, as well as the total yield, together with the observed number of events in data, are shown after the combined maximumlikelihood fit for the e þ jets (top) and μ þ jets (bottom) categories. The uncertainties include the statistical component from the fit, but not the theoretical uncertainties in the tt signal. The uncertainties in the sum of backgrounds and the total yield are determined neglecting correlations for presentational purposes, although the full likelihood with correlations is used to compute the uncertainties in the measurements of the cross section.  selections in the detector and minimize the dependence on theoretical input. The measurement at the parton level is defined relative to the top and antitop quarks before they decay, but after they radiate any gluons. The POWHEG+PYTHIA6 simulation is used to determine the acceptance for the particle-level and parton-level selections and to obtain the predicted cross section values. The following particle-level selections are used to define the fiducial region in the simulation: (i) One electron or muon with p T > 45 GeV (computed prior to any potential photon radiation) and jηj < 2.1. (ii) At least one anti-k T (R ¼ 0.5) jet with 0.1 < ΔRðl; jetÞ < π=2, p T > 30 GeV, and jηj < 2.4. (iii) At least one CA (R ¼ 0.8) jet with ΔRðl; jetÞ > π=2, p T > 400 GeV, 140 < m j < 250 GeV, and jηj < 2.4.
Jets at the particle level in the simulation are formed from stable particles, excluding electrons, muons, and neutrinos. The cross section at parton level is measured for the region where the top or antitop quark that decays to quarks has p T > 400 GeV. No other kinematic requirements are imposed.
The measurements at both the particle and parton levels are corrected for the branching fraction of tt → e=μ þ jets, determined from the tt simulation.
The integrated tt cross section is obtained from the tt signal yield in the maximum-likelihood fit. Uncertainties associated with the signal modeling are not included as nuisance parameters in the fit. These are instead evaluated through the difference in the signal acceptance from changes made in the μ R and μ F scales and PDF variations. The uncertainties from the choice of event generator and parton shower algorithm are also evaluated independently of the fit through the difference in the tt signal acceptance between the POWHEG+PYTHIA6 and MC@NLO+HERWIG6 predictions at the particle and parton levels.
The measurements of the integrated cross sections for p t T > 400 GeV are particle level : σ tt ¼ 0.499 AE 0.035ðstat þ systÞ AE 0.095ðtheoÞ AE 0.013ðlumiÞ pb; parton level : σ tt ¼ 1.44 AE 0.10ðstat þ systÞ AE 0.29ðtheoÞ AE 0.04ðlumiÞ pb: The theoretical uncertainties from the PDF, μ R and μ F scales, and choice of event generator and parton shower algorithm are, respectively, 9%, 9%, and 14% at the particle level, and 9%, 10%, and 15% at the parton level.
The measurements are compared to predictions from different tt simulations. Assuming the NNLO cross section of 252.9 pb [49] for the full phase space, the resulting POWHEG+PYTHIA6 cross section is 0.580 (1.67) pb at particle (parton) level. The ratio of the measured integrated tt cross section for the high-p T region to the value predicted by the POWHEG+PYTHIA6 simulation is 0.86 AE 0. 16 (0.86 AE 0.19) for the particle (parton) level. Thus, the measurements and predictions are consistent within the total uncertainty, which is dominated by the theoretical uncertainty in the cross section extraction. The integrated cross sections are also extracted from the MADGRAPH+PYTHIA6 and MC@NLO +HERWIG6 simulations, again assuming the NNLO cross section for the full phase space, and are 0.675 (1.85) pb and 0.499 (1.42) pb at the particle (parton) level, respectively. The prediction from the MC@NLO+HERWIG6 simulation agrees well with the measured values, while the MADGRAPH+PYTHIA6 simulation overestimates the cross sections at both particle and parton levels.

C. Differential tt cross section measurements
The differential tt cross section is measured as a function of the p T and y of the top quark that decays to a hadronic final state. The event sample from which the p T and y distributions of the t jet candidates are extracted is defined by combining the signal-dominated 1t þ 0b and 1t þ 1b event categories. The observed number of tt events at detector level is first extracted from data by subtracting the SM background contributions using the normalizations from the maximum-likelihood fit (shown in Table I). As a cross-check, it is verified that a small tt contribution added to the maximum-likelihood fit from a beyond-the-SM process, such as a 1%-2% contribution from Z 0 → tt (corresponding to a signal cross section already excluded in Ref. [13]), has a negligible impact on the extracted SM backgrounds. We also verify that a small potential modification of the top quark rapidity has a minimal impact on the background normalizations that is well within the quoted background normalization uncertainties.
An unfolding procedure translates the observed number of tt events in bins of reconstructed p T and y of the t jet candidate to a cross section in bins of particle-and partonlevel top quark p t T and y t . If more than large-R jet fulfills the particle-level selection in Sec. VI B, which occurs for < 1% of events, the one with highest p T is chosen as the particle-level t jet. The unfolding accounts for all reconstruction and detector efficiencies, detector resolution effects, and migrations of tt events across bins. The unfolding is performed using response matrices, determined with simulated POWHEG+PYTHIA6 tt events, using the singular-value-decomposition (SVD) method [50] in the ROOUNFOLD package [51].
The background-subtracted data are unfolded in two steps, first from detector level to particle level, and in a second step from particle level to parton level. Response matrices are created between the p T and y of the reconstructed t jet candidate and the particle-level t jet, and between the particle-level t jet and the parton-level top quark. These response matrices are used to unfold the data and obtain the differential cross sections, after dividing by the bin width and correcting for the branching fraction of tt → e=μ þ jets. The unfolding is performed multiple times, repeating the procedure for each systematic change that affects the p t T or y t distributions. The electron and muon channels are unfolded separately, and are then TABLE II. Differential tt cross section in bins of p T and y for the t jet at the particle level (top) and the top quark at parton level (bottom). The measurements are compared to predictions from the POWHEG+PYTHIA6, MADGRAPH+PYTHIA6, and MC@NLO+HERWIG6 simulations. The total relative uncertainty (Tot) in the measurements is separated into relative statistical (Stat), experimental (Exp), and theoretical (Th) components, all in percent. dσ=dp T (fb=GeV) at particle level combined through the statistically weighted mean in each bin. Specifically, the combined cross section in a bin (σ) is given by σ ¼ P ðσ i =δσ 2 i Þ= P ð1=δσ 2 i Þ, where σ i is the cross section in a bin for each channel (i ¼ e, μ) and δσ i is the corresponding uncertainty. The statistical uncertainty in the combined cross section (δσ) is given by δσ ¼ 1=ð P ð1=δσ 2 i ÞÞ 1=2 . The combination is repeated for each systematic variation, and the difference with respect to the combined nominal value is taken as the uncertainty for that source of systematic bias. The uncertainty in the normalization of the background is extracted by rescaling the subtracted background by AE1 standard deviation, as derived from the maximum-likelihood fit in Sec. VI A, and taking the difference in the unfolded result relative to the nominal yield as the uncertainty at particle and parton level, respectively. Similarly, the t tagging efficiency uncertainty as measured at detector level is translated into an uncertainty in the differential measurement at particle and parton levels by unfolding, assuming systematically varied t tagging efficiencies. The uncertainties from the choice of event generator and parton shower algorithm are evaluated by unfolding the nominal POWHEG+PYTHIA6 simulated events using the response matrix from MC@NLO+HERWIG6. The differences between the unfolded simulation and the predictions at the particle and parton levels are taken as the uncertainties. At particle (parton) level, these are 1%-18% (2%-21%) and 3%-8% (2%-6%) for the p t T and y t measurements, respectively. The unfolded results at the particle and parton levels, including all experimental and theoretical uncertainties, are shown as a function of p t T and y t as the data points in Fig. 3, and the relative uncertainties are displayed in Fig. 4. As a consequence of bin migrations, the uncertainties at particle and parton level differ from the corresponding bin-by-bin uncertainties at detector level.
The measured tt cross sections are listed in bins of p t T and y t at the particle and parton levels in Table II. The measured cross sections are compared to the theoretical predictions from the POWHEG+PYTHIA6, MADGRAPH +PYTHIA6, and MC@NLO+HERWIG6 tt simulations, all normalized to the NNLO cross section [49]. Their values are also displayed in Fig. 3 and given in Table II. Also listed in Table II are the different relative uncertainties in the measurements, separated into the statistical uncertainty (Stat), the combined experimental uncertainty (Exp), the theoretical uncertainty (Th), and the total measurement uncertainty (Tot), all in percent. The measured cross sections are lower than the predictions from POWHEG +PYTHIA6 and MADGRAPH+PYTHIA6, in particular for the high-p t T region, while MC@NLO+HERWIG6 gives a better modeling of the data across the full p t T range. The differential cross sections are significantly overestimated for jy t j < 1.2 by MADGRAPH+PYTHIA6 as compared to the data. The predictions of the y t distributions by MC@NLO +HERWIG6 and POWHEG+PYTHIA6 agree with the data within the measurement uncertainties.
The differential tt cross section measurement in bins of parton-level top quark p T is compared to different theoretical cross section calculations in Fig. 5. Calculations of NNLO differential cross sections are extracted from Ref. [52] for three different PDF sets (NNPDF3.0 [53], CT14 [54], and MMHT2014 [55]). Approximate next-tonext-to-next-to-leading-order (aNNNLO) predictions corresponding to the results presented in Ref. [56] were provided by the author. The NNLO calculations are in good agreement with the measurement across the full top quark p T range studied. Predictions for different PDF sets cannot be distinguished given the current measurement uncertainty but are all observed to be consistent with the data. The aNNNLO calculation significantly overestimates the cross section, with an increasing disagreement with higher top quark p T . An additional check of the unfolding procedure is performed to confirm that the unfolding itself would support such a different p T spectrum. The POWHEG+PYTHIA6 simulation is unfolded using response matrices derived from the same sample, but reweighting the distribution at detector level by a factor that corresponds to that required to match the aNNNLO prediction at parton level. The scaled and then unfolded simulation reproduces the aNNNLO prediction within the measurement uncertainty. The measured cross section is compared to theoretical calculations at NNLO for three different PDF sets [52] and at aNNNLO [56]. The lower plot shows the ratio of these theoretical predictions to the data. The statistical uncertainties are represented by the inner vertical bars with ticks and the light bands in the ratios. The combined uncertainties are shown as full vertical bars and the dark solid bands in the ratios.

VII. SUMMARY
The first CMS measurement of the tt production cross section in the boosted regime has been presented. The integrated cross section, as well as differential cross sections as a function of the top quark p T and y, have been measured for p t T > 400 GeV. The measurements use lepton þ jets events, identified through an electron or a muon, a b jet candidate from the semileptonic top quark decay, and a t jet candidate from the top quark decaying to a hadronic final state. Backgrounds are modeled using simulations for the distributions, or a data sideband for multijet production. Background normalizations are extracted jointly with the signal yield and the t tagging efficiency using a maximum-likelihood fit.
The integrated cross section measured for p t T > 400 GeV is σ tt ¼0.499AE0.035ðstatþsystÞAE0.095ðtheoÞAE 0.013ðlumiÞ pb at particle level, and σ tt ¼ 1.44 AE 0.10ðstat þ systÞ AE 0.29ðtheoÞ AE 0.04ðlumiÞ pb at parton level, both corrected for the branching fraction of tt → e=μ þ jets. The measurements are compared to the predicted cross section for this p T range from the POWHEG +PYTHIA6 tt simulation assuming σ tot ¼ 252.9 pb, which provides a value of 0.580 pb at particle level and 1.67 pb at parton level. The cross section for this high-p T region is therefore found to be overestimated by 14% in the POWHEG +PYTHIA6 simulation, but is consistent within the uncertainties.
Differential cross sections are also measured at both particle and parton levels. Background contributions are subtracted from the t-tagged jet distributions to obtain the distribution for signal. This is unfolded first to the particle level to correct for signal efficiency, acceptance, and bin migrations to yield the cross section in bins of t jet p T and y at particle level. The data are further unfolded to the parton level to extract the cross section in bins of top quark p T and y. The measurements are compared to predictions from different tt simulations. The POWHEG+PYTHIA6 and MADGRAPH+PYTHIA6 simulations are observed to overestimate the cross section, in particular at high p t T , while MC@NLO+HERWIG6 results in a good modeling of the p t T spectrum. The POWHEG+PYTHIA6 and MC@NLO+HERWIG6 simulations model the y t distributions well, while MADGRAPH+PYTHIA6 significantly overestimates the cross section for jy t j < 1.2. The results are compatible with those from the nonboosted CMS measurement [4] in the p T range where the two analyses overlap (400-500 GeV). The nonboosted measurement also observes an overestimate of the cross section for different MC generators in this p T range, most prominent for MADGRAPH+PYTHIA6, and an improved modeling of the p T spectrum using HERWIG6 for the parton showering. The measurement as a function of parton-level top quark p T is also compared to theoretical aNNNLO and NNLO calculations. While the aNNNLO prediction significantly overestimate the measurement, especially for high top quark p T , the NNLO calculations are in good agreement across the full p T range studied.
The analysis presented in this paper extends the differential tt cross section measurement into the p T > 1 TeV range. These measurements will help improve the modeling of event generators in this high-p T range, an important regime for many new physics searches.

ACKNOWLEDGMENTS
We congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centers and personnel of the Worldwide LHC Computing Grid for delivering so effectively the computing infrastructure essential to our analyses. Finally, we acknowledge the enduring support for the construction and operation of the LHC and the CMS detector provided by the following funding agencies: