Measurements of $t\bar{t}$ differential cross-sections of highly boosted top quarks decaying to all-hadronic final states in $pp$ collisions at $\sqrt{s}=13\,$ TeV using the ATLAS detector

Measurements are made of differential cross-sections of highly boosted pair-produced top quarks as a function of top-quark and $t\bar{t}$ system kinematic observables using proton--proton collisions at a center-of-mass energy of $\sqrt{s} = 13$ TeV. The data set corresponds to an integrated luminosity of $36.1$ fb$^{-1}$, recorded in 2015 and 2016 with the ATLAS detector at the CERN Large Hadron Collider. Events with two large-radius jets in the final state, one with transverse momentum $p_{\rm T}>500$ GeV and a second with $p_{\rm T}>350$ GeV, are used for the measurement. The top-quark candidates are separated from the multijet background using jet substructure information and association with a $b$-tagged jet. The measured spectra are corrected for detector effects to a particle-level fiducial phase space and a parton-level limited phase space, and are compared to several Monte Carlo simulations by means of calculated $\chi^2$ values. The cross-section for $t\bar{t}$ production in the fiducial phase-space region is $292 \pm 7 \ \rm{(stat)} \pm 76 \rm{(syst)}$ fb, to be compared to the theoretical prediction of $384 \pm 36$ fb.


Introduction
The large top-quark pair-production cross-section at the Large Hadron Collider (LHC) allows detailed studies of the characteristics of the production of top-antitop (tt) quark pairs, providing an opportunity to further test the Standard Model (SM). Focusing on highly boosted final states probes the QCD tt production processes in the TeV scale range, a kinematic region where theoretical calculations based on the SM still present large uncertainties [1][2][3]. High-precision measurements, especially in kinematic regions that have not been explored extensively, are necessary to better constrain the models currently in use. Furthermore, effects beyond the SM can appear as modifications of tt differential distributions with respect to the SM predictions [4][5][6] that may not be detected with an inclusive cross-section measurement.
In the SM, the top quark decays almost exclusively to a W boson and a b-quark. The signature of a tt final state is therefore determined by the W boson decay modes. The ATLAS [7][8][9][10][11][12][13][14] and CMS [15][16][17][18][19] Collaborations have published measurements of the tt differential cross-sections at center-of-mass energies of √ s = 7 TeV, √ s = 8 TeV and √ s = 13 TeV in pp collisions using final states containing leptons. The analysis presented here makes use of the all-hadronic tt decay mode, where only top-quark candidates with high transverse momentum (p T ) are selected. This highly boosted topology is easier to reconstruct than other final-state configurations as the top-quark decay products are collimated into a large-radius jet by the Lorentz boost of the top quarks. This analysis is performed on events with the leading top-quark jet having p t, 1 T > 500 GeV and the second-leading top-quark jet having p t,2 T > 350 GeV. These jets are reconstructed from calorimeter energy deposits and tagged as top-quark candidates to separate the tt final state from background sources. The event selection and background estimation follows the approach used in Ref. [20], but with updated tagging methods and data-driven multijet background estimates.
These measurements are based on data collected by the ATLAS detector in 2015 and 2016 from pp collisions at √ s = 13 TeV, corresponding to an integrated luminosity of 36.1 fb −1 . Measurements are made of the tt differential cross-sections by unfolding the detector-level distributions to a particle-level fiducial phase-space region. The goal of unfolding to a particle-level fiducial phase space and of using variables directly related to detector observables is to allow precision tests of QCD by avoiding model-dependent extrapolation of the measurements to a phase-space region outside the detector acceptance. Measurements of parton-level differential cross-sections are also presented, where the detector-level distributions are unfolded to the top quark at the parton-level in a limited phase-space region. These allow comparisons to the higher-order calculations that are currently restricted to stable top quarks [1][2][3].
These differential cross-sections are similar to those studied in dijet measurements at large jet transverse momentum [21,22] and are sensitive to effects of initial-and final-state radiation (ISR and FSR), to different parton distribution functions (PDF) and to different schemes for matching matrix-element calculations to parton shower models.
Measurements are made of the differential cross-sections for the leading and second-leading top quarks as a function of p t,1 T and p t,2 T , as well as the rapidities of the top quarks. The rapidities of the leading and second-leading top quarks in the laboratory frame are denoted by y t,1 and y t,2 , respectively, while their rapidities in the tt center-of-mass frame are y = 1 /2 y t,1 − y t,2 and −y . These allow the construction of the variable χ tt = exp 2|y |, which is of particular interest as many processes not included in the Standard Model are predicted to peak at low values of χ tt [23]. The longitudinal motion of the tt system in the laboratory frame is described by the rapidity boost y tt B = 1 /2 y t,1 + y t,2 and is sensitive to PDFs. Measurements are also made of the differential cross-sections as a function of the invariant mass, p T and rapidity of the tt system; the absolute value of the azimuthal angle between the two top quarks, ∆φ tt ; the absolute value of the out-of-plane momentum, p tt out (i.e., the projection of the three-momentum of one of the top-quark jets onto the direction perpendicular to a plane defined by the other top quark and the beam axis (z) in the laboratory frame [22]); the cosine of the production angle in the Collins-Soper 1 reference frame, cos θ ; and the scalar sum of the transverse momenta of the two top quarks, H tt T [24,25]. Some of the variables (e.g. ∆φ tt and p tt out ) are more sensitive to additional radiation in the main scattering process, and thus are more sensitive to effects beyond leading order (LO) in the matrix elements. All of these variables are sensitive to the kinematics of the tt production process.
The paper is organized as follows. Section 2 briefly describes the ATLAS detector, while Sec. 3 describes the data and simulation samples used in the measurements. The reconstruction of physics objects and the event selection is explained in Sec. 4 and the background estimates are discussed in Sec. 5. The procedure for unfolding to particle level and parton level are described in Sec. 6. The systematic uncertainties affecting the measurements are summarized in Sec. 7. The results of the measurements are presented in Sec. 8 and comparisons of these results with theoretical predictions are made in Sec. 9. A summary is presented in Sec. 10.

ATLAS detector
The ATLAS experiment [26] at the LHC uses a multi-purpose detector with a forward-backward symmetric cylindrical geometry and near 4π coverage in solid angle. 2 It consists of an inner tracking detector surrounded by a superconducting solenoid magnet creating a 2 T axial magnetic field, electromagnetic and hadronic calorimeters, and a muon spectrometer.
The inner tracking detector covers the pseudorapidity range |η| < 2.5. Consisting of silicon pixel, silicon microstrip and transition radiation tracking detectors, the inner tracking detector allows highly efficient reconstruction of the trajectories of the charged particles produced in the pp interactions. An additional silicon pixel layer, the insertable B-layer, was added between 3 and 4 cm from the beam line to improve b-hadron tagging [27]. Lead/liquid-argon (LAr) sampling calorimeters provide electromagnetic (EM) energy measurements with high granularity and shower-depth segmentation. A hadronic (steel/scintillatortile) calorimeter covers the central pseudorapidity range (|η| < 1.7). The endcap and forward regions are instrumented with LAr calorimeters for EM and hadronic energy measurements up to |η| = 4.9. The muon spectrometer is located outside of the calorimeter systems and is based on three large air-core toroid superconducting magnets with eight coils each. It includes a system of precision tracking chambers and detectors with sufficient timing resolution to enable triggering of events.
A two-level trigger system is used to select events [28]. The first-level hardware-based trigger uses a subset of the detector information to reduce the rate of accepted events to a design maximum of 100 kHz. This is followed by a software-based trigger with a maximum average accepted event rate of 1 kHz.

Data sets and Monte Carlo event generation
The data used for this analysis were recorded with the ATLAS detector at a center-of-mass energy of 13 TeV in 2015 and 2016 and correspond to an integrated luminosity of 36.1 fb −1 . Only the data-taking periods in which all the subdetectors were operational are considered.
The events for this analysis were collected using an inclusive anti-k t jet trigger with radius parameter R = 1.0 and nominal p T thresholds of 360 GeV and 420 GeV for the 2015 and 2016 data-taking periods, respectively. These triggers were fully efficient for jets with p T > 480 GeV [28].
The signal and several background processes are modeled using Monte Carlo (MC) event generators. Multiple overlaid proton-proton collisions (pileup) are simulated with the soft QCD processes of Pythia 8.186 [29] using a set of tuned parameters called the A2 tune [30] and the MSTW2008LO [31] PDF set. The detector response is simulated using the Geant4 framework [32,33]. The data and MC events are reconstructed with the same software algorithms.
Several next-to-leading-order (NLO) MC calculations of the tt process are used in the analysis, and to compare with the measured differential cross-sections. The Powheg-Box v2 [34], MadGraph5_aMC@NLO [35] and Sherpa [36] Monte Carlo event generators encode different approaches to the matrix element calculation and different matching schemes between the NLO QCD matrix-element calculation and the parton shower algorithm. A more detailed explanation of the differences among these event generators can be found in Ref. [37].
The nominal sample uses the Powheg-Box v2 [34] event generator employing the NNPDF30 PDF set interfaced with the Pythia8 parton shower and hadronization model (hereafter also referred to as PWG+PY8). The Powheg h damp parameter, which controls the p T of the first additional emission beyond the Born configuration, is set to 1.5 times the top-quark mass [38]. The main effect of this is to regulate the high-p T emission against which the tt system recoils. To enhance the production of top quarks in the high-p T region, the Powheg parameter bornsuppfact is set to p T,supp = 500 GeV [34,39]. The Pythia8 parameters are chosen for good agreement with ATLAS Run-1 data by employing the A14 tune [40] with the NNPDF23LO PDF set [41].
Two alternative Powheg+Pythia8 samples with systematic variations of the Powheg and Pythia8 parameters probe the effects of the experimental tuning of the MC event generators. One sample, which primarily increases the amount of initial-and final-state radiation, uses h damp = 3m top , the factorization and renormalization scale reduced by a factor of 2 and the A14 Var3c Up tune variation [40]. The second sample, which decreases the amount of initial-and final-state radiation, uses h damp = 1.5m top , the factorization and renormalization scale increased by a factor of 2 and the A14 Var3c Down tune variation [40].
An alternative matrix element calculation and matching with the parton shower is realized with the Mad-Graph5_aMC@NLO event generator (hereafter referred to as MG5_aMC@NLO) [35] interfaced with the Pythia8 parton shower and hadronization model using the same tune as the nominal sample. This sample requires the leading top quark in each event to have p T > 300 GeV to ensure that the high-p T region is adequately populated. The effects of using alternative parton shower and hadronization models is probed by interfacing the nominal Powheg setup with the Herwig7 parton shower and hadronization model [42] employing the H7UE tune (hereafter also referred to as PWG+H7). Another calculation using the Sherpa v2.2.1 event generator [36] with the default Sherpa parton shower and hadronization model merges the NLO tt matrix element with matrix element calculations including up to four additional jets using the MEPS@NLO setup [43].
The Wt single-top-quark processes are modeled using the Powheg-Box v2 event generator with the CT10 PDF set [44]. For the single-top-quark process, the top quarks are decayed using MadSpin [45]. The parton shower, fragmentation and the underlying event for these processes are simulated using the Pythia 6.428 event generator [46] with the CTEQ6L1 PDF sets and the corresponding Perugia 2012 tune (P2012) [47]. Electroweak t-and s-channel single-top-quark events are not explicitly modeled because of the small cross-section of these processes and the low jet multiplicity in the final state. Their contribution is accounted for in the data-driven background estimate.
The associated production of tt pairs with W, Z and Higgs bosons is modeled using the MG5_aMC@NLO event generator [35] coupled to the Pythia8 parton shower and hadronization model using the same PDF sets and tunes as the tt sample.
The top-quark mass is set to m top = 172.5 GeV for all samples and the renormalization and factorization scales are set to µ R/F = m 2 top + 1 2 (p T (t) 2 + p T (t) 2 ) for all tt samples except where explicity noted above. The EvtGen v1.2.0 program [48] is used for modeling the properties of the bottom and charm hadron decays for all event generator setups other than for the Sherpa sample.
The tt samples are normalized using the next-to-next-to-leading-order cross-section plus next-to-nextto-leading-logarithm corrections (NNLO+NNLL) σ tt = 832 +46 −51 pb [49], where the uncertainties reflect the effect of scale and PDF variations. The single-top-quark cross-section is normalized to the NLO predictions [50]. The associated production of tt pairs with W, Z and Higgs bosons are normalized to 0.603 pb, 0.586 pb and 0.231 pb, respectively, as predicted by the MG5_aMC@NLO event generator.

Event reconstruction and selection
This analysis makes use of jets, electrons and muons as well as event-based measures formed from their combinations. The event reconstruction and selection are summarized in the following subsections.

Event reconstruction
Electron candidates are identified from high-quality inner detector tracks matched to calorimeter deposits consistent with an electromagnetic shower. The calorimeter deposits have to form a cluster with E T > 25 GeV, |η| < 2.47 and be outside the transition region 1.37 ≤ |η| ≤ 1.52 between the barrel and endcap calorimeters. A likelihood-based requirement is used to suppress misidentified jets (hereafter referred to as fakes), and calorimeter-and track-based isolation requirements are imposed [51,52]. Overall, these criteria result in electron identification efficiencies of ∼ 90% for electrons with p T > 25 GeV and 96% for electrons with p T > 60 GeV.
Muon candidates are reconstructed using high-quality inner detector tracks combined with tracks reconstructed in the muon spectrometer. Only muon candidates with p T > 25 GeV and |η| < 2.5 are considered. Isolation criteria similar to those used for electrons are used [53]. To reduce the impact of non-prompt leptons, muons within ∆R = (∆η) 2 + (∆φ) 2 = 0.4 of a jet are removed.
The anti-k t algorithm implemented in the FastJet package [54, 55] is used to define two types of jets for this analysis: small-R jets with a radius parameter of R = 0.4 and large-R jets with R = 1.0. These are reconstructed independently of each other from topological clusters in the calorimeter. The clusters used as input to the large-R jet reconstruction are calibrated using the local calibration method described in Ref. [56]. The small-R jet energy scale is obtained by using an energy-and η-dependent calibration scheme resulting from simulation and in situ corrections based on data [57-60]. Only small-R jets that have |η| < 2.5 and p T > 25 GeV are considered. To reduce pileup effects, an algorithm that determines whether the primary vertex is the origin of the charged-particle tracks associated with a jet candidate is used to reject jets coming from other interactions [61]. This is done only for jet candidates with p T < 50 GeV and |η| < 2.4. The small-R jet closest to an electron candidate is removed if they are separated by no more than ∆R = 0.2. Small-R jets containing b-hadrons are identified (b-tagged) using a multivariate discriminant that combines information about secondary vertices and impact parameters. The small-R jets are considered b-tagged if the value of the discriminant is larger than a threshold that provides 70% efficiency. The corresponding rejection factors for gluon/light-quark jets and charm-quark jets are approximately 125 and 4.5, respectively [62,63].
The large-R jet energy scale is derived by using energy-and η-dependent calibration factors derived from simulation and in situ measurements [57,58,64]. The large-R jet candidates are required to have |η| < 2.0 and p T > 300 GeV. A trimming algorithm [65] with parameters R sub = 0.2 and f cut = 0.05 is applied to suppress gluon radiation and further mitigate pileup effects. A top-tagging algorithm [66] is applied that consists of p T -dependent requirements on two variables: the jet mass m J , measured from clusters in the calorimeter, and the N-subjettiness ratio τ 32 [67,68]. The N-subjettiness variable τ N expresses how well a jet can be described as containing N or fewer subjets. The ratio τ 32 = τ 3 /τ 2 allows discrimination between jets containing a three-prong structure and jets containing a two-prong structure. The p T -dependent requirements provide a 50% top-quark tagging efficiency independent of p T , with a light-quark and gluon jet rejection factor of ∼ 17 at p T = 500 GeV and decreasing with increasing p T to ∼ 10 at p T = 1 TeV. This combination of variables used with trimmed large-R jets provides the necessary rejection for this analysis, and is insensitive to the effects of pileup.

Event selection
The event selection identifies fully hadronic tt events where both top quarks have high p T . Each event is required to have a primary vertex with five or more associated tracks with p T > 0.4 GeV. In order to reject top-quark events where a top quark has decayed semileptonically, the events are required to contain no reconstructed electron or muon candidate. To identify the fully hadronic decay topology, events must have at least two large-R jets with p T > 350 GeV, |η| < 2.0 and |m J − m top | < 50 GeV, where the top-quark mass m top is set to 172.5 GeV. The leading jet is required to have p T > 500 GeV and the event must contain at least two small-R jets with p T > 25 GeV and |η| < 2.5. This preselection results in an event sample of 22.7 million events.
To reject multijet background events, the two highest p T large-R jets must satisfy the top-tagging criteria described in Sec. 4.1. Furthermore, both top-tagged large-R jets are required to have an associated small-

Background estimation
There are two categories of background sources: those involving one or more top quarks in the final state and those sources where no top quark is involved. The background processes involving top quarks are estimated using MC calculations. The largest background source is events where the two leading jets both arise from gluons or u, d, s, c, or b quarks (which are referred to as "multijet" events). Monte Carlo predictions of multijet events have large uncertainties coming from the relatively poorly understood higher-order contributions that produce a pair of massive jets [69,70]. To avoid these large uncertainties the multijet background is determined using a data-driven technique. A similar method was used in previous work [20].
A Powheg+Pythia8 tt sample is used to estimate the number of tt events in the sample that arise from at least one top quark decaying semileptonically. This includes contributions from decays resulting in τ leptons, as no attempt is made to identify τ lepton candidates and reject them. The rate is estimated to be only ∼ 4% in the signal region, primarily due to the top-tagging requirements. However, this category of tt events contributes to control and validation regions where the top-tagging and/or b-tagging requirements are relaxed. Thus, this MC prediction is used to estimate this contamination. Single-topquark production in the Wt-channel makes a small contribution to the signal sample, which is estimated using the MC predictions described earlier. The t-channel single-top-quark process is not included, but is partially accounted for in the multijet background estimate.
The data-driven multijet background estimate is performed using a set of control regions. Sixteen separate regions are defined by classifying each event in the preselection sample according to whether the leading and second-leading jets are top-tagged or b-tagged. Table 1 shows the 16 regions that are defined in this way, and illustrates the proportion of expected tt events in each region relative to the observed rate. Region S is the signal region, while the regions with no b-tags (A, C, E and F) and the regions with one b-tag and no top-tags (B and I) are dominated by multijet backgrounds.
After subtracting the estimated contributions of the tt signal and of the other background sources to each of the control regions, the number of events in region J divided by the number of events in region A gives Table 1: Region labels and expected proportion of tt events used for the data-driven background prediction of multijet events. A top-quark tagged jet is defined by the tagging algorithm described in the text, and denoted "1t" in the table, while a jet that is not top-tagged is labeled "0t". A b-match is defined as ∆R(J, b) < 1.0, where J represents a large-R jet and "b" represents a b-tagged jet. The labels "1b" and "0b" represent large-R jets that either have or not have a b-match. Regions K, L, N and M have an expected contribution from sources involving one or more top quarks of at least 15% of the observed yield. In other regions, the expected contribution from signal and backgrounds involving top quarks is less than 15% of the observed event rate. This "ABCD" estimate assumes that the mistagging rate of the leading jet does not depend on how the second-leading jet is tagged. This assumption is avoided by measuring the correlations in backgrounddominated regions, e.g. comparing the ratio of the numbers of events in regions F and E (giving the leading jet top-tagging rate when the second-leading jet is top-tagged) with the ratios of events in regions C and A (giving the leading jet top-tagging rate when the second leading jet is not top-tagged). This results in a refined data-driven estimate of the size of the multijet background given by where the region name is the number of observed events in that region. The measured correlations in the tagging of background jets result in an increase of (12 ± 3)% in the background estimate compared with the estimate assuming that the tagging rates are independent. This estimate is also valid when a variable characterizing the kinematics of the events in all the regions is further restricted to range between specific values. This provides a bin-by-bin data-driven background estimate with uncertainties that come from the number of events in the regions used in Eq. (1).
Regions L and N are estimated to consist of approximately equal numbers of tt signal events and multijet background events. They are used as validation regions to verify that the signal and background estimates are robust. In these cases, the multijet background is estimated using different combinations of control regions, namely N = H × D/B and L = H × G/I.
The number of multijet events in the signal region is calculated by applying Eq. (1) to the number of events in the control regions. This results in an estimate of 810 ± 50 multijet events in the signal region, where the uncertainty takes into account the statistical uncertainties as well as the systematic uncertainties in the tt signal subtraction.
There is good agreement in the validation regions between the predicted and observed event yields, as well as in the shape of distributions that are sensitive to the proportion of tt signal and multijet background. This is illustrated in Fig. 1, which compares the large-R jet mass distributions and the highest-p T subjet mass distribution of the leading jets. A shift between the measured and predicted jet mass distributions, shown in Figs. 1(a) and 1(b), is consistent with the uncertainties arising from the calibration for large-R jets [71]. The distributions for the leading and second-leading jet p T and rapidity in regions N and L are shown in Fig. 2, and can be compared with the signal region distributions in Fig. 3.
The level of agreement between the observed and predicted distributions in the signal region can be seen in Fig. 3, which shows the distributions of the leading top-quark p T and absolute value of rapidity, as well as the same distributions for the second-leading jet.
The event yields are summarized in Table 2 for the simulated signal, the background sources and the data sample.

Signal region
Tot. Syst. Unc. ⊕ Stat. [GeV] Tot. Syst. Unc.  6 Unfolding procedure The differential cross-sections are obtained from the data using an unfolding technique that corrects for detector effects such as efficiency, acceptance and resolution. This correction is made to the particle level using a fiducial phase space that is defined to match the experimental acceptance and hence avoid large MC extrapolations. The parton-level differential cross-sections are obtained using a similar procedure, but in this case the correction is made to the top-quark parton after final-state radiation effects have been included in the generation process using a limited phase-space region matched to the kinematic acceptance of the analysis.
In the following subsections, the particle-level fiducial phase space and the parton-level phase space are defined and the algorithm used for the unfolding is described.

Particle-level fiducial phase-space and parton-level phase-space regions
The particle-level fiducial phase-space definition models the kinematic requirements used to select the tt process.
In the MC signal sample, electrons and muons that do not originate from hadron decays are combined or "dressed" with any photons found in a cone of size ∆R = 0.1 around the lepton direction. The fourmomentum of each photon in the cone is added to the four-momentum of the lepton to produce the dressed lepton.
Jets are clustered using all stable particles except those used in the definition of dressed electrons and muons and neutrinos not from hadron decays, using the anti-k t algorithm with a radius parameter R = 0.4 and R = 1.0 for small-R and large-R jets, respectively. The decay products of hadronically decaying τ leptons are included. These jets do not include particles from pileup events but do include those from the underlying event. Large-R jets are required to have p T > 350 GeV and a mass within 50 GeV of the top-quark mass.
The following requirements on particle-level electrons, muons and jets in the all-hadronic tt MC events define the particle-level fiducial phase space: • no dressed electrons or muons with p T > 25 GeV and |η| < 2.5 be in the event, • at least two anti-k t R = 1.0 jets with p T > 350 GeV and |η| < 2.0, • at least one anti-k t R = 1.0 jet with p T > 500 GeV and |η| < 2.0, • the masses of the two large-R jets be within 50 GeV of the top-quark mass of 172.5 GeV, • at least two anti-k t R = 0.4 jets with p T > 25 GeV and |η| < 2.5 and • the two leading R = 1.0 jets be matched to a b-hadron in the final state using a ghost-matching technique as described in Ref.
The parton-level phase space is defined by requiring that the leading top quark have p T > 500 GeV and the second-leading top quark have p T > 350 GeV. No rapidity or other kinematic requirements are made. This definition avoids a large extrapolation in the unfolding procedure that results in large systematic uncertainties.

Unfolding algorithm
The iterative Bayesian method [73] as implemented in RooUnfold [74] is used to correct the detectorlevel event distributions to their corresponding particle-and parton-level differential cross-sections. The unfolding starts from the detector-level event distributions after subtraction of the estimated backgrounds.
An acceptance correction f acc is applied that accounts for events that are generated outside the fiducial or parton phase space but pass the detector-level selection.
In order to properly account for resolution and any combinatorial effects, the detector-level and particlelevel (parton-level) objects in MC events are required to be well-matched using the angular difference ∆R. At particle (parton) level, each top-quark particle-level jet (top quark) is matched to the closest detectorlevel jet within ∆R < 1.0, a requirement that ensures high matching efficiency. The resulting acceptance corrections f j acc are illustrated in Fig. 4. The unfolding step uses a migration matrix (M) derived from simulated tt events with matching detectorlevel jets by binning these events in the particle-level and parton-level phase spaces. The probability for particle-level (parton-level) events to remain in the same bin is therefore represented by the elements on the diagonal, and the off-diagonal elements describe the fraction of particle-level (parton-level) events that migrate into other bins. Therefore, the elements of each row add up to unity (within rounding) as shown in Fig. 5. The efficiency corrections eff correct for events that are in the fiducial particle-level (parton-level) phase space but are not reconstructed at the detector level, and are illustrated in Fig. 4. The overall efficiency is largely determined by the working points of the b-tagging (70%) and top-tagging (50%) algorithms. The reduction in efficiency at higher top-quark candidate p T arises primarily from the b-tagging requirements. Examples of the migration matrices for several variables are shown in Fig. 5.
The unfolding procedure for an observable X at both particle and parton level is summarized by the expression dσ fid where N reco and N bg refer to the number of reconstructed signal and background events, respectively; the index j runs over bins of X at detector level while the index i labels bins at particle and parton level; ∆X i is the bin width while L dt is the integrated luminosity. The Bayesian unfolding is symbolized by M −1 i j . The inclusive cross-section for tt pairs in the fiducial (parton) phase space, obtained by integrating the absolute differential cross-section, is used to determine the normalized differential cross-section 1/σ fid · dσ fid /dX i . This cross-section is not corrected for the all-hadronic tt branching fraction of 0.457 [75].
Tests are performed at both particle and parton level to verify that the unfolding procedure is able to recover the generator-level distributions for input distributions that vary from the observed distributions or nominal predictions. These closure tests show that the unfolding procedure results are unbiased so long as the features of the input distributions are consistent with the measurement resolution of the variable. Parton level p (d) Figure 5: Migration matrices for p T and |y| of the leading top-quark jet in the particle-level fiducial phase space in (a) and (b) and parton-level phase space in (c) and (d). Each row is normalized to 100. The Powheg+Pythia8 event generator is used as the nominal prediction.

Systematic uncertainties
Systematic uncertainties resulting from electron, muon and jet reconstruction and calibration, MC event generator modeling and background estimation, are described below. The propagation of systematic uncertainties through the unfolding procedure is described in Sec. 7.2.

Estimation of systematic uncertainties
The systematic uncertainties in the measured distributions are estimated using MC data sets and the data satisfying the final selection requirements.
Estimates of large-R jet uncertainties [71] are derived by studying tracking and calorimeter-based measurements and comparing these in data and MC simulations. These uncertainties also include the energy, mass and substructure response. The uncertainty in the large-R jet mass resolution is incorporated by measuring the effect that an additional resolution degradation of 20% has on the observables [64,76]. The total uncertainty affecting the cross-section arising from jet calibration and reconstruction ranges from 11% to 30% for jet p T over the range 350 to 900 GeV.
The small-R jet energy scale uncertainty is derived using a combination of simulations, test-beam data and in situ measurements [57-59, 77]. Additional uncertainty contributions from the jet flavor composition, calorimeter response to different jet flavors and pileup are taken into account. Uncertainties in the jet energy resolution are obtained with an in situ measurement of the jet response asymmetry in dijet events [78]. These small-R jet uncertainties are typically below 1% for all distributions.
Uncertainties associated with pileup, the effect of additional interactions and the selection requirements used to mitigate them are estimated using comparisons of data and MC samples and are approximately 1%. The efficiency to tag jets containing b-hadrons is corrected in simulated events by applying b-tagging scale factors, extracted in tt and dijet samples, in order to account for the residual difference between data and simulation. The associated systematic uncertainties, computed by varying the b-tagging scale factors within their uncertainties [62,63], are found to range from ±8% to ±17% for large-R jet p T increasing from 500 to 900 GeV. The uncertainties arising from lepton energy scale and resolution [52,53,79] are < 1%.
Systematic uncertainties affecting the multijet background estimates come from the subtraction of other background processes in the control regions and from the uncertainties in the measured tagging correlations (which are statistical in nature). The uncertainty in the subtraction of the all-hadronic tt events in the control regions arises from the uncertainties in the tt cross-section and b-matching algorithm. Together, these result in background uncertainties ranging from ±2 to ±5% for large-R jet p T ranging from 350 to 900 GeV, respectively. The uncertainty in the single-top-quark background rates comes from the uncertainties in the Wt production cross-section, the integrated luminosity, detection efficiency and the relative contribution of t-channel and Wt production, which is assigned an uncertainty of ±50%.
Other MC event generators are employed to assess modeling systematic uncertainties. In these cases, the difference between the unfolded distribution of an alternative model and its own particle-level or partonlevel distribution is used as the estimate of the corresponding systematic uncertainty in the unfolded differential cross-section.
To assess the uncertainty related to the matrix element calculation and matching to the parton shower, MG5_aMC@NLO+Pythia8 events are unfolded using the migration matrix and correction factors derived from the Powheg+Pythia8 sample. This uncertainty is found to be in the range ±10-15%, depending on the variable, increasing to ±20-30% at large p t T , m tt , p tt T and y tt where there are fewer data events. To assess the uncertainty associated with the choice of parton shower and hadronization model, a comparison is made of the unfolded and particle-level distributions of simulated events created with Powheg interfaced to the Herwig7 parton shower and hadronization model using the nominal Powheg+Pythia8 corrections and unfolding matrices. The resulting systematic uncertainties, taken as the symmetrized difference, are found to be ±5-15%. The uncertainty related to the modeling of initial-and final-state radiation is determined using two alternative Powheg+Pythia8 tt MC samples described in Sec. 3. This uncertainty is found to be in the range ±10-15%, depending on the variable considered. The uncertainty arising from the size of the nominal MC sample is approximately 1%, scaling with the statistical uncertainty of the data as a function of the measured variables.
The uncertainty arising from parton distribution functions is assessed using the Powheg+Pythia8 tt sample. An envelope of spectra is determined by reweighting the central prediction of the PDF4LHC PDF set [80] and applying the relative variation to the nominal distributions. This uncertainty is found to be less than 1%.
The uncertainty in the integrated luminosity is ±2.1%. It is derived, following a methodology similar to that detailed in Ref.
[81], from a calibration of the luminosity scale using x-y beam-separation scans performed in August 2015 and May 2016.
Other sources of systematic uncertainty (e.g., the top-quark mass) are less than 1%.

Propagation of systematic uncertainties and treatment of correlations
The statistical and systematic uncertainties are propagated and combined in the same way for both the particle-level and parton-level results using pseudoexperiments created from the nominal and alternative MC samples.
The effect of the data statistical uncertainty is incorporated by creating pseudoexperiments in which independent Poisson fluctuations in each data bin are made. The statistical uncertainty due to the size of the signal MC samples used to correct the data is incorporated into the pseudoexperiments by adding independent Poisson fluctuations for a bin corresponding to the MC population in the bin.
To evaluate the impact of each uncertainty after the unfolding, the simulated distribution is varied, then unfolded using corrections obtained with the nominal Powheg+Pythia8 sample. The unfolded varied distribution is compared to the corresponding particle-or parton-level distribution. For each systematic uncertainty, the correlation between the signal and background distributions is taken into account. All detectorand background-related systematic uncertainties are estimated using the nominal Powheg+Pythia8 sample. Alternative hard-scattering, parton shower and hadronization, ISR/FSR and PDF uncertainties are estimated by a comparison between the unfolded cross-section and the corresponding particle-or partonlevel distribution produced using the corresponding alternative Monte Carlo event generator.
The systematic uncertainties for the particle-level fiducial phase-space total cross-section measurement described below are listed in Table 3. Table 3: Summary of the largest systematic and statistical relative uncertainties for the absolute particle-level fiducial phase-space cross-section measurement in percent. Most of the uncertainties that are less than 1% are not listed.

Source
Percentage Large-R jet energy scale 5 Figure 6 shows a summary of the relative size of the systematic uncertainties for the leading top-quark jet transverse momentum and rapidity at particle level and parton level.
[GeV]  Figure 6: Relative uncertainties in the normalized differential cross-sections as a function of the leading top-quark jet transverse momentum and rapidity at particle level and parton level. The light and dark blue areas represent the total and statistical uncertainty, respectively. The Powheg+Pythia8 event generator is used as the nominal prediction to correct for detector effects.
A covariance matrix is constructed for each differential cross-section to include the effect of all uncertainties to allow quantitative comparisons with theoretical predictions. This covariance matrix is derived by summing two covariance matrices following the same approach used in Refs. [10, 14].
The first covariance matrix incorporates statistical uncertainties and systematic uncertainties from detector effects and background estimation by using pseudoexperiments to convolve the sources. In each pseudoexperiment, the detector-level data distribution is varied following a Poisson distribution. For each systematic uncertainty effect, Gaussian-distributed shifts are coherently included by scaling each Poisson-fluctuated bin content with its expected relative variation from the associated systematic uncertainty. Differential cross-sections are obtained by unfolding the varied distribution with the nominal corrections, and the distribution of the resulting changes in the unfolded distributions are used to compute this first covariance matrix.
The second covariance matrix is obtained by summing four separate covariance matrices corresponding to the effects of the tt event generator, parton shower and hadronization, ISR/FSR and PDF uncertainties. The bin-to-bin correlation values are set to unity for all these matrices.
The comparison between the measured differential cross-sections and a variety of MC predictions is quantified by calculating χ 2 values employing the covariance matrix and by calculating the corresponding p-values (probabilities that the χ 2 is larger than or equal to the observed value assuming that the measured and predicted distributions are statistically equivalent) from the χ 2 and the number of degrees of freedom (NDF). The χ 2 values are obtained using where V N b is the vector of differences between measured differential cross-section values and predictions, and Cov −1 N b is the inverse of the covariance matrix. The normalization constraint used to derive the normalized differential cross-sections lowers the NDF to one less that the rank of the N b × N b covariance matrix, where N b is the number of bins in the unfolded distribution. The χ 2 for the normalized differential cross-sections is where V N b −1 is the vector of differences between measurement and prediction obtained by discarding one of the N b elements and Cov N b −1 is the (N b − 1) × (N b − 1) sub-matrix derived from the covariance matrix by discarding the corresponding row and column.
8 Measurement of the differential cross-sections The differential cross-sections are obtained from the data using the unfolding technique described above. In the following subsections, the resulting particle-level and parton-level differential cross-sections are presented.

Particle-level fiducial phase-space differential cross-section
The unfolded differential cross-sections, normalized to the total cross-section for the fiducial phase space, are shown in Fig. 7 for the p T and rapidity of the leading and second-leading top-quark jets, and in Fig. 8 for the p T , mass and rapidity of the tt system. The unfolded differential cross-sections are shown in Figs. 9-11 for the tt production angle in the Collins-Soper reference frame, the scalar sum of the transverse momenta of the two top quarks, H tt T , the longitudinal boost, y tt B , the azimuthal angle between the two top-quark jets, ∆φ tt , the variable related to the rapidity difference between the two top-quark jets, χ tt , and the absolute value of the out-of-plane momentum, p tt out . These are compared with SM predictions obtained using the NLO MC event generators described in Sec. 3. This analysis is sensitive to top-quark jets produced with p T up to approximately 1 TeV and to a rapidity |y t | < 2.0. The differential cross-section falls by two orders of magnitude as a function of top-quark jet transverse momentum over a p T range from 500 GeV to 1 TeV. The production cross-section decreases as a function of top-quark jet rapidity by approximately 30% from y t = 0 to y t = ±1. The differential crosssection as a function of p T for the second-leading top-quark jet reflects the effect of the p T requirement on the leading top-quark jet and the strong correlation in p T of the two top-quark jets arising from the pair-production process.
The tt system is centrally produced with a transverse momentum typically below 200 GeV, an invariant mass below 1.5 TeV and a rapidity |y t | < 1.5. In particular, the m tt distribution falls smoothly, with a sensitivity that extends up to ∼ 2 TeV.         Figure 10: Normalized particle-level fiducial phase-space differential cross-sections as a function of (a) the azimuthal angle between the two top-quark jets ∆φ tt and (b) the absolute value of the out-of-plane momentum p tt out . The gray bands indicate the total uncertainty in the data in each bin. The vertical bars indicate the statistical uncertainties in the theoretical models. The Powheg+Pythia8 event generator is used as the nominal prediction. Data points are placed at the center of each bin.  Figure 11: Normalized particle-level fiducial phase-space differential cross-sections as a function of (a) the production angle in the Collins-Soper reference frame and (b) the variable χ tt . The gray bands indicate the total uncertainty in the data in each bin. The vertical bars indicate the statistical uncertainties in the theoretical models. The Powheg+Pythia8 event generator is used as the nominal prediction. Data points are placed at the center of each bin.

Parton-level phase-space differential cross-sections
The unfolded parton-level phase-space differential cross-sections are shown in Figs.12-16 for the kinematical variables describing the top quark, leading top quark, second-leading top quark and the tt system.
To measure the average top-quark p T distribution that can be compared with NNLO+NNLL calculations [1][2][3], the data are unfolded by randomly selecting one of the two top-quark candidates at the detector level for each event. The normalized average top-quark p T and rapidity differential cross-sections are shown in Fig. 12(a) and Fig. 12(b), respectively.      Figure 16: The normalized parton-level differential cross-sections as a function of (a) ∆φ(t 1 , t 2 ) and (b) p tt out . The orange bands indicate the total uncertainty in the data in each bin. The vertical bars indicate the statistical uncertainties in the theoretical models. The Powheg+Pythia8 event generator is used as the nominal prediction to correct for detector effects, parton showering and hadronization. Data points are placed at the center of each bin. The unfolding has required the leading top-quark p T > 500 GeV and the second-leading top-quark p T > 350 GeV.

Fiducial phase-space inclusive cross-section
The cross-section of tt production in the fiducial phase space defined in this analysis is determined using the same methodology employed to obtain the unfolded differential cross-sections at particle level, with the exception that all events are grouped into a single bin. The inclusive fiducial cross-section is: The systematic uncertainties in this measurement, which are dominated by tagging and modeling uncertainties, are summarized in Table 3.
The resulting inclusive fiducial cross-section measurement is shown in Fig. 18 and compared with various MC predictions. The measured value is below all of the predictions, and in particular is below the Powheg +Pythia8 prediction of 384 ± 36 fb. The uncertainty in this MC prediction is the sum in quadrature of statistical, scale and PDF uncertainties, including the uncertainty in the NNLO+NNLL total crosssection prediction. The scale uncertainty is estimated by determining the envelope of predictions when the factorization µ F and renormalization µ R scales are varied by factors of 0.5 and 2.0. The PDF uncertainty is obtained using the PDF4LHC prescription with 30 eigenvectors. All of the predictions are normalized to the NNLO+NNLL total tt cross-section.

Comparisons with Standard Model predictions
The particle-level fiducial phase-space differential cross-sections and the parton-level differential crosssections are compared with several Standard Model calculations. The predicted total particle-level cross-section for top-quark pair production in the fiducial phase-space region is larger than the one observed. However, the effect is not statistically significant due to the large systematic uncertainties. A better agreement is found for Powheg+Herwig7 and to a lesser extent for the predictions of Powheg+Pythia8 with more initial-and final-state radiation.
The information provided by the shapes of the observed differential cross-section measurements is compared to the predictions using the χ 2 test described in Sec. 7.2, which takes into account the correlations between the measured quantities. The largest correlations at the detector-level arise from sources of uncertainty that affect all bins equally, so that the most effective comparison is made using the normalized differential cross-sections where many of the common detector-level uncertainties largely cancel. The χ 2 values and associated p-values that quantify the level of agreement between the measurements and the predictions are shown in Table 4 for the normalized particle-level fiducial phase-space differential cross-sections and in Table 5 for the normalized parton-level differential cross-sections.
The particle-level differential cross-sections are generally well-described by the Powheg +Pythia8, Powheg +Herwig7, MG5_aMC@NLO+Pythia8 and Sherpa event generator predictions. The tt differential crosssection as a function of the absolute value of the leading top-quark rapidity (Fig. 7(c)) is broader in the data than the predictions of all Monte Carlo event generators. A similar effect is observed in the tt system rapidity differential cross-section (Fig. 8(c)). However, the p-values arising from the χ 2 comparisons are mostly within 0.15 to 0.55, reflecting the overall reasonable agreement of the predictions with the measured differential cross-sections. There are modest differences in the distributions of the production angle cos θ * (Fig. 11(a)) and the variable χ tt (Fig. 11(b)), both showing p-values that are generally below 0.2.
The most significant deviations are in the MG5_aMC@NLO particle-level fiducial phase-space differential cross-sections as a function of p tt T (Fig. 8(a)), ∆φ tt ( Fig. 10(a)) and p tt out ( Fig. 10(b)) for which the MG5_aMC@NLO+Pythia8 MC event generator predicts a harder p tt T spectrum, a broader azimuthal opening angle differential cross-section than what is measured and a slower decline than observed as a function of p tt out . There is similar good agreement between the parton-level differential cross-sections and the Powheg+Pythia8, Powheg+Herwig7, MG5_aMC@NLO +Pythia8 and Sherpa predictions, confirming the results of the fiducial phase-space measurements, but with larger uncertainties. As shown in Fig. 14(a), the Powheg+Pythia8 and Powheg +Herwig7 event generators predict a softer p T spectrum of the tt system, while the MG5_aMC@NLO + Pythia8 event generator predicts a harder spectrum. The Sherpa event generator offers a good description of the differential cross-section behavior for p tt T in the range 100 to 500 GeV but predicts a steeper distribution for lower momenta and a higher rate for p tt T > 500 GeV than observed. The modeling uncertainties generally play a dominant role in determining the significance of the difference between the measurements and the nominal Powheg+Pythia8 prediction. It suggests that future work should seek the sources for this potential discrepancy, considering variations in parton shower and hadronization models as well as the matching of higher-order matrix elements with the parton shower model. Table 4: Comparison between the measured normalized particle-level fiducial phase-space differential crosssections and the predictions from several SM event generators. For each variable and prediction, a χ 2 and a p-value are calculated using the covariance matrix described in the text, which includes all sources of uncertainty. The number of degrees of freedom (NDF) is equal to N b − 1, where N b is the number of bins in the distribution.  These results are in agreement with earlier differential cross-section measurements in the tt final states involving at least one lepton [7-14, 16-19]. Those studies observed a "softer" p T spectrum for the topquark final states, although the statistical and systematic uncertainties for top quarks with p T > 500 GeV are larger than the measurements reported here. Together, the previous measurements and these results provide a coherent picture that the current NLO Monte Carlo models for tt production and decay overestimate the production of highly boosted top quarks.

Conclusion
Measurements of differential cross-sections of highly boosted pair-produced top quarks in 13 TeV pp collisions are presented in a data sample of 36.1 fb −1 collected by the ATLAS detector at the LHC. The top-quark pairs are observed in their all-hadronic decay modes. With a combination of top-tagging and b-tagging techniques, an event sample with a tt signal-to-background ratio of approximately 3-to-1 is selected. Because most of the decay products of the top quarks are observed in a large-R jet, the kinematics of the top quarks and the tt system are well-measured compared with final states involving energetic neutrinos. The measurements are corrected to a fiducial phase space and normalized to the total cross-section for events with leading top quarks with p T > 500 GeV and second-leading top quarks with p T > 350 GeV. Parton-level differential cross-sections are also determined.
The leading and second-leading top-quark p T differential cross-sections fall by two orders of magnitude over the p T range from 500 GeV to 1 TeV. The top-quark rapidity distributions show a plateau out to |y t | ∼ 0.6 and then fall rapidly, reflecting the central production of these top-quark pairs. The measurements show that the tt system is produced centrally with limited transverse momentum, though events are observed up to a p tt T of 500 GeV. The normalized differential cross-sections are compared with several Standard Model predictions for highly boosted pair-produced top quarks, and there is generally good agreement of the predictions with the particle-level and parton-level differential results. In particular, the Powheg+Pythia8, Powheg+Herwig7 and Sherpa predictions are consistent with the observed differential cross-sections at particle level and parton level. The most significant discrepancy is in the aMC@NLO+Pythia8 predictions for the kinematics of the tt system. Qualitatively, both particle-and parton-level rapidity distributions of the leading top quark and of the tt system are broader in the data compared with the Monte Carlo generator predictions. Also, there are more modest differences between predicted and observed differential cross-sections as a function of the production angle cos θ * and the variable χ tt .
The cross-section for tt production in the particle-level fiducial phase space is 292 ± 7 (stat) ± 76 (syst) fb, which can be compared with the Powheg+Pythia8 prediction of 384 ± 36 fb, where the total cross-section has been calculated up to NNLO+NNLL corrections. Improvements in this measurement will come from a better understanding of the models of tt production that are the source of the modelling uncertainties.
This analysis shows that studies of boosted top-quark jets can be done with good efficiency and signalto-background ratios in the all-hadronic channel. This creates opportunities for more detailed studies of high-p T Standard Model processes, and provides data to test and improve models of tt production. [8] ATLAS Collaboration, Measurements of normalized differential cross-sections for ttbar production in pp collisions at √ s = 7 TeV using the ATLAS detector, Phys. Rev. D 90 (2014) 072004, arXiv: 1407.0371 [hep-ex].
[10] ATLAS Collaboration, Measurement of the differential cross-section of highly boosted top quarks as a function of their transverse momentum in √ s = 8 TeV proton-proton collisions using the ATLAS detector, Phys. Rev. D 93 (2016) 032009, arXiv: 1510.03818 [hep-ex].