Measurement of the Jet Mass Distribution and Top Quark Mass in Hadronic Decays of Boosted Top Quarks in pp Collisions at ﬃﬃ s p = 13 TeV

A measurement is reported of the jet mass distribution in hadronic decays of boosted top quarks produced in pp collisions at ﬃﬃﬃ s p ¼ 13 TeV. The data were collected with the CMS detector at the LHC and correspond to an integrated luminosity of 35 . 9 fb − 1 . The measurement is performed in the lepton þ jets channel of t ¯ t events, where the lepton is an electron or muon. The products of the hadronic top quark decay t → bW → bq ¯ q 0 are reconstructed as a single jet with transverse momentum larger than 400 GeV. The t ¯ t cross section as a function of the jet mass is unfolded at the particle level and used to extract a value of the top quark mass of 172 . 6 (cid:2) 2 . 5 GeV. A novel jet reconstruction technique is used for the first time at the LHC, which improves the precision by a factor of 3 relative to an earlier measurement. This highlights the potential of measurements using boosted top quarks, where the new technique will enable future precision measurements.

The top quark is the most massive known elementary particle. Its large mass m t leads to significant contributions from quantum corrections to the mass of the Higgs boson and precision observables in the electroweak sector. As a consequence, the top quark plays an important role in the mechanism of electroweak symmetry breaking. Precision measurements of m t provide a crucial input for consistency checks of the standard model [1,2]. Direct measurements of m t at the CERN LHC reach a precision of around 0.5 GeV [3][4][5][6][7][8][9]. However, an ambiguity in the interpretation of the results originates from the modeling of partonshower dynamics and nonperturbative effects in quantum chromodynamics (QCD). The result can depend on the Monte Carlo (MC) event generator, the tuning of its free parameters, and the observables used [10]. Precisely relating the experimentally obtained value of m t to the pole mass or a mass in another well-defined renormalization scheme is therefore difficult from first principles [11].
As an alternative, a value of the pole mass can be extracted through measurements of the total [12][13][14]14,15] and differential [16,17] tt production cross sections, with a precision of approximately 1 GeV. These measurements are dominated by tt threshold production, where uncertainties due to parton distribution functions (PDFs) and higherorder QCD corrections are important [18][19][20]. Another way to determine m t involves measuring top quarks produced with large Lorentz boosts, where the decay products t → bW → bqq 0 are contained in a single jet. The jet mass (m jet ) peak location is sensitive to m t and can be calculated from first principles [21][22][23][24][25][26][27] in soft-collinear effective theory [28][29][30][31].
A past measurement reporting the tt cross section as a function of m jet in the l þ jets final state, where l is an electron or muon, was carried out in proton-proton (pp) collisions at ffiffi ffi s p ¼ 8 TeV [32]. This Letter reports a new measurement of the m jet distribution in pp collisions at 13 TeV using several important improvements, including jet clustering with the XCone algorithm [33], used for the first time in an LHC analysis, and an improved unfolding procedure using sideband regions with high granularity. The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. A silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter, each composed of a central barrel and two end sections, reside within the solenoid volume. Forward calorimeters extend the pseudorapidity (η) coverage provided by the barrel and end detectors. Muons are detected in gas-ionization chambers embedded in the steel flux-return yoke outside the solenoid. A more detailed description of the CMS detector, together with a definition of the coordinate system, can be found in Ref. [34]. The particle-flow (PF) algorithm [35] aims to reconstruct and identify each individual particle in an event, using an optimized combination of information from the various elements of the CMS detector. The candidate vertex with the largest sum of the square of the transverse momenta p 2 T of the physics objects is taken to be the primary pp interaction vertex; more details are given in Sec. 9.4.1 in Ref. [36]. From PF candidates, jets are reconstructed using the anti-k T [37] or the XCone [33] algorithm as implemented in the FASTJET software package [38]. The anti-k T jets are obtained using a distance parameter of 0.4. In the jet-clustering procedure, charged PF candidates are excluded if they are associated to vertices from additional inelastic pp interactions within the same bunch crossing (pileup).
The POWHEG [39][40][41][42][43][44] v2 generator is used for simulating tt production at next-to-leading order (NLO). Alternatively, tt production is simulated with MadGraph5_aMC@NLO v2.2.2 [45,46] at NLO to check a potential generator dependence of the measured cross sections. Background events resulting from the production of single top quarks are also generated in POWHEG at NLO, where spin correlations are taken into account [47]. The production of a W boson with additional jets is simulated using MadGraph5_aMC@NLO at NLO. Events from Drell-Yan (DY) production with additional jets are simulated in MadGraph5_aMC@NLO at leading order (LO) and are normalized to the next-to-next-toleading-order cross section [48]. The simulation of the production of two heavy gauge bosons with additional jets is performed at LO with PYTHIA v8.212 [49]. Events in which jets are produced only through QCD interactions are also simulated with PYTHIA at LO.
In simulated MadGraph5_aMC@NLO events, the matrix element (ME) calculations at NLO and LO accuracy are matched to parton showers with the FxFx [50] and MLM [51] algorithms, respectively. The parton shower, hadronization process, and multiple-parton interactions are simulated using PYTHIA. The NNPDF3.0 [52] PDFs at LO and NLO are used for the respective processes simulated at LO and NLO. The UE tune CUETP8M2T4 [53] is used to simulate tt and single top quark production in the t channel; all other processes are simulated using CUETP8M1 [54,55]. The detector response is simulated with the GEANT4 package [56,57]. Simulated events are processed through the software chain used for collision data and are reweighted to match the observed distribution in the number of pileup interactions in the data.
This analysis uses data recorded with the CMS detector that correspond to an integrated luminosity of 35.9 fb −1 [58]. Events containing the decay of a top quark to a final state including a muon are selected using a single-muon trigger [59] that requires the presence of at least one muon candidate with a transverse momentum p T > 50 GeV and jηj < 2.4. For events containing a final-state electron, the trigger requires the presence of at least one isolated candidate with p T > 27 GeV, or an electron candidate without an isolation requirement but with p T > 115 GeV and jηj < 2.5, or at least one photon candidate with p T > 175 GeV and jηj < 2.5. The latter requirement ensures that events containing electrons with high p T are selected with high efficiency. Lepton candidates (electrons or muons) must have p T > 55 GeV, jηj < 2.4. Following the requirement at the trigger level, electrons with p T < 120 GeV must pass an isolation requirement [60], where the isolation is defined as the p T sum of charged hadrons and neutral particles in a cone with radius ΔR ¼ 0.3 around the electron. The angular distance between two objects is defined as ΔR ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi ðΔηÞ 2 þ ðΔϕÞ 2 p , where ϕ is the azimuthal angle in radians. Electrons with p T > 120 GeV and muons with p T > 55 GeV are required to pass a twodimensional selection of either ΔRðl; jÞ > 0.4 or p T;rel ðl; jÞ > 40 GeV, where j is the anti-k T jet with minimal angular separation ΔR from the lepton l and p T;rel ðl; jÞ is the component of the lepton momentum orthogonal to the anti-k T -jet axis [61,62]. Each selected event must contain a single lepton.
The XCone jets are obtained through a two-step jet clustering [63]. First, the exclusive XCone algorithm is applied with a distance parameter of R jet ¼ 1.2 and the specification of returning two jets, corresponding to the two boosted top quarks in the event. Using the constituents of these two large jets as input, XCone is run again with the distance parameter R sub ¼ 0.4 and the parameter of the number of subjets in each jet N sub ¼ 3. Subjets are considered only if they are within jηj < 2.4. This procedure results in exactly two large-radius XCone jets with three XCone subjets each. The final result is not influenced by the number of subjets within the large XCone jet including the lepton, where N sub ¼ 2 would be the natural choice for clustering the visible products of the decay t → bW → blν. The four-momentum of the lepton candidate is subtracted from the four-momentum of the anti-k T jet or XCone subjet if ΔRðl; jÞ < 0.4. Jet energy corrections [64] derived for anti-k T jets are applied to anti-k T jets and XCone subjets. The jet energy resolution in simulated events is smeared to match the resolution in data. An additional correction applied to the XCone-subjet momenta is obtained from simulated tt events in the all-jets channel to account for differences between the XCone-subjet momenta and the momenta of anti-k T jets. This correction is parametrized as a function of XCone subjet p T and jηj and has an average size of 2%, with an average uncertainty of 0.3%.
The four-momenta of the three XCone subjets are combined to form the final XCone jet. The XCone jet used to perform the measurement is the one with the largest distance ΔR to the selected lepton. Each of the three XCone subjets in this jet must have p T > 30 GeV. The XCone-jet mass m jet is the invariant mass of all PF candidates clustered into the three XCone subjets.
In order to identify jets originating from the hadronization of b quarks, the combined secondary vertex v2 (CSVv2) [65] algorithm is applied to the anti-k T jets. These candidate b jets are required to have p T > 30 GeV and jηj < 2.4 and must pass the tight working point of the CSVv2 algorithm.
The fiducial region chosen for this measurement is studied using simulations at the particle level, defined by all particles with average lifetimes longer than 10 −8 s. The kinematic phase space of this region is defined through tt events containing one lepton with p l T > 60 GeV, which originates from the decay of a W boson; the τ lepton decays are not considered part of the signal. Particle-level jets are obtained with a clustering identical to the one in the data. The particle-level XCone jet with largest distance ΔR to the lepton is required to have p T > 400 GeV, and each of its XCone subjets must have p T > 30 GeV. Its mass has to be greater than the mass obtained by summing the fourmomenta of the second-highest XCone jet in p T and the lepton. The resulting distribution in m jet at the particle level has a width half as large as for Cambridge-Aachen (CA) jets [66,67] with R jet ¼ 1.2, as used in a previous measurement [32]. The improvement is due to the two-step XCone jet clustering procedure, which acts as a grooming algorithm [68][69][70], similar to trimming [71], on the large jet. The advantage of XCone over other grooming algorithms in this measurement is its dynamical interpolation between the resolved and boosted regime, i.e., between three well-separated subjets and three subjets close together, which would not be resolved by other reconstruction methods.
At the reconstruction level, the same criteria are used as in the definition of the fiducial phase space at the particle level. In addition, at the reconstruction level, an event has to have at least one b-tagged anti-k T jet and p miss T > 50 GeV, which suppresses non-tt backgrounds. Here, p miss T is the magnitude of the negative vector sum of the transverse momenta of the PF candidates in an event [72]. The resulting m jet distribution for XCone jets with p jet T > 400 GeV is displayed in Fig. 1. Backgrounds originate from singly produced top quarks and from W þ jets events. Contributions from DY þ jets, diboson, and QCD multijet production are found to be negligible. The tt simulation is scaled, such that the number of simulated events matches the number of backgroundsubtracted events in the data. The distribution shows a pronounced and narrow peak close to the value of m t . The XCone-jet reconstruction results in a large improvement of the experimental resolution in m jet . With XCone, a resolution of 6% is achieved, compared to a resolution of approximately 14% for CA jets with R jet ¼ 1.2. The measurement at the particle level uses a regularized unfolding procedure based on a least-squares fit, implemented in the TUnfold [73] framework. The optimal regularization strength is determined through a minimization of the average global correlation coefficient in the output bins [74]. The response matrix is evaluated by using tt events simulated with POWHEG that pass the particle-or reconstruction-level requirements. Prior to the unfolding, contributions from background processes are subtracted from data. Sideband regions are included in the unfolding process to constrain migrations into and out of the measurement phase space. Five sideband regions are defined by the requirements: 55<p l T <60 GeV, 350<p jet T < 400GeV, at least one XCone subjet with p T < 30 GeV, m jet less than the mass of the second XCone jet and lepton system, and at least one anti-k T jet passing a looser b-tagging requirement with no anti-k T jet passing the tight b-tagging requirement. In addition, the measurement region is divided into three bins in p jet T . Except for the sideband with a looser b tag, all sideband selections have corresponding selections at the particle level in the evaluation of the migration matrix. In this matrix, the number of bins in m jet at the particle level is larger than the number of bins in which the final measurement is presented. This helps to reduce the dependence on variations in signal modeling through a more precise determination of migration effects. The electron and muon channels are combined before the unfolding to increase the statistical precision but are also unfolded separately to verify their consistency.
Experimental uncertainties are estimated using simulation and propagated through the unfolding process. We consider uncertainties in the pileup reweighting [75], trigger, lepton identification and b-tagging [65] efficiencies, and also those related to the jet energy scale [64] and jet energy resolution for anti-k T jets and XCone subjets, and additional XCone-subjet corrections. Uncertainties related to the integrated luminosity [58] and the production cross sections of all significant background processes [76][77][78][79][80][81] are also included. Uncertainties arising from choices in modeling the signal include changes made in renormalization and factorization scales μ R and μ F , changes in m t by AE3 GeV, changes in PDFs, and choices in modeling of parton showers (PS) and their matching to the ME calculation and the underlying event (UE). Uncertainties in the modeling of PS include changes in scales of initialand final-state radiation (ISR and FSR, respectively) and changes in the ME matching parameter h damp [53]. The uncertainty related to modeling the UE is estimated by changing the model of color reconnection in PYTHIA [82] and using two other schemes [83,84]. Uncertainties from modeling b quark fragmentation and the semileptonic branching fractions of b hadrons are found to be negligible. The measured differential cross section in the data is shown in Fig. 2 (top) and compared to the predictions from POWHEG and MadGraph5_aMC@NLO with m t ¼ 172.5 GeV. In the peak region, the total relative uncertainty is between 16% and 36%, of which the dominant contribution is 12%-31% from the jet energy scale uncertainty. The largest model uncertainty is from FSR modeling, with an uncertainty of 4%-18%. The statistical uncertainty is 6%-7%. The total measured tt cross section in the fiducial region of 112 < m jet < 232 GeV is σ ¼ 527 AE 15ðstatÞ AE 39ðexpÞAE 29ðmodelÞ fb. The cross section predicted by POWHEG is 680 AE 109 fb, where the theoretical uncertainty is obtained by changing the scales μ R and μ F , the ISR and FSR PS scales, the parameter h damp , and the UE modeling in the simulation. A smaller cross section is observed in the data relative to the simulation, in agreement with previous highp T top quark measurements [32,[85][86][87][88].
Figure 2 (bottom) shows the normalized differential cross section as a function of m jet , which is obtained by dividing the differential cross section by the total cross section in the fiducial region. The normalized differential cross section benefits from a partial cancellation of systematic uncertainties and shows good agreement with the prediction from POWHEG for a value of m t ¼ 172.5 GeV.
The normalized differential cross section can be used to extract a value of m t . A fit is performed based on the χ 2 evaluated as χ 2 ¼ d T V −1 d, where d is the vector of differences between the measured normalized cross sections and the predictions obtained from POWHEG for different values of m t . The symbol V represents the covariance matrix that contains statistical, experimental systematic, signal modeling in the unfolding, and theoretical uncertainties. The result is m t ¼ 172.6 AE 0.4ðstatÞ AE 1.6ðexpÞ AE 1.5ðmodelÞ AE 1.0ðtheoÞ GeV: This result is a determination of m t from decays of boosted top quarks, with an average energy scale of approximately 480 GeV, much larger than the scale in m t measurements from threshold production. The improvement in precision by a factor of 3.6 relative to the measurement at 8 TeV [32] is attributed primarily to the novel jet reconstruction using XCone. The improvement by a factor of 2 in both the m jet width at the particle level and experimental resolution, together with more integrated luminosity and an increased FIG. 2. The particle-level tt differential cross section in the fiducial region as a function of the XCone-jet mass (top). The measurement is compared to predictions from POWHEG and MadGraph5_aMC@NLO with m t ¼ 172.5 GeV. Theoretical uncertainties are shown as bands for the predictions from POWHEG. The normalized differential cross section (bottom) is compared to predictions from POWHEG for different values of m t . The vertical bars represent the statistical (inner) and the total (outer) uncertainties. The horizontal bars reflect the bin widths. The lower panels show the ratios of theoretical predictions to data. value of ffiffi ffi s p , provides a reduction by a factor of about 14 in the statistical uncertainty.
The systematic uncertainties are also reduced through the XCone-jet reconstruction, which enables a more precise calibration of the XCone-subjet energies and a better stability against contributions from pileup and the UE. Uncertainties from modeling are reduced through the use of additional sideband regions with higher granularity in the unfolding.
In summary, a measurement has been presented of the tt differential cross section for t → bW → bqq 0 decays of boosted top quarks as a function of the jet mass m jet . A determination of m t from the normalized m jet distribution provides a value of 172.6 AE 2.5 GeV, with an uncertainty close to that of events at the tt production threshold. This measurement shows for the first time the importance of boosted top quarks for extracting standard model parameters such as m t . The differential cross section as a function of m jet will enable a determination of m t using precise analytical calculations, feasible only in the boosted regime [26]. This is an important step in understanding the ambiguities arising between the top quark pole mass and m t measurements at hadron colliders. The novel reconstruction technique using the XCone jet algorithm results in the accuracy necessary for precision measurements at large top quark momenta, which will become increasingly important in future work at the LHC.
We congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centers and personnel of the Worldwide LHC Computing Grid for delivering so effectively the computing infrastructure essential to our analyses. Finally, we acknowledge the enduring support for the construction and operation of the LHC and the CMS detector provided by the following funding agencies: BMBWF and FWF [15] ATLAS Collaboration, Measurement of the tt production cross-section and lepton differential distributions in eμ dilepton events from pp collisions at ffiffi ffi s p ¼ 13 [87] CMS Collaboration, Measurement of the integrated and differential tt production cross sections for high-p T top quarks in pp collisions at ffiffi ffi s p ¼ 8 TeV, Phys. Rev. D 94, 072002 (2016).
[88] CMS Collaboration, Measurement of differential cross sections for the production of top quark pairs and of additional jets in lepton þ jets events from pp collisions at ffiffi ffi s p ¼ 13 TeV, Phys. Rev. D 97, 112003 (2018).