Measurement of the jet mass distribution and top quark mass in hadronic decays of boosted top quarks in pp collisions at $\sqrt{s} =$ 13 TeV

A measurement is reported of the jet mass distribution in hadronic decays of boosted top quarks produced in pp collisions at $\sqrt{s}=$ 13 TeV. The data were collected with the CMS detector at the LHC and correspond to an integrated luminosity of 35.9 fb$^{-1}$. The measurement is performed in the lepton+jets channel of $\mathrm{t\bar{t}}$ events, where the lepton is an electron or muon. The products of the hadronic top quark decay t$\to$bW$\to$bq$\mathrm{\bar{q}}'$ are reconstructed as a single jet with transverse momentum larger than 400 GeV. The $\mathrm{t\bar{t}}$ cross section as a function of the jet mass is unfolded at the particle level and used to extract a value of the top quark mass of 172.6 $\pm$ 2.5 GeV. A novel jet reconstruction technique is used for the first time at the LHC, which improves the precision by a factor of three relative to an earlier measurement.


1
The top quark is the most massive known elementary particle. Its large mass m t leads to significant contributions from quantum corrections to the mass of the Higgs boson and precision observables in the electroweak sector. As a consequence, the top quark plays an important role in the mechanism of electroweak symmetry breaking. Precision measurements of m t provide a crucial input for consistency checks of the standard model [1,2]. Direct measurements of m t at the CERN LHC reach a precision of around 0.5 GeV [3][4][5][6][7][8][9]. However, an ambiguity in the interpretation of the results originates from the modeling of parton-shower dynamics and nonperturbative effects in quantum chromodynamics (QCD) of such measurements. The result can depend on the Monte Carlo (MC) event generator used, the tuning of its free parameters, and in part also on the observables used in the analyses [10]. Precisely relating the experimentally obtained value of m t to the pole mass or a mass in another well-defined renormalization scheme is therefore difficult from first principles [11].
As an alternative, a value of the pole mass can be extracted through measurements of the total [12, 13] and differential [14,15] tt production cross sections, with a precision of approximately 1 GeV. These measurements are dominated by tt threshold production, where uncertainties due to parton distribution functions (PDFs) and higher-order QCD corrections are important [16][17][18]. Another way to determine m t involves measuring top quarks produced with large Lorentz boosts, where the decay products t → bW → bqq are contained in a single jet. The jet mass (m jet ) peak location is sensitive to m t , and can be calculated from first principles [19][20][21][22][23][24][25] in soft-collinear effective theory [26][27][28][29].
A past measurement reporting the tt cross section as a function of m jet in the +jets final state, where is an electron or muon, was carried out in proton-proton (pp) collisions at √ s = 8 TeV [30]. This Letter reports a new measurement of the m jet distribution in pp collisions at 13 TeV using several important improvements. The improvements include jet clustering with the exclusive XCone algorithm [31], used for the first time in an LHC analysis, and an improved unfolding procedure using sideband regions with high granularity. The precise measurement of the m jet distribution provides cross-checks of the analytic calculations [24], the modeling of decays of boosted top quarks in MC event generators, and a determination of m t at energy scales much larger than those reached using other methods.
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. A silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter, each composed of a central barrel and two end sections, reside within the solenoid volume. Forward calorimeters extend the pseudorapidity (η) coverage provided by the barrel and end detectors. Muons are detected in gas-ionization chambers embedded in the steel flux-return yoke outside the solenoid. A more detailed description of the CMS detector, together with a definition of the coordinate system, can be found in Ref. [32]. The particle-flow (PF) algorithm [33] aims to reconstruct and identify each individual particle in an event, using an optimized combination of information from the various elements of the CMS detector. The PF candidates are either identified as a photon, electron, muon, charged or neutral hadron. The candidate vertex with the largest value of summed physics-object p 2 T is taken to be the primary pp interaction vertex; more details are given in Section 9.4.1 of Ref. [34]. From PF candidates, jets are reconstructed using the anti-k T [35] or the XCone [31] algorithm as implemented in the FASTJET software package [36]. The anti-k T jets are obtained using a distance parameter of 0.4. In the jet clustering procedure, charged PF candidates are excluded if they are associated to vertices from additional inelastic pp interactions within the same bunch crossing (pileup).
The POWHEG [37][38][39][40][41][42] v2 generator is used for simulating tt production at next-to-leading order (NLO). Alternatively, tt production is simulated with MADGRAPH5 aMC@NLO v2.2.2 [43,44] at NLO to check a potential generator dependence of the measured cross sections. Background events resulting from the production of single top quarks in the s, t and tW channels are also generated in POWHEG at NLO, where spin correlations are taken into account [45]. The production of a W boson with additional jets is simulated using MADGRAPH5 aMC@NLO at NLO. Events from Drell-Yan (DY) production with additional jets are simulated in MAD-GRAPH5 aMC@NLO at leading order (LO) and are normalized to the next-to-next-to-leadingorder cross section [46]. The simulation of the production of two heavy gauge bosons with additional jets is performed at LO with PYTHIA v8.212 [47]. Events in which jets are produced only through QCD interactions are also simulated with PYTHIA at LO.
In simulated MADGRAPH5 aMC@NLO events the matrix element (ME) calculations at NLO and LO accuracy are matched to parton showers with the FxFx [48] and MLM [49] algorithms, respectively. The parton shower, hadronization process, and multiple-parton interactions (MPI) are simulated using PYTHIA. The NNPDF3.0 [50] PDFs at LO and NLO are used for the respective processes simulated at LO and NLO. The UE tune CUETP8M2T4 [51] is used to simulate tt and single top quark production in the t channel; all other processes are simulated using CUETP8M1 [52,53]. The detector response is simulated with the GEANT4 package [54,55]. Simulated events are processed through the software chain used for collision data and are reweighted to match the observed distribution in the number of pileup interactions in data.
This analysis uses data recorded with the CMS detector that correspond to an integrated luminosity of 35.9 fb −1 [56]. Events containing the decay of a top quark to a final state including a muon are selected using a single-muon trigger [57] that requires the presence of at least one muon candidate with a transverse momentum p T > 50 GeV and |η| < 2.4. For events containing a final-state electron, the trigger requires the presence of at least one isolated candidate with p T > 27 GeV, or an electron candidate without an isolation requirement but with p T > 115 GeV and |η| < 2.5, or at least one photon candidate with p T > 175 GeV and |η| < 2.5. The latter requirement ensures that events containing electrons with high p T are selected with high efficiency as the criteria based on shower distributions in the ECAL are less stringent for photon than for electron candidates.
Lepton candidates (electrons or muons) must have p T > 55 GeV, |η| < 2.4. Following the requirement at the trigger level, electrons with p T < 120 GeV must pass an isolation requirement [58], where the isolation is defined as the p T sum of charged hadrons and neutral particles in a cone with radius ∆R = 0.3 around the electron. The angular distance between two objects in η and φ is defined as ∆R = √ (∆η) 2 + (∆φ) 2 , where φ is the azimuthal angle in radians. Electrons with p T > 120 GeV and muons with p T > 55 GeV are required to pass a two-dimensional selection of either ∆R( , j) > 0.4 or p T, rel ( , j) > 40 GeV, where j is the anti-k T jet with minimal angular separation ∆R from the lepton , and p T, rel ( , j) is the component of the lepton momentum orthogonal to the anti-k T -jet axis [59,60]. Each selected event must contain a single lepton. The trigger and lepton reconstruction efficiencies in simulation are corrected to match those measured in data.
The XCone jets are obtained through a two-step jet clustering [61]. First, the exclusive XCone algorithm is applied with a distance parameter of R jet = 1.2 and the specification of returning two jets, corresponding to the two boosted top quarks in the event. Using the constituents of these two large jets as input, XCone is run again with the distance parameter R sub = 0.4 and the parameter of the number of subjets in each jet N sub = 3. Subjets are only considered if they are within |η| < 2.4. This procedure results in exactly two large-radius XCone jets with three XCone subjets each. The final result is not influenced by the number of subjets within the large  Figure 1: Simulated m jet distribution after the particle level selection from a tt simulation with m t = 172.5 GeV. Also shown are the distributions separately for fully merged and not merged events, as defined in the text.
XCone jet including the lepton, where N sub = 2 would be the natural choice for clustering the visible products of the decay t → bW → b ν. The four-momentum of the lepton candidate is subtracted from the four-momentum of the anti-k T jet or XCone subjet if ∆R( , j) < 0.4. Jet energy corrections [62] derived for anti-k T jets are applied to anti-k T jets and XCone subjets to account for additional energy depositions from pileup, nonlinearities in η, and a p T dependence in the detector response. The jet energy resolution in simulated events is smeared to match the resolution in data. An additional correction applied to the XCone-subjet momenta is obtained from simulated tt events in the all-jets channel to account for differences between the XConesubjet momenta and the momenta of anti-k T jets. This correction is parametrized as a function of XCone subjet p T and |η|, and has an average size of 2%, with an average uncertainty of 0.3%. This is verified in the +jets channel and in single top quark production in the tW channel.
The mean values of the generated and reconstructed XCone-subjet momenta agree within the uncertainty of 0.3%.
The four-momenta of the three XCone subjets are combined to form the final XCone jet. The XCone jet used to perform the measurement is the one with the largest distance ∆R to the selected lepton. Each of the three XCone subjets in this jet must have p T > 30 GeV. The XConejet mass m jet is the invariant mass of all PF candidates clustered into the three XCone subjets.
In order to identify jets originating from the hadronization of b quarks, the combined secondary vertex v2 (CSVv2) [63] algorithm is applied to the anti-k T jets. These candidate b jets are required to have p T > 30 GeV and |η| < 2.4, and must pass the tight working point of the CSVv2 algorithm. This choice results in an efficiency of 41% to correctly identify anti-k T jets originating from b quarks, with a misidentification rate of 0.1% for jets from light quarks and gluons. The b tagging efficiencies and misidentification rates are adjusted in simulation to match those in data.
The fiducial region chosen for this measurement is studied using simulations at the particle level, defined by all particles with average lifetimes longer than 10 −8 s. The kinematic phase space of this region is defined through tt events containing one lepton with p T > 60 GeV, which originates from the decay of a W boson; the τ lepton decays are not considered part of the signal. Particle-level jets are obtained with a clustering identical to the one in data. The  particle-level XCone jet with largest distance ∆R to the lepton is required to have p T > 400 GeV, and each of its XCone subjets must have p T > 30 GeV. Its mass has to be greater than the mass obtained by summing the four-momenta of the second-highest XCone jet in p T and the lepton. Figure 1 shows the simulated distribution in m jet at the particle-level, normalized to the number of events expected for the integrated luminosity in data. A sharp peak is visible near the value of m t . This distribution has a width half as large as for Cambridge-Aachen (CA) jets [64, 65] with R jet = 1.2, as used in a previous measurement [30]. The improvement is due to the two-step XCone jet clustering procedure, which acts as a grooming algorithm [66, 67], similar to trimming [68], on the large jet. The advantage of XCone over other grooming algorithms in this measurement is its dynamical interpolation between the resolved and boosted regime, i.e., between three well-separated subjets and three subjets close together, which would not be resolved by other reconstruction methods. The fraction of fully merged t → Wb → qq b decays in the region of the top quark peak with 140 < m jet < 200 GeV is approximately 75%, where an event is called fully merged if each individual parton from the fully hadronic top quark decay is within ∆R = 0.4 of one of the three XCone subjets at the particle level.
At the reconstruction level the same criteria are used as in the definition of the fiducial phase space at the particle level. In addition, at the reconstruction level an event has to have at least one b-tagged anti-k T jet present and pass the requirement of having p miss T > 50 GeV, which suppresses non-tt backgrounds. Here, p miss T is the magnitude of the negative vector sum of the transverse momenta of all the PF candidates in an event [69]. The resulting m jet distribution for XCone jets with p jet T > 400 GeV is displayed in Fig. 2. Backgrounds originate from singly produced top quarks and from W+jets events. Contributions from DY+jets, diboson, and QCD multijet production are found to be negligible. The tt simulation is scaled, such that the number of simulated events matches the number of background-subtracted events in data. Just as in Fig. 1, the distribution shows a pronounced and narrow peak close to the value of m t . The XCone-jet reconstruction results in a large improvement of the experimental resolution in m jet . With XCone a resolution of 6% is achieved, compared to a resolution of approximately 14% for CA jets with R jet = 1.2. Another advantage is that the peak position is stable as a function of the number of pileup vertices, a fact which has been verified using m jet from fully merged W decays, as calculated from the two XCone subjets with the smallest pairwise mass. The measurement at the particle level uses a regularized unfolding procedure based on a leastsquares fit, implemented in the TUNFOLD [70] framework. The optimal regularization strength is determined through a minimization of the average global correlation coefficient in the output bins [71]. The response matrix is evaluated by using tt events simulated with POWHEG that pass the particle-or reconstruction-level selections. Prior to the unfolding, contributions from background processes are subtracted from data. Sideband regions are constructed to constrain migrations into and out of the measurement phase space. For every selection criterion leading to significant migrations, a sideband region is included in the unfolding process. The resulting five sideband regions are defined by the following individual requirements: 55 < p T < 60 GeV, 350 < p jet T < 400 GeV, at least one XCone subjet with p T < 30 GeV, m jet must be less than the mass of the second XCone jet and lepton system, and at least one anti-k T jet passing a looser b tagging requirement with no anti-k T jet passing the tight b tagging requirement. In addition, the measurement region is divided into three bins in p jet T . Except for the sideband with a looser b tag, all sideband selections have corresponding selections at the particle level in the evaluation of the migration matrix. In this matrix, the number of bins in m jet at the particle level is larger than the number of bins in which the final measurement is presented. This helps to reduce the dependence on variations in signal modeling through a more precise determination of migration effects. The electron and muon channels are combined before the unfolding to increase the statistical precision, but are also unfolded separately to verify their consistency.
Experimental uncertainties are estimated using simulation and propagated through the unfolding process. We consider uncertainties in the pileup reweighting [72], trigger, lepton identification and b tagging [63] efficiencies, and also those related to the jet energy scale [62] and jet energy resolution for anti-k T jets and XCone subjets, and additional XCone-subjet corrections. Uncertainties related to the integrated luminosity [56] and the production cross sections of all significant background processes [73-78] are also taken into account. Uncertainties arising from choices in modeling the signal include variations of the renormalization and factorization scales µ R and µ F , the choice of PDFs and m t , differences in modeling of parton showers (PS) and their matching to the ME calculation and the underlying event (UE). The scales µ R and µ F are changed by factors of 0.5 and 2, either coherently or individually, and the variation giving the maximum uncertainty is used. The uncertainty from the choice of PDFs is assessed by reweighting the signal simulation to 100 replicas from the NNPDF set [50]. The value of m t in the simulation is changed by ±3 GeV. Uncertainties in the modeling of PS include changes in the initial-and final-state radiation (ISR and FSR) scales by factors of 2 and √ 2 [51], respectively. The matching of the ME calculation to the PS is controlled by the model parameter h damp = 1.58 +0.66 −0.59 [51], which is changed by its uncertainties. The uncertainty related to modeling the UE is estimated by changing the parameters used to determine the CUETP8M2T4 tune in signal events. The model of color reconnection in PYTHIA is based on MPIs with early resonance decays switched off [79], and is changed to three other models: the MPI-based scheme with early resonance decays switched on, a gluon-move scheme [80], and a QCD-inspired scheme [81]. Uncertainties from modeling b quark fragmentation and the semileptonic branching fractions of b hadrons are found to be negligible. All the corresponding uncertainties from the modeling of signal described above are estimated by unfolding alterna-  Figure 3: The particle-level tt differential cross section in the fiducial region as a function of the XCone-jet mass (left). The measurement is compared to predictions from POWHEG and MADGRAPH5 aMC@NLO with m t = 172.5 GeV. Theoretical uncertainties are shown as colored bands for the predictions from POWHEG. The normalized differential cross section (right) is compared to predictions from POWHEG for different values of m t . The vertical bars represent the statistical (inner) and the total (outer) uncertainties. The horizontal bars reflect the bin widths. The panels below show the ratios of theoretical predictions to data. tive simulations and comparing the results to the true particle-level distributions.
The measured differential cross section in data is shown in Fig. 3 (left) and compared to the predictions from POWHEG and MADGRAPH5 aMC@NLO with m t = 172.5 GeV. The bin width is chosen so that the purity and stability are above 40%. The purity is the fraction of events that were generated and reconstructed in the same bin with respect to the total number of reconstructed events the bin, and the stability is this number with respect to all generated events in the bin. In the peak region, the total relative uncertainty is between 16 and 36%, of which the dominant contribution is 12-31% from the jet energy scale uncertainty. The largest model uncertainty is from FSR modeling, with an uncertainty of 4-18%. The statistical uncertainty is 6-7%. The total measured tt cross section in the fiducial region of 112 < m jet < 232 GeV is σ = 527 ± 15 (stat) ± 39 (exp) ± 29 (model) fb = 527 ± 51 fb.
The cross section predicted by POWHEG is 680 ± 109 fb, where the theoretical uncertainty is obtained by changing the scales µ R and µ F , the ISR and FSR PS scales, the parameter h damp , and the UE modeling in simulation. A smaller cross section is observed in data relative to simulation, in agreement with previous high-p T top quark measurements [30,[82][83][84][85].
Figure 3 (right) shows the normalized differential cross section as a function of m jet , which is obtained by dividing the differential cross section by the total cross section in the fiducial region. The normalized differential cross section benefits from a partial cancellation of systematic uncertainties and shows good agreement with the prediction from POWHEG for a value of m t = 172.5 GeV.
As illustrated in Fig. 3 (right), the m jet distribution shows sensitivity to m t , and the normalized differential cross section can be used to extract a value of m t . A fit is performed based on the χ 2 evaluated as χ 2 = d T V −1 d, where d is the vector of differences between the measured normalized cross sections and the predictions obtained from POWHEG for different values of m t . The symbol V represents the covariance matrix that contains statistical, experimental systematic, signal modeling in the unfolding, and theoretical uncertainties. The result is m t = 172.6 ± 0.4 (stat) ± 1.6 (exp) ± 1.5 (model) ± 1.0 (theo) GeV = 172.6 ± 2.5 GeV.
The fit converges to a minimum at χ 2 = 0.59 with three degrees of freedom. This result is a determination of m t from decays of boosted top quarks, with an average energy scale of approximately 480 GeV, much larger than the scale in m t measurements from threshold production. The improvement in precision by a factor of 3.6 relative to the measurement at 8 TeV [30] can be attributed primarily to the novel jet reconstruction using XCone. The improvement by a factor of two in both, the m jet width at the particle level and experimental resolution, together with more integrated luminosity and an increased value of √ s, provides a reduction by a factor of about 14 in the statistical uncertainty.
The systematic uncertainties are also reduced through the XCone-jet reconstruction which enables a more precise calibration of the XCone-subjet energies and a better stability against contributions from pileup and the UE. Uncertainties from modeling are reduced in the unfolding through the use of additional sideband regions with higher granularity in the migration matrix.
In summary, a measurement has been presented of the tt differential cross section for t → bW → bqq decays of boosted top quarks as a function of the jet mass m jet . The result relies on a novel method to reconstruct the decay of a boosted top quark using the XCone jet algorithm, which provides an improvement by a factor of two in both the width of the m jet distribution at the particle level and the m jet resolution, as well as reduced systematic uncertainties. The unfolded distribution is well described by simulation of tt production and shows high sensitivity to the top quark mass m t . A determination of m t from the normalized m jet distribution provides a value of 172.6 ± 2.5 GeV, which has an uncertainty close to that of events at the tt production threshold. This measurement shows for the first time the importance of boosted top quarks for extracting standard model parameters such as m t . The differential cross section as a function of m jet will enable a determination of m t using precise analytical calculations, feasible only in the boosted regime [24]. This is an important step in understanding the ambiguities arising between the top quark pole mass and m t measurements at hadron colliders.

Acknowledgments
We congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centers and personnel of the Worldwide LHC Computing Grid for delivering so effectively the computing infrastructure essential to our analyses. Finally, we acknowledge the enduring support for the construction and operation of the LHC and the CMS detector provided by the following funding agencies: BMBWF and FWF (Austria);     [12] CMS Collaboration, "Measurement of the tt production cross section in the eµ channel in proton-proton collisions at √ s = 7 and 8 TeV", JHEP 08 (2016) 029, doi:10.1007/JHEP08(2016)029, arXiv:1603.02303.