A measurement of the soft-drop jet mass in pp collisions at $\sqrt{s} = 13$ TeV with the ATLAS detector

Jet substructure observables have significantly extended the search program for physics beyond the Standard Model at the Large Hadron Collider. The state-of-the-art tools have been motivated by theoretical calculations, but there has never been a direct comparison between data and calculations of jet substructure observables that are accurate beyond leading-logarithm approximation. Such observables are significant not only for probing the collinear regime of QCD that is largely unexplored at a hadron collider, but also for improving the understanding of jet substructure properties that are used in many studies at the Large Hadron Collider. This Letter documents a measurement of the first jet substructure quantity at a hadron collider to be calculated at next-to-next-to-leading-logarithm accuracy. The normalized, differential cross-section is measured as a function of log$_{10}\rho^2$, where $\rho$ is the ratio of the soft-drop mass to the ungroomed jet transverse momentum. This quantity is measured in dijet events from 32.9 fb$^{-1}$ of $\sqrt{s} = 13$ TeV proton-proton collisions recorded by the ATLAS detector. The data are unfolded to correct for detector effects and compared to precise QCD calculations and leading-logarithm particle-level Monte Carlo simulations.

The dynamics of strong interactions, described by quantum chromodynamics (QCD), are responsible for most of the physical processes occurring in proton-proton (pp) scattering at the Large Hadron Collider (LHC).The fundamental particles of QCD, quarks and gluons, cannot be observed directly and instead form collimated sprays of particles called jets when produced at high energy.The radiation pattern inside jets has been used extensively for identifying highly Lorentz boosted hadronically decaying massive particles [1].Many of these techniques were motivated by recent advances in analytical calculations of jet substructure [8].However, prior to this work, there has never been a direct comparison between collision data and calculations beyond the leading-logarithm (LL) accuracy of Parton Shower (PS) Monte Carlo (MC) programs [9].The comparisons presented here begin the field of precision jet substructure, wherein data and calculations in the collinear regime of QCD can be used to test the modeling of final state radiation and maybe even extract fundamental parameters of the SM such as the strong coupling constant or the top quark mass [10].Such precision understanding will also be essential to maximize the quantitative sensitivity of the LHC and future colliders to physics beyond the Standard Model.
Of particular importance is the jet mass, defined as the norm of the four-momentum sum of constituents inside a jet.The jet mass is a key jet substructure observable and is the most powerful tool for identifying Lorentz boosted hadronically decaying massive particles.Unlike Lorentz boosted bosons or top quarks, the mass of generic quark and gluon jets is set by the fragmentation of highly virtual partons [11].A complete prediction for mass or other variables beyond LL has not been possible due to the presence of non-global logarithms (NGLs) [12]: resummation terms associated with particles that radiate out of, and then radiate back into, a jet.These terms are formally present at next-to-leadinglogarithm (NLL) accuracy and have prevented full comparisons of observables beyond LL.However, using insights from modern analytical methods, the authors of Ref. [13] introduced a new procedure to systematically remove soft and wide-angle radiation from the jet (grooming) that is formally insensitive to NGLs.This procedure was extended in Ref. [14] to form the soft-drop grooming algorithm.The calculation of the masses of jets that have the soft-drop procedure applied is insensitive to NGLs.The distribution of the soft-drop mass has now been calculated at both next-to-leading order (NLO) with NLL [15,16] and leading order (LO) with next-to-next-to-leading-logarithm (NNLL) accuracy [17,18].These are the most precise calculations for jet substructure at a hadron collider.
The soft-drop procedure acts on the clustering history of a sequential recombination jet algorithm [19].In these algorithms, all inputs to jet-finding start as a proto-jet and are combined pairwise using a distance metric in y-φ space1.When the smallest distance is above some threshold R (called the jet radius), the algorithm terminates and the remaining proto-jets are the final jets.The clustering history is the sequence of pairwise combinations that lead to a particular jet.Jets at the LHC experiments are usually clustered using the anti-k t algorithm [20], which has the benefit of producing regularly shaped jets in y-φ space.Even though anti-k t jets are useful experimentally, their clustering history does not mimic the angular-ordered PS [21] used in the related k t [19,22] and Cambridge-Aachen [23,24] (C/A) algorithms.The soft-drop algorithm starts by re-clustering an anti-k t jet's constituents with the C/A algorithm.Next, the clustering tree is traversed from the latest branch to the earliest and at each node the following criterion is applied to proto-jets j 1 and j 2 : 1 ATLAS uses a right-handed coordinate system with its origin at the interaction point in the center of the detector.
Cylindrical coordinates (r, φ) are used in the transverse plane, φ being the azimuthal angle around the beam axis.The pseudorapidity is defined in terms of the polar angle θ as η = − ln tan(θ/2).The rapidity, differences of which are invariant under longitudinal boosts also for massive particles, is defined as where p T is the momentum of a jet transverse to the beam pipe, z cut and β are algorithm parameters, and ∆R 12 = (∆y) 2 + (∆φ) 2 is the distance in y-φ between the proto-jets.The parameter z cut sets the scale of the energy removed by the algorithm; β tunes the sensitivity of the algorithm to wide-angle radiation.If the soft-drop condition in Eq. ( 1) is not satisfied, then the branch with the smaller p T is removed.The procedure is then iterated on the remaining branch.If the condition is satisfied at any node, the algorithm terminates.As β increases, the fraction of branches where the condition is satisfied increases, reducing the amount of radiation removed from the jet.In the limit β → ∞, the original jet is untouched.The mass of the resulting jet is referred to as the soft-drop jet mass, m soft drop .
This Letter presents a measurement of the soft-drop jet mass using 32.9 fb −1 of √ s = 13 TeV pp data collected in 2016 by the ATLAS detector, and the first comparison to predictions of jet substructure that are formally more accurate than the LL PS approximation.
ATLAS is a particle detector designed to achieve nearly a full 4π coverage in solid angle [25].The inner tracking detector (ID) is inside a 2 T magnetic field and is designed to measure charged-particle trajectories up to |η| = 2.5.Surrounding the ID are electromagnetic and hadronic calorimeters, which use liquid argon and lead, copper, or tungsten absorber for the electromagnetic and forward (|η| > 1.7) hadronic detectors, and scintillator-tile active material with steel absorber for the central (|η| < 1.7) hadronic calorimeter.
For this study, jets are clustered using the anti-k t jet algorithm with radius parameter R = 0.8 implemented in FastJet [26].The inputs are topological calorimeter-cell clusters calibrated using the local cluster weighting algorithm [27].In order to improve the rapidity resolution, cluster four-vectors are corrected to point toward the reconstructed primary collision vertex [28].An overall jet energy calibration, derived for R = 0.8 jets, accounts for residual detector effects as well as contributions from pileup (i.e., simultaneous additional pp collisions) in order to make the reconstructed jet energy unbiased (up through 'Absolute MC-based calibration' in Ref. [29]).Jets are required to have |η| < 1.5 so that their calorimeter-cell clusters are within the coverage of the ID.
Events were selected online using a two-level trigger system [30] that is hardware-based at the first level and software-based for the second level.In this analysis, the full-luminosity jet trigger with the lowest p T threshold is nearly 100% efficient for jets with p T > 600 GeV.Events are required to have a minimum of two jets, at least one of which has p T > 600 GeV.In addition, a dijet topology is imposed by requiring that the leading two p T -ordered jets satisfy p T,1 /p T,2 < 1.5: as the leading two jets are required to have similar p T , this removes events with additional energetic jets.
The soft-drop algorithm is then run on the leading two jets in the selected events.Both of these jets are used for the measurement.Three different values of β ∈ {0, 1, 2} are considered.The value of z cut is fixed at 0.1 so that log(z cut ) resummation is negligible [15].The dimensionless mass ρ = m soft drop /p ungroomed T is the observable of interest: as the soft-drop mass is correlated with p T , ρ is a dimensionless quantity that only weakly depends on p T .For each β value, log 10 (ρ2 ) is constructed from the jet's mass after the soft drop algorithm and its p T before (referred to as p resummation dominates over non-perturbative or fixed-order parts of the recent precision calculations; studying the distribution in log-scale allows this region to be studied more closely. After the event selection, the data are unfolded to correct for detector effects.MC simulations are used to perform the unfolding and for comparisons with the corrected data.The unfolding procedure corrects detector-level2 observables to particle-level.The particle-level selection is defined to be as close as possible to the detector-level selection in order to minimize the size of simulation-based corrections when unfolding.Particle-level jets are clustered from simulated particles with a mean lifetime τ > 30 ps excluding muons and neutrinos.These jets are built using the same algorithm as for detector-level jets, and particle-level events must pass the same dijet requirement.The experimental resolution of the log 10 (ρ 2 ) distribution depends on the jet p T , so the log 10 (ρ 2 ) and p T distributions are simultaneously unfolded.After correcting for the acceptance of the event selection, the full twodimensional distribution is unfolded using an iterative Bayesian (IB) technique [31] with four iterations as implemented in the RooUnfold framework [32].The acceptance corrections are largely independent of log 10 (ρ 2 ), with a small effect below −3 due to the ρ 0 requirement.
Several MC simulations are used to unfold and compare to the data.Dijet events were generated at LO using P [33] 8.186, with the 2 → 2 matrix element (ME) convolved with the NNPDF2.3LOparton distribution function (PDF) set [34], and using the A14 [35] set of tuned PS and underlyingevent model parameters.Additional radiation beyond the ME was simulated in P 8 using the LL approximation for the p T -ordered PS [36].To provide several comparisons to data, additional dijet samples were simulated using different generators.S 2.1.1 [37] generates events using multi-leg 2 → 3 matrix elements, which are matched to the PS following the CKKW prescription [38].These S events were simulated using the CT10 LO PDF set [39] and the default S event tune.H ++ 2.7.1 [40,41] events were generated with the 2 → 2 matrix element, convolved with the CTEQ6L1 PDF set [42] and configured with the UE-EE-5 tune [43].Both S and H ++ use angular ordering in the PS and a cluster model for hadronization [44].All MC samples use P 8 minimum bias events (MSTW2008LO PDF set [45] and A2 tune [46]) to simulate pileup.They were processed using the full ATLAS detector simulation [47] based on G 4 [48].
Figure 1 shows the uncorrected data compared with detector-level simulation for P , S , and H ++ as well as particle-level simulation for P .There are substantial migrations between the detector-and particle-level distributions, which cause large off-diagonal terms in the unfolding matrix especially at low values of log 10 (ρ 2 ).
Various systematic uncertainties impact the soft-drop mass distribution.The sources of uncertainty can be classified into two categories: experimental and theoretical modeling.Experimental uncertainties are due to limitations in the accuracy of the modeling of calorimeter-cell cluster energies and positions as well as their reconstruction efficiency, and are evaluated as follows.Isolated calorimeter-cell clusters are matched to tracks; the mean and standard deviation of the energy-to-momentum ratio (E/p) is used for the cluster energy scale and resolution uncertainties, and the standard deviation of the relative position is used for the cluster angular resolution.In the track-momentum range 30 GeV< p < 350 GeV, E/p is augmented with information from testbeam studies [49].For |η| > 0.6 in that p range or for p > 350 GeV (and any |η|), a flat 10% uncertainty is estimated for both the energy scale and resolution, motivated by earlier studies [50].The reconstruction efficiency is studied using the fraction of tracks without a matched calorimeter-cell cluster.A series of validation studies are performed to ensure that these uncertainties are valid also for non-isolated clusters.Jets clustered from tracks are geometrically matched to calorimeter jets and the ratio of their p T and mass is sensitive to the jet energy scale (JES) and jet mass scale.Furthermore, the decomposition method [50][51][52] is used to propagate the cluster-based uncertainties to an effective JES, which agrees well with the observed in-situ shift for R = 0.4 ungroomed jets [29].Finally, the jet mass scale and resolution are tested using the observed W mass peak in t t events.The same event selection and level of agreement is observed as in Ref. [53].These additional studies confirm that the cluster-based uncertainties are valid for log 10 (ρ 2 ).
One of the dominant uncertainties is due to the theoretical modeling of jet fragmentation (QCD Modeling).In particular, as dijet simulation is used to unfold the data, the results of the analysis are sensitive to the choice of MC generator used for this procedure.The P generator is used for the nominal sample, and comparisons are made with S and H ++. The S and H ++ generators give compatible results, so only the variation with S is used as a systematic uncertainty.The impact of this uncertainty is assessed by unfolding the data with the alternative response matrix.In addition to directly varying the model used to derive the response matrix, a data-driven nonclosure technique is used to estimate the potential bias from a given choice of prior and the number of iterations in the IB method [54].The inverse of the response matrix is applied to the particle-level spectrum, which is reweighted until the folded spectrum agrees with data.This modified detector-level distribution is unfolded with the nominal response matrix and the difference between this and the reweighted particle-level spectrum is taken as an uncertainty.Finally, the sensitivity of the unfolding procedure to pile-up is assessed by reweighting events to vary the distribution of the number of interactions in the MC simulation by 10%: the impact on the measurement is small.This is expected, since the soft-drop algorithm is designed to remove the soft, wide-angle radiation that pileup contributes.
The uncertainties are dominated by QCD modeling and the cluster energy scale.The former are largest ( 20%) at low log 10 (ρ 2 ) where non-perturbative effects introduce a sensitivity to the log 10 (ρ 2 ) distribution prior, and are 10% for the rest of the distribution.Cluster energy uncertainties are large ( 5%) at low log 10 (ρ 2 ) where the cluster multiplicity is low and also at high log 10 (ρ 2 ) where the energy of the hard prongs, rather than their opening angle, dominates the mass resolution.Other sources of uncertainty are typically below 5% across the entire distribution.A summary of the relative sizes of the various systematic uncertainties for β = 0 is shown in Fig. 2. The relative sizes of the different sources of systematic uncertainty are similar for β = 1 and β = 2, except that the large uncertainty at low log 10 (ρ 2 ) values spans a larger range.
The unfolded data are shown in Fig. 3.They are compared to the predictions of the P , S , and H ++ generators, as well as the NLO+NLL prediction from Ref. [15,16] and the LO+NNLL prediction from Refs.[17,18].The (N)NLL calculations use NLOJet++ [55,56] (MG5_aMC [57]) with the CT14nlo [58] (MSTW2008LO) PDF set for matrix element calculations.The distributions are normalized to the integrated cross-section, σ resum , measured in the resummation region, −3.7 < log 10 (ρ 2 ) < −1.7.The uncertainties due to the analytical calculation come from independently varying each of the renormalization, factorization, and resummation scales by factors of 2 and 1/2.The NLO+NLL calculation is also given with non-perturbative (NP) corrections based on the average of various MC models with NP effects turned on and off; the envelope of predictions is added as an uncertainty [15].The LO+NNLL predictions do not contain NP effects, but the open makers in Fig. 3 indicate where NP are expected to be large ('large NP effects').
The MC predictions and the analytical calculations are expected to be accurate in different regions of log 10 (ρ 2 ) [15,17,18].In general, non-perturbative effects are large for log 10 (ρ 2 ) < −3.7 (where smallangle or soft gluon emissions dominate) and small for −3.7 < log 10 (ρ 2 ) < −1.7 where resummation dominates.Fixed, higher-order corrections are expected to be important for log 10 (ρ 2 ) > −1.7,where large-angle gluon emission can play an important role.This implies that the region −3.7 < log 10 (ρ 2 ) < −1.7 (the resummation region) should have the most reliable predictions for both the MC generators and the LO+NNLL analytical calculation, while the NLO+NLL calculation should also be accurate for log 10 (ρ 2 ) > −1.7.For all values of β, the measured and predicted shapes agree well in the resummation region, and the data and NLO+NLL prediction continue to agree well at higher values of log 10 (ρ 2 ).At more negative values of log 10 (ρ 2 ), non-perturbative effects lead to distinctly different predictions between the MC generators and the calculations without NP corrections; the data fall below the predictions for all β values.Interestingly, the NNLL calculation is not everywhere a better model of the data than the NLL calculation in the resummation regime and NP effects can also be comparable to the higher order resummation corrections in this regime.Therefore, improved precision for the future will require will require a careful comparative analysis of the different perturbative calculations as well as a deeper and possibly analytic understanding of NP effects.
As β increases, the fraction of radiation removed by soft-drop grooming decreases and the impact of non-perturbative effects grows larger [17,18], so the range over which the analytical calculations are accurate also decreases.The degree of agreement between data and all the calculations for log 10 (ρ 2 ) < −3 does substantially worsen for β ∈ {1, 2}, especially when NP corrections are not included.Agreement between the data and the MC generators remains generally within uncertainties for all values of β.Digitized versions of the results, along with versions binned in jet p T can be found at Ref.  > 600 GeV, after the soft drop algorithm is applied for β ∈ {0, 1, 2}, in data compared to P , S , and H ++ particle-level (left), and NLO+NLL(+NP) [15] and LO+NNLL [17,18] theory predictions (right).The LO+NNLL calculation does not have non-perturbative (NP) corrections; the region where these are expected to be large is shown in a open marker (but no correction is applied), while regions where they are expected to be small are shown with a filled marker.All uncertainties described in the text are shown on the data; the uncertainties from the calculations are shown on each one.The distributions are normalized to the integrated cross section, σ resum , measured in the resummation region, −3.7 < log 10 (ρ 2 ) < −1.7.The NLO+NLL+NP cross-section in this resummation regime is 0.14, 0.19, and 0.21 nb for β = 0, 1, 2, respectively [15].
In summary, a measurement of the soft-drop jet mass is reported.The measurement provides a comparison of the internal properties of jets between 32.9 fb −1 of 13 TeV pp collision data collected by the ATLAS detector at the LHC and precision QCD calculations accurate beyond leading logarithm.Where the calculations are well defined perturbatively, they agree well with the data; in regions where non-perturbative effects are expected to be significant, the calculations disagree with the data and the predictions from MC simulation are better able to reproduce the data.The dijet cross section is presented as a normalized fiducial dijet differential cross section as a function of the log 10 (ρ 2 ) for each jet, allowing the results to be used to constrain future calculations and MC generator predictions.x Also at LAL, Univ.Paris-Sud, CNRS/IN2P3, Université Paris-Saclay, Orsay, France y Also at Graduate School of Science, Osaka University, Osaka, Japan z Also at Fakultät für Mathematik und Physik, Albert-Ludwigs-Universität, Freiburg, Germany aa Also at Institute for Mathematics, Astrophysics and Particle Physics, Radboud University Nijmegen/Nikhef, Nijmegen, Netherlands ab Also at Department of Physics, The University of Texas at Austin, Austin TX, United States of America ac Also at Institute of Theoretical Physics, Ilia State University, Tbilisi, Georgia ad Also at CERN, Geneva, Switzerland ae Also at Georgian Technical University (GTU),Tbilisi, Georgia a f Also at Ochadai Academic Production, Ochanomizu University, Tokyo, Japan ag Also at Manhattan College, New York NY, United States of America ah Also at The City College of New York, New York NY, United States of America ai Also at Departamento de Fisica Teorica y del Cosmos, Universidad de Granada, Granada, Portugal

Figure 1 :
Figure 1: Distributions of log 10 (ρ 2 ) in data compared to reconstructed detector-level (Reco.)P , S , and H ++, and particle-level (Truth) P simulations for β = 0 (left), β = 1 (right), and β = 2 (bottom).The ratio of the three detector-level MC predictions to the data is shown in the middle panel, and the size of the detector→ particle-level corrections for P is shown as the ratio in the bottom panel.The error bars on the data points and in the first ratio include the experimental systematic uncertainties in the cluster energy, angular resolution, and efficiency.The distributions are normalized to the integrated cross-section, σ resum , measured in the resummation region, −3.7 < log 10 (ρ 2 ) < −1.7.

Figure 3 :
Figure 3: The unfolded log 10 (ρ 2 ) distribution for anti-k t R = 0.8 jets with p lead T a j Also at Department of Physics, California State University, Sacramento CA, United States of America ak Also at Moscow Institute of Physics and Technology State University, Dolgoprudny, Russia al Also at Departement de Physique Nucleaire et Corpusculaire, Université de Genève, Geneva, Switzerland am Also at Institut de Física d'Altes Energies (IFAE), The Barcelona Institute of Science and Technology, Barcelona, Spain an Also at School of Physics, Sun Yat-sen University, Guangzhou, China ao Also at Institute for Nuclear Research and Nuclear Energy (INRNE) of the Bulgarian Academy of Sciences, Sofia, Bulgaria ap Also at Faculty of Physics, M.V.Lomonosov Moscow State University, Moscow, Russia aq Also at National Research Nuclear University MEPhI, Moscow, Russia ar Also at Department of Physics, Stanford University, Stanford CA, United States of America as Also at Institute for Particle and Nuclear Physics, Wigner Research Centre for Physics, Budapest, Hungary at Also at Giresun University, Faculty of Engineering, Turkey au Also at CPPM, Aix-Marseille Université and CNRS/IN2P3, Marseille, France av Also at Department of Physics, Nanjing University, Jiangsu, China aw Also at Institute of Physics, Academia Sinica, Taipei, Taiwan ax Also at University of Malaya, Department of Physics, Kuala Lumpur, Malaysia * Deceased Also at Departament de Fisica de la Universitat Autonoma de Barcelona, Barcelona, Spain k Also at Departamento de Fisica e Astronomia, Faculdade de Ciencias, Universidade do Porto, Portugal l Also at Tomsk State University, Tomsk, and Moscow Institute of Physics and Technology State University, Dolgoprudny, Russia m Also at The Collaborative Innovation Center of Quantum Matter (CICQM), Beijing, China n Also at Universita di Napoli Parthenope, Napoli, Italy o Also at Institute of Particle Physics (IPP), Canada p Also at Horia Hulubei National Institute of Physics and Nuclear Engineering, Bucharest, Romania q Also at Department of Physics, St. Petersburg State Polytechnical University, St. Petersburg, Russia r Also at Borough of Manhattan Community College, City University of New York, New York City, United States of America s Also at Department of Financial and Management Engineering, University of the Aegean, Chios, Greece t Also at Centre for High Performance Computing, CSIR Campus, Rosebank, Cape Town, South Africa u Also at Louisiana Tech University, Ruston LA, United States of America v Also at Institucio Catalana de Recerca i Estudis Avancats, ICREA, Barcelona, Spain w Also at Department of Physics, The University of Michigan, Ann Arbor MI, United States of America j