Search for Supersymmetry in pp Collisions at ﬃﬃ s p = 13 TeV in the Single-Lepton Final State Using the Sum of Masses of Large-Radius Jets

Results are reported from a search for supersymmetric particles in proton-proton collisions in the final state with a single lepton, multiple jets, including at least one b -tagged jet, and large missing transverse momentum. The search uses a sample of proton-proton collision data at ﬃﬃﬃ s p ¼ 13 TeV recorded by the CMS experiment at the LHC, corresponding to an integrated luminosity of 35 . 9 fb − 1 . The observed event yields in the signal regions are consistent with those expected from standard model backgrounds. The results are interpreted in the context of simplified models of supersymmetry involving gluino pair production, with gluino decay into either on- or off-mass-shell top squarks. Assuming that the top squarks decay into a top quark plus a stable, weakly interacting neutralino, scenarios with gluino masses up to about 1.9 TeV are excluded at 95% confidence level for neutralino masses up to about 1 TeV.

A central goal of the physics program of the CMS experiment at the CERN LHC [1] is the search for new particles and phenomena beyond the standard model (SM), in particular, for supersymmetry (SUSY) [2][3][4][5][6][7][8][9]. During 2016, CMS recorded a data sample of proton-proton collisions at a center-of-mass energy of 13 TeV, corresponding to an integrated luminosity of 35.9 fb −1 , significantly extending the sensitivity to the production of new heavy particles. The search described here focuses on a generically important experimental signature that is also strongly motivated by SUSY phenomenology. This signature includes a single lepton (an electron or a muon), several jets, arising from the hadronization of energetic quarks and gluons, at least one b-tagged jet, indicative of processes involving third generation quarks, and, finally, ⃗p miss T , the missing momentum in the direction transverse to the beam. A large value of p miss T ≡ j ⃗p miss T j can arise from the production of high momentum, weakly interacting particles that escape detection. Searches for SUSY in the single-lepton final state have been performed by both ATLAS and CMS at ffiffi ffi s p ¼ 7 and 8 TeV [10][11][12][13] and at ffiffi ffi s p ¼ 13 TeV [14][15][16][17]. The present analysis, which introduces extended binning and other improvements, is based largely on methodologies described in detail in Ref. [16], which include the use of large-radius jets and related kinematic variables.
In models based on SUSY, new particles are introduced such that all fermionic (bosonic) degrees of freedom in the SM are paired with corresponding bosonic (fermionic) degrees of freedom in the extended theory. The discovery of a Higgs boson with low mass [18][19][20][21][22][23] provides a key motivation for SUSY. Stabilizing the Higgs boson mass at a low value, without invoking extreme fine-tuning of parameters, is a major theoretical challenge, referred to as the gauge hierarchy problem [24][25][26][27][28][29]. This stabilization can be achieved in so-called natural SUSY models [30][31][32][33][34], in which several of the SUSY partners are constrained to be light [33]: the top squarkst L andt R , which have the same electroweak couplings as the left-(L-) and right-(R-) handed top quarks, respectively, the bottom squark with L-handed couplings,b L , the gluinog; and the HiggsinosH. This search targets gluino pair production, which has a relatively large cross section for a given mass, with gluino decayg → ttχ 0 1 . This process can arise fromg →t 1t , where the lighter top squark mass eigenstatet 1 is produced either on or off mass shell. The symbolχ 0 1 denotes the lightest neutralino, an electrically neutral mass eigenstate that is in general a mixture of the Higgsinos and electroweak gauginos. In R-parity conserving SUSY models [35,36] in which theχ 0 1 is the lightest supersymmetric particle, thẽ χ 0 1 is stable and can, in principle, account for some or all of the astrophysical dark matter [37][38][39]. The scenario with off-mass-shell top squarks is denoted as T1tttt [40] in simplified model scenarios [41][42][43]. In natural SUSY models, the top squark is typically lighter than the gluino, so we also search for scenarios with on-shell top squarks, denoted as T5tttt.
Simulated event samples for SM background processes are used to determine correction factors, typically near unity, that are used in conjunction with observed event yields in control regions to determine the SM background contribution in the signal regions. The production of tt þ jets, W þ jets, Z þ jets, and QCD multijet events is simulated with the MC generator MADGRAPH5_AMC@NLO@NLO 2.2.2 [44], with parton distribution functions taken from NNPDF 3.0 [45]. Details on the simulated SM background samples, including other processes with smaller contributions (single top quark, tt þ bosons, diboson, and tttt production) are given in Ref. [16]. The detector simulation is performed with GEANT4 [46]. Simulated event samples for SUSY signal models, used to determine the selection efficiency for signal events, are generated with MADGRAPH5_AMC@NLO@NLO 2.2.2 with up to two additional partons at leading order accuracy and are normalized to cross sections based on Ref. [47]. Because of the large number of mass hypotheses examined in this analysis, the detector simulation in this case is performed with the CMS fast simulation package [48].
Two T1tttt benchmark models are used to illustrate typical signal behavior. The T1tttt(1800,100) model, which we refer to as a noncompressed-spectrum model (NC), has mðgÞ ¼ 1800 GeV, mðχ 0 1 Þ ¼ 100 GeV, and a cross section of 2.8 fb, and corresponds to a scenario with a large gluinoneutralino mass splitting. The T1tttt(1400,1000) model, with mðgÞ ¼ 1400 GeV, mðχ 0 1 Þ ¼ 1000 GeV, and a cross section of 25 fb, corresponds to a scenario with a small gluino-neutralino mass splitting and is referred to as a compressed-spectrum model (C).
The data were recorded with the CMS detector [49], which is constructed around a superconducting solenoid of 6 m diameter, providing a magnetic field of 3.8 T. Within the solenoid volume are the charged particle tracking systems, composed of silicon-pixel and silicon-strip detectors, and the calorimeter systems, consisting of a lead tungstate crystal electromagnetic calorimeter and a brass and scintillator hadron calorimeter. Muons are identified and measured by gas-ionization detectors embedded in the magnetic fluxreturn yoke outside the solenoid. Events were selected using several triggers [50] that require either large p miss T or a single lepton (an electron or a muon), with and without significant hadronic activity. The trigger efficiency is measured in data for our analysis requirements to be nearly 100%.
Event reconstruction proceeds from particles identified by the particle-flow (PF) algorithm [51], which uses information from the tracker, calorimeters, and muon systems to identify PF candidates as electrons, muons, charged or neutral hadrons, or photons. Electrons are reconstructed by associating a charged-particle track with electromagnetic calorimeter superclusters [52]. The resulting candidate electrons are required to have transverse momentum p T > 20 GeV and pseudorapidity jηj < 2.5, and to satisfy identification criteria designed to reject lightparton jets and photon conversions. Muons are reconstructed by associating tracks in the muon system with those found in the silicon tracker [53]. Muon candidates are required to satisfy p T > 20 GeV and jηj < 2.4. To select leptons from W boson decays, leptons are required to be isolated from other PF candidates. Isolation is quantified using an optimized version [16] of the mini-isolation variable originally suggested in Ref. [54], in which the transverse energy of the particles within a cone around the lepton momentum vector is computed using a cone size that decreases as 1=p T l , where p T l is the transverse momentum of the lepton.
To suppress dilepton backgrounds, we veto events that contain a broader category of candidates for the second lepton, referred to as veto tracks. These include two categories of charged-particle tracks: isolated leptons satisfying looser identification criteria than lepton candidates, including a relaxed momentum requirement, p T > 10 GeV, and isolated charged-hadron PF candidates, which must satisfy p T > 15 GeV and jηj < 2.5. In either case, the charge of the veto track must be opposite to that of the lepton candidate in the event. To maintain a high selection efficiency for signal events, lepton veto tracks must satisfy a requirement on the quantity [55,56] M T2 ð ⃗p l ; ⃗p v T ; ⃗p miss T Þ < 80 GeV and hadronic veto tracks must satisfy M T2 ð ⃗p l ; ⃗p v T ; ⃗p miss T Þ < 60 GeV, where v refers to the veto track.
Charged and neutral PF candidates are clustered into jets using the anti-k T algorithm [57] with radius parameter R ¼ 0.4, as implemented in the FASTJET package [58]. Jets are required to satisfy p T > 30 GeV and jηj ≤ 2.4. Additional details and references are given in Ref.
[16] on the p T -and η-dependent jet energy calibration [59], the jet identification requirements, and the subtraction of the energy contribution to the jet p t from multiple protonproton interactions from the same or neighboring beam crossings (pileup) [60]. A subset of the jets are tagged as originating from b quarks using the combined secondary vertex algorithm [61,62].
We further cluster the jets with R ¼ 0.4 (small-R jets), including those associated with isolated leptons, into R ¼ 1.4 (large-R) jets using the anti-k T algorithm. The masses mðJ i Þ of the large-R jets reflect the p T spectrum and multiplicity of the clustered objects, as well as their angular spread. The variable M J is defined as the sum of all large-R jet masses: M J ¼ P J i ¼large−Rjets mðJ i Þ. For tt events with a small contribution from initial-state radiation (ISR), the M J distribution has an approximate cutoff at 2m t . In contrast, the M J distribution for signal events extends to larger values because of the presence of multiple top quarks in the decay chain. The presence of a significant amount of ISR generates a high-M J tail in the tt background, producing the main source of background in the analysis.
The missing transverse momentum ⃗p miss T is defined as the negative vector sum of the transverse momenta of all PF candidates. To separate backgrounds characterized by the presence of a single W boson decaying leptonically, but without any other source of p miss T , we use the transverse is the difference between the azimuthal angles of p l T and ⃗p miss T . The quantity H T is defined as the scalar sum of the transverse momenta of all the small-R jets passing the selection, while We select events with exactly one isolated charged lepton (an electron or a muon), no veto tracks, S T > 500 GeV, p miss T > 200 GeV, and at least six small-R jets, at least one of which is b tagged. After this set of requirements, referred to as the baseline selection, about 80% of the SM background arises from tt production. The contributions from events with a single top quark or a W boson in association with jets are each about 6%-8%; much of the remainder arises from events with a tt pair produced in association with a vector boson. After applying the baseline selection, the background from QCD multijet events is negligible.
The analysis is performed using four regions in the M J -m T plane: three control regions (CR) and one signal region: All four regions are divided in bins of p miss T , forming three largely independent M J -m T planes: Regions R2 and R4, which have high M J , are further divided into bins according to the number of small-R jets (N jets ) and the number of b-tagged jets (N b ) as follows: (ii) Two N jets bins: , giving a total of 18 bins each. Backgrounds with a single W boson decaying leptonically are strongly suppressed by the requirement m T > 140 GeV, so the background in R3 and R4 is dominated by dilepton tt events. Approximately half of the dilepton background events in R4 contain a missed electron or muon, and the other half contain a hadronically decaying τ lepton. Given that the main background processes have two or fewer b quarks, the total SM contribution to the N b ≥ 3 bins is very small and is driven by the b tag misidentification rate. Signal events in the T1tttt and T5tttt models populate primarily the bins with N b ≥ 2.
The method for predicting the background yields takes advantage of the near absence of correlation between the M J and m T variables in R1-R4, which is a consequence of the high jet multiplicity, p miss T , and S T requirements applied in the baseline selection [16]. To satisfy these requirements, background events must typically contain additional jets from ISR. Even though the background at low m T arises largely from single-lepton tt events, while the background at high m T is dominated by dilepton tt events, the shapes of the M J distributions at low and high m T become very similar in the presence of multiple ISR jets. We therefore measure this shape at low m T (R1, R2) and extrapolate it to high m T to obtain the background prediction in R4. The fitted mean background yields in R1-R4 are thus related by the constraint μ bkg Here, κ is a near-unity correction factor obtained from MC simulation of the total background that accounts for a residual m T -M J correlation: This constraint is imposed by relating the expected yields in R1-R4 to three parameters: an overall background normalization λ and two ratios Rðm T Þ and RðM J Þ, where the expected background yields are given by μ bkg These quantities are defined such that there is one value of RðM J Þ and κ for each bin of p miss T , N jets , and N b . Because regions R1 and R3 are integrated in N jets and N b , the fit parameters λ and Rðm T Þ are defined such that there is only one value of these quantities for each bin in p miss T . We perform two types of maximum likelihood fits, which are described in detail in Ref. [16]. The predictive fit uses the observed yields in R1-R3, assuming no signal contribution, to propagate the uncertainties to λ, RðM J Þ, and Rðm T Þ. The global fit uses the observed yields in all four regions R1-R4 and allows a signal contribution with a single normalization parameter. The global fit accounts for signal contamination in R1-R3, which is typically less than 10%, and is used to compute signal limits and significances. The results from the predictive fit simplify theoretical reinterpretation in terms of other models by only requiring comparison of observed and predicted yields in R4 rather than all four regions. In both cases, the likelihood function is written as a product of Poisson distributions for the relevant contributions in bins of p miss T , N jets , and N b within R2 and R4, taking into account the correlated yields between the unbinned regions R1 and R3.
Systematic uncertainties in the background prediction are incorporated in the uncertainty in the double ratio correction factor κ. Discrepancies between the value of κ predicted by simulation and the true value of κ in the data can in principle arise from mismodeling of the background composition or its properties, including detector effects.
To assess the potential impact of such effects on κ, two control samples in data are used: a five-jet control sample and a dilepton control sample. The five-jet control sample is completely dominated by background processes and has a SM composition very similar to that of the analysis regions. In particular, this sample probes the rate at which p miss T is mismeasured in single-lepton events, which could increase the tail of the m T distribution. Such events account for about 7% of the background in the signal region at high p miss T . This small event category can have a κ value that departs significantly from unity, and it is important to validate the modeling of such effects. Using the analogous R1-R4 regions in the N jets ¼ 5 control sample, κ values are measured in data and are found to be consistent with those obtained from simulation. Because of this consistency, the statistical uncertainty obtained from the comparison in the N jets ¼ 5 control sample is assigned as an uncertainty in κ for each p miss T bin. These uncertainties are taken to be fully correlated over the N jets and N b bins.
The dilepton control sample is used to test the degree of similarity between the M J shapes of single-lepton and dilepton tt events in the presence of ISR. This sample includes not only events with two identified isolated leptons, but also events with one lepton and an oppositely charged veto track. The usual R3 and R4 regions are replaced by dilepton events, and the quantity κ is measured in bins of N jets . As in the five-jet control sample, the values of κ measured in data are found to be consistent with those observed in simulation, and uncertainties are assigned in a similar way. The uncertainties are treated as independent across N jets bins but fully correlated across N b and p miss T bins. The uncertainties from the dilepton and five-jet control samples are treated as uncorrelated. Studies of a broad range of potential mismodeling effects in simulation show that all such effects would be evident in these control samples.
Systematic uncertainties in the expected signal yields account for uncertainties in the trigger, lepton identification, jet identification, and b tagging efficiencies in simulated data, uncertainties in the distributions of p miss T , the number of pileup vertices, and ISR jet multiplicity, and uncertainties in the jet energy corrections, QCD scales, and integrated luminosity [63]. The combined effect of all signal-related uncertainties is typically about 25%. Table I lists the observed event yields in region R4 in data, together with the mean background yields from the predictive fit and the expected signal yields from two benchmark model points. The uncertainties in the predicted background yields include the statistical uncertainties on the event yields in R1-R4 in data, the statistical uncertainties in the κ values arising from the finite size of simulated event samples, and the systematic uncertainties in κ as assessed from the data control samples. The observed yields are consistent with the background predictions in all of the 18 signal bins within 2 standard deviations, with most of the 18 bins consistent within 1 s.d. The R4 bins with p miss T > 500 GeV show an underprediction of the background with respect to the observed yields. However, accounting for the correlations arising from the use of a single, integrated yield in R3 across bins in N jets and N b , the significance of the discrepancy in these six bins in R4 is only 1.9 s.d., mostly due to the bins with N b ¼ 1.
To simplify the reinterpretation of the results in terms of other theoretical models, we provide predicted mean background yields for four aggregated search bins, shown in Table II. The aggregate bins are defined such that at least one bin will provide sensitivity to most of the models for which the finely binned analysis has sensitivity. Since the aggregate bins overlap, they are intended to be used one at a time, unlike the 18 nonoverlapping signal bins, which are considered simultaneously in the fit. Each prediction includes all sources of uncertainty. The choice of the best aggregate bin will depend on the model under study. For the T1tttt benchmark models considered in this Letter, using the aggregate bins results in expected upper limits on the TABLE I. Observed event yields and mean background yields from the predictive fit in the 18 bins of the signal region R4. Each bin is specified by the values of p miss T , N jets , and N b . The uncertainties in κ include both a statistical component from the size of the MC samples and a systematic component assessed from the data control samples. The uncertainty in the predicted event yield includes both of these and the statistical uncertainties associated with the data control regions. Yields for the two T1tttt benchmark models NC and C are also given. cross sections that are 20%-50% higher than those resulting from the full analysis.  Table I in M J ranges larger than the binning shown in the figure. The lower-p miss T region shows the background behavior with higher statistics, while the higher-p miss T region has higher sensitivity to the signal. Figure 2 shows an interpretation of the results as exclusion limits at 95% C.L. for T1tttt and T5tttt. The limits are obtained using the CL s method with a profilelikelihood ratio as the test statistic, using asymptotic approximations for the distribution of the test statistic [64][65][66]. The color map shows the cross section upper limits as a function of mðgÞ and mðχ 0 1 Þ for T1tttt, assuming a 100% branching fraction for the decayg → ttχ 0 1 . The T1tttt model points below the dark solid curve, which extend up to gluino masses of about 1.9 TeV for neutralino masses up to 1 TeV, have a theoretical cross section above the observed cross section upper limit and are thus excluded by this analysis. The dotted black lines around the observed mass limits show the impact of the theoretical uncertainties in the overall signal cross sections arising from uncertainties in the parton distribution functions and the renormalization and factorization scales.
Model points below the light solid curve are excluded at 95% C.L. for the T5tttt model, where it is assumed that the top squark mass is 175 GeV above the neutralino mass, a limiting case in terms of sensitivity to the decay kinematics. The T5tttt simulation does not explicitly include direct top squark pair production. Studies presented in Ref. [16] demonstrate that the effect of this contribution is very small for most of the space of T5tttt model points considered here. For most of the excluded region, the boundaries for T1tttt and T5tttt are very similar, indicating only a weak overall sensitivity to the value of the top squark mass. At low values of mðχ 0 1 Þ in T5tttt, the sensitivity is reduced because the neutralino carries very little momentum; however, some sensitivity is still provided by dilepton events that escape the lepton veto [16]. For both the T1tttt and T5tttt models, expected limits are computed using the background-only hypothesis, with nuisance parameters assuming their best fit values from the observed data. All limits are computed using results from the global fit.
In summary, we have performed a search for an excess event yield above that expected for SM processes using a data sample of proton-proton collision events with an integrated luminosity of 35.9 fb −1 at ffiffi ffi s p ¼ 13 TeV. The signature is characterized by large missing transverse momentum, a single isolated lepton, multiple jets, and at least one b-tagged jet. No significant excesses above the SM backgrounds are observed. The results are interpreted in the framework of simplified models that describe natural SUSY scenarios. For gluino pair production followed by the three-body decayg → ttχ 0 1 (T1tttt model), gluinos with masses below 1.9 TeVare excluded at 95% confidence level for neutralino masses up to about 1 TeV. For the two-body Upper limit (95% CL) on 1 10 2 10 experiment s.d. gluino decayg →t 1t witht 1 → tχ 0 1 (T5tttt model), the results are generally similar, except at low neutralino masses, where the excluded gluino mass is somewhat lower. These results extend previous gluino mass limits by about 300 GeV and are among the most stringent constraints on these simplified models of SUSY to date.
We congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centers and personnel of the Worldwide LHC Computing Grid for delivering so effectively the computing infrastructure essential to our analyses. Finally, we acknowledge the enduring support for the construction and operation of the LHC and the CMS detector provided by the following funding agencies: BMWFW and FWF  [5] D. V. Volkov and V. P. Akulov, Possible universal neutrino interaction, JETP Lett. 16, 438 (1972 G 28, 2693 (2002).