Search for supersymmetry in pp collisions at sqrt(s)=13 TeV with 137 fb-1 in final states with a single lepton using the sum of masses of large-radius jets

Results are reported from a search for new physics beyond the standard model in proton-proton collisions in final states with a single lepton; multiple jets, including at least one jet tagged as originating from the hadronization of a bottom quark; and large missing transverse momentum. The search uses a sample of proton-proton collision data at √ s = 13 TeV, corresponding to 137 fb−1, recorded by the CMS experiment at the LHC. The signal region is divided into categories characterized by the total number of jets, the number of bottom quark jets, the missing transverse momentum, and the sum of masses of large-radius jets. The observed event yields in the signal regions are consistent with estimates of standard model backgrounds based on event yields in the control regions. The results are interpreted in the context of simplified models of supersymmetry involving gluino pair production in which each gluino decays into a top quark-antiquark pair and a stable, unobserved neutralino, which generates missing transverse momentum in the event. Scenarios with gluino masses up to about 2150 GeV are excluded at 95% confidence level (or more) for neutralino masses up to 700 GeV. The highest excluded neutralino mass is about 1250 GeV, which holds for gluino masses around 1850 GeV. Submitted to Physical Review D c © 2019 CERN for the benefit of the CMS Collaboration. CC-BY-4.0 license ∗See Appendix A for the list of collaboration members ar X iv :s ub m it/ 29 33 36 2 [ he pex ] 1 8 N ov 2 01 9


Introduction
The physics program of the CMS experiment at the CERN LHC [1] is designed to explore the TeV energy scale and to search for new particles and phenomena beyond the standard model (SM), for example, those predicted by supersymmetry (SUSY) [2][3][4][5][6][7][8][9]. The search described here focuses on an important experimental signature that is also strongly motivated by SUSY phenomenology. This signature includes a single lepton (an electron or a muon); several jets, arising from the hadronization of energetic quarks and gluons; at least one b-tagged jet, indicative of processes involving third-generation quarks; and p miss T , the missing momentum in the direction transverse to the beam. A large value of p miss T ≡ | p miss T | can arise from the production of high-momentum, weakly interacting particles that escape detection. Searches for SUSY in the single-lepton final state have been performed by both ATLAS and CMS in proton-proton (pp) collisions at √ s = 7, 8, and 13 TeV [10][11][12][13][14][15][16][17][18][19].
This paper describes the single-lepton SUSY search based on the mass variable M J , defined as the sum of the masses of large-radius jets in the event, as well as on several other kinematic variables. The search uses the combined CMS Run 2 data sample from 2016, 2017, and 2018, corresponding to a total integrated luminosity of approximately 137 fb −1 at √ s = 13 TeV. The analysis is based on well-tested methods described in detail in two published studies [17,19], which used Run 2 data samples of 2.3 fb −1 (2015) and 35.9 fb −1 (2016). In this version of the analysis, the signal and control region definitions have been reoptimized to take advantage of the significant increase in the size of the data sample and to maximize the analysis sensitivity to the SUSY models considered. Other improvements, such as a more advanced b tagging algorithm, have also been incorporated into the analysis.
In models based on SUSY, new particles are introduced such that all fermionic (bosonic) degrees of freedom in the SM are paired with corresponding bosonic (fermionic) degrees of freedom in the extended, supersymmetric theory. The discovery of a Higgs boson at a low mass [20-25] highlighted a key motivation for SUSY, referred to as the gauge hierarchy problem [26][27][28][29][30][31]. Assuming that the Higgs boson is a fundamental (noncomposite) spin-0 particle, its mass is subject to large quantum loop corrections, which would drive the mass up to the cutoff scale of validity of the theory. If the SM is valid up to the Planck scale associated with quantum gravity, these corrections would be enormous. Stabilizing the Higgs boson mass at a low value, without invoking extreme fine tuning of parameters to cancel the corrections, is a major theoretical challenge, which can be addressed in so-called natural SUSY models [32][33][34][35][36]. In such models, several of the SUSY particles are constrained to be light [35]: the top squarks, t L and t R , which have the same electroweak gauge couplings as the left-(L) and right-(R) handed top quarks, respectively; the bottom squark with L-handed couplings ( b L ); the gluino ( g); and the Higgsinos ( H). In SUSY models that conserve R-parity-a multiplicative quantum number equal to +1 for SM particles and −1 for their SUSY partners [37,38]-SUSY particles must be produced in pairs and each SUSY particle decay chain must terminate in the production of the lightest supersymmetric particle (LSP). The LSP is therefore stable and, if weakly interacting, can in principle account for some or all of the astrophysical dark matter [39][40][41].
Motivated by the naturalness-based expectations that both the gluino and the top squark should be relatively light, we search for gluino pair production with decays to either off-or on-massshell top squarks. Furthermore, gluino pair production has a large cross section relative to most other SUSY pair-production processes, for a fixed SUSY particle mass. Each gluino is assumed to decay via the process g → t 1 t (or the conjugate final state), where the top squark mass eigenstate, t 1 , is the lighter of the two physical superpositions of t L and t R . Depending on the mass spectrum of the model, the top squark can be produced either on or off mass shell, and it is as-  Figure 1: Gluino pair production and decay for the simplified models T1tttt (left) and T5tttt (right). In T1tttt, the gluino undergoes a three-body decay g → tt χ 0 1 via a virtual intermediate top squark. In T5tttt, the gluino decays via the sequential two-body process g → t 1 t, t 1 → t χ 0 1 . Because gluinos are Majorana fermions, each one can decay to t 1 t and to the charge conjugate final state t 1 t. The filled circle represents the sum of processes that can lead to gluino pair production.
sumed to decay with 100% branching fraction via t 1 → t χ 0 1 , where χ 0 1 is a neutralino LSP. The neutralino is an electrically neutral mixture of the neutral Higgsinos and electroweak gauginos. Because the χ 0 1 is weakly interacting, it would traverse the detector without depositing energy, much like a neutrino. As a consequence, neutralino production typically generates an apparent imbalance in the total transverse momentum of the event, p miss T , which is a priori known to be essentially zero, apart from detector resolution effects and missing momentum carried by weakly interacting particles (e.g., neutrinos) or particles outside the detector acceptance.
Diagrams showing gluino pair production with decays to off-mass-shell and on-mass-shell top squarks are shown in Fig. 1 and are denoted as T1tttt and T5tttt, respectively, in the context of simplified models [42][43][44][45]. Such models, which include only a small subset of the full SUSY particle spectrum, are often used to quantify the results of searches, in spite of limitations in describing the potential complexities associated with a complete spectrum. The diagram for the T1tttt model does not explicitly show the off-mass-shell top squark, but the fundamental gluino decay vertex for both T1tttt and T5tttt models is the same. Thus, regardless of whether the top squark is produced on or off mass shell, each gluino ultimately decays via the process g → tt χ 0 1 , so signal events would contain a total of four top quarks and two neutralinos. The final states for both T1tttt and T5tttt are characterized by a large number of jets, four of which are b jets from top quark decays. Depending on the decay modes of the accompanying W bosons, a range of lepton multiplicities is possible. We focus here on the single-lepton final state, where the lepton is either an electron or a muon, and a background estimation method specifically designed for this final state is a critical part of the analysis. Events from the extreme tails of the kinematic distributions for tt events can have properties that closely resemble those of signal events, including the presence of large p miss T generated by the neutrino from a leptonic W boson decay. Initial-state radiation (ISR) from strong interaction processes can enhance the jet multiplicity, producing another characteristic feature of signal events. Quantifying the effects of ISR is a critical element of the analysis.
The signature used here to search for the processes shown in Fig. 1 is characterized not only by the presence of an isolated high transverse momentum (p T ) lepton, multiple jets, at least one b-tagged jet, and large p miss T , but also by additional kinematic variables. The first of these is m T , defined as the transverse mass of the system consisting of the lepton and the p miss T in the event. Apart from resolution and small effects from off-mass-shell W boson production, m T is bounded above by m W for events with a single leptonically decaying W boson, and this variable is effective in suppressing the otherwise dominant single-lepton tt background, as well as background from W+jets events.
Largely because of the effectiveness of the m T variable in helping to suppress the single-lepton tt background, the residual background in the signal regions arises predominantly from a single SM process, dilepton tt production. In such background events, both W bosons from the t → bW decays produce leptons, but only one of the two leptons satisfies the leptonidentification criteria, as well as the requirements on the p T , pseudorapidity (η), and isolation from other energetic particles in the event. This background includes tt events in which one or both of the W bosons decay into a τ lepton and its neutrino, provided that the subsequent τ decays produce a final state containing exactly one electron or muon satisfying the lepton selection requirements.
A second key kinematic variable, M J , is defined as the scalar sum of the masses of large-radius jets in the event. This quantity is used both to characterize the mass scale of the event, providing discrimination between signal and background, and as a key part of the background estimation. A property of M J exploited in this analysis is that, for tt events with large jet multiplicity, this variable is nearly uncorrelated with m T . As a consequence, the M J background shape at high m T , which includes the signal region, can be measured to a very good approximation using the corresponding M J shape in a low-m T control sample. The quantity M J was first discussed in phenomenological studies, for example, in Refs. [46][47][48]. Similar variables have been used by ATLAS for SUSY searches in all-hadronic final states using 8 TeV data [49,50]. Studies of basic M J properties and performance in CMS have been presented using early 13 TeV data [51].
Because the signal processes would populate regions of extreme tails of the SM distributions, it is important to determine the background in a manner that accounts for features of the detector behavior and of the SM backgrounds that may not be perfectly modeled in the simulated (Monte Carlo, MC) event samples. The background estimation method is constructed such that corrections derived from MC samples enter only at the level of double ratios of event yields, rather than as single ratios. This approach helps to control the impact of potential mismodeling on the background prediction because of the cancellation of many mismodeling effects. Systematic uncertainties in these predicted double ratios are obtained by performing tests using data control samples in regions that are kinematically similar but have only a very small potential contribution from signal events. This paper is organized as follows. Sections 2 and 3 describe the simulated event samples and the CMS detector, respectively. Section 4 discusses the triggers used to collect the data, the event reconstruction methods, and the definitions of key quantities used in the analysis. The event selection and analysis regions are presented in Section 5, and the methodology used to predict the SM background is presented in Section 6. Section 7 summarizes the systematic uncertainties in the background predictions. Section 8 presents the event yields observed in the signal regions, the corresponding background predictions, the uncertainties associated with the signal efficiencies, and the resulting exclusion regions for the gluino pair-production models considered. Finally, the main results are summarized in Section 9.
Throughout this paper, two T1tttt benchmark models are used to illustrate typical signal behavior. The T1tttt(2100,100) model, with masses m( g ) = 2100 GeV and m( χ 0 1 ) = 100 GeV, corresponds to a scenario with a large mass splitting between the gluino and the neutralino. This mass combination probes the sensitivity of the analysis to a low cross section (0.59 fb) process that has a hard p miss T distribution, which results in a relatively high signal efficiency. The T1tttt(1900,1250) model, with masses m( g ) = 1900 GeV and m( χ 0 1 ) = 1250 GeV, corresponds to a scenario with a relatively small mass splitting (referred to as a compressed spectrum) between the gluino and the neutralino. Here the cross section is higher (1.6 fb) because the gluino mass is lower than for the T1tttt(2100,100) model, but the sensitivity suffers from a low signal efficiency due to the soft p miss T distribution. Finally, to model the presence of additional pp collisions from the same or adjacent bunch crossing as the primary hard scattering process (pileup interactions), the simulated events are overlaid with multiple minimum bias events (generated with the PYTHIA 8.2 generator), such that the minimum bias event multiplicity in simulation matches that observed in data.

3 CMS detector
The central feature of the CMS detector is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter, and a brass and scintillator hadron calorimeter. Each of these systems is composed of a barrel and two endcap sections. The tracking detectors cover the pseudorapidity range |η| < 2.5. For the electromagnetic and hadronic calorimeters, the barrel and endcap detectors together cover the range |η| < 3.0. Forward calorimeters extend the coverage to 3.0 < |η| < 5.0. Muons are measured and identified in both barrel and endcap systems, which together cover the pseudorapidity range |η| < 2.4. The detection planes are based on three technologies: drift tubes, cathode strip chambers, and resistive plate chambers, which are embedded in the steel flux-return yoke outside the solenoid. The detector is nearly hermetic, permitting accurate measurements of p miss T . Events of interest are selected using a two-tiered trigger system [73]. The first level (L1), composed of custom hardware processors, uses information from the calorimeters and muon detectors to select events at a rate of around 100 kHz within a time interval of less than 4 µs. The second level, known as the high-level trigger, consists of a farm of processors running a version of the full event reconstruction software optimized for fast processing, and reduces the event rate to around 1 kHz before data storage. A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Ref. [74].

Trigger requirements and event reconstruction
The data sample used in this analysis was obtained with the logical OR of event triggers that require either missing transverse momentum larger than 100-120 GeV, or a single lepton with p T greater than 24-32 GeV, or a single lepton with p T > 15 GeV accompanied by transverse hadronic energy greater than 350-400 GeV, where the exact thresholds depended on the instantaneous luminosity. The triggers based on missing transverse momentum quantities alone, without a lepton requirement, have high asymptotic efficiency (about 99%), but they only reach the efficiency plateau for p miss T larger than 250-300 GeV. The single-lepton triggers are therefore included to ensure high efficiency at lower values of p miss T , and they bring the analysis trigger efficiency up to nearly 100% for p miss T > 200 GeV.
The total trigger efficiency has been studied as a function of the analysis variables N jets , N b , M J , and m T , defined later in this section, in the region with p miss T > 200 GeV. The efficiency is close to 100% and is found to be uniform with respect to these analysis variables over the three years of data taking. The systematic uncertainty in the trigger efficiency is estimated to be 0.5%. Event reconstruction proceeds from particles identified by the particle-flow (PF) algorithm [75], which uses information from the tracker, calorimeters, and muon systems to identify PF candidates as electrons, muons, charged or neutral hadrons, or photons. Charged-particle tracks are required to originate from the event primary pp interaction vertex, defined as the candidate vertex with the largest value of summed physics-object p 2 T . The physics objects used in this calculation are the jets, clustered using the anti-k T jet finding algorithm [76,77] with the tracks assigned to candidate vertices as inputs, and the associated missing transverse momentum, taken as the negative vector sum of the p T of those jets.
Electrons are reconstructed by associating a charged-particle track with electromagnetic calorimeter superclusters [78]. The resulting electron candidates are required to have p T > 20 GeV and |η| < 2.5, and to satisfy identification criteria designed to reject light-parton jets, photon conversions, and electrons produced in the decays of heavy-flavor hadrons. Muons are reconstructed by associating tracks in the muon system with those found in the silicon tracker [79]. Muon candidates are required to satisfy p T > 20 GeV and |η| < 2.4.
To preferentially select leptons that originate in the decay of W bosons, and to suppress backgrounds in which the leptons are produced in the decays of hadrons containing heavy quarks, leptons are required to be isolated from other PF candidates. Isolation is quantified using an optimized version of the "mini-isolation" variable originally suggested in Ref. [80]. The isolation I mini is calculated by summing the transverse momentum of the charged hadrons, neutral hadrons, and photons within ∆R ≡ √ (∆φ) 2 + (∆η) 2 < R 0 of the lepton momentum vector p T , where φ is the azimuthal angle in radians and R 0 is given by 0.2 for p T ≤ 50 GeV, (10 GeV)/p T for 50 < p T < 200 GeV, and 0.05 for p T ≥ 200 GeV. Electrons (muons) are then required to satisfy I mini /p T < 0.1 (0.2).
Jets are reconstructed by clustering charged and neutral PF candidates using the anti-k T algorithm [76] with a distance parameter of R = 0.4, as implemented in the FASTJET package [77]. Jets are corrected using a p T -and η-dependent jet energy calibration [81], and the estimated energy contribution to the jet p T from pileup [82] is subtracted. Jets are then required to satisfy p T > 30 GeV and |η| < 2.4, as well as jet identification criteria [81]. Finally, jets that have PF constituents matched to an isolated lepton are removed from the jet collection. The number of jets satisfying these requirements is a key quantity in the analysis and is denoted N jets .
Jets are "tagged" as originating from the hadronization of b quarks using the deep combined secondary vertex (DeepCSV) algorithm [83]. For the medium working point chosen here, the signal efficiency for identifying b jets with p T > 30 GeV in tt events is about 68%. The probability to misidentify jets in tt events arising from c quarks is approximately 12%, while the probability to misidentify jets associated with light-flavor quarks or gluons as b jets is approximately 1%. The number of b-tagged jets is another key quantity in the analysis and is denoted N b .
The analysis also makes use of large-radius (large-R) jets, denoted generically with the symbol J. These jets are constructed by clustering the standard small-R (R = 0.4) jets described above, as well as isolated leptons, into large-R (R = 1.4) jets using the anti-k T algorithm. Starting the clustering from small-R jets takes advantage of the corrections that are applied to these jets. The masses, m(J i ), of the individual large-R jets reflect the p T spectrum and multiplicity of the clustered objects, as well as their angular spread. By summing the masses of all large-R jets in an event, we obtain the variable M J , which is central to the analysis method: For tt events with a small contribution from ISR, the distribution of M J has an approximate cutoff at 2m t [17]. Thus, in the absence of ISR, the requirement M J > 2m t is expected to remove most of the tt background. In contrast, the M J distribution for signal events typically extends to larger values of M J because of the presence of more than two top quarks in the decay chain. However, as discussed in Refs. [17,19], the presence of a significant amount of ISR in a subset of tt background events generates a tail at large values of M J , and understanding this effect is critical for estimating the remaining background in the analysis.
The missing transverse momentum, p miss T , is defined as the negative vector sum of the transverse momenta of all PF candidates and is calibrated taking into account the jet energy corrections. Dedicated event filters designed to reject instrumental noise are applied to further improve the correspondence between the reconstructed and the genuine p miss T [84,85].
To suppress backgrounds characterized by the presence of a single W boson decaying leptonically, and without any other significant source of p miss T apart from the neutrino from this process, we use the quantity m T , defined as the transverse mass of the system consisting of the lepton and the missing transverse momentum vector, where ∆φ , p miss T is the difference between the azimuthal angles of p T and p miss T . For both tt events with a single leptonic W decay, and for W+jets events with leptonic W boson decay, the m T distribution peaks strongly below the W boson mass.
Although the event selection requires exactly one identified isolated lepton, backgrounds can still arise from processes in which two leptons are produced but only one satisfies the identification and isolation criteria. The dominant contribution to this type of background arises from tt events with two leptonic W boson decays, including W decays involving τ leptons, which can themselves decay into hadrons, electrons, or muons. To help suppress such dilepton backgrounds, events are vetoed that contain a broader category of candidates for the second lepton, referred to as veto tracks, which do not satisfy the stringent lepton identification requirements. These include two categories of charged-particle tracks: isolated leptons satisfying looser identification criteria than lepton candidates, as well as a relaxed momentum requirement, p T > 10 GeV, and isolated charged-hadron PF candidates, which must satisfy p T > 15 GeV. For example, isolated charged hadrons can arise in τ decays. For either category, the charge of the veto track must be opposite to that of the identified lepton candidate in the event. To maintain a high selection efficiency for signal events, lepton veto tracks must also satisfy a requirement on the quantity m T2 [86,87], where v refers to the veto track. The minimization is taken over all possible pairs of momenta p 1 and p 2 that sum to the p miss T . For the dominant background, tt, if the lepton, the veto track, and the missing transverse momentum all result from a pair of leptonically decaying W bosons, m T2 is bounded above by the W boson mass. We improve the signal efficiency by requiring m T2 < 80 GeV for loosely identified leptonic tracks and m T2 < 60 GeV for hadronic tracks.
Finally, we define S T as the scalar sum of the transverse momenta of all the small-R jets and all leptons passing the selection.

Event selection and analysis regions
Using the quantities and criteria defined in Section 4, events are selected that have exactly one isolated charged lepton (an electron or a muon), no veto tracks, M J > 250 GeV, S T > 500 GeV, p miss T > 200 GeV, and at least seven (six) small-R jets if p miss T ≤ 500 GeV (p miss T > 500 GeV). At least one of the jets must be tagged as originating from a bottom quark. After this set of requirements, referred to in the following as the baseline selection, more than 85% of the remaining SM background arises from tt production. The contributions from events with a single top quark or a W boson in association with jets are each about 4-5%, while the combined contribution from ttW and ttZ events is about 2%. The background from QCD multijet events after the baseline selection is very small. Approximately 40% of signal T1tttt events are selected with the single-lepton requirement only.

8
To improve the sensitivity to the signal and to provide a method for the background estimation, the events satisfying the baseline selection are divided into a set of signal and control regions in the M J -m T plane and in bins of p miss T , N jets , and N b . In each of the three p miss T regions, 200 < p miss T ≤ 350 GeV, 350 < p miss T ≤ 500 GeV, and p miss T > 500 GeV, the M J -m T plane is divided into six regions, referred to as R1, R2A, R2B, R3, R4A, and R4B, as shown in Fig. 2. The signal regions are R4A and R4B, while R1, R2A, R2B, and R3 serve as control regions. Potential signal contamination in the control regions is taken into account using a fit method described in Section 6. Regions denoted with the letter A are referred to as low M J , while regions denoted with the letter B are referred to as high M J . The control regions R1, R2A, and R3 are used to estimate the background in signal region R4A, while the control regions R1, R2B, and R3 are used to estimate the background in signal region R4B. (In discussions where the distinction between R2A and R2B, or between R4A and R4B, is irrelevant, we refer to these regions generically as R2 and R4.) As seen in Fig. 2, for each of the three regions in p miss T , the M J ranges for R2A and R4A (low M J ) and for R2B and R4B (high M J ) are The use of six regions in the M J -m T plane (in each bin of p miss T ) is an improvement over the original method used in Refs. [17,19], where only four regions were used: R1, R2 (combining R2A and R2B), R3, and R4 (combining R4A and R4B). The larger event yields in the full Run 2 data sample allow for this additional division of the M J -m T plane. By separating each of the original "high" M J regions into two bins, we are able to obtain additional sensitivity to SUSY models with large mass splittings, which tend to populate the highest M J regions with a significant number of events. In addition, the values of M J corresponding to the boundaries between these regions increase with p miss T , improving the expected precision in the background prediction.
Regions R2A, R2B, R4A, and R4B are further subdivided into bins of N jets and N b to increase sensitivity to the signal: • two N jets bins: N jets = 7 (6 ≤ N jets ≤ 7) for p miss T ≤ 500 GeV (p miss T > 500 GeV) and The total number of signal regions is therefore 3(p miss Given that the main background processes have two or fewer b quarks, the total SM contribution to the N b ≥ 3 bins is very small and is driven by the b jet mistag rate. Signal events in the T1tttt model are expected to populate primarily the bins with N b ≥ 2, while bins with N b = 1 mainly serve to test the method in a background dominated region. Because of the common use of R1 and R3 in the background estimations for R4A and R4B, as well as the integration over N jets and N b in the R1 and R3 regions, there are statistical correlations between the background predictions, which are taken into account in the fitting methodology (Section 6).

Background estimation method
The method for estimating the background yields in each of the signal bins takes advantage of the fact that the M J and m T distributions of background events with a significant amount  of ISR are largely uncorrelated and that there are background-dominated control samples that can be used to test the method and establish systematic uncertainties. Figure 3 shows the twodimensional distribution of simulated tt events in the variables M J and m T , with single-lepton and dilepton events shown with separate symbols. The three background-dominated regions (R1, R2, and R3) and the signal region (R4) are indicated. (For simplicity, the separate A and B regions for R2 and R4 are not shown in this figure.) The low-m T region, m T ≤ 140 GeV, is dominated by tt single-lepton events, and the rapid falloff in the number of such events as m T increases is apparent. In contrast, the high-m T region, m T > 140 GeV, is dominated by tt dilepton events. As discussed in Refs. [17,19], the M J distributions for the events in these two regions become nearly identical in the presence of significant ISR, which is enforced by the jet multiplicity requirements. This behavior allows us to measure the shape of the M J distribution at low m T with good statistical precision and then use it to obtain a background prediction in the high-m T region by normalizing it to the event yield in R3.
To estimate the background contribution in each of the signal bins, a modified version of an "ABCD" method is used. Here, the symbols A, B, C, and D refer to four regions in a two-dimensional space in the data, where one of the regions is dominated by signal and the other three are dominated by backgrounds. In a standard ABCD method, the background rate (µ bkg region ) in the signal region (in this case, either R4A or R4B) is estimated from the yields (N region ) in three control regions using where the labels of the regions correspond to those shown in Fig. 2. The background prediction is unbiased in the limit that the two variables that define the plane (in this case, M J and m T ) are uncorrelated. The effect of any residual correlation can be taken into account by multiplying these background predictions by correction factors κ A and κ B , R4 Figure 3: Distribution of simulated single-lepton tt events (dark-blue inverted triangles) and dilepton tt events (light-blue triangles) in the M J -m T plane after applying the baseline selection and requiring at least two b jets. A representative random sample of T1tttt events with m( g ) = 2100 GeV and m( χ 0 1 ) = 100 GeV is also shown for comparison (red squares). Each marker represents one expected event in the full data sample. Overflow events are placed on the edges of the plot. The values of the correlation coefficients ρ for each of the background processes are given in the legend. Region R4, which is further split into smaller bins R4A and R4B, is the nominal signal region, while R1, R2, and R3 serve as control regions. This plot is only illustrative, because the boundary between R1 and R2, as well as between R3 and R4, is p miss Tdependent. The line shown at 400 GeV corresponds to the value used for the lowest p miss T bin. Additional sensitivity is obtained by binning the events in p miss T , N jets , and N b .
which are double ratios obtained from simulated event samples: When the two ABCD variables are uncorrelated, or nearly so, the κ factors are close to unity. This procedure ignores potential signal contamination in the control regions, which is accounted for by incorporating the methods described above into a fit that includes both signal and background components.
In principle, this calculation to estimate the background can be performed for each of the 36 signal bins by applying this procedure in 36 independent ABCD planes. However, such an approach would incur large statistical uncertainties in some bins due to the small number of events in R3 regions. This problem is especially important in bins with a large number of jets, where the M J distribution shifts to higher values and the number of background events expected in R4 can even exceed the background in R3.
To alleviate this problem, we exploit the fact that, after the baseline selection, the background is dominated by a single source (tt events), and the shapes of the N jets distributions are nearly identical for the single-lepton and dilepton components, because of the large amount of ISR. As a result, the m T distribution is approximately independent of N jets and N b . More specifically, we find that for M J values corresponding to the R1 and R3 regions, the ratios of high-m T to low- m T event yields do not vary substantially between events with seven or more jets, and across N b within these N jets bins. We exploit this result by integrating the event yields in the low-M J regions (R1 and R3) over the N jets and N b bins for each p miss T bin. This procedure increases the statistical power of the ABCD method but also introduces a correlation among the predictions from Eq. (5) for the N jets and N b bins associated with a given p miss T bin. Figure 4 shows the values of the κ factors obtained from simulation (computed after integrating over N jets and N b in R1 and R3 only) for the 18 signal bins of the low-M J ABCD planes, i.e., R1-R2A-R3-R4A (left plot), and the 18 signal bins of the high-M J ABCD planes, i.e., R1-R2B-R3-R4B (right plot). These values are close to unity for the low-M J regions and are slightly above unity for the high-M J regions. The deviation from unity is due to the presence of mismeasured jets in single-lepton tt events, which produces a correlation between m T and M J . The additional p miss T arising from the jet mismeasurement allows these events to migrate from the low-m T to the high-m T region. Since the mismeasured p miss T is correlated with hadronic activity, these events typically also have larger M J values relative to well-reconstructed events. Consequently, their presence at high m T results in a difference between the shapes of the M J distributions for low-m T and high-m T events and thus results in a κ value larger than unity. In addition to the statistical uncertainties shown in Fig. 4, systematic uncertainties are obtained from studies of the modeling of the κ values in dedicated data control samples, including both a sample with high purity of dilepton tt events as well as a sample enriched in mismeasured single-lepton tt events, as discussed in Section 7.
The method described above is implemented with a maximum likelihood fit to the event yields observed in data using a likelihood function that incorporates both the statistical and systematic uncertainties in κ A and κ B . The fit also takes into account the correlations associated with the common R1 and R3 yields that arise from the integration over N jets and N b , and it accounts for the signal contamination in the control regions.
The signal strength is the only parameter that enters the likelihood in a way that extends across p miss T bins. We can therefore define the correlation model within each p miss T bin and then take the product over p miss T bins to construct the full likelihood function. Let µ bkg i,j,k be the estimated (Poisson) mean background in each region, where i indicates the p miss T bin, j ∈ S with S ≡ {R1, R2A, R2B, R3, R4A, R4B}, and k runs over the six N jets -N b bins. Then, in a given p miss T bin, the 26 background rates (one each for R1 and R3 and six each for R2A, R2B, R4A, and R4B) can be expressed in terms of 14 floating fit parameters θ (one each for R1 and R3 and six each for R2A and R2B) and the 12 correction factors κ (κ A and κ B for each of the six N jets and N b bins for a fixed p miss T bin) as Here, the i index for the three p miss T bins is suppressed, because it applies identically to all parameters in the equations. In addition, the k index over the N jets and N b bins is omitted for terms that are integrated over these quantities, i.e., for the parameters for the R1 and R3 regions. The quantity θ R3 /θ R1 is simply the ratio between the background event rates in regions R3 and R1, summed over N jets and N b . To obtain the prediction for the mean background, this ratio is then multiplied by the appropriate rate θ R2A,k or θ R2B,k and then corrected with the appropriate value κ A or κ B for the given bin in N jets and N b .
In Eq. (8), the L data ABCD,i terms account for the statistical uncertainty in the observed data yield in each bin, and the L MC sig,i terms account for the uncertainty in the signal shape, due to the finite size of the MC samples. The statistical uncertainties in the κ factors due to the finite size of the simulated background event samples are implemented as Gaussian constraints. The signal systematic uncertainties are incorporated in the likelihood function as log-normal constraints with a nuisance parameter for each uncorrelated source of uncertainty. These terms are not explicitly shown in the likelihood function above for simplicity.
The likelihood function defined in Eq. (8) is employed in two separate types of fits that provide complementary but compatible background estimates based on an ABCD model. The "R1-R3 fit" is used to test the agreement between the observed event yields (R4) and the predictions (based on R1, R2, and R3 event yields) under the null (i.e., the background-only) hypothesis. In this approach, we exclude the observations in the signal regions in the likelihood and fix the signal strength r to 0. This procedure involves as many unknowns as constraints. As a result, the estimated background rates in regions R1, R2, and R3 become simply the observed values in those bins, and we obtain predictions for the signal regions that do not depend on the observed N data R4 . The R1-R3 fit thus corresponds to the standard ABCD method with κ corrections, and the likelihood machinery becomes just a convenient way to solve the system of equations and propagate the various uncertainties.
In contrast, the "R1-R4 fit" also makes use of the observations in the signal regions, and it can therefore provide an estimate of the signal strength r, while also allowing for signal events to populate the control regions. We also use the R1-R4 fit with the constraint r = 0 to assess the agreement between the data and the background predictions in the null hypothesis.

Background systematic uncertainties
The background estimation procedure described in Section 6 relies on the approximate independence of the kinematic variables M J and m T , as well as on the ability of the simulation to correctly model any residual correlation, which would manifest as a departure of κ from unity. The approximate independence of M J and m T is a consequence of two key features of the data, namely, that the high-m T sample is composed primarily of dilepton tt events and that the M J spectra of tt events with one and two leptons become highly similar in the presence of ISR jets. A residual correlation of M J and m T can arise either from (i) contributions to the overall M J shape from backgrounds other than single-lepton tt at low m T and dilepton tt at high m T or from (ii) subleading kinematic effects that result in the gradual divergence of the single-lepton and dilepton M J shapes as a function of the analysis binning variables. As an example of (i), simulation studies show that the deviation of κ from unity for the high-M J ABCD planes, most pronounced at low p miss T , can arise from mismeasured single-lepton tt events that populate the high-m T region. A classification and study of such mechanisms was presented in Ref. [17]. Based on this understanding, the systematic uncertainties in the background estimate are obtained by quantifying the ability of the simulation to predict the behavior of κ in control samples in data with varying background composition and as a function of the analysis binning variables.

Control sample strategy
Two control samples are used to assess the ability of the simulation to reproduce the behavior of κ in the data: a 2 sample composed of events with two reconstructed leptons and a 1 , 5-6 jet sample composed of events with a single reconstructed lepton and either five or six jets.
Because it is composed primarily of tt dilepton events, the 2 control sample allows us to assess the validity of the main assertion of the analysis, namely that the shapes of the M J distributions for 1 and 2 tt events approximately converge at high jet multiplicities. The MC predictions for κ are tested independently as a function of N jets and p miss T using this control sample, because simulation studies show no significant correlation in the κ behavior as a function of these two variables. The dilepton control sample is not used to probe the modeling of κ as a function of N b , which is instead studied in the 5-6 jet control sample described below. Events in the dilepton control sample with N b ≥ 2 are excluded to avoid potential signal contamination.
Except for the case of the dilepton tt process, it is not possible to find useful control samples where a particular background category dominates. As a consequence, we cannot completely factorize the uncertainty in κ arising from mismodeling of the background composition and from mismodeling the m T -M J correlation for a particular background. However, we are able to define a control sample in which the background composition and kinematic characteristics are very similar to those in the signal regions, but in which the expected signal contribution is too small to significantly affect the data vs. simulation comparison. The single-lepton, 5-6 jet sample satisfies these requirements. Both the κ values for individual background categories and the composition of background processes are very similar to those for events with N jets ≥ 7. We therefore use this control sample to quantify mismodeling of κ arising either from detector mismeasurement effects (which can result in a larger fraction of single-lepton tt events at high m T ), or from mismodeling of the background composition. An N b -dependent uncertainty is derived from the lowest p miss T region (which is binned in N b ). Based on studies in simulation, any N b dependence is not correlated with p miss T within the statistical precision of the sample, and therefore the uncertainties derived in the low-p miss T region can be used for all p miss T bins. Since the low-p miss T bin has the highest contribution from events with p miss T mismeasurement, this uncertainty also provides an estimate of the uncertainty in the modeling of κ in the presence of mismeasurement that is valid over the full p miss T range. We have verified in simulation that artificially increasing the fraction of mismeasured events has a consistent effect across the bins in the single-lepton, 5-6 jet control sample and the corresponding signal bins, so this effect would be detected in a study of this control sample.

Dilepton control sample results
We construct an alternate ABCD plane in which the high-m T regions R3 and R4A/B are replaced with regions D3 and D4A/B, which are defined as having either two reconstructed leptons or one lepton and one isolated track. The new regions D3 and D4A/B are constructed to mimic the selection for R3 and R4A/B, respectively. For the events with two reconstructed leptons in D3 and D4A/B, the selection is modified as follows: the N jets bin boundaries are lowered by 1 to keep the number of large-R jet constituents the same as in the single-lepton samples; the m T requirement is not applied; and events with both N b = 0, 1 are included to increase the size of the event sample. The lepton-plus-track events in D3 and D4A/B are required to pass the same selection as those in R3 and R4 except for the track veto. With these requirements, the sample is estimated from simulation to consist of between 75-85% tt dilepton events, depending on the p miss T and N jets bin.
Using the dilepton control sample, we compute the values of κ in both simulation and data in the two N jets bins at low p miss T , and integrated over N jets in the intermediate-and high-p miss T bins. Figure 5 compares the κ values obtained from simulation with those observed in data in the dilepton control sample. We observe that these values are consistent within the total statistical uncertainties, and we therefore assign the statistical uncertainty in this comparison as the systematic uncertainty in κ as follows. We take the uncertainty associated with the N jets dependence of κ from the lowest p miss

Single-lepton, 5-6 jet control sample results
The single-lepton, 5-6 jet control sample (referred to simply as the 5-6 jet control sample) is constructed in a manner identical to the signal regions, except for the N jets requirement. The  Figure 6 compares the κ values obtained from simulation with those measured in the data. We find consistency between the simulation and the data except for a 3σ deviation in the 2 b-jet bin. Closer examination of distributions contributing to this κ value shows a higher yield in the region equivalent to R4A in the 5-6 jet control sample. Additional checks at 100 < p miss T ≤ 200 GeV for both 5-6 jet events and 7-jet events yield consistent κ values between the simulation and the data. These results, as well as studies of the shape of the N b distribution, suggest that this discrepancy observed in the 2 b jet bin at low M J is the result of a fluctuation. Nevertheless, we assign systematic uncertainties to cover potential mismodeling of κ as a function of N b , taking 10, 20, and 25% as the uncertainties for events with N b = 1, N b = 2, and N b ≥ 3, respectively. Table 1 shows the total symmetrized systematic uncertainties in the κ values used to compute the background yields for each signal bin, based on the uncertainties derived in the control samples. These uncertainties are obtained by combining the uncertainties under the assumption of no correlation between any N jets , N b , and p miss T dependence as follows,

Summary of systematic uncertainties in the background estimate
bin of the dilepton control sample; and finally, σ 2 mid p miss T and σ 2 high p miss T refer to the uncertainty as a function of p miss T , integrated in N jets and N b , derived in the dilepton control sample. Since the uncertainty as a function of N jets is derived in the low-p miss T bin of the dilepton sample, it already accounts for any mismodeling of the p miss T distribution at low p miss T , and thus no additional term is needed to account for such mismodeling in the first equation. Similarly, any mismodeling of the contribution of single-lepton mismeasured events at high m T is already folded into the σ 5-6j low p miss T , b term, and thus no additional uncertainty is needed to account for this. In practice, the three sources of uncertainty listed above are implemented in the likelihood as six log-normal constraints. A separate low-M J and high-M J nuisance parameter is assigned for each of the quantities σ SR low p miss T , j, b , σ SR mid p miss T , j, b , and σ SR high p miss T , j, b . The low-M J and high-M J nuisance parameters are decoupled, based on the observation that the background contributions for which κ > 1 have a p miss T dependence that is different at low M J and high M J . The total uncertainties with the full Run 2 data set are in the range 13 to 39%, increasing with p miss T . Figure 7 shows two-dimensional distributions of the data in the M J -m T plane after applying the baseline selection described in Section 5, with separate plots for the intermediate-and highp miss T bins. Both plots in the figure are integrated over N jets and N b ≥ 2 and hence do not represent the full sensitivity of the analysis. Each event in data is represented by a single filled circle. For comparison, the plots also show the expected total SM background based on simulation, as well as an illustrative sample of the simulated signal distribution for the T1tttt model with m( g ) = 2100 GeV and m( χ 0 1 ) = 100 GeV, plotted with one square per event, normalized to the same integrated luminosity as the data. This model has a large mass splitting between the gluino and the neutralino, and signal events typically have large values of p miss T . Qualitatively, the two-dimensional distribution of the data corresponds well to the expected distribution for the SM background events. The highest M J , highest p miss T region shows several simulated signal events for the T1tttt(2100, 100) model. However, only two observed events populate this region in the data.  Figure 7: Two-dimensional distributions in M J and m T for both data and simulated event samples, integrated over the N jets and N b ≥ 2 and shown separately for the 350 < p miss T ≤ 500 GeV bin (left) and the p miss T > 500 GeV bin (right). The black dots represent events in data, the colored histogram shows the total expected background yield per bin from simulation (not the actual predicted background), and the red squares correspond to a representative random sample of signal events drawn from the simulated distribution for the T1tttt model with m( g ) = 2100 GeV and m( χ 0 1 ) = 100 GeV for 137 fb −1 . Overflow events are shown on the edges of the plot.

Results and interpretation
The basic principle of the analysis is illustrated in Fig. 8, which compares, in three separate p miss T regions, the M J distributions for low-m T and high-m T data. The low-m T data correspond to regions R1, R2A, and R2B. Here, each event in R2A or R2B is weighted with the relevant κ factor, and then the total low-m T yield is normalized to the total high-m T yield in data. In the absence of signal, the shapes of these distributions should be approximately consistent, as observed. The low-and intermediate-p miss T regions (upper plots) show the background behavior with better statistical precision, while the high-p miss T region (bottom) has a higher sensitivity to the signal. For all three plots, integrals are performed over N jets and N b , as indicated in the legends. Figure 9 shows the observed event yields in all of the signal regions and bins of the analysis, the predicted backgrounds with their uncertainties obtained from the R1-R3 and R1-R4 fits, and the pulls associated with the fits. Both the R1-R3 and the R1-R4 fits are based on the null hypothesis, i.e., no signal contribution. We observe a broad pattern of consistency between the observed data and predicted backgrounds in the search regions and bins. Most of the pulls are less than one standard deviation (s.d.). The largest pull is −2.0 s.d. and occurs in the bin with M J > 650 GeV, 350 < p miss T ≤ 500 GeV, N jets = 7, and N b = 1.
Tables 2 (low M J ) and 3 (high M J ) present the same information as in Fig. 9, but in detailed numerical form, including the observed and fitted yields in regions R1-R3, as well as the expected signal yields for the two T1tttt benchmark model points. Again, the observed event yields in data are consistent with the background predictions.     Table 2: Observed and predicted event yields for the signal regions (R4A) and background regions (R1-R3) in the low-M J ABCD planes. For the R1-R3 fit, the values given for R1, R2A and R3 correspond to the observed yields in those regions. Expected yields for the two SUSY benchmark scenarios, T1tttt(2100, 100) and T1tttt (1900, 1250), are also given. The uncertainties in the prediction account for the amount of data in the control samples, the precision of κ from MC, and the systematic uncertainties in κ assessed from control samples in data.   Table 3: Observed and predicted event yields for the signal regions (R4B) and background regions (R1-R3) in the high-M J ABCD planes. For the R1-R3 fit, the values given for R1, R2B and R3 correspond to the observed yields in those regions. Expected yields for the two SUSY benchmark scenarios, T1tttt(2100, 100) and T1tttt(1900, 1250), are also given. The uncertainties in the prediction account for the amount of data in the control samples, the precision of κ from MC, and the systematic uncertainties in κ assessed from control samples in data.  Figure 9: Observed and predicted event yields in each signal region. The open rectangles represent the prediction and uncertainty obtained using event yields from regions R1, R2, and R3 only (R1-R3 fit), while the hashed rectangles represent the prediction obtained when all regions are included in the fit (R1-R4 fit). The labels 1b, 2b, and ≥3b refer to N b = 1, N b = 2, and N b ≥ 3 bins, respectively. In both cases, all statistical and systematic uncertainties are included. The bottom panel shows the pulls for both fits, defined as (N obs − N pred )/ The results are first interpreted in terms of the simplified model T1tttt of SUSY particle production. This model is characterized by just two mass parameters, m( g ) and m( χ 0 1 ). To determine which sets of masses, or mass points, are excluded by the data, we generate a set of simulated signal samples in which the mass parameters are varied across the range to which the analysis is potentially sensitive. These samples are used to determine the number of events that would be expected at each mass point, given the theoretical production cross section for this point. To assess which model points can be excluded by the data, it is necessary to evaluate the systematic uncertainties associated with the expected number of observed signal events.
Systematic uncertainties in the expected signal yields account for uncertainties in the trigger, lepton identification, jet identification, and b tagging efficiencies in simulated events; uncertainties in the distributions of p miss T , number of pileup vertices, and ISR jet multiplicity; and uncertainties in the jet energy corrections, renormalization and factorization scales, and integrated luminosity [88][89][90]. Each systematic uncertainty is evaluated in each of the analysis bins separately, and the uncertainties are treated as symmetric log-normal distributions. In the case that the sizes of up and down variations are not the same, the variation having larger absolute value is taken. If the sign of variations changes bin-by-bin, the correlation between bins is preserved, while the value that has the larger absolute variation is taken. A summary of the magnitude of the uncertainty due to each systematic source across sensitive signal bins for each of the two signal benchmark points is shown in Table 4.
An upper limit on the production cross section at 95% confidence level (CL) is estimated using the modified frequentist CL s method [91][92][93], with a one-sided profile likelihood ratio test statistic, using an asymptotic approximation [94]. The statistical uncertainties from data counts in the control regions are modeled by Poisson terms. All systematic uncertainties are multiplicative and are treated as log-normal distributions. Exclusion limits are also estimated for ±1 s.d. variations on the production cross section based on the approximate NNLO+NNLL calculation. Figures 10 and 11 show the corresponding excluded cross section regions at 95% CL for the T1tttt and T5tttt models, respectively, in the m( g )-m( χ 0 1 ) plane. These regions correspond to excluded cross sections under the assumption that the branching fraction for the given process is 100%. For T1tttt, gluinos with masses of up to approximately 2150 GeV are excluded for χ 0 1 masses up to about 700 GeV. The highest limit on the χ 0 1 mass is approximately 1250 GeV, attained for m( g ) of about 1700-1900 GeV. The observed limits for T1tttt are within the 1σ uncertainty of the expected limits across the full scan range.
The T5tttt model allows us to extend the interpretation of the results to scenarios in which the top squark is lighter than the gluino. Rather than considering a large set of models with independently varying top squark masses, we consider the extreme case in which the top squark has approximately the smallest mass consistent with two-body decay, m( t ) ≈ m(t) + m( χ 0 1 ), for a range of gluino and neutralino masses. The decay kinematics for such extreme, compressed mass spectrum models correspond to the lowest signal efficiency for given values of m( g ) and m( χ 0 1 ), because the top quark and the χ 0 1 are produced at rest in the top squark frame. As a consequence, the excluded signal cross section for fixed values of m( g ) and m( χ 0 1 ) and with m( g ) > m( t 1 ) ≥ m(t) + m( χ 0 1 ) is minimized for this extreme model point. In particular, at low m( χ 0 1 ) the neutralino carries very little momentum, thus reducing the value of m T , and resulting in significantly lower sensitivity for T5tttt than T1tttt. In this kinematic region, the sensitivity to the signal is in fact dominated by the events that have at least two leptonic W boson decays, which produce additional p miss T , as well as a tail in the m T distribution. Although such dilepton events are nominally excluded in the analysis, a significant number of these signal events escape the dilepton veto.
For physical consistency, the signal model used in the T5tttt study should include not only gluino pair production, but also direct top squark pair production, t t, referred to as T2tt. For m( χ 0 1 ) < 33 GeV and 100 < m( χ 0 1 ) < 550 GeV, with m( t ) − m( χ 0 1 ) = 175 GeV, the T2tt model 23 is excluded in direct searches for t t production [72,95]. For 33 < m( χ 0 1 ) < 100 GeV, the T2tt model is not excluded due to the difficulty in assessing the rapidly changing acceptance with the finite event count available in simulation. We have verified that for m( χ 0 1 ) > 550 GeV, where the T2tt model remains unexcluded, adding the contribution from the T2tt process to our analysis regions does not have a significant effect on the sensitivity. For simplicity, in Fig. 11, we have based the exclusion curve on T5tttt only, without including the additional T2tt process.
As with all SUSY particle mass limits obtained in the context of simplified models, it is important to recognize that the results can be significantly weakened if the assumptions of the model fail to hold. In particular, the presence of alternative decay modes could reduce the number of expected events for the given selection. However, cross section limits remain valid if they are interpreted as limits on cross section multiplied by the branching fraction for the assumed decay mode.
Observed ( Figure 10: Interpretation of the results in the T1tttt model. The colored regions show the upper limits (95% CL) on the production cross section for pp → g g, g → tt χ 0 1 in the m( g )-m( χ 0 1 ) plane. The curves show the expected and observed limits on the corresponding SUSY particle masses obtained by comparing the excluded cross section with theoretical cross sections.

Summary
A search is performed for an excess event yield above that expected for standard model processes using a data sample of proton-proton collision events with an integrated luminosity of 137 fb −1 at √ s = 13 TeV. The experimental signature is characterized by a single isolated lepton, multiple jets, at least one b-tagged jet, and large missing transverse momentum. No significant excesses above the standard model backgrounds are observed. The results are interpreted in the framework of simplified models that describe natural supersymmetry scenarios. For gluino pair production followed by the three-body decay g → tt χ 0 1 (T1tttt model), gluinos with masses below about 2150 GeV are excluded at 95% confidence level for neutralino masses up to 700 GeV. The highest excluded neutralino mass is about 1250 GeV. For the two-body gluino decay g → t 1 t with t 1 → t χ 0 1 (T5tttt model), the results are generally similar, except at low neutralino masses, where the excluded gluino mass is somewhat lower. These results extend previous gluino mass limits [19] from this search by about 250 GeV, due to both the data sample increase and the analysis reoptimization enabled by it. These mass limits are among the most stringent constraints on this supersymmetry model to date. [11] ATLAS Collaboration, "Search for squarks and gluinos in events with isolated leptons, jets and missing transverse momentum at √ s = 8 TeV with the ATLAS detector", JHEP 04 (2015) 116, doi:10.1007/JHEP04(2015)116, arXiv:1501.03555.
[19] CMS Collaboration, "Search for supersymmetry in pp collisions at √ s = 13 TeV in the single-lepton final state using the sum of masses of large-radius jets", Phys. Rev. Lett