Search for top squark production in fully hadronic final states in proton-proton collisions at ﬃﬃ s p = 13 TeV

A search for production of the supersymmetric partners of the top quark, top squarks, is presented. The search is based on proton-proton collision events containing multiple jets, no leptons, and large transverse momentum imbalance. The data were collected with the CMS detector at the CERN LHC at a center-of-mass energy of 13 TeV, and correspond to an integrated luminosity of 137 fb − 1 . The targeted signal production scenarios are direct and gluino-mediated top squark production, including scenarios in which the top squark and neutralino masses are nearly degenerate. The search utilizes novel algorithms based on deep neural networks that identify hadronically decaying top quarks and W bosons, which are expected in many of the targeted signal models. No statistically significant excess of events is observed relative to the expectation from the standard model, and limits on the top squark production cross section are obtained in the context of simplified supersymmetric models for various production and decay modes. Exclusion limits as high as 1310 GeVare established at the 95% confidence level on the mass of the top squark for direct top squark production models, and as high as 2260 GeV on the mass of the gluino for gluino-mediated top squark production models. These results represent a significant improvement over the results of previous searches for supersymmetry by CMS in the same final state.


I. INTRODUCTION
The standard model (SM) of particle physics correctly predicts a wide range of phenomena. Nonetheless, the SM has well-known shortcomings, such as an instability in the calculation of higher-order corrections to the Higgs boson mass, known as the fine-tuning (or hierarchy) problem [1]. There is also an abundance of experimental observations, including the existence of dark matter, that cannot be explained within the context of the SM alone [2]. Supersymmetry (SUSY) [3][4][5][6][7][8] is an extension of the SM that could help explain some of these shortcomings by introducing an additional symmetry between the fermions and the bosons. As a result, it predicts a supersymmetric partner particle (superpartner) for each SM particle. The quantum numbers for each superpartner are the same as the quantum numbers for the corresponding SM particle with the exception of the spin, which differs by a half-integer unit. The superpartners of quarks, gluons, and Higgs bosons are squarksq, gluinosg, and Higgsinos, respectively. The neutral and charged Higgsinos mix with the superpartners of the neutral and charged electroweak gauge bosons to form neutralinosχ 0 and charginosχ AE .
Divergent quantum loop corrections to the Higgs boson mass due to virtual SM particles can be canceled by corresponding contributions from virtual SUSY particles [9,10], which may resolve the fine-tuning problem. The symmetry proposed by SUSY is not exact, as no SUSY particles have been observed yet and they must therefore be more massive than their SM counterparts. The stabilizing features of SUSY can still survive if SUSY particles are not much heavier than their SM counterparts. Superpartners of third-generation quarks play particularly important roles in this consideration, as the third-generation quarks and squarks have large couplings to the Higgs boson, and therefore produce the largest corrections. In so-called natural models of SUSY [11][12][13], the third-generation squarks (top squarks and bottom squarks), gluino, and Higgsinos are expected to have masses no larger than a few TeV [14][15][16]. At the same time, null results from SUSY searches at the CERN LHC so far also suggest that the first two generation squarks have much larger masses [17] and are expected to be decoupled at the LHC energy. These considerations provide strong motivations to search for top squark production at the LHC.
In SUSY models with R-parity conservation [18], SUSY particles are produced in pairs, and the lightest supersymmetric particle (LSP) is stable. The lightest neutralinõ χ 0 1 is assumed to be the LSP, which would be a good candidate for weakly interacting massive particle dark matter. Theχ 0 1 interacts only weakly, so it does not leave a signal in the CMS detector. Because theχ 0 1 is present in the decay chain of top squarks, it provides a powerful experimental probe for signal events: a large momentum imbalance in the plane perpendicular to the beam axis.
The top quark and W boson tagging algorithms identify hadronically decaying top quarks and W bosons produced in SUSY particle decay chains. At high momentum, the decay products of a hadronically decaying top quark or W boson tend to merge into a single large-radius jet. At lower momentum, hadronic top quark decays can be resolved as three smaller-size jets. Separate algorithms are employed to benefit from these two different classes of decay product kinematic properties. Previous top squark searches already used hadronic top quark and W boson tagging algorithms, which were based on jet properties and decision trees [40,41]; however, the ones utilized in the present search benefit from the use of deep neural networks [47,49,50]. These tagging algorithms are critical for improving the sensitivity of the search to models with on-shell top quarks and W bosons in the final state. Separate search bins with different top quark and W boson multiplicities help to maintain high sensitivities for both direct and gluinomediated top squark production scenarios with different decay modes.
For SUSY models with compressed mass spectra, i.e., models with a small mass difference between the top squark and the LSP, the search benefits from dedicated search bins that require a high transverse momentum (p T ) jet originating from initial-state radiation (ISR). Signal events from models with compressed mass spectra generally leave little visible energy in the detector, making such events difficult to identify. These events can still be detected if the top squark pair recoils against a high-p T jet arising from ISR. Some of the models with compressed mass spectra will yield low-p T bottom quarks in the final states. The search utilizes an algorithm to identify jets originating from the hadronization of b quarks (b jets) [49]. To improve the sensitivity to models with compressed mass spectra, we also employ another algorithm that is specifically optimized for the identification of low-p T b quarks (soft b quarks) [40].
The paper is organized as follows. Section II describes the signal models considered in this search. Sections III and IV discuss the CMS detector and the simulated data samples used in this analysis. The event reconstruction and event selection procedures are presented in Secs. V and VI, respectively. The background prediction methods are described in Sec. VII. Results and interpretations are detailed in Sec. VIII and a summary is presented in Sec. IX.

II. SIGNAL MODELS
Top squark pairs may be produced in many different SUSY models. In any given SUSY model, the mechanisms by which top squark pairs are produced depend on the parameters of the model. In this search, we target production scenarios that are motivated by natural SUSY, in which R parity is conserved and the top squark is produced directly in pairs or in cascade decays of the gluino. The signal topologies are characterized by the so-called simplified model spectra [51][52][53][54] with a small number of parameters describing the masses of the SUSY particles. In the following paragraphs we describe the specific models, as well as our choices for the parameters of those models, that we use for the interpretations presented in Sec. VIII.
For direct top squark pair production, the models shown in Fig. 1 are considered. Depending on the specific details of the SUSY model and the mass hierarchy of the SUSY particles, the top squark decays in a variety of modes. In particular, the mass difference Δm between the top squarkt and the LSP has a large impact on the decay modes of the top squark. When Δm ¼ m˜t − m˜χ0 1 is larger than the W boson mass m W , the two decays considered in this search aret → t ðÃÞχ0 1 with t ðÃÞ → bW þ andt → bχ þ 1 with χ þ 1 → W þχ0 1 , as well as their charge-conjugate modes. The former decay mode corresponds to the model denoted by "T2tt," and the latter to the model denoted by "T2bW." In the mixed decay model "T2tb," the top squark decays in either of the above decay modes with a branching fraction of 50%. For T2tb the compressed mass spectrum of m˜χAE 1 − m˜χ0 1 ¼ 5 GeV is assumed, which is a likely scenario wheñ χ AE 1 andχ 0 1 belong to the same gauge eigenstate. For T2bW, theχ AE 1 mass is assumed to be the arithmetic mean of the top squark and LSP masses, as in earlier searches [40,47]. With this assumption of moderateχ AE 1 mass, theχ AE 1 decays to an on-shell W boson andχ 0 1 , and the W boson produces highmomentum objects in the final state. Sensitivity to the final states expected from these models is enhanced by the application of top quark and W boson taggers [50].
When Δm is smaller than m W , the decay of the top squark to an on-shell top quark or an on-shell W boson is kinematically forbidden. In such scenarios, the top squark may decay, as in the above T2tt and T2bW models, but via off-shell top quarks or W bosons. The models with these decay modes are denoted by "T2ttC" and "T2bWC," respectively, where "C" represents the compressed mass spectrum between the top squark and the LSP. Another possible decay mode is the loop-induced flavor-changing neutral-current processt → cχ 0 1 . The model with this decay mode is referred to as "T2cc." These models with small Δm are phenomenologically well motivated because of the compatibility between their prediction of the relic density of dark matter [55][56][57] and cosmological observations; however, signatures expected from these models are challenging to search for experimentally because of the lack of high-p T particles from such top squark decays. This search gains sensitivity to these models by requiring a high-p T jet expected to be from ISR, which gives higher p T to particles from top squark decays, and also by using a soft b quark identification algorithm.
For gluino pair production, the models shown in Fig. 2 are considered. In the model denoted "T1tttt," each of the pair-produced gluinos decays to an off-shell top squark and an on-shell top quark. The off-shell top squark decays to a top quark and the LSP. The gluino decay is thusg → ttχ 0 1 .
In the "T5ttcc" model, pair-produced gluinos each decay to a top quark and an on-shell top squark, and subsequently the top squark decays to a charm quark and the LSP. For this model, Δm ¼ 20 GeV is assumed, so the decay of the top squark to a top quark and the LSP is kinematically forbidden. In such cases, the top squark decayt → cχ 0 1 is expected to be one of the dominant decay modes as discussed above. The value of Δm has little effect on the final results for the T5ttcc model when it remains below m W . The T5ttcc model provides sensitivity to scenarios in which the top squark is kinematically unable to decay to an on-shell top quark and can cover the scenarios of high top squark masses beyond the reach of T2cc through the cascade decays of gluinos.

III. THE CMS DETECTOR
The CMS detector is a general-purpose particle detector surrounding the luminous region where protons from the LHC beams interact. A 3.8 T magnetic field is produced by a solenoid of 6 m internal diameter, within which are a silicon pixel and silicon strip tracking detector, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadronic calorimeter (HCAL). Each of these parts of the detector is composed of a cylindrical barrel section and two end cap sections. The pseudorapidity η coverage of the barrel and end cap detectors is extended by forward calorimeters which lie very close to the LHC beam line. Outside the solenoid, returning magnetic flux is guided through a steel return yoke. Gas-ionization detectors are sandwiched in between the layers of the return yoke and are used to detect muons. The events used in the search were collected in 2016-2018 using a two-tier trigger system: a hardware-based level-1 trigger and a softwarebased high-level trigger [58]. The integrated luminosities recorded in 2016, 2017, and 2018 are measured with uncertainties in the range of 2.3-2.5% [59][60][61]. The uncertainties in these measurements are mostly uncorrelated from year to year, resulting in a smaller uncertainty, 1.8%, in the total integrated luminosity, 137 fb −1 . The CMS detector is described in more detail, along with the coordinate system and basic kinematic variables, in Ref. [62].

IV. SIMULATED EVENT SAMPLES
Samples of simulated events produced via the Monte Carlo (MC) method are used to optimize selection criteria, estimate signal acceptances, and develop background estimation techniques.
Events arising from SM processes are simulated using a number of MC event generators. Samples of tt events, W þ jets events, Z þ jets events with Z → νν, Z=γ Ã ð→ l þ l − Þ þ jets events (DY þ jets), γ þ jets events, and quantum chromodynamics (QCD) multijet events containing solely jets produced through the strong interaction are simulated using the MadGraph5_aMC@NLO event generator at LO (versions 2.2.2 and 2.4.2). The tt events are generated with up to three additional partons in the matrix element calculations, while the W þ jets, Z þ jets, DY þ jets, and γ þ jets events are generated with up to four additional partons. Events containing a single top quark produced through the s channel, events containing a tt pair produced in association with a Z boson, a W boson, or a photon, and rare events such as those containing multiple electroweak or Higgs bosons (W, Z, γ, and H) are generated with MadGraph5_aMC@NLO (versions 2.2.2 and 2.4.2) at next-toleading order (NLO) [75]. The POWHEG v1.0 (v2.0) [76][77][78][79][80][81][82][83] program is used to simulate events at NLO from 2016 (2017 and 2018) containing a single top quark produced through the t and tW channels, as well as WW and ttH events. Events containing ZZ are generated at NLO with either POWHEG or MadGraph5_aMC@NLO depending on the decay mode, and WZ production is simulated with PYTHIA 8.226 (8.230) [84] for 2016 (2017 and 2018) at LO. Normalization of the simulated background samples is performed using the most accurate cross section calculations available [63,79,80,[85][86][87][88][89][90][91][92][93][94][95][96], which typically correspond to NLO or NNLO accuracy.
All simulated samples make use of the PYTHIA 8.226 (8.230) program for 2016 (2017 and 2018) to describe parton showering and hadronization. Samples that are simulated at NLO with MadGraph5_aMC@NLO adopt the FxFx [75] scheme for matching partons from the matrix-element calculation to those from parton showers. Samples simulated at LO adopt the MLM [97] scheme for the same purpose. The CUETP8M1 [98] PYTHIA 8.226 tune is used to produce the SM background and signal samples for the analysis of the 2016 data. For the analysis of the 2017 and 2018 data, the CP5 and CP2 [99] tunes are used for the SM background samples and signal samples, respectively. Simulated samples generated at LO or NLO with the CUETP8M1 tune use the NNPDF2.3LO or NNPDF2.3NLO [100] parton distribution functions (PDFs), respectively. The samples using the CP2 or CP5 tune use the NNPDF3.1LO or NNPDF3.1NNLO [101] PDFs, respectively.
Simulated SM events are processed through a GEANT4based [102] simulation of the CMS detector. In order to keep the computational processing time manageable, simulated signal events are processed through the CMS fast simulation program [103,104], which yields results that are generally consistent with the GEANT4-based simulation. The simulated events are generated with nominal distributions of additional pp interactions per bunch crossing, referred to as pileup. They are reweighted to match the corresponding pileup distribution measured in data.
In order to improve the modeling of additional jet multiplicities originating from radiation in events containing tt, the MadGraph5_aMC@NLO prediction is compared to data in a tt-enriched data set. The events in this data set are required to contain two reconstructed charged leptons (ee, μμ, or eμ) and two jets that are identified as originating from a bottom quark. A correction factor is derived from this comparison. This correction factor is applied to the simulated tt events for 2016, which use the CUETP8M1 tune, and to all the simulated signal events, which use the CUETP8M1 and CP2 tunes. The correction factor is not applied to the simulated tt events for 2017 and 2018, which use the CP5 tune, because these simulated event samples already show a reasonable agreement with the data before the correction. In addition, simulated tt events for all three years are corrected for the observed mismatch in the top quark p T spectrum between data and simulation according to the results presented in Ref. [105].

V. EVENT RECONSTRUCTION
Events are reconstructed using the particle-flow (PF) algorithm [106], which uses information from all of the subdetectors to reconstruct candidates (PF candidates) of charged hadrons, neutral hadrons, photons, electrons, and muons. Combinations of these PF candidates are used to reconstruct higher-level objects such as the missing transverse momentum ( ⃗p miss T ). The ⃗p miss T is defined as the negative vector sum of the transverse momentum ⃗p T of all PF candidates in the event, and its magnitude is denoted as p miss p [107]. We use only events with at least one reconstructed vertex. The primary pp interaction vertex (PV) is taken to be the one with the largest value of the summed p 2 T , summing over jets and the associated p miss T . Jets are reconstructed from tracks assigned to the vertex using the anti-k T jet finding algorithm [108,109] and the associated p miss T used for the PV identification is defined based on these track jets.
The primary set of jets used to define the data set for this search is reconstructed by clustering charged and neutral PF candidates using the anti-k T algorithm [108,109] with a distance parameter of 0.4 (AK4 jets). Only those charged PF candidates identified as originating from the PV are considered; any charged PF candidates originating from pileup vertices are ignored. Jet quality criteria [110] are imposed to eliminate jets from spurious sources such as electronics noise. The energies of jets are corrected for the presence of particles from pileup interactions [111] as well as for the response of the detector as a function of p T and η [112]. We count jets (N j ) with p T > 30 GeV and jηj < 2.4.
Because the final states of the signal processes generally include at least one bottom quark, the identification of jets originating from a bottom quark plays an important role in this search. Bottom quark jets (b jets) are identified by applying a version of the combined secondary vertex algorithm based on a deep neural network (DeepCSV) [49]. The "medium working point" of this algorithm is used, which provides a tagging efficiency for b jets (in the p T range typical of b quarks from top quark decay) of 68% [49]. The corresponding misidentification probability for light-flavor jets originating from gluons and up, down, and strange quarks is 1%, while that for charm quark jets is 12% [49]. We count b jets (N b ) with p T > 20 GeV and jηj < 2.4. The minimum p T threshold for counting N b is set to 20 instead of 30 GeV, as used for N j , in order to improve the sensitivity to top squark signal models, particularly those with compressed mass spectra, which yield low-p T b quarks. Even without the use of a dedicated charm quark tagger, we find that we have adequate sensitivity to models containing charm quarks in the final state.
A large fraction of signal events from models with compressed mass spectra, e.g., events expected from the T2ttC and T2bWC models, contain b quarks with p T below the 20 GeV b jet p T threshold, which would fail to be reconstructed as b jets. Identification of these soft b quarks improves our ability to separate potential signal events from the SM background. We therefore identify soft b quarks based on the presence of a secondary vertex (SV) reconstructed using the inclusive vertex finder algorithm [113]. Additional requirements on SV observables and the distance between the SV and PV are applied to suppress the background originating from light-flavor jets, as done in the previous top squark search [40]. These requirements result in an efficiency of 40-55% for correctly identifying soft b quarks, and a misidentification rate of ≈2-5% for objects originating from light-flavor hadrons [47]. To maintain orthogonality to b tagging, SVs are further required to be separated by ΔR > 0.4 from any jets with p T > 20 GeV. The selected SVs are counted (N SV ).
As discussed in Sec. II, signal events with small Δm generally leave little visible energy in the detector, making such events difficult to identify. They can still be detected if the top squark pair recoils against a high-p T jet arising from ISR. The ISR jet gives a transverse boost to the top squark pair and its decay products, including two LSPs, which can result in greater p miss T than in comparable events without a high-p T ISR jet. Jets clustered using the anti-k T algorithm with a size parameter of 0.8 (AK8 jets), instead of 0.4 as used for N j counting, are used to identify ISR jet candidates as well as boosted high-p T top quark and W boson candidates. The identification of Lorentz-boosted top quarks and W bosons is discussed in detail in Sec. VA.
Pileup contributions to AK8 jets are statistically subtracted using the "pileup per particle identification" [110,114] method, by which each charged and neutral particle is weighted by a factor representing its probability to originate from the PV before the clustering is performed. The use of AK8 jets improves the ISR jet identification by capturing ISR gluons which often radiate additional gluons and result in large size jets. The AK8 jet with the largest p T among the AK8 jets with p T > 200 GeV in the event is considered an ISR jet candidate. The ISR jet is required to fail the DeepCSV b -tag identification defined at the "loose working point" [49], which is characterized by a tagging efficiency of about 80% and a misidentification rate of about 10% for light-flavor jets. The ISR jet may not be tagged as a top or W jet as defined in Sec. VA.
In order to obtain a sample of fully hadronic events, all events with charged leptons, including electron and muon candidates, hadronically decaying tau lepton candidates, and isolated tracks, are removed from the search data set. Electron and muon candidates are also used to define control samples of events with one or two isolated leptons, which are used for background estimation as discussed in Secs. VII A and VII B. Electron candidates are reconstructed starting from clusters of energy deposited in the ECAL that are then matched to a track in the silicon tracker [115]. Electron candidates are required to have jηj < 2.5. Muon candidates are reconstructed by matching tracks in the muon detectors to compatible track segments in the silicon tracker [116] and are required to be within the muon detector fiducial region of jηj < 2.4. Electron and muon candidates are further required to be isolated.
The isolation criterion for electron and muon candidates is based on the "mini-isolation" variable I mini , which is the scalar p T sum of all charged-hadron, neutral-hadron, and photon PF candidates within a cone around the lepton candidate direction, where the radius ΔR of the cone depends on the lepton candidate p T . For p T < 50 GeV, ΔR ¼ 0.2; for p T > 200 GeV, ΔR ¼ 0.05; and for 50 < p T < 200 GeV, ΔR ¼ 10 GeV=p T . The decrease in cone size with increasing p T is motivated by the concomitant increase in collimation of the lepton decay products. It reduces the rate of accidental overlap between the lepton and jets in high multiplicity or highly boosted events, particularly overlap between b jets and leptons originating from a boosted top quark. The mini-isolation variable is corrected for contributions from pileup using an estimate of the pileup energy inside the cone [115,116]. The isolation requirement is I mini =p T < 0.1 for electron candidates and I mini =p T < 0.2 for muon candidates.
Hadronically decaying tau lepton candidates τ h are reconstructed by the hadron-plus-strips algorithm [117,118]. The τ h candidates are required to have p T > 20 GeV, jηj < 2.4, and transverse mass m T ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2p T p miss T ð1 − cos ΔϕÞ p < 100 GeV, where Δϕ is the azimuthal separation between the candidate ⃗p T and ⃗p miss T . The goal of the transverse mass requirement is to suppress the background with W → τ h ν decays and no other source of p miss T . Some electrons, muons, and tau leptons that do not satisfy the above criteria are still reconstructed as electron, muon, or charged-hadron PF candidates. Electron, muon, and charged-hadron PF candidates are identified as electron tracks, muon tracks, and charged-hadron tracks, respectively, provided they satisfy criteria on the PF candidate p T and η, the transverse mass m T , and the isolation, and are collectively referred to as isolated tracks. Isolation is defined based on the scalar p T sum p sum T of all other charged PF candidates lying within a cone ΔR < 0.3 around the PF candidate. Isolated electron and muon tracks are required to have p T > 5 GeV and p sum T < 0.2p T , while charged-hadron tracks are required to have p T > 10 GeV as well as p sum T < 0.1p T . Isolated tracks are all required to satisfy jηj < 2.5 and m T < 100 GeV.
Photon candidates, which are used in the estimation of some backgrounds, are reconstructed from clusters of energy deposited in the ECAL. They must satisfy requirements on the cluster shape, the relative fraction of energy deposited in the HCAL behind the cluster in the ECAL, and the photon isolation [119].
Events expected from direct top squark production models with Δm larger than the top quark mass m t and from all the gluino-mediated top squark production models considered in this search produce on-shell top quarks and/or on-shell W bosons in the decay chain. Thus, identification of hadronically decaying top quarks and W bosons plays a central role in this analysis. Previous searches [40,41] already employed top quark and W boson tagging algorithms, but this search benefits from the improved tagging algorithms discussed below. Because the top quarks and W bosons may have a wide range of p T , we employ a combination of two tagging algorithms, which are optimized for different p T ranges.

A. Merged top quark and W boson tagging algorithm
When a top quark or W boson is produced with high p T , its decay products are often merged into a single AK8 jet. Top quark and W boson candidates are selected from AK8 jets based on the jet p T and soft-drop mass. The soft-drop mass is a groomed jet mass calculated using the soft-drop algorithm [120,121] with an angular exponent β ¼ 0 and soft cutoff threshold z cut < 0.1. The soft-drop algorithm recursively removes soft wide-angle radiation from a jet. Top quark candidates are defined as those AK8 jets that have p T > 400 GeV and soft-drop mass above 105 GeV, while W boson candidates are required to have p T > 200 GeV and soft-drop mass between 65 and 105 GeV.
Final identification of top quark and W boson candidates is performed using the "DeepAK8" algorithm [50]. DeepAK8 is a multiclass classifier that identifies hadronically decaying particles as one of five main categories: W, Z, H, t, and "other." These categories are further subdivided into minor categories corresponding to the decay modes of each particle. DeepAK8 uses a customized deep neural network architecture tailored to the jet classification task, which exploits PF information directly. The neural network uses information about all of the PF candidates and all of the SVs associated with each AK8 jet. A detailed description of the algorithm can be found in Ref. [50]. In the context of this analysis the output classes are combined to achieve "top quark vs. QCD multijet" and "W boson vs. QCD multijet" discrimination. The other discriminators are not used.
Top quark candidates that satisfy a requirement on the value of the top quark vs. QCD multijet discriminator are considered tagged and are counted (N t ). The requirement used in this analysis yields a misidentification rate in QCD multijet events of 0.5%, and a top quark tagging efficiency shown in Fig. 3. Similarly, W boson candidates that satisfy a requirement on the value of the W boson vs. QCD multijet discriminator and are not already tagged as top quarks are considered tagged and are counted (N W ). The requirement used in this analysis yields the W boson tagging efficiency shown in Fig. 3 and an average misidentification rate in QCD multijet events of 1%.

B. Resolved top quark tagging algorithm
The DeepResolved algorithm [47] identifies top quarks with small boost, in the p T range of roughly 100 to 500 GeV, whose decay products are too spread out to be contained inside a single AK8 jet. Top quark candidates are formed from the combination of three AK4 jets with p T of at least 40, 30, and 20 GeV, respectively. The three jets of each top quark candidate must have an invariant mass between 100 and 250 GeV, no more than one of the jets can be identified as a b jet using the DeepCSV medium working point, and the three jets must all lie within a cone of size ΔR < 3.1 around the trijet centroid, the vector sum of the momenta of the three jets.
After this loose preselection, a feed-forward neural network with a single hidden layer is used to distinguish between trijet combinations whose three jets all match to a decay product of a top quark versus those that do not. More complex neural network architectures did not result in improved discrimination power in our study. The network uses high-level information such as the invariant mass of the trijet and individual dijet pairs, as well as information from each jet including the relativistic energy-momentum four-vector describing the jet, the DeepCSV heavy-flavor discriminator value, jet shape variables [122], the number of PF candidates associated with the jet, and variables describing the fraction of the jet energy carried by each of several categories of PF candidates. The network is trained using simulated tt events, simulated QCD multijet events, and events from a collision data set that is dominated by QCD multijet production.
The simulation is used to define which trijets are considered "signal" and "background" during neural network training. Signal is defined as any trijet passing the preselection in which each jet is matched to a simulated decay product of a top quark within a cone of size ΔR < 0.4 and the overall trijet system is matched to the simulated top quark within a cone of size ΔR < 0.6. Background is defined as any trijet combination that is not categorized as signal. Background includes trijet combinations where some, but not all, of the jets are matched to top quark decay products, as well as trijet combinations in which the three top quark decay products to which the three jets are matched originate from two or more simulated top quarks. Collision data that are highly enriched in QCD multijet events are included in the training. These data are included using domain adaptation via gradient reversal [123,124] to discourage the network from learning features of the simulation that are not present in data. With this method an additional output is added to the neural network, connected to the hidden layer in the same manner as the primary neural network output. This additional output is tasked with distinguishing between trijet candidates from QCD multijet simulation and trijet candidates from collision data. The primary neural network output is trained to minimize the ability to discriminate based on observables that are not well modeled in simulation. This yields a network that is able to discriminate between signal and background almost as well as a network trained without FIG. 3. Top quark and W boson tagging efficiencies are shown as a function of the generator-level top quark p T and the generator-level W boson p T , respectively, for the merged tagging algorithm described in Sec. VA and the resolved tagging algorithm described in Sec. V B. The left plot shows the efficiencies as calculated in a sample of simulated tt events in which one top quark decays leptonically, while the other decays hadronically. The right plot shows the W boson tagging efficiency when calculated in a sample of simulated WW events. In addition to the individual algorithms shown as orange squares (boosted top quarks), green inverted triangles (resolved top quarks), and red triangles (boosted W bosons), the total top quark tagging efficiency (blue dots) is also shown. domain adaptation but with minimal reliance on features that exist only in simulation.
Before the final selection of trijets as tagged top quarks can be made, any overlap between trijet candidates that share jets with another candidate must be resolved. When considering any pair of overlapping trijets, the trijet that is more background-like according to the neural network is removed from further consideration. Additionally, trijet candidates that overlap with top quark and W boson candidates identified by the DeepAK8 algorithm are removed. A trijet overlaps a DeepAK8-tagged jet if any of the trijet constituents lies within a cone of size ΔR < 0.4 around one of the subjets (as identified by the soft-drop algorithm [50]) of the AK8 jet. Any remaining trijets with a neural network output greater than a threshold are considered tagged and are counted (N res ). This threshold is chosen to yield a misidentification rate in QCD multijet events of 2%.
The overall efficiencies of the top quark and W boson tagging algorithms are shown in Fig. 3. The efficiency for each object is defined as the fraction of simulated hadronically decaying top quarks or W bosons that are identified by the appropriate algorithm. The simulated top quark or W boson is considered to have been identified by the DeepAK8 algorithm if all of its primary decay products lie within a cone of size ΔR < 0.6 around the AK8 jet. Similarly, a simulated top quark is considered to have been identified by the DeepResolved algorithm if at least two of its three primary decay products lie within a cone of size ΔR < 0.4 around distinct constituents of the trijet and the simulated top quark lies within a cone of size ΔR < 0.6 around the trijet centroid. The sum of the DeepAK8 and DeepResolved efficiencies for tagging top quarks is also shown in Fig. 3 and demonstrates the complementary nature of these two algorithms over the full range of relevant p T .
The top quark and W boson taggers exhibit slightly different performance in data and in simulation. We derive scale factors to correct the performance of the taggers in simulation to match their performance in data. For both the DeepAK8 and DeepResolved algorithms, the tagging efficiency is estimated in both data and simulation using a dedicated sample of events containing a single charged lepton, selected to be enriched in top quark and W boson production. For the DeepAK8 algorithm, the misidentification rate is estimated in a sample of events containing a single photon, which is selected to be depleted of top quark and W boson production. For the DeepResolved algorithm, the misidentification rate is estimated in a sample of events containing jets but no charged leptons and small p miss T . This sample is similarly depleted of top quark and W boson production.
For each category of tagged top quark or W boson, datato-simulation scale factors are defined as the ratio of the performance (either tagging efficiency or misidentification rate) in data to the performance in simulation. These scale factors are parametrized as a function of the p T of the tagged top quark or W boson candidate and are used to reweight simulated events to more accurately describe the data. The efficiency scale factors for the DeepResolved algorithm are within 6% of unity while the misidentification scale factors are within 8% of unity. The DeepAK8 efficiency scale factors are within 8% of unity for the top quark and W boson categories while the misidentification scale factors vary up to 20 (30)% from unity for the top quark (W boson) categories. The DeepAK8 scale factors are discussed in detail in Ref. [50]. The most important sources of uncertainty in the scale factors arise from the jet energy scale and resolution, parton shower modeling, choice of factorization and renormalization scales, and statistical uncertainties in the data and the simulation.

VI. EVENT SELECTION AND SEARCH REGIONS
Events used for the search regions in this analysis were collected with a trigger that requires both p miss T and H miss T larger than a threshold that varied between 100 and 140 GeV depending on the LHC instantaneous luminosity and data taking period, where H miss T is the magnitude of the vector p T sum of jets reconstructed at the trigger level. The trigger efficiency is greater than 95% after application of the event selection criteria described below, including the requirement of p miss T > 250 GeV. All events are required to pass filters designed to remove detector-and beam-related noise and events that suffered from event reconstruction failures [107]. The data set used in this analysis is defined broadly by the exclusive presence of multiple jets of strongly interacting particles along with large p miss T . Large p miss T in SM events generally arises from leptonic decays of W bosons, Z boson decays to neutrinoantineutrino pairs, or jet energy mismeasurements in QCD multijet events. Events with isolated electrons or muons, as defined in Sec. V, with p T > 5 GeV are removed from the search data set in order to suppress SM backgrounds with large p miss T from leptonic W boson decays. This requirement provides a search data set that is orthogonal to the data set used for top squark searches performed using final states with a single lepton [47] or with two oppositely charged leptons [48], which will enable the statistical combination of the results from these other searches.
In order to further suppress events with charged leptons from W boson decays, we remove events from the search data set that contain τ h candidates, isolated electron tracks, isolated muon tracks, or isolated charged-hadron tracks, as defined in Sec. V.
Large p miss T in QCD multijet events typically arises from undermeasured jet energies, which result in a small azimuthal separation between the undermeasured jet and ⃗p miss T . This background is suppressed by removing events with small azimuthal separation between a high-p T jet and ⃗p miss T .
The detailed "baseline" requirements that define the search data set are given in Table I. The data set is further divided into two regions. The low-Δm region is designed to be sensitive to low-Δm signal models while the high-Δm region is designed to be sensitive to high-Δm signal models.

A. Low-Δm search region
In order to enhance sensitivity to low-Δm signal models, as discussed in Sec. II, the low-Δm region requires an ISR jet candidate with p T > 200 GeV, jηj < 2.4, and Δϕð ⃗p miss T ; j ISR Þ > 2. In the low-Δm signal models, the ISR jet recoils against the top squark pair, including the two LSPs from the top squark decay chains, and provides p miss T in the targeted signal events. Because the low-Δm signal models involve neither on-shell top quarks nor on-shell W bosons, events with tagged top quarks or tagged W bosons, as described in Secs. VA and V B, are vetoed in the low-Δm region. We also require p miss T = ffiffiffiffiffiffi ffi H T p > 10 ffiffiffiffiffiffiffiffiffiffiffi GeV p , which suppresses events with a large p miss T arising from jet energy mismeasurements.
For events from signal models with compressed mass spectra, such as T2ttC and T2bWC or T2tt and T2bW with small Δm, we expect low-p T bottom quarks in the final state. For events with N b ≥ 1, the minimum transverse mass of all b jets with respect to the ⃗p miss T (m b T ) is required to be less than 175 GeV because events from low-Δm signal models tend to lie in this region. In events with N b ≥ 3, only the two jets with the highest DeepCSV discriminator value are considered in the calculation of m b T . The dominant source of SM events that satisfy these requirements is Z þ jets production in which the Z boson decays to a neutrino-antineutrino pair.
Events passing the low-Δm baseline selection are further required to have p ISR T > 300 GeV (while events with 200 < p ISR T < 300 GeV are still used for the validation of background estimation, as discussed below) and are further divided into 53 disjoint search bins as shown in Table II. Eight search bins require N b ¼ N SV ¼ 0 and are divided based on N j and p miss T . These bins are designed to provide sensitivity primarily to the T2cc model. The remaining 45 search bins target other low-Δm signal models with b quarks in the final state, i.e., the T2ttC and T2bWC models, require N b ≥ 1 and/or N SV ≥ 1, and are divided based on N j , N b , N SV , p ISR T , p b T , and p miss T . The variable p b T is defined as the p T of the b jet for events with N b ¼ 1 and as the scalar p T sum of the leading two b jets for events with N b ≥ 2.

B. High-Δm search region
The high-Δm search region is optimized for those direct top squark production signal models with Δm > m W and for the gluino pair production models considered in this search, all of which often produce events with a large number of jets, some of which originate from b quarks. Therefore, the high-Δm region selection requires N j ≥ 5 and N b ≥ 1. The additional requirement of Δϕðj 1;2;3;4 ; ⃗p miss T Þ > 0.5 further suppresses QCD multijet events with severe jet energy mismeasurements. Events passing the high-Δm region selection are further divided among 130 disjoint search bins, which are described in detail in Table III.
The dominant source of SM background events that satisfy these requirements is tt production in which one of the W bosons decays leptonically. In such events, the b quark from the same top quark decay as the leptonically decaying W boson is expected to yield a low m b T value, typically below the top quark mass. The high-Δm region is divided into two subcategories with m b T < 175 GeV and m b T > 175 GeV. Here R is the distance parameter of the anti-k T algorithm. Electron and muon candidates as well as τ h candidates and isolated tracks are as defined in Sec. V. The ith highest-p T jet is denoted by j i .    The subcategory with m b T < 175 GeV suffers from large tt background yields but provides sensitivity to signal models with moderate Δm. Events from these signal models are likely to produce relatively low-p T top quarks, which leads to large N j , so events in this subcategory are required to have N j ≥ 7 and N res ≥ 1, and are further divided based on N b and p miss T . The subcategory with m b T > 175 GeV is further divided into search bins based on N b , N t , N res , N W , p miss T , and H T . These search bins help to provide sensitivity to signal models with a wide range of top squark, gluino, and LSP masses. The search bins with N b ¼ 1 and 2 primarily provide sensitivity to the direct top squark pair production models T2tt, T2bW, and T2tb, as well as the gluinomediated top squark production model T5ttcc. The sensitivity to the T1tttt and T1ttbb models is driven primarily by search bins with N b ≥ 3, and additional requirements on the top quark candidate multiplicity enhance the sensitivity further for T1tttt in particular. The requirement of one or more merged top quark candidates enhances the sensitivity to signal events from models with large m˜t or m˜g and low m˜χ0 1 , in which the top quarks from top squark or gluino decays have a high transverse boost, while the requirement of one or more resolved top quark candidates plays a more important role for signal events from models with higher m˜χ0 1 , in which top quarks are expected to be less boosted. The requirement of one or more W boson candidates enhances the sensitivity to the T2bW model.

C. Validation regions
In addition to the search region, two validation regions are defined and used to validate the background estimation methods. These validation regions are kinematically similar to but disjoint from the search region, and are depleted in expected signal relative to the search region. This allows the estimated background yields to be compared to data, as is done in Sec. VII E, while maintaining a blind search.
The low-Δm validation region is divided into 19 validation bins, analogous to the search bins, as shown in Table IV. Bins 0-14 have identical baseline requirements to the low-Δm search region, but require lower p miss T than any of the low-Δm search bins. Bins 15-18 are used to validate the background model in a higher p miss T range, and are made disjoint from the low-Δm search region by altering the requirement on the alignment between jets and ⃗p miss T . The low-Δm search region requires Δϕð ⃗p miss T ; j 1 Þ > 0.5, while these low-Δm validation bins instead require 0.15 < Δϕð ⃗p miss T ; j 1 Þ < 0.5 (medium Δϕ). The high-Δm validation region is divided into 24 validation bins as shown in Table V. These bins have identical requirements to the high-Δm search region except for the requirement on the alignment between jets and ⃗p miss T . The high-Δm search region requires Δϕð ⃗p miss T ; j 1;2;3;4 Þ > 0.5, while the high-Δm validation bins instead require Δϕð ⃗p miss T ; j 1 Þ > 0.5, Δϕð ⃗p miss T ; j 2;3 Þ > 0.15, and at least one of Δϕð ⃗p miss T ; j 2;3;4 Þ < 0.5.

VII. BACKGROUND ESTIMATION
The data set is expected to be dominated by events that do not contain top squarks (backgrounds), which arise from SM processes. The contributions of the major backgrounds are estimated through measurements in dedicated "control regions." This approach produces background estimates that are more precise and less affected by potential mismodeling than estimates taken purely from simulation. The control regions are each disjoint from the search region and from each other, are enriched in background events from a particular source, and are expected to be depleted of signal events. With the aid of simulation, the observations in these control regions are translated to predictions in the search region. This strategy makes use of methods described in previous searches in similar final states [37,40,41].
Events with large p miss T and a charged lepton mainly arise from tt production, electroweak production of a single top quark, and production of a W boson with additional jets in the final state. These events enter the search region when the charged lepton is not identified in the detector. The estimation of this background is described in Sec. VII A. Events containing a Z boson that decays to a neutrinoantineutrino pair contain large p miss T . These events enter the search region when jets are produced along with the Z boson. This background is described in Sec. VII B.
QCD multijet events have nearly zero p miss T ; however, mismeasurement of the p T of one or more of the jets in the final state can result in large p miss T . These mismeasured events enter the search region and constitute another important source of background, which is described in Sec. VII C.
Last, a variety of rare processes contribute to the search region, including production of multiple electroweak bosons and production of a top quark-antiquark pair in association with one or more electroweak or Higgs bosons. The estimation of these backgrounds is described in Sec. VII D.
A. Background from tt, single top quark, and W + jets events The background from events containing at least one top quark, top antiquark, or W boson, along with additional jets in the final state is dominated by events in which the W boson (either prompt or from the decay of a top quark) decays to a charged lepton and a neutrino. The event selection for this search requires the number of reconstructed charged leptons and isolated tracks (which can be the result of a partially reconstructed charged lepton) to be zero, which substantially reduces this background. These events still pass the event selection when the charged lepton lies outside the lepton acceptance or is not reconstructed, or is not isolated, and thus is not counted even as an isolated track. Therefore, this source of SM background is referred to as the lost lepton (LL) background.
This background is estimated from a l þ jets control region with l ¼ e or μ, selected with the same high-Δm and low-Δm baseline selection criteria as discussed above, except that we require exactly one rather than zero isolated leptons and we do not remove events containing isolated tracks. The m T of the lepton is required to be less than 100 GeV in order to select events containing a W → lν decay and to suppress possible SUSY signal contamination, i.e., signal events that satisfy the requirements of the l þ jets control region.
The LL background yield in each search bin N LL pred is estimated based on the event count in data in a corresponding bin in the l þ jets control region N 1l data . This count is extrapolated to the search region to obtain a prediction by means of a transfer factor TF LL obtained from simulation: where N 1l MC is the yield expected from simulation in the corresponding control region bin and N 0l MC is the yield expected from simulation in the search bin. These event yields include contributions from tt, W þ jets, and single top quark production, as well as smaller contributions from events with two or three electroweak gauge bosons, denoted by "multiboson," and from events with a tt pair produced in association with a γ, a H boson, a W boson, or a Z boson, denoted by ttX. A unique TF LL is defined for each of the 183 search bins. The simulated events used to estimate these yields include at least one simulated charged lepton. The definition of the corresponding bin in the control region is identical to the definition of the search bin except for the requirements on N t , N W , and N res .
For the control region bins corresponding to the high-Δm region, no requirement on N t , N W , or N res is made. This improves the statistical uncertainty in the background estimation in the high-Δm region. Thus, in the high-Δm region, the transfer factor TF LL is The event yields expected from simulation include the application of data-to-simulation scale factors for the efficiency of the DeepAK8 and DeepResolved top quark and W boson taggers. Figure 4 shows that the background model describes the data in the high-Δm region of the l þ jets control region well as a function of N t , N W , and N res . The total yield of the background model is scaled to exactly match the total yield observed in collision data for the purpose of this comparison only, while the prediction for the LL background in the search region does not include this scaling. This demonstrates that the transfer factor method described above will correctly describe the data as a function of N t , N W , and N res . The background model includes the top quark p T reweighting mentioned in Sec. IV.
One of the most important sources of systematic uncertainty in the LL background estimation arises from the data-to-simulation scale factor measurements for the merged top quark and W boson identification and resolved top quark identification. This leads to an uncertainty in the estimated LL background yield of up to 8% from the DeepAK8 top quark scale factor, 17% from the DeepAK8 W boson scale factor, and up to 5% from the DeepResolved top quark scale factor for some high-Δm search bins that require one or more of these tagged objects. Another significant source of systematic uncertainty arises from the top quark p T reweighting, which improves the agreement between data and simulation. The uncertainty associated with this reweighting is up to 15%, depending on the search bin. Other sources of systematic uncertainty include the statistical uncertainties due to the control region data (up to 80%) and the simulated event samples (up to 43%), the electron and muon identification and isolation efficiencies (2-6%), the τ h veto efficiency (1-7%), the jet energy scale (1-12%), the p miss T energy resolution (1-8%), the b tagging efficiency (2-5%), the PDF uncertainty (2-17%), the pileup uncertainty (1-10%), and the tt and W þ jets production cross section uncertainties (4-6%), depending on the search bin.

B. Background from ZðννÞ + jets events
In previous searches [37,40,41], two different methods have been used to estimate the background from Z þ jets events with Z → νν decay [ZðννÞ þ jets events]. One method uses Z þ jets events in which the Z boson decays to l þ l − (e þ e − or μ þ μ − ). In these events, the Z bosons have very similar kinematic properties to those of the ZðννÞ þ jets events in the search region, but this method is statistically limited because of the small Z → l þ l − branching fraction. Another method uses γ þ jets events, which feature a cross section that is larger than the ZðννÞ þ jets cross section by roughly a factor of 5 in the range of Z boson p T that is relevant for this search. The LO Feynman diagrams involved in γ þ jets production are similar to those for Z þ jets production, but differ in the quark-boson couplings and in the fact that the Z boson is massive. These , and N res (lower right) after scaling the simulation to match the total yield in data. The hatched region indicates the total shape uncertainty in the simulation. The lower panels display the ratios between the observed data and the simulation.
differences are generally described well by simulation and are accounted for. Taking into account these considerations we use a hybrid method to estimate the ZðννÞ þ jets background that makes use of both procedures. The method is discussed in more detail in Refs. [37,40].
Two control regions are used. The Zðl þ l − Þ control region requires two same-flavor, opposite-charge reconstructed leptons (e þ e − or μ þ μ − ) and is enriched in Z þ jets events in which the Z boson decays to l þ l − . The γ þ jets control region requires a single reconstructed photon and is therefore enriched in γ þ jets events.
The predicted yield of ZðννÞ þ jets is where N ZðννÞþjets pred and N ZðννÞþjets MC are respectively the predicted number of ZðννÞ þ jets events and the number of simulated ZðννÞ þ jets events in each search bin, R Z is a flavor-dependent Z þ jets normalization factor measured in the Zðl þ l − Þ control region, and S γ is a shape correction factor measured in the γ þ jets control region.
The normalization factor R Z is derived from a fit of the simulation to the data in the Zðl þ l − Þ control region as a function of the measured dilepton mass m l þ l − . The Zðl þ l − Þ control region is selected from single-electron and single-muon triggers, and further requires offline p T > 40ð50Þ GeV for the leading electron (muon) candidates and p T > 20 GeV for the subleading electron and muon candidates. The quality, isolation, and η selection criteria discussed in Sec. V are also required. Jets matched to these selected leptons are removed from the calculation of search variables, and the dilepton ⃗p T is added to ⃗p miss T to emulate the p miss T expected in ZðννÞ þ jets events. Events in the Zðl þ l − Þ control region are required to pass the same low-Δm and high-Δm baseline selections shown in Table I, with the exception of the lepton and isolated track vetoes, and with the additional requirement that p T ðl þ l − Þ > 200 GeV. The R Z factor is measured from events in the dilepton mass window 81 < m l þ l − < 101 GeV, while events in the ranges 50 < m l þ l − < 81 GeV and m l þ l − > 101 GeV are used to measure the rate of nonresonant background contributions, which are primarily tt events in which both W bosons decay leptonically. Approximately 97% of the events in the low-Δm Zðl þ l − Þ control region and 79% of the events in the high-Δm Zðl þ l − Þ control region are expected to be DY þ jets events. Minor contributions from, e.g., ZZ production are counted with Z þ jets events in the extraction of R Z , and minor contributions from, e.g., single top processes are counted with tt events when measuring the rate of nonresonant backgrounds. The R Z factor is measured separately in different ranges of N b and N SV as well as separately in the low-Δm and high-Δm regions, allowing it to account for dependence on heavy-flavor production.
This results in five distinct R Z values in the low-Δm region, as can be seen from Table II, which includes five unique combinations of N b and N SV requirements. In the high-Δm region, this results in two distinct R Z values: one with a requirement of N b ¼ 1 and one with a requirement of N b ≥ 2. The R Z factor ranges from 0.71 to 1.05 for low-Δm search bins and from 1.20 to 1.27 for high-Δm search bins and has uncertainties of 4-14%, which are propagated to the ZðννÞ þ jets predictions in the search regions.
The shape correction factor S γ is derived from the γ þ jets control region and corrects for any mismodeling of the search variable distributions by the simulation. The γ þ jets control region is selected from single-photon triggers that require a photon candidate with a p T threshold of 175-200 GeV, depending on the data collection period. Offline photons are required to have p T > 220 GeV and jηj < 1.44 or 1.57 < jηj < 2.5, avoiding the gap between the ECAL barrel and end cap detectors. Similarly to what is done for dilepton events, jets matched to selected photons are removed from the calculation of search variables, and the photon four-vector is added to ⃗p miss T to emulate the p miss T expected in ZðννÞ þ jets events. The p miss T prior to this addition is required to be less than 250 GeV to make the γ þ jets control region orthogonal to the search region. Approximately 87% of the events in the low-Δm γ þ jets control region and 76% of the events in the high-Δm control region are expected to be γ þ jets events.
The S γ factor is not used to correct the estimated overall rate of ZðννÞ þ jets events, but only to correct the distribution of those events. The yield of simulated events in the γ þ jets control region is scaled to the corresponding event yields from collision data separately in the low-Δm and high-Δm regions and in different ranges of N b and N j , and then simulated events are compared to collision data as a function of all search bin variables except N t , N W , and N res to extract S γ . The high-Δm γ þ jets control region bins make no requirements on N t , N W , or N res , yielding 112 control region bins, with a distinct S γ value for each control region bin. The ZðννÞ þ jets simulation provides an improved description of the distributions of N t , N W , and N res in the search region after the R Z and S γ correction factors have been applied. The S γ factor measured in the γ þ jets control region is validated in the Zðl þ l − Þ control region. The observed differences between S γ calculated in the γ þ jets control region and S γ calculated in the Zðl þ l − Þ control region as a function of p miss T (up to 16%) are treated as systematic uncertainties.
In addition to the uncertainties in the R Z normalization factor obtained from the Zðl þ l − Þ control region and in the S γ shape correction factors discussed above, we consider several sources of uncertainty in the estimation of the ZðννÞ þ jets background, including the statistical uncertainties in the photon control region data (up to 100%) and simulated event samples (up to 110%), the photon identification efficiencies (5-13%), the photon trigger efficiency (up to 2%), the pileup reweighting (up to 40%), the jet energy scale corrections (up to 41%), the p miss T energy resolution (up to 35%), the PDF uncertainty (up to 59%), the b tagging efficiencies for heavy-flavor jets (up to 5%) and misidentification rates for light-flavor jets (up to 16%), the soft b tagging efficiencies (up to 1%), and the top quark and W boson misidentification rates (up to 34%).

C. Background from QCD multijet events
The QCD multijet background originates from mismeasurement of the energy of one or more jets in a QCD multijet event. When that happens, large amounts of spurious p miss T can be present in the reconstructed event, causing it to satisfy the selection requirements. The probability to produce such an event, including misidentified b jets and top quarks, is very low, but the high QCD multijet production cross section makes them very numerous and therefore their contribution to the search bins must be estimated.
The QCD multijet control region requires that at least one of the three leading jets is close to the ⃗p miss T , that is, Δϕð ⃗p miss T ; j 1;2;3 Þ < 0.1. This control region definition otherwise requires the baseline selection described in Sec. VI, including the low-Δm and high-Δm regions (with the exception of the Δϕ requirements). These requirements produce a control region in which QCD multijet events are expected to make up 84% of the total yield.
The QCD multijet control region is divided into bins based on p miss T , H T , N j , m b T , p b T , p ISR T , N b , and N SV , similarly to the search bins described in Sec. VI and Tables II and III, with the exception that the QCD multijet control region is not binned in N t , N W , or N res . This allows us to maintain adequately small statistical uncertainties in each bin of the control region.
The yield of QCD multijet events in a search bin is extrapolated from the corresponding bin in the QCD multijet control region. The ratio of the QCD multijet yield predicted by simulation in a search bin to the QCD multijet yield predicted in the corresponding control region bin, TF QCD , is taken from simulation, and then the QCD multijet background yield N QCD pred is estimated as where N data is the number of events in the QCD multijet control region bin and N non−QCD MC is the number of events from all other backgrounds in the same bin as predicted by simulation.
To improve the statistical power of the QCD multijet simulation, we employ a "smearing" procedure, which involves resampling the p T of the leading jets from the expected jet response distribution. This simulates the effects of jet p T mismeasurement, and allows simulated events with low p miss T to be used. A correction scale factor is applied to each simulated event to correct for any mismodeling of the jet response distribution. This scale factor varies as a function of the ratio of the reconstructed jet p T to the simulated jet p T , and is derived by fitting the simulated events to the data in the QCD multijet control region as a function of a proxy variable, p reco T =ðp reco T þ p miss T Þ, where p T reco is the reconstructed jet p T . We account for the effects of a number of sources of systematic uncertainty in the estimate of the QCD multijet background yield, including the statistical uncertainties due to the control region data (1-260%) and simulated event samples (4-130%), the b tagging efficiencies for heavyflavor jets (up to 18%) and misidentification rates for lightflavor jets (up to 9%), the soft b tagging efficiency (up to 8%), the trigger efficiency (up to 40%), the pileup reweighting (up to 50%), the jet energy scale corrections (up to 63%), the p miss T energy resolution (up to 64%), the top quark and W boson misidentification rates (up to 36%), the top quark p T reweighting (up to 39%), the PDF uncertainty (up to 67%), the smearing procedure (up to 41%), the jet response correction (up to 42%), and residual bias in the p miss T distribution (up to 20%).

D. Background from rare processes
Besides the backgrounds discussed above, other SM processes with small production cross sections are also considered in this analysis. These include the diboson (WW, WZ, and ZZ) processes, multiboson (WWW, WWZ, WZZ, and ZZZ) processes, associated production with a top quark-antiquark pair (ttH, ttγ, ttW, and ttZ), and other processes (tWZ, WZγ, and WWγ). Of these, the most important is the ttZ background because, in the case where the Z boson decays to νν, this background is irreducible.
Simulated events are used to estimate the background, and the total yield is given by the product of the luminosity and the theoretical cross section, with the exception of the ttZ background. The ttZ cross section is taken from a recent measurement using CMS data [125]. These backgrounds, other than ttZ, are not estimated from data because they are sufficiently rare that the estimate based on the theoretical cross sections is more precise than an estimate based on data. The LL background estimation procedure already accounts for the portion of these backgrounds that include one or more charged leptons, and therefore simulated events that include generated charged leptons are not part of this prediction.
The uncertainties for the rare backgrounds are determined individually for each search bin and arise from the statistical uncertainty in the simulated event samples (up to 100%), the integrated luminosity (1.8%), the b tagging efficiency for heavy-flavor jets (up to 7%) and misidentification rates for light-flavor jets (up to 14%), the soft b tagging efficiency (up to 5%), the trigger efficiency (less than 1%), the renormalization and factorization scales (up to 35%), the pileup reweighting (up to 48%), the jet energy scale corrections (up to 39%), the p miss T energy resolution (up to 23%), the PDF uncertainty (up to 15%), the merged top quark and W boson reconstruction efficiencies (up to 19%), the resolved top quark reconstruction efficiencies (up to 17%), and the ttZ scale factor derived from comparison to data in the three-and four-lepton channels (8%).

E. Validation of the background estimation
To validate our background model, we compare the model to data in the validation regions described in Sec. VI C. The validation regions are kinematically very similar to the search region, but do not overlap with it and are not expected to contain any appreciable yield from any of the signal models. The low-Δm validation region is described in Table IV and the high-Δm validation region is described in Table V. The background predictions in these validation bins are calculated as described in the preceding sections and compared to data, as shown in Fig. 5. The background prediction is compatible within uncertainties with the observed data. This compatibility demonstrates that the background model adequately describes the backgrounds present in the data and can be used to describe the backgrounds in the search region.

VIII. RESULTS AND INTERPRETATION
The predicted and observed yields in the 183 search bins defined in Sec. VI are summarized in Figs. 6 and 7, and numerical values are presented in Tables VI-XII of the Appendix. No statistically significant excess of events is observed relative to the expectation from the SM. All but six out of 183 search bins have agreement within two standard deviations, and all search bins have agreement within three standard deviations. A goodness-of-fit test under the background-only hypothesis yields a p-value of 0.66, indicating good agreement with the SM expectation. The observations are interpreted in the context of the models described in Sec. II as upper limits on the cross section for production of top squarks as a function of the masses of the top squark and LSP or the gluino and LSP.
Upper limits on the direct top squark pair production cross section or gluino-mediated top squark production cross section are derived via a modified frequentist method using the CL s criterion in an asymptotic formulation [126][127][128]. The observed and predicted yields in the search bins as well as all of the control region bins are included in the limit calculation. To implement the background estimation procedures based on data in control regions described in Sec. VII, the yields of the relevant backgrounds in the search region bins and the corresponding control region bins are taken directly from the simulation, but are scaled by nuisance parameters with no a priori constraint. These a priori unconstrained nuisance parameters are constrained in the fitting procedure by the observed yield in the data in the search region as well as the data in the control regions. Systematic uncertainties are also implemented using nuisance parameters with log-normal a priori constraints. When computing the limits, the signal yields are corrected to account for the expected signal contamination of the data control regions used to estimate the SM background. These corrections are typically below 20%.
The uncertainties in the signal modeling are determined individually for each search bin and arise from the statistical uncertainty in the simulated event samples (up to 100%), the integrated luminosity (1.8%), the charged lepton veto efficiencies (up to 10%), the b tagging efficiency for heavy-flavor jets (up to 11%) and misidentification rates for light-flavor jets (up to 14%), the soft b tagging efficiency (up to 5%), the trigger efficiency (less than 1%), the pileup reweighting (up to 15%), the renormalization and factorization scales (up to 7%), the ISR modeling (up to 37%), the jet energy scale corrections (up to 26%), the p miss T energy resolution (up to 12%), the merged top quark and W boson reconstruction efficiencies (up to 17%), and the resolved top quark reconstruction efficiencies (up to 20%), where the systematic uncertainty upper range is defined as the 95% upper quantile to indicate the typical ranges. Because SUSY signal events are simulated using the CMS fast simulation program, additional uncertainties are assigned to the correction of the b tagging, soft b, merged top quark and W boson, and FIG. 6. Observed event yields in data (black points) and predicted SM background (filled histograms) for the low-Δm search bins 0- 52 (upper), and for the high-Δm search bins 53-104 (lower). The bracketed numbers in the lower plot represent the respective N t , N W , and N res requirements used in that region. The signal models are denoted in the legend with the masses in GeV of the SUSY particles in parentheses: ðmt; m˜χ0 1 Þ or ðmg; m˜χ0 1 Þ for the T2 or T1 signal models, respectively. For both plots, the lower panel shows the ratio of the data to the total background prediction. The hatched bands correspond to the total uncertainty in the background prediction. The (unstacked) distributions for two example signal models are also shown in both plots. resolved top quark reconstruction efficiencies, as well as to cover differences in p miss T between the fast simulation and the full GEANT4-based model of the CMS detector, which lead to uncertainties of up to about 40%, depending on the search bin. All uncertainties except those from the statistical precision of the simulation are treated as fully correlated among search bins. The statistical uncertainties from the simulated signal events as well as those from the simulated SM background events are incorporated into the limit calculation via the approach described in Ref. [129]. Figure 8 shows the 95% confidence level (C.L.) upper limits on the direct top squark pair production cross section in the context of the T2tt, T2bW, and T2tb models. The upper limits are used to derive an exclusion region in the  2 (upper), and for the high-Δm search bins 153-182 with N b ≥ 3 (lower). The bracketed numbers in each plot represent the respective N t , N W , and N res requirements used in that region. The signal models are denoted in the legend with the masses in GeV of the SUSY particles in parentheses: ðm˜t; m˜χ0 1 Þ or ðm˜g; m˜χ0 1 Þ for the T2 or T1 signal models, respectively. For both plots, the lower panel shows the ratio of the data to the total background prediction. The hatched bands correspond to the total uncertainty in the background prediction. The (unstacked) distributions for two example signal models are also shown in both plots. mt − m˜χ0 1 plane by comparison to the theoretical cross sections calculated at approximate NNLO þ NNLL in Refs. [51][52][53][54]. For the T2tt model, top squark masses up to 1310 GeV and LSP masses up to 640 GeV are excluded. For the T2tt model we do not present cross section upper limits in the region of jmt − m t − m˜χ0 1 j < 25 GeV and mt < 275 GeV as shown in Fig. 8 (upper left). In this region, signal events become similar to SM tt events and the signal acceptance changes rapidly and is very sensitive to the details of the simulation, so no interpretation is presented [37]. To constrain the top squark pair production cross section in this region, a dedicated search could be performed [46], or a measurement of spin correlations in the tt dileptonic decay system [130] could provide some constraint. For the T2bW and T2tb models, top squark masses

95% CL upper limit on cross section [pb]
FIG. 8. The 95% C.L. upper limit on the production cross section of the T2tt (upper left), T2bW (upper right), and T2tb (lower) simplified models as a function of the top squark and LSP masses. The solid black curves represent the observed exclusion contour with respect to approximate NNLO þ NNLL signal cross sections and the change in this contour due to variation of these cross sections within their theoretical uncertainties (σ theory ) [64][65][66][67][68][69][70][71][72][73][74]. The dashed red curves indicate the mean expected exclusion contour and the region containing 68 and 95% (AE1 and 2σ experiment ) of the distribution of expected exclusion limits under the background-only hypothesis. For T2tt, no interpretation is provided for signal models for which jm˜t − m˜χ0 1 − m t j < 25 GeV and m˜t < 275 GeV as described in the text. up to 1170 and 1150 GeV and LSP masses up to 550 and 500 GeVare excluded, respectively. For the T2bW and T2tb models, there are regions of low mt and m˜χ0 1 where signals are not excluded by the observed 95% C.L. limits. The sensitivity is reduced in this region because the top squark decay products have low p T and the signal acceptance becomes smaller. Figure 9 shows the 95% C.L. upper limits on the production cross section in the plane of mt versus Δm for the T2ttC, T2bWC, and T2cc models. Signal events with Δm below m W in the range of 10-80 GeV are considered. Top squark masses up to 640 GeV are excluded at the 95% C.L. in the context of the T2ttC model, 740 GeV for the T2bWC model, and 630 GeV for the T2cc model.
Exclusion limits for the models of gluino pair production, T1tttt, T1ttbb, and T5ttcc, are shown in the mg − m˜χ0

95% CL upper limit on cross section [pb]
FIG. 9. The 95% C.L. upper limit on the production cross section of the T2ttC (upper left), T2bWC (upper right), and T2cc (lower) simplified models as a function of the top squark mass and the difference between the top squark and LSP masses. The solid black curves represent the observed exclusion contour with respect to approximate NNLO þ NNLL signal cross sections and the change in this contour due to variation of these cross sections within their theoretical uncertainties (σ theory ) [64][65][66][67][68][69][70][71][72][73][74]. The dashed red curves indicate the mean expected exclusion contour and the region containing 68% (AE1σ experiment ) of the distribution of expected exclusion limits under the background-only hypothesis. model, up to 2250 and 1400 GeV for the T1ttbb model, and up to 2150 and 1380 GeV for the T5ttcc model. In the case of the T5ttcc model there is a reduction in sensitivity as m˜χ0 1 approaches zero. This is due to the kinematic properties of the top squark decayt → cχ 0 1 . The LSP in this situation carries only a small fraction of the top squark momentum, and this results in reduced p miss T and reduced signal acceptance. With the SUSY particle spectrum assumed in the T5ttcc model, direct top squark production should also occur as in the T2cc model. For m˜χ0 1 < 600 GeV, the T2cc model is excluded by this search and by earlier searches by the ATLAS [27] and CMS [37,40,42] experiments as well as by the LEP experiments [131][132][133][134]. For m˜χ0 1 > 600 GeV, where the T2cc model is not excluded, adding the T2cc direct top squark production contributions to the gluino pair production contributions already present in T5ttcc does not have a significant effect on the sensitivity. For simplicity, Fig. 11 shows the exclusion based on the T5ttcc model without contributions from direct top squark production.

IX. SUMMARY
Results are presented from a search for direct and gluinomediated top squark production in proton-proton collisions

95% CL upper limit on cross section [pb]
FIG. 10. The 95% C.L. upper limit on the production cross section of the T1tttt (left) and T1ttbb (right) simplified models as a function of the gluino and LSP masses. The solid black curves represent the observed exclusion contour with respect to approximate NNLO þ NNLL signal cross sections and the change in this contour due to variation of these cross sections within their theoretical uncertainties (σ theory ) [64][65][66][67][68][69][70][71][72][73][74]. The dashed red curves indicate the mean expected exclusion contour and the region containing 68 and 95% (AE1 and 2σ experiment ) of the distribution of expected exclusion limits under the background-only hypothesis. FIG. 11. The 95% C.L. upper limit on the production cross section of the T5ttcc simplified model as a function of the gluino and LSP masses. The solid black curves represent the observed exclusion contour with respect to approximate NNLO þ NNLL signal cross sections and the change in this contour due to variation of these cross sections within their theoretical uncertainties (σ theory ) [64][65][66][67][68][69][70][71][72][73][74]. The dashed red curves indicate the mean expected exclusion contour and the region containing 68% and 95% (AE1 and 2σ experiment ) of the distribution of expected exclusion limits under the background-only hypothesis. The expected and observed upper limits do not take into account contributions from direct top squark pair production; however, its effect is small for m˜χ0 1 > 600 GeV, which corresponds to the phase space beyond the exclusions based on direct top squark pair production. The excluded regions based on direct top squark pair production from this search and earlier searches by the ATLAS [27] and CMS [37,40,42] experiments, as well as by the LEP experiments [131][132][133][134] are indicated by the hatched areas.
at a center-of-mass energy of 13 TeV. The analysis includes deep neural network based tagging algorithms for top quarks and W bosons both at low and high transverse momentum. The search is based on events with at least two jets and large imbalance in transverse momentum p miss T . The data set corresponds to an integrated luminosity of 137 fb −1 collected with the CMS detector at the LHC in 2016-2018. A set of 183 search bins is defined based on several kinematic variables and the number of reconstructed top quarks, bottom quarks, and W bosons. No statistically significant excess of events is observed with respect to the expectation from the standard model.
Upper limits at the 95% confidence level are established on the cross section for several simplified models of direct and gluino-mediated top squark pair production as a function of the masses of the supersymmetric particles. Using the predicted cross sections, which are calculated with approximate next-to-next-to-leading order plus next-to-next-to-leading logarithmic accuracy, lower limits at the 95% confidence level are established on the top squark, LSP, and gluino masses. In the case of the direct top squark production models, top squark masses are excluded below a limit ranging from 1150 to 1310 GeV in the region of parameter space where the mass difference between the top squark and the LSP is larger than the W boson mass, depending on the top squark decay scenario. In the region of parameter space where the mass difference between the top squark and the LSP is smaller than the mass of the W boson, top squark masses are excluded below a limit ranging from 630 to 740 GeV, depending on the top squark decay scenario. In the case of the gluino-mediated top squark production models, gluino masses are excluded below a limit ranging from 2150 to 2260 GeV, depending on the signal model. These results significantly extend the mass exclusions of the previous top squark searches in the fully hadronic final state from CMS [40,41] by about 100-300 GeV, benefiting not only from the larger data set, but also from improved analysis methods. For models of direct top squark production, the results obtained in this analysis are the most stringent constraints to date, regardless of the final state.

ACKNOWLEDGMENTS
We congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centers and personnel of the Worldwide LHC Computing Grid and other centers for delivering so effectively the computing infrastructure essential to our analyses. Finally, we acknowledge the enduring support for the construction and operation of the LHC, the CMS detector, and the supporting computing infrastructure provided by the following funding agencies:  No. 123842, 123959, 124845, 124850, 125105, 128713, 128786, and 129058 (Hungary)
[46] CMS Collaboration, Search for the pair production of light top squarks in the e AE μ ∓ final state in proton-proton collisions at ffiffi ffi s p ¼ 13 TeV, J. High Energy Phys. 03 (2019) 101. [47] CMS Collaboration, Search for direct top squark pair production in events with one lepton, jets, and missing transverse momentum at 13 TeV with the CMS experiment, J. High Energy Phys. 05 (2020) 032.