Search for pair-produced resonances each decaying into at least four quarks in proton-proton collisions at $\sqrt{s}=$ 13 TeV

This letter presents the results of a search for pair-produced particles of masses above 100 GeV that each decay into at least four quarks. Using data collected by the CMS experiment at the LHC in 2015-2016, corresponding to an integrated luminosity of 38.2 fb$^{-1}$, reconstructed particles are clustered into two large jets of similar mass, each consistent with four-parton substructure. No statistically significant excess of data over the background prediction is observed in the distribution of average jet mass. Pair-produced squarks with dominant hadronic $R$-parity-violating decays into four quarks and with masses between 0.10 and 0.72 TeV are excluded at 95% confidence level. Similarly, pair-produced gluinos that decay into five quarks are also excluded with masses between 0.10 and 1.41 TeV at 95% confidence level. These are the first constraints that have been placed on pair-produced particles with masses below 400 GeV that decay into four or five quarks, bridging a significant gap in the coverage of $R$-parity-violating supersymmetry parameter space.

This analysis probes a previously unexamined region of the SUSY model space, where pairproduced squarks with masses as light as 100 GeV each decay into four non-top quarks through hadronic RPV couplings [16]. Such processes proceed through a two-body decay of the squark into a higgsino and a quark, and the higgsino subsequently undergoes an RPV three-body decay via an off-shell squark ( q * ) into quarks, as shown in Fig. 1 (left). This decay mode is expected to dominate over direct squark decays into two quarks if the higgsino mass is less than the squark mass (as expected in natural scenarios) and if the RPV coupling is small. A related topology with five-quark resonances occurs when the gluino undergoes a three-body decay into two quarks and a higgsino, and the higgsino decays into three quarks, as shown in Fig. 1 (right). Searches by the experiments at LEP strongly disfavor squark and higgsino masses below 100 GeV, while existing LHC searches are only sensitive to multijet scenarios above at least 400 GeV [17][18][19][20]. Light pair-produced squarks and gluinos that decay through RPV couplings into two and three quarks, respectively, have also been directly constrained by Tevatron and LHC searches [21][22][23][24].
H ±/0q ⇤ q q q q qq ⇤H ±/0q ⇤ g q q q q q Figure 1: Leading order Feynman diagrams for a squark decaying into four quarks via an intermediate higgsino and a three-body decay involving an off-shell squark with an RPV coupling (left) and for a gluino decaying into five quarks via a three-body decay into two quarks and a higgsino, which decays into three quarks as in the squark case (right). This analysis assumes that the initial squarks (or gluinos) are pair-produced and that they each decay as presented.
We take advantage of the large cross section predicted for squark pair production, focusing on the small fraction of phase space in which the squarks are produced back-to-back and the decay products of each squark reside entirely within a single large jet. This requirement reduces the combinatorial background and recovers the resonant structure of the squark decay. Furthermore, the masses of the two jets are required to be approximately equal, and the jets must evince substructure consistent with at least four non-top quarks. These conditions help to suppress the immense background of light-quark jets produced by the strong interaction, referred to as quantum chromodynamics (QCD) multijet events, which is estimated from data with a novel technique. The background from top quark-antiquark (tt) events is also significant; its contribution is estimated from simulation. The tt events in control regions are used in the estimation of the selection efficiency, jet mass scale, and jet mass resolution.
The motivation for this search comes from natural SUSY, but the analysis aims more broadly for new phenomena in general. We make no assumptions about the helicity, the intermediate decay topology, the nature of the final-state particles, or the presence of secondary vertices (characteristic of heavy-flavor quarks) in the final state, so as to be more inclusive and to focus on the more experimentally challenging light-quark mode. The inclusive procedure of the search additionally allows us to constrain pair-produced gluinos that decay to five non-top quarks.
The CMS detector consists of a silicon tracker, a lead-tungstate electromagnetic calorimeter, an interleaved brass and plastic scintillator hadron calorimeter, and gas-ionization muon detectors. A superconducting solenoid provides a uniform magnetic field within the detector. Potentially interesting observed events are selected by a two-tiered trigger system [25]. A more detailed description of the CMS detector can be found in Ref. [26].
Observations from each of the CMS detector components are reconstructed and combined by particle flow (PF) algorithms [27] into particle objects. These objects are clustered into jets using the FASTJET software package [28] with either the anti-k T (AK) or Cambridge-Aachen (CA) algorithm [29] with a size parameter R. A collection of jets is referenced by the acronym for the clustering algorithm used, followed by ten times the associated value of R: for example, "CA12" refers to jets clustered with the CA algorithm with R = 1.2, the jet collection primarily used in this analysis.
We analyze data collected by the CMS detector in 2015-2016 from proton-proton collisions at √ s = 13 TeV and corresponding to an integrated luminosity of 38.2 fb −1 . Events considered for analysis were selected by a trigger that imposes a requirement on the event H T ≡ ∑ i p T,i , where i iterates over the AK8 jets in the event with transverse momentum p T > 150 GeV and |η| < 2.5, where η is the pseudorapidity.
Signal events are generated at leading order in perturbative QCD using MAD-GRAPH5 aMC@NLO 2.5.5 [30] with the NNPDF2.3 parton distribution function [31] and PYTHIA 8.212 [32] with the underlying event tune CUETP8M1 [33] to simulate the subsequent parton showering and decays. Events for 11 squark mass points between 0.1 and 0.8 TeV are produced. In each case, the higgsino mass is set to 75% of the squark mass in order to distribute the squark energy evenly among the daughter quarks. Simulated gluino signal events are produced for 17 gluino mass points between 0.1 and 1.5 TeV. In this case, the higgsino mass is set to 60% of the gluino mass, and the squarks are assumed to be too massive to participate in the process. Squark and gluino events are normalized to the theoretical cross sections calculated up to next-to-leading-order plus next-to-leading-logarithm precision [34]. The squark cross sections are chosen to be those computed for top squarks.
Background from tt production is generated with POWHEG v2 [35][36][37], and the p T distribution of the top quarks is reweighted to agree with measurement [38]. Background from QCD multijet events is simulated at leading order using MADGRAPH5 aMC@NLO 2.2.2 with up to four partons included in the matrix-element calculation. Backgrounds such as hadronic W+jets and Z+jets make a negligible contribution and are absorbed into the QCD multijet background estimate. All background samples are generated with NNPDF2.3 or NNPDF3.0 parton distribution functions and are interfaced with PYTHIA for the computation of fragmentation and hadronization. Additional proton-proton interactions within the same or a nearby bunch crossing (pileup) are added to the simulated events, with a frequency distribution chosen to match that observed in data.
In order to capture as many of the final-state constituents of the squark decay products as possible in a single jet, we cluster the PF particle objects of each event into CA12 jets. To reduce the effect of pileup interactions, we ignore any charged hadrons not associated with the primary vertex in this clustering procedure. The primary vertex is chosen to be the vertex with the largest quadratic sum of the p T values of AK4 jets clustered from tracks assigned to the vertex and the p T corresponding to the negative vector-p T sum of these AK4 jets. The CA12 jet energies are corrected to compensate for the combined response function of the CMS calorimeters and the presence of neutral hadrons from pileup, with calibrations designed for AK8 jets [39]. The data are corrected with an additional scale factor to account for the residual difference in the detector response between simulation and data. Miscorrections that result from the application of AK8 calibrations to the CA12 jets are measured and are accounted for in the uncertainties of the final results. Finally, to reject misreconstructed CA12 jets or CA12 jets derived from calorimeter noise [40], we impose restrictions on the neutral and charged CA12 jet constituents by applying loose jet identification criteria [41].
We analyze the substructure of the CA12 jets using the N-subjettiness variables, denoted τ N [42]. These quantities are determined by reclustering the constituents of the jet with the k T algorithm [29] into N subjets and then calculating where i iterates over the jet constituents, d 0 ≡ ∑ i p T,i R with R representing the size parameter of the jet, and ∆R j,i is the distance in η-φ space between the j th subjet and the i th jet constituent, where φ is the azimuthal angle. The lower the value of τ N for a jet, the more the jet is consistent with containing N partons. The relative N-subjettiness τ kl ≡ τ k /τ l , for specific integers k and l, characterizes the substructure of a jet. In particular, τ 42 is used to discriminate against QCD multijet events and τ 43 to discriminate against tt events.
Finally, we apply a pruning algorithm [43] that reclusters the original constituents of a jet using a CA algorithm modified to ignore any objects with small relative p T (z cut = 0.1) and large kinematically weighted displacement (D cut = 0.5). This refinement serves to suppress perturbative radiation characteristic of QCD multijet events as well as to reduce the contribution to jet masses from detector noise, pileup, and the underlying event. In this Letter, the term "fat jet" is used to refer to a CA12 jet with τ N requirements, and the mass of a jet m always refers to the mass after pruning.
We consider events with H T > 0.9 TeV in order to guarantee that we analyze a kinematic region with a fully efficient trigger. In addition, we require that the two p T -leading CA12 jets of the events must each satisfy the following three conditions: p T > 0.4 TeV to ensure that the jets are more likely to contain all signal products, |η| < 2 to require that the jets are dominated by objects in the central detector region, and τ 21 < 0.75 to select CA12 jets with potential for substructure.
From these selected events, we define a signal region (SR) and two control regions (CRs). The regions are formed by selecting events in which the two p T -leading CA12 jets (indexed 1 and 2) each satisfy certain requirements on τ 42 and τ 43 , called the region's "fat-jet tag," and also pass relational criteria of A m ≡ |m 1 − m 2 |/ (m 1 + m 2 ) < 0.1 and ∆η ≡ |η 1 − η 2 | < 1.0, defining a "fat-jet pair." The SR fat-jet tag requires τ 42 < 0.50 and τ 43 < 0.80, choices that were optimized for the squark signal significance. We define an inclusive CR and a b-tagged CR, which we use to verify the closure of the background prediction in data and to estimate the systematic uncertainties that are further constrained in the SR. Both CR fat-jet tags impose the same N-subjettiness requirements on each fat jet: τ 42 < 0.55, τ 43 < 0.90, and either τ 42 > 0.50 or τ 43 > 0.80. The b-tagged CR fat-jet tag additionally requires a loose b tag [44] (with approximately 80% tagging efficiency and 9% mistag rate) in order to select a tt-enriched sample. The CA12 jets are b tagged by matching them to b-tagged AK4 jets, assigning to each CA12 jet the smallest b-tagging discriminator from among the matched AK4 jets. The final discriminating variable of this analysis is the average pruned mass of the two p T -leading fat jets in an event: The fraction of squark events in the SR ranges from < 0.001% for the 0.1 TeV squark mass point to 1% for the 0.8 TeV mass point. For very light squarks, the small selection rate is predominantly driven by the low kinematic acceptance, but it is compensated for by a large production cross section (1520 pb for 0.1 TeV top squarks). Similar values for the selection efficiency are found for gluino events, although the m distribution for gluinos is broader than for squarks in part because of the extra quark in the decay chain. These properties are illustrated in Appendix A: We estimate the contribution of a particular signal to the observed events by performing a maximum-likelihood fit of the combined m probability density functions (PDFs) of the signal and the two backgrounds to the m distribution of the data. The signal and tt background m PDFs are taken from simulation; the QCD multijet background m PDF is constructed using a data-driven method detailed below. Each m PDF is assigned three fit parameters: a normalization factor, a shift of the median of the PDF from the nominal value, and a width factor (stretch) about the median of the PDF. These fit parameters allow us to account for observed shape differences between simulation and data.
This procedure is applied to the CRs in the data, and we use the post-fit values of the m PDF parameters as estimates of the background systematic uncertainties. Figure 2 shows the post-fit background m PDFs and the associated fit parameters, assuming no signal contribution. Signal contamination in the CRs is generally small, and we repeat the same procedure including the m PDF for each signal mass point to establish that the presented null hypothesis results in the most reliable fit. The tt m PDF fit parameters have physical interpretations: the normalization corresponds to the data-to-simulation fat-jet tagging scale factor, the shift corresponds to the tagged fat-jet mass scale, and the stretch corresponds to the tagged fat-jet mass resolution. As presented in Fig. 2, the shifts in the CRs are generally consistent with zero, but the stretches imply broadening, consistent with the known ∼10% underestimation of the jet energy resolution in simulation [39]. We study a variety of alternative choices of CR fat-jet tags, which similarly demonstrate good closure. The background estimation method reproduces the data in the inclusive CR, which has greater statistical precision than the SR, lending confidence that it provides reliable results in the SR as well.
We derive the QCD multijet background m PDF from a large sample of data events in order to avoid the mismodeling of simulated QCD multijet events. For this purpose, events are selected in which at least the p T -leading jet satisfies the fat-jet tag; the number of such events is more than an order of magnitude greater than the number required to have a fat-jet pair. We form a collection of fat jets by taking the p T -leading fat jet from each of these events. From this collection of fat jets from different events, we form all possible fat-jet pairs. The m distribution of these artificial fat-jet pairs provides an m PDF for QCD multijet events. This construction treats the m distribution of p T -leading fat jets as an m PDF, denoted P(m), from which two fat-jet masses are sampled to form an m PDF P avg (m): where θ is the Heaviside step function that imposes the A m fat-jet pair requirement. Although the ∆η fat-jet pair requirement does not appear explicitly in Eq. (2), it is imposed when we implement this construction. Fat jets from QCD multijet events do not contain the products of partons with well-defined masses, and so the P(m) derived from them has a significant p T dependence that must be taken into account. We accommodate this effect by constructing a different P avg (m) for consecutive H T windows, which we then combine with weights chosen to reproduce to the observed H T distribution in data. Figure A.4 of Appendix A shows P(m) and the derived P avg (m) with and without this H T reweighting, demonstrating the significant effect of the reweighting on the high-m tail of the distribution. Corrections for the contamination of the QCD multijet m PDF by signal or other backgrounds are unnecessary, given the dominance of QCD multijet events over all other processes in the events used to construct P(m).
The accuracy of this QCD multijet m PDF construction is demonstrated by a closure test. We construct QCD multijet m PDFs from simulated QCD multijet events in the SR and CRs and compare them to the m distributions of the regions' measured tagged fat-jet pairs; the result for the SR is shown in Fig. 3 (left). The SR m PDF only needs to be shifted downward by approximately 11 GeV and compressed by about 9% to match the simulated fat-jet pair distribution. Similar tests with alternative choices for the N-subjettiness variables and different simulation procedures for the QCD multijet events also demonstrate good closure.
Using the data in the SR, we apply a Bayesian construction with a flat signal-strength prior [45] to compute 95% confidence level (CL) upper limits on squark (gluino) pair production and decay to four (five) quarks. Markov chain Monte Carlo techniques [46] are used to integrate out the nuisance parameters. We analyze the data in 30 GeV bins, as shown in Fig. 3 (right).
We assign nuisance parameters corresponding to the shift, stretch, and normalization of the QCD multijet, tt, and signal m PDFs. The nuisance parameters have Gaussian PDFs. The QCD multijet m PDF shape nuisance parameter uncertainties are taken from the CRs, but no initial shift or stretch is applied, since these parameters in the SR might be different from those in the CRs. We do not have a robust prediction for the total rate of QCD multijet events in the SR, so the QCD multijet m PDF normalization is chosen to be a free parameter. This procedure is reasonable because the QCD multijet shape is much broader than that of the signal, allowing the data in the SR to directly constrain this background. The tt rate and shape nuisance parameter uncertainties are determined by their measured values in the CRs; we start the minimization procedure with the initial value of these parameters reflecting no shift or stretch, as that is consistent with measurement. Interpreting the tt shift and stretch as jet-mass scale and resolution uncertainties, respectively, we apply the same uncertainties to the signal shape by constraining them to match the tt background nuisance parameters. For the signal rate, we assign the same uncertainty value that we measured for tt events, since the signal and tt production and decay topologies are similar. The uncertainty in the signal rate includes contributions from the integrated luminosity and the fat-jet tagging scale factor. Uncertainties in the signal acceptance due to the parton distribution functions of the proton contribute subdominantly. The central values and standard deviations of all of the nuisance parameters are shown in Table 1. Table 1: The nuisance parameters corresponding to each rate and shape parameter of the background and signal distributions, before and after the maximum-likelihood fit. Except for the QCD multijet m PDF normalization, which is floating (and the value of which is simply the event yield with its statistical uncertainty), each nuisance parameter has a Gaussian prior PDF and is reported with the given mean and standard deviation.

Parameter
Pre-fit value Post-fit value QCD normalization [events] floating 1222 ± 35 QCD shift [GeV] 0 ± 17 −8 ± 4 QCD stretch (0 ± 18)% (−1 ± 3)% tt normalization 1.00 ± 0.24 1.08 ± 0.14 tt and signal shift [GeV] 0 ± 16 −10 ± 6 tt and signal stretch (0 ± 20)% (15 ± 9)% Signal normalization 1.00 ± 0.24 - The resulting limits on the production cross section and the predicted background components in the SR are shown in Fig. 4. Assuming the top squark production cross section, squark masses between 0.10 and 0.72 TeV are excluded at 95% CL. Gluinos decaying to five quarks with masses between 0.10 and 1.41 TeV are also excluded at 95% CL. The post-fit total background estimate agrees with the data. The posterior distributions of the nuisance parameters confirm that the background component predictions are not significantly different from the estimates.
[GeV]  Figure 4: The expected and observed limits on the product of the pair-production cross section and branching fraction squared for a quark decaying to four quarks (left) and for a gluino decaying to five quarks (right).
In summary, a search has been conducted at the LHC for light pair-produced resonances that each decay into at least four quarks. No statistically significant excess over the expectation is observed. The data impose limits on R-parity-violating supersymmetry pair production [15], excluding squark masses between 0.10 and 0.72 TeV and gluino masses between 0.10 and 1.41 TeV. These are the first constraints that have been placed on pair-produced particles with masses below 400 GeV that decay into four or five quarks, bridging a significant gap in the coverage of R-parity-violating supersymmetry parameter space. This analysis is sufficiently general that it can be applied to other models describing pair-produced particles decaying into four or more detectable objects.

Acknowledgments
We congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centers and personnel of the Worldwide LHC Computing Grid for delivering so effectively the computing infrastructure essential to our analyses. Finally, we acknowledge the enduring support for the construction and operation of the LHC and the CMS detector provided by the following funding agencies: BMWFW and FWF (Aus    [25] CMS Collaboration, "The CMS trigger system", JINST 12 (2017) P01020, doi:10.1088/1748-0221/12/01/P01020, arXiv:1609.02366.
[30] J. Alwall et al., "The automated computation of tree-level and next-to-leading order differential cross sections, and their matching to parton shower simulations", JHEP 07