Inclusive search for squarks and gluinos in pp collisions at sqrt(s) = 7 TeV

A search is performed for heavy particle pairs produced in sqrt(s) = 7 TeV proton-proton collisions with 35 inverse picobarns of data collected by the CMS experiment at the LHC. The search is sensitive to squarks and gluinos of generic supersymmetry models, provided they are kinematically accessible, with minimal assumptions on properties of the lightest superpartner particle. The kinematic consistency of the selected events is tested against the hypothesis of heavy particle pair production using the dimensionless razor variable R, related to the missing transverse energy. The new physics signal is characterized by a broad peak in the distribution of MR, an event-by-event indicator of the heavy particle mass scale. This new approach is complementary to missing transverse energy-based searches. After background modeling based on data, and background rejection based on R and MR, no significant excess of events is found beyond the standard model expectations. The results are interpreted in the context of the constrained minimal supersymmetric standard model as well as two simplified supersymmetry models.


Introduction
Models with softly broken supersymmetry (SUSY) [1][2][3][4][5] predict superpartners of the standard model (SM) particles. Experimental limits from the Tevatron and LEP showed that superpartner particles, if they exist, are significantly heavier than their SM counterparts. Proposed experimental searches for R-parity conserving SUSY at the Large Hadron Collider (LHC) have therefore focused on a combination of two SUSY signatures: multiple energetic jets and/or leptons from the decays of pair-produced squarks and gluinos, and large missing transverse energy (E miss T ) from the two weakly interacting lightest superpartners (LSP) produced in separate decay chains.
In this article a new approach is presented that is inclusive not only for SUSY but also in the larger context of physics beyond the standard model. The focal point for this novel razor analysis [6] is the production of pairs of heavy particles (of which squarks and gluinos are examples), whose masses are significantly larger than those of any SM particle. The analysis is designed to kinematically discriminate the pair production of heavy particles from SM backgrounds, without making strong assumptions about the E miss T spectrum or details of the decay chains of these particles. The baseline selection requires two or more reconstructed objects, which can be calorimetric jets, isolated electrons or isolated muons. These objects are grouped into two megajets. The razor analysis tests the consistency, event by event, of the hypothesis that the two megajets represent the visible portion of the decays of two heavy particles. This strategy is complementary to traditional searches for signals in the tails of the E miss T distribution [7][8][9][10][11][12][13][14][15][16] and is applied to data collected with the Compact Muon Solenoid (CMS) detector from pp collisions at √ s = 7 TeV corresponding to an integrated luminosity of 35 pb −1 .
The event-by-event estimator of M ∆ is where p R j 1 and p R j 2 are the 3-momenta of the megajets in the R frame. For signal events in the limit where the R frame and the true CM frame coincide, M R equals M ∆ , and more generally M R is expected to peak around M ∆ for signal events. For QCD dijet and multijet events the only relevant scale is √ŝ , the CM energy of the partonic subprocess. The search for an excess of signal events in a tail of a distribution is thus recast as a search for a peak on top of a steeply falling SM residual tail in the M R distribution. To extract the peaking signal, the QCD multijet background needs to be reduced to manageable levels. This is achieved using the razor variable defined as: Since for signal events M R T has a maximum value of M ∆ (i.e., a kinematic edge), R has a maximum value of approximately 1 and the distribution of R for signal events peaks around 0.5. These properties motivate the appropriate kinematic requirements for the signal selection and background reduction. It is noted that, while M R T and M R measure the same scale (one as an end-point, the other as a peak), they are largely uncorrelated for signal events, as shown in Fig. 1. In this figure, the W+jets and tt+jets backgrounds peak at M R values partially determined by the W and top quark masses, respectively.  In this analysis the SM background shapes and normalizations are obtained from data. The backgrounds are extracted from control regions in the R and M R distributions dominated by SM processes. Initial estimates of the background distributions in these regions are obtained data. Events with QCD multijet, top quarks, and electroweak bosons were generated with MADGRAPH interfaced with PYTHIA for parton showering, hadronization, and underlying event description. To generate Monte Carlo samples for SUSY, the mass spectrum was first calculated with SOFTSUSY [22] and the decays with SUSYHIT [23]. The PYTHIA program was used with the SLHA interface [24] to generate the events. The generator level cross section and the K factors for the next-to-leading order (NLO) cross section calculation were computed using PROSPINO [25].
Events are required to have at least one good reconstructed interaction vertex [26]. When multiple vertices are found, the one with the highest associated ∑ track p T is used. Jets are reconstructed offline from calorimeter energy deposits using the infrared-safe anti-k T [27] algorithm with radius parameter 0.5. Jets are corrected for the nonuniformity of the calorimeter response in energy and η using corrections derived with the simulation and are required to have p T > 30 GeV and |η| < 3.0. The jet energy scale uncertainty for these corrected jets is 5% [28]. The E miss T is reconstructed using the particle flow algorithm [29].
The electron and muon reconstruction and identification criteria are described in [30]. Isolated electrons and muons are required to have p T > 20 GeV and |η| < 2.5 and 2.1, respectively, and to satisfy the selection requirements from [30]. The typical lepton trigger and reconstruction efficiencies are 98% and 99%, respectively, for electrons and 95% and 98% for muons.
The reconstructed hadronic jets, isolated electrons, and isolated muons are grouped into two megajets, when at least two such objects are present in the event. The megajets are constructed as a sum of the four-momenta of their constituent objects. After considering all possible partitions of the objects into two megajets, the combination minimizing the invariant masses summed in quadrature of the resulting megajets is selected among all combinations for which the R frame is well defined.
After the construction of the two megajets the boost variable |β R | is computed; due to the approximations mentioned above, |β R | can fall in an unphysical region (≥1) for signal or background events; these events are removed. The additional requirement |β R | ≤ 0.99 is imposed to remove events for which the razor variables become singular. This requirement is typically 85% efficient for simulated SUSY events. The azimuthal angular difference between the megajets is required to be less than 2.8 radians; this requirement suppresses nearly back-to-back QCD dijet events. These requirements define the inclusive baseline selection. After this selection, the signal efficiency in the constrained minimal supersymmetric standard model (CMSSM) [31][32][33][34] parameter space for a gluino mass of ∼600 GeV is over 50%.

Background Estimation
In traditional searches for SUSY based on missing transverse energy, it is difficult to model the tails of the E miss T distribution and the contribution from events with spurious instrumental effects. The QCD multijet production is an especially daunting background because of its very high cross section and complicated modeling of its high-p T and E miss T tails. In this analysis a cut on R makes it possible to isolate the QCD multijet background in the low-M R region.
Apart from QCD multijet backgrounds, the remaining backgrounds in the lepton and hadronic boxes are processes with genuine E miss T due to energetic neutrinos and leptons from massive vector boson decays (including W bosons from top quark decays). After applying an R threshold, the M R distributions in the lepton and hadronic boxes are very similar for these backgrounds; this similarity is exploited in the modeling and normalization of these backgrounds.

QCD multijet background
The QCD multijet control sample for the hadronic box is defined from event samples recorded with prescaled jet triggers and passing the baseline analysis selection for events without a wellidentified isolated electron or muon. The trigger requires at least two jets with an average uncorrected p T > 15 GeV. Because of the low jet threshold, the QCD multijet background dominates this sample for low M R , thus allowing the extraction of the M R shapes with different R thresholds for QCD multijet events. These shapes are corrected for the H T trigger turn-on efficiency.
The M R distributions for events satisfying the QCD control box selection, for different values of the R threshold, are shown in Fig. 2 (left). The M R distribution is exponentially falling, after a turn-on at low M R resulting from the p T threshold requirement on the jets entering the megajet calculation. After the turn-on which is fitted with an asymmetric Gaussian, the exponential region of these distributions is fitted for each value of R to extract the exponential slope, denoted by S. The value of S that maximizes the likelihood in the exponential fit is found to be a linear function of R 2 , as shown in Fig. 2 (right); fitting S to the form S = a + bR 2 determines the values of a and b.   When measuring the exponential slopes of the M R distributions as a function of the R threshold, the correlations due to events satisfying multiple R threshold requirements are neglected. The effect of these correlations on the measurement of the slopes is studied by using pseudoexperiments and is found to be negligible.
To measure the shape of the QCD background component in the lepton boxes, the corresponding lepton trigger data sets are used with the baseline selection and reversed lepton isolation criteria. The QCD background component in the lepton boxes is found to be negligible.
The R threshold shapes the M R distribution in a simple therefore predictable way. Event selections with combined R and M R thresholds are found to suppress jet mismeasurements, including severe mismeasurements of the electromagnetic or hadronic component of the jet energy, or other anomalous calorimetric noise signals such as the ones described in [35,36].

W+jets, Z+jets, and t+X backgrounds
Using the muon (MU) and electron (ELE) control boxes defined in Section 3, M R intervals dominated by W( ν)+jets events are identified for different R thresholds. In both simulated and data events, the M R distribution is well described by two independent exponential components. The first component of W( ν)+jets corresponds to events where the highest p T object in one of the megajets is the isolated electron or muon; the second component consists of events where the leading object in both megajets is a jet, as is typical also for the t+X background events. The first component of W( ν)+jets can be measured directly in data, because it dominates over all other backgrounds in a control region of lower M R set by the R threshold. At higher values of M R , the first component of W( ν)+jets falls off rapidly, and the remaining background is instead dominated by the sum of t+X and the second component of W( ν)+jets; this defines a second control region of intermediate M R set by the R threshold.
Using these two control regions in a given box, a simultaneous fit determines both exponential slopes along with the absolute normalization of the first component of W( ν)+jets and the relative normalization of the sum of the second component of W( ν)+jets with the other backgrounds. The M R distributions as a function of R are shown in Fig. 3 (left). The slope parameters characterizing the exponential behavior of the first W( ν)+jets component are shown in Fig. 3 (right); they are consistent within uncertainties between the electron and muon channels. The values of the parameters a and b that describe the R 2 dependence of the slope are in good agreement with the values extracted from simulated W( ν)+jets events.
The data/MC ratios ρ(a) data/MC 1 , ρ(b) data/MC 1 of the first component slope parameters a, b measured in the MU and ELE boxes are thus combined yielding where the quoted uncertainties are determined from the fits.
The ratios ρ data/MC are taken as correction factors for the shapes of the Z+jets and t+X backgrounds as extracted from simulated samples for the MU and ELE boxes; the same corrections are used for the shape of the first component of W( ν)+jets as extracted from simulated samples for the hadronic (HAD) box.
The data/MC correction factors for the Z(νν)+jets and t+X backgrounds in the HAD box, as well as the second component of W( ν)+jets in the MU, ELE, and HAD boxes, are measured in the MU and ELE boxes using a lepton-as-neutrino treatment of leptonic events. Here the electron or muon is excluded from the megajet reconstruction, kinematically mimicking the presence of an additional neutrino. With the lepton-as-neutrino treatment in the MU and ELE boxes only one exponential component is observed both in data and in W( ν)+jets simulated events. In the simulation, the value of this single exponential component slope is found to agree with the value for the second component of W( ν)+jets obtained in the default treatment.
The combined data/MC correction factors measured using this lepton-as-neutrino treatment are  For the final background prediction the magnitude of the relative normalization between the two W( ν)+jets components, denoted f W , is determined from a binned maximum likelihood fit in the region 200 < M R < 400 GeV.

Lepton box background predictions
Having extracted the M R shape of the W+jets and Z+jets backgrounds, their relative normalization is set from the W and Z cross sections measured by CMS in electron and muon final states [30]. Similarly, the normalization of the cc background relative to W+jets is taken from the tt cross section measured by CMS in the dilepton channel [37]. The measured values of these cross sections are summarized below: For an R > 0.45 threshold the QCD background is virtually eliminated. The region 125 < M R < 175 GeV where the QCD contribution is negligible and the W( ν)+jets component is dominant is used to fix the overall normalization of the total background prediction. The final background prediction in the ELE and MU boxes for R > 0.45 is shown in Fig. 4. The number of events with M R > 500 GeV observed in data and the corresponding number of predicted background events are given in Table 1 for the ELE and MU boxes. Agreement between the predicted and observed yields is found. The p-value of the measurement in the MU box is 0.1, given the predicted background (with its statistical and systematic uncertainties) and the observed number of events. A summary of the uncertainties entering the background measurements is presented in Table 2.

Hadronic box background predictions
The procedure for estimating the total background predictions in the hadronic box is summarized as follows: • Construct the non-QCD background shapes in M R using measured values of a and b from simulated events, applying correction factors derived from data control samples, and taking into account the H T trigger turn-on efficiency.
• Set the relative normalizations of the W+jets, Z+jets, and t+X backgrounds using the relevant inclusive cross section measurements from CMS (Eq. 10).
• Set the overall normalization by measuring the event yields in the lepton boxes, corrected for lepton reconstruction and identification efficiencies. The shapes and normalizations of all the non-QCD backgrounds are now fixed.
• The shape of the QCD background is extracted, as described in Section 5.1, and its normalization in the HAD box is determined from a fit to the low-M R region, as described below.
The final hadronic box background prediction is calculated from a binned likelihood fit of the total background shape to the data in the interval 80 < M R < 400 GeV with all background normalizations and shapes fixed, except for the following free parameters: i) the H T trigger turn-on shapes, ii) f W as introduced in Section 5.2, and iii) the overall normalization of the QCD background. A set of pseudo-experiments is used to test the overall fit for coverage of the various floated parameters and for systematic biases. A 2% systematic uncertainty is assigned to the high-M R background prediction that encapsulates systematic effects related to the fitting procedure. Figure 5 shows the final hadronic box background predictions with all uncertainties on this prediction included for R > 0.5. The observed M R distribution is consistent with the predicted one over the entire M R range. The predicted and observed background yields in the high-M R region are summarized in Table 3. A summary of the uncertainties entering these background predictions is listed in Table 4. A larger R requirement is used in the HAD box analysis due to the larger background.

Limits in the CMSSM Parameter Space
Having observed no significant excess of events beyond the SM expectations, we extract a model-independent 95% confidence level (CL) limit on the number of signal events. This limit is then interpreted in the parameter spaces of SUSY models.
The likelihood for the number of observed events n is modeled as a Poisson function, given the sum of the number of signal events (s) and the number of background events. A posterior probability density function P(s) for the signal yield is derived using Bayes theorem, assuming a flat prior for the signal and a log-normal prior for the background.
The model-independent upper limit is derived by integrating the posterior probability density function between 0 and s * so that s * 0 P(s)ds = 0.95. The observed upper limit in the hadronic box is s * = 8.4 (expected limit 7.2 ± 2.7); in the muon box s * = 6.3 (expected limit 3.5 ± 1.1); and in the electron box s * = 2.9 (expected limit 3.6 ± 1.1). For 10% of the pseudo-experiments in the muon box the expected limit is higher than the observed. The stability of the result was studied with different choices of the signal prior. In particular, using the reference priors derived with the methods described in Ref. [38], the observed upper limits in the hadronic, muon, and electron boxes are 8.0, 5.3, and 2.9, respectively.
The results can be interpreted in the context of the CMSSM, which is a truncation of the full SUSY parameter space motivated by the minimal supergravity framework for spontaneous soft breaking of supersymmetry. In the CMSSM the soft breaking parameters are reduced to five: three mass parameters m 0 , m 1/2 , and A 0 being, respectively, a universal scalar mass, a universal gaugino mass, and a universal trilinear scalar coupling, as well as tan β, the ratio of the up-type and down-type Higgs vacuum expectation values, and the sign of the supersymmetric Higgs mass parameter µ. Scanning over these parameters yields models which, while not entirely representative of the complete SUSY parameter space, vary widely in their superpartner spectra and thus in the dominant production channels and decay chains.
The upper limits are projected onto the (m 0 , m 1/2 ) plane by comparing them with the predicted yields, and excluding any model if s(m 0 , m 1/2 ) > s * . The systematic uncertainty on the signal yield (coming from the uncertainty on the luminosity, the selection efficiency, and the theoretical uncertainty associated with the cross section calculation) is modeled according to a lognormal prior. The uncertainty on the selection efficiency includes the effect of jet energy scale (JES) corrections, parton distribution function (PDF) uncertainties [39], and the description of initial-state radiation (ISR). All the effects are summed in quadrature as shown in Table 5. The JES, ISR, and PDF uncertainties are relatively small owing to the insensitivity of the signal R and M R distributions to these effects. The observed limits from the ELE, MU, and HAD boxes are shown in Figs. 6, 7, and 8, respectively, in the CMSSM (m 0 , m 1/2 ) plane for the values tan β = 3, A 0 = 0, sgn(µ) = +1, together with the 68% probability band for the expected limits, obtained by applying the same procedure to an ensemble of background-only pseudo-experiments. The band is computed around the median of the limit distribution. Observed limits are also shown in Figs. 9 -11 in the CMSSM (m 0 , m 1/2 ) plane for the values tan β = 10, A 0 = 0, sgn(µ) = +1, and in Figs. 12-13 for the values tan β = 50, A 0 = 0, sgn(µ) = +1. Figure 14 shows the same result in terms of 95% CL upper limits on the cross section as a function of the physical masses for two benchmark simplified models [13, [41][42][43]: four-flavor squark pair production and gluino pair production. In the former, each squark decays to one quark and the LSP, resulting in final states with two jets and missing transverse energy, while in the latter each gluino decays directly to two light quarks and the LSP, giving events with four jets and missing transverse energy.

Summary
We performed a search for squarks and gluinos using a data sample of 35 pb −1 integrated luminosity from pp collisions at √ s = 7 TeV, recorded by the CMS detector at the LHC. The kinematic consistency of the selected events was tested against the hypothesis of heavy particle       pair production using the dimensionless razor variable R related to the missing transverse energy E miss T , and M R , an event-by-event indicator of the heavy particle mass scale. We used events with large R and high M R in inclusive topologies.
The search relied on predictions of the SM backgrounds determined from data samples dominated by SM processes. No significant excess over the background expectations was observed, and model-independent upper limits on the numbers of signal events were calculated. The results were presented in the (m 0 , m 1/2 ) CMSSM parameter space. For simplified models the results were given as limits on the production cross sections as a function of the squark, gluino, and LSP masses.
These results demonstrate the strengths of the razor analysis approach; the simple exponential behavior of the various SM backgrounds when described in terms of the razor variables is useful in suppressing these backgrounds and in making reliable estimates from data of the background residuals in the signal regions. Hence, the razor method provides an additional powerful probe in searching for physics beyond the SM at the LHC.           [7] D0 Collaboration, "Search for squarks and gluinos in events with jets and missing transverse energy using 2. [9] ATLAS Collaboration, "Search for an excess of events with an identical flavour lepton pair and significant missing transverse momentum in √ s= 7 TeV proton-proton collisions with the ATLAS detector", (2011). arXiv:1103.6208. Accepted by EPJC.