Search for supersymmetry in events with opposite-sign dileptons and missing transverse energy using an artificial neural network

In this paper, a search for supersymmetry (SUSY) is presented in events with two opposite-sign isolated leptons in the final state, accompanied by hadronic jets and missing transverse energy. An artificial neural network is employed to discriminate possible SUSY signals from a standard model background. The analysis uses a data sample collected with the CMS detector during the 2011 LHC run, corresponding to an integrated luminosity of 4:98 fb 1 of proton-proton collisions at the center-of-mass energy of 7 TeV. Compared to other CMS analyses, this one uses relaxed criteria on missing transverse energy ( 6 ET > 40 GeV) and total hadronic transverse energy (HT > 120 GeV), thus probing different regions of parameter space. Agreement is found between standard model expectation and observations, yielding limits in the context of the constrained minimal supersymmetric standard model and on a set of simplified models.


Introduction
One of the most natural extensions of the standard model (SM) of particle physics is supersymmetry (SUSY) [1][2][3][4][5][6][7][8]. Supersymmetry allows for gauge coupling unification at the energy of 10 16 GeV, provides a good dark matter candidate (lightest supersymmetric particle, LSP) [9], is a necessary component to explain quantum gravity in the framework of string theory, and automatically cancels the quadratic divergences in radiative corrections to the Higgs boson mass. For every particle in the standard model, SUSY introduces a super-partner, the "sparticle", with spin differing by 1/2 unit from the SM particle. There are theoretical arguments that suggest sparticle masses could be less than ∼1 TeV [7,8] making the experiments at the Large Hadron Collider (LHC) an ideal place for their discovery.
With the successful 2011 LHC run, an integrated luminosity of 4.98 fb −1 in pp collisions at 7 TeV center-of-mass energy has been collected with the Compact Muon Solenoid (CMS) experiment. This dataset is used to search for the presence of SUSY particles in events with two opposite-sign leptons (electrons and muons) in the final state, utilizing an artificial neural network (ANN). Two opposite-sign leptons can be produced in a SUSY cascade through the decay of neutralinos and charginos. Assuming that R-parity is conserved [10], a stable, weakly interacting LSP exists, resulting in a missing transverse energy (E T / ) signature. The amount of missing transverse energy depends on the mass splittings among the heavier sparticles. So far, typical dilepton SUSY searches in CMS have required several jets with large transverse momentum, which correspond to large values of H T , the scalar sum over the transverse momenta of all jets satisfying the jet selection, and large missing transverse energy to discriminate a SUSY signal from the very large SM backgrounds. Compared with previous CMS searches [11,12], this analysis uses relaxed criteria on missing transverse energy (E T / > 40 GeV) and H T (H T > 120 GeV). For SUSY models that yield events with large E T / , the ANN's performance is comparable to the data analyses using large E T / and H T . Hence, for such models the additional power of a multivariate technique is not required to discriminate between new physics and the SM backgrounds. However, for SUSY models that yield low-E T / or low-H T signatures, the discriminating power of the ANN helps to suppress the large SM backgrounds.
The results are interpreted in the context of the constrained minimal supersymmetric standard model (CMSSM [13,14]), and a class of simplified model scenarios (SMS) [15,16]. For illustration purposes, the benchmark CMSSM point LM6 (m 0 = 85 GeV, m 1/2 = 400 GeV, tan β = 10, A 0 = 0 GeV) is used throughout the paper. In the class of SMS considered, gluinos are pairproduced, with one of them decaying as g → χ 0 2 jj → χ 0 1 + − jj, and the other as g → χ 0 1 jj. Here χ 0 2 is the second-lightest neutralino, χ 0 1 is the lightest neutralino and the LSP, and = e, µ, or τ with equal probability. This SMS thus always leads to a pair of opposite-sign leptons in the final state, in addition to the jets and E T / . The SMS is fully described by the following parameters: the masses of the gluino (m g ), and the LSP (m LSP ), along with the neutralino mass in the gluino decay which is set to m χ 0 2 = (m g + m LSP )/2.

CMS Detector
A detailed description of the CMS Detector can be found elsewhere [17]. A right-handed coordinate system is used with the origin at the nominal interaction point. The x axis points to the center of the LHC ring, the y axis is vertical and points upward, and the z axis points in the direction of the counterclockwise proton beam. The azimuthal angle φ is measured with respect to the x axis in the x-y plane and the polar angle θ is defined with respect to the z axis, while the pseudorapidity is defined as η = − ln[tan(θ/2)]. The central feature of the CMS ap-paratus is a superconducting solenoid, of 6 m internal diameter, that produces a magnetic field of 3.8 T. Located within the field volume are the silicon pixel and strip tracker, and the barrel and endcap calorimeters (|η| < 3), composed of a crystal electromagnetic calorimeter (ECAL) and a brass and scintillator hadron calorimeter (HCAL). Calorimetry provides energy and direction measurements of electrons and hadronic jets. The detector is nearly hermetic, allowing for energy balance measurements in the plane transverse to the beam directions. Outside the field volume, in the forward region (3 < |η| < 5), there is an iron and quartz-fiber hadron calorimeter. The steel return yoke outside the solenoid is instrumented with gas-ionization detectors used to identify muons. The CMS experiment collects data using a two-level trigger system, the Level-1 (L1) hardware trigger [18] and a high-level software trigger (HLT) [19].

Data Samples, Trigger, and Event Selection
Data events are selected using a set of dilepton triggers, which require the presence of at least two leptons, either two muons or two electrons or a muon-electron pair. In the case of the double-muon trigger, the selection is asymmetric with a transverse momentum (p T ) threshold of 13 GeV for the leading (higher-p T ) muon and 8 GeV for the subleading one. In the case of the double-electron trigger, the selection is asymmetric with a threshold applied to the transverse energy of a cluster in the ECAL. The thresholds are fixed to 17 GeV (8 GeV) for the leading (subleading) electron energy. For the muon-electron trigger, the threshold on the transverse momentum, p T (transverse energy, E T ) is 8 GeV (17 GeV) for the muon (electron). For all triggers, additional identification and isolation criteria are also applied.
Muon candidates are reconstructed [20] by combining the information from the inner tracking system, the calorimeters, and the muon system. Electron candidates are reconstructed [21] by combining the information from the ECAL with the silicon tracker, using shower shape and track-ECAL-cluster matching variables in order to increase the sample purity. Jets are reconstructed using the anti-k T clustering algorithm [22] with a distance parameter ∆R = (∆φ) 2 + (∆η) 2 = 0.5. The inputs to the jet clustering algorithm are the four-momentum vectors of reconstructed particles. Each such particle is reconstructed with the particle-flow technique [23] which combines information from several subdetectors. The measured jet transverse momenta are corrected with scale factors derived from simulation; to correct for any differences in the energy response between simulation and data, a residual correction factor derived from the latter is applied to jets in the data [24]. In general, E T / ≡ −|∑ p T |, where the sum is taken over all final-state particles reconstructed in the CMS detector. The total transverse energy (∑ E T ) of the event is calculated as the scalar sum of the transverse energies of leptons and jets. The total hadronic transverse energy, (H T ≡ | ∑ p T |), is computed as the scalar sum of the transverse energies of all reconstructed jets in the event satisfying the jet selection criteria described below.
Simulated pp collision events are produced with the PYTHIA 6.4.22 [25] generator (using underlying event tune Z2 which is identical to the Z1 tune [26] except that Z2 uses the CTEQ6L parton distribution functions (PDF) while Z1 uses CTEQ5L) for QCD, WW, ZZ and WZ samples. For tt, Drell-Yan, and W + jets samples the MADGRAPH 4.4.24 [27] generator is used. Events are then processed with a simulation of the CMS detector response based on GEANT4 [28]. Multiple proton-proton interactions are superimposed on the hard collision, and all simulated event samples are reweighted according to the distribution of the number of reconstructed primary vertices in data. Simulated events are reconstructed and analyzed in the same way as data events. Simulated event samples are used to train the ANN, to extrapolate background estimates from a background-enriched control region in data to the expected signal-enriched region, and to estimate systematic uncertainties.
Non-collision backgrounds are removed by applying quality requirements ensuring the presence of at least one reconstructed primary vertex [29]. Events are required to have at least two opposite-sign leptons, both electrons or muons, or an electron-muon pair, with p T > 20 GeV and |η| < 2.4, and at least two jets with p T > 30 GeV and |η| < 2.4. Jets are required to satisfy the quality criteria described in Ref. [30]. Leptons are required to be isolated from significant energy deposits and tracks in a cone of radius ∆R = 0.3 around the direction of the lepton. The relative combined isolation, defined as I comb rel = (∑ tracks p T + ∑ ECAL E T + ∑ HCAL E T )/p T , is required to be <0.2 for muons and <0.08 for electrons, with the latter criterion being more strict in order to reject jets misidentified as electrons.

Signal to Background Discrimination
The ANN in this analysis is used to separate SUSY signals from SM background events, exploiting correlations among the discriminating variables, and thus providing improved results with respect to the use of sequential selections. Due to the presence of isolated leptons, the main SM background contributions to this analysis involve the production of tt, and Z + jets. The QCD multijet processes with two misidentified (fake) leptons, and W + jets events with one misidentified lepton can also be part of the background, but are significantly reduced by applying additional candidate event selection criteria described below. Finally, two leptons in the final state could be produced by WW, WZ or ZZ decays but their contributions are found in simulation to be negligible compared to the main backgrounds.
The candidate event selection criteria, which are imposed before the ANN training, are the following: events are required to have E T / > 30 GeV, the distance ∆R between either of the two leading opposite-sign leptons and the closest jet is required to be >0.2, and the dilepton mass M , formed from the two leading opposite-sign leptons, is required to be larger than 10 GeV. These criteria reject the vast majority of the background, while retaining most of the signal as shown in Table 1 for CMSSM benchmark point LM6. This greatly facilitates the ANN training and optimization by excluding a region heavily dominated by background in which few if any signal events are present. The signal region is defined by the candidate event selection criteria with an additional requirement on the ratio of the dilepton transverse energy ∑ E lepton T to the total transverse energy (as defined in Section 3) to be less than 0.4. Table 1: Expected number of signal and background (bkg.) events after the event selection criteria, and after the candidate event selection criteria for events in the signal region are applied. The next-leading-order (NLO) cross section is used for the CMSSM benchmark point LM6 yield determination. The dataset resulting from the candidate event selection is used as input to the ANN. The uncertainties quoted are statistical only. The ANN training samples are based on simulated events. A mixture of tt, Z + jets, W + jets, and QCD simulated samples are used as the SM background. For the signal, a class of SMS scenarios [15] is used. For the ANN training grid points close to the diagonal (m g = m LSP ) are used with |m g − m LSP | < 400 GeV. These points are chosen since they exhibit low E T / or H T thresholds: more than 90% of the events have E T / < 200 GeV or H T < 600 GeV.
Several topological and kinematical variables are considered according to their potential to discriminate SM backgrounds from possible SUSY signals, taking into account the correlations among them. The variables studied are based on the general production and decay characteristics of many supersymmetric processes and are not tuned to a specific model.
Using different combinations of candidate input variables, several ANNs are constructed and compared in order to select the optimal configuration. The differences in performance are studied and quantified in terms of the signal selection efficiency as a function of background rejection. A network with seven input variables, those with the smallest degree of correlation among themselves and with the highest discriminating power, shows the best performance. The ANN variable importance is defined as sum of the weights-squared of the connections between the variable's neuron in the input layer and the ones in the first hidden layer. Table 2 lists the seven input ANN variables along with their description, and their relative importance after the ANN training.
where ∑ E T and ∑ p T represent the scalar and vector sums over the transverse momenta of all reconstructed jets and leptons.

ANN Output for SM Background
In order to quantify the level of agreement and the significance of a possible excess between data and SM expectation, it is important to provide a robust estimate of the ANN output distribution in the signal region under the SM-only hypothesis along with its systematic uncertainty.
The approach used to estimate the ANN prediction for the SM-only hypothesis from data is as follows. A signal region (SR) is defined by the set of the candidate event selection requirements and the additional criterion on the fraction of transverse energy carried by the dilepton system as described in Section 3. A primary control region (CR) is defined by inverting two of the signal event selection criteria, the total missing transverse energy and the selection cut on the fraction of transverse energy carried by the dilepton system. This region is chosen so that it is dominated by SM processes. Signal contamination in the primary control region is small: for the LM6 benchmark point it is less than 0.03%, and less than 0.4% for SMS points close to the diagonal (m g = m LSP ). The ANN output distribution in the primary control region is then obtained using data ANN(SM) data CR .
Next, an extrapolation ratio, R Ext. = (1) The primary control region is further subdivided into a tt enriched one with E T / > 30 GeV and M / ∈ [75, 105] GeV, denoted as "control region A", and separately into a Z + jets enriched one with E T / < 30 GeV or 75 GeV < M < 105 GeV, denoted as "control region B". These are not used in the analysis. However they provide quality control cross-checks (level of agreement between data and simulation) for the two main backgrounds that affect the analysis. Figure 2 compares the ANN output distributions of data and simulated events in the control regions as defined above. Agreement between data and simulation is observed both in the primary control region used to define the ANN output, as well as in the tt and Z + jets dominated control regions "A" and "B".
Similar agreement between data and simulation for the ANN input variables in the control region is observed as well. This helps to confirm that the simulation is appropriate to train the ANN and adequate to be used for the estimation of systematic uncertainties.

Systematic Uncertainties
Systematic uncertainties of the ANN output prediction for the SM-only hypothesis, obtained as described in Section 5, are estimated with simulated data using the following procedure. A systematic effect is introduced into the simulated data for all events in the sample before  any preselection is applied. The nominal SM extrapolation factor R ext is then used to obtain a new ANN output prediction for the signal region corresponding to the systematic effect under study. Next, the ANN output prediction, corresponding to the systematic alteration, is compared against the ANN output for the original sample, without any systematic effects introduced. A binned ANN output distribution is studied for this analysis. The relative difference in ANN outputs for each bin, is assigned as a bin-by-bin systematic uncertainty. Similarly, the relative difference in the integrated number of events above a certain ANN output is assigned as a systematic uncertainty to the number of signal-like events. Finally, for each bin, the relative differences for all systematic effects studied are added in quadrature. This results in a bin-by-bin total systematic uncertainty in the ANN output prediction. In a similar manner the relative differences in the integrated number of events above some ANN value are added in quadrature yielding the total systematic uncertainty on the number of signal-like events.
The overall systematic uncertainties corresponding to the seven input variables used for the ANN construction, as well as the uncertainties in the cross sections of the SM backgrounds, are shown in Table 3 for the ANN optimal selection. The magnitude of the systematic alterations for the jet energy scale is taken from dedicated CMS measurements [31]. While the clustered energy scale of E T / is known to the 3% level in CMS and the unclustered energy scale for E T / is known to within 10% [32], this analysis uses a conservative 10% for the overall E T / systematic uncertainty.
For the input ANN variables for which there is no dedicated CMS measurement, the level of agreement between data and simulation in the control region is used to obtain an estimate of the systematic uncertainty. Therefore, the control region is used to constrain the systematic uncertainties in these cases. Given the above, the difference between data and simulation for the migration of events from the one-jet to the two-jet bin is estimated to be 0.5%. Similarly, the systematic uncertainty on the ratio of the lepton to the total transverse energy is estimated to be 2%, and the M T uncertainty is estimated to be 5%. The dilepton mass scale uncertainty of 1% is taken from the CMS measurements of the Z peak [33].
The relative fraction of tt and Z + jets backgrounds is observed to vary as a function of the ANN output, as well as across the signal and control regions. In order to account for any remaining differences, the cross sections of all background components are left to vary within their uncertainties, taken from the recent CMS measurements for the tt [34] cross section, and using a conservative 50% uncertainty on the QCD cross section. The Z + jet cross section uncertainty (<3%) [33], and the W + jet cross section uncertainty <3%) [33] produce a negligible systematic effect on the ANN output.

Performance of the ANN
The systematic uncertainties associated with the signal acceptance and efficiency (ANN selection), along with their magnitude, are summarized in Table 4. The uncertainty on the lep- ton triggers and the lepton isolation are the same as the ones estimated in Ref.
[35]. The relative ANN uncertainty for the signal is lower than the corresponding uncertainty for the background, due mainly to the different ANN shapes for these two populations (signal and background).

Performance of the ANN
The ANN output after the training is shown in Fig. 3 for the signal (blue) and SM background (red) samples; the efficiency and purity of the selected samples are also shown as a function of the ANN output requirement.  When statistical and systematic uncertainties are taken into account, the ANN output requirement yielding the best expected exclusion limit in the SMS plane is ANN > 0.95. The expected number of SM and signal events for the CMSSM benchmark point LM6 after imposing the ANN output requirement of >0.95 are shown in Table 5. The remaining backgrounds are dominated by tt events in the dilepton final state, followed by Z + jets production at a much smaller level.

Results
The seven input ANN variables are shown in Fig. 4 for simulated and data events, after the candidate event selection criteria are applied and for signal events. Data and simulation are consistent with each other, within the statistical and systematic uncertainties. Figure 5 shows the comparison between the SM ANN prediction and the data in the signal region including statistical and systematic uncertainties.
In the signal-like region there are 171 events observed and 140 +73 −46 (stat.) ± 42 (syst.) expected. The statistical error on the expectation comes from the number of data events in the control region. The 95% confidence level (CL) upper limit (UL) on number of signal events is estimated to be 95. There is agreement between expectation and observation at a 68% CL. Figure 6 shows the E T / and H T distributions for data and simulated events in the signal-like region. These figures illustrate that this analysis accepts signal-like events with E T / as low as 40 GeV or H T as low as 120 GeV -regions not explored yet by other CMS analyses.
Finally, the observed and expected number of events are translated into limits on SUSY parameter space. The 95% CL upper limits are computed using a hybrid CL s method with profile likelihood test statistics, and lognormal distributions for the background expectation [36,37]. The uncertainties in the NLO+NLL cross sections from the parton distribution functions [38][39][40][41][42], the choice of the factorization and renormalization scale, and α S , are taken into account for each point, and are evaluated according to the PDF4LHC recommendation [43]. A constant signal acceptance systematic uncertainty of 18% is assumed for each point. As described previously, the contamination of the signal in the control region is negligible and hence not taken into account in the limit setting.
The exclusion limits on SMS models are depicted in Fig. 7, and in the (m 0 , m 1/2 ) CMSSM plane are shown in Fig. 8 [44].
As discussed earlier, for SUSY models that yield events with large E T / (CMSSM with m 0 < 1000 ), the ANN's performance is comparable to the data analyses using large E T / and H T , and in some cases worse, given that the ANN has been trained with models characterized by low E T / and H T . For SUSY models that yield events with low E T / and/or H T (CMSSM with m 0 > 1000, and for SMS models close to the diagonal), the ANN's performance is better compared to the analyses using large E T / and H T selection criteria.
In the case of the CMSSM limits and for a specific choice of parameter values, squark masses below ∼700 GeV are excluded at 95% CL; and similarly gluino masses below ∼700 GeV are excluded for the region m 0 < 700 GeV. In the region 1000 < m 0 < 3000 GeV, gluino masses below ∼300 GeV are excluded, while the squark mass in the excluded models varies in the range from 1000 GeV to 2500 GeV, depending on the value m 0 . In the case of the SMS limits, for gluino masses below ∼800 GeV, LSP masses below ∼400 GeV are excluded. For gluino masses above ∼800 GeV, no limits on the mass of LSP can be set.

Conclusions
A search for supersymmetry in events with two opposite-sign leptons in the final state and with the use of an artificial neural network has been presented, using the 2011 dataset collected with the CMS experiment. This search is complementary to the ones already published by the CMS collaboration and yields comparable exclusion limits for high-E T / , high-H T SUSY models. In addition, the significantly relaxed criteria on E T / and H T with respect to the previously published analyses allows for the study of events not addressed by previous searches, and provides an independent and complementary probe of this particularly challenging region of phase space. Agreement is observed between the expectation from the SM and the data, with no significant excess, which results in limits in the CMSSM (m 0 , m 1/2 ) and SMS (m g , m LSP ) planes.