Search for supersymmetry in pp collisions at sqrt(s) = 8 TeV in final states with boosted W bosons and b jets using razor variables

A search for supersymmetry in hadronic final states with highly boosted W bosons and b jets is presented, focusing on compressed scenarios. The search is performed using proton-proton collision data at a center-of-mass energy of 8 TeV, collected by the CMS experiment at the LHC, corresponding to an integrated luminosity of 19.7 inverse femtobarns. Events containing candidates for hadronic decays of boosted W bosons are identified using jet substructure techniques, and are analyzed using the razor variables M[R] and R^2, which characterize a possible signal as a peak on a smoothly falling background. The observed event yields in the signal regions are found to be consistent with the expected contributions from standard model processes, which are predicted using control samples in the data. The results are interpreted in terms of gluino-pair production followed by their exclusive decay into top squarks and top quarks. The analysis excludes gluino masses up to 1.1 TeV for light top squarks decaying solely to a charm quark and a neutralino, and up to 700 GeV for heavier top squarks decaying solely to a top quark and a neutralino.


Introduction
The CERN LHC has provided sufficient data to probe a large variety of theories beyond the standard model (SM).Among these, theories based on supersymmetry (SUSY) [1][2][3][4][5][6][7][8][9], which predict the existence of a spectrum of supersymmetric partners to the SM particles, are strongly motivated.Scenarios with nondegenerate supersymmetric particle spectra, with cross sections as low as ≈1 fb, have been explored in many final states; however, as yet no evidence for SUSY has been found.
The focus of many current searches is so-called natural SUSY [10,11], in which the Higgs boson mass can be stabilized without excessive fine-tuning.In natural SUSY scenarios, the Higgsino mass parameter µ is required to be of the order of 100 GeV, and the lightest top squark t 1 , the gluino g, and the lightest bottom squark b 1 are constrained to have masses around the TeV scale, while the masses of the other superpartners are unconstrained and can be much heavier and beyond the LHC reach.The possibility that the top squark could be light has motivated several searches by the ATLAS and CMS collaborations [12][13][14][15][16][17][18][19][20][21][22][23] for this sparticle.In general, the sensitivity of these searches diminishes for direct top squark production when the mass of the top squark approaches that of the lightest supersymmetric particle (LSP), which is assumed to be the lightest neutralino χ 0 1 .For searches that specifically target the decay t 1 → t χ 0 1 , the sensitivity is reduced when the mass difference ∆m between the top squark and the LSP is comparable to the top quark mass m t .
Here, we focus on two types of scenarios: the so-called compressed spectrum in which ∆m is very small, of the order of a few GeV to tens of GeV (e.g.[24][25][26]), and scenarios where ∆m ≈ m t .In the compressed case, the top squark decays to the LSP and soft decay products, which are difficult to detect.When ∆m ≈ m t , the signature of top squark production is very similar to that of tt production, which has a much higher cross section.Therefore, to be sensitive to such processes, we cannot solely rely on the top squark decay products.Possibilities to discriminate the signal are tagging the top squark events based on a jet from initial-state radiation (ISR) using the monojet signature [27,28], or searching for top squark events in cascade decays of heavier particles, such as the heavy top squark decays t 2 → t 1 + H/Z [21], or from gluino decays.
In this paper, we search for the challenging top squark final states described above in gluino decays.Specifically, we consider gluino-pair production where each gluino decays to a top squark and a top quark.We consider the scenarios in which the gluino has a mass of around 1 TeV and the lighter top squark has a mass of a few hundred GeV.Because of the significant mass gap between the gluino and the top squark, the top quark from the gluino decay will receive a large boost.The top squark decays to c χ 0 1 for a small ∆m, or to t χ 0 1 for ∆m ≈ m t , as in the targeted searches for t 1 → t χ 0 1 mentioned above.The analysis described in this paper is especially sensitive to the decay t 1 → c χ 0 1 .Consequently, this analysis provides new information about the viability of natural SUSY.
The gluino-pair production processes described above, with t 1 → c χ 0 1 or t 1 → t χ 0 1 , can be described using simplified model spectra [29][30][31][32][33][34].Specifically, the models T1ttcc and T1t1t, shown in Fig. 1, are used in the design of the analysis and in the interpretation of the results.
In light of the discussion above, it is expected that boosted top quarks are a promising signature of new physics involving a massive gluino decaying to a relatively light top squark.Boosted objects with high transverse momentum, p T , are characterized by merged decay products separated by ∆R ≈ 2m/p T , where m denotes the mass of the decaying particle.For the top quark decay products to be merged within the typical jet size of ∆R = 0.5 requires a top Here, an asterisk ( * ) denotes an antiparticle of a supersymmetric partner.quark momentum of ≈700 GeV, a value difficult to reach with proton-proton collisions at 8 TeV.Therefore, in order to increase the signal efficiency by entering the boosted regime, we focus on W bosons from top quark decays, which require a more accessible p T of around 300 GeV.The targeted final state therefore contains boosted W bosons and jets originating from b quarks (b jets) from top quark decays, light quark jets from unmerged hadronic W boson decay products or charm quarks, and missing energy from the neutralinos.Hadronically decaying boosted W boson candidates are identified using the pruned jet mass [35][36][37] and a jet substructure observable called N-subjettiness [38].The razor kinematic variables M R and R 2 [39] are used to discriminate the processes with new heavy particles from SM processes in final states with jets and missing transverse energy.To increase the sensitivity to new physics, we perform the analysis by partitioning the (M R ,R 2 ) plane into multiple bins.This paper is organized as follows.The razor variables are introduced in Section 2. Section 3 gives a brief overview of the CMS detector, while Section 4 covers the triggers, data sets, and Monte Carlo (MC) simulated samples used in this analysis.Details of the object definitions and event selection are given in Sections 5 and 6, respectively.Section 7 describes the data/simulation scale factors that are needed to correct the modeling of the boosted W boson tagger.The statistical analysis is explained in Section 8, and Section 9 covers the systematic uncertainties.Finally, our results and their interpretation are presented in Section 10, followed by a summary in Section 11.

Razor variables
The razor variables M R and R 2 [39] are useful for describing a signal arising from the pair production of heavy particles, each of which decays to a massless visible particle and a massive invisible particle.In the two-dimensional razor plane, a signal with heavy particles is expected to appear as a peak on top of smoothly falling SM backgrounds, which can be empirically described using exponential functions.For this reason, the razor variables are robust discriminators for SUSY signals in which supersymmetric particles are pair produced and decay to SM particles and the LSP.For the simple case in which the final state comprises two visible particles, e.g.jets, the razor variables are defined using the momenta p j 1 and p j 2 of the two jets as where p z are the z components of the j 1,2 momenta, p miss T is the missing transverse momentum, computed as the negative vector sum of the transverse momenta of all observed particles in the event, and E miss T is its magnitude (see Section 5 for a more precise definition).Given M R and the transverse quantity M R T , the razor dimensionless ratio is defined as If the heavy mother particle is denoted by G and the heavy invisible daughter particle is denoted by χ, the peak of the M R distribution and end point of the M R T distribution are both estimates of the quantity (m 2 G − m 2 χ )/m G .When the decay chains are complicated, producing multiple particles in the final state, the razor variables can still be meaningfully calculated by reducing the final state to a two-"megajet" structure.The megajet algorithm aims to cluster visible particles coming from the decays of the same heavy supersymmetric particle.The razor variables M R and R 2 are computed using the four-momenta of the two megajets, where the megajet four-momentum is the sum of the four-momenta of the particles comprising the megajet.Studies show that, of all the possible clusterings, the one that minimizes the sum of the squared invariant masses of the megajets maximizes the efficiency with which particles are matched to their heavy supersymmetric particle ancestor [40].
Figure 2 shows the simulated distributions of the overall SM background and a T1ttcc signal with m g = 1 TeV, m t = 325 GeV, and m χ 0 1 = 300 GeV in the (M R ,R 2 ) plane.The binning is chosen in accordance with the exponentially falling behavior of the razor variables, to optimize the statistical precision in each bin.The numerical values for the bin boundaries, which are used all through the analysis are given in Table 5.The SM background, which mainly arises from multijet production, is dominant at low values of R 2 , while the SUSY-like signal peaks higher in the (M R ,R 2 ) plane (M R peaks at around 900 GeV, which is the expected value).
In order to be sensitive to low-E miss T scenarios (small ∆m), we use a lower R 2 threshold than that used in previous razor analyses [40][41][42][43].To exploit the boosted phase space in which the expected signal significance is greater than in the nonboosted phase space, we work at large (m 2 G − m 2 χ )/m G and thus at high M R , allowing us to raise the M R threshold.This has the added virtue of keeping the SM backgrounds at a manageable level.

The CMS detector
A detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found elsewhere [44].A characteristic feature of the CMS detector is its superconducting solenoid magnet, of 6 m internal diameter, which provides a field of 3.8 T. Within the field volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter, and a brass and scintillator hadron calorimeter.Muon detectors based on gas-ionization chambers are embedded in a steel flux-return yoke located outside the solenoid.Events are collected by a two-layer trigger system, where the = 300 GeV, both obtained from simulation.A very loose selection is used: a good primary vertex and at least three jets, one of which is required to have p T > 200 GeV.first level is composed of custom hardware processors, and is followed by a software-based high-level trigger.
The tracking system covers the pseudorapidity region |η| < 2.5, the muon detector |η| < 2.4, and the calorimeters |η| < 3.0.Additionally, the forward region at 3 < |η| < 5 is covered by steel and quartz fiber forward calorimeters.The near hermeticity of the detector permits an accurate measurement of the momentum balance in the transverse plane.

Trigger and event samples
This analysis is based on a sample of proton-proton collision data at √ s = 8 TeV collected by the CMS experiment in 2012 and corresponding to an integrated luminosity of 19.7 fb −1 .Events are selected using two triggers, requiring either the highest jet p T or the scalar sum H T of jet transverse momenta to be above given thresholds.The jet p T threshold was 320 GeV (and 400 GeV for a brief data taking period corresponding to 1.8 fb −1 ), while the H T threshold was 650 GeV.The two trigger algorithms were based on a fast implementation of the particle-flow (PF) reconstruction method [45,46], which is described in Section 5.
To measure the efficiency of these triggers, samples with unbiased jet p T and H T distributions are obtained using an independent set of triggers that require at least one electron or muon.Figure 3 shows, on the left-hand side, the efficiency of the requirement that events satisfy at least one of the two trigger conditions as well as the baseline selection described in Section 6, in the (H T , leading jet p T ) plane.The trigger is fully efficient for events with H T > 800 GeV.In order to account for the lower efficiency of the regions with H T < 800 GeV, the measured trigger efficiency over the (H T , leading jet p T ) plane is applied as an event-by-event weight to the simulated samples.The right-hand side of Fig. 3 shows the trigger efficiency across the (M R , R 2 ) plane for the total simulated background.

Event reconstruction
We select events that have at least one interaction vertex associated with at least four chargedparticle tracks.The vertex position is required to lie within 24 cm of the center of the CMS detector along the beam direction and within 2 cm from the center in the plane transverse to the beam.Because of the high instantaneous luminosity of the LHC, hard scattering events are typically accompanied by overlapping events from multiple proton-proton interactions (pileup), and therefore contain multiple vertices.We identify the primary vertex, i.e., the vertex of the hard scatter, as the one with the highest value of the ∑ p 2 T of the associated tracks.Detectorand beam-related filters are used to discard events with anomalous noise that mimic events with high energy and a large imbalance in transverse momentum [65,66].CMS reconstructs events using the PF algorithm, in which candidate particles (PF candidates) are formed by combining information from the inner tracker, the calorimeters, and the muon system.Each PF candidate is assigned to one of five object categories: muons, electrons, photons, charged hadrons, and neutral hadrons.Contamination from pileup events is reduced by discarding charged PF candidates that are incompatible with having originated from the primary vertex [67].The average pileup energy associated with neutral hadrons is computed event by event and subtracted from the jet energy and from the energy used when computing lepton isolation, i.e., a measure of the activity around the lepton.The energy subtracted is the 5 Event reconstruction average pileup energy per unit area (in ∆η × ∆φ) times the jet or isolation cone area [68,69].
Jets are clustered with FASTJET 3.0.1 [70] using the anti-k T algorithm [71] with distance parameter ∆R = 0.5.These jets are referred to as AK5 jets.Corrections are applied as a function of jet p T and η to account for the residual effects of a nonuniform detector response.The jet energies are corrected so that, on average, they match those of simulated particle-level jets [72].After correction, jets are required to have p T > 30 GeV and |η| < 2.4.We use the combined secondary vertex algorithm [73,74] to identify jets arising from b quarks.The medium tagging criterion, which yields a misidentification rate for light quark and gluon jets of ≈1% and a typical efficiency of ≈70%, is used to select b jets.The loose tagging criterion, with a misidentification rate of ≈10% and an efficiency of ≈85%, is used to reject events containing b jets.
To identify boosted W bosons, we follow a similar procedure as outlined in Ref. [75].Jets are clustered with FASTJET using the Cambridge-Aachen algorithm [76] and a distance parameter of 0.8, yielding CA8 jets.Jet energy corrections for these jets are derived from the anti-k T jets with distance parameter ∆R = 0.7.Simulations show that the corrections are valid for CA8 jets and have an additional uncertainty ≤ 2%.
The jet mass is calculated from the constituents of the jet after jet pruning, which removes the softest constituents of the jet.During jet pruning, the jet constituents are reclustered, and at each step the softer and larger-angle "protojet" of the two protojets to be merged is removed should it fail certain criteria [35,36].A CMS study has shown that jet pruning reduces pileup effects and provides good discrimination between boosted W jets and quark/gluon (q/g) jets [37].We define mass-tagged jets (mW) as CA8 jets with p T > 200 GeV and jet mass within the range 70 < m jet < 100 GeV around the W boson mass.
In addition to the jet mass, we also consider the N-subjettiness [38] variables, which are obtained by first finding N candidate axes for subjets in a given CA8 jet, and then computing the quantity where R 0 is the original jet distance parameter and k runs over all constituent particles.The subjet axes are obtained with FASTJET via exclusive k T clustering, followed by a one-pass optimization to minimize the N-subjettiness value.The quantity τ N is small if the original jet is consistent with having N or fewer subjets.Therefore, to discriminate boosted W bosons, which have two subjets, from q/g jets characterized by a single subjet, we require that a W boson mass-tagged jet satisfy τ 2 /τ 1 < 0.5 for it to be classified as a W boson tagged jet (labeled W in the following).The W boson tagging efficiency is dependent on the CA8 jet p T , and is 50-55% according to simulation.The corresponding misidentification rate is 3-5%.We also define W boson antitagged jets (aW) as W boson mass-tagged jets that satisfy the complement of the τ 2 /τ 1 criterion, and use these jets to define control regions for data-driven background modeling.
To calculate p miss T , which is used in the calculation of the razor variable R 2 defined in Eqs. ( 2) and (3), the vector sum over the transverse momenta is taken of all the PF candidates in an event.
Loosely identified and isolated electrons [77] (and muons [78]) with p T > 5 GeV and |η| < 2.5 (2.4) are used both to suppress backgrounds in the signal region and in the definition of the control regions.Tightly identified isolated leptons, electrons (muons) with p T > 10 GeV and |η| < 2.5 (2.4), define a control region enriched in Z→ ¯ events, from which we estimate the systematic uncertainty in the predicted number of Z→ νν events in the signal region.Electron candidates that lie in the less well-instrumented transition region between the barrel and end cap calorimeters, 1.44 < |η| < 1.57, are rejected.We suppress the background from events that are likely to contain τ and other leptons that fail the loose selection by discarding events with isolated tracks with p T > 10 GeV and a track-primary vertex distance along the beam direction |d z | < 0.05 cm.
Known differences between the properties of data and MC simulated data are corrected by weighting simulated events with data/simulation scale factors for the jet energy scale, b tag, W mass-tag, W tag, and W antitag efficiency.The W tagging-related scale factors are described in Section 7. In addition, event-by-event weights are used to correct the simulated data so that their pileup, trigger, top quark p T , and ISR characteristics match those of the data.

Analysis strategy and event selection
We search for deviations from the SM in the (high-M R , high-R 2 ) region using events with at least one boosted W boson, at least one b-tagged jet, and no isolated leptons or tracks.SM backgrounds in the signal region S are estimated using observations in control regions and scale factors, calculated from MC simulation, that relate the number of events in one region to that in another.Three control regions, Q, T, and W, select high-purity samples of multijet, tt, and W(→ ν)+jets events, respectively.Details of the background estimation method are given in Section 8.
Events must satisfy the following baseline selection: 1. have at least one good primary vertex (see Section 5); 2. pass all detector-and beam-related filters (see Section 5); 3. have at least three selected AK5 jets of which at least one has p T > 200 GeV, thereby defining the boosted phase space; and 4. satisfy M R > 800 GeV and R 2 > 0.08, where the megajets are constructed from the selected AK5 jets.
The details of the event selection in addition to the baseline selection are given in Table 1.The signal and control regions are defined using different requirements on the multiplicities of leptons, b-tagged jets, and W-tagged jets, and on kinematic variables that discriminate between different processes.The multijet-enriched control sample Q is used for estimating the multijet background in the S and T regions.To characterize Q, we use the fact that E miss T in multijet events is largely due to jet mismeasurements rather than the escape of particles that interact weakly with the detector; consequently, p miss T will often be aligned with one of the jets.Therefore, a good discriminant between multijet events and events with genuine E miss T is that is, the minimum of the angles between p miss T and the transverse momentum of each jet, where i runs over the three leading AK5 jets.Since detector inaccuracies mostly cause undermeasurements of the jet energy and momentum, the variable ∆φ min provides a reliable discrimination of fake E miss T in multijet events.
The T and W control regions are used to characterize the tt and W+jets backgrounds, respectively, in the S region.The contamination in the S region from fully hadronic decays of tt pairs Table 1: Summary of the selections used, in addition to the baseline selection, to define the signal region (S), the three control regions (Q, T, W), and the two regions (S , Q ) used for the cross-checks described later in the text. is negligible because they do not produce sufficient genuine E miss T to satisfy our event selection.The tt contamination consists thus of the semileptonic decays of tt pairs in which one W boson is boosted and the other W boson decays to a charged lepton that is not identified.Therefore, the T region is required to have a lepton from the decay of a W boson, at least one b-tagged jet, and a W-tagged jet.Similarly, the W+jets contribution in the S region comes from leptonic W boson decays in which the charged lepton is not identified and a jet is misidentified as a W jet. Therefore, we require the W region to have events with a lepton from the W boson and a mass-tagged boosted W jet, which is a quark or gluon initiated jet misidentified as a boosted W boson.The N-subjettiness criterion is not imposed in order to maintain high event yields in these control regions and therefore higher statistical precision.
In the T and W regions, we suppress potential signals using the transverse mass, where ∆φ is the difference in azimuthal angle between the lepton p T and p miss T , and p T is the magnitude of the lepton p T .The m T distribution exhibits a kinematic edge at the mass of the W boson for tt and W(→ ν)+jets processes.However, such an edge is not present for signal events because of the extra contribution to E miss T from neutralinos, which escape direct detection.Therefore, potential signals are suppressed in the T and W regions by requiring m T < 100 GeV.For the W region, we additionally require m T > 30 GeV in order to reduce residual contamination from multijet events, which are expected to have small E miss T and therefore small m T .Table 1 lists two additional control regions, S and Q , which are used in the cross-checks described later in this section.
Figure 4 shows the simulated distributions in the signal region for the M R and R 2 variables, where the smoothly falling nature of the backgrounds, as well as their relative contributions, can be observed.The m T distribution in the T and W regions prior to the m T and ∆φ min selection is shown in Fig. 5, while Fig. 6 shows the ∆φ min distribution in the Q region, for both data and simulated backgrounds.Overall, there is reasonable agreement between the observed and simulated yields.The discrepancies are accommodated by the systematic uncertainties we assign to the simulated yields.
In Table 2, we show the expected number of events obtained from simulation for the different background processes and for the example T1ttcc model with m g = 1 TeV, m t = 325 GeV, and m χ 0 1 = 300 GeV.The observed event counts after different levels of selection, beyond the trigger requirement, are also reported.The background composition in percent after the baseline, S, Q, T, and W region selections is reported in Table 3.The signal region is tt dominated,  with additional contributions from W(→ ν)+jets and multijet processes.Each control region, Q, T, and W, has high purity for the background process it targets, 90% multijet, 83% tt and single top quark processes, and 85% W(→ ν)+jets, respectively.The discrepancies between the observations and the simulation are due to uncertainties in the MC modeling, especially for the multijet processes.
We do not explicitly estimate the background in the signal region.Rather, from the observations in the control regions, we create a prior distribution (described in Section 8) for the four background components of the signal region that incorporates all statistical and systematic uncertainties.However, in order to verify that the control regions in data provide adequate models for backgrounds in the signal region and that the translations between different regions behave as expected, we perform two cross-checks, taking into account statistical uncertainties only.
Table 2: Event yields in simulated event samples and in data as event selection requirements are applied.The simulated event counts are normalized to an integrated luminosity of 19.7 fb −1 ."Other" refers to the sum of the small background components Z/γ * → ¯ +jets, triboson, and ttV.The signal is the T1ttcc model with m g = 1000 GeV, m t 1 = 325 GeV, m χ 0 1 = 300 GeV.The row corresponding to "n PV > 0" gives the event counts after applying the noise filters, pileup reweighting, top p T reweighting for tt, ISR reweighting for the signal, and the requirement of at least one primary vertex.The column listing the total number of background events also includes some processes that only contribute at the early stages of the event selection.The cross sections used for each sample are listed in the second line of the header.Several of the simulated background samples were produced with generator-level selections applied, which are not fully covered by the first selection levels listed in this table.In the first cross-check, we predict the background in a signal-like control region, and compare these predictions with the observations in that region.This control region, denoted by S , is defined by inverting the ∆φ min requirement while preserving the rest of the signal selection.The estimated number of events in the S region for the multijet, W(→ ν)+jets, and top quark processes is computed as follows: while the estimated number of multijet events in the control region T is given by In Eqs. ( 7)- (10), the superscripts denote one of the control regions, while the subscripts "other", W(→ ν), TTJ + T, and multijet, denote the sum of the small backgrounds, W(→ ν)+jets, tt plus single top quark, and multijet, respectively, while "obs" labels observed counts.These equations are used only in this cross-check.However, they incorporate the same relations between signal and control regions as will be used in the likelihood procedure described in Section 8.As can be seen from Table 3, the nominal choice of the parameters associated with systematic uncertainties leads to N T multijet, MC = 0.The total estimated background in S is where i runs over all background processes.For smaller backgrounds, N S i is determined by simulation.Backgrounds are estimated bin by bin in the (M R , R 2 ) space, where the bin boundaries are numerically defined in  statistical precision is not sufficient to yield reliable bin-by-bin estimates.The expected global scale factors, which we denote by κ, are defined in Section 8, which also describes how they are calculated.
Figure 7 shows the projection on the M R and R 2 axes of the predicted and observed distributions in the S region.The prediction agrees with observation within ≈20%.This cross-check of the background modeling shows that it is feasible to estimate a multicomponent background in a signal-like region using the control regions we have defined.
In the second cross-check, we use the Q region to estimate the background in a signal-like Q region, denoted by Q , for which ∆φ min > 0.5, from the relationship Here, N MC includes all contributing background processes, and N Q obs is the observed count in the Q region.This test assesses the degree to which the simulated distribution of ∆φ min as well as its extrapolation from the Q region to the S region are reliable.As observed from Table 3, the multijet process is only a small contribution in the Q region.Therefore, this cross-check assesses how well the reduction of the multijet process, via the ∆φ min > 0.5 requirement, is modeled.The comparison between prediction and observation can be made from data shown in Fig. 8.The level of discrepancy between the prediction and the observation in this cross-check is incorporated as a systematic uncertainty of 42% in the global scale factor for the multijet component, as described in Section 8.

The W boson tagging scale factors
The W boson tagger used in this analysis is the same as that defined and used in previous CMS analyses [75,79].Since the W boson tagging efficiency does not depend significantly on the event topology, we use the same scale factor [75] SF Wtag = 0.86 ± 0.07, (13) as used in these previous analyses, for correcting the modeling differences between FULLSIM and data for the W boson tagging efficiency and apply the scale factor to processes with genuine  hadronically decaying W bosons (mainly tt and signal) in the S and T regions.
On the other hand, the data/FULLSIM scale factors for the misidentification (mistag) efficiency for mass-tagged, antitagged, and tagged W bosons are derived specifically for this analysis.The mistag efficiency is defined as the probability to tag, with one of the W taggers, a jet not originating from the hadronic decay of a W boson. Scale factors are necessary to correct the mistag efficiencies for W boson mass tagging and antitagging in the MC simulation of the Q and W control regions, respectively, whereas the mistag efficiency scale factor for W boson tagging is used to correct simulated events with misidentified W bosons, e.g.multijet or W(→ ν)+jets events, in the S and T regions.All three mistag efficiency scale factors are derived using the same multijet-enriched control region, defined as region Q with the exception of all selections related to razor variables and W tagging.To obtain the mistag efficiencies f for W boson tagging, mass tagging and antitagging, we use the leading CA8 jet in each event and measure the fraction of these jets passing the given tagger.After obtaining f in both data and FULLSIM, we compute the scale factor, The scale factors for the W boson tagging, mass tagging, and antitagging mistag efficiency vary between 1.0-1.2,1.1-1.4,and 1.2-1.5, respectively, depending on the CA8 jet p T .The uncertainties in the scale factor include the statistical uncertainty as well as the trigger efficiency and jet energy scale uncertainties, and vary between 2-7% depending on the CA8 jet p T .
Because the signal processes are simulated with FASTSIM, the resulting tagging efficiencies must be corrected for modeling differences between the programs FASTSIM and FULLSIM.To compute the W boson tagging efficiency FULLSIM/FASTSIM scale factor we use a sample of tt events simulated with FULLSIM and FASTSIM.We first determine the W boson tagging efficiency for both samples, considering only events with exactly one hadronically decaying W boson at the generator level for which the closest reconstructed CA8 jet lies within ∆R = 0.8 of the W boson. Since we wish to select boosted W bosons, and not boosted top quarks, we require that there be no (generator-level) b quark from the top quark decay within the cone of the closest CA8 jet.The W boson tagging efficiency as a function of p T for a given sample is then obtained by dividing the p T distribution of the closest CA8 jets that also satisfy the tagging condition (70 < m jet < 100 GeV and τ 2 /τ 1 < 0.5) by the p T distribution of all of the closest CA8 jets.To determine the FULLSIM/FASTSIM scale factor for the W boson tag-ging efficiency, we divide the efficiencies obtained from the FULLSIM and FASTSIM samples, SF Full/Fast (p T ) = FULLSIM (p T )/ FASTSIM (p T ).This scale factor is applied to all signal samples and varies between 0.89-0.95,depending on the p T of the given CA8 jet, with an uncertainty of less than 3%.

Statistical analysis
The statistical analysis of the observations in the signal region is based on a likelihood function, L(σ), given by where σ is the total signal cross section, M = 25 is the number of bins in the (M R , R 2 ) plane, , where the sum is over all bins of the simulated data; A and B denote any of the S, Q, T, or W regions.
The association of the global scale factors with the control regions is shown in Fig. 9, which also shows which control regions provide constraints on the background parameters, b S process .Although we use the same global scale factors in each bin, shape uncertainties in the simulated distributions are accounted for by allowing the uncertainty in the scale factors to be bin dependent.The 25 signal bins in the (M R , R 2 ) plane are divided into three sets for which different uncertainties are applied: the four bins nearest the origin (set 1), the five surrounding bins (set 2), and the remaining bins (set 3).The likelihood per bin is taken to be p The integral in Eq. ( 15) is approximated using MC integration by sampling the priors π(L) and π( θ 1 , • • • , θ M ) and averaging the multibin likelihood with respect to the sampled points {(L, θ 1 , • • • , θ M )}.The priors for the expected integrated luminosity L, signal efficiencies , and simulated background counts b region process,MC are modeled with gamma function densities, in which the mode is set to c and the variance to δc 2 , where c ± δc denotes either the measured integrated luminosity or, for a given bin of a given region and process, the simulated signal efficiency, or the simulated background count.From c ± δc, we calculate the gamma density parameters, where k = (c/δc) 2 .For empty bins, we set γ = 1 and the bin value is constrained to zero by setting the β parameter to 10 −4 .

Systematic uncertainties
For the signal efficiencies and backgrounds, the prior is modeled hierarchically, where φ represents parameters that characterize the independent sources of systematic uncertainty, described in Section 9.The integral in Eq. ( 19) is evaluated as follows: φ values are sampled from π( φ) following the procedure described in Section 9, then c i values from The sampling from π( φ) and π( θ i | c i ) is straightforward because the functional forms are known.However, the sampling of c i requires running the analysis multiple times, yielding an ensemble of histograms in the (M R , R 2 ) plane, which is the output of the procedure described in Section 9. Thereafter, the sampling, which yields the points {(L, θ, • • • , θ M )}, proceeds as follows: 1. sample the integrated luminosity parameter; 2. sample the efficiency parameters, , for every bin and every signal model; 3. sample the background parameters b region process, MC for every bin and every background; 4. scale b Q multijet, MC by a random number sampled from a gamma density of unit mode and standard deviation 0.36 in order to induce the 42% uncertainty in the multijet global scale factor κ Q/S multijet that accounts for deficiencies in the modeling of multijet production, as derived from the second cross-check mentioned in Section 6; 5. compute the κ parameters from the appropriate background sums, for example, 6. scale each κ value by a random number sampled from a gamma density with unit mode and standard deviation of either 0.5 or 1.0 for the bins in set 2 or set 3, respectively, to account for the larger uncertainties in the tails of the simulated distributions; and 7. sample the background parameters b S multijet , b S TTJ , and b S W(→ ν) , from the Poisson models of the control regions; for example, for region multijet using a flat prior in b S multijet , and b S multijet is sampled from the posterior density.

Systematic uncertainties
The input to the statistical analysis is an ensemble of histograms in the (M R , R 2 ) plane that incorporate systematic uncertainties in the simulated signal and background samples.The independent systematic effects, described below, are sampled simultaneously.For each sampled systematic effect, a Gaussian variate with zero mean and unit variance is used in the calculation of the random shift due to the systematic effect for all the signal and background models.Likewise, the same randomly sampled PDFs are used for all signal and background models.In this way, the statistical dependencies among all bins of the signal and background models are correctly, and automatically, modeled.The sampling of the systematic effects is repeated several hundred times.
In all cases, except for those associated with PDFs, the systematic uncertainties are in the scale factors (SF) applied to the simulated samples to correct them for modeling deficiencies.We consider the systematic uncertainties in the following quantities: • Jet energy scale: The uncertainties are dependent on jet p T and η [72].
• Parton distribution functions: We use 100 randomly sampled sets of PDFs from NNPDF23 lo as 0130 qed [83], MSTW2008lo68cl [84], and CT10 [53].The samples for the latter two are generated using the program HESSIAN2REPLICAS, recently released with LHAPDF6 [85].Given a sampled set i, for PDF set K and the PDF set O with which the events were simulated, events are reweighted using the scale factors, SF K,i = w K,i /w O , where the weights w are products of the event-by-event PDFs for the colliding partons.• Trigger efficiency: We take the uncertainty in each bin, as a function of H T and leading jet p T , to be the maximum of the statistical uncertainty in the efficiency after the baseline selection and the difference between the efficiencies before and after the baseline selection.
• b tagging scale factors: The b tagging performance differs between data and simulation, and differs between FULLSIM and FASTSIM, which is used to model signal processes.The simulated events are therefore corrected by applying jet flavor-, p T -, and η-dependent data/FULLSIM and FULLSIM/FASTSIM scale factors on the b tagging or mistagging efficiency.The uncertainties in these scale factors are also jet flavor, p T , and η dependent, and are of the order of a few percent [74].• W tagging scale factors: The W boson tag efficiency, and the mistag efficiency for W boson tagging, W boson mass tagging, and W boson antitagging differ between data and simulation, as well as between FULLSIM and FASTSIM.Data/FULLSIM and FULLSIM/FASTSIM scale factors, whose uncertainties are functions of jet p T , are applied to the simulated samples.• Lepton identification: For electrons, we use p T -and η-dependent scale factors for the identification efficiency.The uncertainties are also p T and η dependent [77].The scale factor for the muon identification efficiency equals one and the corresponding uncertainties are negligible [78].• Initial-state radiation: Deficiencies in the modeling of ISR are corrected by reweighting [19] the signal samples using an event weight that depends on the p T of the recoiling system.The associated systematic uncertainty is equal to the difference 1 − w ISR , where w ISR is the ISR event weight.• Top quark transverse momentum: Differential top quark pair production cross section analyses have shown that the shape of the p T spectrum of top quarks in data is softer than predicted [86].To account for this, we reweight events based on the p T of the generator level t and t quarks in the tt simulation.The uncertainty associated with this reweighting is taken to be equal to the full amount of the reweighting.• Pileup: Simulated events are reweighted so that their vertex multiplicity distribu-tion matches that observed in data.The minimum-bias cross section is varied by ±5%, thereby changing the shape of the vertex multiplicity distribution and therefore the weights.
• Multijet spectrum: The cross-checks described in Section 6 showed that there is a 42% uncertainty in the multijet scale factor κ between the S and Q regions.This uncertainty is incorporated by increasing the uncertainty in the κ parameter, as described in Section 8. • Z(→ νν)+jets prediction: About 8% of the background in the signal region is com- posed of Z(→ νν)+jets events.Since we require the presence of at least one btagged jet, and given the known deficiency in modeling Z production in association with heavy flavor quarks [87], we include an extra systematic uncertainty in the Z(→ νν)+jets contribution.This uncertainty is estimated using a data control region enriched in Z(→ ¯ )+jets, required to have exactly two tight leptons with the same flavor (e or µ) and opposite charge, 60 < m ¯ < 120 GeV, at least one b-tagged jet, and at least one W mass-tagged jet.We estimate the uncertainty by first computing bin-by-bin data/simulation ratios in this control region.Then, we take the uncertainty in the ratio in each bin as the standard deviation of a Gaussian density, normalized to the number of events in that bin.Finally, the Gaussian densities from all bins are superposed, and the uncertainty is taken to be the magnitude of the 68% band around a ratio of unity.
As noted above, all systematic effects are varied simultaneously across (M R , R 2 ) bins.However, to assess the effect of each systematic uncertainty individually, each one is varied by one standard deviation up and down.The effect on the background count and signal efficiency in the signal region is shown in Table 4.The signal values are obtained from averaging over all mass points in the T1ttcc model (∆m = 25 GeV) plane.The PDF systematic uncertainties are obtained by running over 100 different members from the three PDF sets and fitting a Gaussian function to the efficiency distribution.The last line in the table corresponds to the full sampling of the systematic uncertainties.To obtain this value, we again fit a Gaussian function to the efficiency distribution obtained from the full systematic sampling including 500 variations.Although the effects of some of these systematic uncertainties on the backgrounds are large, they do not influence our results greatly because only the ratios of simulated background counts enter the statistical analysis, not the absolute values.Therefore, most of the systematic effects cancel.The statistical precision on the number of events in the control regions is the leading uncertainty in the background prediction for the search bins at large M R or R 2 .The dominant systematic uncertainty in the signal efficiency arises from the PDFs.

Results and interpretation
Our background predictions for each bin in the (M R , R 2 ) plane are presented in Fig. 10 and in Table 5, which also lists the observed event yield in each bin.The background predictions are presented as the mean and standard deviation as determined from the background prior π(θ) described in Section 8.The observed event yields are found to be in agreement with the predicted backgrounds from SM processes.Consequently, no evidence of a signal is observed.
We interpret our results in terms of the simplified model spectra T1ttcc and T1t1t, whose diagrams are shown in Fig. 1.These models each have three mass parameters: the gluino, top squark, and LSP masses.The mass of the gluino is varied between 600 and 1300 GeV and that of the LSP between 1 and 500 GeV, while the mass difference between the top squark and the LSP, ∆m, is fixed at 10, 25, or 80 GeV for the T1ttcc model, and at 175 GeV for the T1t1t model.In both models the gluino is assumed to decay 100% of the time into a top squark and a top quark.
To illustrate the expected signal sensitivity, we show in Fig. 11 the signal efficiencies as a function of the gluino and neutralino masses, for the T1ttcc model, to which this analysis is particularly sensitive, and for the T1t1t model.Efficiencies of up to 6% in the most boosted regimes are reached.For the T1ttcc model a drop in efficiency is observed for the region of model parameter space with the lowest neutralino mass (m χ 0 1 = 1 GeV), which can be explained by Lorentz boosts.For LSP masses higher than the mass of the charm quark, the LSP will assume most of the momentum.For the bins with the lowest LSP mass, however, the LSP and the charm quark have about equal mass, so that after the boost they will share the momentum about equally.This results in a softer E miss T spectrum and therefore a lower R 2 value, which reduces the efficiency substantially.
Figure 12 shows the observed 95% confidence level (CL) upper limit on the signal cross section as a function of the gluino and neutralino masses, obtained using the CLs method described briefly in Section 8, for the T1t1t model and for the T1ttcc model with ∆m = 10, 25, and 80 GeV.Additionally, the figure also shows contours corresponding to the observed and expected lower limits, including their uncertainties, on the gluino and neutralino masses.This analysis has made significant inroads into the parameter space of the T1ttcc model.Gluinos with mass up to about 1.1 TeV have been excluded for neutralinos with a mass less than about 400 GeV when the top squark decays to a charm quark and a neutralino and ∆m < 80 GeV.This also means that top squarks with masses up to about 400 GeV have been excluded for small mass differences with the LSP, given the existence of a gluino with a mass less than about 1.1 TeV.Similarly, for the T1t1t model, top squarks with a mass of up to about 300 GeV have been excluded for the scenarios with ∆m = 175 GeV and gluino mass less than 700 GeV.The observed limit for this model is lower than the expected limit because of the small excess in the low M R bins for 0.12 ≤ R 2 < 0.16, which are among the most sensitive bins for the T1t1t model.

Summary
We have presented a search for new physics in hadronic final states with at least one boosted W boson and a b-tagged jet using data binned at high values of the razor kinematic variables, M R and R 2 .The analysis uses 19.7 fb −1 of 8 TeV proton-proton collision data collected by the CMS experiment.The SM backgrounds are estimated using control regions in data.Scale factors, derived from simulations, connect these control regions to the signal region.The observations are found to be consistent with the SM expectation, as shown in Fig. 10 and Table 5.The results, which are encapsulated in a binned likelihood, are interpreted in terms of supersymmetric models describing pair production of heavy gluinos decaying to boosted top quarks.Limits are set on the gluino and neutralino masses using the CLs criterion on the gluino-neutralino mass plane, as shown in Fig. 12.Assuming that the gluino always decays into a top squark and a top quark, this analysis excludes gluino masses up to 1.1 TeV for top squarks with a mass of up to about 450 GeV that decay exclusively to a charm quark and a neutralino.In this scenario, the mass difference considered between the top squark and the neutralino is less than 80 GeV.This analysis also excludes gluino masses of up to 700 GeV when the top squark decays solely to a top quark and a neutralino, and the mass difference between the top squark and the neutralino is around the top quark mass.[13] ATLAS Collaboration, "Search for direct top-squark pair production in final states with two leptons in pp collisions at √ s = 8 TeV with the ATLAS detector", JHEP 06 (2014) 124, doi:10.1007/JHEP06(2014)124,arXiv:1403.4853.

Figure 2 :
Figure 2: Distributions in the (M R ,R 2 ) space of the overall SM backgrounds and a T1ttcc signal with m g = 1 TeV, m t = 325 GeV and m χ 0 1

Figure 3 :
Figure 3: (Left panel) The trigger efficiency, obtained from data, as a function of H T and leading jet p T after the baseline selection discussed in Section 6. (Right panel) The trigger efficiency as a function of M R and R 2 after the same baseline selection, obtained by applying the trigger efficiency as a function of H T and leading jet p T to the simulated background.

Figure 4 : 1 =
Figure 4: Simulated M R (left panel) and R 2 (right panel) distributions in the signal region, S. Stacked on top of the background distributions is the predicted signal contribution from an example T1ttcc model, with parameters m g = 1 TeV, m t = 325 GeV, and m χ 0 1 = 300 GeV.The bin entries are normalized proportionally to the bin width.

Figure 5 : 1 =
Figure 5: Distributions of m T for data and simulated backgrounds, in the T (left panel) and W (right panel) control regions, without applying any selection on m T and ∆φ min .The contribution from an example signal corresponding to the T1ttcc model with m g = 1 TeV, m t = 325 GeV, and m χ 0 1 = 300 GeV, is stacked on top of the background processes.Only statistical uncertainties are shown.

Figure 6 :
Figure 6: Distributions of ∆φ min for data and simulated backgrounds in the Q region without applying a selection on ∆φ min .Only statistical uncertainties are shown.Signal contamination in this control region is negligible.

Figure 7 :
Figure 7: One-dimensional projection of M R (left panel) and R 2 (right panel) for the crosscheck predicting the ∆φ min sideband region S .The estimates for the three different background processes are stacked on top of each other.The uncertainties shown are statistical only.The horizontal error bars indicate the bin width.

1 2 3 Figure 8 :
Figure 8: One-dimensional projection of M R (left panel) and R 2 (right panel) for the cross-check predicting the background in region Q defined by ∆φ min > 0.5.The uncertainties shown are statistical only.The horizontal error bars indicate the bin width.
the observed count in bin i of the signal region, and the bin-by-bin parameters , b S multijet , b S TTJ , b S W(→ ν) , and b S other are denoted collectively by θ.The parameter represents the M signal efficiencies (including acceptance) for a given signal model, while the bin-by-bin background parameters for a given background process in the S region are denoted by b S process .The function π(L) is the integrated luminosity prior and π( θ 1 , • • • , θ M ) is an evidence-based prior constructed from observations in the control regions and the four global scale factors κ

Figure 9 :
Figure 9: Graphical representation of the analysis method.The circles represent the signal (S) and control (Q, T, W) regions, with their definition summarized in the associated boxes.Listed inside each circle are the likelihood parameters relevant to that region: the bin-by-bin background parameters b region process for the given region and background process, as well as the global scale factors κ A/B process = ∑ i b A process,MC,i / ∑ i b B process,MC,i , where the sum is over all bins of the simulated data.A connection between two regions indicates that one or more parameters are shared.The total expected background, per the (M R , R 2 ) bin, is the sum of the terms shown for each region.Furthermore, associated with each bin of each region is an observed count, N region , a simulated count, N region process,MC , and a count N region other,MC equal to the sum of the smaller backgrounds, Z/γ * → ¯ +jets, diboson, triboson, and ttV, with an associated parameter in the likelihood b region other .

Figure 10 : 1 = 1 =
Figure 10: Background predictions and observations.The results are shown in bins of M R for each R 2 bin.The hatched band represents the total uncertainty in the background prediction.Overlaid are two signal distributions corresponding to the T1ttcc model with m g = 1 TeV, m t = 325 GeV, and m χ 0 1 = 300 GeV, and the T1t1t model with m g = 800 GeV, m t = 275 GeV, and m χ 0 1 = 100 GeV.

Figure 11 :
Figure 11: Signal efficiency for the T1ttcc and T1t1t simplified model spectra, as a function of the gluino and neutralino masses.Three mass splittings between top squark and LSP are considered for the T1ttcc model: 10, 25, and 80 GeV, shown in the top left, top right, and bottom left panels, respectively.The efficiency for the T1t1t model with a mass splitting of 175 GeV is shown in the bottom right panel.

Figure 12 :
Figure 12: Observed upper limit (CLs method, 95% CL) on the signal cross section as a function of the gluino and neutralino masses for the T1ttcc model with ∆m = 10, 25, and 80 GeV (top left, top right, bottom left panels) and for the T1t1t model with ∆m = 175 GeV (bottom right panel).Also shown are the contours corresponding to the observed and expected lower limits, including their uncertainties, on the gluino and neutralino masses.

Table 3 :
Background composition according to simulation after the baseline, S, Q, T, W, Q and S region selections."Other" refers to the sum of the small background components Z/γ * → ¯ , triboson, and ttV.

Table 5 .
However, the estimated scale factors are global as the

Table 4 :
Summary of ±1 standard deviation systematic uncertainties for the average signal efficiency over all mass assumptions in the T1ttcc model (∆m = 25 GeV), and for the total background count in the signal region, unless indicated otherwise, as determined from simulation.

Table 5 :
Event yields for the predicted backgrounds and for the data in each of the signal bins in R 2 and M R .The uncertainties in the predictions are the combined statistical and systematic uncertainties obtained using the sampling procedure described in the text.