Search for vector-like leptons in multilepton final states in proton-proton collisions at $\sqrt{s}$ = 13 TeV

A search for vector-like leptons in multilepton final states is presented. The data sample corresponds to an integrated luminosity of 77.4 fb$^{-1}$ of proton-proton collisions at a center-of-mass energy of 13 TeV collected by the CMS experiment at the LHC in 2016 and 2017. Events are categorized by the multiplicity of electrons, muons, and hadronically decaying $\tau$ leptons. The missing transverse momentum and the scalar sum of the lepton transverse momenta are used to distinguish the signal from background. The observed results are consistent with the expectations from the standard model hypothesis. The existence of a vector-like lepton doublet, coupling to the third generation standard model leptons in the mass range of 120-790 GeV, is excluded at 95% confidence level. These are the most stringent limits yet on the production of a vector-like lepton doublet, coupling to the third generation standard model leptons.


I. INTRODUCTION
The standard model (SM) of particle physics is a quantum field theory that describes the known fundamental particles and their interactions. The predictions of the SM have been experimentally tested with great precision [1]. However, the SM does not explain several observations, such as the existence of dark matter and the baryon asymmetry in the Universe. In addition, there exist theoretical issues such as the hierarchy problem, that suggest that an extension of the SM, predicting new particles, is needed to provide a more complete description of nature.
In one class of new particles there are nonchiral color singlet fermions that couple to the SM leptons. The term nonchiral implies that the left-and right-handed components of these particles transform identically under gauge symmetries. These particles are thus referred to as vectorlike leptons (VLLs). They arise in a wide variety of models invoking, for example, supersymmetry or extra dimensions [2][3][4][5]. The VLLs are often classified by the SM lepton generation with which they are associated. VLLs and their associated SM leptons have identical lepton numbers. This paper presents a search for an SU(2) doublet VLL extension [6] of the SM with couplings to the thirdgeneration SM leptons. The search is carried out in final states with multiple charged leptons (e, μ, τ), using proton-proton (pp) collision data collected by the CMS detector at the LHC in 2016 and 2017. The model that we consider introduces a vectorlike τ lepton ðτ 0− Þ, its antiparticle ðττ 0þ Þ, and the corresponding neutrinos (ν 0 τ andν 0 τ ). At the LHC, they can be produced in τ 0AE ν 0 τ , τ 0þ τ 0− , and ν 0 τν 0 τ channels, with subsequent decays of τ 0 to Zτ or Hτ and of ν 0 τ to Wτ, where W, Z, and H are the SM W, Z, and Higgs bosons, respectively. At tree level, the τ 0 and ν 0 τ are mass degenerate, whereas higher-order radiative corrections predict < 0.3% relative mass splitting between these two states, for VLL masses greater than 100 GeV. In this paper, τ 0 and ν 0 τ are assumed to be mass degenerate. The mass of the VLL is the only free parameter both in the production cross section and in the branching fraction calculations. The tree-level Feynman diagrams for associated and pair production of the doublet model VLLs are shown in Fig. 1 along with possible subsequent decay chains that would result in a multilepton final state.
The ATLAS Collaboration performed a search for heavy lepton resonances decaying into a Z boson and a lepton in a multilepton final state at a center-of-mass energy of 8 TeV [7], constraining a singlet VLL model and excluding VLLs in the mass range of 114-176 GeV. However, to date, there are no such constraints on the doublet VLL model from any of the LHC experiments. The L3 Collaboration at LEP placed a lower bound of ≈100 GeV on additional heavy leptons [8]. Given these existing constraints, this analysis focuses on VLL masses greater than 100 GeV.

II. THE CMS DETECTOR
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter, each composed of a barrel and two end cap sections. Muons are measured in gasionization detectors embedded in the steel flux-return yoke outside the solenoid. The inner tracker measures charged particles with jηj < 2.5 and provides an impact parameter resolution of ≈15 μm and a transverse momentum (p T ) resolution of about 1.5% for 100 GeV charged particles. Extensive forward calorimetry complements the barrel and end cap detectors by covering the pseudorapidity range 3.0 < jηj < 5.2. Collision events of interest are selected using a two-tiered trigger system [9]. The first level, composed of custom hardware processors, selects events at a rate of around 100 kHz. The second level, based on an array of microprocessors running a version of the full event reconstruction software optimized for fast processing, reduces the event rate to around 1 kHz before data storage. A detailed description of the CMS detector, along with a definition of the coordinate system and relevant kinematic variables, can be found in Ref. [10].

III. EVENT RECONSTRUCTION AND PARTICLE IDENTIFICATION
Events collected for this search are recorded using a combination of triggers requiring a single electron or a single muon. For events collected in 2016 (2017), the electron trigger requires an electron with p T > 27 ð35Þ GeV, while the muon trigger requires a muon with p T > 24 ð27Þ GeV. Information from all subdetectors is combined using the CMS particle-flow (PF) algorithm [11] to reconstruct and identify individual particles (charged hadrons, neutral hadrons, photons, electrons, and muons). Collectively these are referred to as PF objects.
For each event, PF objects originating from the same interaction vertex are clustered into jets using the infraredand collinear-safe anti-k T algorithm [12,13], with a radius parameter of 0.4. The momenta of all PF objects in each jet are summed vectorially to determine the jet momentum. The reconstructed vertex with the largest value of summed physics-object p 2 T is taken to be the primary pp interaction vertex. The physics objects are the jets, clustered using the jet finding algorithm [12,13] with the tracks assigned to the vertex as inputs, and the associated missing transverse momentum, taken as the negative vector sum of the p T of those jets. Additional interactions within the same or nearby bunch crossings (pileup) can contribute spurious extra tracks and calorimetric energy depositions to the jet momentum. Hence, charged particles identified as originating from pileup vertices are discarded and an offset correction [14] is applied to account for the remaining neutral pileup particle contributions. Additional jet energy corrections are applied to account for the nonlinear response of the detectors [15].
The missing transverse momentum vector ( ⃗p miss T ) is calculated as the negative vectorial p T sum of all the PF objects belonging to the primary vertex. The p miss T is defined as the magnitude of this vector. For calculating p miss T in 2016, we use PF objects located in the full fiducial volume of the detector, whereas for 2017, PF objects within 2.5 < jηj < 3.0 and with p T < 50 GeV are excluded to mitigate noise effects related to the aging of the CMS ECAL.
Electron candidates are reconstructed by combining ECAL superclusters and Gaussian sum filter [16] tracks from the silicon tracker [17]. Muon candidates are reconstructed by combining the information from both the silicon tracker and the muon spectrometer [18]. Hadronically decaying τ lepton candidates (τ h ) are selected using the hadron-plus-strips algorithm [19]. This algorithm has been designed to optimize the performance of τ h reconstruction by considering specific τ h decay modes. It starts with hadronic jets and reconstructs τ h candidates from the tracks ("prongs") and energy deposits in strips of the ECAL, in the one-prong, one-prong þπ 0 , and three-prong decay modes. We require the reconstructed leptons to lie within the region of pseudorapidity jηj < 2.5, 2.4, and 2.3 for the electron, muon, and τ h candidates, respectively.
Lepton candidates arising from pp collisions can be broadly categorized into prompt, nonprompt, and conversion leptons. A prompt lepton can be produced in the decay of a W, Z or Higgs boson. Events from background FIG. 1. Two illustrative leading-order Feynman diagrams for associated production of τ 0 with a ν 0 τ (left) and for pair production of τ 0 (right) and possible subsequent decay chains that result in a multilepton final state.
All 2016 samples are generated with the same order of the NNPDF3.0 parton distribution function (PDF) [36] as the order of the MC generator. All 2017 samples are generated with the NNPDF3.1 next-to-next-to-leading (NNLO) order PDF [37], irrespective of the order of the MC generator. The response of the CMS detector is simulated using dedicated software based on the GEANT4 toolkit [38]. Additional weights are applied to all simulated events to account for differences in the trigger and lepton identification efficiencies between data and simulation. For the simulated events, additional minimum bias interactions are superimposed on the primary collision, reweighted in such a way that the frequency distribution of the extra interactions matches that observed in data.

V. EVENT SELECTION CRITERIA
We collectively refer to electrons and muons as light leptons to distinguish them from τ h leptons. Events are then categorized as those with four or more light leptons (4L), exactly three light leptons (3L), and exactly two light leptons along with at least one τ h lepton (2L1T). In the 2L1T channel, we have a further division based on whether the two light leptons are of opposite sign (OS) or same sign (SS). In all categories, the leptons are ordered by decreasing transverse momenta and those with the largest p T are labeled as the leading leptons. The leading light lepton is required to satisfy p T > 38 ð28Þ GeV if it is an electron (muon). These thresholds are imposed so that the corresponding single lepton triggers are fully efficient for events that would subsequently satisfy the offline selection. All of the other leptons are required to satisfy p T > 20 GeV.
We use the scalar p T sum of the leptons (denoted as L T ) to discriminate signal from SM backgrounds in all channels. The L T distribution is divided into 150 GeV bins, each of which is treated as a separate experiment. In the 2L1T and 4L categories that contain more than one τ h and more than four light-lepton candidates, respectively, only the leading τ h and the leading four light leptons are used in the calculation of L T .
In order to improve sensitivity for the signal, in each of the 4L, 3L, and 2L1T (OS, SS) categories, the events are divided into low-and high-p miss T regions. While the 4L category is divided into p miss T < 50 GeV and > 50 GeV regions, the 3L and 2L1T (OS, SS) categories are divided into p miss T < 150 GeV and > 150 GeV regions. These categories form the bases of signal regions (SRs) that would be sensitive to the presence of a VLL signal. They are complemented by orthogonal control regions (CRs) that are expected to be dominantly populated by backgrounds. Additionally, all events with a light-lepton pair invariant mass below 12 GeV are vetoed regardless of the flavor and sign of the pair, in order to suppress low mass quarkonia resonances. The SRs are described in Table I, where OSSF refers to an opposite-sign, same-flavor lepton pair. A detailed description of the CRs is given in Sec. VI.

VI. BACKGROUND ESTIMATION
The WZ and ZZ background yields are normalized to data using dedicated CRs. For the WZ CR, we select events with exactly three light leptons, one OSSF pair invariant mass satisfying the 91 AE 15 GeV window ("on-Z"), and 50 < p miss T < 100 GeV. The ratio of the expected WZ yield to data (after correcting for non-WZ events) is found to be 1.14 AE 0.06 (1.07 AE 0.05) for the 2016 (2017) data analysis, where the uncertainty includes both statistical and systematic contributions. Similarly, for the ZZ background, we select events with exactly four leptons, two distinct OSSF pairs both satisfying the on-Z requirement, and p miss T < 50 GeV. The ratio of the expected ZZ yield to data is found to be 1.01 AE 0.05 (0.98 AE 0.05) for the 2016 (2017) search.
The conversion background consists of events with photons from final-state radiation, where the photon converts asymmetrically to two additional leptons, only one of which is reconstructed in the detector. A selection of events with three light leptons with an OSSF pair below the Z boson mass (<76 GeV), M 3l satisfying the on-Z window, and with p miss T < 50 GeV is used to calculate the ratio of the conversion background prediction in simulation to data. The quantity m 3l is defined as the invariant mass of the three light leptons. The ratio is measured to be 0.95 AE 0.11 (0.87 AE 0.10) for the 2016 (2017) data analysis. For the 2017 analysis, the Z=γ Ã þ γ and tt þ γ simulation samples are used, while for the 2016 analysis, the Z=γ Ã and tt simulation samples are used because of the unavailability of enhanced samples.
The measured ratios are then applied to the WZ, ZZ, and conversion background estimates to correct for any residual differences in the efficiency and acceptance between data and simulation. The CRs are also used to verify the performance of the simulation in modeling the kinematic distributions of interest. Figure 2 shows the transverse mass m T and the L T distributions in the WZ CR and the m 4l and L T distributions in the ZZ CR for data and simulation, in the combined 2016 and 2017 datasets. The quantity m 4l is defined as the invariant mass of the leading four light leptons. The quantity m T is defined as where p l T refers to the p T of the lepton that is not part of the OSSF pair closest to the Z boson mass and Δϕ m T is the difference in azimuth between W ⃗p miss T and ⃗p l T . The prompt backgrounds from triboson and associated Higgs boson production are estimated from simulation using the calculated cross sections at NLO and are henceforth referred to as the VVV and the H þ X backgrounds, respectively. Similarly, the background from ttV and ttZ is estimated from simulation and is referred to as the ttV background.
The MisID background arises from processes such as Z þ jets and tt þ jets. This background is estimated using a three-dimensional implementation of a matrix method [39]. In this method, rates are measured in data CRs for leptons to pass the analysis lepton selections, given that these leptons pass looser offline selections. It is assumed that these rates for prompt and misidentified leptons behave similarly across the different CRs and SRs. We measure these rates in dedicated CRs: one with a dilepton selection for prompt rates and another with a trilepton signaldepleted selection with one OSSF on-Z pair and p miss T < 50 GeV for misidentification rates. The rates are parameterized as functions of lepton p T and η. An additional correction factor is applied as a function of the number of charged particles, to account for rate variations due to the hadronic activity in the event. For τ h misidentification rates, an additional parameterization is needed, based on the p T of the jet matched to the τ h . This is required to account correctly for rate variations due to the boost of the lepton system. The rate measurements are dominated by Z þ jets events and are corrected using simulation to an average of the Z þ jets and tt þ jets events. Figure 3 demonstrates the agreement between the expected background and the observed data yields, as a function of the dilepton mass and L T , in a signal-depleted 2L1T (OS) selection.

VII. SYSTEMATIC UNCERTAINTIES
The primary sources of systematic uncertainty in the SM background arise from those in the MisID background and from those in the WZ and ZZ backgrounds. The systematic uncertainty in the MisID background contribution arises primarily via the uncertainties in the measurement of prompt and misidentified rates in the matrix method. In addition, the uncertainties in the Z þ jets and tt þ jets rates contribute to the systematic uncertainty in this background. GeV. The lower row shows the m 4l (left) and the L T (right) distributions in the ZZ control region. The ZZ control region contains events with two OSSF lepton pairs, both of which are on-Z, and p miss T < 50 GeV. The total SM background is shown as a stack of all contributing processes. The hatched gray bands in the upper panels represent the total uncertainty in the expected background. The lower panels show the ratios of observed data to the total expected background. In the lower panels, the light gray band represents the combined statistical and systematic uncertainty in the expected background, while the dark gray band represents the statistical uncertainty only. The rightmost bins include the overflow events.
We vary the rates within their respective uncertainties and observe the change in the background yield in all SRs. The final estimates vary by 20%-35% depending upon the year the data were collected and the SR. The WZ and ZZ background estimates have systematic uncertainties of 4%-5% arising from the normalization factor measurements in the dedicated CRs. The conversion background estimate has a systematic uncertainty of 11%.
To account for differences between the data and simulation, a number of different sources of systematic uncertainty are considered. Lepton energy (or momentum) scale uncertainties, as well as jet and lepton resolution uncertainties, are applied at the per-object level, where the corresponding object momenta are varied up and down by their corresponding uncertainties. This results in a 2%-10% impact on the background prediction, depending on L T and the SR. The uncertainty in the trigger efficiency results in a 2%-3% uncertainty in the background prediction. Additionally, an integrated luminosity measurement uncertainty of 2.5% (2.3%) is applied to the simulated rare background estimates for the 2016 [40] (2017 [41]) analysis. For the subdominant, rare background processes such as ttV, triboson, or associated Higgs boson production, a 50% systematic uncertainty is applied to the theoretical cross sections to cover the PDF and the renormalization and factorization scale uncertainties. The pileup modeling uncertainty is evaluated by varying < 50 GeV. The total SM background is shown as a stack of all contributing processes. The hatched gray bands in the upper panels represent the total uncertainty in the expected background. The lower panels show the ratios of observed data to the total expected background. In the lower panels, the light gray band represents the combined statistical and systematic uncertainty in the expected background, while the dark gray band represents the statistical uncertainty only. The rightmost bins include the overflow events. TABLE II. The sources of systematic uncertainty and the typical variations (percent) observed in the affected background and signal yields in the analysis. All sources of uncertainty are considered as correlated between the 2016 and 2017 data analyses except for the lepton identification and isolation, the single lepton trigger, and the integrated luminosity. The label ALL is defined as WZ, ZZ, rare (ttV, VVV, Higgs boson), and signal processes. the cross section used in the reweighting procedure up and down by 5%, which results in a 4% impact on background yields according to simulation. The typical variations for various sources of systematic uncertainty are provided in Table II.

VIII. RESULTS
The L T distributions for the 4L and 3L SRs are shown in Fig. 4, while those for various 2L1T SRs are shown in Fig. 5. We do not observe any significant discrepancies between the background predictions and the observed data. Limits are set on the combined cross section for associated (τ 0 ν 0 τ ) and pair (τ 0 τ 0 =ν 0 τ ν 0 τ ) production of VLLs. To obtain upper limits on the signal cross section at 95% confidence level (C:L:), we use a modified frequentist approach with a test statistic based on the profile likelihood in the asymptotic approximation and the CL s criterion [42][43][44].
The upper limits are shown in Fig. 6. We use a linear interpolation of the expected event yields between the simulated signal samples in the limit calculations. Systematic uncertainties are incorporated into the likelihood as nuisance parameters with log-normal probability distributions, while statistical uncertainties are modeled > 150 GeV (lower right). The total SM background is shown as a stack of all contributing processes. The predictions for VLL signal models (sum of all production and decay modes) with m τ 0 =ν 0 ¼ 200 and 500 GeV are also shown as dashed lines. The hatched gray bands in the upper panels represent the total uncertainty in the expected background. The lower panels show the ratios of observed data to the total expected background. In the lower panels, the light gray band represents the combined statistical and systematic uncertainty in the expected background, while the dark gray band represents the statistical uncertainty only. The rightmost bins include the overflow events.
with gamma functions. The observed limits are within 2 standard deviations of the expected limits from the background-only hypothesis. Because of the preferential coupling of VLLs to τ leptons, the major contribution to these results comes from the 2L1T SRs. The analysis sensitivity benefits from the large signal-to-background ratio in the 2L1T (SS) SRs, despite the small production rate for this channel. The measurements in the 2L1T channels alone exclude VLLs in the mass range 120-740 GeV. On combining all the 4L, 3L, and 2L1T SRs, with the hypothesis of an SU(2) mass degenerate VLL doublet with couplings to the third generation SM leptons, we exclude VLLs with mass in the range of 120-790 GeV at 95% C:L:

IX. SUMMARY
A search for vectorlike leptons coupled to the thirdgeneration standard model leptons has been performed in several multilepton final states using 77.4 fb −1 of protonproton collision data at a center-of-mass energy of 13 TeV, collected by the CMS experiment in 2016 and 2017. No significant deviations of the data from the standard model predictions are observed. These results exclude a vectorlike lepton doublet with a common mass in the range 120-790 GeV at 95% confidence level. These are the most stringent limits yet on the production of a vectorlike lepton doublet, coupling to the third-generation standard model leptons.

ACKNOWLEDGMENTS
We congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centers and personnel of the Worldwide LHC Computing Grid for delivering so effectively the computing infrastructure essential to our analyses. Finally, we acknowledge the enduring support for the construction and operation of the LHC and the CMS detector provided by the following funding agencies: BMBWF and FWF ( Research Council and Horizon 2020 Grant, Contracts No. 675440, No. 752730, and No. 765710 NKFIA research Grants No. 123842, No. 123959, No. 124845, No. 124850, No. 125105, No. 128713, No. 128786, and No. 129058 (Hungary); the Council of Science and Industrial Research, India; the HOMING PLUS program of the Foundation for Polish  [17] CMS Collaboration, Performance of electron reconstruction and selection with the CMS detector in proton-proton collisions at ffiffi ffi s p ¼ 8 TeV, J. Instrum. 10, P06005 (2015).