Search for heavy resonances decaying to WW, WZ, or WH boson pairs in a final state consisting of a lepton and a large-radius jet in proton-proton collisions at $\sqrt{s}$ = 13 TeV

A search for new heavy resonances decaying to pairs of bosons (WW, WZ, or WH) is presented. The analysis uses data from proton-proton collisions collected with the CMS detector at a center-of-mass energy of 13 TeV, corresponding to an integrated luminosity of 137 fb$^{-1}$. One of the bosons is required to be a W boson decaying to an electron or muon and a neutrino, while the other boson is required to be reconstructed as a single jet with mass and substructure compatible with a quark pair from a W, Z, or Higgs boson decay. The search is performed in the resonance mass range between 1.0 and 4.5 TeV and includes a specific search for resonances produced via vector boson fusion. The signal is extracted using a two-dimensional maximum likelihood fit to the jet mass and the diboson invariant mass distributions. No significant excess is observed above the estimated background. Model-independent upper limits on the production cross sections of spin-0, spin-1, and spin-2 heavy resonances are derived as functions of the resonance mass and are interpreted in the context of bulk radion, heavy vector triplet, and bulk graviton models. The reported bounds are the most stringent to date.


Introduction
The standard model (SM) of particle physics [1][2][3] has successfully accommodated a multitude of experimental observations, culminating in the discovery of a Higgs boson (H) [4][5][6]. Yet, the SM falls short of addressing several outstanding issues, such as the hierarchy problem, i.e., explaining the large difference between the Higgs boson mass and the largest scale in the SM, that are necessary components of a consistent theory of nature up to the Planck scale. These shortcomings are addressed by a variety of theoretical extensions to the SM, several of which predict the existence of new heavy particles with masses near the TeV scale that couple to W, Z, or Higgs bosons and could be produced in proton-proton (pp) collisions at the CERN Large Hadron Collider (LHC). Models studied in the relevant literature include the bulk scenario of the Randall-Sundrum (RS) model with warped extra dimensions [7,8] and examples of the heavy vector triplet (HVT) framework [9], which generically represents a number of models that predict additional gauge bosons, such as composite Higgs [10][11][12][13][14] and little Higgs [15,16] models.
In this paper, a search is presented for a heavy resonance X with mass between 1.0 and 4.5 TeV decaying to a pair of bosons, using pp collision data at a center of mass energy of 13 TeV, collected with the CMS detector from 2016 to 2018. The final state considered targets the scenario where one of the two bosons is required to be a W boson decaying to an electron or muon and a neutrino, while the other boson is detected as a large-radius jet formed from the merged hadronic decay products of the boson, either a quark pair (qq ( ) ) from a W or Z boson (collectively referred to as V) or a bottom quark pair (bb) from a Higgs boson.
A boosted V or Higgs boson with transverse momentum p T ≈ 250 GeV and mass m ≈ 100 GeV decaying to quarks is expected to have its decay products within a cone defined by an angular separation of ∆R = √ (∆η) 2 + (∆φ) 2 ≈ 2m/p T ≈ 0.8, where η is the pseudorapidity and φ is the azimuthal angle. Therefore, the lower bound of 1 TeV for the resonance mass is appropriate for the requirement that the hadronically decaying boson appears as a single broad massive jet. The search targets resonance production via gluon-gluon fusion (ggF) and Drell-Yan-like quark-antiquark annihilation (DY) processes, where no other decay products are expected, as well as production via vector boson fusion (VBF), where the final state contains two additional quark-induced jets in the forward and backward regions of the detector. Example Feynman diagrams for three representative combinations of production mechanisms and final states studied in this paper are shown in Fig. 1. Feynman diagrams for three of the processes studied in this paper: (left) ggFproduced, spin-2 resonance decaying to WW → νqq ; (center) DY-like, charged spin-1 resonance decaying to WH → νbb; (right) VBF-produced, charged spin-1 resonance decaying to WZ → νqq.
Previous searches for heavy WW and WZ resonances in semileptonic final states by the AT-LAS [17][18][19] and CMS [20][21][22]  The analysis is performed by initially selecting events with a well reconstructed W boson that decays to an electron or muon and a neutrino, and a large radius jet. The preselected events are then split into 24 categories based on lepton flavor, compatibility of the large radius jet substructure with a vector boson or a Higgs boson, VBF tagging, and compatibility of the event kinematic variables with the spin of the hypothetical resonance. Sensitivity to the spin of the new resonance is introduced by categorizing the events based on the rapidity separation between the W boson and the large-radius jet. The distribution of the rapidity separation is different for signals of different spins produced with different production mechanisms and for background events, improving further the sensitivity of the search.
A two-dimensional (2D) fit is then applied in the 24 categories to extract the signal production cross section. The fit is performed in the plane whose coordinates are defined by the invariant mass of the reconstructed diboson system and by the mass of its V → qq ( ) or H → bb jet component. The fit includes one signal template and two background classes based on the compatibility of the jet mass with a vector boson or with a quark-or gluon-initiated jet. Systematic uncertainties are encoded as nuisance parameters in the fit and are measured in disjoint regions of phase space (control samples). Systematic uncertainties affecting background shapes and yields are constrained further using the data during the likelihood minimization process. Systematic uncertainties affecting the signal are measured precisely in control regions using events with similar particle content and kinematic configuration.
The 2D fit strategy improves the search sensitivity by constraining the backgrounds in regions in the 2D space that are dominated by one of the two background classes. In addition, the choice of the signal shape in both jet mass and resonance mass in the 2D likelihood enables a simultaneous search for WW, WZ, and WH resonances using the same background estimation procedure for all final states and production mechanisms.
Compared to the previous CMS search for semileptonic WW and WZ resonances with 2016 data [22], this analysis still employs the 2D fit, but extends the search sensitivity to WH decays and VBF production modes by employing bb tagging and VBF tagging. In addition, sensitivity to the spin of the resonance through rapidity separation is introduced for the first time, boosting the reach of the search beyond the improvement expected with the larger data sample.
The paper is organized as follows: Section 2 describes the CMS detector and the event reconstruction, while Section 3 provides information on the simulation and data samples used. Section 4 describes the event selection and the categorization of the data in different classes. Section 5 provides details on the 2D signal extraction and Section 6 describes the systematic uncertainties. Section 7 presents the results of the search. Finally, the analysis is summarized in Section 8.

The CMS detector and event reconstruction
The central feature of the CMS apparatus is a superconducting solenoid of an internal diameter of 6 m that provides a magnetic field of 3.8 T. Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter (HCAL), each composed of a barrel and two endcap sections. Forward calorimeters extend the pseudorapidity coverage provided by the barrel and endcap detectors. Muons are detected in gas-ionization chambers embedded in the steel flux-return yoke outside the solenoid. A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Ref. [30].
Event reconstruction relies on the particle-flow (PF) algorithm [31], which aims to identify each individual particle with an optimized combination of information from the various elements of the CMS detector. The energy of photons is obtained from the ECAL measurement. The energy of electrons is determined from a combination of the electron momentum at the primary interaction vertex as determined by the tracker, the energy of the corresponding ECAL cluster, and the energy sum of all bremsstrahlung photons spatially compatible with originating from the electron track. The momentum of muons is obtained from the combined curvature of the corresponding track in both silicon tracker and the muon system. The energy of charged hadrons is determined from a combination of their momentum measured in the tracker and the matching ECAL and HCAL energy deposits, corrected for the response function of the calorimeters to hadronic showers. Finally, the energy of neutral hadrons is obtained from the corresponding corrected ECAL and HCAL energies. The missing transverse momentum vector p miss T is computed as the negative vector sum of the transverse momenta of all the PF candidates in an event, and its magnitude is denoted as p miss For each event, hadronic jets are clustered from these reconstructed particles using the infraredand collinear-safe anti-k T algorithm [33,34]. The jet momentum is determined as the vector sum of all particle momenta in the jet and is found from simulation to be, on average, within 5 to 10% of the true momentum over the entire p T spectrum and detector acceptance. Jet energy corrections are derived from simulation studies so that the average measured response of jets becomes identical to that of particle-level jets [35].
Additional pp interactions within the same or nearby bunch crossings (pileup) can contribute additional tracks and calorimetric energy depositions, increasing the apparent jet momentum. To mitigate this effect, tracks identified to be originating from pileup vertices are discarded and an offset correction is applied to correct for remaining contributions [31,34]. In the computation of jet substructure variables, a different Pileup-Per Particle Identification (PUPPI) algorithm [36,37], which uses local shape information of charged pileup to rescale the momentum of each particle based on its compatibility with the primary interaction vertex, is employed.
Events of interest are initially selected using a two-tiered trigger system. The first level, composed of custom hardware processors, uses information from the calorimeters and muon detectors to select events at a rate of around 100 kHz within a fixed latency of about 4 µs [38]. The second level, known as the high-level trigger (HLT), consists of a farm of processors running a version of the full event reconstruction software optimized for fast processing and reduces the event rate to around 1 kHz before data storage [39].

Data and simulated samples
This search uses data samples of pp collisions collected by the CMS experiment at the LHC at a center-of-mass energy of 13 TeV in 2016, 2017, and 2018. The performance of the detector on the variables of interest was very similar in the different periods of data taking, therefore they are treated as one single data set, with a total integrated luminosity of 137 fb −1 . Collision events are selected mainly by HLT algorithms that either require the reconstruction of an electron within |η| < 2.5 or a muon within |η| < 2.4.
Several electron triggers are combined. In 2016, p T thresholds of 27, 55, and 115 GeV are used in association with tight, loose, or no isolation criteria, respectively, while in 2017 and 2018, p T thresholds of 32 and 115 GeV are used with tight or no isolation criteria, respectively. Muon triggers have a p T threshold of 50 GeV. To further increase the trigger efficiency, another algorithm selects events with p miss T > 120 GeV, exploiting the presence of the high-p T neutrino in the W → ν decay. The overall HLT efficiency is larger than 99.7% for signal events passing the offline selection described in Section 4.
Several signal benchmark scenarios are used to interpret the results of the search, focusing on relevant models probed in earlier searches by the ATLAS and CMS Collaborations. Spin-0 radions [40][41][42] and spin-2 gravitons [43][44][45] decaying to WW are generated for the bulk scenario of the RS model of warped extra dimensions [7,8]. For bulk gravitons, denoted as G bulk , the ratiok of the unknown curvature scale of the extra dimension k and the reduced Planck mass M Pl is set tok = 0.5, which ensures that the natural width of the graviton is negligible with respect to the experimental resolution [46]. For bulk radions, we consider a scenario with kr c π = 35 and Λ R = 3 TeV, where r c is the compactification radius and Λ R is the ultraviolet cutoff of the theory [46]. Spin-1 resonances decaying to WW, WZ, or WH are studied within the HVT framework using benchmark models from Ref.  [47]. For each model, resonance masses in the range 1.0-4.5 TeV are considered, and the resonance width is set to 0.1% of the resonance mass, ensuring that the width of the signal distribution is dominated by the detector resolution.
Simulated samples for SM background processes are used to optimize the search and to build background templates, as described in Section 5.2. The W(→ ν) + jets process is produced with MADGRAPH5 aMC@NLO at LO in QCD. The background from top quark pair events (tt) is generated with POWHEG v2 [48][49][50][51] at next-to-LO (NLO). Single top quark events are generated in the t channel and associated tW channel at NLO with POWHEG v2 [52,53], while SM diboson processes are generated at NLO with MADGRAPH5 aMC@NLO using the FxFx merging scheme [54] for WZ and ZZ, and with POWHEG v2 for WW [55].  [60] PDFs and the CP5 [61] tune are used for 2017 and 2018 conditions. To simulate the effect of pileup, additional minimum bias interactions are superimposed on the hard-scattering process, and the events are then weighted to match the distributions of the number of pileup interactions observed in 2016, 2017, and 2018 data separately. All samples are processed through a simulation of the CMS detector based on GEANT4 [62] and are reconstructed using the same algorithms used for collision data. Simulated events are also reweighted to correct for differences between data and simulation in the efficiencies of the trigger, lepton identification, and b tagging algorithms described in Section 4.

Event selection and categorization
The event selection is designed to isolate events containing a boosted topology consistent with the semileptonic decay of a WW, WZ, or WH pair, involving one energetic electron or muon, large p miss T , and a so-called large-radius jet corresponding to a W, Z, or Higgs boson candidate.
Events / 5 GeV Electron and muon candidates are considered if they satisfy p T > 55 GeV, the same η acceptance requirements as at the HLT, and a set of lepton reconstruction quality and lepton identification requirements optimized to maintain a large efficiency for high-momentum leptons [63,64]. The electrons must satisfy requirements on the ratio of the energies deposited in the HCAL and ECAL, the distribution of the ECAL deposits, their geometrical matching with reconstructed tracks, and the number of reconstructed hits in the silicon tracker. For muons, a track is required that is reconstructed in both the silicon tracker and the muon system. Identification criteria are imposed on the track quality and the number of matched muon hits. To reject backgrounds from bottom and charm decays, decays in flight, and misidentified leptons inside jets, muons and electrons are required to be isolated in the detector in a region defined by a cone of ∆R < 0.3 around the respective lepton. The muon isolation variable, defined as the p T sum of all particles within ∆R = 0.3 of the muon direction, subtracting the muon itself and pileup contributions, is required to be less than 5% of the muon p T , while electron isolation is defined by applying separate requirements suitable for high energy electrons using the tracker and the calorimeter deposits [63]. Additional requirements on the impact parameters of electron and muon tracks with respect to the primary interaction vertex are applied to suppress the contributions from secondary decays and pileup interactions. The lepton selection requirements used establish an identification efficiency of about 90% while ensuring negligible contributions from SM events composed uniquely of jets produced through the strong interaction, referred to as QCD multijet events.
Large-radius jets are clustered using a distance parameter R = 0.8 in the anti-k T algorithm and are required to have p T > 200 GeV. Since the signal is expected to be produced centrally, large-radius jets are required to be in the tracker acceptance (|η| < 2.5) so as to exploit the maximum granularity of the CMS detector. Another collection of jets is clustered using R = 0.4 and referred to as standard jets, with p T > 30 GeV and |η| < 4.7. Both sets of jets are required to pass tight identification requirements [65] to remove jets originating from calorimetric noise and track misreconstruction in the silicon tracker. Large-radius jets located within ∆R < 1.0 of a selected lepton are discarded, as are standard jets located within ∆R < 0.8 of a large-radius jet or within ∆R < 0.4 of a selected lepton.
To identify large-radius jets as hadronic decays of boosted W, Z, and Higgs bosons, jet substructure techniques are employed. First, to perform jet grooming, i.e. to remove soft, wideangle radiation from the jet, the modified mass-drop algorithm [66,67] known as "soft drop" is used with angular exponent β = 0, soft cutoff threshold z cut = 0.1, and characteristic radius R 0 = 0.8 [68]. The invariant mass of the remaining jet constituents is called the soft-drop jet mass, denoted as m jet , and is one of the two observables of the final 2D fit. Second, a V tagging algorithm is defined based on the N-subjettiness ratio between 2-subjettiness and 1subjettiness [69], τ 21 = τ 2 /τ 1 , which takes lower values for jets coming from two-prong W, Z, or H decays than for one-prong jets from dominant SM backgrounds. However, the selection on τ 21 is found to sculpt the distribution of m jet , affecting the monotonically falling behavior of the W + jets background distributions. Therefore, to decorrelate τ 21 from the jet mass, the "designing decorrelated taggers" (DDT) procedure [70] is followed, leading to the definition of the mass-decorrelated N-subjettiness ratio τ DDT 21 ≡ τ 21 − M log (m 2 jet /(p T µ)), with M = −0.08 and µ = 1 GeV and using the p T of the original jet. In the computation of both m jet and τ DDT 21 , pileup mitigation relies on the PUPPI algorithm. Third, to discriminate H → bb and Z → bb decays from other signals and backgrounds that involve light-flavor jets, the so-called double-b tagger is used [71], which is a multivariate discriminant that combines information from displaced tracks, secondary vertices (SV), and the two-SV system within the Higgs or Z boson jet candidate.
The collection of standard jets serves two purposes. First, the b jet identification algorithm known as DeepCSV [71], which relies on a deep neural network that uses track and SV information is applied to standard jets found within |η| < 2.5. The medium operating point of this algorithm has an efficiency of about 68% for correctly identifying b jets in simulated tt events, and a misidentification probability of about 1% for light-flavor jets. Events where at | are the invariant mass and pseudorapidity separation of the two highest p T standard jets. The selection requirements were chosen to reject backgrounds from additional jets present in energetic top events and W production in association with multiple jets. This criterion is used to define additional categories.
The event selection requires the presence of exactly one identified electron or muon, and events that contain additional electrons (muons) passing p T > 35 (20) GeV and otherwise identical requirements are discarded. The p miss T in the event is required to be greater than 80 GeV if the selected lepton is an electron and greater than 40 GeV if the selected lepton is a muon. Muons have lower QCD background and are not included in the p miss T calculation in the trigger, resulting in the reconstruction of the whole boosted leptonic W decay as p miss T , achieving higher efficiency at lower offline thresholds. To reconstruct a W → ν boson candidate, the p miss T is taken as an estimate of the p T of the neutrino, and the longitudinal component p z of the neutrino momentum is estimated by imposing a W boson mass constraint to the lepton+neutrino system. This leads to a quadratic equation, of which the solution with smallest magnitude of the neutrino p z is chosen. When no real solution is found, only the real part of the two complex solutions is considered. Besides this leptonically decaying W boson candidate, hereafter referred to as W lep , the hadronically decaying W, Z, or Higgs boson candidate is defined as the most energetic large-radius jet in the event and referred to as V had . The V had is required to pass τ DDT 21 ≤ 0.8, and the W lep and V had are both required to have p T > 200 GeV. They are then combined to form a WW, WZ, or WH diboson candidate, whose invariant mass is denoted as m WV and is the second observable used in the 2D fit.
Angular criteria are applied in order to select a diboson-like topology: the angular distance between the selected lepton and the V had is required to be ∆R > π/2, while the difference in azimuthal angle between the V had and both the p miss T and the W lep directions is required to be |∆φ| > 2. The difference in rapidity between the W lep and the V had is denoted as |∆y| and is used later for event categorization to exploit the fact that signal models investigated in this search tend to have lower values of |∆y| compared to backgrounds, except for spin-1 and spin-2 VBF-produced resonances, which significantly populate the |∆y| > 1 region.
The signal region for the 2D fit is defined by two final requirements on the diboson reconstructed mass and soft-drop jet mass, namely 0.7 < m WV < 6 TeV and 20 < m jet < 210 GeV. The lower bound on m WV ensures that the backgrounds have a falling spectrum while allowing a search for resonances with masses greater than 1 TeV, and the 6 TeV upper bound ensures that all observed events are included. The use of a large window for m jet allows the selection of background events containing V jets as well as top quark jet candidates, while retaining sizeable low-and high-mass sidebands to constrain shapes and normalizations. The overall signal selection efficiency times acceptance ranges from 22 to 79%, depending on the bench-mark model and increasing with resonance mass.
To enhance the analysis sensitivity to all signals under consideration, each event of the signal region is eventually assigned to one of 24 mutually exclusive search categories, based on a combination of four criteria. First, the event sample is split according to lepton flavor, distinguishing the electron and muons channels, which helps account for the differences in lepton reconstruction and selection. Second, the V tagging information is used to split each channel into a high-purity (HP) and a low-purity (LP) subchannel, which correspond to values of the mass-decorrelated N-subjettiness ratio in the ranges τ DDT 21 ≤ 0.5 and 0.5 < τ DDT 21 ≤ 0.8, respectively. Third, each subchannel is further divided into three regions, referred to as VBFtagged for events that satisfy the aforementioned VBF tagging criterion, double-b -tagged (bb) for non-VBF-tagged events for which the double-b tagger output is larger than 0.8, and nondouble-b -tagged (no-bb) for the remaining events. Fourth, each of the twelve resulting regions is split into two event categories: LDy, corresponding to a difference in rapidity between the reconstructed bosons of |∆y| ≤ 1; and HDy, which corresponds to |∆y| > 1. The categorization requirements are summarized in Table 1.
Besides the signal region, a disjoint event sample enriched in tt events with similar kinematic distributions is defined by requiring the presence of a b-tagged standard jet instead of vetoing it. This control region is used to compare data and simulation for the main selection variables, to correct the top background yields and m jet shapes in the signal region, and to compute efficiency scale factors for the τ DDT 21 selection. Figure 2 shows the distributions of m jet , τ DDT 21 , the double-b tagger, and |∆y| in this top quark-enriched sample before the aforementioned corrections and scale factors are applied.

Two-dimensional templates
A similar signal extraction strategy is followed as in the previous CMS search for semileptonic WV resonances with 2016 data [22], using a simultaneous maximum likelihood fit to the (m WV , m jet ) data distributions in the 24 search categories. Signal and background templates are constructed using simulated events, after applying corrections to the simulation. Analytical shapes are used to model the signal, while binned templates are used for background processes. The binning was optimized to maximize the number of bins while ensuring smooth templates in all categories. This process resulted in two binning schemes, one for higher statistics and one for lower statistics categories, respectively. Particular care is devoted to constructing smooth background templates, modifying the strategy to accommodate the larger 2D signal region and the fact that new categorization criteria such as VBF tagging and double-b tagging cause some categories to be sparsely populated by simulated events.

Signal modeling
The 2D probability density function (pdf) for signal events in the (m WV , m jet ) plane is described as the product of two one-dimensional (1D) resonant pdfs: where θ represents sets of nuisance parameters affecting the shape, which we describe in Section 6.2. Each factor in this formula is constructed by fitting a double-sided Crystal Ball (dCB) function [72], composed of a Gaussian core and asymmetric power-law tails, to the corresponding distribution of simulated signal events for eleven different values of the resonance mass m X from 1.0 to 4.5 TeV. Such a model neglects the mild correlation between m WV and m jet , which results in a small rotation in the 2D m WV and m jet plane that is negligible compared to the experimental resolution uncertainties. Since the modeling of the lepton momentum scale and resolution has negligible impact in the shape of the invariant mass of the system compared to the impact of jet and p miss T reconstruction, the electron and muon channels are merged to gain statistics for the fit in the simulation. In LP categories, an exponential function is added to the fit model of the m jet dimension over its entire range, to model properly the low-mass tail. Each function parameter is interpolated for other values of m X using a polynomial function. Separate shape models are built for each studied signal benchmark scenario.  Figure 3 shows the projections of the 2D likelihood along the m WV and m jet dimensions, respectively. The m WV projections are shown for the G bulk → WW signal for mass hypotheses of 1.5, 2.5, and 4.5 TeV. The distributions are very similar for other spin hypotheses, production mechanisms, and decay modes. The m jet projections are shown for G bulk → WW, W → WZ, and W → WH signals for a mass of m X = 2.5 TeV, demonstrating the sensitivity obtained by CMS reconstruction and jet substructure techniques to W, Z, and Higgs jet hypotheses. While the W and Z invariant mass distributions peak near the expected masses, the Higgs mass peak is slightly shifted, since the presence of neutrinos in b quark decays is not accounted for in the calibration of the jet mass. The experimental resolution for m jet is found to be of the order of 10%, whereas that of m WV ranges from 6% at 1 TeV to 4% at 4.5 TeV. In addition, the expected signal yield in every search category is also parameterized as a function of the collected integrated luminosity so that the resonance production cross section can later be extracted from the fit to data.

Background modeling
Background events are classified into two classes in the fit, each of which is described by a different pdf: 1. A background called W+V/t, for which the m jet shape has two peaks, one near the W/Z boson masses and the other near the top quark mass, while m WV has a falling spectrum.
This resonant background is dominated by tt production, with subdominant contributions from SM diboson and single top quark production, and is defined in the simulation by requiring that both generated quarks from a hadronic V decay be located within ∆R = 0.8 of the selected large-radius jet. The m jet shape structure is thus due to the selection of a partially or fully merged top quark jet or a V jet.

2.
A background called W +jets, for which m jet does not have a peak structure, and m WV again has a falling spectrum. This nonresonant background is dominated by W(→ ν) + jets events, where the selected jet is produced by the hadronization of one or more partons not originating from a vector boson but is mistagged as a V jet. In addition, this background also includes tt events where the selected large-radius jet corresponds to a random combination of jets in the event, instead of a W boson or top quark hadronic decay.
The W+jets and W+V/t background shapes are both described as the product of a conditional pdf of m WV as a function of m jet , and a m jet pdf: where θ again represents sets of nuisance parameters, described in Section 6.1. The conditional m WV pdf is constructed with a similar strategy for both classes of backgrounds that employs a robust kernel density estimation technique that is designed to provide smooth 2D templates in all subcategories. For each event in the simulated background sample, particle-level jets are clustered from stable particles using the same algorithms employed in event reconstruction. A diboson mass m part.
WV is then defined by combining the reconstructed leptonically decaying W boson and the generated large-radius jet. A detector response model is derived for the scale and resolution of the diboson mass as a function of the transverse momentum p gen.
T,jet of the generated jet, by comparing the reconstructed and generated variables m WV and m part.
WV . The signal region is divided into slices of the soft-drop jet mass m jet : 16 slices for W+jets, and one or two slices for W +V/t in the HP and LP categories, respectively. Each simulated event i in a given m jet slice then contributes to the template m WV distribution in that slice by adding the following 1D Gaussian distribution: Here, m runs over the allowed values of m WV , w i is the event weight of the simulation, corrected for differences in the m WV spectrum of W+jets observed between data and simulation in the control region, and s(p gen.
T,jet ) and σ(p gen. T,jet ) are the scale and resolution parameters from the detector response model. This modified kernel procedure, seeded by the resolution instead of the event density, is robust in describing rapidly changing shapes, however it has limited performance at very low statistics. Therefore, since simulated samples do not have enough events to provide regular shapes up to the largest values of m WV , these high-mass tails have to be smoothed. This smoothing is applied in the region where the event yield is dominated by Poisson statistics, namely for events with m WV larger than a threshold that varies between 1.1 and 1.6 TeV depending on the background class and category. The high-m WV distribution is fitted to a power-law function, which is used as the shape in this region. To improve the robustness of the templates, the electron and muon channels are here merged for the purpose of constructing the template, and in the case of W+jets, the VBF-tagged, bb, and no-bb regions are merged as well. The residual differences between the 24 categories are later corrected for by the final 2D fit, deploying uncorrelated shape uncertainties that are described in Section 6.1.
The m jet pdf is built separately for each of the 24 search categories, in order to account for the specific kinematic configuration of each category. For the nonresonant W +jets background, the m jet pdf is obtained by smoothing the 1D histogram of selected simulated events in each category using cubic spline interpolation between bins. A kernel method, similar to the one followed for m WV has limited performance in this variable since the resolution model is complex for the soft-drop mass variable. In the case of the W +V/t background, the m jet distribution features a peak around the W boson mass, which is dominated by top quark jets where only the W → qq products were reconstructed inside the large-radius jet, and a peak around the top quark mass, where the W boson and the b quark decays are merged. To define the nominal background shapes, an analytical function is fitted to the distribution of simulated events, where the two-peak structure is modeled as the sum of two dCB functions and an exponential function that is used only in the LP categories. The inclusion of the top peak in the search region improves the fit precision, because constraining the relative fraction of the two peaks helps capture the convoluted effects of the top quark p T spectrum and jet grooming. The m jet shape of the W+V/t background in the region of the W mass peak is found to differ from that of the X → WW signals even if, in both cases, a W boson is present. This difference is attributed to the fact that the background consists of high-momentum tt events, in which a part of the b jet from the t → Wb decay overlaps with the V jet from the W → qq decay, while in the case of the signal, the V jet is isolated.

Systematic uncertainties
Systematic uncertainties affecting the normalization and shape of the signal and backgrounds are modeled by nuisance parameters, each of which is profiled in the likelihood maximization. All sources of systematic uncertainties are listed in Table 2. When specified, the magnitude of the uncertainty is the width of the function used to constrain the nuisance parameter, which is a log-normal distribution for uncertainties related to normalization, and a Gaussian distribution for parameters that control shape uncertainties. The following sections describe these uncertainties in more detail.

Systematic uncertainties in the background estimation
The 2D fit is designed to predict correctly the normalization and shapes of background contributions directly from the data by introducing nuisance parameters that vary the shapes and the yields of each contribution during the likelihood minimization process. To implement nuisance parameters that affect the shapes, the template-building procedure described in Section 5 is repeated with additional event weights or modified shape parameters. For each parameter, this produces two alternative 2D templates that represent an upward and a downward shift, between which the 2D fit performs an interpolation based on the value of the nuisance parameter. The magnitude of the shape variations are chosen to cover the differences between data and simulation observed in the control region.
First, both classes of backgrounds are assigned shape uncertainties that modify the conditional m WV factor in the 2D likelihood. These have to account not only for differences between data and simulation but also for the use of common conditional likelihoods in different categories. The main difference of the shapes between the data and the templates arises from differences in the p T spectrum after categorization and jet substructure requirements are applied. Therefore, we define alternative shapes by performing a linear reweighting of the jet p T spectrum in each Table 2: Summary of the systematic uncertainties considered in the 2D fit, the quantities they affect, and their magnitude, when applicable. When ranges are given, the magnitude of the uncertainty depends on the signal model or mass. The three parts of the table concern shape uncertainties only affecting backgrounds, shape uncertainties in the scales and resolutions, and normalization uncertainties. category. This variation is motivated by the imperfect modeling of the parton distribution functions and initial-state radiation, and is conservative given that the fit has the statistical power to constrain these uncertainties in regions where no signal is expected for a given model. The W+jets background has another shape variation related to the correlation between the jet mass and the jet p T , which simultaneously modifies both dimensions of the conditional likelihood, while the W+V/t background is assigned an uncertainty that modifies the power-law function used to populate the high-m WV tails. Since each category involves jets with different p T spectra as a result of the selection requirements imposed on different particles in the event, these sets of nuisance parameters are left uncorrelated between categories.
Uncertainties in the m jet shape of the W +jets background mostly arise from hadronizationrelated effects and their interplay with the soft-drop algorithm. Two m jet shape variations are defined. The first one is chosen to be a simple shift of the m jet scale. The second one is motivated by the study of the scaling variable log(m 2 jet /p T ), which reveals a difference in hadronization behavior between data and simulation. The discrepancy in the distribution of this variable is measured in the region of m jet < 50 GeV that is dominated by W+jets events. Consequently, an uncertainty is introduced that generates alternative shapes after reweighting the simulation to match the data. Since the jet substructure variables are very sensitive to the apparent difference of the jet p T spectrum in different categories, these uncertainties are treated as uncorrelated.
The m jet shape of the W +V/t background is affected by uncertainties in the scale and resolution of the soft-drop jet mass, which are estimated by the top-enriched sample and are encoded in nuisance parameters that modify the width and the peak of the fitted dCB functions, respectively. Since the scale and resolution effects of the softdrop mass are different for two-prong and three-prong objects, these uncertainties are left uncorrelated between the W boson and top quark mass peaks and between the HP and LP categories but are considered fully correlated across other categorization criteria.
Additionally, an uncertainty of 13% in the ratio of the W boson mass peak normalization to the sum of the W boson and top quark mass peaks, derived by fitting the m jet spectrum in the topenriched control region, is introduced with one parameter per category, effectively measuring the p T spectrum of the top quark. The shape uncertainties introduced are constrained in the fit by the shapes and relative normalizations of the W and top peaks in data.
Both the W+jets and the W+V/t backgrounds are assigned a large normalization uncertainty of 25% based on agreement between data and simulation in the low m jet sideband and the topenriched control region respectively. While the cross section measurements of these processes at the LHC are known to better precision, the effects of jet substructure, the requirement of jet mass windows, and the categorization, all introduce larger discrepancies. The motivation behind the 2D-fit signal extraction procedure is to constrain those differences from the data in each category without being sensitive to the initial value of the uncertainty. The corresponding parameters are fully correlated between lepton channels and uncorrelated across other categorization criteria and between the two background classes. Two other uncorrelated 5% background normalization parameters are assigned to the electron and muon channels in order to account for lepton triggering, reconstruction, and isolation efficiencies based on measurements from the data.

Systematic uncertainties in the signal prediction
The two-dimensional signal shapes in the (m jet , m WV ) plane are affected by several uncertainties. Relative scale factors on the mean and width of the dCB function modeling the m WV shape encode uncertainties in the scale and resolution of the jet energy and the p miss T , and electron and muon energy scales.
In the m jet dimension, two parameters account for the impact of grooming on the scale and resolution of the soft-drop jet mass, and are fully correlated with the analogous shape uncertainties of the W mass peak of the W+V/t background.
The dominant uncertainties in the signal normalizations arise from uncertainties in the efficiency of the V tagging, double-b tagging, and |∆y|-based selections. The corresponding nuisance parameters are anticorrelated between HP and LP, between bb and no-bb, and between LDy and HDy categories, respectively, therefore inducing a migration of events between the categories in which they apply. Two nuisance parameters are associated with the τ DDT 21 -based categorization: one for its efficiency and another for its dependence on the jet p T . The values of these nuisance parameters correspond to the statistical and systematic uncertainties of the measurement of V tagging in data for different ranges of the jet p T . These measurements are performed in the disjoint region enriched in top quark events with hadronic W boson decays, by splitting the sample into events that pass or fail V tagging and then simultaneously fitting the W jet mass distributions in data and simulation.
The uncertainty caused by the |∆y| requirement is evaluated by studying the distributions of data and simulated events in the top quark-enriched control region.
Other uncertainties that apply to the normalization of signal events are associated with the integrated luminosity [73][74][75], the pileup reweighting, the efficiency of the b tagging veto on standard jets, and the lepton triggering, reconstruction, and isolation. Finally, uncertainties in the signal yield due to the choice of PDFs, and the factorization and renormalization scales are also taken into account: the scale uncertainties are evaluated following the proposal in Refs. [76,77], while the PDF uncertainties are evaluated using the NNPDF 3.0 [57] PDF set. The resulting uncertainties in acceptance are found to be negligible for the scale variation and range from 0.1 to 2% for the PDF evaluation. On the other hand, the uncertainties in the signal cross section due to PDFs and scales are not taken into account in the statistical interpretation but are instead considered as uncertainties in the theoretical cross section.

Results
The 2D maximum likelihood fit is performed simultaneously in all 24 search categories. To assess the fit quality, the fit is first performed without signal contributions. The results of the background-only fit are illustrated for six representative categories in Figs. 4 and 5, where projections of the 2D post-fit distributions are shown in the m jet and m WV dimensions, respectively. The distributions for the remaining 18 categories show very similar levels of agreement. The jet mass distributions demonstrate good modeling of both the resonant peaks and the continuum for all categories. In the LP categories, the W +V/t background has significant contributions from the W boson peak and top quark peak, while only the W peak is visible in the HP categories.
The post-fit pull distribution of the nuisance parameters is consistent with a Gaussian distribution centered around zero with a standard deviation of unity, while the best-fit values of most nuisance parameters are found to lie within the ±1σ range initially associated with each uncertainty. The quality of the fit is also assessed with a goodness-of-fit estimator that uses the saturated model [78], and the observed value of the estimator falls well within the central 68% interval defined from pseudo-experiments.
Then, the fit is repeated for each benchmark model, including a signal contribution, and ex- tracting the signal production cross section . No significant excess is observed over the estimated background. The largest deviation from the background hypothesis is observed for a VBF-produced charged spin-1 resonance decaying to WZ with mass around 1 TeV, with a local significance of 3.0 standard deviations.
The results are interpreted in terms of exclusion limits at 95% confidence level (CL). While the interpretation is performed for a well-defined set of benchmark signal models, the results are generally relevant for narrow resonances of a given spin and production mechanism. The limits are evaluated using the asymptotic approximation [79] of the CL s method [80,81].      Figure 8: Exclusion limits on the product of the production cross section and the branching fraction for a new neutral spin-1 resonance produced via qq annihilation (upper left) or vector boson fusion (upper right) and decaying to WW, for a new charged spin-1 resonance produced via qq annihilation (center left) or vector boson fusion (center right) and decaying to WZ, and for a new charged spin-1 resonance produced via qq annihilation and decaying to WH (lower), as functions of the resonance mass hypothesis, compared with the predicted cross sections for a W or Z from HVT model B (for DY) or HVT model C with c H = 3 (for VBF). Signal cross section uncertainties are shown as red cross-hatched bands.

Summary
A search for new narrow heavy resonances with mass larger than 1 TeV and decaying to WW, WZ, or WH boson pairs is performed using proton-proton collision events at √ s = 13 TeV containing one high-p T electron or muon, large missing transverse momentum, and a massive large-radius jet. The data were collected with the CMS detector at the LHC in 2016-2018 and correspond to an integrated luminosity of 137 fb −1 . The signal extraction strategy is structured around a two-dimensional maximum-likelihood fit to the distributions of the diboson reconstructed mass and the soft-drop jet mass. The sensitivity to different final states and production mechanisms is enhanced by the use of event categories that exploit the mass-decorrelated N-subjettiness ratio, the double-b tagger, the presence of a pair of forward jets compatible with vector boson fusion production, and the difference in rapidity between the reconstructed bosons. No significant excess is found, and the results are interpreted in terms of upper limits on the production cross section of new narrow resonances in several benchmark models. Spin-2 ggF-produced bulk gravitons with masses below 1.8 TeV and decaying to WW are excluded at 95% CL. Spin-1 DY-produced Z → WW resonances lighter than 4.0 TeV, W → WZ resonances lighter than 3.9 TeV, and W → WH resonances lighter than 4.0 TeV in the context of HVT model B are excluded at 95% CL. Spin-0 ggF-produced bulk radions with masses below 3.1 TeV, decaying to WW, are excluded at 95% CL. Finally, for particles produced exclusively by vector boson fusion, the present data do not yet have sensitivity to exclude the benchmark scenarios under study. The reported limits, also provided in tabulated form in the HEPData record [82] for this analysis, are generally relevant for any narrow heavy resonance with a given spin produced by gluon fusion, qq annihilation, or vector boson fusion. The excluded cross section values set the most stringent experimental bounds to date.

Acknowledgments
We congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centers and personnel of the Worldwide LHC Computing Grid and other centers for delivering so effectively the computing infrastructure essential to our analyses. Finally, we acknowledge the enduring support for the construction and operation of the LHC, the CMS detector, and the supporting computing infrastructure provided by the follow-