Sensitivity to longitudinal vector boson scattering in semileptonic final states at the HL-LHC

Longitudinal vector boson scattering provides an important probe of electroweak symmetry breaking, bringing sensitivity to physics beyond the Standard Model as well as constraining properties of the Higgs boson. It is a difficult process to study due to the small production cross section and challenging separation of the different polarization states. We study the sensitivity to longitudinal $WV$ vector boson scattering at the high-luminosity Large Hadron Collider in semileptonic final states. While these are characterized by larger background contributions compared to fully leptonic final states, they benefit from a higher signal cross section due to the enhanced branching fraction. We determine the polarization through full reconstruction of the event kinematics using the $W$ boson mass constraint and through the use of jet substructure. We show that with these techniques sensitivities around three standard deviations at the HL-LHC are achievable, which makes this channel competitive with its fully leptonic counterparts.

measurement is challenging due to small cross sections. Both ATLAS and CMS have by now established VBS processes involving massive vector bosons (V = W, Z) in fully leptonic decay modes, successfully separating the desired purely electroweak production from strong (QCD-induced) V V jj production and other background processes [20][21][22][23][24][25]. Studies of semileptonic VBS final states where one vector boson decays hadronically -albeit benefitting from the large V hadronic branching fraction compared to the leptonic decays -thus far were unable to firmly establish the SM signal due to increased background levels, but have proven to provide excellent sensitivity to anomalous couplings [26][27][28]. Observation of the semileptonic W V jj VBS process is expected to be achievable at the LHC with an modes, but none of the W ± W ± jj, W Zjj and ZZjj processes studied is predicted to reach a significance of 3 standard deviations at a single experiment [32][33][34][35][36][37]. Using more sophisticated analysis techniques such as deep machine learning, the sensitivity for longitudinal VBS can be significantly increased, as demonstrated for W ± W ± jj in [38] for the leptonic decay channel.
The sensitivity to longitudinal VBS in semileptonic final states has been explored much less. While some studies exist for √ s = 13 TeV [39] or √ s = 27 TeV [40], the sensitivity to longitudinal VBS in semileptonic final states at the HL-LHC has not been assessed -a gap that this paper is addressing. We focus on the W V jj channel where one W boson decays into a charged lepton (an electron or muon, denoted by ) and an (anti-) neutrino ν, while the other massive vector boson V is considered to decay into a pair of quarks which we require to be reconstructed as a merged, large-radius jet (J), leading to an νJjj final state.
To enable full reconstruction of the event kinematics, the neutrino four-vector is recreated by imposing a W boson mass constraint in the lepton neutrino system. The expected significant impact of pileup at the HL-LHC is mitigated by using track-based observables, and jet substructure techniques are deployed to improve V boson reconstruction. As both the resolved W V jj channel (where the hadronic V decay is reconstructed via two separate smallradius jets) and ZV jj semileptonic final states will contribute to establishing longitudinal VBS in semileptonic final states, our results can be seen as a lower limit on the expected sensitivity at the HL-LHC.

II. SIMULATION SAMPLES
Electroweak W V jj production includes contributions from the W ± W ∓ jj, W ± W ± jj and W Zjj VBS processes, which are modeled with MadGraph5 aMC@NLO 2.7.3 [41], interfaced to Pythia 8.243 [42] for parton showering and hadronization. These samples are generated with two on-shell vector bosons, with one W boson decaying leptonically (W → ν), and the other massive vector boson decaying hadronically. The contribution from triboson processes is also included, but negligible in the phase space studied (see Sec. III). Four different polarization states are produced at leading order in QCD: both bosons are longitudinally polarized (W L V L ), both transversely polarized (W T V T ), or a mixture (W T V L and W L V T ). These polarized samples are simulated with the helicity eigenstates defined in the W V center-of-mass reference frame [43]. For this analysis focused on W L V L production, the signal is referred to as VBS W L V L , while the other polarization states (W T V T , W T V L , and W L V T ) are referred to as the VBS W V background.
The main background contributions for this analysis are the production of a W boson in association with jets and top-quark pair production. The W +jets samples are simulated using CKKW-L merging [44,45] with up to four partons at leading order in QCD using MadGraph5 aMC@NLO 2.7.3. The top-quark pair production sample is generated using MadGraph5 aMC@NLO 2.7.3 at next-to-leading order in QCD, and the top quarks are decayed using MadSpin [46] in order to preserve the spin correlations for top-quark production and decay. Pythia 8.243 is used for parton showering and hadronization for all background samples. The contribution of QCD-induced V W jj processes in our signal region is found to be a factor of 100 smaller than the W +jets background, and a factor of two smaller than the VBS V W EW-induced backgrounds, and is not considered further in our studies.
A parton level event filter of H T > 200 GeV is used to enhance the statistical power of the Monte Carlo (MC) samples in the phase space studied, where H T is the sum of the transverse momentum (p T ) of all partons. Leptons and partons are also required to satisfy p T > 10 GeV at the generator level. This filter is found to be fully efficient for the concerned phase space in this study (see Sec. III). Table I summarizes the simulated MC samples. The number of events generated in particular for the background processes is driven by the requirement that there be no empty bins in the discriminant used to determine the analysis sensitivity (see Sec. V), hence avoiding any extrapolations across empty bins. All signal and background processes are reconstructed using a generic detector in the Delphes simulation framework [47], modeled after the ATLAS detector in the HL-LHC [48].

III. EVENT SELECTION
Events from VBS W V production exhibit several distinct characteristics which may be used in the event selection. In the semileptonic decay, the event contains one lepton and missing transverse momentum E miss T from the leptonic W boson decay, and either two jets or a large-radius jet from the hadronic V decay. In addition to the V boson decay products, there are two forward jets from the VBS production, which are referred to as the "tagging" jets.
As detailed in Table II, a loose selection is applied to the events, based on the expected reconstruction capabilities at the HL-LHC [49]. In order to select the leptonically decaying W boson, each event is required to have an electron or muon with p T > 20 GeV and pseudorapidity |η | <4.0, and to contain no other leptons with p T > 7 GeV.
Our study focuses on the case where the hadronically decaying V boson candidate can be reconstructed as a single large-radius jet J. The inclusion of the resolved case where the decay products are reconstructed as separate jets would improve the significance of these results, but is not considered due to the combinatoric challenges in assigning tagging-and V -decay jets. Jets are clustered with FastJet [50] using the anti-k t algorithm [51] with radius parameter of R = 1.0, using "particle flow objects" as inputs to the jet reconstruction algorithm. These particle flow inputs combine information from the tracker and calorimeter in order to provide better resolution for object reconstruction. The large-radius jets are groomed using the soft-drop grooming algorithm, with β = 1.0, and z cut = 0.1 [52] in order to reduce effects due to multiple simultaneous pp collisions (pileup) and the underlying event, and to improve sensitivity of the V boson reconstruction. The large-radius jet is required to have p T > 200 GeV in order to reconstruct both decay products within a single jet, and |η| < 4.0, and it is required to be isolated from the lepton by ∆R ,J > 1.0. If multiple large-radius jets are reconstructed, the highest-p T jet is selected. After the jet is selected, its mass is required to satisfy 40 < m J < 180 GeV.
Missing transverse momentum is reconstructed as the negative sum of the transverse momentum of all particle-flow objects within |η| < 5.0, and is required to be greater than 80 GeV to reduce QCD background contributions.
The two quarks produced in the VBS production are reconstructed using small-radius (R = 0.4) jets. These jets are required to have p T > 30 GeV and |η| < 4.0, and they must be isolated from the selected large-radius jet by ∆R > 1.4, and from the lepton by ∆R > 0.4.
The two jets that maximize the dijet invariant mass (m j 1 j 2 ) and are in opposite hemispheres (η j 1 · η j 2 < 0) are identified as the tagging jets, and events are required to have m j 1 j 2 > 800 GeV to reduce the background contributions. To lower contributions from top-quark pair production, the event is required to have no b-tagged jets outside of the selected large-R jet.

A. V boson reconstruction
After above event selection, both of the V bosons are reconstructed at detector level.
The large-radius jet serves as a proxy for the hadronically decaying V boson. Since the jet has been groomed with the soft-drop algorithm, the two associated subjets which pass the soft-drop condition are natural proxies for the decay products of the V boson.

Object Selection
Lepton The leptonically decaying W boson is fully reconstructed using the lepton and E miss T , using the W boson mass to fully constrain the kinematics, with the assumption that the E miss T arises solely from the neutrino. The neutrino transverse momentum is taken to be the E miss T , and the longitudinal component is solved for by assuming the W boson is on-shell, and that the charged lepton is massless. The result of this is a second order polynomial with two solutions. In cases where there are no real solutions, the longitudinal momentum is taken to be the real component of the solution. In cases where there are two real solutions, the solution with the smaller longitudinal momentum is taken, which produces the correct result in around 65% of generated events.

B. Polarization
In the V boson rest frame, the decay products of the V boson will be back-to-back, and can be characterized based on the angle θ * between the V boson direction and the decay product direction. The V -boson differential cross section depends on the polarization fractions as where f − , f + , and f L are the fractions of events where the V boson polarization is −1, +1, and 0, respectively. Similarly, in the laboratory frame, the decay products for the longitudinally polarized V bosons will tend to be more balanced in p T , and less balanced for transversely polarized V bosons. Consequently, the momentum balance of the leptonic decay products, or z g, = p T, /p T,W , and cos(θ * ) are sensitive variables to the W boson polarization.
Similar variables may be defined for the hadronic case as well, using the large-R jet and its two subjets as proxies for the hadronically decaying V boson and its decay products.
Using MC generator truth information, the decay products are distinguishable as quark (q) and anti-quark (q), and we can use for example cos(θ * ) q (defined using the angle between the V boson direction and the quark q from the V boson decay), and z g,q = p T,q /p T,V as polarization-sensitive observables without introducing a kinematical bias, albeit not reconstructable in data. At detector level, the two subjets are only distinguishable by their kinematics, and denoting the leading p T subjet q 1 and the subleading p T subjet q 2 , we can define e.g. cos(θ * ) q 1 and z g,q 1 = p T,q 1 /p T,V accordingly. While this biases the kinematics (as illustrated in Fig. 1), these observables are accessible with the detector.
To validate the assumption that the subjets are good proxies for the V boson decay products, above observables -both using q and using q 1 -are studied with a few additional requirements. To reduce the contributions of events where the V boson decay products are not contained within a single large-radius jet, the V boson is required to be matched to the selected large-radius jet with ∆R V,J < 0.4. The subjets are ordered with the same η-ordering as the generator-level decay products, to avoid any bias from a direct matching of the subjets and the generator-level decay products. A comparison of the generator-level and detector-level distributions for cosθ * and z g is shown for the hadronically decaying V boson in Fig. 1, which demonstrates that the subjets indeed are good proxies for the V boson decay products, and can be used to distinguish between the different polarization states of the V boson.
At the HL-LHC, reconstruction is complicated by the impact of radiation from pileup on these observables. In particular, jet substructure is sensitive to the wide-angle, low-p T where pileup is difficult to separate from the hard-scatter collision, while for tracks, pileup may be removed based on the primary vertex association. In order to mitigate their pileup sensitivity, jet substructure observables can be calculated using tracks as inputs rather than using particle-flow objects. To reconstruct these track-based observables, tracks are associated to a large-R jet using a ∆R < 1.0 matching. These tracks are then clustered and groomed using the same algorithms as the particle-flow jets. Consequently, each substructure observable may be calculated using either the particle-flow constituents of the jet, or the groomed tracks associated to the jet. As illustrated in Fig. 2, track-based observables are able to capture similar information as the particle-flow observables and hence are used in our analysis from here on for substructure observables whose pileup sensitivity has not been studied in detail, namely any substructure observables which are not the jet mass or ratios of energy correlation functions such as d 2 [53]. Since the leptonically decaying W boson is fully reconstructed, it is also possible to define similar observables using the lepton and the reconstructed W boson. The corresponding results are shown for events with particle-level E miss T > 80 GeV in Fig. 3, illustrating that the reconstructed W boson decay behaves similarly to the generator-level W boson. In the transversely polarized case, the lepton tends to have a p T smaller than the neutrino. This is a result of the E miss T cut, which biases the relative momenta of the W decay products.

IV. SIGNAL EXTRACTION
Three main background processes need to be considered to extract the longitudinal VBS signal: W +jets and top-quark pair production, as well as the VBS non-W L V L polarization states. To illustrate the initial signal-to-background ratio, the event yields of signal and background for 3000 fb −1 of data after applying the event selection are shown in Fig. 4 for several observables. No single observable offers sufficient background reduction on its own, but by combining multiple observables in a neural network, the background reduction can be significantly improved.
Each different background has unique characteristics which may be used to distinguish it from the W L V L signal process: • The background VBS W V events have a similar topology, but differ for variables sensitive to the polarization states.
• The W +jets background does not contain a hadronically decaying V boson, and the tagging jets will tend to be more central.
• The top-quark pair production background contains a hadronically decaying W boson, and will tend to have more (heavy flavor) jets in the event.
Because of this, it is difficult to train a tagger to effectively distinguish between the W L V L events and all background processes. In order to improve analysis sensitivity, a multiclass tagger is trained to identify four different classes of events: the signal (VBS W L V L ), the other (background) polarization states of VBS W V , W +jets, and top-quark pair production.
The multiclass tagger is trained using the TMVA [54] implementation of multiclass deep neural network (DNN) based on a multilayer perceptron with one hidden layer and 17 neurons. Twelve variables, listed in Table III, are used as inputs into a multiclass DNN tagger. The distributions of these input variables are shown in Fig. 5 for both signal and background, with the pseudorapidity difference between the two tagging jets ∆η(j 1 , j 2 ) yielding the best single-variable signal discrimination. For reference purposes, we compare signal extraction based on ∆η(j 1 , j 2 ) alone (while additionally requiring the jet mass to be 60 GeV< m J <100 GeV, and d 2,J < 1.5 to further reduce background contributions), with our DNN performance. The event yield for the DNN tagger score compared to ∆η(j 1 , j 2 ) is shown for the signal and background events in Fig. 6, illustrating that the discrimination power of the DNN score for the signal class is significantly better than the discrimination power of the most important input variable to the tagger ∆η(j 1 , j 2 ). This is expected, as the DNN is able to better separate the signal events from the background contributions by making full use of the kinematic information available in the event. The output of this DNN tagger for the signal class is used as an input to the template fit utilized to estimate the signal sensitivity, as described in the next section.

V. ANALYSIS SENSITIVITY
The analysis sensitivity to the VBS W L V L signal is extracted by performing a simultaneous binned maximum-likelihood fit to the signal and background distributions of the DNN p T,j 1 p T of the leading tagging jet The expected significance is shown in Fig. 7 as a function of the total integrated luminosity, with and without the inclusion of the systematic uncertainties. The total integrated luminosity at the HL-LHC is expected to be 3000 fb −1 , and our results are shown for up to double this integrated luminosity, giving a simple extrapolation to the expected sensitivity from the combination of measurements from ATLAS and CMS. The sensitivity using the multiclass tagger is compared to using a ∆η(j 1 , j 2 ), which shows the best single-variable separation between signal and background. The tagger provides significant gains over the single-variable input, demonstrating the importance of a multivariate tagger to improve the