Reconstruction and identification of $H\to WW^*$ with high transverse momentum in the full hadronic final state

This Letter presents a study of the reconstruction and identification of $H\to WW^*$ with high transverse momentum, where both $W^{(*)}$ bosons decay hadronically. We show that the boosted $H\to WW^*$ can be effectively reconstructed as a single jet and identified using jet substructures in the center-of-mass frame of the jet. Such a reconstruction and identification approach can discriminate the boosted $H\to WW^*$ in the full hadronic final state from QCD jets. This result will significantly improve experimental sensitivities of searches for potential new physics phenomena beyond the standard model in final states containing highly boosted Higgs bosons.


I. INTRODUCTION
Many new physics (NP) extensions to the standard model (SM) predict new particles with masses at the TeV scale. Some of these heavy resonances [1][2][3][4][5] can decay into final states containing Higgs bosons. The most effective way to search for such particles is to reconstruct the Higgs boson in its dominant decay into a bottom quarkantiquark pair (bb) final state. Because the Higgs bosons decayed from heavy resonances have very large momenta (boosted), the hadronically decaying products of H → bb are so collimated that they are often reconstructed as single jets in the experiments. Searches for new heavy resonances using the signature of boosted H → bb have been very actively pursued by both the ATLAS [6][7][8][9][10][11][12][13][14][15][16][17][18][19][20] and CMS [21][22][23][24][25][26][27][28][29][30][31][32][33] experiments at the Large Hadron Collider (LHC). While the current searches start to probe heavy resonances with masses above 2-3 TeV, the experimental sensitivities become limited by the signal reconstruction efficiency as the production cross-sections of both heavy resonances and background from the SM processes drop significantly at the very high energy scale.
In this paper we present a study of the reconstruction and identification of boosted H → W W * , where both onshell W boson and off-shell W * boson decay hadronically. The H → W W * decay mode has the second largest decaying branch fraction of 21 % and its boosted signature has never been studied before. We show that a boosted H → W W * in the hadronic final state can be effectively reconstructed as a single jet, hereafter referred to as a H → W W * jet. By using jet substructures in the centerof-mass frame of the jet, the H → W W * jets can be distinguished from QCD jets, where the QCD jets are defined as those jets initiated by a non-top quark or a gluon. This result can significantly improve experimental sensitivities of searches for potential NP phenomena in final states containing highly boosted Higgs bosons.

II. EVENT SAMPLE
We use the boosted H → W W * jets, from the SM process of ZH production, as a benchmark to study the signal reconstruction and identification. The SM Multijet production process is used to model the QCD jets. The sample of H → W W * jets is further reweighted on a jet-by-jet basis such that its jet kinematics in transverse momentum (p T ) and pseudorapidity (η) match that of the QCD jet sample.
All the events used in this analysis are produced using the Pythia 8.244 Monte Carlo (MC) event generator [34,35] for the pp collision at 13 TeV center-of-mass energy. The spread of beam interaction point is assumed to follow a Gaussian distribution with a width of 0.015 mm in the transverse beam direction, and 45 mm in the longitudinal beam direction [36]. We divide the (η, φ) plane into 0.1×0.1 cells to simulate the finite resolution of the Calorimeter detector at the LHC experiments. The energies of particles entering each cell in each event, except for the neutrinos, are summed over and replaced with a massless pseudoparticle of the same energy, also referred to as an energy cluster, pointing to the center of the cell. These pseudoparticles are fed into the FastJet 3.0.1 [37] package for jet reconstruction. As for charged tracks, their momenta and vertex positions are smeared according to the expected resolutions of the ATLAS detector [38]. The effect of multiple pp interactions in the same event (pileup) is included by overlaying minimumbias events simulated with Pythia 8.244 on each event of interest in all samples. The number of pileup events is assumed to follow a Poisson distribution with a mean of 35, which is the observed average number of pileup events at ATLAS during its data taking at the 13 TeV pp collisions.

A. Event selection
Jets are reconstructed with the anti-k T algorithm [39] with a distance parameter of ∆R = 1.0, which is the default jet reconstruction algorithm used at the ATLAS and CMS experiments. To correct the presence of additional energy depositions from underlying events and pileups, we employ a jet area correction technique [40] to take into account the effects on an event-by-event ba-sis. For each event, the distribution of transverse energy densities is calculated for all jets with |η| < 2.1, and its median is taken as an estimate of the energy density of the pileup and underlying events. We subsequently correct each jet by subtracting the product of the transverse energy density and the jet area, which is determined with the "active" area calculation technique [40]. This method results in a modified jet four-momentum p µ = ( p jet , m jet ) that is used throughout the paper unless explicitly stated otherwise.
We select jets with p T > 350 GeV and |η| < 2.0 as H → W W * jet candidates. For a Higgs boson with p T = 350 GeV, MC studies show that the averaged separation between the Higgs boson and its decaying product is about 0.8, and approximately 55 % of the H → W W * in the hadronic final state can be reconstructed as a single jet. Here the separation between two objects is defined as ∆R = ∆η 2 + ∆φ 2 , where ∆η and ∆φ are the differences in psudorapidity and the azimuthal angle between the two object's momenta, respectively. The averaged separation gradually decreases as a function of p T to be less than 0.5 for Higgs bosons with p T > 1 TeV, where more than 90 % of the H → W W * in the hadronic final state can be reconstructed as a single jet.
We further require the track-assisted mass [41] of selected H → W W * jets to satisfy 40 < m TA jet < 240 GeV. The track-assisted jet mass is defined as m TA jet = m trk jet × (p T /p trk T,jet ), where m trk jet and p T /p trk T,jet are the invariant mass and the total transverse momentum of charged tracks associated with the jet, respectively. Only charged tracks with p T > 1 GeV and |η| < 2.5 are considered. They are also required to satisfy the criteria that |d 0 | < 1 mm and |z 0 − z pv | sin θ < 1.5 mm, where d 0 and z 0 are the transverse and longitudinal impact parameters of the charged track, z pv is the longitudinal position of the primary vertex, and θ is the polar angle of the charged track. A charged track is considered to be associated with a jet with p T in a unit of GeV only if the separation ∆R between the track and jet is less than R max , where R max = 1.0 − 0.4 × (p T − 350)/650 for jets with p T < 1000 GeV and R max = 0.6 for jets with p T > 1000 GeV, respectively. The track-assisted jet mass distributions of signal H → W W * jets and QCD jets in different p T ranges are shown in Fig. 1. The m TA jet distribution of the signal jets peaks around the Higgs boson mass and its shape shows no significant variation as a function of jet p T .

B. Jet substructure
Signal H → W W * jets can be further distinguished from QCD jets using jet substructure variables. There are many jet substructure variables [42] proposed and some of them have been successfully used to identify the boosted hadronically decaying W/Z boson, top quark, and H → bb boson by the ATLAS [43,44] and CMS [45,46] experiments. As a simple illustration, this paper uses the substructure variables defined in the center-of-mass frame of the jet, introduced in Ref. [47].
They are the thrust (T ), thrust-minor (T min ), sphericity (S), and the ratio between the second-order and zerothorder Fox-Wolfram moments (R 2 ). Those variables have been successfully implemented by the ATLAS experiment to make the first observation of the boosted hadronically decaying vector boson reconstructed as a single jet from the SM W/Z+jets production [48]. We define the center-of-mass frame (rest frame) of a jet as the frame where the four-momentum of the jet is equal to p rest µ ≡ (m jet , 0, 0, 0). All the jet substructure variables are calculated using the energy clusters of a jet or the charged tracks associated with a jet in the centerof-mass frame of the jet. Their distributions are shown in Fig. 2. A jet consists of its constituent particles. While the H → W W * is a two-body decay, the momenta of the W ( * ) bosons in the Higgs rest frame is very small due to the mass suppression. As a result, the distribution of the constituent particles of a boosted H → W W * in its center-of-mass frame has a four-body decay topology, with each subjet corresponds to one quark decayed from the two W ( * ) bosons. The hadronically decaying products from a boosted H → W W * has a relatively isotropic distribution in the jet rest frame. It is worth noting that the tracks associated with a Higgs boson with a larger p T are more isotropically distributed compared to the ones associated with a Higgs boson with a smaller p T , as shown in Fig. 2. Such an effect is also seen in the distribution of jet sphericity calculated using the energy clusters, but not for thrust, thrust-minor, and R 2 as their calculations based on the energy clusters are degraded by the finite resolution of the Calorimeter detector when jet p T becomes very large. On the other hand, the constituent particle distribution of a QCD jet in the jet rest frame is less isotropic and becomes even more directional when jet p T becomes larger. As a result, the jet substructure variables defined in the jet rest frame has a better discriminating power to separate boosted H → W W * bosons from the background for jets with larger p T . We further recluster the energy clusters of a jet to reconstruct subjets in the jet rest frame using a generalized EEkT algorithm [49] in the FastJet 3.0.1 [37] package with the parameters of p = 0 and R = 0.8. The reconstructed subjets are required to have energy E subjet > 10 GeV in the jet rest frame. A charged track is considered to be associated with a subjet only if their angular separation is less than 0.8 in the jet rest frame.

C. Identification of H → W W * jets
The final variable to identify boosted H → W W * jets is constructed using a boosted decision tree (BDT) algorithm. The input variables used in the BDT include m TA jet , T , T min , S, R 2 , the number of charged tracks associated with the jet, and the number of subjets reconstructed in the jet rest frame. The BDT is optimized separately in different jet p T ranges. The signal identification efficiency of H → W W * jets vs the background rejection of QCD jets for the BDT variable is shown in Fig 3 . The rejection of QCD jets for a given H → W W * jet identification efficiency is comparable to the H → bb taggers at ATLAS [50] and CMS [46]. The performance of H → W W * identification gets better for jets with larger p T unlike the H → bb taggers, whose background rejections degrade when the jet p T increases [46,50]. The performance of H → W W * identification can be further improved by including additional BDT input variables, such as the subjet energies, the number of charged tracks associated with each subjet, other jet substructure variables [42] defined in the lab frame, and so on. Such a dedicated study is beyond the scope of this paper.

IV. APPLICATION
Reconstructed H → W W * jets can be used to improve searches for NP with specific final state signatures.
Here we demonstrate such an application by considering a search for a heavy resonance (X) with the narrow width that decays into two Higgs bosons in the final states: pp → X → HH, H → bb, W W * , where both W ( * ) bosons decay hadronically. Note that for such high mass (> 1.5 TeV) resonance decays, more than 90 % of the events have their decay particles from the Higgs decay within a cone of R < 1.0. As a result, two leading jets that have the highest and the second highest energy with p T > 350 GeV and η| < 2.0 in an event are selected as two Higgs jet candidates. They are subsequently combined to form an X → HH resonance candidate. The H → bb jet identification and the H → W W * jet identification are subsequently applied to reduce the background that is dominated by QCD jets with the criteria that both identification efficiencies are set to be 50 %. The background rejection of H → bb tagger is assumed to be the same as the CoM Higgs tagger [19,50] at ATLAS.
We estimate the expected 95 % C.L. upper limit (UL) on the product of the production cross-section of a heavy resonance X and the branching fraction for its decay into a Higgs boson pair. The expected limit is plotted as a function of the assumed X mass, as shown in Fig. 4, for 400 fb −1 LHC data at 13 TeV, equivalent to the total luminosity accumulated after the incoming Run III data taking at the LHC. For heavy resonances whose masses are below 3 TeV, the search sensitivity in the final state where both Higgs bosons decay into a bb pair (X → HH → bbbb) is significantly better than the final state in which one of the Higgs bosons decays into two W ( * ) bosons (X → HH → bbW W * ), because of the much larger decay branching fraction of H → bb than that of H → W W * in the hadronic final state. However, the expected UL in the X → HH → bbW W * final state becomes comparable to the one in the X → HH → bbbb channel for resonances with masses above 3.5 TeV, due to the degradation of background rejection of H → bb taggers for jets with very large p T [19,50]. Besides H → bb jets, adding H → W W * jets as an additional experimental signature can lower the expected UL of searches for X → HH by 10-50 %, depending on the resonance mass.
We compare the search sensitivities in our study to the expected ones for 400 fb −1 data from the ATLAS and CMS experiments by scaling down their published results [14,17,31,33] according to the square root of the luminosity increase, as shown in Fig. 3. Our estimated UL in the X → HH → bbbb final state is comparable to the ones from the ATLAS [14] and CMS [31] publications, for resonances with masses less than 2 TeV, but better in the higher mass region. This is because our study used a most recent H → bb tagger at ATLAS [19,50] that has a significantly better performance than the Higgs taggers used in the previous ATLAS and CMS publications. Both the ATLAS [17] and CMS [33] experiments also carried out searches for X → HH → bbW W * in the final state where one W ( * ) boson decays hadronically and the other W ( * ) boson decays leptonically. Depending on the resonance mass, the expected UL from the CMS (ATLAS) publication is approximately 1.2 − 2 (> 10) times the expected UL in our study, where both hadronically decaying W ( * ) bosons are reconstructed as a single jet.  The expected 95% C.L. UL on the product of the production cross-section of a heavy resonance X and its decaying branching faction into a Higgs boson pair, as a function of the assumed X mass that is reconstructed in different Higgs boson decay final states.

V. CONCLUSION
In this paper we study the reconstruction and identification of H → W W * with high transverse momentum, where both W ( * ) bosons decay hadronically. We show that the boosted H → W W * can be effectively reconstructed as a single jet and identified using jet substructures in the center-of-mass frame of the jet. Such a reconstruction and identification approach can discriminate the boosted H → W W * in the full hadronic final state from QCD jets. Our result will significantly improve experimental sensitivities of searches for potential NP beyond the SM in final states containing highly boosted Higgs bosons. The technique we proposed can be also directly applied to boosted H → ZZ * in the full hadronic final state.