Measurements of Higgs boson production by gluon-gluon fusion and vector-boson fusion using $H\rightarrow W W^* \rightarrow e\nu \mu\nu$ decays in $pp$ collisions at $\sqrt{s}=13$ TeV with the ATLAS detector

Higgs boson production via gluon-gluon fusion and vector-boson fusion in proton-proton collisions is measured in the $H\rightarrow W W^* \rightarrow e\nu \mu\nu$ decay channel. The Large Hadron Collider delivered proton-proton collisions at a center-of-mass energy of 13 TeV between 2015 and 2018, which were recorded by the ATLAS detector, corresponding to an integrated luminosity of 139 fb$^{-1}$. The total cross sections for Higgs boson production by gluon-gluon fusion and vector-boson fusion times the $H\rightarrow W W^*$ branching ratio are measured to be $12.0\pm1.4$ and $0.75\;^{+0.19}_{-0.16}$ pb, respectively, in agreement with the Standard Model predictions of $10.4\pm 0.6$ and $0.81\pm 0.02$ pb. Higgs boson production is further characterized through measurements of Simplified Template Cross Sections in a total of 11 kinematic fiducial regions.


Introduction
The Higgs boson is a neutral scalar particle associated with a field whose nonzero vacuum expectation value results in the breaking of electroweak (EW) symmetry in the Standard Model (SM) and gives mass to the and bosons [1][2][3][4]. Observation of a new particle consistent with being the Higgs boson was reported by the ATLAS and CMS collaborations in 2012 [5,6]. The Higgs boson has a rich set of properties that can be verified experimentally. Measurements of these properties are a powerful test of the SM and can be used to constrain theories of physics beyond the SM (BSM). BSM physics can also alter the kinematics of the Higgs boson production and decay. The large data sample delivered by the Large Hadron Collider (LHC) [7] at CERN makes it possible to measure the Higgs boson cross section in different kinematic regions in order to probe for these effects.
This paper describes measurements of Higgs boson production by gluon-gluon fusion (ggF) and vectorboson fusion (VBF) using → * → decays in proton-proton ( ) collisions at a center-of-mass energy of 13 TeV. The data were recorded by the ATLAS detector [8] during Run 2 (2015-2018) of the LHC and correspond to an integrated luminosity of 139 fb −1 . The chosen decay channel takes advantage of the large branching ratio for → * decay and the relatively low background from other SM processes due to having two charged leptons of different flavors in the final state. The measured cross section in the ggF production mode probes the Higgs boson couplings to heavy quarks, while the VBF production mode directly probes the couplings to and bosons. Previous studies of the → * → decay channel have been reported by the CMS Collaboration using its 137 fb −1 full Run 2 dataset [9] and by the ATLAS Collaboration using a partial Run 2 dataset corresponding to an integrated luminosity of approximately 36 fb −1 [10]. Compared to the previous ATLAS Run 2 analysis, several improvements have been made in addition to using the larger dataset-most notably, a measurement of the ggF production mode in the final state with two or more reconstructed jets and measurements of cross sections in kinematic fiducial regions defined in the Simplified Template Cross Section (STXS) framework [11,12].
The outline of this paper is as follows. Section 2 provides an overview of the signal characteristics and the analysis strategy. Section 3 describes the data and the simulated samples. Section 4 describes the event reconstruction. Section 5 details the various selections used to define the signal and control regions in the analysis. Section 6 discusses how the backgrounds are estimated. Section 7 provides commentary on the systematic uncertainties. Section 8 defines the likelihood fit procedure. Finally, the results of the analysis are presented in Sec. 9 and summarized in Sec. 10.

Analysis overview
The → * → decay is characterized by two charged leptons and two undetected neutrinos in the final state. The opening angle between the two charged leptons tends to be small due to the spin-0 nature of the Higgs boson and the chiral structure of the weak force in the decay of the two bosons [13]. This feature of the decay is exploited to separate the Higgs boson signal from the main backgrounds such as continuum production of , where the charged leptons are more likely to have a large opening angle.
In addition to the decay products of the Higgs boson, the final state can be populated by jets either from the quarks participating in the VBF production mode or from initial-state radiation from quarks or gluons (in both the ggF and VBF production modes). The composition of the background processes changes significantly depending on the number of jets ( jet ) in the final state. Therefore, the analysis is performed separately in the jet = 0, jet = 1, and jet ≥ 2 channels. The analysis is divided into four categories: one each for the jet = 0 and jet = 1 channels, which solely target the ggF signal production mode, and two for the jet ≥ 2 channel, which separately target the VBF and ggF production modes.
For each analysis category, a set of selections are applied in order to enhance the signal contribution in a sample of events referred to as a signal region (SR), and a final fit to these events is performed. For the categories targeting the ggF production mode, the fit variable discriminating between signal and SM background processes is the dilepton transverse mass, defined as T = √︃ ℓℓ T + miss T 2 − ℓℓ T + miss T 2 with ℓℓ T = √︃ | ℓℓ T | 2 + 2 ℓℓ , where ℓℓ is the dilepton invariant mass, ℓℓ T is the vector sum of the lepton transverse momenta, and miss T (with magnitude miss T ) is the missing transverse momentum. For the jet ≥ 2 category targeting the VBF production mode, the output of a deep neural network (DNN) trained to identify the VBF topology is used as the discriminating variable in the fit. The analysis likelihood function combines all SRs and determines the best-fit values for a set of parameters of interest (POIs). Cross sections times branching ratios are measured for the ggF and VBF production modes and their combination in the inclusive jet multiplicity.
Cross-section measurements are also conducted in the Stage-1.2 STXS category scheme, which, relative to the 1.1 scheme [14], refines the granularity of bins for ggF events with a Higgs boson produced with large transverse momentum. Selected events are categorized according to the requirements placed on the transverse momentum of the reconstructed Higgs boson candidate ( T ) and on potential additional hadronic jets. The ggF STXS template process (referred to subsequently as ) is defined to be the Born-level → process plus higher-order QCD and EW corrections. This includes real EW radiation, in particular the → (→¯) process. The VBF STXS template process (referred to subsequently as EW ) is defined to include the (→¯) topology in addition to the usual VBF topology. After merging certain STXS bins to ensure sensitivity for all the measured POIs, a total of 11 fiducial cross sections corresponding to different STXS-bound kinematic regions are measured: 6 for production and 5 for EW production.

Detector and data samples
The ATLAS detector at the LHC covers nearly the entire solid angle around the collision point. 1 It consists of an inner tracking detector surrounded by a thin superconducting solenoid, electromagnetic and hadronic calorimeters, and a muon spectrometer incorporating three large superconducting toroidal magnets. The inner-detector system (ID) is immersed in a 2 T axial magnetic field and provides charged-particle tracking in the range | | < 2.5.
The high-granularity silicon pixel detector covers the vertex region and typically provides four measurements per track, the first hit normally being in the insertable B-layer installed before Run 2 [15,16]. It is followed by the silicon microstrip tracker, which usually provides eight measurements per track. These silicon 1 ATLAS uses a right-handed coordinate system with its origin at the nominal interaction point (IP) in the center of the detector and the axis along the beam pipe. The axis points from the IP to the center of the LHC ring, and the axis points upward. Cylindrical coordinates ( , ) are used in the transverse plane, being the azimuthal angle around the axis. The pseudorapidity is defined in terms of the polar angle as = − ln tan( /2). Angular distance is measured in units of Δ ≡ √︁ (Δ ) 2 + (Δ ) 2 .
detectors are complemented by the transition radiation tracker (TRT), which enables radially extended track reconstruction up to | | = 2.0. The TRT also provides electron identification information based on the fraction of hits (typically 30 in total) above a higher energy-deposit threshold corresponding to transition radiation.
The calorimeter system covers the pseudorapidity range | | < 4.9. Within the region | | < 3.2, electromagnetic calorimetry is provided by barrel and end cap high-granularity lead/liquid-argon (LAr) calorimeters, with an additional thin LAr presampler covering | | < 1.8 to correct for energy loss in material upstream of the calorimeters. Hadronic calorimetry is provided by the steel/scintillator-tile calorimeter, segmented into three barrel structures within | | < 1.7, and two copper/LAr hadronic end cap calorimeters. The solid angle coverage is completed with forward copper/LAr and tungsten/LAr calorimeter modules optimized for electromagnetic and hadronic measurements respectively.
The muon spectrometer comprises separate trigger and high-precision tracking chambers measuring the deflection of muons in a magnetic field generated by the superconducting air-core toroids. The field integral of the toroids ranges between 2.0 and 6.0 T m (Tesla*meter) across most of the detector. A set of precision chambers covers the region | | < 2.7 with three layers of monitored drift tubes, complemented by cathode-strip chambers in the forward region, where the background is highest. The muon trigger system covers the range | | < 2.4 with resistive-plate chambers in the barrel and thin-gap chambers in the end caps.
Interesting events are selected to be recorded by the first-level trigger system implemented in custom hardware, followed by selections made by algorithms implemented in software in the high-level trigger [17]. The first-level trigger accepts events from the 40 MHz bunch crossings at a rate below 100 kHz, which the high-level trigger reduces in order to record events to disk at about 1 kHz. A combination of unprescaled single-lepton triggers and one -dilepton trigger [18,19] is employed in this analysis so as to maximize the total trigger efficiency. The transverse momentum ( T ) threshold for single-electron(muon) triggers was 24 (20) GeV for the first year of data taking and increased to 26 GeV for both lepton flavors during the remainder of Run 2. The -trigger had a T threshold of 17 GeV for electrons and 14 GeV for muons. The full ATLAS Run 2 dataset is used for this analysis, consisting of collision data produced at √ = 13 TeV and recorded between 2015 and 2018. The data are subjected to quality requirements [20], including the removal of events recorded when relevant detector components were not operating correctly. The total integrated luminosity after this cleaning of the data corresponds to 139 fb −1 [21].
An extensive software suite [22] is used in the reconstruction and analysis of real and simulated data, in detector operations, and in the trigger and data acquisition systems of the experiment.

Simulated event samples
Higgs boson production and decay into pairs of bosons or leptonically decaying -leptons were simulated for each of the four main production modes: ggF and VBF, as well as and (production in association with a vector boson, collectively referred to as ).
VBF events were generated with Powheg Box [23][24][25]43], interfaced with Pythia 8.230 [28] with the dipole recoil option enabled to model the parton shower, hadronization and underlying event. The Powheg Box prediction is accurate to NLO in QCD corrections and tuned to match calculations with effects due to finite heavy-quark masses and soft-gluon resummations up to next-to-next-to-leading-logarithm (NNLL) accuracy. The MC prediction is normalized to an approximate-NNLO QCD cross section with NLO electroweak corrections [44][45][46].
The uncertainties due to the parton shower and hadronization model for the ggF and VBF Higgs boson signal samples are evaluated using the events in the nominal sample generated with Powheg Box but interfaced to an alternative showering program Herwig 7 [47,48] instead of Pythia 8. To estimate the uncertainty related to the matching between the matrix element and the parton shower for ggF and VBF production, MC events produced with the MG5_aMC@NLO [49] generator (where MG5 denotes MadGraph5) and interfaced to Herwig 7 are used. They are accurate to NLO in QCD and utilize the NNPDF30_nlo_as_0118 [50] parton distribution function (PDF) set. In both cases, the H7UE set of tuned parameters (tune) [48] and the MMHT2014lo PDF set [51] were used for the underlying event.
The production was simulated using Powheg Box v2 [23][24][25]43] and interfaced with Pythia 8.212 for parton showering and nonperturbative effects. The Powheg Box prediction is accurate to NLO for the production of plus one jet. Samples for the loop-induced process → were generated with Powheg Box v2 interfaced to Pythia 8.235. The MC prediction is normalized to cross sections calculated at NNLO in QCD (including the → contribution) and at NLO in electroweak corrections [52][53][54][55][56].
The ggF, VBF, and Higgs boson samples use the PDF4LHC15 [57] PDF set and the AZNLO tune [58] of Pythia 8. The sample normalizations account for the decay branching ratios calculated with HDECAY [59][60][61] and Prophecy4f [62][63][64] assuming a Higgs boson mass of 125.09 GeV [65]. An uncertainty of 2.16% [11] is assigned to the → * branching ratio, which includes the uncertainty in the Higgs boson mass. All Higgs boson samples are generated with a Higgs boson mass of 125 GeV, with the uncertainty in the Higgs boson mass being negligible for kinematic distributions.
The production of¯events was modeled using the Powheg Box v2 [23-25, 66, 67] generator at NLO with the NNPDF3.0nlo [50] PDF set. The events were interfaced to Pythia 8.230 using the A14 tune [68]. The prediction is normalized to the cross section computed at NLO QCD and NLO EW accuracy [11]. The sample is inclusive in Higgs decay modes and the cross section is computed assuming a Higgs boson mass of 125.09 GeV.
To model the SM background, quark-initiated production of , , * , and involving the strong interaction was simulated with the Sherpa 2.2.2 [69] generator. Fully leptonic final states were generated using matrix elements at NLO accuracy in QCD for up to one additional parton and at leading-order (LO) accuracy for up to three additional parton emissions. For * , the * mass was generated with a lower bound of 4 GeV. Samples for the loop-induced processes → and were generated using LO-accurate matrix elements for up to one additional parton emission. 3 The rapidity is defined in terms of a particle's energy and momentum in the direction of the beam pipe as = 1 2 ln For the quark-initiated background, systematic uncerainties are evaluated via samples simulated with alternative settings of the Sherpa 2.2.2 generator. The uncertainty in the matching procedure is assessed by varying the parameter cut , which determines the transition between the matrix-element and parton-shower domain [70]. Specifically, alternative samples with cut = 15 and 30 GeV instead of the nominal value of cut = 20 GeV are considered. The uncertainties in the shower model are estimated via samples, in which either the resummation scale, q , is increased or decreased by a factor of 2, or the alternative recoil scheme described in Refs. [71,72] is used.
The production of final states was simulated with the Sherpa 2.2.8 [69] generator. Matrix elements are at NLO QCD accuracy for up to one additional parton and at LO accuracy for up to three additional parton emissions.
Triboson production ( ) was simulated with the Sherpa 2.2.2 generator using factorized gauge-boson decays. The matrix elements are accurate to NLO for the inclusive process and to LO for up to two additional parton emissions.
For all nominal multiboson samples generated with Sherpa, the matrix-element calculations were matched and merged with the Sherpa parton shower based on Catani-Seymour dipole factorization [71,73] using the MEPS@NLO prescription [70,[74][75][76]. The virtual QCD corrections were provided by the OpenLoops library [77,78]. The NNPDF3.0nnlo [50] set of PDFs was used, along with the dedicated set of tuned parton-shower parameters developed by the Sherpa authors.
Electroweak production in association with two jets ( ) was generated by MG5_aMC@NLO with LO matrix elements using the NNPDF3.0nlo PDF set. For the nominal sample, MG5_aMC@NLO was interfaced with Pythia 8.244, using the A14 tune to model the parton shower, hadronization, and underlying event. An alternative sample is utilized to evaluate the shower model uncertainty, for which MG5_aMC@NLO was instead interfaced with Herwig 7.
The associated production of top quarks and bosons (mainly ) was modeled using the Powheg Box v2 generator at NLO in QCD using the five-flavor scheme and the NNPDF3.0nlo set of PDFs. The diagram removal scheme [84] was used to remove interference and overlap with¯production. The events were interfaced to Pythia 8.230 using the A14 tune and the NNPDF2.3lo set of PDFs. In all samples generated with Powheg Box v2, the decays of bottom and charm hadrons were performed by EvtGen 1.6.0 [85].
The +jets and multĳet backgrounds are estimated from data. Generated samples of +jets and +jets events are used to validate the estimate and to determine the flavor composition uncertainties. These MC samples were generated using Powheg Box interfaced with Pythia 8.186, with Sherpa 2.2.1, and with MG5_aMC@NLO [49,86] interfaced with Pythia 8.186.
The MC generators, PDFs, and programs used for the underlying event and parton shower (UEPS) are summarized in Table 1. The alternative generators or UEPS models used to estimate systematic uncertainties are also listed in parentheses. Finally, the orders of the perturbative prediction for each sample are reported.
For all MC samples, the events were processed through the ATLAS detector simulation [87] based on Geant4 [88]. The effect of pileup was modeled by overlaying the hard-scattering event with simulated inelastic events generated with Pythia 8.186 using the NNPDF2.3lo set of PDFs and the A3 tune [89].

Event reconstruction
Primary vertices in the event are reconstructed from tracks in the ID with T > 500 MeV. Events are required to have at least one primary vertex with at least two associated tracks. The hard-scatter vertex is selected as the vertex with the highest Electron candidates are reconstructed by matching energy clusters in the electromagnetic calorimeter to well-reconstructed tracks that are extrapolated to the calorimeter [104]. All candidate electron tracks are fitted using a Gaussian sum filter [105] to account for bremsstrahlung energy losses. Electron candidates are required to satisfy | | < 2.47, excluding the transition region 1.37 < | | < 1.52 between the barrel and end caps of the LAr calorimeter. Muon candidates are reconstructed from a global fit of matching tracks from the inner detector and muon spectrometer [106]. They are required to satisfy | | < 2.5. In order to reject particles misidentified as prompt leptons, several identification requirements as well as isolation and impact parameter criteria [104, 106] are applied. For electrons, a likelihood-based identification method [104] is employed, which takes into account a number of discriminating variables such as electromagnetic shower shapes, track properties, transition radiation response, and the quality of the cluster-to-track matching. Electron candidates with 15 GeV < T < 25 GeV must satisfy the "tight" likelihood working point, which has an efficiency of approximately 70% for these electrons. For T > 25 GeV, where misidentification backgrounds (discussed further in Sec. 6.4) are less important, electron candidates must satisfy the "medium" likelihood working point, which has an efficiency of approximately 85% for an electron with T ∼40 GeV. For muons, a cut-based identification method [106] is employed, using the "tight" working point with an efficiency of ∼95% so as to maximize the sample purity. The impact parameter requirements are | 0 sin | < 0.5 mm and | 0 |/ 0 < 5 (3) for electrons (muons). 5 Leptons are required to be isolated from other activity in the event by placing upper bounds on both other transverse energy (using topological clusters in the calorimeter) within a cone of size Δ = 0.2 around the lepton and other T (using tracks) within a cone of variable size no larger than Δ = 0.2 (0.3) for electrons (muons). At least one of the offline reconstructed leptons must be matched to an online object that triggered the recording of the event.
In the case where the -trigger is solely responsible for the event being recorded, each lepton must correspond to one of the trigger objects. This trigger matching scheme also requires the T of the lepton to be at least 1 GeV above the trigger-level threshold.
Jets are reconstructed using the anti-algorithm with a radius parameter of = 0.4 and particleflow objects as input [107][108][109]. The four-momentum of the jets is corrected for the response of the noncompensating calorimeter, signal losses due to noise threshold effects, energy loss in inactive material, and contamination from pileup (defined as additional interactions in the same and neighboring bunch crossings) [110]. For jets entering the analysis, a kinematic selection of T > 20 GeV and | | < 4.5 is applied. In the context of event categorization, only jets with T > 30 GeV are considered for jet counting. Furthermore, a jet-vertex-tagger multivariate discriminant selection that reduces contamination from pileup [111] is applied to jets with 20 < T < 60 GeV and | | < 2.4, utilizing calorimeter and tracking information to separate hard-scatter jets from pileup jets. Jets with T > 20 GeV and | | < 2.5 containing -hadrons ( -jets) are identified using a neural-network discriminant based on a number of lower-level taggers which utilize relevant quantities such as the associated track impact parameters and information from secondary vertices. The working point that is adopted has an average 85% -jet tagging efficiency, as estimated from simulated¯events [112,113].
The following procedure is adopted in the case of overlapping objects. If two electrons share an ID track, the lower-T electron is removed. If a muon shares an ID track with an electron, the electron is removed. For electrons and jets, the jet is removed if Δ (jet, ) < 0.2 and the jet is not tagged as a -jet. For any surviving jets, the electron is removed if Δ (jet, ) < 0.4. For muons and jets, the jet is removed if Δ (jet, ) < 0.2, the jet has less than three associated tracks with T > 500 MeV, and the jet is not tagged as a -jet. For any surviving jets, the muon is removed if Δ (jet, ) < 0.4. 5 0 and 0 are the longitudinal and transverse impact parameters, respectively. 0 is defined by the point of closest approach of the track to the beamline in the r-plane, while 0 is the longitudinal distance to the hard-scatter primary vertex from this point.
The quantity miss T is calculated as the negative vector sum of the T of all the selected leptons and jets, together with reconstructed tracks that are not associated with these objects but are consistent with originating from the primary collision [114]. A second definition of missing transverse momentum (in this case denoted by miss T ) uses tracks for the hadronic hard term as well, replacing the calorimeter-measured jets with their associated tracks instead. The miss T observable is used directly in the selection of events because of its ability to discriminate better against the / * → background, while the miss T observable is used to build signal-sensitive variables such as T due to its superior resolution.

Event selection and categorization
The initial sample of events is required to satisfy the data quality and trigger criteria, as well as to contain exactly two leptons identified as discussed in the previous section, with different flavor and opposite charge. In addition, the higher-T (leading) lepton is required to have T > 22 GeV and the subleading lepton is required to have T > 15 GeV. Di--lepton backgrounds from low-mass Drell-Yan (DY) production and meson resonances are removed by requiring a dilepton invariant mass ℓℓ > 10 GeV. In the analysis categories targeting the ggF production mode, a miss T > 20 GeV selection is applied, which significantly reduces both the / * → background and the multĳet backgrounds with misidentified leptons. The above criteria define the event preselection. Figure 1 shows the jet multiplicity distribution at the preselection level. All histograms in this paper include underflow and overflow events unless otherwise stated. The different background compositions as a function of jet multiplicity motivate the division of the data sample into separate jet categories. Four main analysis categories are defined: the jet = 0 category targeting the ggF production mode as described in Sec. 5.1, the jet = 1 category targeting the ggF production mode as described in Sec. 5.2, the jet ≥ 2 category targeting the VBF production mode as described in Sec. 5.3, and the jet ≥ 2 category targeting the ggF production mode as described in Sec. 5.4. To reject background from top-quark production, events containing -jets with T > 20 GeV are vetoed in all analysis categories. The remaining selections used to define the analysis SRs are described separately for each category of events below, while Table 2 provides a summary of the full set of SR selections.

jet = 0 category
Events with a significant mismeasurement of the missing transverse momentum are suppressed by requiring miss T to point away from the dilepton transverse momentum (Δ ℓℓ, miss T > /2). In the absence of a jet to balance the dilepton system, the magnitude of the dilepton momentum ℓℓ T is expected to be small in DY events. A requirement of ℓℓ T > 30 GeV further reduces the DY contribution while retaining the majority of the signal events.

Continuum
production and resonant Higgs boson production can be separated by exploiting the spin-0 property of the Higgs boson, which, when combined with the − nature of the boson decay, leads to a small opening angle between the charged leptons. A requirement of ℓℓ < 55 GeV, which combines the small lepton opening angle with the kinematics of a low-mass Higgs boson ( = 125 GeV), significantly reduces both the and DY backgrounds. A requirement of Δ ℓℓ < 1.8 significantly reduces the remaining DY background while retaining most of the signal. The ℓℓ and Δ ℓℓ selections in the jet = 0 category are indicated by dashed lines in Figs. 2(a) and 2(b), with an arrow at the top pointing to the region retained. The jet = 0 SR is further split into four subregions with boundaries in ℓℓ at    Background rejection     Figure 2: Distributions of (a) ℓℓ and (b) Δ ℓℓ in the jet = 0 category as well as (c) ℓℓ and (d) Δ ℓℓ in the jet = 1 category, after the preselection and background rejection steps, and also after the selection on ℓℓ for the Δ ℓℓ plots. The dashed lines indicate where the selection on the observable is made. The distributions are normalized to their nominal yields, before the final fit to all SRs and CRs (prefit normalizations). The hatched band shows the normalization component of the total prefit uncertainty, assuming SM Higgs boson production. The bottom panels show the normalized distributions for the signal and backgrounds, from which it can be inferred which background processes are primarily removed by the indicated selections.

jet = 1 category
A requirement is applied to the maximum transverse mass defined as the maximum value of ℓ and ℓ can be either the leading or the subleading lepton. This quantity tends to have small values for the DY background and large values for the signal process. It also has small values for multĳet production, where misidentified leptons are frequently measured with energy lower than the jets from which they originate. Therefore, these backgrounds are substantially reduced with a requirement of max ℓ T > 50 GeV. The one-jet requirement improves the rejection of / * → background. Using the direction and magnitude of the measured missing transverse momentum and projecting it along the directions defined by the two reconstructed charged leptons, the mass of the leptonically decaying -lepton pair, , can be reconstructed using the so-called collinear approximation [115]. A requirement of < − 25 GeV 6 significantly reduces the remaining DY contribution and is applied in all categories with jet ≥ 1. The same Δ ℓℓ and ℓℓ selections as described in Sec. 5.1 are also applied in the jet = 1 category and are illustrated in Figs. 2(c) and 2(d), respectively. The jet = 1 SR is further split into four subregions with the same boundaries as defined for the jet = 0 category.

VBF-enriched jet ≥ 2 category
The VBF process is characterized by the kinematics of the two leading jets in the event, which are predominantly emitted in the forward region, and by the relatively low levels of hadronic activity between them due to the mediating weak bosons that do not exchange color. In order to construct a SR enriched in this VBF topology, events are rejected if they contain additional jets with T > 30 GeV that lie in the pseudorapidity gap between the two leading jets (central jet veto) or if either lepton lies outside the pseudorapidity gap between the two leading jets (outside lepton veto). Furthermore, the invariant mass of the two leading jets, , is required to be above 120 GeV to ensure orthogonality to analyses targeting the (→ ) production mode.
The events in this category are analyzed using a DNN that is implemented through Keras [116] and TensorFlow [117], considering VBF Higgs boson production as signal and the other processes as background (including ggF Higgs boson production). The hyperparameters are optimized to find the best-performing set. The architecture of the DNN exhibits seven dense hidden layers, with the first hidden layer consisting of 256 nodes and each successive layer decreasing in size. Nodes in the hidden layers use rectified linear units as activation functions, while the output node utilizes a sigmoid activation function. Cross-entropy is used to calculate the loss, and dropout is used as a regularization method to prevent overtraining. A total of 15 kinematic variables built from the leptons (ℓ), jets ( ), and miss T in the event are used as inputs to the DNN. The following variables are chosen to provide discrimination based on the VBF topology: ; the difference between the two jet rapidities (Δ ); the lepton -centrality ( ℓ ℓ , where ℓ = |2 ℓ − |/Δ ), which quantifies the positions of the leptons relative to the leading jets in pseudorapidity [118]; the T of the three leading jets ( 1 T , 2 T , 3 T , where 3 T is set to 0 GeV if there is no third jet in the event); and the invariant masses of all four possible lepton-jet pairs between the leptons and the two leading jets ( ℓ 1 1 , ℓ 1 2 , ℓ 2 1 , ℓ 2 2 ). The variables ℓℓ , Δ ℓℓ , and T are also utilized, targeting the features of the → * decay. Two additional variables are also included: the total transverse momentum ( tot T ), defined as the magnitude of the vectorial sum of the T of all selected objects, and the miss T significance, which provides separation between events with real undetected high-T particles and events where the miss T is the result of resolution effects [119]. The observables providing the best discrimination between signal and background are and Δ , and their distributions in the jet ≥ 2 VBF SR are shown in Fig. 3. The DNN output reflects how "VBF-like" the event's kinematics are, and   thus is used as a classifier, with the signal purity improving as the output value increases. The DNN bin boundaries in the VBF-sensitive range are chosen with an algorithm that aims to make the bins as narrow as possible, while also requiring at least ten expected signal and background events each per bin as well as at most 20% statistical uncertainty in the background. This method helps to mitigate the risk of strong statistical fluctuations in the fit, and yields narrower bins on average for larger values of the DNN output, giving a total of seven bins: In the bin with the highest DNN output, the expected VBF signal-to-background ratio is approximately 2 to 1.

ggF-enriched jet ≥ 2 category
The ggF-enriched jet ≥ 2 category is forced to be mutually exclusive to the VBF-enriched jet ≥ 2 category by requiring events to fail either the central jet veto or outside lepton veto. Furthermore, (→ ) production is suppressed by rejecting events in a region defined by | − 85| ≤ 15 GeV and Δ ≤ 1.2. The same Δ ℓℓ and ℓℓ selections as described in Sec. 5.1 are also applied in the ggF-enriched jet ≥ 2 category and are shown in Figs. 4(a) and 4(b), respectively, together with the selection in Fig. 4(c). The ggF-enriched jet ≥ 2 SR is further split into two subregions with a boundary at ℓℓ = 30 GeV.

STXS categorization
In order to optimize the measurements in bins aligned with those of the Stage-1.2 STXS framework, several STXS kinematic fiducial regions are merged and the separation of the selected events into SRs differs slightly from the description above. The STXS bin merging strategy, referred to as Reduced Stage-1.2, and the reconstructed SRs are illustrated in Fig. 5  cross section is measured, with the same SR splitting as defined in Sec. 5.1 and applying a T < 200 GeV selection. For the exclusive jet = 1 category, all three Stage-1.2 measurements are retained, with the SR split along the same T bin boundaries. For the exclusive jet ≥ 2 category with T < 200 GeV, only a single STXS cross section is measured, with the same SR splitting as defined in Sec. 5.4 but also including a T < 200 GeV selection. For the jet inclusive category with T ≥ 200 GeV, only a single STXS cross section is measured, using both the jet = 1 SR with an added T ≥ 200 GeV selection and the same ggF-enriched jet ≥ 2 SR with splitting as defined in Sec. 5.4 but also including a the bins separated by T value 7 are merged, and so are the bins separated by for T ≥ 200 GeV, resulting in a total of five measured cross sections. The same SR described in Sec. 5.3 is used, but split into five subregions with the same boundaries that define the measured EW STXS cross sections. In addition, the same DNN and training is used for the final fit's discriminating variable. The exclusive jet ≥ 2 EW category with < 350 GeV that targets (→¯) production and also the (→ leptons) and¯categories to which this analysis is not sensitive are fixed to their expected yields in the fit. Figure 6 shows the relative contributions of the different merged STXS bins in all reconstructed SRs. In each case, the target categories provide the largest contribution in the corresponding SRs which aim to select them.

Expected Composition
= 13 TeV, 139 fb s Figure 6: Relative SM signal composition in terms of the measured STXS bin for each reconstructed signal region.

Background estimation
The background contamination in the SRs originates from various processes: nonresonant , topquark pair (¯) and single-top-quark ( ), diboson ( , , , * , and ) and Drell-Yan (mainly / * → , hereafter denoted by / * ) production. Other background contributions arise from +jets and multĳet production with misidentified leptons, which are either nonprompt leptons from decays of heavy-flavor hadrons or jets misidentified as prompt leptons. The backgrounds with misidentified leptons are estimated using a data-driven technique. Dedicated data regions with low expected signal, hereafter called control regions (CRs), are used to normalize the predictions of the , top-quark, and / * → backgrounds. Table 3 summarizes the selections used to define the CRs, which start from the preselection defined in Table 2. The background estimates for the remaining background processes, most notably the diboson processes other than , are obtained from simulated samples normalized to the theoretical cross sections for these processes. 7 T is defined at the reconstruction level as the transverse momentum of the system composed of the two leptons + miss T + two leading jets in the event. Table 3: Event selection criteria used to define the control regions in the → * → analysis. Every control region selection starts from the selection labeled "Preselection" in Table 2, and -jet, (20< T <30 GeV) represents the number of -jets with 20 < T < 30 GeV.

background
The nonresonant background mainly originates from the quark-initiated process (labeled ) with a small additional contribution from the gluon-initiated process proceeding via a box-diagram ( ). The process produces approximately 10% of the total background contribution and is estimated from simulated samples normalized to the theoretical cross sections. The process is normalized to the observed yields in dedicated CRs, defined separately for each analysis category. The CRs are orthogonal to the SRs and enriched in the process. For the jet = 0 category, the selected ℓℓ region is modified to 55 < ℓℓ < 110 GeV and the Δ ℓℓ selection is relaxed to Δ ℓℓ < 2.6 (from Δ ℓℓ < 1.8 in the SRs). The upper bound on the ℓℓ selection reduces the contamination from top-quark processes in the CR, whereas the Δ ℓℓ selection removes most of the / * → contamination. The CR in the jet = 1 category differs from the SR by requiring ℓℓ > 80 GeV and | − | > 25 GeV. For the jet ≥ 2 categories, it is difficult to find a region with high process purity because of the overwhelming background from top-quark processes. For the ggF-enriched jet ≥ 2 category, the process is normalized to the yield in the CR, whereas the background in the VBF-enriched jet ≥ 2 category is estimated from simulated samples normalized to the theoretical cross section. The CR for the ggF-enriched jet ≥ 2 category is defined by requiring ℓℓ > 80 GeV and T2 > 165 GeV. The T2 variable [120] is defined as 2 T2 = min where the minimization is over all possible two-momenta, / 1,2 , such that their sum gives the observed missing transverse momentum / T , and where each of T and T is the combined transverse momentum of a charged lepton and a jet. The T2 selection is indicated by a dashed line in Fig. 7, with an arrow at the top pointing to the region retained. For all CRs, a -jet veto is maintained.

Top-quark backgrounds
The top-quark backgrounds affecting this analysis are associated with the¯and processes. They are normalized to the observed combined top yields in CRs, defined separately for each analysis category. The uncertainties in the relative contributions of¯and are accounted for by considering their relevant uncertainties separately, while their ratios are similar between respective CRs and SRs. The CRs are orthogonal to the SRs, normally by inverting the -jet veto. The exception is in the ggF-enriched jet ≥ 2 category, where the top-quark CR is defined with a -jet veto and orthogonality to the SR and CR is achieved by requiring ℓℓ > 80 GeV and T2 < 165 GeV, respectively. This is possible due to the jet ≥ 2 categories having high top-quark event purity even with a -jet veto, and this definition reduces the uncertainties from the -jet selection in this category. For the jet = 0 category, the top-quark CR requires the presence of a reconstructed jet with 20 < T < 30 GeV which is identified as coming from a -quark.

Backgrounds with misidentified leptons
The backgrounds originating from either one or two misidentified (Mis-Id) leptons are primarily due to +jets and multĳet processes, respectively. They are estimated using a data-driven technique 8 where control samples are established in which all nominal selections are applied with the exception that one of the two lepton candidates fails to meet all of the identification criteria defined in Sec. 4, but satisfies a looser set of identification criteria (referred to as an anti-identified lepton). The expected Mis-Id background yields in the signal and control regions are extrapolated from the observed number of events in the corresponding samples with anti-identified leptons, after subtracting the expected contribution from processes with two   CRs with signal (normalized to postfit measurement) and background modeled contributions. The red arrow in the lower panel of (a) indicates that the central value of the data lies above the window. The last bin of each distribution is inclusive (includes the overflow). The hatched band shows the total uncertainty, assuming SM Higgs boson production. Some contributions are too small to be visible.    Events / 5 GeV 70 80 90 100 110 120 130 140 150 [GeV]   Events / bin   prompt leptons. The method appropriately accounts for all processes with misidentified leptons, as long as they are represented in the sample with one anti-identified lepton and the extrapolation factor is similar to the nominal value. The small contribution from multĳet processes with two misidentified leptons is accounted for in the extrapolation by applying a correction term evaluated in a sample where both lepton candidates are anti-identified. The correction is largest in the VBF-enriched jet ≥ 2 category, for which a direct miss T selection is not applied. In this case, the multĳet processes constitute approximately 25% of the total misidentified lepton yield in the SR.
The extrapolation factor that is used to extrapolate the expected Mis-Id background yields in the control samples to the SRs is determined in a sample of +jets-enriched events, where a three-lepton selection is applied to target events with a leptonically decaying boson plus a misidentified lepton candidate recoiling against the boson. It is defined as the ratio of the number of events in which the misidentified lepton candidate is identified to the number of events in which it is anti-identified and is measured in bins of lepton T (and | |) in the case of electrons (muons). A correction factor is used to account for the fact that the sources of misidentified leptons, such as hadrons, nonprompt leptons from heavy-flavor decays, and photons, contribute in different ratios to the +jets-enriched sample in which the extrapolation factor is derived and the largest source of Mis-Id background in the SR ( +jets events). This sample composition correction factor is determined from the ratio of extrapolation factors measured in +jets and +jets MC simulation.

Control regions for the STXS measurements
For the cross-section measurements in the STXS framework, the CRs defined above for the , top-quark, and / * → processes are further split into smaller CRs corresponding to the various STXS SRs. For the jet = 0 CRs, no further splitting is needed and identical CRs are used. For the jet = 1 category, the CRs for the , top-quark, and / * → processes are each divided into four regions defined by T < 60 GeV, 60 ≤ T < 120 GeV, 120 ≤ T < 200 GeV, and T ≥ 200 GeV. The CRs for the , top-quark, and / * → processes in the ggF-enriched jet ≥ 2 category are further divided into regions with T < 200 GeV and T ≥ 200 GeV. The T ≥ 200 GeV STXS category targeting the ggF production mode is common to all jet multiplicities. For the VBF-enriched jet ≥ 2 category, the top-quark and / * → CRs are split into three regions each. Two regions are defined for T < 200 GeV by 350 ≤ < 700 GeV and ≥ 700 GeV, while one region is defined for T ≥ 200 GeV.

Systematic uncertainties
Uncertainties from both experimental and theoretical sources affect the results of the analysis. This section describes the estimation of their effects on the signal and background normalizations as well as, where applicable, their effects on the shape of the final discriminant. The relative impacts that various sources of systematic uncertainties have on the measured ggF and VBF cross sections are obtained from the likelihood fit described in Sec. 8 and are listed in Table 6.

Experimental uncertainties
The uncertainties related to the reconstruction of the objects used in the analysis are determined using datadriven methods on high-statistics samples of processes such as → ℓℓ. Uncertainties associated with the selected leptons originate from the reconstruction and identification efficiency, the energy (or momentum) scale and resolution, and the isolation efficiency [104, 106]. For jets, uncertainties arise from the jet energy scale and resolution [110], the jet-vertex tagger's performance, and the -jet identification [113]. Furthermore, uncertainties due to the trigger selection [18,19] and the soft term in the reconstruction of miss T [114] are estimated. The uncertainty in the modeling of pileup for simulated samples is estimated by varying the reweighting to the profile in data within its uncertainties. The uncertainty in the combined 2015-2018 integrated luminosity is 1.7% [21], obtained using the LUCID-2 detector [123] for the primary luminosity measurements. The integrated luminosity uncertainty is only applied to the Higgs boson signal and to background processes that are normalized to theoretical predictions.
Three sources of uncertainty related to the extrapolation factor used in the data-driven Mis-Id background estimate are considered: the statistical uncertainty of the extrapolation factor itself, an uncertainty related to the subtraction of processes with two prompt leptons from the +jets-enriched sample used to derive the extrapolation factor, and an uncertainty in the sample composition correction factor. Together, they amount to a total uncertainty on the electron (muon) extrapolation factor ranging from 10% (12%) at low T to 35% (75%) at high T where there is a small number of Mis-Id leptons.
The largest experimental uncertainties affecting the ggF measurement come from the -jet identification, the pileup modeling, the jet energy resolution, and the Mis-Id background estimate. For the VBF measurement, the largest experimental uncertainty comes from the miss T reconstruction.

Theoretical uncertainties
Uncertainties from the renormalization and factorization scale choices, underlying-event modeling, and choice of PDF are estimated for all processes. For signal, top, and / * processes, the uncertainties from the parton shower and the matrix-element matching are estimated by comparing predictions from the nominal and alternative generators that are described in Sec. 3.2. For the prediction of and of , , and * production ( ), variations of the matching scale and nonperturbative effects are considered instead of an alternative program for estimating the matrix-element matching uncertainties. The uncertainty from the resummation scale is estimated for the Sherpa samples.
For signal processes, the approach described in Refs. [11,124] is used to estimate the variations due to the impact of higher-order contributions not included in the calculations and of migration effects on the jet ggF cross sections. In particular, the uncertainties from the choice of factorization and renormalization scales, the choice of resummation scales, and the ggF migrations between the 0-jet and 1-jet phase-space bins or between the 1-jet and ≥ 2-jet bins are considered [11,[125][126][127][128].
The process is simulated at LO precision for up to one additional parton emission. Therefore, a conservative −50%/+100% normalization uncertainty is assigned to this process for the jet ≥ 2 categories and to the STXS measurement in the region with T ≥ 200 GeV targeting the ggF production mode. A similar −50%/+100% normalization uncertainty is also assigned to the sample, due to a mismodeling of the → misidentification rate which primarily affects events with T ≲ 80 GeV and can be seen in Fig. 11(a). The EW process, which contributes most significantly in the highest VBF DNN bin, is assigned an additional normalization uncertainty of 15% due to NLO EW corrections, as calculated using the leading-log approximation [129,130]. For , an additional uncertainty estimated by comparing samples with different diagram removal schemes [84] is applied. A normalization uncertainty of 12%, as estimated in the sample with a three-lepton selection described in Sec. 6.4, is also applied to the nonand backgrounds. For backgrounds which are normalized to CR yields, uncertainties are estimated for the CR-to-SR extrapolation factors. Only uncertainties that change the ratios of SR yields to CR yields affect the extrapolation. The uncertainties in the extrapolation factors are treated as uncorrelated between different jet multiplicities and between the CRs for the STXS measurement.
The uncertainties in the STXS measurements are estimated for each SR separately, where the 11 STXS bins are treated as different processes, and cover the migration of events between STXS SRs. Merged SRs are used to determine the uncertainties when only a small number of events are available in the simulated samples in a particular SR.
The largest theoretical uncertainties affecting the ggF signal come from the measurement of exclusive jet multiplicities and from the parton shower. For the VBF signal, the comparisons of different event generators for the matrix-element matching and for the parton shower result in the largest uncertainties in the measurement. For background processes, the theoretical uncertainties in the and top-quark backgrounds result in the largest contributions to the overall uncertainty.

Fit procedure
Results are obtained from a profile likelihood fit [131] to data in the signal and control regions. Uncertainties enter the fit as nuisance parameters in the likelihood function. Theoretical uncertainties affecting the signal and the experimental uncertainties affecting both signal and background are in general correlated between signal and control regions in all analysis categories. Theoretical uncertainties in the backgrounds and the background normalization factors are uncorrelated between different analysis categories.
The T distribution is used as the final fit discriminant in each of four regions defined by ℓℓ and subleading lepton T in both the jet = 0 and jet = 1 categories, as described in Secs. 5.1 and 5.2. The same binning of the T distribution is used in all regions: [0-90, 90-100, 100-110, 110-120, 120-130, 130-∞]. The SR in the ggF-enriched jet ≥ 2 category is split into two bins of ℓℓ , but there is no split in subleading lepton T . In both regions, the T distribution is divided into six bins with the same boundaries as for the jet = 0 and jet = 1 categories. For the VBF-enriched jet ≥ 2 category, the DNN output is used as the discriminating fit variable. The distribution is divided into seven bins with the boundaries defined in Sec. 5.3.
For the STXS measurements, two modifications are made: the four STXS regions in the jet = 1 category are no longer split into bins defined by ℓℓ and subleading lepton T , while the STXS measurements targeting the VBF production mode define four bins for the DNN output with boundaries [0-0.5, 0.5-0.7, 0.7-0.84, 0.84-1.00].
The cross sections for the ggF and VBF production modes are determined in a simultaneous fit to all nominal SRs and CRs in the jet = 0, jet = 1, and jet ≥ 2 categories. The ggF and VBF cross sections are the two unconstrained POIs in this fit. A second fit is performed using these same regions, but measuring a single POI for the combined ggF and VBF yield. In both fits, the other Higgs boson production modes are fixed to their expected yields. A third fit is made to all the STXS regions, where the 11 cross sections measured are POIs. No nuisance parameters are significantly pulled or constrained in any of the fits. Table 5 shows the postfit SR yields for all of the four analysis categories defined in Sec. 5. The uncertainty in the total expected yield reflects incomplete knowledge of the observed yield in each analysis category and is not indicative of the precision of the analysis. Furthermore, the relative error in the yields of the background processes for which dedicated CRs are defined is in many cases less than the relative error in the corresponding normalization factor displayed in Table 4 due to effects of anticorrelation with some nuisance parameters modeling theory uncertainties. Table 5: Postfit MC and data yields in the ggF and VBF SRs. Yields in the bin with the highest VBF DNN output are also presented. The quoted uncertainties correspond to the statistical uncertainties, together with the experimental and theory modeling systematic uncertainties. The sum of all the contributions may differ from the total value due to rounding. Moreover, the uncertainty in the total yield differs from the sum in quadrature of the single-process uncertainties due to the effect of anticorrelations between the sources of their systematic uncertainties, which are larger than their MC statistical uncertainties. The T distributions for the separate jet = 0, jet = 1, and ggF-enriched jet ≥ 2 SRs, as well as the combination of ggF SRs, are shown in Fig. 11. The bottom panels of Fig. 11 display the difference between the data and the total estimated background compared to the T distribution of a SM Higgs boson with = 125.09 GeV. The total signal observed in all categories (see Table 5) is about 4000 events and agrees, in both shape and rate, with the expected SM signal. The observed (expected) signal yield in the ggF-enriched jet ≥ 2 category, with the VBF contribution fixed to the Standard Model prediction, reaches a significance of 2.2 (1.6 ) above the background expectation.

Signal region yields and results
The VBF DNN output distribution in the final signal region is presented in Fig. 12. The observed (expected) VBF signal reaches a significance of 5.8 (6.2 ) above the background expectation. Figure 13 shows the best-fit values and uncertainties of the → * cross section for the ggF and VBF processes and their combination, normalized to the corresponding SM prediction. The cross sections times branching ratio for the ggF and VBF production modes for a Higgs boson with mass = 125.09 GeV in the → * decay channel, ggF · B → * and VBF · B → * , are simultaneously measured to be ggF · B → * The combined cross section times branching ratio, ggF+VBF · B → * , obtained from fitting a single POI, is measured to be ggF+VBF · B → * = 12.3 ± 1.3 pb = 12.3 ± 0.6 (stat.) +0.8 −0.7 (exp. syst.) ± 0.6 (sig. theo.) ± 0.7 (bkg. theo.) pb, compared to the SM predicted value of 11.3 ± 0.5 pb. Table 6 shows the relative impact of the main uncertainties on the measured values for ggF+VBF · B → * , ggF · B → * , and VBF · B → * . The measurements are dominated by systematic uncertainties. For the ggF measurement, uncertainties from experimental and theoretical sources are comparable. For the VBF measurement, signal theory uncertainties make up the largest contribution and the dominant ones are those related to the modeling of potential jets in addition to the tagging jets. The 68% and 95% confidence level two-dimensional contours of ggF · B → * and VBF · B → * are shown in Fig. 14 and are consistent with the SM predictions. Figure 15 shows a summary of the → * cross sections measured in each of the 11 STXS bins, normalized to the corresponding SM prediction. The correlation matrix for the measured cross sections is shown in Fig. 16. The largest correlations between the measured cross sections are around 30% and are primarily caused by the migration of signal events between STXS bins and reconstructed signal regions. The measured cross sections for the five STXS bins targeting EW production are on average lower than the VBF cross section measured in the two-POI fit. Events with high or high T carry a larger statistical weight than events at low in the two-POI fit, and in these STXS bins, the measured value is close to 1. Table 7 provides the central value and uncertainties of each of the measured STXS cross sections, together with the SM predictions. The results are compatible with the Standard Model predictions, with a -value of 53%. Table 6: Breakdown of the main contributions to the total uncertainty in ggF+VBF · B → * , ggF · B → * , and VBF · B → * , relative to the measured value. The individual sources of systematic uncertainties are grouped together. The sum in quadrature of the individual components differs from the total uncertainty due to correlations between the components.       Figure 14: 68% and 95% confidence level (C.L.) two-dimensional contours of ggF · B → * vs VBF · B → * , compared to the SM prediction shown by the red marker. The 68% C.L. contour for the SM predictions of the ggF and VBF cross sections times branching ratio [11] is indicated by the red ellipse.

Conclusions
The → * → decay channel was used to measure Higgs boson production by gluon-gluon fusion and vector-boson fusion. The measurements are based on a dataset of proton-proton collisions with an integrated luminosity of 139 fb −1 recorded with the ATLAS detector at the LHC in 2015-2018 at a center-of-mass energy of 13 TeV. The ggF and VBF cross sections times the → * branching ratio are simultaneously measured to be 12.0 ± 0.6 (stat.) +0.9 −0.8 (exp. syst.) +0.6 −0.5 (sig. theo.) ± 0.8 (bkg. theo.) and 0.75 ± 0.11 (stat.) +0.07 −0.06 (exp. syst.) +0.12 −0.08 (sig. theo.) +0.07 −0.06 (bkg. theo.) pb, in agreement with the Standard Model predictions of 10.4 ± 0.6 and 0.81 ± 0.02 pb, respectively. These measurements are significantly more precise than the previous Higgs boson cross sections times → * branching ratio results from ATLAS because of several improvements to the analysis in addition to the larger dataset, most notably the inclusion of a dedicated signal region for the ggF production mode in conjunction with two or more reconstructed jets. Higgs boson production in the → * decay channel is further characterized through STXS measurements in a total of 11 categories. The STXS results are compatible with the Standard Model predictions, with a -value of 53%. The ATLAS Collaboration