Search for invisible decays of the Higgs boson produced via vector boson fusion in proton-proton collisions at $\sqrt{s} =$ 13 TeV

A search for invisible decays of the Higgs boson produced via vector boson fusion (VBF) has been performed with 101 fb$^{-1}$ of proton-proton collisions delivered by the LHC at $\sqrt{s} =$ 13 TeV and collected by the CMS detector in 2017 and 2018. The sensitivity to the VBF production mechanism is enhanced by constructing two analysis categories, one based on missing transverse momentum, and a second based on the properties of jets. In addition to control regions with Z and W boson candidate events, a highly populated control region, based on the production of a photon in association with jets, is used to constrain the dominant irreducible background from the invisible decay of a Z boson produced in association with jets. The results of this search are combined with all previous measurements in the VBF topology, based on data collected in 2012 (at $\sqrt{s} =$ 8 TeV), 2015, and 2016, corresponding to integrated luminosities of 19.7, 2.3, and 36.3 fb$^{-1}$, respectively. The observed (expected) upper limit on the invisible branching fraction of the Higgs boson is found to be 0.18 (0.10) at the 95% confidence level, assuming the standard model production cross section. The results are also interpreted in the context of Higgs-portal models.

Direct searches for H → inv decays have already been performed by the ATLAS [20, 21] and CMS [22][23][24] Collaborations using data collected at √ s = 7, 8, and 13 TeV, and combining the three main Higgs boson production modes, namely gluon-gluon fusion (ggH), production of a Higgs boson in association with vector bosons (VH, with V = W ± or Z), and vector boson fusion (VBF). Assuming SM production of the Higgs boson, the best observed (expected) 95% confidence level (CL) upper limits on B(H → inv) are set at 0.19 (0.15) by CMS, using data collected at √ s = 7, 8, and 13 TeV, and at 0.26 (0.17) by ATLAS using data collected at 13 TeV. In both cases, the data at 13 TeV were collected in 2016. Combining the latest CMS constraints on both visible and invisible decays within the κ framework [25], the upper bound on B(H → inv) is 0.22 at the 95% CL, using only the data set collected at 13 TeV in 2016.
Thanks to its large production cross section [26] and distinctive event topology, the VBF production mechanism drives the overall sensitivity in the direct search for invisible decays of the Higgs boson. This paper focuses exclusively on the search for H → inv in the VBF production mode using the LHC proton-proton (pp) collision data set collected during 2017-2018, corresponding to an integrated luminosity of up to 101 fb −1 , and on the combination of this search with analyses performed on previous data sets [22,27].
Employing a strategy similar to the one used in the previously published analysis [22], the invariant mass of the jet pair produced by VBF, m jj , is used as a discriminating variable to separate the signal and the dominant backgrounds arising from vector boson production in association with two jets (V+jets). Representative Feynman diagrams for the signal and main background processes are shown in Fig. 1.  Figure 1: Leading-order Feynman diagrams for the production of the Higgs boson in association with two jets from VBF (left), and representative leading-order Feynman diagrams for the production of a Z boson in association with two jets either through VBF production (middle) or strong production (right). Diagrams for the production of a W boson in association with two jets are similar.
Control regions enriched in V+jets processes are used to constrain the associated background contributions in the signal region. Additional sensitivity is obtained by using γ+jets events to further constrain the Z(νν ) background. In the previous CMS publication, the trigger strategy was based exclusively on the invisible Higgs boson decay products, requiring a high threshold on the missing transverse momentum. With the availability of a trigger based on the jet properties from VBF production, in this analysis, additional sensitivity is achieved by including events with lower missing transverse momentum.
This article is organized as follows. Section 2 introduces the CMS detector. Section 3 summarizes the data and simulated samples. The event reconstruction is detailed in Section 4, followed by the analysis strategy in Section 5. Section 6 describes the systematic uncertainties. Finally, the results are presented in Section 7, with tabulated versions provided in HEPData [28], followed by a summary in Section 8.

The CMS detector
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter, each composed of a barrel and two endcap sections. Hadron forward (HF) steel and quartz fiber calorimeters extend the pseudorapidity η coverage provided by the barrel and endcap detectors. Muons are measured in gas-ionization detectors embedded in the steel flux-return yoke outside the solenoid. A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Ref. [29].
Events of interest are selected using a two-tiered trigger system [30]. The first level (L1) is composed of custom hardware processors, which use information from the calorimeters and muon detectors to select events at a rate of about 100 kHz [31]. The second level, known as the high-level trigger (HLT), is a software-based system that runs a version of the full event reconstruction optimized for fast processing, reducing the event rate to about 1 kHz.
At the end of 2016, the first part of the CMS detector upgrade program (Phase 1) was undertaken, with the replacement of the inner tracking pixel detector and the L1 trigger system. During the 2016 and 2017 data-taking periods, partial mistiming of signals in the forward region of the ECAL endcaps (2.5 < |η| < 3.0) led to a large reduction in the L1 trigger efficiency [31]. A separate correction was determined using an unbiased data sample and applied to simulated events to reproduce the loss of efficiency. This problem was resolved before the 2018 data-taking period.

Data and simulated samples
Data were recorded by several triggers, as detailed in Section 5.1, during 2017 and 2018, for maximum integrated luminosities corresponding to 41.5 and 59.8 fb −1 , respectively.
The signal and background processes are simulated using similar Monte Carlo (MC) generator configurations as described in detail in Ref. [22], and summarized below. Separate independent samples were produced for each data-taking year. The same generator settings were used for the 2017 and 2018 samples.
The Higgs boson signal events, produced through ggH, VBF, VH, and in association with top quarks (ttH), are generated with POWHEG v2.0 [32][33][34][35][36] at next-to-leading order (NLO) approximation in perturbative quantum chromodynamics (pQCD). The signal yields are normalized to the inclusive Higgs boson production cross sections, calculated in Ref. [26] at approximate Jets are reconstructed by clustering all PF candidates associated with the primary interaction vertex using the anti-k T clustering algorithm [55], with a distance parameter of 0.4, as implemented in the FASTJET package [56]. The candidate vertex with the largest value of summed physics-object p 2 T is taken to be the primary pp interaction vertex. The physics objects used for this determination are the jets, clustered using the jet finding algorithm [55,56] with the tracks assigned to candidate vertices as inputs, and the associated missing transverse momentum, taken as the negative vector sum of the p T of those jets. Pileup mitigation techniques [57] are used to correct the objects for energy deposits belonging to pileup vertices, as well as to remove objects not associated with the primary interaction vertex. Loose identification criteria are applied on the jet composition to remove contributions from calorimeter noise. To correct the average measured energy of the jets to that of particle-level jets, jet energy corrections are derived using simulated events, as a function of the reconstructed jet p T and η. In situ measurements of the momentum balance in dijet, γ+jet, Z+jet, and multijet events are used to determine any residual differences between the jet energy scale in data and in simulation, and appropriate corrections are made [58]. In simulated events, the jet energy is also smeared to reproduce the jet energy resolution measured in the data [58]. For jets with p T < 50 GeV, a multivariate discriminant against pileup jets is applied, using a loose working point [59]. Jets are selected in the range |η| < 4.7 and with p T > 30 GeV. Jets with an identified electron, muon, or photon within ∆R < 0.4 are rejected, where ∆R = √ (∆η) 2 + (∆φ) 2 .
The missing transverse momentum vector ( p miss T ) is computed as the negative vector p T sum of all the PF candidates in an event, and its magnitude is denoted as p miss T . Any correction applied to individual objects is propagated correspondingly to the p miss T [60]. Specific event fil-ters have been designed to reduce the contamination arising from large misreconstructed p miss T from noncollision backgrounds [60]. For the analysis of the 2018 data, the missing HE section affects the PF reconstruction. The inability to distinguish electrons and photons from jets leads to spurious p miss T in the corresponding φ region as a result of the suboptimal reconstruction of charged and neutral hadrons. Consequently, events with −1.8 < φ(p miss T ) < −0.6 are rejected in data for the part of this data set, around 65% of the total, that is affected. Simulated events in this region are reweighted accordingly to model the efficiency loss.
It has been observed during data taking that the HF detector can, on rare occasions, give rise to unphysical high-energy signals. This occurs in particular when a muon or a charged particle coming from a late showering hadron directly hits one of the photomultiplier tubes that are used to read out the quartz fibers. The photomultiplier tubes are located behind the HF detector, in readout boxes gathering quartz fibers of a given φ region. The resulting energy is therefore typically spread across several channels of constant φ. Spurious jets can also arise when a high energy muon from machine-induced backgrounds [61] undergoes bremsstrahlung in the HF detector. The associated energy deposit is then narrow in both η and φ. Although these two effects are uncommon, they lead to large p miss T , and a dedicated mitigation technique is therefore applied to reject events with such calorimeter noise. For jets reconstructed in the HF detector, with |η| > 2.99, shower shape variables are constructed based on the associated PF candidates found within ∆R < 0.4 of the jet. A central η strip size is defined by counting the number of associated PF candidates, N cent PFCand , with transverse momentum p T > 10 GeV within ∆φ < 0.05 of the jet direction. This corresponds to the number of candidates within the same φ HF tower. The shower widths in both directions are defined as σ ηη and σ φφ , using the pileup-corrected energy-weighted sums of the separations in η and φ between the associated PF candidates with p T > 3 GeV and the jet axis directions.
As stated above, jets stemming from calorimeter noise, called HF noise jets in the following, tend to be either more spread in η than in φ, or narrow in both directions. They lead to spurious p miss T in the opposite direction in φ. Events are hence rejected if they contain any jet with |η| > 2.99, p T > 80 GeV, and ∆φ( p miss T , p jet T ) > 2.5 that does not satisfy the criteria summarized in Table 1. The requirements of this selection are chosen to have a mistagging rate smaller than 10% for signal-like jets, while being more than 60-90% efficient at rejecting noise-like jets, depending on their p T and η. To correct for mismodelling of these selections in simulation, the selection efficiency on signal-like jets is measured in both data and simulated Z( )+jet and γ+jet events, and scale factors are applied to correct the simulation. The scale factors are measured as functions of jet p T and η, and are consistent with unity to within 10%.

<3
Hadronically-decaying τ leptons (τ h ) are identified from reconstructed jets through the multivariate DEEPTAU algorithm [62], using a working point that has an average efficiency above 96% for a jet misidentification rate of less than 10%. The τ h candidates are reconstructed with p T > 20 GeV and |η| < 2.3. Jets with an identified loose electron or muon within ∆R < 0.4 are rejected before applying the DEEPTAU algorithm.
The specific features of heavy-flavored jets, in particular the presence of displaced vertices, are used in a multivariate jet tagging method. The "medium" working point of the DEEPCSV algorithm from Ref.
[63] is used to tag b quark jets with p T > 20 GeV and |η| < 2.4 with 68% efficiency, and 1.1 (12)% probability of misidentifying a light-flavor or gluon (c quark) jet as a bottom quark jet.

Analysis strategy
The distinctive feature of VBF production is a pair of jets originating from light-flavor quarks, with a large separation in η (∆η jj ) and therefore a large m jj . The signal region (SR) in this analysis uses selection requirements on the jet pair together with the presence of a significant amount of p miss T . The shape of the m jj distribution is used to disentangle jet pairs produced in VBF production from other SM processes. When fitting the shape of this distribution, the strong production of the V+jets processes together with the ggH signal dominate at low m jj , whereas the VBFproduced V+jets processes populate the high-m jj tail, together with the VBF H signal. The shapes of m jj , |∆η jj |, and the dijet separation in azimuthal angle (|∆φ jj |) predicted by the simulation are compared between strong and VBF production of both V+jets and signal processes in Fig. 2. Whereas similarities are seen between the VBF production of vector bosons and Higgs bosons, there are some differences. First, the respective bosons have different coupling structures, identified as the main reason for the different behaviour in the ∆φ jj distribution [64]. Second, the V+jets (VBF) samples include additional diagrams compared with the VBF H production, in which the vector boson is produced through a coupling to quarks, and jets are produced from additional EW vertices. For the VBF H production, such diagrams are strongly suppressed through the Yukawa mechanism, also leading to differences in the kinematic behaviour.
The dominant V+jets backgrounds are measured using control regions (W, Z, and γ CRs), in which one or more charged leptons (electrons and muons), or a photon, are required, but the selection on the jets and p miss T is kept identical to that in the SR. The MC simulation is used to define transfer factors. These make it possible to predict the V+jets yields in the SR from both the yields measured and predicted in the CRs. The full procedure is described in Section 7.
The Z CR suffers from a lack of statistical precision, particularly in the high-m jj region, due to the low branching fraction of the Z boson to a pair of leptons. Because of their similarities, the ratios of the yields of W ± or γ to Z production in the SR are constrained, within theoretical uncertainties, to those predicted by the simulation.
In the following subsections, the SR selection is first presented. Then, the data-driven methods used to estimate the V+jets and QCD multijet backgrounds are described. The remaining small contributions expected from diboson and top quark processes are estimated using simulation.

Signal region selection
Two complementary trigger strategies were used to select events online. The first category only uses the p miss T information at L1 and in the HLT, and is referred to as the "missing momentum triggered region" (MTR) category in the following. During 2017 and 2018, the HF region was included in the definition of p miss T,L1 , leading to an increase in trigger acceptance when the VBF jets are reconstructed with 3 ≤ |η| ≤ 5, compared with 2016. The p miss T,L1 thresholds varied between 65 and 90 GeV, with a p miss T,HLT threshold of 120 GeV. After correcting for the L1 mistiming   Figure 2: Comparison of shapes of quantities related to the dijet pair. Shown are m jj (upper), ∆η jj (lower left), and ∆φ jj (lower right), as predicted by simulation, separating strong and VBF production for V+jets and signal processes.
inefficiency described in Section 2, this trigger is more than 90% efficient for p miss T > 250 GeV. Loose muon candidates are ignored at L1 and in the HLT when calculating the p miss T,L1 and p miss T,HLT variables, ensuring that the same trigger can be used in the muon CRs. The data collected by these triggers in 2017 and 2018 correspond to integrated luminosities of 41.5 and 59.8 fb −1 , respectively.
The second category uses a combination of p miss T and jet information, and is referred to as the "VBF jets triggered region" (VTR) category in the following. After the upgrade of the L1 trigger system [65] it was possible to develop an algorithm targeting jets originating from VBF production. This trigger was added after the start of data taking in 2017, and collected a data set corresponding to an integrated luminosity of 36.7 fb −1 during that year. The L1 VBF algorithm requires the presence of at least two jets passing p L1 T thresholds of 115 (110) GeV for the leading jet and 40 (35) GeV for the subleading jet in 2017 (2018). Pairs are formed from the selected jets, and the pair with the highest invariant mass m L1 jj must satisfy m L1 jj > 620 GeV. The pair is allowed to be formed by two jets with p L1 T > 40 (35) GeV if there is a third jet passing the leading-p L1 T threshold requirement, in 2017 (2018). A corresponding HLT algorithm was designed specifically for the H → inv analysis, adding a requirement on p miss T,HLT with a minimum threshold of 110 GeV. At the analysis level, after correcting for the L1 mistiming inefficiency, this trigger is more than 85% efficient for p miss T > 160 GeV and the leading-m jj jet pair passing the following requirements: m jj > 900 GeV, with p T thresholds on the two jets forming the leading-m jj pair of p 1,2 T > 140, 70 GeV. The trigger efficiencies for the two trigger algorithms are available in Appendix A. Again, loose muon candidates are ignored when calculating the p miss T variables at all stages.
We ensure the MTR and VTR categories are orthogonal by requiring 160 < p miss T ≤ 250 GeV in the VTR category, and p miss T > 250 GeV in the MTR category.
To enhance the selection of jets with VBF properties at the analysis level, and to reduce the contamination arising from jet pairs in QCD multijet events, the two leading-p T jets (or the jets forming the highest-m jj pair in the VTR category) are required to be in opposite hemispheres of the detector (η j1 η j2 < 0). The two selected jets are also required to have |∆η jj | > 1 and |∆φ jj | < 1.5 (|∆φ jj | < 1.8 in the VTR category). In the MTR category, the m jj threshold is set to 200 GeV to use the full shape of the spectrum to better separate the signal from the background formed by strong V+jets production.
In QCD multijet events, large p miss T may arise from mismeasurements of the jet momenta, in which case some jets in the event could be aligned in φ with the p miss T . To reduce the contamination from such events, the minimum value of the azimuthal angle between the p miss T vector and any of the first four leading jets (p T > 30 GeV), min(∆φ( p miss T , p jet T )), is required to be above 0.5 (1.8) in the MTR (VTR) category. Events with possible mismeasurements due to calorimeter noise, which would lead to jets with anomalously large (small) energy fractions coming from neutral (charged) particles, are rejected. This is done by rejecting the event if either of the selected VBF jets has |η| < 2.5 and a neutral (charged) hadron energy fraction NHEF > 0.8 (CHEF < 0.1). This selection rejects at most 2% of events, independent of the process and uniformly in m jj . A criterion is also applied on the difference between the p miss T measured using the PF algorithm and that using only the calorimeters. This difference is required to be less than 50% of the p miss T . This selection rejects at most 2 (1)% of all events, mostly at low m jj , for the 2017 (2018) data-taking conditions.
A second source of QCD multijet background is due to the remaining impact of jets originating from HF noise, where the p miss T is balanced in φ with such a jet, and the jet still passes the selection criteria from Table 1. Combined with genuine jets from QCD multijet production, such events can pass the SR selection. These large energy deposits are generally close to the outer HF boundary (|η| < 3.25), where the readout boxes are located, though can extend up to |η| = 5.
Finally, a veto on all other types of loosely identified objects (electrons, muons, photons, τ h candidates, and b-tagged jets), as described in Section 4, is applied.
The criteria for the SR selections are summarized in Table 2, for the MTR and VTR categories. After these selections, the dominant backgrounds come from the V+jets processes. Due to the large branching fraction of the Z boson decay to neutrinos, the Z(νν )+jets process accounts for about two-thirds of the total background. After the lepton vetoes and the p miss T requirement, the contributions from other decay modes are negligible. The next largest background arises from W( ν)+jets production in which the charged lepton from the W ± boson decay is outside of the acceptance of the tracking detector, leading to additional p miss T . In the case of muons, which deposit very little energy in the calorimeters, the p miss T is significant. The hadronic decay modes of the W ± boson are rejected by the large p miss T requirement. The VBF production of V+jets contributes about 2% of the total V+jets background for m jj around 200 GeV. This increases to about 11% at m jj ≈ 1.5 TeV, and to more than 48% for m jj > 3.5 TeV.

Lepton-based control regions
As the boson recoil properties are driven by the production mode and are independent of the boson decay mode, the dominant Z(νν )+jets background is modelled using CRs with leptonic decays of the Z boson (Z(ee) and Z(µµ)). To reduce the contribution from Drell-Yan γ * decays to leptons, the invariant mass of the selected leptons is required to lie in the range 60-120 GeV. The lepton selection is chosen to maximize the event yield while still ensuring leptonic Z boson decays are selected with high purity.
To stay as close as possible to the SR selection, the same trigger as in the SR is used for the Z(µµ) CR. As a result, systematic uncertainties in the trigger efficiencies largely cancel when estimating the corresponding transfer factors. Instead of the muon veto, a pair of oppositely charged muons, consisting of a tight muon with p T > 20 GeV and a loose muon with p T > 10 GeV, is required. All other criteria from Table 2 are applied. The p miss T variable is recalculated ignoring the muons, to mimic the boson recoil.
For the Z(ee) CR the triggers used in the SR are inefficient, as electrons deposit their energy in the calorimeter. Single-electron triggers are therefore used. The lowest-threshold trigger requires a minimum p T of 35 GeV and imposes isolation requirements. It is supplemented by an electron trigger with a p T threshold of 115 GeV, but no isolation requirements, as well as by a photon trigger requiring p T > 200 GeV. For the last, no isolation criteria are applied, and it does not rely on track reconstruction. Taken together, this set of triggers optimizes the efficiency over the full p T range. Instead of the electron veto, a pair of oppositely charged electrons, consisting of a tight electron with p T > 40 GeV and a loose electron with p T > 10 GeV, is required. The p miss T variable is recalculated ignoring the electrons, to mimic the boson recoil.
For the W( ν)+jets background, single-lepton CRs are used. It should be noted that in this case, the W(eν) and W(µν) CRs favour W ± boson decays with a high-p T central lepton (|η| < 2.5 (electrons) or 2.4 (muons)), whereas the background expected in the SR consists of W ± boson decays in which the leptons (including τ leptons) are outside of the acceptance. This has an impact on the p miss T distribution. As explained in greater detail in Section 7, the impact of the lepton acceptance is accounted for in the definition of the transfer factors between the CRs and the SR, using the simulation. For the W(eν) and W(µν) CRs, the lepton veto is replaced by the selection of a tight electron (muon) with p T > 40 (20) GeV, and a veto on any other identified loose electron or loose muon. To reduce the contribution from misidentified electrons stemming from QCD multijet production, the p miss T associated with the W(eν) decay (i.e. not ignoring the electron contribution) is required to be above 80 GeV. In the 2018 analysis, due to the missing HE sector in the data, jets in the corresponding η-φ area are often identified as electrons or photons, and hence events with an electron in this region are rejected.

The γ+jets control region
To further constrain the Z(νν ) background in the SR, a photon CR is used. At large p T , the kinematic properties of photon production become similar to those of the Z(νν ) process [41], and can therefore be used to estimate the latter. Events are selected using a trigger requiring an online photon p T of at least 200 GeV. In the offline analysis, photons are required to be located in the central part of the detector (|η| < 1.4442), have p T > 230 GeV to ensure full trigger efficiency, and pass additional identification criteria based on the properties of the associated energy deposit (supercluster) in the ECAL, as well as the isolation of the photon relative to nearby objects. Exactly one such photon is required, and all other criteria from Table 2 are applied. The p miss T variable is recalculated ignoring the photon, to mimic the boson recoil.
In addition to the desired γ+jets events with a genuine well-identified and isolated ("prompt") photon, small contributions from QCD multijet events with hadronic jets misidentified as photons are present in this region ("nonprompt"). To estimate this background contribution, a purity measurement is performed. The measurement is based on the distribution of the lateral width, σ iηiη [53], of the ECAL supercluster associated with the photon. For prompt photons, the distribution of σ iηiη peaks sharply around values of 0.01 and below, while nonprompt photons show a much smaller peak and a shoulder towards values larger than 0.01. To extract the contamination, a template fit to the σ iηiη distribution is performed in data collected with a looser version of the CR selection. In this looser region, instead of the usual selection that is applied to the VBF jets, we require the presence of a single jet with p T > 100 GeV to enhance the available number of events. Additionally, the photon identification criteria are modified by removing a requirement on σ iηiη that is otherwise included. A template for prompt photons is obtained from simulated γ+jets events, while a nonprompt template is derived from a data sample that is enriched in nonprompt events by inverting the isolation requirements that are part of the photon identification criteria. The nonprompt fraction is defined as the fraction of nonprompt photons present below the σ iηiη threshold set by the identification criteria. The template fit is performed separately in bins of the photon p T and yields a nonprompt fraction between around 4% at p T = 200 GeV and 2-3% at p T = 800 GeV, depending on the data-taking period. The final QCD multijet contribution is then determined by weighting the events observed in the data by the nonprompt fraction. A 25% uncertainty is assigned to the normalization of the QCD multijet background to account for any mismodelling of σ iηiη in the simulation. The uncertainty is estimated by repeating the measurement while varying the binning of the σ iηiη distribution used for fitting, capturing the effect of the mismodelling. The statistical uncertainty in the determination of the differential m jj shape is negligible.

Multijet background
The QCD multijet background is estimated using events in data in which the p miss T arises from mismeasured jets. Depending on the source of the mismeasurement, the jet that was mismeasured is either balanced in the case of additional HF noise, or aligned with the p miss T in φ.
For the multijet background stemming from HF noise, an m jj template is extracted by inverting the requirements on the HF jet shape variables, hence requiring at least one jet in the event to fail the selection criteria given in Table 1. The probability for an HF noise jet candidate to pass or fail the criteria is parameterized as a function of the jet p T and η, using events selected to have large p miss T and to contain only one HF jet balanced in φ with the p miss T . An event-by-event weight is applied to estimate the contribution in the SR from the "failing" events. The estimated contamination from other SM processes is then removed bin-by-bin for the distribution under study. A closure test is performed by selecting events with the leading-p T jet within 3 < |η| < 3.25. With this selection, the signal contamination is <2% assuming B(H → inv) = 0.19, as previously excluded in Ref. [22]. For events with spurious p miss T from noise, one expects a full decorrelation between the p miss T measured from the tracker acceptance only, p miss T,trk , and from the full event, whereas for events with true p miss T a correlation exists. Noise events are therefore expected to dominate in the region with large ∆φ(p miss T,trk ,p miss T ). Data are compared with the estimated template for the ∆φ(p miss T,trk ,p miss T ) distribution in Fig. 3. From the agreement observed in the closure test in both years, a 20% systematic uncertainty is assigned to the template shape in the SR.  For the second category of multijet events, the requirement on min(∆φ( p miss T , p jet T )) is inverted to define a control region enriched in QCD multijet events (QCD CR). The m jj shape for the contribution of this background in the SR is derived from the yields in data in each m jj bin in the QCD CR, after subtracting the contributions from V+jets, diboson, and top quark processes estimated from simulation, as well as HF noise contributions estimated in the data. The template is normalized as follows. The distribution of min(∆φ( p miss T , p jet T )) in data is fit with the sum of templates derived from the simulated V+jets, diboson, and top quark events, an HF noise template derived in data; and a functional form f QCD representing the QCD multijet contribution. The functional form is for the MTR region, and for the VTR region. The parameters p i are allowed to float, and x = min(∆φ( p miss T , p jet T )). The choices of these functions are validated by fitting this model to simulated QCD multijet events, and they are found to describe the distributions in the MTR and VTR categories well.
The normalizations of the W( ν)+jets, Z(νν )+jets, and HF noise contributions are allowed to vary independently in the fit. They are constrained within 20% of the prediction from simulation to account for systematic uncertainties related to jet energy calibrations, missing higher orders in the V+jets cross section calculations, and the closure of the HF noise contribution between data and simulation. The fitted values of the normalizations are used when subtracting the W( ν)+jets, Z(νν )+jets, and HF noise contributions from the data to obtain the m jj template. Their fitted uncertainties are included in the final systematic uncertainty in the QCD multijet estimate.
In both the MTR and VTR categories, the fit range is 0 < min(∆φ( p miss T , p jet T )) < 1.8, chosen to minimize the overlap of events in data with the SR. The fits are performed separately for the 2017 and 2018 data sets. Figure 4 shows the min(∆φ( p miss T , p jet T )) distribution in data used in the fit, and the contributions from V+jets, diboson plus top quark processes, HF noise, and QCD multijet events resulting from the fit. The sums of the 2017 and 2018 data sets are shown for the MTR and VTR categories.
The function f QCD (x) provides an estimate of the QCD multijet template normalization, N, in the SR via where X = 0.5 for the MTR categories and X = 1.8 for the VTR categories. An uncertainty in N is derived by generating, and then refitting, pseudodata to extract the standard deviation of log N. This uncertainty includes the statistical uncertainty in the fit for the QCD multijet normalization and the uncertainty in the templates from simulation used to subtract the backgrounds due to the limited number of simulated events.

Sources of uncertainty
Several sources of systematic uncertainty affect the predictions of the signal and background components. They are separated into two categories, experimental and theoretical sources. Their effect is propagated either directly to the yields expected in the SR (for signal and backgrounds estimated directly from simulation), or to the transfer factors (for the V+jets processes estimated from the CRs).

Experimental uncertainties
The reconstruction and identification efficiencies of electrons, photons, muons, τ h candidates, and b-tagged jets have been measured in both data and simulation [53,54,63,67]. The simulated events are corrected by scale factors, which are usually dependent on the p T and η of the object, and have associated systematic uncertainties. The scale factors, and their uncertainties, are also propagated when vetoing events with identified objects. For the electrons and muons, when different working points are chosen between leptons selected in the CRs and vetoed in the SR, the corresponding uncertainties are kept uncorrelated. All these uncertainties are correlated between the MTR and VTR categories. Except for the muons, where the dominant source of systematic uncertainty comes from the experimental method employed, these uncertainties are considered uncorrelated between the 2017 and 2018 data sets.
The effects of the jet energy scale (JES) and jet energy resolution (JER) uncertainties are studied explicitly by varying them by one standard deviation [58], propagating the changes to the p miss T accordingly, and checking the impact on the transfer factors for the V+jets processes as functions of m jj . For the signal and the minor backgrounds, the impact is studied on the simulated yields in the SR as a function of m jj , with a fit procedure to remove the statistical contribution in the less-populated high-m jj bins. The JER uncertainties impact the signal yields by 4-9 (2-15)%, increasing with m jj , for VBF (ggH) production. The impacts of the JES uncertainties are 5-25 (7-35)%, increasing with m jj , for the VBF (ggH) process. Eleven independent JES sources are considered, with partial correlations between the 2017 and 2018 data sets. The dominant source is the η dependence of the corrections. The corrections become particularly large for forward jets, which explains the large increase at high m jj . The JER uncertainties are uncorrelated between the two data-taking years. Both the JER and JES uncertainties are correlated between the MTR and VTR categories.
Simulated events are weighted to match the distribution of reconstructed vertices to the dis-tribution observed in the data. An uncertainty associated with this procedure is obtained by varying the total inelastic pp cross section by ±5% [68], and repeating the background estimation procedure. This uncertainty is correlated across categories and data sets.
The trigger efficiencies are measured in both data and simulation, and a scale factor is extracted. For the signal triggers, the scale factor is parameterized as a function of the p miss T . The associated systematic uncertainty partially cancels in the transfer factors between the muon CRs and the SR. The associated uncertainty is uncorrelated across categories as different triggers are used, but partially correlated between years. For the electron CRs, the scale factor is parameterized as a function of the lepton p T and η, and the impact of the uncertainties is propagated to the corresponding expectations in the SR. The uncertainty in the electron trigger scale factor is uncorrelated between years but correlated across categories.
The uncertainty in the integrated luminosity of the 2017 (2018) data set is 2.3% [69] (2.5% [70]). When combined together with the 2016 data set [71], the uncertainty is reduced to 1.6%. The improvement in the precision reflects the (uncorrelated) time evolution of some systematic effects. Eight independent sources are identified to take into account the correlations across data sets. These uncertainties are considered correlated between categories.

Theoretical uncertainties
The uncertainties in the ggH and VBF predictions due to PDFs and renormalization and factorization scale variations are taken from Ref. [26]. For the ggH process, an additional uncertainty of 40% is assigned to take into account the limited knowledge of the ggH production cross section in association with two or more jets, as well as the uncertainty in the prediction of the ggH differential cross section for large Higgs boson p T , p H T > 250 GeV, following the recipe described in Ref. [22]. The uncertainties in the signal acceptance due to the choice of the PDF set, and the renormalization and factorization scales, are evaluated independently for the different signal processes [22], and are treated as independent nuisance parameters in the fit.
Some of the theoretical uncertainties in the V+jets and γ+jets processes are expected to mostly cancel in the ratio of the W ± (γ) to Z processes. The uncertainties are estimated using the strong production, and applied to both the VBF and strong production processes. As a conservative choice, for the pQCD NLO corrections, the renormalization and factorization scales are varied independently by a factor of two. They are also varied independently for the W( ν)+jets and Z(νν )+jets processes. The maximum variation in the ratio of the W( ν)+jets to Z(νν )+jets yields, per m jj bin, is taken as the uncertainty. The maximum variation is generally given by that of the W process. Uncertainties in the PDFs are directly propagated to the ratio of W( ν)+jets to Z(νν )+jets in a correlated way, and are also applied per m jj bin. The full EW correction is taken as an additional uncertainty in the ratio of W( ν)+jets (γ+jets) to Z(νν )+jets for the strong production of those processes, and is assumed to be uncorrelated between m jj bins. The theoretical uncertainties are assumed uncorrelated between the VBF-and strong-produced V+jets processes, as well as between the MTR and VTR categories.
All theoretical uncertainty sources are fully correlated between years.

Results
A binned maximum likelihood fit is performed simultaneously across the SR and all CRs in both categories and for both data sets. In the fit, one parameter per bin i of the m jj distribution, for each category and year, is left freely floating. This parameter represents the expected rate of the background from the strong production of Z(νν )+jets events, and it is labelled κ i νν . The m jj distribution is binned in the same way as in Ref. [22]. This binning has been found to be optimal for the 2017 and 2018 data sets as well. Similar to the method described in Ref. [22], the likelihood function is defined as: where P(x|y) = y x e −y /x!, and d CR i and d i are the observed number of events in each bin i of the m jj distribution in each of the CRs and in the SR, respectively. The index i runs over the m jj bins in the two years and all categories. The symbol θ refers to constrained nuisance parameters used for the modelling of the systematic uncertainties. The signal term S i represents the expected signal prediction from the sum of the main Higgs boson production mechanisms (ggH, VBF, VH, ttH) assuming the cross sections predicted in the SM. The parameter µ = (σ H /σ SM H )B(H → inv) denotes the signal strength parameter, and is also left freely floating in the fit.
In a given bin, the V+jets background yields expected in the SR are obtained from transfer factors relating the yields in the different CRs to the yields in the SR, separately for the VBF and strong production processes. These transfer factors are denoted R CR,proc i (θ), where "proc" can be strong or VBF, and are obtained from the simulation. For the single-lepton (dilepton) CRs, the factors R CR,proc i (θ) refer to the ratio of W+jets (Z+jets) yields from the corresponding CR to the SR. In the photon CR, which is only available in the MTR category, R CR,proc i (θ) = 1.
In addition, transfer factors are defined between the W (γ) and the Z processes, separately for the VBF and strong production processes. These transfer factors are denoted as f where "proc" can be strong or VBF. Finally, a transfer factor, denoted as Z VBF strong i , relates the VBF production to the strong production of Z(νν )+jets. The factors C CR,strong i (θ) and C CR,VBF i (θ) are dependent on the nature of the CR, with C (ee,µµ),proc i The contributions from subleading backgrounds in each region are estimated directly from simulation and they are denoted by B CR i (θ) in the CRs, and B i (θ) in the SR. Systematic uncertainties are modelled as constrained nuisance parameters (θ), with a lognormal distribution for those that affect the overall normalization of a given process, and Gaussian priors for those that directly affect the transfer factors, indicated by P(θ j ) in Eq. (4). The impact on the transfer factors of each of the uncertainties described in Section 6 is summarized in Table 3. Table 3: Experimental and theoretical sources of systematic uncertainty in the V+jets transfer factors. The second column indicates which ratio a given source of uncertainty is applied to. The impact on m jj is given in the 3rd column, as a single value if no dependence on m jj is observed, or as a range. When a range is shown, the uncertainty increases with the value of m jj . The quoted uncertainty values represent the absolute value of the relative change in the transfer factor corresponding to a variation of ±1 standard deviation in the systematic uncertainty, or equivalently to a variation of ±1 in the corresponding nuisance parameter θ in Eq. (4).

Source of uncertainty
In the following, the expected background yields used as input to the fit procedure are denoted as "prefit", while the yields after a fit to the CRs, or the CRs and SR, are denoted as "CR-postfit", or "postfit", respectively.
The results are presented for the MTR and VTR categories separately. They are shown separately for the 2017 and 2018 data sets when presenting the transfer factors, and for the data sets added together otherwise. The corresponding figures and tables split into individual control regions and data-taking periods are available in Appendix A.
All results are obtained from a combined fit across all categories and data sets.
For the maximum likelihood fit, the different data sets and categories are treated separately. The final results are obtained from a fit using the combined likelihood function including all categories and data sets, which takes into account nuisance correlations.

Control regions for the MTR category
The ratio of dilepton to single-lepton events is studied to validate the predictions from simulation and uncertainties used to model the ratio of Z+jets to W+jets events in the SR (parameters f W/Z,proc i (θ) in Eq. (4)). The prefit ratio between the number of Z+jets and W+jets events in the CRs in bins of m jj is shown in Fig. 5 (upper row) for the 2017 (left) and 2018 (right) data sets. The prediction is compared to the ratio of observed data yields, from which the estimates of minor background processes have been subtracted. The CR-postfit results are shown together with the prefit ratio. A reasonable agreement is observed between data and simulation, with differences in most bins covered by the systematic uncertainties listed in Table 3. In 2017, the simulation predicts a lower ratio than observed in data. The compatibility of the data with the prefit prediction is measured, in this particular category and year only, using a χ 2 test accounting for correlations between the 2017 MTR CRs and each m jj bin. This test indicates that there is a local discrepancy of approximately two standard deviations. The disagreement is attributed to the Z+jets regions with low event yields, and is partially compensated in the fit through the movement of nuisance parameters representing uncertainties such as the renormalization and factorization scales. The significance of the discrepancy is low and none of the nuisance parameters move by more than one standard deviation from their prefit value in the combined fit across all categories and years. The p-value [72] for the 2017 data set in the MTR category after the combined fit is 38.4%, and it is 37.0% for all categories when combining the 2017 and 2018 data sets. Closure tests have also been performed comparing the decays to electrons or muons separately for the W and Z CRs, again showing reasonable agreement between data and simulation.
The ratio of photon to dilepton events is also studied to validate the predictions from simulation and uncertainties used to model the ratio of γ+jets to Z+jets events in the SR (parameters   Table 3, as well as the statistical uncertainty in the simulation.
postfit predictions are in good agreement with the data within one standard deviation for most of the bins, with discrepancies of just over one standard deviation in the dielectron MTR CR for two bins at m jj ≈ 2 TeV. As discussed previously in the context of the validation of the Z+jets over W+jets and the γ+jets over Z+jets ratios in the CRs, the prefit disagreement observed in the 2017 MTR is partially compensated by the nuisance parameters in the fit, resulting in an overall p-value of 37.0%.

Control regions for the VTR category
The prefit and CR-postfit ratios between the number of Z+jets and W+jets events in the CRs in bins of m jj are shown in Fig. 8. The predictions from simulated events are found to model the The m jj distributions in the dilepton and single-lepton CRs are shown in Fig. 9, along with the CR-postfit and postfit estimates. Again, a good agreement between the data and the predictions is shown, within the uncertainties.

Signal region fits
The background estimates in the SR are reported for each m jj bin of the MTR category in Table 4, and for each m jj bin of the VTR category in Table 5. The observed and expected m jj distributions

Combination of results
No significant deviations from the SM expectations are observed. The results of this search are interpreted in terms of an upper limit on the product of the Higgs boson production cross section and its branching fraction to invisible particles, σ H B(H → inv), relative to the predicted cross section assuming SM interactions, σ SM H . Observed and expected 95% CL upper limits are computed using an asymptotic approximation of the CL s method detailed in Refs. [73,74], with a profile likelihood ratio test statistic [75] in which systematic uncertainties are modelled as nuisance parameters following a frequentist approach [76].
Both VBF and non-VBF signal production modes are included, with their relative contributions fixed to the SM prediction within their uncertainties.
Between the 2017 and 2018 data sets, and the two analysis categories (MTR and VTR), the uncertainties are correlated according to the description given in Section 6. To combine with the data taken in 2016, the same correlation scheme as between 2017 and 2018 is used, except for the jet energy calibration uncertainties (JES and JER), which are kept fully uncorrelated. The integrated luminosity of the 2016 data set was updated to 36.3 fb −1 to reflect the latest improvements in the luminosity measurement [71]. To be consistent with the treatment of the       Table 3, as well as the statistical uncertainty in the simulation.          center-of-mass energy. Partial correlations between data sets exist for the uncertainty in the luminosity measurements. All other experimental uncertainties are decorrelated between the run periods before and after 2015. The results of the fit to the data across all data-taking periods are available in Appendix A.

Constraints on an SM-like Higgs boson
Observed and expected upper limits on (σ H /σ SM H )B(H → inv) at 95% CL are presented in Fig. 11 and Table 6. The limits are computed for the combination of all data sets, as well as for individual categories and data-taking periods. By itself, the addition of the γ+jets CR (the addition of the VTR category) improves the expected limits by about 11 (8)   ratio, followed by the statistical uncertainties in the simulated samples, the trigger uncertainties, jet calibration effects, and the uncertainties in the QCD multijet modelling.
The upper limit on B(H → inv), obtained from the combination of 2012-2018 data, is interpreted in the context of Higgs-portal models of DM interactions, in which a stable DM particle couples to the SM Higgs boson. The interaction between a DM particle and an atomic nucleus may be mediated by the exchange of a Higgs boson, producing nuclear recoil signatures, such as those investigated by direct detection experiments. The sensitivity of these experiments depends mainly on the DM particle mass (m DM ). If m DM is smaller than half of the Higgs boson where Γ SM is set to 4.07 MeV [79]. We do not perform the translation under the assumption of a vector DM candidate in this paper, since it requires an extended dark Higgs sector, which may lead to modifications of kinematic distributions assumed for the invisibly decaying Higgs boson signal. Figure 13 shows the 90% CL upper limits on the spin-independent DM-nucleon scattering cross section as a function of m DM , for both the scalar and the fermion DM scenarios. The corresponding 90% CL upper limit on B(H → inv) is 0.16. These limits are computed at the 90% CL so that they can be compared with those from direct detection experiments such as XENON1T [80], CRESST-II [81], CDMSlite [82], LUX [83], Panda-X 4T [84], and DarkSide-50 [85], which provide the strongest constraints in the m DM range probed by this search. The collider-based results complement the direct-detection experiments in the range m DM smaller than 12 (6) GeV, assuming a fermion (scalar) DM candidate.  Figure 13: The 90% CL upper limits on the spin-independent DM-nucleon scattering cross section in Higgs-portal models, assuming a scalar (dashed orange) or fermion (dashed red) DM candidate. Limits are computed as functions of m DM and are compared to those from the XENON1T [80], CRESST-II [81], CDMSlite [82], LUX [83], Panda-X 4T [84], and DarkSide-50 [85] experiments, which are shown as solid lines.

Summary
A search for the Higgs boson (H) decaying invisibly, produced in the vector boson fusion mode, is performed with 101 fb −1 of proton-proton collisions delivered by the LHC at √ s = 13 TeV and collected by the CMS detector during 2017-2018. Building upon the previously published results, an additional category targeting events at lower Higgs boson transverse momentum is added. An additional highly populated control region, based on production of a photon associated with jets, is used to constrain the dominant irreducible background from invisible decays of a Z boson produced in association with jets. Compared with the strategy of the previously published analysis, these additions improve the expected limits by approximately 17%. The observed (expected) upper limit on the invisible branching fraction of the Higgs boson, B(H → inv), is found to be 0.18 (0.12) at the 95% confidence level (CL), assuming the standard model production cross section. The results are combined with previous measurements in the vector boson fusion topology, for total integrated luminosities of 19.7 fb −1 at √ s = 8 TeV and 140 fb −1 at √ s = 13 TeV, yielding an observed (expected) upper limit of 0.18 (0.10) at the 95% CL. This is currently the most stringent limit on B(H → inv). Finally, the results are interpreted in the context of Higgs-portal models. The 90% CL upper limits on the spin-independent darkmatter-nucleon scattering cross section obtained from the observed LHC data collected during 2012-2018 complement the direct detection experiments in the range of dark matter particle masses smaller than 12 (6) GeV, assuming a fermion (scalar) dark matter candidate.

Acknowledgments
We congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centers and personnel of the Worldwide LHC Computing Grid and other centers for delivering so effectively the computing infrastructure essential to our analyses. Finally, we acknowledge the enduring support for the construction and operation of the LHC, the CMS detector, and the supporting computing infrastructure provided by the follow-    [21] ATLAS Collaboration, "Search for associated production of a Z boson with an invisibly decaying Higgs boson or dark matter candidates at √ s = 13 TeV with the ATLAS detector", 2021. arXiv:2111.08372. Submitted to Phys. Lett. B.    [72] L. Demortier, "P-values and nuisance parameters", in Statistical issues for LHC physics. Proceedings

A Additional figures and tables Trigger efficiencies
Trigger efficiencies are shown in Fig. A.1.

Combination for MTR adding 2016 data
The observed m jj distribution in the MTR SR for the sum of the 2016-2018 samples is shown in Fig. A.3.    Tables A.1 to A.6.                                                       Table 3, as well as the statistical uncertainty in the simulation. Comparison between data and simulation for the Z(ee)+jets/W(eν)+jets (left) and Z(µµ)+jets/W(µν)+jets (right) prefit and CR-postfit ratios, as functions of m jj , for the VTR category 2017 samples. The minor backgrounds in each CR are subtracted from the data using estimates from simulation. The grey bands include the theoretical and experimental systematic uncertainties listed in Table 3, as well as the statistical uncertainty in the simulation.  Table 3, as well as the statistical uncertainty in the simulation. Comparison between data and simulation for the Z(ee)+jets/W(eν)+jets (left) and Z(µµ)+jets/W(µν)+jets (right) prefit and CR-postfit ratios, as functions of m jj , for the VTR category 2018 samples. The minor backgrounds in each CR are subtracted from the data using estimates from simulation. The grey bands include the theoretical and experimental systematic uncertainties listed in Table 3, as well as the statistical uncertainty in the simulation.