Search for the standard model Higgs boson produced through vector boson fusion and decaying to b bbar

A first search is reported for a standard model Higgs boson (H) that is produced through vector boson fusion and decays to a bottom-quark pair. Two data samples, corresponding to integrated luminosities of 19.8 inverse femtobarns and 18.3 inverse femtobarns of proton-proton collisions at sqrt(s) = 8 TeV were selected for this channel at the CERN LHC. The observed significance in these data samples for a H to b bbar signal at a mass of 125 GeV is 2.2 standard deviations, while the expected significance is 0.8 standard deviations. The fitted signal strength mu = sigma/sigma[SM] = 2.8 + 1.6 - 1.4. The combination of this result with other CMS searches for the Higgs boson decaying to a b-quark pair, yields a signal strength of 1.0 +/- 0.4, corresponding to a signal significance of 2.6 standard deviations for a Higgs boson mass of 125 GeV.


Introduction
In the standard model (SM) [1][2][3], the electroweak symmetry breaking is achieved by a mechanism [4][5][6] that provides mass to the electroweak gauge bosons, while leaving the photon massless.The mechanism predicts the existence of a scalar Higgs boson (H), and its observation was one of the main goals of the CERN LHC program.A boson with mass near 125 GeV was recently discovered by both the ATLAS [7] and CMS [8,9] collaborations, with properties that are compatible with those of a SM Higgs boson [10,11].
At the LHC, a SM Higgs boson can be produced through a variety of mechanisms.The expected production cross sections [12] as a function of the Higgs boson mass are such that, in the mass range considered in this study, the vector boson fusion (VBF) process pp → qqH has the second largest production cross section following gluon fusion (GF).Furthermore, for a SM Higgs boson with a mass m H 135 GeV, the expected dominant decay mode is to a b-quark pair (bb).
Thus far, the search for H → bb has been carried out in the associated production process involving a W or a Z boson (VH production) at the Tevatron [13] and at the LHC [14,15], as well as in association with a top quark pair at the LHC [16][17][18], without reaching the necessary sensitivity to observe the Higgs boson in this decay channel.It is therefore important to exploit other production modes, such as VBF, to provide in the bb decay channel further information on the nature and properties of the Higgs boson.
The prominent feature of the VBF process qqH → qqbb is the presence of four energetic jets in the final state.Two jets are expected to originate from a light-quark pair (u or d), which are typically two valence quarks from each of the colliding protons scattered away from the beam line in the VBF process.These "VBF-tagging" jets are expected to be roughly in the forward and backward directions relative to the beam direction.Two additional jets are expected from the Higgs boson decay to a bb pair in more central regions of the detector.Another important property of the signal events is that, being produced through an electroweak process, no quantum chromodynamics (QCD) color is exchanged at leading order in the production.As a result, in the most probable color evolution of these events, the VBF-tagging jets connect to the proton remnants in the forward and backward beam line directions, while the two b-quark jets connect to each other as decay products of the color neutral Higgs boson.Consequently very little additional QCD radiation and hadronic activity is expected in the space outside the color-connected regions, in particular in the whole rapidity interval (rapidity gap) between the two VBF-tagging jets, with the exception of the Higgs boson decay products.
The dominant background to this search is from QCD production of multijet events.Other backgrounds arise from: (i) hadronic decays of Z or W bosons produced in association with additional jets, (ii) hadronic decays of top quark pairs, and (iii) hadronic decays of singly produced top quarks.The contribution of the Higgs boson in GF processes with two or more associated jets is included in the expected signal yield.
The search is performed on selected four-jet events that are characterized by the response of a multivariate discriminant trained to separate signal events from background without making use of kinematic information on the two b-jet candidates.Subsequently, the invariant mass distribution of two b jets is analyzed in each category in the search for a signal "bump" above the smooth contribution from the SM background.This is the first search of this kind, and the only search for the SM Higgs boson in all-jet final states at the LHC.A search for a SM Higgs boson in the all-hadronic final state has been previously reported by the CDF experiment [19].
the signal and main backgrounds, and Section 4 presents the employed triggers.Event reconstruction and selection are described in Sections 5 and 6, respectively.The unique features of the analysis are discussed in Section 7, which includes the improvement of the resolution in jet transverse momentum (p T ) by regression techniques, discrimination between quark-and gluon-originated jets, and soft QCD activity.An important validation of the analysis strategy is the observation of the Z → bb decay, which is presented in Section 8.The search for a SM Higgs boson is discussed in Section 9 and the associated systematic uncertainties are presented in Section 10.The final results are discussed in Section 11 and combined with previous searches in the same channel in Section 12.We summarize in Section 13.

The CMS detector
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. A silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter, and a brass and scintillator hadron calorimeter are located within the solenoidal field.Muons are measured in gas-ionization detectors embedded in the steel flux-return yoke of the solenoid.Forward calorimetry (pseudorapidity 3 < |η| < 5) complements the coverage provided by the barrel (|η| < 1.3) and end cap (1.3 < |η| < 3) detectors.The first level (L1) of the CMS trigger system, composed of specialized processors, uses information from the calorimeters and muon detectors to select the most interesting events in a time interval of less than 4 µs.The high-level trigger (HLT) processor farm decreases the event rate from about 100 kHz to less than 1 kHz, before data storage.A more detailed description of the CMS apparatus and the main kinematic variables used in the analysis can be found in Ref. [20].

Simulated samples
Samples of simulated Monte Carlo (MC) signal and background events are used to guide the analysis optimization and to estimate signal yields.Several event generators are used to produce the MC events.The samples of VBF and GF signal processes are generated using the nextto-leading order perturbative QCD program POWHEG 1.0 [21], interfaced to TAUOLA 2.7 [22] and PYTHIA 6.4.26 [23] for the hadronization process and modeling of the underlying event (UE).The most recent PYTHIA 6 Z2* tune is derived from the Z1 tune [24], which uses the CTEQ5L parton distribution functions (PDF), whereas Z2* adopts CTEQ6L [25].The signal samples are generated using only H → bb decays, for five mass hypotheses: m H = 115, 120, 125, 130, and 135 GeV.
Background samples of QCD multijet, Z +jets, W+jets, and tt events are simulated using leadingorder (LO) MADGRAPH 5.1.3.2 [26] interfaced with PYTHIA.The single top quark background samples are produced using POWHEG, interfaced with TAUOLA and PYTHIA.The default set of PDF used with POWHEG samples is CT10 [27], while the LO CTEQ6L1 set [25] is used for other samples.The production cross sections for W+jets and Z +jets are rescaled to next-tonext-to-leading-order (NNLO) cross sections calculated using the FEWZ 3.1 program [28][29][30].The tt and single top quark samples are also rescaled to their cross sections based on NNLO calculations [31,32].
To accurately simulate the LHC luminosity conditions during data taking, additional simulated pp interactions overlapping in the same or neighboring bunch crossings of the main interaction, denoted as pileup, are added to the simulated events with a multiplicity distribution that matches the one in the data.

Triggers
The data used for this analysis were collected using two different trigger strategies that result in two different data samples for analysis.First, a set of dedicated trigger event selection (paths) was specifically designed and deployed for the VBF qqH → qqbb signal search, both for the L1 trigger and the HLT, and operated during the full 2012 data taking.Then, a more general trigger was employed for the larger part of the 2012 data taking that targeted VBF signatures in general.The first (nominal) set of triggers collected the larger fraction of the signal events, while the second trigger supplemented the search with events that failed the stringent nominaltrigger requirements.The integrated luminosity collected with the first set of triggers was 19.8 fb −1 , while for the second trigger it was 18.3 fb −1 .
While the first dedicated trigger paths collected data within the standard CMS streams, the second general-purpose VBF trigger path ran in parallel with data streams that were reconstructed later, in 2013, during the LHC upgrade.

Dedicated signal trigger
The L1 paths require the presence of at least three jets with p T above decreasing thresholds p (1) T , p T that were adjusted according to the instantaneous luminosity [p T = 24-32 GeV ].Among the three jets, one and only one of the two leading jets [with p T > p T , p T ] can be in the forward region with pseudorapidity 2.6 < |η| ≤ 5.2, while the other two jets are required to be central (|η| ≤ 2.6).
The HLT paths are seeded by the L1 paths described above, and require the presence of four jets with p T above thresholds that were again adjusted according to the instantaneous luminosity, p T > 75-82, 55-65, 35-48, and 20-35 GeV, respectively.Two complementary HLT paths have been employed that make use, respectively, of (i) only calorimeter-based jets (CaloJets) and (ii) particle-flow jets (PFJets, see Section 5).At least one of the four selected jets must further fulfill minimum b-tagging requirements, evaluated using HLT regional tracking around the jets, and using the "track counting high-efficiency" (TCHE) or the "combined secondary vertex" (CSV) algorithms [33], alternatively for the first and second paths.Events are accepted if they satisfy either of the two paths.
Among the four leading jets, the light-quark (qq) VBF-tagging jet pair is assigned in one of two ways: (i) the pair with the smallest HLT b-tagging output values (b-tag-sorted qq) or (ii) the pair with the maximum pseudorapidity difference (η-sorted qq).Both pairs are required to exceed variable minimum thresholds on |∆η qq | of 2.2-2.5, and of 200-240 GeV on the dijet invariant mass m qq , depending on the instantaneous luminosity.
To evaluate trigger efficiencies, a prescaled control path is used, requiring one PFJet with p T > 80 GeV.To match the efficiency in data, the simulated trigger efficiency must be corrected with a scale factor of order 0.75 that is parametrized a function of the highest jet b-tag output in the event and the invariant mass of the quark-jet candidates.With this procedure the weak dependence of the trigger efficiencies on the invariant mass of the two b jets is also taken into account.

General-purpose VBF trigger
The L1 paths for the general-purpose VBF trigger require that the scalar p T sum of the hadronic activity in the event exceeds 175 or 200 GeV, depending on the instantaneous luminosity.
The HLT path is seeded by the L1 path described above, and requires the presence of at least two CaloJets with p T > 35 GeV.Out of all the possible jet pairs in the event, with one jet lying at positive and the other at negative η, the pair with the highest invariant mass is selected as the most probable VBF-tagging jet pair.The corresponding invariant mass m trig jj and absolute pseudorapidity difference |∆η trig jj | are required to be larger than 700 GeV and 3.5, respectively.The efficiency of the general-purpose VBF trigger is measured in a similar way as for the dedicated triggers, using a prescaled path (requiring two PFJets with average p T > 80 GeV).To match the efficiency in data, the simulated trigger efficiency must be corrected with a scale factor of order 0.8 that is expressed as a function of the invariant mass and the pseudorapidity difference of the two offline quark-jet candidates.

Event reconstruction
The offline analysis uses reconstructed charged-particle tracks and candidates from the particleflow (PF) algorithm [34][35][36].In the PF event reconstruction all stable particles in the event, i.e. electrons, muons, photons, and charged and neutral hadrons, are reconstructed as PF candidates using information from all CMS subdetectors to obtain an optimal determination of their direction, energy, and type.The PF candidates are then used to reconstruct the jets and missing transverse energy.
Jets are reconstructed by clustering PF candidates with the anti-k T algorithm [37,38] with a distance parameter of 0.5.Reconstructed jets require a small additional energy correction, mostly due to thresholds on reconstructed tracks and clusters in the PF algorithm and various reconstruction inefficiencies [39].Jet identification criteria are also applied to reject misreconstructed jets resulting from detector noise, as well as jets heavily contaminated with pileup energy (clustering of energy deposits not associated with a parton from the primary pp interaction) [40].The efficiency of the jet identification criteria is greater than 99%, with the rejection of 90% of background pileup jets with p T 50 GeV.
The identification of jets that originate from the hadronization of b quarks is done with the CSV b tagger [33], also implemented for the HLT paths, as described in Section 4. The CSV algorithm combines the information from track impact parameters and secondary vertices identified within a given jet, and provides a continuous discriminator output.
Events are required to have at least four reconstructed jets.All the jets found in an event are ordered according to their p T , and the four leading ones are considered as the most probable b jet and VBF-tagging jet candidates.The distinction between the two jet types is done by the means of a multivariate discriminant that, in addition to the b-tag values and the b-tag ordering, takes into account the η values and the η ordering.In the VBF H → bb signal simulation it is found that the b jets have higher b-tag values and are more central in η than the VBF-tagging jets.A boosted decision tree (BDT), implemented with the TMVA package [41], is trained on simulated signal events using the discriminating variables previously described and its output is used as a b-jet likelihood score; out of the four leading jets the two with the highest score are identified as the b jets, while the other two are identified as the VBF-tagging jets.With the use of the multivariate b-jet assignment the signal efficiency is increased by ≈10% compared to the interpretation based on CSV output only.

Event selection
The offline event selection is based upon the b-jet and VBF-tagging jet assignment described in Section 5, and is adjusted to the two different trigger sets presented in Section 4. In what follows the selected events are divided into two sets referred to as set A and set B. These selections are summarized in Table 1.
Events selected in set A are required to have been selected by the dedicated VBF qqH → qqbb trigger and to have at least four PF jets with p 1,2,3,4 T > 80, 70, 50, 40 GeV and |η| < 4.5.Moreover, at least two of these jets must satisfy the loose CSV working point requirement (CSVL) [33].The VBF topology is ensured by requiring m qq > 250 GeV and |∆η qq | > 2.5, where qq denotes the pair of the most probable VBF-tagging jets.Finally, in order to suppress further the background, the azimuthal angle difference ∆φ bb between the two b-jet candidates must be less than 2.0 radians.Figure 1 shows the normalized distributions of |∆η qq | (left) and ∆φ bb (right) for the sum of all simulated backgrounds, and the VBF and GF Higgs boson production.
Events in set B are first required to not belong to set A (to avoid double counting).Then, they must have passed the generic VBF topological trigger and have at least four PF jets with p T > 30 GeV and |η| < 4.5.In addition, the scalar p T sum of the two leading jets must be greater than 160 GeV.In order to enrich the sample in b jets, there must be at least one jet satisfying the medium CSV working point requirement (CSVM) [33] and one jet satisfying the CSVL.The VBF topology is ensured by requiring m qq , m trig jj > 700 GeV, and |∆η qq |, |∆η trig jj | > 3.5, where qq denotes the pair of the most probable VBF jets and jj denotes the jet pair with the highest invariant mass (as in the trigger logic described in Section 4).Finally the azimuthal angle ∆φ bb between the two b-jet candidates must be less than 2.0 radians.After all the selection requirements, 2.3% of the simulated VBF signal events (for m H = 125 GeV) end up in set A, and 0.8% end up in set B. The fraction of events in set A that also satisfy the requirements of set B (except for the set A veto) amounts to 39%.The set B selection recovers signal events presenting pronounced VBF jets, with lower p T but larger pseudorapidity opening and invariant mass.

Signal properties
The analysis described in this paper relies on certain characteristic properties of the studied final state, which provide a significant improvement of the overall sensitivity.First, the resolution of the invariant mass of the two b jets is improved by applying multivariate regression techniques.Then, the jet composition properties are used to form a discriminant that separates jets originating from light quarks or gluons.Third, soft QCD activity outside the jets is quantified and used as a discriminant between QCD processes with strong color flow and the VBF signal without color flow.

Jet transverse-momentum regression
The bb mass resolution is improved by using a regression technique similar to those used in the search for a Higgs boson produced in association with a weak vector boson and decaying to bb [14].A refined calibration is carried out for individual b jets, beyond the default jet energy corrections, that takes into account the jet composition properties and targets semileptonic b decays that lead to a substantial mismeasurement of the jet p T due to the presence of an escaping neutrino.For this purpose a regression BDT is trained on simulated signal events with inputs including information about the jet properties and structure.The target of the regression is the p T of the associated particle-level jet, clustered from all stable particles (with lifetime cτ > 1 cm).The inputs include: (i) the jet p T , η, and mass; (ii) the jet energy fractions carried by neutral hadrons and photons [34][35][36]; (iii) the mass and the uncertainty in the decay length associated with the secondary vertex, when present; (iv) the event missing transverse energy and its azimuthal direction relative to the jet; (v) the total number of jet constituents; (vi) the p T of the soft-lepton candidate inside the jet, when present, and its p T component perpendicular to the jet axis; (vii) the p T of the leading track in the jet; and (viii) the average p T density of the event in y-φ space (FASTJET ρ method [42]).
The additional energy correction of b jets leads to an improvement of the jet p T resolution, which in turn improves the dijet invariant mass resolution by approximately 17% in the phase space of the offline event selections.Figure 2 shows the reconstructed dijet invariant mass of the b-jet candidates (m bb ) before and after the regression for simulated events passing set A selections.The measured distribution of the regressed m bb in set A is shown in Fig. 3.
The validation of the regression technique in data is done with samples of Z → events with one or two b-tagged jets.When the jets are corrected by the regression procedure, the p T balance distribution, between the Z boson, reconstructed from the leptons, and the b-tagged jet or dijet system is improved to be better centered at zero and narrower than when the regression correction is not applied.In both cases the distributions for data and the simulated samples are in good agreement after the regression correction is applied.

Discrimination between quark-and gluon-originated jets
To further identify whether the jet pair with the smallest b-tag values among the four leading jets is likely to originate from hadronization of a light (u,d,s-type) quark, as expected for signal VBF jets, or from gluons, as is more probable for jets produced in QCD processes, a quark-gluon discriminant [43][44][45] is applied to the VBF candidate jets.
The discriminant exploits differences in the showering and fragmentation of gluons and quarks, making use of the following internal jet composition observables based on the PF jet constituents: (i) the major root-mean square (RMS) of the distribution of jet constituents in the η-φ plane [45], (ii) the minor RMS of the distribution of jet constituents in the η-φ plane [45], (iii) the jet asymmetry pull [46], (iv) the jet particle multiplicity, and (v) the maximum energy fraction carried by a jet constituent.The pull and RMS variables are calculated by weighting each jet constituent by its p T squared [45].
The five variables above are used as inputs to a likelihood estimated with gluon and quark jets from simulated QCD dijet events using the TMVA package.To improve the separation power, all variables are corrected for pileup effects as a function of the FASTJET ρ density.Figure 4 shows the normalized distribution of the quark-gluon likelihood (QGL) [45] for the first VBF-Events / 10 GeV The LO QCD cross section is multiplied by a factor 1.68 so that the total number of background events equals the number of events in the data, while the VBF and GF Higgs boson signal cross sections are multiplied by a factor 10 for better visibility.The last bin is the sum of all the events beyond the range of the x axis (overflow).The panel at the bottom shows the fractional difference between the data and the background simulation, with the shaded band representing the statistical uncertainties in the MC samples.jet candidate (the jet that is ranked third in the b-tag score; see Section 5), for background and signal events.As expected, VBF signal events, dominated by quark jets, have a pronounced peak at likelihood ∼0, while the background and GF events are enriched in gluon jets, and have a very different QGL distribution.The QGL distribution of all four jets is used as input to the signal vs background discriminant (Section 9.1).

Soft QCD activity
To measure the additional hadronic activity between the VBF-tagging jets, excluding the more centrally produced Higgs boson decay products, only reconstructed charged tracks are used.This is done to measure the hadronic activity associated with the primary vertex (PV), defined as the reconstructed vertex with the largest sum of squared transverse momenta of tracks used to reconstruct it.
A collection of "additional tracks" is assembled using reconstructed tracks that (i) satisfy the high purity quality requirements defined in Ref. [47] and p T > 300 MeV; (ii) are not associated with any of the four leading PF jets in the event; (iii) have a minimum longitudinal impact parameter, |d z (PV)|, with respect to the main PV, rather than to other pileup interaction vertices; (iv) satisfy |d z (PV)| < 2 mm and |d z (PV)| < 3σ z (PV) with respect to the PV, where σ z (PV) is the uncertainty in d z (PV); and (v) are not in the region between the two best b-tagged jets.This is defined as an ellipse in the η-φ plane, centered on the midpoint between the two jets, with major axis of length ∆R(bb) + 1, where ∆R(bb) = (∆η bb ) 2 + (∆φ bb ) 2 , oriented along the direction connecting the two b jets, and with minor axis of length 1.
The additional tracks are then clustered into "soft TrackJets" using the anti-k T clustering algorithm with a distance parameter of 0. method [48] to reconstruct the hadronization of partons with very low energies down to a few GeV [49]; an extensive study of the soft TrackJet activity can be found in Refs.[43,44].
The discriminating variable, H soft T , that encapsulates the differences between the signal and the QCD background, is the scalar p T sum of the soft TrackJets with p T > 1 GeV, and is shown in Fig. 5.

Extraction of the Z boson signal
The Z +jets background process, with the Z boson decaying to a b-quark pair, provides a validation of the analysis strategy used for the Higgs boson search.The extraction of the Z boson signal demonstrates the ability to observe a relatively wide hadronic resonance on top of a smooth QCD background.Also, if such a signal can be seen, it can serve for in situ confirmation of the scale and resolution of the invariant mass of the two b jets.Recently, the observation of a Z → bb signal was reported by the ATLAS Collaboration [50] in the Z +1 jet final state, and similar techniques are applied here.The overall strategy has two parts.First, events are selected from set A, with the additional requirement to have at least one CSVM jet.It should be noted that it is important to extract the Z boson signal in the same four-jet phase space in which the Higgs boson search is performed.Then, a multivariate discriminant is trained to separate the Z +jets process from the QCD multijet production, using variables that are only weakly correlated to m bb .According to the value of the discriminant the events are divided into three categories, ranging from a signal-depleted control category, to a signal-enriched one.Finally, a simultaneous fit of the signal and the QCD background m bb shape is performed in all three categories.The subsequent sections give details of the outlined procedure.

Z boson signal vs. background discrimination
As discussed above, the selection of events is based on set A, with the additional requirement of having at least one CSVM jet; the tightening of the b-tagging condition was found to improve the expected sensitivity.A Fisher discriminant (FD) [41] is implemented with the TMVA package and trained to discriminate between the Z +jets signal and the background.For this purpose, seven variables are used: (i) the absolute η difference |∆η qq | of the VBF jets; (ii) the absolute η of the b-jet system |η bb |; (iii) the CSV value of the jet with highest CSV value (with best b tag); and (iv)-(vii) the QGL values of the four leading jets.Due to the very small correlations between the variables, the FD performs almost as well as more advanced, nonlinear discriminators.Figure 6 shows the normalized distribution of the discriminant, where the output of the Z +jets signal is compared to the background.

Fit of the dijet invariant mass spectrum
The selected events are divided into three categories, based on the FD output.Table 2 summarizes the event categories and corresponding yields.Besides its discrimination power, the FD has minimal correlation with the invariant mass of the two b jets.This means that the m bb spectrum from QCD processes is independent of the category.In practice, however, there is a small residual dependence, of up to 3%, which is corrected with a linear transfer function of m bb that is taken from data.The extraction of the Z boson signal is done with a simultaneous fit in all three categories.Eq. ( 1) describes the fit model: where the subscript i denotes the category, N i,Z is the expected yield for the Z boson signal; and µ Z , N i,QCD are free parameters for the signal strength and the QCD event yield.The shape of the top quark background T i (m bb ) is taken from the simulation (sum of the tt and single top quark contributions), and the expected yield N i,t is allowed to vary in the fit by 20%.The Z +jets signal shape Z i (m bb ; k JES , k JER ) is taken from the simulation and is parametrized as a Crystal Ball function [51] (Gaussian core with power-law tail) on top of a combinatorial background modeled by a polynomial.The position and the width of the Gaussian core are allowed to vary by the factors k JES and k JER , respectively, which quantify any mismatch of the jet energy scale and resolution between data and simulation and are constrained by the dedicated validation measurements of the regressed jet energy scale and resolution.
Finally, the QCD background shape in each category is described by a common, eighth-order polynomial B8(m bb ; p), whose parameters p are determined by the fit, and a multiplicative linear transfer function K i (m bb ) that accounts for the small background shape differences between the categories.The choice of the polynomial is based on an extensive bias study described in Section 9. Allowing for 20% uncertainty on the Z boson signal efficiency, the simultaneous binned maximum-likelihood fit yields a signal strength of µ Z = 1.10 +0.44 −0.33 , which corresponds to an observed (expected) significance of 3.6 σ (3.3 σ).The fitted values of k JES and k JER are 1.01 ± 0.02 and 1.02 ± 0.10, respectively.Figure 7 shows the fitted distributions and the background-subtracted ones.
The extraction of the Z boson signal in this way validates the Higgs boson search method used in this paper by finding a known dijet resonance in a similar mass range.In addition, the simulated m bb scale and resolution are consistent with the data, based on the best-fit values of the k JES and k JER nuisance parameters, which serve to constrain the corresponding uncertainties in the Higgs boson signal extraction.

Search for a Higgs boson
The search for a Higgs boson follows closely the methodology applied for the extraction of the Z boson signal (Section 8.2).Namely, a multivariate discriminant is employed (Section 9.1) to divide the events into seven categories that are subsequently fit simultaneously with m bb templates (Section 9.2).

Higgs boson signal vs. background discrimination
In order to separate the overwhelmingly large QCD background from the Higgs boson signal, all discriminating features have to be used in an optimal way.This is best achieved by using a multivariate discriminant, which in this case is a BDT implemented with the TMVA package.The variables used as an input to the BDT are chosen such that they are very weakly correlated with the dynamics of the bb system, in particular with m bb , and are grouped into five distinct groups: (i) the dynamics of the VBF-jet system, expressed by ∆η qq , ∆φ qq , and m qq ; (ii) the b-jet content of the event, expressed by the CSV output for the two best b-tagged jets; (iii) the jet flavor of the event QGL for all four jets; (iv) the soft activity, quantified by the scalar p T sum H soft T of the additional soft TrackJets with p T > 1 GeV, and the number N soft of soft TrackJets with p T > 2 GeV; and (v) the angular dynamics of the production mechanism, expressed by the cosine of the angle between the qq and bb planes in the center-of-mass frame of the four leading jets cos θ * qq,bb .In practice, two BDTs are trained with the same input variables using the selections corresponding to the two sets of events.This distinction is necessary because the properties of the selected events are significantly different between the two selections (set A and set B). Figure 8 shows the output of the BDT for the two sets of events.
Events / 0.10  Data are shown by the points, while the simulated backgrounds are stacked.The LO QCD cross sections are scaled such that the total number of background events equals the number of events in data, while the VBF and GF Higgs boson signal yields are multiplied by a factor of 10 for better visibility.The panels at the bottom show the fractional difference between the data and the background simulation, with the shaded band representing the statistical uncertainties of the MC samples.

Fit of the dijet invariant mass spectrum
Taking into account the expected sensitivity of the analysis and the available number of MC events (necessary to build the various m bb templates), seven categories are defined, according to the BDT output: four for set A and three for set B. The boundaries of the categories and the respective event yields are summarized in Table 3.In an m bb interval of twice the width of the Gaussian core of the signal distribution (m H = 125 GeV), the signal-over-background ratio reaches 1.7% in the most sensitive category (category 4).It should be noted that both the VBF and GF contributions are added to the Higgs boson signal, with the fraction of the latter ranging from ∼50% in category 1 to ∼7% in category 4.
The analysis relies on the assumption that the QCD m bb spectrum shape is the same in all BDT categories of the same set of events.In reality, a small correction is needed to account for residual differences between the m bb spectrum in category 1 vs. categories 2,3 and 4, and in category 5 vs. categories 6 and 7.The correction factor (transfer function) is a linear function of m bb in set A and a quadratic one in set B (because a stronger dependence is observed in set B between m bb and the multivariate discriminant).With the introduction of the transfer functions, the fit model for the Higgs boson signal is given by Eq. ( 2): where the subscript i denotes the category and µ H , N i,QCD are free parameters for the signal strength and the QCD event yield.N i,H , N i,Z , and N i,t are the expected yields for the Higgs boson signal, the Z +jets, and the top quark background respectively.The shape of the top quark background T i (m bb ; k JES , k JER ) is taken from the simulation (sum of the tt and single top quark contributions) and is described by a broad Gaussian.The Z/W+jets background Z i (m bb ; k JES , k JER ) and the Higgs boson signal H i (m bb ; k JES , k JER ) shapes are taken from the simulation and are parametrized as a Crystal Ball function (Gaussian core with power-law tail) on top of a polynomial.The position and the width of the Gaussian core of the MC templates (signal and background) are allowed to vary by the free factors k JES and k JER , respectively, which quantify any mismatch of the jet energy scale and resolution between data and simulation.Finally, the QCD shape is described by a polynomial B(m bb ; p set ), common within the categories of each set, and a multiplicative transfer function K i (m bb ) per category, accounting for the shape differences between the categories.The parameters of the polynomial, p set , and those of the transfer functions, are determined by the fit, which is performed simultaneously in all categories in each set.For set A, the polynomial is of fifth order, while for set B it is of fourth order.
The choice of the QCD background shapes and event category transfer function parametrizations are fully based on data, and have been performed among classes of functions, e.g.polynomials, exponential, power laws and their combinations, with a minimum number of degrees of freedom suited to fit the data in all categories.Each function considered is used to generate different MC pseudo-data sets, and each data set is fitted using the different functional models.
A potential bias on the signal estimation is computed for each pair of possible functions used to generate and fit to the pseudo-data sets.The background model chosen yields a maximum potential bias on the fitted signal strength of less than six times the statistical uncertainty on the background.Hence the systematic uncertainty associated with the background shape can be neglected.

Systematic uncertainties
Table 4 summarizes the sources of uncertainty related to both the background and to the signal processes.The leading uncertainty comes from the QCD background description: both the parameters of its shape and the overall normalization in each category are allowed to float freely, being determined by the simultaneous fit to the data.The resulting covariance matrix is used to compute the uncertainty.For the smaller background contributions from the Z/W+jets and top quark production, the m bb shapes are taken from the simulation, while their corresponding yields are allowed to float in the fit with a 30% log-normal constraint centered on the SM expectations.
The experimental uncertainties on the jet energy scale (JES) and jet energy resolution (JER) affect the signal acceptance and the shape of the multivariate discriminant output, and are included as nuisance parameters.The effect of the JES and JER uncertainties on the m bb shape is taken into account in the fit function.By varying the JES and JER by their measured uncertainties [39], the impact of the signal yield per analysis category is estimated.These variations affect the acceptance by up to 10%, while the peak position of the m bb shape is shifted by 2%, and the width by 10%.
Additional uncertainties are assigned to the flavor tagging of the jets.The CSV and QGL discriminant outputs are shifted according to the observed agreement between data and simulation and the effect on the signal acceptance is estimated to range from 3% to 10% for the former, and from 1% to 3% for the latter.The impact of the CSV shift is more significant, both because it is used for the event selection, and because the multivariate discriminant depends more strongly on the b tagging of the jets.The shift of QGL only affects the shape of the discriminant, and thus the distribution of signal events in the analysis categories.
The trigger uncertainty is estimated by propagating the uncertainty in the data vs.MC simulation scale factor for the efficiency.This is achieved by convolving the two-dimensional efficiency scale factor with the signal distribution.As a result, the uncertainty in the signal yield ranges from 1% to 6% for the VBF process, and from 5% to 20% for the GF.
Theoretical uncertainties affect the signal simulation.First, the uncertainty due to PDFs and strong coupling constant α S variation is computed to be 2.8% (VBF) and 7.5% (GF) [52].A residual uncertainty from these sources is estimated for the particular kinematical phase space of the search: following the PDF4LHC prescription [53,54] the PDF and α S uncertainty ranges from 2% to 5%, while the renormalization and factorization scale variations in the signal simulation induce an uncertainty of 1% to 5% in the analysis categories, on top of a global cross section uncertainty of 0.2% (VBF) and 7.7-8.1% (GF).Finally, the variation of the UE and parton-shower (PS) model (using PYTHIA 8.1 [55] instead of the default PYTHIA 6) affects the signal acceptance by 2% to 7% (VBF) and by 10% to 45% (GF).
Lastly, an uncertainty of 2.6% is assigned to the total integrated luminosity measurement [56].

Results
The m bb distributions in data, for all categories, are fitted simultaneously with the parametric functions described in Section 9.2 under two different hypotheses: background only and background plus a Higgs boson signal.The fit is a binned likelihood fit incorporating the systematic uncertainties discussed in Section 10 as nuisance parameters.Due to the smallness of the GF contribution in the most signal-sensitive categories we do not attempt to fit independently the VBF and the GF signal strengths.The fits in sets A and B are shown in Figs. 9 and 10, respectively.The limits on the signal strength are computed with the asymptotic CL s

Combination with other CMS Higgs boson to b-quarks searches
The CMS experiment has also performed searches for the Higgs boson decaying to bottom quarks, where the Higgs boson is produced in association with a vector boson [14] (VH), or with a top quark pair [16,17] (ttH).The VH results have been recently updated and combined with ttH [11].Here we combine those results with the ones from the VBF production search described in this paper.Event selection overlaps between different analyses have been checked and are either empty by construction or have negligible effects on the combination.The combination methodology is based on the likelihood ratio test statistics employed in Section 11, and takes into account correlations among sources of systematic uncertainty.Care is taken to understand the behavior of the parameters that are correlated between analyses, in terms of the fitted parameter values and uncertainties.

Summary
A search has been carried out for the SM Higgs boson produced in vector boson fusion and decaying to bb with two data samples of pp collisions at √ s = 8 TeV collected with the CMS detector at the LHC corresponding to integrated luminosities of 19.8 fb −1 and 18.3 fb −1 .Upper limits, at the 95% confidence level, on the production cross section times the H → bb branching fraction, relative to expectations for a SM Higgs boson, are extracted for a Higgs boson in the mass range 115-135 GeV.In this range, the expected upper limits in the absence of a signal vary between a factor of 2.2 to 3.7 of the SM prediction, while the observed upper limits vary from 5.0 to 5.8.For a Higgs boson mass of 125 GeV, the observed and expected significance is, respectively, 2.2 and 0.8 standard deviations, and the fitted signal strength is µ = σ/σ SM = 2.8 +1.6 −1.4 .This is the first search of this kind, and the only search for the SM Higgs boson in all-jet final states, at the LHC.
The combination of the results obtained in this search with other CMS H → bb searches in the VH and ttH production modes, yields a H → bb signal strength µ = 1.03 +0. 44 −0.42 with a signal significance of 2.6 standard deviations for m H = 125 GeV that is consistent with the SM.

Figure 1 :
Figure 1: (a) Normalized distribution in absolute pseudorapidity difference between the two VBF-jet candidates (|∆η qq |).(b) Normalized distribution of the azimuthal difference between the two b-jet candidates (∆φ bb ).The selection corresponds to set A, data are shown by the points, and the sum of all simulated backgrounds is by the filled histograms.The VBF Higgs boson signal is displayed by a solid line, and the GF Higgs boson signal is shown by a dashed line.The panels at the bottom show the fractional difference between data and background simulation, with the shaded band representing the statistical uncertainties in the MC samples.

Figure 2 :
Figure 2: Simulated invariant mass distribution of the two b-jet candidates before and after the jet p T regression, for VBF signal events.The generated Higgs boson signal mass is 125 GeV and the event selection corresponds to set A. By FWHM we denote the width of the distribution at the middle of its maximum height.

Figure 3 :
Figure3: Distribution in invariant mass of the two b-jet candidates, after the jet p T regression, for the events of set A. Data are shown by the points, while the simulated backgrounds are stacked.The LO QCD cross section is multiplied by a factor 1.68 so that the total number of background events equals the number of events in the data, while the VBF and GF Higgs boson signal cross sections are multiplied by a factor 10 for better visibility.The last bin is the sum of all the events beyond the range of the x axis (overflow).The panel at the bottom shows the fractional difference between the data and the background simulation, with the shaded band representing the statistical uncertainties in the MC samples.

5 .Figure 4 :
Figure 4: Normalized distribution in quark-gluon likelihood discriminant of the first light-jet candidate.Quark jets are expected to have low likelihood values (closer to 0), while gluon jets are expected to have higher ones (closer to 1).The selection corresponds to set A, data are shown by the points, and the sum of all simulated backgrounds is shown by the filled histogram.The VBF Higgs boson signal is displayed by a solid line, and the GF Higgs boson signal is shown by a dashed line.The panel at the bottom shows the fractional difference between the data and the background simulation, with the shaded band representing the statistical uncertainties in the MC samples.

Figure 5 :
Figure 5: Normalized distribution of the scalar p T sum of TrackJets that are associated with the soft QCD activity (H soft T ).The selection corresponds to set A, data are shown by the points, and the sum of all simulated backgrounds is shown by the filled histogram.The VBF Higgs boson signal is displayed by a solid line, and the GF Higgs boson signal is shown by a dashed line.The panel at the bottom shows the fractional difference between the data and the background simulation, with the shaded band representing the statistical uncertainties in the MC samples.

Figure 6 :
Figure 6: Normalized distribution in Z boson Fisher discriminant.Data are shown by the points, and the sum of all simulated backgrounds is shown by the filled histogram.The Z +jets signal is displayed with solid line.The panel at the bottom shows the fractional difference between the data and the background simulation, with the shaded band representing the statistical uncertainties in the MC samples.

Figure 7 :
Figure 7: Invariant mass distribution of the two b-jet candidates for the Z boson signal in the three event categories that are based on the Z boson Fisher discriminant output, starting from the most backgroundlike (upper left) and ending at the most signal-like (bottom).Data are shown by the points.The solid line is the sum of the postfit background and signal shapes, while the dashed line is the background component alone.The bottom panel shows the background-subtracted distribution, overlaid with the fitted signal, and with the 1σ and 2σ background uncertainty bands.The measured (simulated) parameters of the Gaussian core of the signal shape in category 3 are 97.7 (96.6)GeV and 9.3 (9.1) GeV for the mean and the sigma, respectively.

Figure 8 :
Figure8: Distribution of the BDT output for the events of set A (a) and set B (b).Data are shown by the points, while the simulated backgrounds are stacked.The LO QCD cross sections are scaled such that the total number of background events equals the number of events in data, while the VBF and GF Higgs boson signal yields are multiplied by a factor of 10 for better visibility.The panels at the bottom show the fractional difference between the data and the background simulation, with the shaded band representing the statistical uncertainties of the MC samples.

Figure 9 :
Figure 9: Fit of the invariant mass of the two b-jet candidates for the Higgs boson signal (m H = 125 GeV) in the four event categories of set A. Data are shown by the points.The solid line is the sum of the postfit background and signal shapes, the dashed line is the background component, and the dashed-dotted line is the QCD component alone.The bottom panel shows the background-subtracted distribution, overlaid with the fitted signal, and with the 1σ and 2σ background uncertainty bands.

Figure 10 :
Figure 10: Fit of the invariant mass of the two b-jet candidates for the Higgs boson signal (m H = 125 GeV) in the three event categories of set B. Data are shown with markers.The solid line is the sum of the postfit background and signal shapes, the dashed line is the background component, and the dashed-dotted line is the QCD component alone.The bottom panel shows the background-subtracted distribution, overlaid with the fitted signal, and with the 1σ and 2σ background uncertainty bands.

Figure 11 :
Figure 11: Expected and observed 95% confidence level limits on the signal cross section in units of the SM expected cross section, as a function of the Higgs boson mass, including all event categories.The limits expected in the presence of a SM Higgs boson with a mass of 125 GeV are indicated by the dotted curve.

Table 1 :
Summary of selection requirements for the two analyses.

Table 2 :
Definition of the event categories for the Z boson signal extraction and corresponding yields in the m bb interval [60, 170] GeV.

Table 3 :
Definition of the event categories and corresponding yields in the m bb interval[80, 200]GeV, for the data and the MC expectation.The BDT output boundary values refer to the distributions shown in Fig.8.

Table 4 :
Sources of systematic uncertainty and their impact on the shape and normalization of the background and signal processes.H = 115 GeV to 5.8 (3.7) at m H = 135 GeV, together with the expected limits in the presence of a SM Higgs boson with a mass of 125 GeV.For the 125 GeV Higgs boson signal the observed (expected) significance is 2.2 (0.8) standard deviations, and the fitted signal strength is µ = σ/σ SM = 2.8+1.6−1.4 .The measured signal strength is compatible with the SM Higgs boson prediction µ = 1 at the 8% level.

Table 5
lists the 95% CL expected and observed upper limits and the best-fit signal strength values from the individual channels and from the combined fit.For m H = 125 GeV the combination yields an H → bb signal strength µ = 1.03 +0.44−0.42 with a significance of 2.6 standard deviations.

Table 5 :
Observed and expected 95% CL limits, best fit values on the signal strength parameter µ = σ/σ SM and signal significances for m H = 125 GeV, for each H → bb channel and their combination.