Observation of $\mathrm{t\overline{t}}$H production

The observation of Higgs boson production in association with a top quark-antiquark pair is reported, based on a combined analysis of proton-proton collision data at center-of-mass energies of $\sqrt{s}=$ 7, 8, and 13 TeV, corresponding to integrated luminosities of up to 5.1, 19.7, and 35.9 fb$^{-1}$, respectively. The data were collected with the CMS detector at the CERN LHC. The results of statistically independent searches for Higgs bosons produced in conjunction with a top quark-antiquark pair and decaying to pairs of W bosons, Z bosons, photons, $\tau$ leptons, or bottom quark jets are combined to maximize sensitivity. An excess of events is observed, with a significance of 5.2 standard deviations, over the expectation from the background-only hypothesis. The corresponding expected significance from the standard model for a Higgs boson mass of 125.09 GeV is 4.2 standard deviations. The combined best fit signal strength normalized to the standard model prediction is 1.26 ${^{+0.31}_{-0.26}}$.


1
Proton-proton (pp) collisions at the CERN LHC, at the center-of-mass (CM) energies of √ s = 7, 8, and 13 TeV, have allowed direct measurements of the properties of the Higgs boson [1][2][3].In particular, the 13 TeV data collected so far by the ATLAS [4] and CMS [5] experiments have led to improved constraints on the couplings of the Higgs boson compared to those performed at the lower energies [6], permitting more precise consistency checks with the predictions of the standard model (SM) of particle physics [7][8][9].Nonetheless, not all properties of the Higgs boson have been established, in part because of insufficiently large data sets.The lack of statistical precision can be partially overcome by combining the results of searches in different decay channels of the Higgs boson and at different CM energies.Among the properties that are not yet well established is the tree-level coupling of Higgs bosons to top quarks.
In this Letter, we present a combination of searches for the Higgs boson (H) produced in association with a top quark-antiquark pair (tt), based on data collected with the CMS detector.Results from data collected at √ s = 13 TeV [10-14] are combined with analogous results from √ s = 7 and 8 TeV [15].As a result of this combination, we establish the observation of ttH production.This constitutes the first confirmation of the tree-level coupling of the Higgs boson to top quarks.
A top quark decays almost exclusively to a bottom quark and a W boson, with the W boson subsequently decaying either to a quark and an antiquark or to a charged lepton and its associated neutrino.The Higgs boson exhibits a rich spectrum of decay modes that includes the decay to a bottom quark-antiquark pair, a τ + τ − lepton pair, a photon pair, and combinations of quarks and leptons from the decay of intermediate on-or off-shell W and Z bosons.Thus, ttH production gives rise to a wide variety of final-state event topologies, which we consider in our analyses and in the combination of results presented below.
In the SM, the masses of elementary fermions are accounted for by introducing a minimal set of Yukawa interactions, compatible with gauge invariance, between the Higgs and fermion fields.Following the spontaneous breaking of electroweak symmetry [16][17][18][19][20][21], charged fermions of flavor f couple to H with a strength y f proportional to the mass m f of those fermions, namely y f = m f /v, where v ≈ 246 GeV is the vacuum expectation value of the Higgs field.Measurements of the Higgs boson decay rates to down-type fermions (τ leptons and bottom quarks) agree with the SM predictions within their uncertainties [22,23].However, the top quark Yukawa coupling (y t ) cannot be similarly tested from the measurement of a decay rate since on-shell top quarks are too heavy to be produced in Higgs boson decay.Instead, constraints on y t can be obtained through the measurement of the pp → ttH production process.Example tree-level Feynman diagrams for this process are shown in Fig. 1.To date ttH production has eluded definite observation, although first evidence has been recently reported by the AT-LAS [24] and CMS [10] Collaborations.
The overall agreement observed between the SM predictions and data for the rate of Higgs boson production through gluon-gluon fusion and for the H → γγ decay mode [6] suggests that the Higgs boson coupling to top quarks is SM-like, since the quantum loops in these processes include top quarks.However, non-SM particles in the loops could introduce terms that compensate for, and thus mask, other deviations from the SM.A measurement of the production rate of the tree-level ttH process can provide evidence for, or against, such new-physics contributions.
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter, and a brass and scintillator hadron calorimeter, each composed of a barrel and two endcap sections.Forward calorimeters extend : Example tree-level Feynman diagrams for the pp → ttH production process, with g a gluon, q a quark, t a top quark, and H a Higgs boson.For the present study, we consider Higgs boson decays to a pair of W bosons, Z bosons, photons, τ leptons, or bottom quark jets.the pseudorapidity coverage provided by the barrel and endcap detectors.Muons are detected in gas-ionization chambers embedded in the steel flux-return yoke outside the solenoid.A detailed description of the CMS detector can be found in Ref. [5].
Events of interest are selected using a two-tiered trigger system [25] based on custom hardware processors and a farm of commercial processors running a version of the full reconstruction software optimized for speed.Offline, a particle-flow algorithm [26] is used to reconstruct and identify each particle in an event based on a combination of information from the various CMS subdetectors.Additional identification criteria are employed to improve purities and define the final samples of candidate electrons, muons, hadronically decaying τ leptons (τ h ) [27,28], and photons.Jets are reconstructed from particle-flow candidates using the anti-k T clustering algorithm [29] implemented in the FASTJET package [30].Multivariate algorithms [31,32] are used to identify (tag) jets arising from the hadronization of bottom quarks (b jets) and discriminate against gluon and light flavor quark jets.The algorithms utilize observables related to the long lifetimes of hadrons containing b quarks and the relatively larger particle multiplicity and mass of b jets compared to light flavor quark jets.The τ h identification is based on the reconstruction of the hadronic τ decay modes τ − → h − ν τ , h − π 0 ν τ , h − π 0 π 0 ν τ , and h − h + h − ν τ (plus the charge conjugate reactions), where h ± denotes either a charged pion or kaon.More details about the reconstruction procedures are given in Refs.[10][11][12][13][14][15].
The 13 TeV data employed for the current study were collected in 2016 and correspond to an integrated luminosity of up to 35.9 fb −1 [33].The 7 and 8 TeV data, collected in 2011 and 2012, correspond to integrated luminosities of up to 5.1 and 19.7 fb −1 [34], respectively.The 13 TeV analyses are improved relative to the 7 and 8 TeV studies in that they employ triggers with higher efficiencies, contain improvements in the reconstruction and background-rejection methods, and use more precise theory calculations to describe the signal and the background processes.For the 7, 8 and 13 TeV data, the theoretical calculations of Ref. [35] for Higgs boson production cross sections and branching fractions are used to normalize the expected signal yields.
The event samples are divided into exclusive categories depending on the multiplicity and kinematic properties of reconstructed electrons, muons, τ h candidates, photons, jets, and tagged b jets in an event.Samples of simulated events based on Monte Carlo event generators, with simulation of the detector response based on the GEANT4 [36] suite of programs, are used to evaluate the detector acceptance and optimize the event selection for each category.In the analysis of data, the background is, in general, evaluated from data control regions.When this is not feasible, either because the background process has a very small cross section or a control region depleted of signal events cannot be identified, the background is evaluated from simulation with a systematic uncertainty assigned to account for the known model dependence.Multivariate algorithms [37][38][39][40][41] based on deep neural networks, boosted decision trees, and matrix element calculations are used to reduce backgrounds.
Figure 2: Best fit value of the ttH signal strength modifier µ ttH , with its 1 and 2 standard deviation confidence intervals (σ), for (upper section) the five individual decay channels considered, (middle section) the combined result for 7+8 TeV alone and for 13 TeV alone, and (lower section) the overall combined result.The Higgs boson mass is taken to be 125.09GeV.For the H → ZZ * decay mode, µ ttH is constrained to be positive to prevent the corresponding event yield from becoming negative.The SM expectation is shown as a dashed vertical line.
At 13 TeV, we search for ttH production in the H → bb decay mode by selecting events with at least three tagged b jets and with zero leptons [11], one lepton [12], or an opposite-sign lepton pair [12], where "lepton" refers to an electron or muon candidate.A search for ttH production in the H → γγ decay mode is performed in events with two reconstructed photons in combination with reconstructed electrons or muons, jets, and tagged b jets [13].The signal yield is extracted from a fit to the diphoton invariant mass spectrum.Events with combinations of jets and tagged b jets and with two same-sign leptons, three leptons, or four leptons are used to search for ttH production in the H → τ + τ − , WW * , or ZZ * decay modes [10,14], where in this case "lepton" refers to an electron, muon, or τ h candidate (the asterisk denotes an off-shell particle).The searches in the different decay channels are statistically independent from each other.Analogous searches have been performed with the 7 and 8 TeV data [15].
The presence of a ttH signal is assessed by performing a simultaneous fit to the data from the different decay modes, and also from the different CM energies as described below.A detailed description of the statistical methods can be found in Ref. [42].The test statistic q is defined as the negative of twice the logarithm of the profile likelihood ratio [42].Systematic uncertainties are incorporated through the use of nuisance parameters treated according to the frequentist paradigm.The ratio between the normalization of the ttH production process and its SM expectation [35], defined as the signal strength modifier µ ttH , is a freely floating parameter in the fit.The SM expectation is evaluated assuming the combined ATLAS and CMS value for the mass of the Higgs boson, which is 125.09GeV [43].We consider the five Higgs boson decay modes with the largest expected event yields, namely H → WW * , ZZ * , γγ, τ + τ − , and bb.Other Higgs boson decay modes and production processes, including pp → tH + X (or tH + X), with X a light flavor quark or W boson, are treated as backgrounds and normalized using the predicted SM cross sections, subject to the corresponding uncertainties.
The measured values of the five independent signal strength modifiers, corresponding to the Table 1: Best fit value, with its uncertainty, of the ttH signal strength modifier µ ttH , for the five individual decay channels considered, the combined result for 7+8 TeV alone and for 13 TeV alone, and the overall combined result.The total uncertainties are decomposed into their statistical (Stat), experimental systematic (Expt), background theory systematic (Thbgd), and signal theory systematic (Thsig) components.The numbers in parentheses are those expected for µ ttH = 1.five decay channels considered, are shown in the upper section of Fig. 2 along with their 1 and 2 standard deviation confidence intervals obtained in the asymptotic approximation [44].Numerical values are given in Table 1.The individual measurements are seen to be consistent with each other within the uncertainties.

Uncertainty
We also perform a combined fit, using a single signal strength modifier µ ttH , that simultaneously scales the ttH production cross sections of the five decay channels considered, with all Higgs boson branching fractions fixed to their SM values [35].Besides the five decay modes considered, the signal normalizations for the Higgs boson decay modes to gluons, charm quarks, and Zγ, which are subleading and cannot be constrained with existing data, are scaled by µ ttH .The results combining the decay modes at 7+8 TeV, and separately at 13 TeV, are shown in the middle section of Fig. 2. The overall result, combining all decay modes and all CM energies, is shown in the lower section, with numerical values given in Table 1.Table 1 includes a breakdown of the total uncertainties into their statistical and systematic components.The overall result is µ ttH = 1.26The principal sources of experimental systematic uncertainty in the overall result for µ ttH stem from the uncertainty in the lepton and b jet identification efficiencies and in the τ h and jet energy scales.The background theory systematic uncertainty is dominated by modeling uncertainties in tt production in association with a W boson, a Z boson, or a pair of b or c quark jets.The dominant contribution to the signal theory systematic uncertainty arises from the finite accuracy in the SM prediction for the ttH cross section because of missing higher order terms and uncertainties in the proton parton density functions [35].
To highlight the excess of data over the expectation from the background-only hypothesis, we classify each event that enters the combined fit by the ratio S/B, where S and B are the expected post-fit signal (with µ ttH = 1) and background yields, respectively, in each bin of the distributions considered in the combination.The distribution of log 10 (S/B) is shown in Fig. 3.
The main sensitivity at high values of S/B is given by events selected in the H → γγ analysis with a diphoton mass around 125 GeV and by events selected in the H → τ + τ − , H → WW * , and H → bb analyses with high values of the multivariate discriminating variables used for the signal extraction.A broad excess of events in the rightmost bins of this distribution is observed, consistent with the expectation for ttH production with a SM-like cross section.
The value of the test statistic q as a function of µ ttH is shown in Fig. 4, with µ ttH based on the combination of decay modes described above for the combined fit.The results are shown for the combination of all decay modes at 7+8 TeV and at 13 TeV, separately, and for all decay modes at all CM energies.To quantify the significance of the measured ttH yield, we compute the probability of the background-only hypothesis (p-value) as the tail integral of the test statistic using the overall combination evaluated at µ ttH = 0 under the asymptotic approximation [45].This corresponds to a significance of 5.2 standard deviations for a one-tailed Gaussian distribution.The expected significance for a SM Higgs boson with a mass of 125.09GeV, eval- Figure 4: The test statistic q, described in the text, as a function of µ ttH for all decay modes at 7+8 TeV and at 13 TeV, separately, and for all decay modes at all CM energies.The expected SM result for the overall combination is also shown.The horizontal dashed lines indicate the p-values for the background-only hypothesis obtained from the asymptotic distribution of q, expressed in units of the number of standard deviations.uated through use of an Asimov data set [45], is 4.2 standard deviations.
In summary, we have reported the observation of ttH production with a significance of 5.2 standard deviations above the background-only hypothesis, at a Higgs boson mass of 125.09GeV.The measured production rate is consistent with the standard model prediction within one standard deviation.In addition to comprising the first observation of a new Higgs boson production mechanism, this measurement establishes the tree-level coupling of the Higgs boson to the top quark, and hence to an up-type quark.

Figure 1
Figure1: Example tree-level Feynman diagrams for the pp → ttH production process, with g a gluon, q a quark, t a top quark, and H a Higgs boson.For the present study, we consider Higgs boson decays to a pair of W bosons, Z bosons, photons, τ leptons, or bottom quark jets.

Figure 3 :
Figure 3: Distribution of events as a function of the decimal logarithm of S/B, where S and B are the expected post-fit signal (with µ ttH = 1) and background yields, respectively, in each bin of the distributions considered in this combination.The shaded histogram shows the expected background distribution.The two hatched histograms, each stacked on top of the background histogram, show the signal expectation for the SM (µ ttH = 1) and the observed (µ ttH = 1.26) signal strengths.The lower panel shows the ratios of the expected signal and observed results relative to the expected background.