Measurements of ttH Production and the CP Structure of the Yukawa Interaction between the Higgs Boson and Top Quark in the Diphoton Decay Channel

The first observation of the t ¯ tH process in a single Higgs boson decay channel with the full reconstruction of the final state ( H → γγ ) is presented, with a significance of 6.6 standard deviations ( σ ). The CP structure of Higgs boson couplings to fermions is measured, resulting in an exclusion of the pure CP -odd structure of the top Yukawa coupling at 3 . 2 σ . The measurements are based on a sample of proton-proton collisions at a center-of-mass energy ﬃﬃﬃ s p ¼ 13 TeV collected by the CMS detector at the LHC, corresponding to an integrated luminosity of 137 fb − 1 . The cross section times branching fraction of the t ¯ tH process is measured to be σ t ¯ tH B γγ ¼ 1 . 56 þ 0 . 34 − 0 . 32 fb, which is compatible with the standard model prediction of 1 . 13 þ 0 . 08 − 0 . 11 fb. The fractional contribution of the CP -odd component is measured to be f HttCP ¼ 0 . 00 (cid:2) 0 . 33 .

Since its observation [1][2][3], the properties of the Higgs boson (H) have been studied using a variety of decay channels and production modes. Among these properties, the tree-level top quark Yukawa (Htt) coupling and its CP structure can be tested by studying H production in association with a top quark-antiquark pair (ttH). The CMS [4] and ATLAS [5] Collaborations reported the observation of the ttH process by combining several H decay channels, with a cross section compatible with the standard model (SM) expectation. One of the most sensitive channels for probing the ttH process is H → γγ.B y probing the interaction between the H and vector bosons, CMS [6][7][8][9][10][11][12][13] and ATLAS [14][15][16][17][18][19] have determined that the H quantum numbers are consistent with J PC ¼ 0 þþ . However, small anomalous contributions were not excluded, and studies of the Htt coupling provide an alternative and independent path for CP tests in the Higgs sector [20][21][22].
This Letter reports on the measurement of the production rate of ttH with H → γγ, giving the first observation of the tree-level Htt coupling in a single H decay channel, along with a first test of its CP structure. Results are based on data from proton-proton (pp) collisions at a center-of-mass energy of ffiffi ffi s p ¼ 13 TeV collected with the CMS detector at the LHC between 2016 and 2018, corresponding to an integrated luminosity of 137 fb −1 .
The central feature of the CMS apparatus [23] is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Inside the solenoid there is a silicon tracker, a lead-tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter. Forward calorimeters extend the coverage to higher pseudorapidity (η), and muon detectors are embedded in the flux-return yoke of the solenoid.
The particle-flow (PF) algorithm [24] reconstructs individual particles (photons, charged and neutral hadrons, muons, and electrons) by combining information from all detectors. Jets are built from PF particles with the anti-k T algorithm [25,26] with a distance parameter of 0.4. The missing transverse momentum (p miss T ) is defined as the negative vector sum of the transverse momenta (p T ) of all PF particles. The primary pp interaction vertex is taken as the vertex with the largest value of summed physics object p 2 T [27]. Charged hadrons originating from additional pp interactions are removed from the analysis. Jets from the hadronization of bottom quarks are tagged by a secondary vertex algorithm based on the score from a deep neural network (DNN) [28].
Signal and background processes are generated with several Monte Carlo (MC) programs. All H production processes are modeled with MADGRAPH5_aMC@NLO2.4.2 at next-to-leading order (NLO) [29] in quantum chromodynamics (QCD), with cross sections and decay branching fractions taken from Ref. [30]. A separate ttH sample, generated with POWHEG2.0 [31][32][33][34] at NLO in QCD, is used to increase the number of events used for training the multivariate discriminants described below. For the CP study, ttH anomalous coupling samples of CP-odd, CP-even, and a mixture of the two are generated at leading order (LO) with JHUGEN7.0.2 [22,[35][36][37] and reweighted with the MELA matrix element library [22,[35][36][37], JHUGEN7.0.2 and MELA are also used for the study of CP effects in the tH process. The MADGRAPH5_aMC@NLO program is also used to generate most background processes, e.g., tt þ γγ, tt þ γ, tt þ jets, γ þ jets, V þ γ, Drell-Yan, diboson, t þ V, where V is a W or a Z boson. In contrast, the diphoton background (γγ þ jets) is generated with SHERPA2.2.4 [38], which includes tree-level processes with up to three additional jets, as well as box processes at LO accuracy. In all MC samples, the parton fragmentation and hadronization as well as the underlying events are modeled with PYTHIA8.205 [39] with the CUETP8M1 [40] and CP5 [41] tune used for the simulation of 2016 and 2017-2018 data, respectively. Finally, the detector response is simulated with the GEANT4 package [42].
The trigger [43] selects diphoton events with a loose calorimetric identification [44] and asymmetric photon transverse energy (E T ) thresholds of 30 and 18 (22) GeV for the data collected during 2016 (2017)(2018). The trigger efficiency is > 95% and is measured as a function of E T , η, and R 9 of the photons using an alternative trigger, where R 9 is the energy sum of the 3 × 3 crystals centered around the most energetic crystal in the cluster divided by the energy of the photon.
H candidates are built from pairs of photon candidates, which are reconstructed from energy clusters in the ECAL not linked to charged-particle tracks (with the exception of converted photons). The photon energies are corrected for the containment of electromagnetic showers in the clustered crystals and the energy losses of converted photons with a multivariate regression technique based on simulation [44]. The ECAL energy scale in data is corrected using Z → e þ e − simulated events smeared to reproduce the energy resolution measured in data. The off-line diphoton selection criteria are similar to, but more stringent than, those used in the trigger [44].
Photons are further required to satisfy a loose identification (photon ID [44]) criterion based on a boosted decision tree (BDT) classifier trained to separate photons from jets. Inputs to photon ID such as shower shape and isolation variables in simulation are corrected with a chained quantile regression method [45] based on studies of Z → e þ e − events. Each variable is corrected with a separately trained BDT, taking the photon kinematic properties, per event energy density, and the previously corrected features as inputs to ensure that correlations between the inputs are preserved and closer to those in data. This method improves the modeling of the photon ID BDT discriminant in MC simulation with respect to the previous CMS H → γγ results [44].
After the preselection described above, we require 100 <m γγ < 180 GeV, p T =m γγ > 1=3 and 1=4 for the leading (in p T ) and subleading photons, respectively, and then divide events into two channels. The leptonic channel is aimed at selecting events where at least one top quark decays leptonically and demands the presence of ≥1 jet with p T > 25 GeV and jηj < 2.4, ≥1 isolated e or μ with p T > 10 GeV for electrons, p T > 5 GeV for muons, and jηj < 2.4. The hadronic channel targets tt hadronic decays by requiring at least three jets, at least one b-tagged jet, and no isolated leptons (e=μ).
A dedicated BDT discriminant ("BDT-bkg")i s employed in each channel to distinguish between ttH and background events. These BDTs are trained with the XGBOOST [46] framework on signal and background MC samples, with one exception as noted below. The background MC samples include γ þ jets, γγ þ jets, tt þ jets, tt þ γ, tt þ γγ, Z þ γ, and W þ γ processes, as well as a variety of other rarer backgrounds. Non-ttH production modes of H are also treated as background. The dominant background in the hadronic channel consists of γ þ jets events, where one jet is misidentified as a photon. To improve the performance of the hadronic BDT-bkg, the γ þ jets background is modeled from a large sample of data events with one photon candidate failing the photon ID requirement; these are almost exclusively multijet and γ þ jets events. For each such event, the photon ID value of the misidentified jet is replaced by a value drawn from the MC distribution of photon ID values of misidentified jets passing the photon ID requirement. These events, appropriately weighted, are then used in the hadronic BDT-bkg training instead of the γ þ jets MC sample.
Input features of BDT-bkg include kinematic properties of jets, leptons, photons, and diphotons (but not m γγ ), jet and lepton multiplicity, b-tagging scores of jets, and p miss T . The inclusion of b-tagging scores reduces the non-tt background; furthermore, jets and leptons in ttH events tend to have higher p T and smaller jηj than in background events. The BDT-bkg also uses output of the photon ID BDT, and the outputs of other machine learning (ML) algorithms described below as input features. One such ML algorithm is a top quark tagger BDT (top tagger) [47] to distinguish events with top quarks decaying into three jets from events that do not contain top quarks. We also use long short-term memory based [48] DNNs trained to separate ttH from the dominant backgrounds in a signalenriched phase space: γγ þ jets and tt þ γγ for the hadronic channel, and tt þ γγ for the leptonic channel. In addition to the features that are used in BDT-bkg, the DNNs exploit low-level information including the full four-vectors of each jet and lepton and the jet flavor scores [28]. The fourvectors allow for a more effective use of the kinematic properties of the jet and the lepton, while the jet flavor scores allow the differentiation of the origins of hadronic jets between ttH and the dominant backgrounds that the DNNs are designed to reject. The DNNs are trained only on MC samples with a large number of simulated events and used as additional inputs to the BDT-bkg, rather than in place of the BDT-bkg. When a DNN is trained on all background components, its performance is worse than the BDT-bkg due to severe overfitting, as the other background samples have a lower number of simulated events than γγ þ jets and tt þ γγ. The modeling of the input features has been validated by comparing data and MC distributions for events passing the preselection in both channels. The BDTbkg score has been validated by comparing the distributions in data and MC in both the m γγ sidebands, satisfying either 100 <m γγ < 120 GeV or 130 <m γγ < 180 GeV (as in Fig. 1), as well as in dedicated control regions that target tt þ Z events.
Events are either rejected or further divided into eight categories to maximize the expected significance according to their BDT-bkg output, as shown in Fig. 1 and Table I. When measuring the CP structure of the Htt coupling that is discussed later, nonrejected events are divided into four categories to maximize the sensitivity to the CP structure of the Htt amplitude. We perform a simultaneous binned maximum likelihood fit to the m γγ distributions in the eight categories to extract the product of the ttH cross section and H → γγ branching fraction (σ ttH B γγ ) and the signal strength μ ttH , defined as the ratio of the measured to SM expected H → γγ. In the fit, all other H production modes are constrained to their SM predictions.
The ttH signal distribution is parameterized using a double-sided Crystal Ball [49] plus Gaussian function. The background is modeled from data with the discrete profiling method [50], which accounts for the uncertainty associated with the choice of analytic function used to model the background m γγ distribution.
All other systematic uncertainties are also included as nuisance parameters, and results are obtained using asymptotic distributions of test statistics based on the profile likelihood ratio [51][52][53]. The dominant theoretical uncertainty in μ ttH arises from the SM prediction of the ttH cross section and is estimated by varying the QCD renormalization and factorization scales [30], with a resulting impact of 8%. The uncertainties in parton distribution functions, QCD coupling, underlying event and parton showers, and the H → γγ branching fraction each affect μ ttH by 2%-5%. The main experimental uncertainties that affect μ ttH are those related to the b quark and photon identification, the jet energy scale and resolution, and the integrated luminosity [54][55][56]. Their effects are in the 2%-6% range. Other systematic uncertainties, including those related to preselection and trigger efficiencies, the lepton identification, and p miss T ,h a v ea< 2% effect on the measurement of μ ttH and σ ttH B γγ .   The data and fit results are shown in Fig. 2 [30]. The observed significance relative to the background-only hypothesis is 6.6 standard deviations (σ), while the expected significance assuming the SM H is 4.7σ.
The CP structure of the Htt amplitude can be parameterized as [22] AðHttÞ¼ whereψ t and ψ t are the Dirac spinors, m t is the top quark mass, v is the SM H field vacuum expectation value, and κ t andκ t are the CP-even and CP-odd Yukawa couplings. In the SM, κ t ¼ 1 andκ t ¼ 0. We measure the CP structure with f Htt CP ¼ jκ t j 2 jκ t j 2 þjκ t j 2 signðκ t =κ t Þ: When the cross sections of the CP-even and CP-odd contributions are equal, f Htt CP ¼ 0.72 [22]. It has been shown in Ref. [22] that an optimal analysis of the CP structure in the ttH process can be performed with two observables, D 0− and D CP . D 0− is designed to separate CP-even from CP-odd and D CP to differentiate the interference. Reference [57] shows that the two observables built by matrix element and ML techniques achieve the same sensitivity. In this study, we use a BDT to obtain D 0− and do not include D CP since it requires tagging the flavor of light jets. As a consequence, it is not possible to measure the relative sign, or phase, of the κ t andκ t couplings. Nonetheless, this sign is incorporated into the f Htt CP definition in Eq. (2) for consistency with other possible studies sensitive to the sign of f Htt CP , such as in the gluon fusion production with the top quark loop [57].
We train a BDT to distinguish CP-even and CP-odd contributions. The observables used in the training include the kinematic variables of the first six jets (in p T ) and the diphoton system (but not m γγ ), the b-tagging scores of jets, and in the leptonic channel, the lepton multiplicity, and the kinematic variables of the leading lepton. The output of the BDT is the D 0− observable. Simulation shows that D 0− has negligible correlation with the BDT-bkg discriminant. The events selected for the signal strength measurements are split into 12 categories, leptonic or hadronic, two BDT-bkg categories, as shown in Fig. 1, and three D 0− bins, as shown in Fig. 3.
A simultaneous fit to the m γγ distribution is performed using the 12 categories to measure f Htt CP . The μ ttH parameter is left unconstrained. An additional systematic uncertainty is introduced to cover possible small differences in the modeling of the distributions with the JHUGEN generator used for variation of the CP structure of the ttH coupling and MADGRAPH5_aMC@NLO generator used to model SM distributions. However, statistical uncertainties dominate the measurement of f Htt CP . In addition to the ttH process, we parameterize the tH production with the μ ttH and f Htt CP parameters, where the H couplings to other particles are constrained to their SM values and the sign of κ t is taken to be positive [58]. The weak dependence of D 0− distributions for the tH events is neglected. Studies show that it decreases the sensitivity by 0.1σ. The other processes are constrained to their SM predictions.
The fit results are shown in Fig. 3 and are obtained using the profile likelihood method as f Htt CP ¼ 0.00 AE 0.33, with  the constraint jf Htt CP j < 0.67 at 95% confidence level (C.L.). The coverage was determined with pseudodatasets and found to agree with that expected in the asymptotic limit [59]. The pure pseudoscalar model of CP structure of the Htt coupling (f Htt CP ¼ 1) is excluded at 3.2σ. The expected constraints based on SM simulation are f Htt CP ¼ 0.00 AE 0.49 at 68% C.L., jf Htt CP j < 0.82 at 95% C.L., and 2.6σ exclusion of the f Htt CP ¼ 1 model. To conclude, we presented the first single-channel observation of the ttH process and the first measurement of the CP structure of the Htt coupling using the H → γγ channel. The cross section of the ttH process is measured to be σ ttH B γγ ¼ 1.56 þ0. 34 −0.32 fb, corresponding to 1.38 þ0.36 −0.29 times the SM prediction, with a significance of 6.6σ. The data disfavor the pure CP-odd model of the Htt coupling at 3.2σ, and a possible fractional CP-odd contribution is constrained to be f Htt CP ¼ 0.00 AE 0.33 at 68% C.L. We congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centers and personnel of the Worldwide LHC Computing Grid for delivering so effectively the computing infrastructure essential to our analyses. Finally, we acknowledge the enduring support for the construction and operation of the LHC and the CMS detector provided by the following funding agencies: BMBWF and FWF