Evidence for the associated production of the Higgs boson and a top quark pair with the ATLAS detector

A search for the associated production of the Higgs boson with a top quark pair ($t\bar t H$) is reported. The search is performed in multilepton final states using a dataset corresponding to an integrated luminosity of 36.1 fb$^{-1}$ of proton--proton collision data recorded by the ATLAS experiment at a center-of-mass energy $\sqrt{s} = 13$ TeV at the Large Hadron Collider. Higgs boson decays to $WW^*$, $\tau\tau$, and $ZZ^*$ are targeted. Seven final states, categorized by the number and flavor of charged-lepton candidates, are examined for the presence of the Standard Model Higgs boson with a mass of 125 GeV and a pair of top quarks. An excess of events over the expected background from Standard Model processes is found with an observed significance of 4.1 standard deviations, compared to an expectation of 2.8 standard deviations. The best fit for the $t\bar t H$ production cross section is $\sigma(t\bar t H) = 790^{+230}_{-210}$ fb, in agreement with the Standard Model prediction of $507^{+35}_{-50}$ fb. The combination of this result with other $t\bar t H$ searches from the ATLAS experiment using the Higgs boson decay modes to $b\bar b$, $\gamma\gamma$ and $ZZ^* \to 4\ell$, has an observed significance of 4.2 standard deviations, compared to an expectation of 3.8 standard deviations. This provides evidence for the $t\bar t H$ production mode.


Introduction
The study of the origin of electroweak symmetry breaking is one of the key goals of the Large Hadron Collider (LHC) [1]. In the Standard Model (SM) [2][3][4][5], the symmetry is broken through the introduction of a complex scalar field doublet, leading to the prediction of the existence of one physical neutral scalar particle, commonly known as the Higgs boson [6][7][8][9][10]. The discovery of a Higgs boson with a mass of approximately 125 GeV by the ATLAS [11] and CMS [12] collaborations was a crucial milestone. Measurements of its properties performed so far [13][14][15][16][17][18] are consistent with the predictions for the SM Higgs boson.
These measurements rely primarily on studies of the bosonic decay modes, H → γγ, H → Z Z * , and H → WW * ; therefore it is crucial to also measure the Yukawa interactions, which are predicted to account for the fermion masses [3,19]. Thus far, only the Yukawa coupling of the Higgs boson to τ leptons has been observed [18,[20][21][22] and evidence for the Yukawa coupling of the Higgs boson to b-quarks has been found through direct searches [23][24][25]. The Yukawa coupling of the Higgs boson to the top quark, the heaviest particle in the SM, is expected to be of the order of unity, and could be particularly sensitive to effects beyond the SM (BSM). A measurement of the ratio of this coupling to the SM prediction of 0.87 ± 0.15 has been obtained from the combined fit of the ATLAS and CMS Higgs boson measurements [18]. This depends largely on the indirect measurement using the top quark contribution to gluon-gluon fusion production and diphoton decay loops for which no BSM contribution is assumed. Therefore, a direct measurement of the coupling of the Higgs boson to top quarks is highly desirable to disentangle any deviation in the top quark's Yukawa coupling due to couplings to new particles and to significantly reduce the model dependence in the extraction of the top quark's Yukawa coupling.
A direct measurement can be achieved by measuring the rate of the process in which the Higgs boson is produced in association with a pair of top quarks, gg/qq → ttH, which is a tree-level process at lowest order in perturbation theory. Although the ttH production cross section at the LHC is two orders of magnitude smaller than the total Higgs boson production cross section, the distinctive signature from the top quarks in the final state gives access to many Higgs boson decay modes. The ATLAS and CMS collaborations have searched for ttH production using proton-proton (pp) collision data collected during LHC Run 1 at center-of-mass energies of √ s = 7 TeV and √ s = 8 TeV, with analyses mainly sensitive to H → WW * , H → τ + τ − , H → bb and H → γγ [26][27][28][29][30]. The combination of these results yields a best fit of the ratio of observed and SM cross sections, µ = σ/σ SM of 2.3 +0. 7 −0. 6 [18].
The ongoing data-taking at the LHC at an increased center-of-mass energy of √ s = 13 TeV allows the collection of a larger dataset because of an increased ttH production cross section relative to . This article reports the results of a search for ttH production using a dataset corresponding to an integrated luminosity of 36.1 fb −1 collected with the ATLAS detector at √ s = 13 TeV during 2015 and 2016. Examples of tree-level Feynman diagrams are given in Figure 1 where the Higgs boson is shown decaying to WW * /Z Z * or ττ. The search uses seven final states distinguished by the number and flavor of charged-lepton (electron, muon and hadronically decaying τ lepton) candidates, denoted l. In the following, the term "light lepton", denoted , refers to either electrons or muons and is understood to mean both particle and antiparticle as appropriate. These signatures are primarily sensitive to the decays H → WW * (with subsequent decay to lνlν or lνqq), H → τ + τ − and H → Z Z * (with subsequent decay to llνν or llqq), and their selection is designed to avoid any overlap with the ATLAS searches for ttH production with H → bb [36], H → γγ [37] and H → Z Z * → 4 [38] decays. Backgrounds to the signal arise from associated production of a top quark pair and a W or Z (henceforth V) boson. Additional backgrounds arise from tt production with leptons from heavy-flavor hadron decays and additional jets (non-prompt leptons), and other processes where the electron charge is incorrectly assigned (labeled as "q mis-id") or where jets are incorrectly identified as τ candidates. Backgrounds are estimated with a combination of simulation and data-driven techniques (labeled as "Pre-Fit"), and then a global fit to the data, in all final states, is used to extract the best estimate for the ttH production rate and adjust the background predictions (labeled as "Post-Fit").
The article is organized as follows. Section 2 introduces the ATLAS detector; Section 3 describes the Monte Carlo (MC) simulation samples as well as the recorded data used for this analysis. The reconstruction and identification of the physics objects are discussed in Section 4. The event selection and classification are explained in Section 5. Section 6 describes the methods used to estimate the backgrounds. The theoretical and experimental uncertainties are discussed in Section 7. The results are presented in Section 8, and the combination with the three other ATLAS searches for ttH production mentioned above is reported in Section 9.

ATLAS detector
The ATLAS experiment [39] at the LHC is a multipurpose particle detector with a forward-backward symmetric cylindrical geometry and a near 4π coverage in solid angle.1 It consists of an inner tracking detector surrounded by a superconducting solenoid providing a 2 T axial magnetic field, electromagnetic and hadron calorimeters, and a muon spectrometer. The inner tracking detector, covering the pseudorapidity range |η| < 2.5, consists of silicon pixel and silicon microstrip tracking detectors inside a transitionradiation tracker that covers |η| = 2.0. It includes, for the √ s = 13 TeV running period, a newly installed innermost pixel layer, the insertable B-layer [40]. Lead/liquid-argon (LAr) sampling calorimeters provide electromagnetic (EM) energy measurements for |η| < 2.5 with high granularity and longitudinal segmentation. A hadron calorimeter consisting of steel and scintillator tiles covers the central pseudorapidity range (|η| < 1.7). The endcap and forward regions are instrumented with LAr calorimeters for EM and hadronic energy measurements up to |η| = 4.9. The muon spectrometer surrounds the calorimeters and is based on three large air-core toroid superconducting magnets with eight coils each. It includes a system of precision tracking chambers (|η| < 2.7) and fast detectors for triggering (|η| < 2.4). A two-level trigger system is used to select events [41]. The first-level trigger is implemented in hardware and uses a subset of the detector information to reduce the accepted rate to a design maximum of 100 kHz. This is followed by a software-based trigger with a sustained average accepted event rate of about 1 kHz. 1 ATLAS uses a right-handed coordinate system with its origin at the nominal interaction point (IP) in the center of the detector and the z-axis along the beam pipe. The x-axis points from the IP to the center of the LHC ring, and the y-axis points upwards. Cylindrical coordinates (r, φ) are used in the transverse plane, φ being the azimuthal angle around the z-axis.

Data and Monte Carlo samples
The data were collected by the ATLAS detector during 2015 and 2016 with a peak instantaneous luminosity of L = 1.4 × 10 34 cm −2 s −1 . The mean number of pp interactions per bunch crossing in the dataset is 24 and the bunch spacing is 25 ns. After the application of beam and data-quality requirements, the integrated luminosity considered corresponds to 36.1 fb −1 .
Monte Carlo simulation samples were produced for signal and background processes using the full ATLAS detector simulation [42] based on G 4 [43] or, for selected smaller backgrounds, a fast simulation using a parameterization of the calorimeter response and G 4 for tracking systems [44]. To simulate the effects of additional pp collisions in the same and nearby bunch crossings (pileup), additional interactions were generated using the low-momentum strong-interaction processes of P 8.186 [45, 46] with a set of tuned parameters referred to as the A2 tune [47] and the MSTW2008LO set of parton distribution functions (PDF) [48], and overlaid onto the simulated hard-scatter event. The simulated events are reweighted to match the pileup conditions observed in the data and are reconstructed using the same procedure as for the data. The event generators used for each signal and background sample, together with the program and the set of tuned parameters used for the modeling of the parton shower, hadronization and underlying event are listed in Table 1. The simulation samples for ttH, ttV, VV and tt are described in Refs. [49][50][51]. The samples used to estimate the systematic uncertainties are indicated in between parentheses in Table 1.
A Higgs boson mass of 125 GeV, from the combined ATLAS and CMS Run 1 measurements [52], and a top quark mass of 172.5 GeV are assumed. The overall ttH cross section is 507 fb, which is computed at next-to-leading order (NLO) in quantum chromodynamics (QCD) with NLO electroweak corrections [31][32][33][34][35]. Uncertainties include +5.8% −9.2% due to the QCD factorization and renormalization scales and ±3.6% due to the PDFs and the strong coupling α S . The cross sections for ttV production, including the process pp → ttl + l − + X over the full Z/γ * mass spectrum, are computed at NLO in QCD and electroweak couplings following Refs. [53,54]. The cross section for ttl + l − , with m(l + l − ) > 5 GeV, is 124 fb, and 601 fb for ttW ± [31]. The QCD scale uncertainties are ±12% and uncertainties from PDF and α S variations are ±4%.
Events in the tt sample with radiated photons of high transverse momentum (p T ) are vetoed to avoid overlap with those from the ttγ sample. Dedicated samples are included to account for backgrounds from tt(Z/γ * ) where the Z/γ * has low invariant mass but the leptons enter the analysis phase space via asymmetric internal conversions, or rare t → W b radiative decays (referred to as "rare top decay" in the following).

Object reconstruction and identification
All analysis channels share a common trigger, jet, lepton and overall event preselection. The selections are detailed here and the lepton selection is summarized in Table 2. Unless otherwise specified, light leptons are required to pass the loose lepton selection. Further channel-specific requirements are discussed in Section 5.
The selection of events is based on the presence of light leptons, with either single-lepton or dilepton triggers. For data recorded in 2015, the single-electron (single-muon) trigger required a candidate with transverse momentum p T > 24 (20) GeV [41]; in 2016 the lepton p T threshold was raised to 26 GeV. The Table 1: The configurations used for event generation of signal and background processes. The samples used to estimate the systematic uncertainties are indicated in between parentheses. "V" refers to production of an electroweak boson (W or Z/γ * ). "Tune" refers to the underlying-event tuned parameters of the parton shower program. The parton distribution function (PDF) shown in the table is the one used for the matrix element (ME . In the region |η| < 0.1, where muon spectrometer coverage is reduced, muon candidates are also reconstructed from inner detector tracks matched to isolated energy deposits in the calorimeters consistent with the passage of a minimum-ionizing particle. Candidates are required to satisfy p T > 10 GeV and |η| < 2.5 and to pass loose identification requirements [77]. To reduce the non-prompt muon contribution, the track is required to originate from the primary vertex by imposing a requirement on its transverse impact parameter significance |d 0 |/σ d 0 < 3 and on its longitudinal impact parameter multiplied by the sine of the polar angle |z 0 sin θ| < 0.5 mm. Additionally, muons are required to be separated by ∆R > min(0.4, 0.04 + (10 GeV)/p T,µ ) from any selected jets (see below for details of jet reconstruction and selection). The requirement is chosen to maximize the acceptance for prompt muons at a fixed rejection factor for non-prompt and fake muon candidates.
Electron candidates are reconstructed from energy clusters in the electromagnetic calorimeter that are associated with charged-particle tracks reconstructed in the inner detector [78,79]. They are required to have a transverse momentum p T > 10 GeV and |η cluster | < 2.47, and the transition region between the barrel and endcap electromagnetic calorimeters, 1.37 < |η cluster | < 1.52, is excluded. A multivariate likelihood discriminant combining shower shape and track information is used to distinguish real prompt electrons from electron candidates from hadronic jets, photon conversions and heavy-flavor (HF) hadron decays (fake and non-prompt electrons). Loose and tight electron discriminant working points are used [79], both including the number of hits in the innermost pixel layer to discriminate between electrons and converted photons. The same longitudinal impact parameter selection as for muons is applied, while the transverse impact parameter significance is required to be |d 0 |/σ d 0 < 5. If two electrons closer than ∆R = 0.1 are preselected, only the one with the higher p T is considered. An electron is rejected if, after passing all the above selections, it lies within ∆R = 0.1 of a selected muon.
Hadronically decaying τ-lepton candidates (τ had ) are reconstructed from clusters in the calorimeters and associated inner detector tracks [80]. Candidates are required to have either one or three associated tracks, with a total charge of ±1. Candidates are required to have a transverse momentum p T > 25 GeV and |η| < 2.5, excluding the electromagnetic calorimeter's transition region. A boosted decision tree (BDT) discriminant using calorimeter-and tracking-based variables is used to identify τ had candidates and reject jet backgrounds. Three types of τ had candidates are used in the analysis, referred to as loose, medium and tight: the latter two are defined by working points with a combined reconstruction and identification efficiency of 55% and 45% (40% and 30%) for one (three)-prong τ had decays, respectively [81], while the first one has a more relaxed selection and is only used for background estimates. The corresponding expected rejection factors against light-quark/gluon jets vary from 30 for loose candidates to 300 for tight candidates [80]. Electrons that are reconstructed as one-prong τ had candidates are removed via a BDT trained to reject electrons. Additionally, τ had candidates are required to be separated by ∆R > 0.2 from any selected electrons and muons. The contribution of fake τ had from b-jets is removed by vetoing the candidates that are also b-tagged, which rejects a large fraction of the tt background. The contribution of fake τ had from muons is removed by vetoing the candidates that overlap with low-p T reconstructed muons. Finally, the vertex matched to the tracks of the τ had candidate is required to be the primary vertex of the event, in order to reject fake candidates arising from pileup collisions.
Jets are reconstructed from three-dimensional topological clusters built from energy deposits in the calorimeters [82, 83], using the anti-k t algorithm with a radius parameter R = 0.4 [84,85]. Their calibration is based on simulation with additional corrections obtained using in situ techniques [86] to account for differences between simulation and data. Jets are required to satisfy p T > 25 GeV and |η| < 2.5. In order to reject jets arising from pileup collisions, a significant fraction of the total summed scalar p T of the tracks in jets with p T < 60 GeV and |η| < 2.4 must originate from tracks that are associated with the primary vertex [87]. The average efficiency of this requirement is 92% per jet from the hard scatter. The calorimeter energy deposits from electrons are typically also reconstructed as jets; in order to eliminate double counting, any jets within ∆R = 0.3 of a selected electron are not considered. This is also the case for any jets within ∆R = 0.3 of a τ had candidate.
Jets containing b-hadrons are identified (b-tagged) via a multivariate discriminant combining information from algorithms using track impact parameters and secondary vertices reconstructed within the jet [88, 89]. These b-tagged jets will henceforth be referred to as b-jets. The working point used for this search corresponds to an average efficiency of 70% for jets containing b-hadrons with p T > 20 GeV and |η| < 2.5 in tt events. The expected rejection factors against light-quark/gluon jets, c-quark jets and hadronically The lepton requirements are summarized in Table 2. Isolation requirements are applied to all lepton types except the loose definition. Two isolation variables, based on calorimetric and tracking variables, are computed. Calorimetric isolation uses the scalar sum of transverse energies of clusters within a cone of size ∆R = 0.3 around the light-lepton candidate. This excludes the electron candidate's cluster itself and clusters within ∆R = 0.1 of the muon candidate's track, respectively, and is corrected for leakage from the electron's shower and for the ambient energy in the event [91,92]. Track isolation uses the sum of transverse momenta of tracks with p T > 1 GeV consistent with originating at the primary vertex, excluding the light-lepton candidate's track, within a cone of ∆R = min(0.3, 10 GeV/p T ( )). Calorimeterand track-based isolation criteria are applied to electrons and muons to obtain a 99% efficiency in Z → events.
Non-prompt leptons are further rejected using a multivariate discriminant, taking as input the energy deposits and charged-particle tracks (including the lepton track) in a cone around the lepton direction, which is referred to as the non-prompt lepton BDT. The jet reconstruction and b-tagging algorithms are run on the track collection, and their output is used to train the algorithm together with isolation variables. A reconstructed track-jet that is matched to a non-prompt lepton is typically a jet initiated by bor cquarks, and may contain a displaced vertex. The most discriminating variables are thus found to be the angular distance between the lepton and the reconstructed jet, the outputs of the b-tagging algorithms, the calorimetric and track isolation variables of the lepton, the number of tracks within the jet and the ratio of the lepton p T to the jet p T . The training is performed separately for electrons and muons on prompt and non-prompt leptons from simulated tt events and validated using data in various control regions. The efficiency at the chosen working point to select well-identified prompt muons (electrons) is about 70% (60%) for p T ∼ 10 GeV and reaches a plateau of 98% (96%) at p T ∼ 45 GeV, as shown in Figure 2, while the rejection factor against leptons from the decay of b-hadrons is about 20. Simulated events are corrected to account for differences between data and simulation for this prompt-lepton isolation efficiency, as well as for the lepton trigger, reconstruction, and identification efficiencies. The corrections were determined using a so-called tag-and-probe method as described in Refs. [77, 78] and studied as a function of the number of nearby light-and heavy-flavor jets. This is illustrated in Figure 2, showing that the corrections for the non-prompt lepton BDT efficiencies are at most 10% at low transverse momentum and decrease with increasing transverse momentum. The largest contribution to the associated systematic uncertainties comes from pileup effects.   There is a small, but non-negligible, probability that electrons and positrons are reconstructed with an incorrect charge. This occurs when an electron (positron) emits a hard bremsstrahlung photon; if the photon subsequently converts to an asymmetric electron-positron pair, and the positron (electron) has high momentum and is reconstructed, the lepton charge can be misidentified. Otherwise it occurs when the curvature of a track is poorly estimated, which typically happens at high momentum. The probability for muons to be reconstructed with incorrect charge is small enough that the charge misassignment is negligible. To reject electrons reconstructed with an incorrect electric charge, a BDT discriminant is built, using the following electron cluster and track properties as input: the electron's transverse momentum and pseudorapidity, the track curvature significance (defined as the ratio of the electric charge to the track momentum divided by the estimated uncertainty in the measurement) and its transverse impact parameter times the electric charge, the cluster width along the azimuthal direction, and the quality of the matching between the track and the cluster, in terms of both energy/momentum and azimuthal position. The chosen working point achieves a rejection factor of ∼17 for electrons passing the tight identification requirements with a wrong charge assignment while providing an efficiency of 95% for electrons with correct charge reconstruction. This requirement is only applied to the very tight electrons. Correction factors to account for differences in the selection efficiency between data and simulation, which are within a few percent for |η| < 2.4 but larger in the forward region, 2.4 < |η| < 2.47, were applied to the selected electrons in the simulation.
The missing transverse momentum − → p T miss (with magnitude E miss T ) is defined as the negative vector sum of the transverse momenta of all identified and calibrated leptons and jets and remaining unclustered energy, the latter of which is estimated from low-p T tracks associated with the primary vertex but not assigned to any lepton or jet candidate [93, 94].

Event selection and classification
The analysis is primarily sensitive to decays of the Higgs boson to WW * or ττ with a small additional contribution from H → Z Z * . If the Higgs boson decays to either WW * or ττ, the ttH events typically contain either WWWW bb or ττWW bb. In order to reduce the tt background, characterized by a final state of WW bb, final states including three or more charged leptons, or two same-charge light leptons, are selected. Seven final states are analyzed, categorized by the number and flavor of charged-lepton candidates after the preselection requirements, as illustrated in Figure 3. Each of the seven final states is called a "channel" and certain channels are further split into categories to gain in significance. Categories include signal and control regions. Additional control regions used for the estimates of the non-prompt backgrounds are discussed in Section 6.
The selection criteria are designed to be orthogonal to ensure that each event only contributes to a single channel. Channels are made orthogonal through the requirements on the number of loose light leptons and medium τ had candidates. A veto on events containing medium τ had candidates is therefore applied for the 2 SS and 3 channels, but no veto is applied for the 4 channel because there is no corresponding τ had channel. In all channels, the light lepton(s) are required to be matched to the lepton(s) selected by either the single-lepton or dilepton triggers. As the 1 +2τ had channel has only one light lepton, only single-lepton triggers are used. In order to reduce the diboson background, all channels also require events to include at least two reconstructed jets and that at least one of these jets must be b-tagged.
The detailed criteria for each channel are described below, and summarized in Table 3. In addition, Table 4 provides a comparison of the key aspects of the selection used in each channel. After the selection, assuming Standard Model ttH production, the total expected number of reconstructed signal events summed over all categories is 91, corresponding to 0.50% of all produced ttH events. The breakdown in each channel is given in Table 5. In total 332 030 events are selected in data. As the background contamination is still large in all channels, except one of the 4 categories and the 3 +1τ had category, further separation of the signal from the background is achieved using multivariate techniques. The TMVA package [95] is used in all channels except for 3 , which uses XGBoost [96]. Independent cross-check analyses using a simpler cut-and-count categorization were developed for the most sensitive 2 SS, 3 and 2 SS+1τ had channels.

2 SS channel
Selected events are required to include exactly two reconstructed light leptons with the same electric charge. To reduce the background from fake and non-prompt leptons as well as electrons reconstructed with incorrect electric charge, the very tight selection requirements described in Section 4 are applied and the leptons are required to satisfy p T > 20 GeV. Events must include at least four reconstructed jets to suppress tt and ttW backgrounds, among which either one or two are required to be b-tagged. A slight disagreement is observed between the Standard Model prediction and the data for events containing two same-charge light leptons and three or more b-jets. To avoid any potential systematic bias, these events are vetoed, at no expense in sensitivity.
Two independent BDTs are trained using the selected events. The first aims to separate the signal from the non-prompt and fake background, while the second aims to separate the signal from the ttV background. The data-driven estimate of the non-prompt and fake background described in Section 6.2.1 is used in the training, which is performed for both BDTs with the nine variables listed in Table 6. The outputs of the two BDT classifiers are combined to maximize the signal significance.
A cross-check is provided by an independent cut-and-count analysis using twelve categories, which places requirements on the jet multiplicity, b-tagged jet multiplicity and the lepton flavor.

3 channel
Selected events are required to include exactly three reconstructed light leptons with the total charge equal to ±1. The lepton of opposite charge to the other two is found to be prompt in 97% of the selected events in tt simulated samples and therefore only required to be loose, isolated and pass the non-prompt BDT selection requirements, as described in Section 4. To reduce the background from fake and non-prompt leptons, leptons in the same-charge pair are required to be very tight and to satisfy p T > 15 GeV. Events containing a same-flavor opposite-charge lepton pair with an invariant mass below 12 GeV are removed to suppress background from resonances that decay to light lepton pairs. A Z-veto is applied, excluding events containing an same-flavor opposite-charge lepton pair with an invariant mass within 10 GeV of the Z mass to suppress the tt Z background. Finally, to eliminate potential backgrounds with Z decays to  One tight light lepton with p T > 27 GeV Two medium τ had candidates of opposite charge, at least one being tight N jets ≥ 3 2 SS+1τ had Two very tight light leptons with p T > 15 GeV Same-charge light leptons One medium τ had candidate, with charge opposite to that of the light leptons N jets ≥ 4 |m(ee) − 91.2 GeV| > 10 GeV for ee events 2 OS+1τ had Two loose and isolated light leptons with p T > 25, 15 GeV One medium τ had candidate Opposite-charge light leptons One medium τ had candidate m( + − ) > 12 GeV and |m( + − ) − 91.2 GeV| > 10 GeV for the SFOC pair N jets ≥ 3 3 +1τ had 3 selection, except: One medium τ had candidate, with charge opposite to the total charge of the light leptons The two same-charge light leptons must be tight and have p T > 10 GeV The opposite-charge light lepton must be loose and isolated Table 4: Summary of the basic characteristics of the seven analysis channels. The lepton selection follows the definition in Table 2 and is labeled as loose (L), loose and isolated (L † ), loose, isolated and passing the non-prompt BDT (L*), tight (T) and very tight (T*), respectively. The τ had selection is labeled as medium (M) and tight (T).

SS
where one lepton has very low momentum and is not reconstructed, the three-lepton invariant mass must satisfy |m(3 ) − 91.2 GeV| > 10 GeV.
Selected events are classified using a five-dimensional multinomial boosted decision tree. The five classification targets used in the training are: ttH, ttW, tt Z, tt and diboson. In total, 28 variables based on topological aspects of the events as listed in Table 6 are used in the training. The output discriminants are mapped into the five categories to maximize the signal significance using a variable multidimensional binning procedure [97], while accounting for the uncertainties in the background estimates: ttH, ttW, tt, tt Z and diboson. The ttH category is the signal region and the remaining four categories are control regions. Events not explicitly assigned to any category are found to largely contain non-prompt or fake leptons and hence are included in the tt category. The Z-veto is removed during the categorization process and then applied in the ttH, ttW and tt categories because this was found to decrease the tt Z background in the signal region. The data-driven estimate of the non-prompt and fake background described in Section 6.2.1 is used for the categorization process, while the simulation is used for the training due to the small size of the sample used in the non-prompt estimate. The ttH discriminant is used in the signal region.
A cross-check is provided by an independent cut-and-count analysis using twelve categories, which places requirements on the jet multiplicity, b-tagged jet multiplicity, the lepton flavor and the invariant mass of the opposite-charge pair of leptons with the smallest ∆R separation.

4 channel
Selected events are required to include exactly four loose light leptons with the total charge equal to zero. To reduce the background from fake and non-prompt leptons, the third and fourth leptons ordered by decreasing transverse momentum are required to satisfy tight selection requirements described in Section 4. No requirements are applied to the number of τ had candidates and any jets also reconstructed as τ had candidates are treated only as jets. To further suppress the tt Z background, the Z-veto described for the 3 channel in Section 5.2 is applied. To suppress background from resonances that decay to light leptons, events containing a same-flavor opposite-charge lepton pair with an invariant mass below 12 GeV are also removed. To reduce contamination from other Higgs boson production processes and to ensure minimal overlap with the dedicated search for ttH production with H → Z Z * → 4 [38] decay, a H → 4 veto |m(4 ) − 125 GeV| > 5 GeV is applied.
Selected events are separated by the presence or absence of a same-flavor, opposite-charge lepton pair into two categories, referred to respectively as the Z-enriched and Z-depleted categories. Background events in the Z-enriched category can arise from off-shell Z * and γ * → + − processes while in the Z-depleted category these backgrounds are absent. Therefore, a BDT is trained in the Z-enriched category to further discriminate the signal from the tt Z background. Seven variables listed in Table 6 are used in the training, including a pseudo-matrix-element discriminator exploiting partially reconstructed resonances (t, H and Z) [98]. A requirement on the BDT discriminant is then imposed to define the Z-enriched signal region.

1 +2τ had channel
Selected events are required to include exactly one tight light lepton and exactly two medium τ had candidates of opposite charge. At least one of the τ had candidates is required to be tight. In order to suppress the tt and ttV backgrounds, events must include at least three reconstructed jets. A BDT is trained to further reduce the main tt background, in which events had one or two fake τ had candidates. Seven variables listed in Table 6 are used in the training, including the invariant mass of the visible decay products of the τ had τ had system.

2 SS+1τ had channel
Selected events are required to contain exactly one medium τ had candidate but otherwise to meet the requirements for the 2 SS channel discussed in Section 5.1, except that the light-lepton p T threshold is lowered from 20 to 15 GeV and that events with 3 or more b-jets are included. The reconstructed charge of the τ had candidate must be opposite to that of the light leptons. The Z-veto is applied to dielectron events to suppress Z+jets events with a misassigned charge. A BDT is trained using the 13 variables listed in Table 6 on events with relaxed selection requirements: the light leptons are required to be loose instead of tight and the requirement on the number of jets is reduced to two. This BDT is used to further reduce the tt background.
A cross-check is provided by an independent cut-and-count analysis using three categories, which places requirements on the maximum |η| of the two light leptons and the p T of the subleading jet.

2 OS+1τ had channel
Selected events are required to include exactly two reconstructed loose and isolated leptons of opposite charge with leading (subleading) p T > 25 (15) GeV, and exactly one medium τ had candidate. In order to reduce the tt, Z+jets and ttV backgrounds, events must include at least three reconstructed jets. The Z-veto is applied to same-flavor lepton pairs to suppress the Z+jets background with a fake τ had candidate. To suppress background from resonances that decay to light leptons, events containing a same-flavor lepton pair with an invariant mass below 12 GeV are also removed. A BDT is trained using the 13 variables listed in Table 6 on the selected events, with the aim of further reducing the main tt background with a fake τ had candidate.

3 +1τ had channel
Selected events are required to contain exactly one medium τ had candidate but otherwise to meet the requirements for the 3 channel discussed in Section 5.2, except that the two same-charge leptons must be tight and have p T > 10 GeV and the opposite-charge lepton must be loose and isolated. The reconstructed charge of the τ had candidate must be opposite to the total charge of the light leptons. Due to the high purity of the signal, no further selection is required and only the event yields are used in the fit.

Channel summary
Twelve categories are defined in the previous subsections: eight signal regions and four control regions (CR) from the 3 channel. The fraction of the expected signal arising from different Higgs boson decay modes in each signal region is shown in Figure 4 (left). The signal-to-background ratio S/B for each signal and control region is shown in Figure 4 (right). This ranges from 0.014 to almost 2. The ratio S/ √ B is also indicated. The acceptance for each channel is shown in Table 5. The background composition in each region is shown in Figure 5. The background prediction methods are described in the next section. Multivariate techniques have been applied in most channels to improve the discrimination between the signal and the background. The variables used in each channel are indicated in Table 6. The modeling of each variable was checked and no significant disagreement between data and simulation was found.   Table 6: Variables used in the multivariate analysis (denoted by ×) for the 2 SS, 3 , 4 (Z-enriched category), 1 +2τ had , 2 SS+1τ had and 2 OS+1τ had channels. For 2 SS and 2 SS+1τ had , lepton 0 and lepton 1 are the leading and subleading leptons, respectively. For 3 , lepton 0 is the lepton with charge opposite to that of the same-charge pair, while the same-charge leptons are labeled with increasing index (lepton 1 and lepton 2) as p T decreases. The best Z-candidate dilepton invariant mass is the mass of the dilepton pair closest to the Z boson mass. The variables also used in the cross-check analyses are indicated by a * .

Variable
2 SS 3 4 1 +2τ had 2 SS+1τ had 2 OS+1τ had Lepton properties  Figure 5: The fractional contributions of the various backgrounds to the total predicted background in each of the twelve analysis categories. The background prediction methods are described in Section 6: "Non-prompt", "Fake τ had " and "q mis-id" refer to the data-driven background estimates (largely tt but also include other electroweak processes), and rare processes (t Z, tW, tW Z, ttWW, triboson production, ttt, tttt, tH, rare top decay) are labeled as "Other".

Background estimation
The irreducible backgrounds all have selected light leptons produced in W or Z/γ * boson decays or leptonic τ decays (prompt leptons, Section 6.1). The reducible backgrounds have at least one lepton arising from another source (Section 6.2). In the latter case, light leptons originate from heavy-flavor hadron decays, photon conversions, improper reconstruction of other particles such as hadronic jets, or prompt leptons whose charge is misassigned. Such misidentified and non-prompt light leptons are collectively referred to as non-prompt leptons in the following, as this is the dominant source. The fake τ had candidates are typically jets, including HF jets.

Backgrounds with prompt leptons
Background contributions with prompt leptons originate from a wide range of processes and the relative importance of individual processes varies by channel. The largest backgrounds with prompt leptons are from top production in association with a vector boson, ttW and tt(Z/γ * ), and diboson production, VV. These background estimates are a crucial part of the analysis, because their final state and kinematics are similar to the signal. In addition, there are contributions from a number of rare processes: rare top decay, t Z, tW, tW Z, ttWW, VVV, ttt and tttt production. The associated production of single top quarks with a Higgs boson, which contributes at most 2% in any signal region, is also considered as a background process. All other Higgs boson production mechanisms contribute negligibly (<0.2%) in any signal region.
All these backgrounds are estimated from simulation using the samples described in Section 3. The systematic uncertainties in the modeling of these processes by the simulation are discussed in Section 7.
The prompt-lepton estimates were validated in various regions, as illustrated in Figure 6 for the 3 tt Z and ttW control regions.

Backgrounds with non-prompt leptons and fake τ had candidates
Data-driven methods are used to estimate the backgrounds with non-prompt light leptons and fake τ had candidates, defining control regions enriched in such backgrounds and extrapolating the observed yields to the signal regions. The control regions used for this purpose are summarized in Table 7. They are orthogonal to the signal regions. Figure 7 summarizes the origin of the non-prompt leptons and fake τ had candidates in these control regions and some signal regions based on predictions from simulation, where the statistical uncertainties of the absolute fractions can be as large as 7%. Table 8 summarizes the strategies used to estimate the non-prompt lepton and fake τ had backgrounds in each of the channels, motivated by the different event topologies and the statistical power available in the control regions. The matrix method and fake-factor method are largely similar, but differ in that the fake-factor method estimates the prompt contribution from simulation, while the matrix method uses the measured prompt lepton efficiency from data.
Non-prompt Other Uncertainty Number of jets Non-prompt Other Uncertainty Figure 6: Comparison of data and prediction of the jet multiplicity in the (left) 3 tt Z and the (right) 3 ttW control regions. The last bin in each figure contains the overflow. The bottom panel displays the ratio of data to the total prediction. The hatched area represents the total uncertainty in the background. The background prediction for non-prompt leptons is described in Section 6.2 and the other backgrounds are normalized according to the predictions from simulation.  The composition from simulation of (left) the fake and non-prompt light leptons and (right) the fake τ had in selected analysis regions. The light-lepton composition is shown separately depending on the lepton flavor in the regions used in the estimate of the non-prompt contribution. The control regions labeled '2lSSxx' are used for the 2 SS and 3 channels; those labeled '3lx' are used for the 4 channel where x denotes the flavor of the lowest-p T lepton and those labeled '2lSSx+1τ' are used for the 2 SS+1τ had channel. The non-prompt lepton background has been separated into the components from b-jets, c-jets, other jets, J/ψ, photon conversions and other contributions. The latter includes pion, kaon and non-prompt tau decays and cases where reconstructed leptons cannot be assigned unambiguously to a particular source. The τ had composition is shown both in the control regions used in the estimates and in the signal regions of each channel. The τ had background has been separated into the components from b-jets, c-jets, light-quark jets, gluon jets, electrons and other contributions. The latter includes muons, hadrons and cases where reconstructed leptons cannot be assigned unambiguously to a particular source. Table 7: Selection criteria applied to define the control regions used for the non-prompt lepton (top part) and fake τ had (bottom part) estimates. The 2 SS CR is used for both the 2 SS and 3 channels, as indicated by putting 3 in parenthesis. Same-flavor, opposite-charge (same-charge) lepton pairs are referred to as SFOC (SFSC) pairs.

Channel
Region Selection criteria 2 SS 2 ≤ N jets ≤ 3 and N b-jets ≥ 1 (3 ) One very tight, one loose light lepton with p T > 20 (15)  Zero or one medium τ had candidate, opposite in charge to the light leptons 1 +2τ had N jets ≥ 3 and N b-jets ≥ 1 One tight light lepton, with p T > 27 GeV Two τ had candidates of same charge At least one τ had candidate has to satisfy tight identification criteria 2 OS+1τ had Two loose and isolated light leptons, with p T > 25, 15 GeV One loose τ had candidate |m( + − ) − 91.2 GeV| > 10 GeV and m( + − ) > 12 GeV N jets ≥ 3 and N b-jets = 0

Non-prompt leptons in the 2 SS and 3 channels
The non-prompt lepton background in the 2 SS and 3 channels is a mixture of leptons from semileptonic HF decays and conversions. These backgrounds are estimated using a matrix method similar to that described in Refs. [99, 100]. The matrix method estimates the number of non-prompt leptons in the signal region by selecting events passing all selection requirements except the tight-lepton requirements and splitting the events into four categories. The four categories contain exactly two tight leptons, one tight and one loose-but-not-tight lepton, one loose-but-not-tight and one tight lepton, and two loosebut-not-tight leptons (where the leptons are ordered according to their p T ). The probabilities for both the loose prompt and non-prompt leptons to be tight are measured in control regions independent from the signal regions. These are used to estimate the number of non-prompt events in the signal regions via the following formula: f SR = w TT N TT + wT T NT T + w TT N TT + wTTNTT. The w weights depend on the measured prompt and non-prompt lepton efficiencies, T andT denote leptons passing the tight and loose-but-not-tight lepton selections respectively. Table 8: Summary of the non-prompt lepton and fake τ had background estimate strategies of the seven analysis channels. DD means data-driven background estimates and the techniques used are the matrix method (MM) and the fake-factor method (FF). The scale factor method (SF), which scales the estimate from simulation by a correction factor measured in data, is partially data-driven. The lower half of the table lists the selection requirements used to define the control regions. The lepton selection follows the same convention as in Table 2 and is labeled as loose (L), loose and isolated (L † ), loose, isolated and passing the non-prompt BDT (L*), tight (T) and very tight (T*), respectively. Analogously, the τ had selection is labeled as medium (M) and tight (T).
2 SS 3 4 1 +2τ had 2 SS+1τ had 2 OS+1τ had 3 +1τ had Non-prompt lepton strategy In the 2 SS channel, the method allows either of the candidate leptons to be non-prompt, while in the 3 channel, the opposite-charge lepton is assumed to always be prompt, as is seen in the simulation for 97% of the cases. The efficiencies are measured separately for electrons and muons.
The control regions used to measure the prompt ( real ) and non-prompt ( fake ) lepton efficiencies are defined in Table 7. They have lower jet multiplicity than the signal regions. The lepton efficiencies are parameterized as a function of p T . The non-prompt electron efficiency is additionally parameterized as a function of the number of b-jets in the events to account for changes in the composition of fakes. The non-prompt muon efficiency is additionally parameterized as a function of the angular distance between the lepton and the closest jet to account for effects of nearby jets. The residual prompt background in the control regions is subtracted using the prediction from simulation, while the background from charge misassignment is subtracted using the estimate described in Section 6.2.4.
The efficiency for electrons from conversions is significantly higher than that for electrons from HF decays; therefore the change in the fraction of conversions when going from the control to the signal regions is estimated from simulation and used to correct fake . Systematic uncertainties in this correction are estimated to be 40%. They include a 15% uncertainty in the modeling of conversions in the simulation [101], a 20% uncertainty from a measurement of ttγ [102], a 50% uncertainty in the modeling of semileptonic b-decays and the uncertainties in the non-prompt lepton efficiencies.
The performance of the matrix method was tested in simulation using a closure test by comparing the prediction from the method to the results from the simulation. Closure tests were performed for each channel using tt simulation and the level of the non-closure is found to be at most (11 ± 8)% and (9 ± 18)% for the 2 SS and 3 channels, respectively, which is accounted for as a systematic uncertainty. Additional systematic uncertainties due to the subtraction of the prompt backgrounds in the control regions are included. The total uncertainty in the non-prompt lepton estimate varies from 20% for e ± µ ± to 30% for 3 . The ratio for the non-prompt background yield in data to the predictions from simulation is found to be 2.0 ± 0.5 for ee, 1.5 ± 0.5 for µµ and 1.7 ± 0.4 for eµ in the 2 SS signal region. It is 1.8 ± 0.8 for 3 in the signal region and 2.2 ± 0.5 in the tt control region. The non-prompt lepton estimates were validated in various regions, as illustrated in Figure 8(a) and 8(b) in a region identical to the 2 SS signal region except for being orthogonal in the N jets requirement (low multiplicity N jets = 2, 3).

Non-prompt leptons in the 4 channel
A semi-data-driven estimate of the non-prompt leptons is used in the 4 channel. Leptons are separated according to their origin: prompt, heavy-flavor and light-flavor, with the latter designation including leptons from photon conversions. As the rate of non-prompt muons originating from light-flavor hadrons is extremely low, the muons of heavy-and light-flavor origin are treated together. The control region defined in Table 7 for the non-prompt lepton estimate in the 4 channel, where three light leptons are required, is used. It is composed of roughly 50% Z+jets, 30% diboson and 20% tt events. The control region is separated into four categories according to the flavor of the leptons (eee, eeµ, eµµ and µµµ) and a fit to the leading jet p T distribution is performed to extract three normalization factors: λ e heavy = 1.48±0.22, λ e light = 0.72 ± 0.53 and λ µ = 0.66 ± 0.19, where the errors are statistical. The normalization factors are applied to all events containing non-prompt leptons to correct the yields from the simulation in each category to data. The composition of the non-prompt leptons in the control region is shown in Figure 7(a). The systematic uncertainty in each normalization factor is estimated to be 30% by varying the p T requirements on the leptons. The non-prompt lepton estimates were validated in various regions, as illustrated in Figure 8(c) in a region identical to the control region used to extract the normalization factors except for being orthogonal in the N jets requirement (higher multiplicity N jets > 2).

Non-prompt leptons and fake τ had candidates in other channels
In the 3 +1τ had , 2 OS+1τ had and 1 +2τ had channels, the background from non-prompt light leptons is a few percent and is estimated from simulation, but the fake τ had background, mainly arising from tt and ttV, is estimated from data. In the 2 SS+1τ had channel, both backgrounds are significant and hence are estimated from data.
In the 2 OS+1τ had channel, the fake-factor method is used to estimate the background from events containing a fake τ had candidate. The method assumes that the real contribution is described well by simulation. The fake factors are estimated using the control region defined in Table 7, which applies the nominal 2 OS+1τ had selection but requires at least three jets and vetoes events containing b-jets. The fake factors are parameterized as a function of p τ had T and no significant dependence on other key event properties was found. Systematic uncertainties include the statistical uncertainty in the control regions, differences in the fake composition between the control and signal regions and the variation in the fake factors between different control regions. The total systematic uncertainty in the fake τ had background estimate in this channel is 11%. Figure 8(d) illustrates a validation of this estimate in the 2 OS+1τ had selection region, which is largely dominated by events with a fake τ had .
As the origin of the τ had fakes is very similar between the channels, as demonstrated in Figure 7(b), an extrapolation is made to the 2 SS+1τ had and 3 +1τ had channels. The fake factors derived in the 2 OS+1τ had channel are converted into a scale factor to correct the simulation of fake τ had candidates coming from jets in order to better describe the data. The scale factor is derived in the 2 OS+1τ had control region and then applied in the respective signal regions. Its dependence on p T was found to be negligible. Uncertainties in the scale factor are derived by comparing the value in the nominal control region to those obtained Other Uncertainty Other Uncertainty in control regions enriched in tt and Z boson events, respectively. The final scale factor is 1.36 ± 0.16 including statistical and systematic uncertainties.
In the 2 SS+1τ had channel, this scale factor is applied only to backgrounds containing prompt leptons and fake τ had candidates. An additional fake-factor method is used to estimate the background from events containing non-prompt light leptons. This fake factor is derived in a control region defined in Table 7, which differs from the signal region by looser lepton requirements and lower jet multiplicity. As in the 2 SS and 3 non-prompt lepton estimates, the change in the fraction of conversions from the control to the signal region is taken into account, with the same associated uncertainties. The total systematic uncertainty in the non-prompt lepton estimate in this channel is 55%, dominated by the statistical uncertainty in the closure test of the method found in simulation.
The dominant background in the 1 +2τ had signal region is tt production where one or two τ had are fakes from tt decays. As there is equal probability for a jet to be reconstructed as a positively or negatively charged τ had , the fakes are estimated from a control region identical to the signal region except that the τ had candidates are required to have the same charge, as shown in Table 7. This region contains almost entirely fakes from tt decays. The estimate is extrapolated to the signal region after using simulation to subtract the contribution from real τ had in the control region. Using simulation, the non-closure of this method was found to be below 30%, which is included as a systematic uncertainty.

Charge misassignment
The electron charge misassignment rate is measured in data, and the corresponding background is taken into account in the 2 SS, 2 SS+1τ had channels and, indirectly, in the 3 channel via the non-prompt background estimate, by scaling opposite-charge data events by this rate. The measurement is performed within a sample of Z → ee events reconstructed as same-charge pairs and as opposite-charge pairs. Six bins in |η| and four bins in p T are used. The bins were chosen in accord with the size of the event sample and the variation of the rate with |η| and p T . The background is subtracted using a sideband method. The charge misassignment rate varies from 5 × 10 −5 for low-p T electrons (p T ≈ 10 GeV) at small |η| to 10 −2 for high-p T electrons (p T 100 GeV) with |η| > 2.
The electron charge misassignment measurement is validated by a closure test in simulation using samecharge pairs, with the observed difference between measured and predicted rates being taken as the systematic uncertainty. An additional validation is performed in data by comparing the measured and estimated numbers of same-charge events. The results are found to agree within uncertainties. Additional systematic uncertainties applied to the estimate include the statistical uncertainty from the data and the variation in the rates when the Z-peak range definition is varied. The total systematic uncertainty in the charge misassignment background estimate is about 30%, with the dominant contribution at low p T from the closure tests and at high p T from the statistical uncertainty.

Systematic uncertainties
The sources of systematic uncertainty considered in this analysis are summarized in Table 9. They impact the estimated signal and background rates, the migration of events between categories and/or the shape of the BDT discriminants used in the final fit. Systematic uncertainties are implemented in the fit as normalization factors that affect the normalization of a process in a given analysis category or as a shape variation that only affects the distribution of a discriminant in a given category but not its normalization.
The impact of all these systematic uncertainties on the measured signal strength is discussed quantitatively in Section 8.
The uncertainty in the combined 2015+2016 integrated luminosity is 2.1%. It is derived, following a methodology similar to that detailed in Ref. The uncertainties in the b-tagging efficiencies measured in dedicated calibration analyses [88] are also decomposed into uncorrelated components. The large number of components for b-tagging is due to the calibration of the distribution of the BDT discriminant. The approximate relative size of the b-tagging efficiency uncertainty is 2% for b-jets, 10% for c-jets and τs, and 30% for light jets. The impact of the tagging uncertainty for jets containing either c-hadrons or τ had is significant and, due to the calibration procedure applied, is taken as fully correlated between the two jet flavors.
Uncertainties in light-lepton reconstruction, identification, isolation and trigger efficiencies have negligible impact. The uncertainty in the identification efficiency for τ had is 6% [81].
The systematic uncertainties associated with the estimation of the fake and non-prompt lepton backgrounds, as well as electron charge misassignment, are discussed in Section 6. They have large effects on the background estimates in all channels.
The systematic uncertainties associated with the generation of signal and background processes are due to uncertainties in the assumed cross sections and acceptance modeling for each process, and they are assessed in each category. The former are evaluated by varying the cross section of each process within its uncertainty, as described in Section 3. The latter are estimated by comparing the results with those obtained using alternative simulated samples detailed in Section 3. The most important uncertainty arising from theoretical predictions is in the assumed SM cross sections and the modeling of the acceptance for ttH, tt Z and ttW production. The uncertainty in the shape of the simulated ttW and tt Z backgrounds due to the choice of event generator varies by at most 10% between bins. The uncertainties for ttγ, t Z, tW Z, and VV(→ X X) include extrapolation uncertainties into the analysis phase space. Table 9: Sources of systematic uncertainty considered in the analysis. "N" means that the uncertainty is taken as normalization-only for all processes and channels affected, whereas "S" denotes uncertainties that are considered shape-only in all processes and channels. "SN" means that the uncertainty applies to both shape and normalization. Some of the systematic uncertainties are split into several components, as indicated by the number in the rightmost column.  A maximum-likelihood fit is performed on all these twelve categories simultaneously to extract the ttH signal cross section normalized to the prediction from the SM (µ) with the signal acceptance in the different regions derived assuming the SM. The statistical analysis of the data uses a binned likelihood function L(µ, ì θ), which is constructed from a product of Poisson probability terms to estimate µ. The

Statistical model and results
Higgs boson branching fractions and the cross section for associated production of a Higgs boson and a single top quark, which is treated as background, are set to their SM expectations with appropriate theoretical uncertainties. As mentioned in Section 5 and summarized in Table 11, a BDT shape is used as the final discriminant in five of the eight signal regions. The exceptions are the 4 Z-enriched (defined after placing a requirement on a BDT discriminant), the 4 Z-depleted and the 3 +1τ had categories, which use a single bin because there are few events. A single bin is also used in the four control regions from the 3 channel. The total number of bins used in the fit is 32 and the details of each category are presented in Table 11.
The impact of systematic uncertainties on the signal and background expectations is described by nuisance parameters (NPs), ì θ, which are constrained by Gaussian or log-normal probability density functions. The latter are used for normalization factors to ensure that they are always positive. The expected numbers of signal and background events are functions of ì θ. The prior for each NP is added as a penalty term to the likelihood, L(µ, ì θ), to decrease it when θ is shifted away from its nominal value. The statistical uncertainties in the simulated background predictions and the control regions used for the non-prompt and fake estimates are included as bin-by-bin NPs using the Beeston-Barlow technique [106].
The test statistic, q µ , is constructed from the profile log-likelihood ratio: q µ = −2 ln Λ µ = −2 ln L(µ,ì θ)/L(μ,ì θ), whereμ andì θ are the parameters that maximize the likelihood andì θ are the NPs that maximize the likelihood for a given µ. The test statistic is used to quantify how well the observed data agrees with the background-only hypothesis.
The fittedμ value is obtained by maximizing the likelihood function with respect to all parameters and the total uncertainty, σ µ , is obtained from the variation of −2 ln Λ µ by one unit from its minimum. Systematic uncertainties are found by subtracting in quadrature the statistical uncertainty, determined by fixing all NPs to their best-fit values, from the total uncertainty. The expected results are obtained in the same way as the observed results by replacing the data in each input bin by the prediction from simulation and the data-driven fake and non-prompt estimates with all NPs set to their best-fit values obtained from the fit to data. The significance is obtained from the test statistic in the asymptotic limit [107]. As the 4 channel has few events, the validity of this assumption was verified using pseudo-experiments. Table 10: Background, signal and observed yields in the twelve analysis categories in 36.1 fb −1 of data at √ s = 13 TeV. Uncertainties in the background estimates due to systematic effects and to limited simulation sample size are shown. "Non-prompt", "Fake τ had " and "q mis-id" refer to the data-driven background estimates described in Section 6. Rare processes (t Z, tW, tW Z, ttWW, triboson production, ttt, tttt, tH, rare top decay) are labeled as "Other". In the top part, the pre-fit values are quoted, i.e. using the initial values of background systematic uncertainty nuisance parameters and the signal expected from the SM. In the bottom part, the corresponding post-fit values are quoted. In the post-fit case, the prediction and uncertainties for ttH reflect the best-fit production rate of 1.6 +0.5 −0.4 times the Standard Model prediction and the uncertainty in the total background estimate is smaller than for the pre-fit values due to anticorrelations between the nuisance parameters obtained in the fit.

Category
Non  As described in Section 7, a large number of systematic uncertainties, whose effects are accounted for using NPs, affect the final results. In total 315 NPs are considered, most having experimental origin. The experimental uncertainties are fully correlated across categories, with the exception of those related to the quark/gluon jet composition and some uncertainties associated with the fake and non-prompt lepton background determinations, which are specific to the different categories, as detailed in Section 6. As the residual prompt (mainly ttW and VV) background contribution is subtracted from the control regions to extract the fake and non-prompt leptons, the associated nuisance parameters are taken as fully correlated with the theoretical cross-section systematic uncertainties. The same treatment is used for the uncertainty associated to the measurement of the background from charge misassignment, which is also subtracted from the control regions.
The fit uses templates constructed from the predicted yields for the signal and the various backgrounds in the bins of the input distribution in each region. The systematic uncertainties are encoded in templates of variations relative to the nominal template for each upward or downward (±σ) variation. A smoothing procedure is applied to remove large local fluctuations in the templates for some background processes in certain regions. Systematic uncertainties that have a negligible impact on the final results are removed to improve the speed of the fit: a normalization or a shape uncertainty is not applied if the associated variation is below 1% in all bins; this reduces the number of nuisance parameters to 230. Most of the neglected nuisance parameters are those related to flavor tagging.
The behavior of the global fit is studied by performing a number of checks including evaluating how much each NP is pulled from its nominal value, how much its uncertainty decreases from the nominal uncertainty and which correlations develop between initially uncorrelated systematic uncertainties. The stability of the results was tested by performing fits for each channel independently and in combination.
The impact of each systematic uncertainty on the final result is assessed by performing the fit with the parameter fixed to its fitted value varied up or down by its fitted uncertainty, with all the other parameters allowed to vary and calculating the ∆µ to the baseline fit. The ranking obtained for those nuisance parameters with the largest contribution to the uncertainty in the signal strength is shown in Figure 9. The NP with the largest pull from its nominal value is the uncertainty in the non-prompt lepton estimate due to the non-closure in the 3 channel. This is mainly due to the slight deficit observed in the 3 tt control region relative to the background prediction. As the fit includes bins with high purity of non-prompt light leptons and fake τ had backgrounds, the precision of these estimates is increased, as is shown in Table 10. The correlations between the nuisance parameters were checked and no unexpected correlations were observed. The impact of the most important groups of systematic uncertainties on the measured value of µ is shown in Table 12. The uncertainties with the largest impact are those associated with the signal modeling, the jet energy scale and the non-prompt light-lepton estimate. The signal uncertainty is separated into two components to show the uncertainty due to the acceptance and the one due to the cross section. The uncertainties in the non-prompt light-lepton estimates, the fake τ had estimates and the charge misassignment have large statistical components due to the small data sample size. The large impact of the luminosity uncertainty is due to its effect on both the signal and simulated background predictions. Although the individual groups are initially largely uncorrelated, a small correlation is introduced by the fit to data.  Figure 13 and Table 13. The individual channel results are extracted from the full fit but with a separate parameter of interest for each channel. The probability that the fitted signal strengths in the seven channels are compatible is 34%. When assuming that the observed signal is due to the SM Higgs boson, the excess over the SM signal-plus-background hypothesis has a significance of 1.4σ. A model-dependent extrapolation is made to the inclusive phase space, and the measured ttH production cross section is σ(ttH) = 790 +150 −150 (stat.) +170 −150 (syst.) fb = 790 +230 −210 fb. The predicted cross section is σ(ttH) = 507 +35 −50 fb.

ATLAS
For the 4 , 2 OS+1τ had and 3 +1τ had channels, the uncertainties in µ are mainly statistical, while the statistical and systematic uncertainties are of comparable size for the 2 SS, 3 , 2 SS+1τ had and 1 +2τ had channels. Figure 14 shows the data, background and signal yields, where the final-discriminant bins in all signal regions are combined into bins of log(S/B), S being the expected signal yield and B the fitted background yield.
The most sensitive 2 SS, 3 and 2 SS+1τ had analyses were cross-checked with simpler cut-and-count analyses with reduced sensitivity. The observed significance relative to the background-only hypothesis is 1.2σ, 2.3σ and 2.3σ, respectively. The observed signal strengths in the cross-check analyses are found to be statistically compatible with those from the nominal analyses.
An alternative fit where ttW and tt Z normalizations were left free together with µ was performed as a cross-check. The expected sensitivity to µ is 15% worse than with the nominal fit. The observed best-fit value of µ is 1.6 +0.6 −0.5 , in agreement with the result obtained with the nominal fit. The fitted ttW and tt Z cross-section modifiers are 0.92 ± 0.32 and 1.17 +0.25 −0.22 , respectively, in agreement with the SM predictions.   Table 10. Table 13: Observed and expected best-fit values of the signal strength µ and associated significance under the SM background-only hypothesis. The expected values are shown for the pre-fit background estimates. The observed significance is indicated with a − for the channels where µ is negative.

Channel
Best-fit µ Significance Observed Expected Observed Expected Non-prompt Other Uncertainty Pre-Fit Bkgd.

(b)
in v is ib le Data / Pred.           The background yields are shown as the fitted values, while the signal yields are shown for the fitted value (µ=1.6) and the SM prediction (µ=1). The total background before the fit is shown as a dashed blue histogram. The pull (residual divided by its uncertainty) of the data relative to the background-only prediction is shown in the lower panel, where the full red line (dashed orange line) indicates the pull of the prediction for signal with µ=1.6 (µ=1) and background relative to the background-only prediction. The background is also shown after the fit to data assuming zero signal contribution as well as its pull (dotted black line) relative to the background from the nominal fit.

Combination of ATLAS tt H searches
In addition to the results reported in Section 8 (referred to hereafter as the multilepton analysis), the ATLAS Collaboration has carried out searches for ttH production at √ s = 13 TeV using other Higgs boson decay modes: • H → bb, in the lepton+jets and dileptonic tt final states [36].
• H → γγ, in lepton+jets/dileptonic and all-hadronic tt decay channels [37]. In addition, specialized categories sensitive to tHqb/WtH production also have significant ttH acceptance and are included. The combined likelihood function L(µ, ì θ) is obtained from the product of likelihood functions of the individual analyses. The nuisance parameters associated with the same sources in the different analyses are treated as follows: • Higgs boson production and decay: all analyses use the same nominal production cross sections and decay branching fractions. All theoretical uncertainties associated with these parameters are fully correlated between analyses.
• Background uncertainties: The cross-section and modeling uncertainties for MC-estimated tt Z, ttW, t Zqb/Wt Z, W Z/Z Z, Wt, tttt, and ttWW production are correlated between the H → bb and multilepton analyses. The modeling systematic uncertainties of the dominant background of tt in the H → bb analyses are not applied to any other channels, as the relevant regions of phase space are not similar and other channels have independent methods of estimating the relevant tt background.
• Experimental uncertainties: The dominant experimental systematic uncertainties are associated with the jet energy scale, jet energy resolution, and flavor tagging. Nuisance parameters related to the jet energy scale are correlated between the analyses with the exception of the uncertainty in the fractions of jets initiated by quarks and by gluons, which differs between the channels. The jet energy resolution is correlated between all channels except for the control regions of the H → bb analysis, to avoid constraining this systematic uncertainty in the signal regions; this gives a conservative estimate of the impact. The H → γγ and H → 4 analyses use a different calibration for the flavor-tagging efficiencies and mistag rates compared to the H → bb and multilepton analyses. Due to this, the flavor-tagging uncertainties are correlated between H → γγ and H → 4 and between H → bb and multilepton analyses, but are uncorrelated between the two pairs. The flavor-tagging uncertainties are constrained significantly by the H → bb analysis, due to its large samples of band c-jets, which carries over to the multilepton analysis.
Other experimental systematic uncertainties such as luminosity, pileup effects, lepton identification, isolation, and trigger efficiencies are treated as correlated, except for statistical uncertainties associated with efficiency measurements for different working points.
None of the NPs in the fit are strongly constrained by more than one analysis, and the value of µ obtained from the combined fit does not depend on the choice of the correlation scheme.
The best-fit value of the ttH signal strength, as determined from the combined likelihood function, is The background-only hypothesis (µ = 0) is excluded at 4.2σ, with an expectation of 3.8σ in the case of a SM signal. This constitutes evidence for ttH production.
The values of µ obtained in each analysis, and the result of the combination, are shown in Figure 15 and Table 14. The probability that the signal strengths from the individual analyses are compatible with the combined value of µ is 38%. The impact of various uncertainties on the combination is shown in Table 15. The leading systematic uncertainties are those associated with the ttH signal modeling and cross section and the tt background modeling in the H → bb analysis. The cross section for ttH production corresponding to the best-fit value of µ is 590 +160 −150 fb, as compared to the SM prediction of σ(ttH) = 507 +35 −50 fb. As no events are observed in the H → 4 analysis, a 68% confidence level (CL) upper limit on µ, computed using the CL s method [108], is reported.  Due to the different acceptances for the different analysis categories for different Higgs boson decay modes, it is possible to independently determine µ in different Higgs boson decay modes. In particular the multilepton analysis has categories with zero and ≥ 1 τ had candidates, which are enriched in H → WW * and H → ττ, respectively (see Figure 4). The result of a fit for four signal strengths is shown in Figure 16. Due to very weak sensitivity for H → Z Z * , the ratio of branching fractions of H → Z Z * and H → WW * are assumed to be as in the SM and a single combined signal strength for H → VV is computed. For H → bb and H → γγ the result is essentially the same as for the individual analyses, due to the high purity of those signal regions for the respective Higgs boson decays. The H → WW * and H → ττ decays are distinguished only by their different contributions to the various multilepton signal regions, resulting in a significant anticorrelation. Two-dimensional scans of the signal strengths are shown in Figure 17 for H → bb versus H → VV and for H → ττ versus H → VV; in these plots the two signal strengths not shown are profiled in the scan.

ATLAS
The ttH analyses are sensitive to the Htt, Hbb, and Hττ fermion couplings, the HWW and H Z Z gauge boson couplings, and the effective Hγγ coupling. Accordingly, constraints can be placed on deviations of these couplings from the SM. An interpretation is made using the κ-parameterization, in which Higgs boson couplings to particle species i are linearly scaled by factors κ i . Here, all fermion couplings are assumed to scale by a common factor κ F and the WW/Z Z couplings by a common factor κ V . As only the relative sign of the κ factors is meaningful, the convention that κ V ≥ 0 is chosen. Modifications to loop-induced processes are determined by multiplying the contributing SM amplitudes by the relevant κ-factors; no contributions from non-SM particles are considered and no non-SM Higgs boson decay modes are allowed. The relevant parameterizations are given in Ref.
[18]. In particular the factor κ γ modifying the effective Hγγ coupling is expressed in terms of κ V and κ F , and κ g is set equal to κ F . The total width of the Higgs boson is modified appropriately.
The ttH analyses, especially the H → γγ, multilepton, and H → 4 channels, have acceptance for tHqb and WtH production. The amplitudes for the H → γγ decay and the production of tHqb and WtH involve interference between the Htt and HWW couplings. In the SM, the interference is destructive,  almost completely in the case of tHqb and WtH. As a result, a global analysis of the ttH channels, in this parameterization, is able to resolve the relative sign of the two couplings.
A likelihood scan is performed in the κ V -κ F plane. The analysis acceptances for all Higgs boson production mechanisms and decays are assumed to be constant as the κ parameters are varied over the scanned region, with only rates being modified. The results are shown in Figure 18, and are in good agreement with the Standard Model values κ F , κ V = 1. The possibility that κ F < 0 is excluded at 95% CL in this parameterization.   Figure 18: Allowed regions at 68% and 95% CL in the κ V -κ F plane from the combination of all ttH channels. The Higgs boson is assumed to not couple to any particles beyond the Standard Model, and the H → γγ and H → gg couplings are expressed in terms of κ F and κ V .

Conclusions
A search for ttH production in multilepton final states using a dataset corresponding to an integrated luminosity of 36.1 fb −1 of proton-proton collision at √ s = 13 TeV recorded by the ATLAS experiment at the LHC is presented. Seven final states, targeting Higgs boson decays to WW * , ττ, and Z Z * , categorized by the number and flavor of charged-lepton candidates, are analyzed. An excess of events over the expected background from SM processes is found, which is interpreted as an observed significance of 4.1 standard deviations for a SM Higgs boson of mass 125 GeV. The expected significance for a SM Higgs boson is 2.8 standard deviations. The best-fit result of the observed production cross section is σ(ttH) = 790 +230 −210 fb, in agreement with the SM prediction of 507 +35 −50 fb.
The combination of this result with other ttH studies from the ATLAS experiment using the Higgs boson decay modes to bb, γγ and Z Z * → 4 is presented. The combination has an observed significance of 4.2 standard deviations, compared to an expectation of 3.8 standard deviations. The cross section for ttH production is measured to be σ(ttH) = 590 +160 −150 fb, in agreement with the SM prediction. This provides evidence for the ttH production mode.   [55] R. D. Ball              [106] R. J. Barlow