Anatomy of the tthh Physics at HL-LHC

The tthh production at colliders contain rich information on the nature of Higgs boson. In this article, we systematically studied its physics at High-Luminosity Large Hadron Collider (HL-LHC), using exclusive channels with multiple (≥ 5) b-jets and one lepton (5b1`), multiple (≥ 5) b-jets and opposite-sign di-lepton (5b2`), same-sign di-lepton (SS2`), multiple leptons (multi-`), and di-tau resonance (ττ). The scenarios analyzed include: (1) the tthh production in Standard Model; (2) the tthh production mediated by anomalous cubic Higgs self-coupling and tthh contact interaction; (3) heavy Higgs (H) production with ttH → tthh; and (4) pair production of fermionic top partners (T ) with TT → tthh. To address the complication of event topologies and the mess of combinatorial backgrounds, a tool of BoostedDecision-Tree was applied in the analyses. The 5b1` and SS2` analyses define the two most promising channels, resulting in slightly different sensitivities. For the non-resonant tthh production, a combination of these exclusive analyses allows for its measurment in the SM with a statistical significance ∼ 0.9σ (with S/B > 1%), and may assist partially breaking the sensitivity degeneracy w.r.t. the cubic Higgs self-coupling, a difficulty usually thought to exist in gluon fusion di-Higgs analysis at HL-LHC. These sensitivities were also projected to future hadron colliders at 27 TeV and 100 TeV. For the resonant tthh productions, the heavy Higgs boson in type II Two-Higgs-Doublet-Model could be efficiently searched for between the mass thresholds 2mh < mH < 2mt and even beyond that, for relatively small tanβ (vacuum alignment), while the fermionic top partners in composite Higgs models could be probed for up to ∼ 1.5 TeV and ∼ 1.7 TeV, for Br(T → th) = 25% and 50%, respectively. ar X iv :1 90 5. 03 77 2v 1 [ he pph ] 9 M ay 2 01 9

Due to its relatively large cross section at hadron colliders, the gg → hh production has received the most attentions in literatures so far. The analyses performed by the ATLAS and CMS collaborations [9,10] show that the cubic Higgs self-coupling could be measured at HL-LHC, with a precision of ∼ O(1) at 2σ C.L.. Recently, the analysis of the gg → hh → bbγγ channel was extended to 27 TeV, indicating that the cubic Higgs self-coupling could be measured with a precision < O(1) [9,[11][12][13]. Yet, because of the strategic difference in addressing the backgrounds, pileups, etc., the sensitivities obtained in literatures vary by a factor of two or three [14]. At 100 TeV, the gg → hh → bbγγ analysis has been extensively carried out (see, e.g., [8,11,12,[15][16][17][18][19]). It shows that a precision of 10% for the cubic Higgs self-coupling measurement is possible, and could be further improved, using the detector tailered for a 100 TeV machine [8,11].
Despite of this, the tthh production may play a special role in measuring the Higgs self-couplings. As is well-known, the gluon fusion di-Higgs production involves destructive interference between Feynman diagrams with and without the cubic Higgs self-coupling. This may result in a sensitivity degeneracy at HL-LHC between two disjointed regions of the cubic Higgs self-coupling: one near the SM and another one far away [9], though this degeneracy could be broken with kinematics at future hadron colliders [14]. Differently, the relevant interference in the tthh production is constructive. Thus there might be a chance to break this degeneracy at HL-LHC either completely or partially with the assistance of this channel. Second, because of its higher energy threshold of production, the tthh cross section increases faster than the gg → hh one, as the beam energy √ s increases. As a matter of fact, their difference in the SM is reduced from a factor ∼ 35 at 14 TeV to ∼ 14 at 100 TeV [1]). This indicates that the tthh measurements may gain more in sensitivity at future hadron colliders, compared to gg → hh. More than that, the di-Higgs productions can be applied to probe for the quartic Higgs self-coupling [20,21], since it may renormalize the cubic Higgs selfcoupling and contribute to the form factor of the relevant vertices at NLO. Yet, as is discussed in [21], a combined analysis of multiple di-Higgs productions is valuable in reducing the dependence of the generated limits on renormalization scheme, a non-physical effect resulting from incomplete treatment of higher-order corrections. The tthh measurements may serve for such a purpose. Aside from measuring the Higgs self-couplings, the tthh channel can also serve as a probe for theories of electroweak symmetry breaking (EWSB) such as Composite Higgs Models(CHM) and supersymmetry. In the CHM, there typically exists an extended top sector which couples with the Higgs boson. With the heavy degrees of freedom being integrated out at a cutoff of strong dynamics, a contact interaction of tthh can be generated [22][23][24][25][26]. This operator will contribute to the non-resonant tthh production at tree level and hence mediate its signal rate at colliders (for studies on the impact of the tthh contact interaction for the gg → hh kinematics, see, e.g., [27]). The CHM may contribute to the resonant tthh production as well. For example, the top partner (T ) can be pair produced in the CHM with both of them decaying as T → th. Actually, this is one of the major channels for its searches at LHC [28]. Besides that, the resonant tthh production may arise from Two Higgs Doublet Model (THDM) of, e.g., type II, including Minimal Supersymmetry Standard Model (MSSM). In this scenario, a heavy CP-even Higgs resonance H can be produced in association with a pair of top quarks, and subsequently decays into a pair of Higgs bosons. This channel may become important for the H searches, if 2m h < m H < 2m t , with a relatively small tan β (i.e., vacuum alignment). Indeed, in this nearly decoupling limit the on-shell decay of H → tt is turned off, while the couplings of H with gauge bosons tend to be suppressed. It has been shown numerically that the decay of H → hh could be dominant in this parameter region [29]. A branching ratio of Br(H → hh) ∼ O(10%) can be extended to m H ∼ 400 GeV and tan β ∼ 5 [29]. Note, the tthh contact interaction can be also generated in these models by integrating out heavy bosonic degrees of freedom.
However, the tthh sensitivities at colliders have been analyzed in very few papers so far. The first analyses at HL-LHC were pursued in [30,31], using a specific channel tthh → ttbbbb and a cut-based method. Subsequently, the ATLAS experimentalists re-analyzed its sensitivity, in a similar setup but with more aggressive kinematic cuts [32]. It showed that a statistical significance of S/ √ B ∼ 0.35 could be achieved, by combining two significances in quadrature: the bin with 5 b-tags (0.23σ, S/B ∼ 0.3%) and the bin with ≥ 6 b-tags (0.26σ, S/B ∼ 1.1%). Yet, such analyses could be further optimized. The tthh production is characterized by a complicated topology. The jet multiplicity in signal and, strongly correlated with this, the combinatorial backgrounds make the event reconstruction very challenging. It has been known for a while, such a difficulty could be addressed to some extent by a tool of Boosted-Decision-Tree (BDT). Actually, the BDT method has been extensively applied in the measurements of top physics at LHC [33], such as the ones of tt + V [34], tt + h → bb [35], and tt + h → γγ/ZZ * [36]. It has also been suggested to use for the searches of heavy Higgs bosons which are produced in association with top or bottom quarks at hadron colliders [37,38]. More than that, almost all studies so far were focused on the specific channel of tthh → ttbbbb, given its relatively large signal rate. The channels with, e.g., same-sign dilepton and multiple leptons, were largely ignored, which could be valuable also, because of their relatively clean backgrounds.
Motivated by these considerations, in this article we will systematically analyze the sensitivities of measuring the tthh physics at HL-LHC, using the tool of BDT, in the following physics scenarios: 1. (non-resonant) the tthh production in the SM (denoted as tthh(SM)) (their tree-level Feynman diagrams are shown in Figure 1).
Limited by statistics, we will focus on five exclusive channels with multiple (≥ 5) b-jets and one lepton, multiple (≥ 5) b-jets and opposite-sign di-lepton, same-sign di-lepton, multiple leptons, and di-tau resonance. The tthh measurements are expected to gain more at future hadron colliders, given the increased tthh cross section, improved kinematics such boost, and potentially larger luminosity. We will leave their study in a separate paper [39]. We organize this article as follows. We will introduce analysis strategy in Section 2. The sensitivities of measuring the non-resonant and resonant tthh physics at HL-LHC are presented respectively in Section 3 and Section 4. To qualitatively measure the potential of future hadron colliders, we also project the sensitivities of probing for the anomalous cubic Higgs self-coupling and the tthh contact interaction to 27 TeV (HE-LHC) and 100 TeV (SppC and FCC-hh) in Section 3. We summarize our studies and discuss potential directions to explore in Section 5. Technical details on our analyses are provided in Appendices.

Generation of Signal and Background Events
In this study, we use MadGraph 5 [40] and Pythia 8 [41] to generate the signal and background events at leading order. The heavy components (t, W, Z and h) in each event decay inclusively in either MadGraph 5 [40], Pythia 8 [41] or MadSpin [42].
The detector simulation is performed using Delphes 3 [43], with a detector performance suggested for the HL-LHC runs. All leptons are required to have p T > 10 GeV and be isolated. The isolation is defined with a cone ∆R = 0.2, and the net p T within it (i.e., the total p T excluding the contribution from target lepton l) being smaller than 0.2p T ( ). Jets are clustered using anti-k t algorithm with ∆R = 0.4. All of them are required to have p T > 20 GeV.
The tagging of b-jets plays an important role in this study. We define its efficiencies with the working points of MV2 b-tagging algorithm [44]. Explicitly, we choose the tight, moderate and loose b-tagging efficiencies to be 60%, 77% and 85%, respectively. We also analyze the sensitivities with a softened tight b-tagging efficiency of 70%, to see the potential impact of the b-tagging efficiency for the tthh sensitivities at HL-LHC. Unless specified, the results based on the tight and softened tight b-tagging efficiencies will be presented. τ jets are tagged with an efficiency ∼ 60%, with a mis-tagging rate ∼ 1.5% for jets with p T > 20 GeV and |η| < 2.3. This setup is consistent with the HL-LHC expectation [45].
For event reconstruction in BDT, we reconstruct fat jets using anti-k t algorithm in a second clustering exercise, with ∆R = 1.0 and p T > 200 GeV. The fat jets are pruned using a jet trimming algorithm [46], with ∆R=0.3 and p T fraction of trimming being 0.05. This to some extent can weaken the impact of the pileups [47] which are not turned on in this simulation. The MET reconstruction can be also smeared by high pileups in detector [48]. Our analyses do not significantly rely on the MET measurements. But still we mimic this impact by smearing the MET with a vector of 50 GeV.  For the non-resonant tthh analysis, the signal events are generated based on a simplified Lagrangian [49] This Lagrangian is reduced to the SM if the dimensionless top Yukawa coupling y, the scaled cubic Higgs self-coupling κ and the tthh Wilson coefficient c t are valued as Given that the top Yukawa coupling can be measured separately (e.g., using the tth production) at HL-LHC [14] with a precision better than the other two, we fix its value to be one in this article. Totally seven signal samples are generated. The first one is defined by the tthh(SM) production. The BDT performance will be illustrated mainly using this sample. This sample will be also used as one of the backgrounds for the resonant tthh analyses. Another six samples are denoted by a set of coupling products {y 4 , y 2 κ 2 , c 2 t , y 3 κ, y 2 c t , yκc t }. They represent contributions of the six terms in the LO tthh cross section. These samples will be used to analyze the sensitivities, given the {κ, c t } values, by reweighting their relative contributions. For the resonant tthh analysis, totally five samples are generated: two for the scenario of heavy Higgs boson, and three for the scenario of fermionic top partner. The samples in each scenario are characterized by different resonance masses.
The major backgrounds shared by the non-resonant and the resonant tthh productions include four-top (4t) and di-top + X, with X = 4b, bbcc, etc. We will not include the contributions of di-top + X with X = 4c, 2b/2c + 2j, 4j. By approximately matching the simulations with fiducial cross section of the tt with multiple b-jets measured at ATLAS [50], we find that in the one lepton + light jets channel (≥ 5b, ≥ 2j), the tt+4b/2b2c sample explains ∼ 70% of the expected event number from QCD. Here the MV2 b-tagging algorithms [44] with an efficiency of 60% are applied. So, we will define a K factor with K = σ f id σ sim = 1.44 for the tt + 4b/2b2c backgrounds to represent this discrepancy 1 . As for the other backgrounds, their cross sections will be universally scaled by an NLO K factor 1.2. This is comparable, e.g., for the 4t background, to the numbers suggested in literatures (see, e.g., [51]). The details on the event generation of signals and backgrounds is summarized in Table 1.

Selection Rules
In this study, the tthh events are exclusively selected for five analyses after a preselection: ≥ 1 lepton with p T > 15 GeV and |η| < 2.5; ≥ 2 tight b-jets (70% tagging efficiency) and ≥ 1 moderate b-jets with p T > 20 GeV and |η| < 2.5. Then each of them is analyzed using BDT. The five exclusive analyses include: • Multiple b-jets (≥ 5) + one lepton (5b1 ). ≥ 5 tight b-jets are required. This analysis targets on the decay mode of hh → 4b, and was suggested to be one of the most promising channels for the tthh measurement [30,31].
• Multiple b-jets (≥ 5) + opposite sign di-lepton (5b2 ). Similar to the previous one, ≥ 5 tight b-jets are required. This analysis also targets on the decay mode of hh → 4b.
• Same-sign di-lepton (SS2 ). Of the two same-sign leptons, the leading one needs to have p T > 17 GeV and the subleading one p T > 14 GeV. The events with a third lepton will be vetoed. Additionally, ≥ 4 b-jets are required, among which at least three are tight, or two tight and one moderate. This analysis targets on the decay mode of h → V V * . This is reminiscent of the SS2 search of the top quark associated single top production at LHC [52,53] which plays an important role in the measurement of top Yukawa coupling.
• Multiple leptons (≥ 3) (multi-). The same requirements for b-jets are applied, as the ones for the SS2 analysis. ≥ 3 leptons with p T > 10 GeV are required. This analysis also targets on the decay mode of h → V V * .
• Di-tau jets (τ τ ). The same requirements for b-jets are applied, as the ones for the SS2 analysis. Both τ jets need to have p T > 20 GeV. This analysis targets on the decay model of h → τ τ .
Being lack of precise knowledge on the trigger setup at HL-LHC, we use the ATLAS 2018 HLT triggers [54] as a reference in the analyses. Below are the relevant ones: 1. Single lepton trigger: one isolated lepton with p T > 25 GeV.
2. Two lepton trigger: two isolated leptons. Electrons need to have p T > 17 GeV and muons p T > 14 GeV.
3. Three muon trigger: at least three isolated muons with p T > 6 GeV.
5. 3b + j trigger: at least three b-jets (70%) and one extra jet, with each having p T > 35 GeV.
We will demonstrate the potential sensitivities combining the samples defined by these triggers. The information of the sample selection for each exclusive analysis is summarized in Table 2. Table 2: Sample selection in the five exclusive analyses.

Boosted Decision Tree
To improve the sensitivity, we apply a BDT tool in the analyses, based on the TMVA package [55]. The workflow of the BDT-based event reconstruction is presented in Figure 2. For event reconstruction in our analyses, two intermediate-object taggers, i.e., one hadronic top tagger and one Higgs tagger, are constructed from lower-level inputs. The top tagger mainly targets on boosted or relatively boosted top quarks in the tthh events. It is constructed using the information of paired fat jet and b-jet, with ∆R < 2 between them. Here the fat jets and the b-jets are generated from two separate clustering exercises. Besides p T , mass and η of the fat jet and the b-jet, subjetness (τ 1−4 ) [56] and track number of the fat jet, tagging level (tight/moderate/loose) of the b-jet, and their separation in azimuthal angle ∆φ are also used as the variable inputs of the tagger. For each fat jet, only the pairing with a b-jet which scores highest by the BDT is kept for subsequent data processing. The four momentum of the reconstructed top is defined as that of the fat jet, if the fat jet has ∆R < 1 w.r.t. the b-jet or its mass is larger than 70 GeV (typical upper limit for the mass of the trimmed W -jets), and as the total four momentum of the paired fat jet and b-jet for the other cases. The tagger is trained using single top events (t + 1j) with p T > 150 GeV for the top quark, against the SM backgrounds.
The Higgs tagger is constructed using the information of paired b-jets with 160 GeV > m bb > 90 GeV. The variable inputs include p T , mass and η of each paired b-jet, their tagging levels (tight/moderate/loose) and their separation in azimuthal angle ∆φ. The four momentum of the reconstructed Higgs is defined as the total four momentum of the paired b-jets. The tagger is trained using the hZ → bb events with p T > 25 GeV for the leptons and p T > 20 GeV for the b-jets.
In each event at most two BDT top quarks and three BDT Higgs bosons will be kept which are sorted by their BDT scores. These top and Higgs candidates, together with the other objects in the events, are used for the event reconstruction. The variables include not only the ones which have been widely accepted [34][35][36], but also the ones encoding the kinematics of the reconstructed objects and their correlation with each other (e.g., their separations ∆R h i ,h j , ∆R h i ,t j , etc., and their invariant mass m h i ,h j , m h i ,t j , etc.) and with the other objects in the event. To characterize their overlapping, we flip the sign of the correlation variables if the two reconstructed objects share some common elements (e.g., if a tagged top t i and a tagged Higgs h j share the same b-jet). A list of these variables are summarized in Table 5. Note, in the non-resonant tthh analyses we take the variables of separation for event reconstruction. They are replaced with the variables of invariant mass in the resonant analysis.
3 Non-Resonant tthh Analysis  Table 3: Cut flow of the tthh(SM) signal and its major backgrounds at HL-LHC. In the five exclusive analyses, the numbers outside (inside) the brackets are based on a tight (softened tight) b-tagging efficiency of 60% (70%). All significances are calculated against the background-only hypothesis.
To illustrate its effectiveness, we apply the strategy developed in Section 2 to analyze the tthh(SM) first, starting with a tight b-tagging efficiency of 60%. The cut flow of the signal and backgrounds before the BDT analysis is summarized in Table 3. We take into account the K factor in calculating statistical significance and the S/B value. For the 5b1 and 5b2 analyses, the backgrounds are dominated by tt4b. Their cut-based significances at HL-LHC are ∼ 0.5σ and ∼ 0.2σ, respectively, with both S/B values being of sub-percent level. For the SS2 and multi-analyses, the main backgrounds are 4t, tt4b and ttbbV /h. Their cutbased significances at HL-LHC are ∼ 0.4σ and ∼ 0.3σ, respectively. The SS2 sensitivity performance is relatively better than the multi-one. Different from the gg → hh → 4W channel [57], where the SS2 analysis suffers a lot from the di-boson backgrounds (i.e., V V , hV , etc.), yielding a sensitivity relatively lower than the multi-one, the relevant di-boson backgrounds for the SS2 analysis of tthh (i.e., ttV V , tthZ, etc.) can be efficiently suppressed with a cut of multiple b-jets. As for the τ τ analysis, its main backgrounds are relatively diverse, yielding a cut-based significance comparable to that of 5b2 . The S/B values of these three analyses are approximately ∼ 1%. The BDT tool is applied to the five exclusive analyses after event selection. Considering that 5b1 and SS2 represent probably the two most promising analyses at HL-LHC, we show their normalized event distributions w.r.t. BDT response in Figure 3. The BDT method results in a wider separation between signal and backgrounds in the 5b1 analysis, compared to the SS2l one. Because suffering more from the combinatorial backgrounds due to the b-jet multiplicity in its final state, this channel gains more from the BDT method.
Based on the BDT response of signal and backgrounds, the ROCs, the gain of statistical significance and the gain of S/B are shown as a function of signal efficiency (or the threshold of BDT response) in Figure 4, for all of the five exclusive analyses. The statistical significances are optimized for some signal efficiency between 50% − 90% which improves the cut-based sensitivities by a factor ∼ 1.2 − 1.3 for 5b1 and 5b2 and a factor ∼ 1.1 − 1.2 for the other three. Compared to the statistical significance, these analyses actually gain more from BDT in improving the S/B values. For the signal efficiency with an optimized significance, the S/B values are improved by a factor of ∼ 1.5 − 3.5. For example, the S/B value in the analyses of 5b1 and 5b2 are raised from sub percent to precent level. An softened tight b-tagging efficiency may yield a better separation between signal and backgrounds, especially for the analyses of 5b1 and 5b2 , and hence improve their sensitivities further (similar for the cut-based analyses). This can be seen by comparing the panels between the left and right columns in Figure 4. But, in that case we need to simulate the impact of the backgrounds tt + 4c, 2b2j, 2c2j, and 4j in a more solid way.
The optimized statistical significances and the correspondent S/B values are summarized in Table 3. The 5b1 and SS2 analyses provide a sensitivity at HL-LHC more promising than the other ones, for the tthh(SM) measurement. If combined with multi-(in quadrature), the SS2 analysis will result in a significance ∼ 0.6σ, comparable to that of the 5b1 one. A  combination of all allows measuring the tthh(SM) production with a significance ∼ 0.9σ. As introduced in Section 1, the tthh production can serve as a probe for the anomalous Higgs self-interaction and the tthh contact interaction [27,49]. In principle we can apply the BDT model which is trained for the tthh(SM) measurement, to extract the relevant sensitivity information. But, the variation of the Higgs self-coupling and the tthh contact interaction may bring new kinematic features to assist the separation between signal and backgrounds. So it will be useful to make a study on such features.

Kinematics and Sensitivities of κ and c t
In Figure 5, we show the normalized kinematic distributions of the tthh events at parton level, w.r.t. varied κ and c t values (for discussions on the impact of the six-dimensional operator ∂ µ (H † H)∂ µ (H † H) for the tthh kinematics at parton level, see [18]). It is easy to see that, as κ increases, the tt invariant mass tends to be larger, while the hh invariant mass tends to be smaller. But, the dependence of p T on the κ value is relatively weak for the leading Higgs boson or top quark. Differently, a deviation of c t from its SM value, i.e., c t = 0 tends to yield a larger invariant mass for both hh and tt pairs in the events. The leading Higgs boson also tends to be harder. These features are strongly correlated with the y 2 κ 2and c 2 t -mediated dynamics in the tthh production, which will be manifested later in Figure 9 and the relevant discussions in the text. To utilize these features, therefore, we train two BDT models using the specific samples of y 2 κ 2 and c 2 t respectively (more information on the sample generation can be found in subsection 2.1). Then the sensitivity to probe for κ and c t via the tthh production is analyzed using the 2D BDT responses for optimization. A comparison of exclusion limits at HL-LHC, which are generated using different BDT models, is presented in left panel of Figure 6. In this panel, the blue region represents the sensitivity reach of a reference analysis of gg → hh → bbγγ pursued by the ATLAS collaboration [9], which remains not excluded at HL-LHC. This analysis was taken in the limit of c t = 0. To project its results to the κ − c t plane, we use the relation [17] σ(gg → hh → bbγγ) 14 σ(gg → hh → bbγγ) SM 14 = 1.70 − 0.82κ + 0.12κ 2 − 3.79c t + 0.98c t κ + 2.68c 2 t , (3.1) by requiring the exclusion contour to reduce to the ATLAS results in the limit of c t = 0. Note, this relation was generated with some basic cuts being applied, for the two b-jets and the two photons in the signal events and for the Higgs invariant mass reconstructed from them [17]. Though some imprecision can be caused by the mismatch between the phase spaces involved in these two analyses, we would expect that it will not qualitatively change our conclusions. So we will tolerate this imprecision in the discussions. In the blue region, the central part can be probed for with a relatively higher sensitivity, compared to the iso-significance elliptical belt near the edge. This is related to the deviation of the signal rate from its SM prediction, to a large extent. In the central part, the signal rate is largely suppressed (also see the σ(gg → hh) 14 contours in the right panel of Figure 6) due to destructive interference, as is manifested by the minus sign of the κ and c t linear terms in Eq. (3.1). As for the elliptical belt near the edge, the predicted signal rate is degenerated or approximately degenerated with the SM one. The colored curves in this panel represent the exclusion limits generated using different BDT models. For each specific BDT model, we generate one set of signal efficiency and background rejection { ij sig , i bg }. Here i runs over the five exclusive analyses, and j runs over the six samples defined by {y 4 , y 2 κ 2 , c 2 t , y 3 κ, y 2 c t , yκc t }. Different from ij sig , i bg is fixed in favor of some specific signal sample (e.g., the y 2 κ 2 sample or the c 2 t sample), and hence be universal for all of the six samples in each analysis. With this information, we will be able to calculate the sensitivity for any given values of {κ, c t }, by reweighting the relative contributions of these six samples, using the LO relation at 14 TeV (generated with MadGraph 5 [40]): Here σ(tthh) SM 14 = 0.99 fb is the tthh(SM) cross sections at 14 TeV. The σ(tthh) 14 contours in terms of κ and c t are presented in right panel of Figure 6. Obviously, their absolute gradient is larger along the line from bottom-left to top-right, compared to its orthogonal direction. This is because the quadratic terms of κ and c t in Eq. (3.2) contribute to σ(tthh) 14 in a topping-up way along this line. Differently, a cancellation occurs between the y 2 κ 2 , c 2 t terms and the κc t term in the orthogonal direction of this line.
In the left panel of Figure 6, the solid (dashed) grey contour is set by the specific BDT model which is trained for the SM tthh analyses with a tight (softened tight) b-tagging efficiency of 60% (70%). The exclusion contours based on the t 2 κ 2 -and c 2 t -specific BDT models are presented in green and red in the figure, respectively. The t 2 κ 2 -specific BDT model can also improve the sensitivity to c 2 t . This is because the t 2 κ 2 and c 2 t samples share similar kinematics on m tt . The orange contour represents the sensitivity resulting from the 2D analysis, based on the two specific BDT models. The impact of the σ(tthh) 14 magnitude and the κ-and c t -induced kinematics for the analysis sensitivity can be easily seen by comparing the contours in Figure 6: the σ(tthh) 14 magnitude sets up the orientation of the exclusion contours, whereas the κ-and c t -induced kinematics further improves them. These contours all exclude the right edge of the blue area, despite of their difference in extent, and partially breaks the degeneracy which is usually thought to exist in the gg → hh → bbγγ analysis at HL-LHC 2 .

Projected Sensitivities at Future Hadron Colliders
The study on the tthh physics will benefit a lot from future hadron colliders, such as the increased signal rate, the enhanced boost of signal kinematics, and even a larger luminosity. Pursuing comprehensive analyses on this is beyond the scope of this article. Instead, we will project the tthh sensitivities at HL-LHC to 27 TeV and 100 TeV, to qualitatively measure their potential.
The projection of the tthh(SM) measurement sensitivities is straightforward. It gives a significance of 3.1σ and 14.3σ at 27 TeV and 100 TeV, respectively, based on the combined BDT analyses. To project the exclusion limits of 2D BDT at the κ − c t plane, we simulate  Figure 7. To project the 100 TeV sensitivity of the reference analysis, i.e., gg → hh → bbγγ, to the κ − c t plane, we will use the relation [17] σ(gg → hh → bbγγ) 100 σ(gg → hh → bbγγ) SM 100 = 1.59 − 0.68κ + 0.09κ 2 − 3.83c t + 0.92c t κ + 3.20c 2 t . (3.5) The relation at 27 TeV was not provided in [17], and is hence generated by interpolating the ones at 14 TeV and 100 TeV. Again we will tolerate the imprecision caused by using these relations for the analysis of gg → hh → bbγγ. Compared to 14 TeV, the tthh analyses may benefit more from the y 2 κ 2 -and c 2 t -induced kinematics at 27 TeV and 100 TeV. To make this clear, we define the weights of the y 2 κ 2 term (denoted as σ y 2 κ 2 ) and the c 2 t term (denoted as σ c 2 t ) in σ(tthh) as  in the SM limit. As for the weight W c 2 t (κ, c t ), its value is suppressed in the SM limit. Yet, with a small deviation in c t , e.g., c t = 0.5, we have In contrast, such an enhancement w.r.t. beam energy is vague for σ(gg → hh → bbγγ) in Eq. (3.1) and Eq. (3.5).
In Figure 9, we show normalized kinematic distributions of the {y 4 , y 2 κ 2 , c 2 t , y 3 κ, y 2 c t , yκc t } samples at 14 TeV and 100 TeV. Compared to the others, the y 2 κ 2 and c 2 t samples tend to be separated more from the y 4 one in kinematics. They are away from the y 4 sample in opposite directions w.r.t. m hh and the leading Higgs p T , while in the same direction w.r.t. m tt and the leading top p T . This can be explained as follows: the two Higgs bosons in the y 2 κ 2 event are produced via off-shell Higgs decay, and hence tend to be soft and central; differently, the two Higgs bosons in the c 2 t event are generated more energetically, resulting in a large m hh . These separations become even wider from 14 TeV to 100 TeV.
A combination of the discussions above justifies that we take a 2D BDT strategy for the analyses at HL-LHC which are based on the y 2 κ 2 and c 2 t samples, and meanwhile, raises the expectation that the y 2 κ 2 -and c 2 t -induced kinematics may further improve the signal efficiency and background rejection { ij sig , i bg } at 27 TeV, and even more at 100 TeV. As for the backgrounds of the tthh analyses, the 4t cross section increases faster as the beam energy raises (by a factor of ∼ 8.6 (252) for 27 (100) TeV) than most of the tt + X ones (by a factor of ∼ 5.7 (105) for 27 (100) TeV), because of its relatively higher energy threshold of production. Then, we are able to calculate the projected exclusion limits, assuming that the signal efficiency and background rejection { ij sig , i bg } obtained from the 14 TeV simulations are not changed at 27 TeV and 100 TeV. The projected sensitivities are presented in Figure 10. In this Figure, the light blue and blue belt regions represent the sensitivity reaches of the reference analysis of gg → hh → bbγγ at 27 TeV and 100 TeV, respectively. We notice the difference of the 27 TeV sensitivity in literatures (a precision of ∼ 30% in [11] and ∼ 80% or worse in [9,13] in measuring κ at 2σ C.L.), and a comparison made between them in [14]. Given that the pileup effect and more backgrounds were simulated in [9], when projecting the 27 TeV sensitivity to the κ − c t plane using Eq. (3.5), we require the exclusion contours to reduce in the limit of c t = 0 to the results in [9]. As for the 100 TeV contours, they are required to reduce to the results in [11]. The sensitivities generated for the reference analysis by this method are not optimized with kinematics. With the assistance of kinematics, the belt regions are expected to shrink to circular ones [17]. Despite of this, such a treatment is sufficient for the reference purpose in this analysis. In this figure, the 27 TeV and 100 TeV exclusion contours of tthh are presented in red and orange, respectively. Compared to the 14 TeV (see Figure 6) and 27 TeV contours, both of which are elliptical, the 100 TeV ones are deformed from left-upper to right-bottom. This deformation is caused by the offset of the SM point from the σ(tthh) contour center which is defined by {κ, c t } = {0, 0}, together with that the exclusion contours are defined against the SM prediction. At 14 TeV and 27 TeV, the collider sensitivities are relatively low. The scales of their exclusion contours are much larger than this offset. Thus its impact for the contour profile is negligibly small. In the limit of c t = 0, the projected tthh sensitivities are worse than that of the reference analysis by a factor of 2 − 3. Considering that these sensitivities have not been optimized by, e.g., utilizing the y 2 κ 2 -and c 2 t -induced kinematics, we would view the projected sensitivities to be encouraging.

Heavy Higgs Boson: ttH → tthh
In type II THDM, physics at the decoupling limit [29,58,59] provides a natural organizing principle for the collider searches of the heavy Higgs boson H. Because its couplings with W W , ZZ and hh are highly suppressed, the H is produced and decays mainly through its Yukawa vertices bbH and ttH, at high and low tan β regions, respectively. It has been shown that its exclusion limits could be pushed up to ∼ 1 TeV at HL-LHC and nearly one order more at 100 TeV for moderate and low tan β regions [37,38], using the decay mode of H → tt 3 and a BDT method. However, as its mass decreases to 2m h < m H < 2m t , the H becomes nearly decoupled. Its decay mode of H → tt is kinematically forbidden, whereas its couplings with W W , ZZ and hh may not be negligibly small. Especially, in the low tan β region the Hhh coupling is larger than the HW W and HZZ ones by an approximate factor m 2 H /m 2 Z [29], which could result in a branching ratio of H → hh larger than 50% [29]. A natural expectation is then: the gg → H → hh production will generate a sensitivity to this region. Yet, because of the low-tan β enhancement for the ttH coupling, the H could be produced efficiently in association with a pair of top quarks also. The ttH → tthh analysis thus may provide a probe for this parameter region, alternative to gg → H → hh.
Note, in general type II THDM the tree-level HW W/HV V couplings depend on the mixing angle of the Higgs sector which is a free parameter. For concreteness, we will fix its  Table 4: Exclusion limits of the ttH → tthh and T T → tthh analyses at HL-LHC. The numbers outside (inside) the brackets are based on a tight (softened tight) b-tagging efficiency of 60% (70%). value using its tree-level relation with m H and tan β in the analyses, as is in the MSSM. We will use MadGraph 5 [40] to calculate the H production cross section and HDECAY [60] to calculate its decay branching ratios. Two benchmark points (m H = 300, 500 GeV) are simulated with a BDT strategy described in subsection 2.3. The BDT model is trained against the SM backgrounds including the tthh(SM) events at each point (see Table 6). The modelindependent exclusion limits in the five exclusive analyses at HL-LHC and their combination are presented in Table 4. The same as before, the best sensitivities result from the 5b1 and SS2 analyses. These analyses may further gain from using a softened tight b-tagging efficiency. A combination of them results in an exclusion limit of ∼ O(1) fb for the ttH → tthh production with 300 GeV < m H < 500 GeV.  Figure 11: σ(ttH → tthh) contours (left) and combined exclusion limits of the ttH → tthh analyses (right), with a luminosity of 300 fb −1 (blue) and 3 ab −1 (black). The solid (dashed) blue and black contours are based on a tight (softened tight) b-tagging efficiency of 60% (70%). The exclusion limits of the H/A → τ + τ − analysis are based on [29].
The interpretation of the combined sensitivities in Type II THDM is shown in Fig. 11. As is expected, the tthh analyses can efficiently probe for the nearly decoupling limit with 2m h < m H < 2m t and relatively low tan β. For tan β ∼ 1, the exclusion limit with a luminosity of 300 fb −1 goes beyond the tt threshold, reaching m H ∼ 430 GeV, with a tight b-tagging efficiency of 60%. The limit can be pushed up to m H ∼ 480 GeV at HL-LHC, fully complementing the bbH/A → bbτ τ measurement for the relevant m H range. Encouragingly, such a sensitivity seems not bad, compared to the gg → H → hh one [29] (also see [61]), though the latter gains a lot from its relatively large cross section.

Fermionic Top Partner: T T → tthh
Vector-like top partners extensively exist in the CHM. Because they can be produced via QCD interactions at LHC, the exclusion limits for these particles, either a SU (2) singlet or doublet, have been pushed to above 1.3 TeV [62]. As is shown in [28], one of the the major channels involved in this effort is T T → tthh. Below we will analyze its sensitivity at HL-LHC.  Figure 12: Combined exclusion limits of the T T → tthh analyses, with a luminosity of 300 fb −1 and 3 ab −1 . The solid (dashed) blue and black contours are based on a tight (softened tight) b-tagging efficiency of 60% (70%). The yellow and red curves are theoretical predictions in two benchmark scenarios which are defined by different branching ratios of T → th.
We simulate three benchmark points with m T = 1.5, 1.75 and 2.0 TeV, with a BDT strategy described in subsection 2.3. As is indicated in Table 6, the BDT model is trained at m T = 1.5 and 2.0 TeV. The analyses at m T = 1.75 TeV are done using the BDT model trained at m T = 1.5 TeV. The model-independent exclusion limits in the five exclusive analyses at HL-LHC and their combination are presented in Table 4. As the signal events are characterized by a large H T (largely due to its large m T ), i.e., the scalar sum of the jet p T , they can be well-separated from the backgrounds. Thus, the T T cross section is strongly constrained to be smaller than 0.20−0.25 fb, with a weak dependence on m T . The combined exclusion limits are also shown in Figure 12, with a luminosity of 300 fb −1 and 3 ab −1 . Two theoretical benchmark scenarios are projected to this figure. They are characterized by different Br(T → th), i.e., 25% and 50%, which could be achieved for a singlet and doublet top partner, respectively (see, e.g., [63]). At HL-LHC, the top partners in these two scenarios could be excluded up to m T ∼ 1.5 TeV and m T ∼ 1.7 TeV, respectively. Since no exclusive T T → tthh analysis at LHC is known to us, for comparison we scale the exclusion limit at m T = 1.43 TeV, obtained in the T T → th + X analysis at ATLAS, with a luminosity of 36.1 fb −1 at √ s = 13 TeV [28] and an assumption of Br(T → th) = 100%, to 3 ab −1 using Gaussian statistics. This yields a projected limit of σ(T T ) < 0.33 fb, higher than the one at m T = 1.5 TeV by a factor ∼ 1.4.

Conclusion
In this article, we systematically studied the tthh physics at HL-LHC, using the five exclusive analyses: 5b1 , 5b2 , SS2 , multi-and τ τ resonance. To address the complication of event topologies and the mess of combinatorial backgrounds, the BDT method was applied. We showed that the tthh production can serve as a useful probe for the cubic Higgs self-coupling, the tthh contact interaction, the new resonances such as heavy Higgs boson and fermion top partners. The measurement results may assist a further study on the nature of EWPT and EWSB. The main conclusions include: • In addition to 5b1 which was first suggested in [30,31], the SS2 (or SS2 +multi-) analysis provides another promising channel to study the tthh physics. For the tthh(SM) production at HL-LHC, the BDT sensitivity of 5b1 is ∼ 30% better than that of SS2 , and comparable to that of SS2 +multi-, with a tight b-tagging efficiency of 60%. But, its S/B value is ∼ 25% smaller than that of SS2 or SS2 +multi-. A combination of these exclusive analyses potentially allows for detecting tthh(SM) with a statistical significance ∼ 0.9σ. The 5b1 sensitivity could be further improved with a softened tight b-tagging efficiency. But, in this case, the backgrounds tt + 4c, 2b/2c2j, 4j need to be simulated in a more solid way.
• The tthh analyses at HL-LHC can be applied to probe for anomalous cubic Higgs selfcoupling and the tthh contact interaction. To utilize the kinematic features brought in by both, we defined a 2D BDT framework. Along each of its dimensions one specific signal sample (i.e., the samples of y 2 κ 2 and c 2 t ) is favored. With this method we showed that a combination of the five exclusive analyses potentially can constrain κ to {6.9, −10} for c t = 0, and c t to {−2.7, 1.8} for κ = 1. This may partially address the sensitivity degeneracy w.r.t. the cubic Higgs self-coupling, a difficulty usually thought to exist in the gluon-fusion di-Higgs analysis at HL-LHC.
• To qualitatively measure the potential of the future hadron colliders, we projected the non-resonant tthh sensitivities at HL-LHC to 27 TeV and 100 TeV, assuming the signal efficiency and background rejection at HL-LHC to be unchanged. We showed that a combination of the five exclusive analyses could detect the tthh(SM) production with a significance of 3.1σ and 14.3σ at 27 TeV and 100 TeV, respectively, and constrain κ to {−2.6, −1.6} ∪ {0.2, 1.6} for c t = 0, and c t to {−0.34, 0.16} for κ = 1, at 2σ C.L. at 100 TeV 4 .
• The tthh analyses can be also applied to search for new heavy resonances. For illustration, we considered two representative scenarios: ttH → tthh in type II THDM and T T → tthh in the CHM. At HL-LHC, the exclusion limit for the H can be pushed beyond the H → tt threshold, up to ∼ 480 GeV for tan β ∼ 1. This complements well the bbH → bbτ τ analysis in the relevant m H range. In the latter case, the tthh analyses set up an upper limit ∼ 0.2 fb for the T T cross section with m T between 1.5 − 2.0 TeV. This constraint excludes a scenario with Br(T → th) = 25% and 50% up to ∼ 1.5 TeV and ∼ 1.7 TeV, respectively.
We would bring it to attention that the following effects were not systematically taken into account in this study. They may weaken the analysis sensitivities obtained in this article. First, though we introduced a K factor to encode the NLO uncertainty in the background cross sections, we did not explore the impact of systematic errors in detail. Also, we did not simulate the pileup effect directly, though we applied a trimming to fat jets and smeared the MET. Additionally, for a softened tight b-tagging efficiency of 70%, we did not simulate the impact of the tt + 4c, 2b/2c2j, 4j backgrounds. Though this choice may improve the analysis sensitivities, especially for 5b1 and 5b2 , a solid simulation of these backgrounds is necessary.
The study on the tthh physics will greatly benefit from future hadron colliders. Because of its relatively large energy threshold of prodction, the tthh cross section increases faster as the beam energy increases, compared to some main di-Higgs production mechanisms such as gluon fusion [1]. The enhanced boost in kinematics with a higher beam energy will benefit to reconstructing the signal events also. Furthermore, as √ s increases, the weights of the y 2 κ 2 and c 2 t terms in the non-resonant σ(tthh) will be gradually enhanced (except some narrow regions), resulting in more contributions from them to the tthh production at 27 TeV and 100 TeV, compared to 14 TeV. This may strengthen the tthh sensitivities at future hadron colliders to the anomalous cubic Higgs coupling and the tthh contact interaction. Together with a potentially larger luminosity, these factors may significantly improve the sensitivities of these tthh analyses. The good sensitivities of these analyses may extend to some new channels, e.g., the one with di-photon Higgs decay. As a matter of fact, ∼ 10 4 di-photon tthh(SM) events, three orders more than those at HL-LHC, can be generated with a luminosity of 30 ab −1 at 100 TeV. Given that di-photon Higgs boson can be reconstructed with a high efficiency and its backgrounds are relatively clean, this channel may also play a non-trivial role in studying the tthh physics. Naturally, pursuing a comprehensive study regarding these lays out an ongoing project [39].
The study on the tthh physics, together with the other Higgs measurements, at future colliders may allow a deeper exploration on the nature of Higgs boson, the drive of EWPT, and the underlying theory of EWSB. One example is the reconstruction of Higgs potential. It is well-known that di-Higgs productions can be applied to probe for the cubic and quartic Higgs self-couplings. Yet, the renormalization of the quartic Higgs self-coupling to the cubic one result in a dependence on renormalization scheme, due to incomplete treatment of higherorder effects. As was discussed in [21], a combined analysis of multiple di-Higgs productions can greatly reduce such a scheme dependence at linear level. The tthh production at hadron colliders may well-pair with the gg → hh production to achieve this. Another example is pinning down the underlying theory of the Higgs self-couplings or the tthh contact interaction, given an anomaly. In either case, such an anomaly could arise from multiple new physics scenarios: an anomalous cubic Higgs coupling can be induced in the SM extension with, e.g., Higgs singlet, doublet or triplet, while a nonzero tthh contact interaction can arise from the CHM or supersymmetry at either tree or loop level. To recognize the real one, a combination of multiple Higgs probes is necessary. We leave these explorations to a future work.