Probing Trilinear Higgs Self-coupling at the HL-LHC via Multivariate Analysis

We perform a multivariate analysis of Higgs-pair production in $HH \to b\bar b \gamma\gamma$ channel at the HL-LHC to probe the trilinear Higgs self--coupling $\lambda_{3H}$, which takes the value of 1 in the SM. We consider all the known background processes. Also, for the signal we are the first to adopt the most recent event generator of POWHEG-BOX-V2 to exploit the NLO distributions for Toolkit for Multivariate Data Analysis (TMVA), taking account of the full top--quark mass dependence. Through Boosted Decision Tree (BDT) analysis trained for $\lambda_{3H}=1$, we find that the significance can reach up to 1.95 with about $9$ signal and $18$ background events. In addition, the Higgs boson self-coupling can be constrained to $1.00<\lambda_{3H}<6.22$ at 95\% confidence level (CL). We also perform a likelihood fitting of $M_{\gamma\gamma bb}$ distribution and find the $1\sigma$ confidence interval (CI) of $0.1<\lambda_{3H}<2.2\, \cup\,5.4<\lambda_{3H}<6.6$ for the $\lambda_{3H}=1$ nominal set. On the other hand, using BDTs trained for each value of $\lambda_{3H}$, we find a bulk region of $0.5<\sim \lambda_{3H}<\sim~ 4.5$, for which it is hard to pin down the trilinear coupling.


I. INTRODUCTION
Since the discovery of the Higgs boson in 2012 [1], the most pressing question is to understand the underlying mechanism for electroweak symmetry breaking (EWSB). There is no particular reason why the EWSB sector only consists of a single Higgs doublet. Indeed, the simplest version suffers from the so-called gauge hierarchy problem. After completing Run I and II at the LHC, the identity of the Higgs boson has been established. It is best described as the standard model (SM) Higgs boson [2], although there is an upward trend in the overall signal strength [3].
All current measurements of the Higgs boson properties confine to the couplings of the Higgs boson to the SM particles, like gauge bosons and fermions . However, the self-couplings of the Higgs boson is not established at all, which depends on the dynamics of the EWSB sector. The self-couplings of the Higgs boson can be very different between the SM and other extensions of the EWSB sector, like two-Higgs doublet models (2HDM), and MSSM.
Higgs boson pair production at the LHC provides a very useful avenue to investigate the selfcouplings of the Higgs boson [4][5][6]. There have been a large number of works in literature on Higgs-pair production in the SM [7], in model-independent formalism [8], in models beyond the SM [9], and in SUSY [10].
Furthermore, the high luminosity option of the LHC running at 14 TeV (HL-LHC) was approved. It is a legitimate machine to investigate the EWSB sector. In a previous work [11], we showed that even with the HL-LHC one cannot establish the self-coupling λ 3H at the SM value using the most promising decay mode HH → (bb)(γγ). Indeed, one can only constrain the self-coupling to be within −1.0 λ 3H 7.6 at 95% CL [11]. These results were based on a conventional cut-based analysis.
In this work, we show that with the use of Boosted Decision Tree (BDT) method, the significance of the signal can be improved by 80%, which is a substantial improvement from the cut-based analysis.
The organization is as follows. In the next section, we give some details on generation of the signal and background event samples. In Sec. III, we set up the TMVA variables and various BDT methods. We present the numerical results in Sec. IV. We end our discussion and conclusion in Sec. V.

II. GENERATION AND SIMULATION OF SIGNAL AND BACKGROUNDS
The Higgs bosons in the signal event samples are generated on-shell with zero width by POWHEG-BOX-V2 [12,13] with the damping factor hdamp set to the default value of 250 to limit the amount of hard radiation. This code provides NLO distributions matched to a parton shower taking account of the full top-quark mass dependence. The variation of the trilinear Higgs coupling, λ 3H , is also allowed in this code. The Higgs and the top quark masses are set to the default values of M H = 125 GeV and m t = 173 GeV, respectively, and the bottom quark is considered massless. The MadSpin code [14] is used after generating a pair of Higgs bosons in order to decay both Higgs bosons into two bottom quarks and two photons.
For parton showering and hadronization, PYTHIA8 [15] is used. Here an appropriate setup provided by POWHEG-BOX-V2 is used to correctly perform a matching of POWHEG-BOX-V2 with PYTHIA8. Finally, fast detector simulation and analysis at the HL-LHC are performed using Delphes3 [16] with the ATLAS template. The parameters in the template are tuned as in Ref. [11].
For generation and simulation of backgrounds, we closely follow Ref. [11], except for the use of the post-LHC PDF set of CT14LO [17] and merged cross sections for non-resonant backgrounds. More precisely, for the two main non-resonant backgrounds of bbγγ and ccγγ, we use the merged cross sections and distributions by MLM matching [18,19] with xqcut and Q cut set to 20 GeV and 30 GeV, respectively. For the remaining non-resonant backgrounds, we are using the cross sections and distributions obtained by applying the generator-level cuts listed in Eq. (2) as adopted in Ref. [20] which might provide more reliable and conservative estimation of the non-resonant backgrounds containing light jets [11].
The information on the matrix-element generation, parton showering, and hadronization is summarized in Table I. The signal cross section at NNLO order in QCD is calculated according to where λ 3H -dependent NLO cross section of σ NLO (λ 3H ) is computed by the use of POWHEG-BOX-V2 and we take K N N LO/N LO SM = 1.116 [21]. For the cross sections of non-resonant and ttγ backgrounds, the following generator-level cuts are applied at parton level in order to remove Di-photon trigger condition, ≥ 2 isolated photons with P T > 25 GeV, |η| < 2.5 2 ≥ 2 isolated photons with P T > 30 GeV, |η| < 1.37 or 1.52 < |η| < 2.37, ∆R γγ,jγ > 0.4 Events are required to contain ≤ 5 jets with P T > 30 GeV within |η| < 2.5 5 No isolated leptons with P T > 25 GeV, |η| < 2.5 the divergence associated with the photons or jets: Note that, in Table I, signal and the ggH(→ γγ) and tt backgrounds are generated at NLO and normalized to the cross sections computed at the accuracy denoted in 'Order in QCD'.
And the remaining backgrounds are generated at LO and normalized to the cross sections computed at the accuracy denoted in 'Order in QCD'.

III. TMVA ANALYSIS
Before performing a multivariate analysis using Toolkit for Multivariate Data Analysis (TMVA) [22] with ROOTv6.18 [23], a sequence of event selections is applied to the signal and background event samples, see Table II. And then we choose the following eight kinematic variables for TMVA: We observe that significance can be meaningfully improved by judiciously choosing the two photons or two b quarks for the above TMVA variables. For M bb,γγ and P bb,γγ T , in terms of P T , we choose the least energetic two photons or two b quarks while the most energetic with λ 3H = 1 (blue) and backgrounds (red) after applying the event pre-selections cuts in Table II . For comparisons, the LO signal distributions are also shown in dashed-blue lines.
ones are chosen for ∆R bb,γγ and M γγbb . For ∆R γb , on the other hand, we choose the least energetic b and the next-to-the-least energetic photon.
In Fig. 1, we show the normalized distributions of the eight kinematic variables for the SM signal with λ 3H = 1 (blue) and backgrounds (red) after applying the event pre-selection cuts in Table II. We observe the broad peak around 125 GeV in the M bb distribution of the signal while the peak in the M γγ distribution of the signal is very sharp. The signal tends to give larger transverse momenta of P bb,γγ T while it is more populated in the region of smaller ∆R bb,γγ , implying a strong negative correlation between P T and ∆R. Furthermore, the signal has larger M γγbb and its distribution is peaked around 400 GeV, and ∆R γb provides another good discriminant observable. First of all, we try various multivariate analysis (MVA) methods provided by TMVA with the eight kinematic variables listed above. For this we use the default TMVA setup for each method. The receiver operating characteristic (ROC) curves for various methods are shown in Fig. 2. We find that the BDT-related methods show higher performance with better signal efficiency and stronger background rejection. We choose the best method of BDT for our analysis.
Before presenting the results of our analysis, we describe our setup for BDT briefly here.
For each event sample of signal and backgrounds, we randomly divide it into two halves with a default split seed. The first half is used for training and the second one for testing.
Then in order to improve performance of a trained BDT, we use 800 trees and node splitting is allowed only when the number of events in a node is larger than 2.5% of total number of events of the training sample. Maximum tree depth is set to 4. Training is carried out using Adaptive Boost with learning rate β = 0.5. One half of the training sample is UseBaggedBoost:BaggedSampleFraction=0.5:SeparationType=GiniIndex:nCuts=20 .

IV. RESULTS
In the left panel of Fig. 3 provide quite better separation between signal and background. Incidentally, by the vertical lines, we denote the position of the optimal cut on the BDT response which maximizes the significance of Z: where s and b represent the numbers of signal and background events, respectively.
In Fig. 4, using BDT SM , we show the behavior of signal and background efficiencies and significance Z according to the variation of the cut value on BDT response. We observe the significance can reach up to 1.95 when 0.145 is taken for the BDT response cut and, at which, the signal and background efficiencies are 0.50 and 4.9 × 10 −4 , respectively.
In Table III, we present expected number of signal and background events at the HL-LHC assuming 3000 fb −1 using BDT SM with the BDT response cut of 0.145. We find that the significance is 1.95 with about 9 signal and 18 background events for λ 3H = 1. Comparing to the results using the cut-and-count analysis [11], we find that the number of signal events decreases by only 10% while the number of backgrounds by almost 80%, resulting in an increase in significance from 1.09 to 1.95. Note that the composition of backgrounds changes drastically by the use of BDT. In the cut-and-count analysis, the non-resonant background is about two times larger than the single-Higgs associated background. While, in the BDT analysis, the single-Higgs associated background is larger than the non-resonant one and tt associated background becomes negligible. Also, we find that the Higgs boson self-coupling can now be constrained to 1.00 < λ 3H < 6.22 at 95% confidence level (CL), which removes the region of negative λ 3H in contrast to the results based on the cut-and-count analysis.
Even the significance standing at 1.95 may not be high enough to make a precise measurement of the trilinear Higgs self-coupling at the HL-LHC, we implement a likelihood fitting of M γγbb distribution to quantify the uncertainty in the determination of λ 3H and to see how much the two-fold ambiguity in the determination could be lifted up.
So far we have used the BDT trained for λ 3H = 1 or BDT SM independently of the input value of λ 3H . Without knowing the value of λ 3H a priori, it would be more desirable to use separate BDTs trained for specific values of λ 3H , which we wish to call BDT λ 3H in short for further references. In Fig. 6, we show the contour plot showing the 1σ and 2σ CI regions obtained by likelihood fitting of M γγbb distributions. For each value of λ in 3H , we use the corresponding BDT λ 3H together with λ 3H = λ in 3H nominal set and the BDT response cut is set to maximize significance. Using BDT λ 3H , the 95% CL region is narrowed into 1.01 < λ 3H < 5.42 at 95% CL. Compared to that obtained using BDT SM , we observe the noticeable changes of Z for λ 3H > ∼ 4. And we also find there exists a bulk region of 0.5 < ∼ λ 3H < ∼ 4.5 in which it is hard for one to pin down the trilinear coupling, see the 1-σ error region delimited by solid lines in Fig. 6.
In Fig. 3, we show that using the NLO distributions of signal for TMVA may lead to the better results. For a quantitative comparison, we use the LO distributions for TMVA and find that the significance can reach up to 1.79 with about 9 signal and 20 background events.
The details of the NLO and LO results based on BDT SM are presented in Table IV where we also make comparison with the recent ATLAS result without systematic uncertainties [24] in which the LO distributions of signal and the generator-level cuts Eq. (2) adopted in Ref. [20] are used for TMVA.
HH⟶bbγγ, s =14 TeV, 3000 fb -1 BDT SM , HH signal + background significance due to the changes of random seeds is negligible.

V. CONCLUSIONS
Higgs-pair production is the most useful avenue to the understanding of the EWSB sector. We have studied in great details, with the help of machine learning, the sensitivity of measuring the trilinear Higgs self-coupling λ 3H that one can expect at the HL-LHC with 3000 fb −1 . With TMVA one can improve upon the signal-to-background significance over  [24]. Z max , s| Zmax , and b| Zmax denote the significance, the number of signal events, and the number of background events, respectively, obtained after applying the BDT cut which maximizes the significance. Also compared are the 95% CL and 1σ CI ranges of λ 3H .
BDT SM ATLAS 2018 [24] MLM the traditional cut-based analysis. In this work, we have shown that the significance is improved by about 80% and found a narrower range of λ 3H below the sensitivity. With BDTs trained for each value of λ 3H , we found the bulk region down to 0.5 λ 3H 4.5 in which one cannot pin down the trilinear coupling.

ACKNOWLEDGMENT
We thank Chih-Ting Lu for initial participation. This work was supported by the Na-