Discovering Higgs boson pair production through rare final states at a 100 TeV collider

We consider Higgs boson pair production at a future proton collider with centre-of-mass energy of 100 TeV, focusing on rare final states that include a bottom-anti-bottom quark pair and multiple isolated leptons: $hh \rightarrow (b\bar{b}) + n \ell + X$, $n = \{2,4\}$, $X = \{ E_T^\mathrm{miss}, \gamma, -\}$. We construct experimental search strategies for observing the process through these channels and make suggestions on the desired requirements for the detector design of the future collider.


I. INTRODUCTION
In the 'aftermath' of the Large Hadron Collider's (LHC) 'Run 1' at 7 and 8 TeV, we are left with the grim possibility that striking new effects will be beyond the reach of the collider, perhaps even in 'Run 2' at 13 TeV. The LHC will continue to collect data into a high-luminosity 'Run 3' phase, possibly at 14 TeV, which will allow us to perform precision tests on the properties of the Higgs boson and the underlying mechanism of electroweak symmetry breaking (EWSB). Taking a rather pessimistic point of view when it comes to beyond-the-Standard Model physics, a future circular hadron collider (FCC-hh), at a potential proton-proton centre-of-mass energy of 100 TeV, will resume this task, zooming into the properties of the Higgs boson. An important process that has received considerable attention in the recent years, as part of this effort, is that of Higgs boson pair production. Examining this process, several aspects of EWSB can be probed: e.g. the consistency of the selfcoupling of the Higgs boson with the Standard Model expectation [1][2][3][4][5][6][7][8][9][10][11][12], whether it comes as part of a doublet or is affected by new physics effects , as well as effects of higher-dimensional operators [44][45][46].
The hh processes is currently under investigation at the LHC, both by phenomenologists and experimentalists. The ATLAS and CMS collaborations have already presented, as a "warm-up" exercise, results for resonant and non-resonant production of hh using 'Run 1' results [47][48][49]. Detection of this process remains, however, a serious challenge, even at the end of the high-luminosity run of the LHC, where 3 ab −1 of integrated luminosity will be collected. The question examined in this article is whether a 100 TeV proton-proton collider can yield significant contributions to the investigation of this process.
At present, it is clear that the task will not be trivial, even at the FCC-hh, but initial results encourage optimism. Several studies examining the hh → (bb)(γγ) channel, with diverse assumptions on detector performance, indicate that it would provide a clear signal at the end of a 3 ab −1 run of the FCC-hh [45,50]. This * Electronic address: apapaefs@cern.ch process falls into the 'rare but clean' category, allowing good rejection of relatively manageable backgrounds with high signal efficiency, on account of the prospect of excellent resolution in the photon momentum measurement. An obvious question is whether the increase of energy opens up access to other such channels, previously unavailable at the LHC due to the small total cross section. In the present article we consider a simple set of such channels, where one Higgs boson is allowed to decay via h → bb and the other to either a pair of gauge bosons h → ZZ, Zγ, W + W − , with subsequent decay to leptons, or directly to leptons h → (τ + τ − )/(µ + µ − ). We will not consider hadronic decays of the τ lepton, as these will require reconsideration of tagging algorithms versus large QCD backgrounds that will rely crucially on the, presently unknown, detector performance of the future collider. * The final states considered, their branching ratios and cross sections at 14 TeV are shown in Table I. We also give the relevant numbers for the hh → (bb)(γγ) process, even though it has not been considered here. Evidently, of the multilepton channels examined here, all but the hh → (bb)(W + W − ) channel provide a negligible number of events at the LHC at 14 TeV. These calculations, and the rest of the signal cross sections in this article, are based on the EFT-approximate total NNLO cross section for Higgs boson pair production of σ(14 TeV) = 40.2 fb and σ(100 TeV) = 1638 fb [52]. † At the FCC-hh, the factor ∼ 40 increase in cross section, with respect to the LHC at 14 TeV, allows for exploration of these rare channels, given that they provide a non-negligible inclusive event sample. In what follows and the rest of this article, we define = {e, µ}.
The paper is organised as follows: we first give details of the simulation of detector effects and the generation of Monte Carlo event samples used for the analyses. Subsequently, we go through each final state, highlighting the important attributes and potential for measurement at * We note also the recent study at the LHC at 14 and the FCChh at 100 TeV of the three-lepton final state arising from hh → (W + W − )(W + W − ) that appeared in [51]. † For further details on the cross section calculations see also [42,[53][54][55][56][57][58].
the FCC-hh. The main results, in Tables III, IV, VI and VII appear as expected number of events at 3 ab −1 of integrated luminosity and can be trivially translated into cross sections or events at higher luminosities. We conclude by summarizing and making general recommendations on desired features of the FCC-hh detectors.

A. Detector simulation
In what follows, we consider all particles within a pseudorapidity of |η| < 5 and p T > 500 MeV. Following [50], we smear the momenta of all reconstructed objects according to [59] for jets and muons and for electrons according to [60]. We also apply the relevant reconstruction efficiencies according to [59] for jets and muons and [60] for electrons. We simulate b-jet tagging by looking for jets containing B-hadrons and applying a flat b-tagging efficiency of 70% and a mis-tag rate of 1% for light-flavour jets. We do not consider c-jets. We do not apply any smearing to the missing transverse energy.
We reconstruct jets using the anti-k t algorithm available in the FastJet package [61,62], with a radius parameter of R = 0.4. We only consider jets with p T > 40 GeV within |η| < 3 in our analysis. The jet-to-lepton mis-identification probability is taken to be P j→ = 0.0048 × e −0.035p T j /GeV , as in [50]. We apply a transverse momentum cut on all leptons of p T > 20 GeV. We also consider the mis-tagging of two bottom quarks with a flat probability of 1% for each mis-tag, corresponding to a b-jet identification rate of 70% and demand that they lie within |η| < 2.5. We demand all leptons to be isolated, an isolated lepton having i p T,i less than 15% of its transverse momentum in a cone of ∆R = 0.2 around it.
Since the design of the detectors for FCC-hh is still under development, we consider in conjunction to the above, which we call 'LHC parametrization' of the detector effects, an alternative 'ideal' parametrization. This is obtained by setting all efficiencies to 100%, within the given considered acceptance regions for jets and leptons, and by removing all momentum smearing effects. Table II summarizes the differences between the two parametrizations. The mis-tagging rates for b-jets, leptons and photon were kept identical in both parametrizations. We note, however, that these were not found to be particularly important for the channels considered here, provided that they remain at the levels that can be achieved at the high-luminosity LHC.

B. Event generation
We generate the signal at leading order using the Herwig++ event generator [63][64][65][66] interfaced with the OpenLoops package for the one-loop amplitudes [10,67]. ‡ The backgrounds have been generated with the MadGraph 5/aMC@NLO package [68][69][70], at next-to-leading order (NLO) in QCD, except for the case of the tt background, which was generated at leading order and merged with the Herwig++ parton shower using the MLM algorithm, including tt+1 parton matrix elements. For the latter, we normalized the cross section to the total NLO cross section. All simulations include modelling of hadronization as well as the underlying event through multiple parton interactions, as it they are made available in Herwig++. No simulation of additional interacting protons (pile-up) is included in this study. The CT10nlo parton density function [71] set was used for all parts of the simulation.
To facilitate the Monte Carlo event generation, we reduce the rather large tt cross section at 100 TeV by applying the following generation-level cuts to the final state objects ( + bν )( −bν ): We also apply generation-level cuts to b-quarks in other backgrounds. These are indicated on the corresponding tables.

III. ANALYSIS
A. hh → (bb) (4 ) Since the h → ZZ → 4 channel has played an important role in the discovery of the Higgs boson, it is reasonable to ask whether the four-lepton final state, in association with h → (bb), could be employed equivalently in the di-Higgs channel. Unfortunately, at the LHC at 14 TeV, the SM cross section in this channel is hopelessly low: σ((bb) + 4 ) 14 TeV ∼ 6 × 10 −3 fb, giving an expected number of ∼ 20 events at the end of the high-luminosity run at 3 ab −1 . Hence, even considering just triggering and basic acceptance cuts, one can conclude that this channel will never be observed at the LHC. At a 100 TeV collider, the cross section increases to about 0.26 fb, leading to ∼ 780 events at 3 ab −1 . Evidently, the channel is still challenging, even at this future collider. Nevertheless, the final state is easier to reconstruct than others, and one should consider whether significance could in principle be obtained. Particularly interesting is the scenario of an integrated luminosity of 30 ab −1 , where one would have an initial sample of 7800 events. ‡ Note that O(10%) systematic uncertainties can be introduced by using the LO sample instead of a sample that includes higher-order real emission matrix elements, as pointed out in [10]. For the purposes of this initial study, it is sufficient to use LO hh production.
TABLE I: Higgs boson pair production rare final state branching ratios (BR) and cross sections that are considered in the present article. Decays to τ leptons that subsequently decay to electrons or muons are included. The hh → (bb)(γγ) channel is not considered here and given for comparison. Ref.
[59] 0 TABLE II: The differences between the two detector parametrizations we consider. One is effectively an 'ideal' detector, with perfect efficiency within the considered pseudorapidity ranges for each type of object, whereas the other is an LHC-like parametrisation, with equivalent assumptions as for the high-luminosity LHC. The b-tagging probability, p b−tag , the efficiency for jets, electrons and muons ( (j/e/µ)) and the corresponding standard deviation of the smearing Gaussian are given (σ(j/e/µ)).
The backgrounds relevant to this process are listed in Table III. Here, we only consider mis-tagging of a single lepton, with the the dominant process in this case being W ± Zh. Evidently, there are 6 objects relevant to the hard process: two b-jets and 4 leptons. We demand pairs of leptons of opposite charge and same flavour as well as two identified b-jets. As a simulation of a possible 4-lepton trigger, we ask for the following staggered lepton cuts: Since the signal in this case is not expected to possess a large amount of missing transverse energy, we impose / E T < 100 GeV. It was observed that the distance between all leptons in the hh signal is substantially smaller than in most of the background processes, and hence a cut of ∆R( 1 , j ) < 1.0, with j = {2, 3, 4}, is imposed. We apply cuts tailored to rejecting events with two on-shell Z bosons: if there are two combinations of same-flavour opposite-sign leptons that have an invariant mass in m + − ∈ (80, 100) GeV, we reject the event. We also demand that no single pair of same-flavour oppositesign leptons possesses a mass above 120 GeV. The final reconstructed observables are the invariant mass of the four-leptons, the invariant mass of the b-jet pair and the invariant mass of all six reconstructed objects. These observables are shown for the hh signal and the two signifi-cant backgrounds, tth and ttZ, after the aforementioned cuts in Fig. 1. To obtain the final result we further impose the following cuts: and no cut on the invariant mass of all the reconstructed objects.
It is important to emphasize at this point a crucial element of this analysis, namely the minimal cuts that should be applied each of the four leptons in the final state to avoid excessive signal rejection. In particular, for the fourth, softest, lepton in the final state, about 65% of the events fall in 20 GeV < p T, 4 < 30 GeV. Not including this bin in p T, 4 would automatically reduce the signal by more than a factor of two. This is demonstrated in Fig. 2, where the ordered transverse momenta of the four leptons are shown. The ordered transverse momenta of the b-jets are shown for completeness.
The hh → (bb)( + − γ) channel contains two leptons from the on-shell Z decay and a hard photon, and possesses a cross section only slightly less than hh → (bb)(4 ). The backgrounds are, however, substantially larger in this case. Here we only include the most significant irreducible ones, coming from bbZγ and ttγ, as well as the dominant reducible ones, where a photon is mis-tagged in bbZ or tt production.
In the analysis of this channel we require two leptons of the same flavour with GeV and a photon with p T,γ > 40 GeV. No isolation requirements are imposed on the photon. We ask for ∆R( 1 , 2 ) < 1.8, ∆R( 1 , γ) < 1.5 and 0.5 < ∆R(b 1 , b 2 ) < 2.0. We construct the invariant mass of the b-jet pair and the invariant mass of the two-lepton and photon system and impose the cuts: 100 < M bb < 150 GeV and 100 < M + − γ < 150 GeV.
Even after these cuts, the bbZγ background dominates the resulting sample, resulting to a signal-to-background ratio of O(2 − 3%) with only O(10) signal events at    3 ab −1 of integrated luminosity. Therefore, this channel is not expected to provide significant information at the 100 TeV pp collider, unless a significant alteration of the hh channel is manifest due to new physics effects.
Another interesting channel to consider is the final state that includes a bb and two oppositely-charged leptons. This channel receives signal contributions from three different hh decay modes. The largest contribution comes from hh → (bb)(W + W − ) with the W s decaying (either directly, or indirectly through taus) to electrons or muons. The second-largest contribution to this channel comes from hh → (bb)(τ + τ − ), with both taus decaying to electrons or muons. Both of these channels will include final-state neutrinos, and hence will be associated with large missing energy. The specific final state has been considered at the LHC at 14 TeV, e.g. coming from (bb)(W + W − ) in. [4] or included implicitly as part of the channel (bb)(τ + τ − ) in [3][4][5]10]. The smallest contribution comes from hh → (bb)(µ + µ − ), i.e. through direct decay of one of the Higgs bosons to muons. This has been considered in [2] at a 200 TeV proton collider, which was envisioned to have a integrated luminosity of either 600 fb −1 or 1200 fb −1 . The channel was shown to be able to provide some information on the hh process at a higher-energy collider, and hence we re-examine it here at a 100 TeV collider, allowing for the possibility of collecting a higher integrated luminosity sample.
Due to the different origin of the leptons in the three different signal processes, the kinematical details vary substantially. As already mentioned, in (bb)(W + W − ) and (bb)(τ + τ − ), we expect large missing energy. The τ leptons in (bb)(τ + τ − ) are light compared to the Higgs boson, and hence the leptons and neutrinos in their decays are expected to be collimated. On the other hand, in (bb)(W + W − ), both W s are heavy, one being most of the time on-shell (M W ∼ 80.4 GeV) and the other off-shell with M W * peaking at ∼ 40 GeV. This implies that the distribution of the leptons and neutrinos will differ considerably from the (bb)(τ + τ − ) case. This will be reflected in the analysis efficiency for the two processes: the fact that the leptons in boosted τ decays are collimated with the associated neutrinos makes the h → τ + τ − decay easier to separate from the backgrounds. The (bb)(µ + µ − ) final state is expected to have little missing energy, possibly only due to B-hadron decays, and the two muons are expected to reconstruct the Higgs boson. To account for all of these properties, we construct two separate signal regions. One aims to capture events containing rather large missing energy, targeting the (bb)(W + W − ) and (bb)(τ + τ − ) channels, whereas the second is aimed towards events with minimal missing energy that are expected to characterize the (bb)(µ + µ − ) channel. The same set of observables is considered, with substantial variations of cuts. We consider events with two isolated lep-tons, with isolation criteria as in the previous sections. We construct their transverse momentum, p T, {1,2} , the distance between them, ∆R( 1 , 2 ) and their invariant mass M . We ask for two b-jets, for which we construct the transverse momenta, distance and invariant mass: We also consider the missing transverse energy, / E T and the invariant mass of all the reconstructed objects, M bb . Moreover, we consider a further observable, M reco. , constructed by assuming that the missing energy arising from neutrinos in the decays of the τ leptons is collinear to the observed leptons: where p bi , p i are the observed momenta of the i-th b-jet and i-th lepton and f 1,2 are constants of proportionality between the neutrino and lepton momenta from the decay of the two τ leptons: p νi = f i p i . These can be calculated from the observed missing transverse energy by inverting the missing transverse momentum balance relation Lf = / E, where L is the matrix L j i = p j i , for which the superscript denotes the component of the i-th lepton momentum, j = {x, y} and E and f are the vec- . We consider this observable for the (bb)(W + W − ) signal sample as well, even though the collinearity approximation fails, as it is still expected to be correlated with the invariant mass of the Higgs boson pair. One may also define the reconstructed Higgs boson mass from the reconstructed τ lepton momenta: We call the signal regions SR / E and SR µ , corresponding to the hh decay modes (bb)(W + W − ), (bb)(τ + τ − ) and (bb)(µ + µ − ) respectively. Table V shows the cuts chosen for these observables for these three signal regions. Note that the M T 2 observable can also be constructed for the rejection of the tt background, but we do not take this approach here [5].
The backgrounds considered for the (bb)( + − )(+ / E T ) final state include the irreducible ones coming from tt, with subsequent semi-leptonic decays of both top quarks, those from bbZ with decays of the Z boson to leptons and those from bbh with subsequent decays of the Higgs boson to two leptons. We also consider here the mis-tagging of a jet to a single lepton through the bbW ± channel and the mis-tagging of bb in the + − +jets background, which was considered using + − +1 parton at NLO. As before, we do not consider mis-identification of c-jets to b-jets.
We show the resulting events after analysis for the signal region SR / E in Table VI. Both the "LHC" and the "ideal" parametrizations are shown. Evidently, since we are using the same set of cuts for both scenaria, the "ideal" parametrization does not necessarily provide a substantial improvement to the signal efficiency. This effect is observed in particular for the (bb)(W + W − ) sample. Moreover, it is evident that this channel could be addressed by employing more advanced statistical methods: cuts could be devised separately for the two sub-channels (bb)(W + W − ) and (bb)(τ + τ − ) and then combined taking  The backgrounds that have been considered in the (bb)( + − γ) channel. All the cross sections at NLO using the MadGraph5/aMC@NLO package. The tt process with the mis-tagged γ was generated at tree-level, merged to one jet via the MLM method, and normalized to the NLO cross section. The statistical uncertainty on the expectation values after the analysis is applied is percent level and we omit it for the sake of clarity.
into account the correlations and cross-contamination. § For example, this could be done via appropriate cuts on M h,reco. , which was found, as expected, to peak at the Higgs boson mass for the (bb)(τ + τ − ) sample and around 50 GeV for the (bb)(W + W − ) sample. This study is beyond the scope of this initial investigation.
Due to the fact that both signal and backgrounds possess larger cross sections, harder cuts are imposed in this channel than in the (bb)(4 ) and (bb)(Zγ) channels. This involves cutting on variables that are expected to have a high degree of correlation with the invariant mass of the Higgs boson pair, which implies that this channel would be less sensitive to variations of the self-coupling than, for example, the hh → (bb) (4 ), where no such cuts are imposed. Nevertheless, insofar as Standard Model-like hh production is concerned, this process is expected to provide important information, contributing to the detection of this channel at a 100 TeV collider, with signalto-background ratio of ∼ 0.1 and large statistical significance at 3 ab −1 .
On the other hand, the situation for the (bb)(µ + µ − ) after the SR µ cuts are applied is rather bleak: at 3 ab −1 of integrated luminosity, only a handful of events are expected with the "LHC" detector parametrization with a few hundred background events, even after hard transverse momentum cuts on the muons and a tight mass window on the di-muon invariant mass around the Higgs boson mass. Because of the latter cut, turning to the "ideal" situation improves the signal efficiency substantially, since the smearing of the muon momenta is absent. Despite this, only O(10) events would be obtained at 3 ab −1 with a similar number of background events as for the 'LHC' parametrization. Hence, barring any significant enhancements of the rate due to new physics, the (bb)(µ + µ − ) contribution to the hh → (bb)( + − ) final state is not expected to provide significant information. § Another improvement would involve separating the (bb)(τ + τ − ) final states into same-flavour and different-flavour. The author thanks Fabio Maltoni for useful discussion on this.

IV. DISCUSSION AND CONCLUSIONS
In this letter we have considered three rare, and potentially clean, final states coming from standard model-like Higgs boson pair production at a future 100 TeV protonproton collider. These processes are made available for investigation due to the fact that the total Higgs boson pair production cross section increases by a factor of ∼ 40 over that expected at the LHC.
For the hh → (bb)(ZZ) channel, where both Z bosons decay to leptons, resulting in (bb)(4 ), it was shown that a few events could be obtained at 3 ab −1 , versus O(10) background events. A crucial aspect of this analysis, that should be taken into account in detector design, is being able to observe four-lepton final states with at least one lepton possessing transverse momentum down to ∼ 20 GeV.
The hh → (bb)(Zγ) channel, with the Z boson decaying to leptons, was also considered briefly and was found to be of negligible importance even at 30 ab −1 . This is due to large backgrounds originating from the processes bbZγ and ttγ.
The last final state considered was that containing (bb) and 2 leptons. This receives contributions from (bb)(W + W − ), (bb)(τ + τ − ) and (bb)(µ + µ − ). The former two channels possess large missing energy due to neutrinos. The latter channel is not expected to be associated with significant missing energy, with the muons coming directly from the decay of the Higgs boson. The analysis is split into two signal regions, depending on the amount of missing energy. The signal region targeting (bb)(W + W − ) and (bb)(τ + τ − ) with leptonic final states provides promising results, with few tens of events versus a few hundred background events at 3 ab −1 of integrated luminosity. This can be further improved by designing a method that captures the individual features of the two sub-channels, as well as using flavour information for the leptons. The other signal region, designed for the (bb)(µ + µ − ) channel yields few events at 3 ab −1 , with significantly larger backgrounds. Potential enhancement of the latter channel could arise from an improvement in the resolution of the muon momenta that would allow a tighter mass window around their invariant mass.
A boost in the importance of all the channels containing a h → bb would similarly be obtained by improvement ∈ (50, 80) GeV ∈ (120, 130) GeV > 600 GeV none TABLE VI: The backgrounds that have been considered in the (bb)( + − + / E) channel coming from h → W + W − . All the cross sections were calculated at NLO using the MadGraph5/aMC@NLO package. The tt process was generated at tree-level, merged to one jet via the MLM method, and normalized to the NLO cross section. We show 1σ-equivalent errors, derived according to the Poisson distribution, for those background event samples that exhibit low number of events after the analysis. For the rest of the samples the uncertainty is percent level and we omit it for the sake of clarity. TABLE VII: The backgrounds that have been considered in the (bb)(µ + µ − ) channel. All the cross sections were calculated at NLO using the MadGraph5/aMC@NLO package. The tt process was generated at tree-level, merged to one jet via the MLM method, and normalized to the NLO cross section. We show 1σ-equivalent errors, derived according to the Poisson distribution, for those event samples that exhibit low number of events after the analysis. For the case of bbZ in the 'ideal' parametrization, the '<' indicates the 1σ-equivalent region, since no events were obtained after analysis. For the rest of the samples the uncertainty is percent level and we omit it for the sake of clarity.
of the resolution of the b-jet momenta, as the mass window around the Higgs boson mass could then be shrank down to beyond the 40-50 GeV region that has been considered here. Additionally, in a final analysis, performed with a more complete FCC-hh detector design in mind, boosted decision tree or neural network methods would increase significances by taking into account the intricate correlations between the observables, going beyond the simple "rectangular" cuts that we have applied here. Finally, the initial investigation of the processes considered here, along with the hh → (bb)(γγ) process, indicate that the study of Higgs boson pair production at a future hadron collider with 100 TeV centre-of-mass energy would greatly benefit with the collection of 10 or 30 ab −1 of integrated luminosity.
ful comments, as well as Nikiforos Nikiforou for useful discussions during various coffee breaks. We would also like to thank the Physics Institute, University of Zürich, for allowing continuous use of their computing resources while this project was being completed. This