Naturalness, the Hyperbolic Branch and Prospects for the Observation of Charged Higgs at High Luminosity LHC and 27 TeV LHC

One of the early criterion proposed for naturalness was a relatively small Higgs mixing parameter $\mu$ with $\mu/M_Z$ order few. A relatively small $\mu$ may lead to heavier Higgs masses ($H^0, A, H^{\pm}$ in MSSM) which are significantly lighter than other scalars such as squarks. Such a situation is realized on the hyperbolic branch of radiative breaking of the electroweak symmetry. In this analysis we construct supergravity unified models with relatively small $\mu$ in the sense described above and discuss the search for the charged Higgs boson $H^{\pm}$ at HL-LHC and HE-LHC where we also carry out a relative comparison of the discovery potential of the two using the decay channel $H^{\pm} \to \tau \nu$. It is shown that an analysis based on the traditional linear cuts on signals and backgrounds is not very successful in extracting the signal while, in contrast, machine learning techniques such as boosted decision trees prove to be far more effective. Thus it is shown that models not discoverable with the conventional cut analyses become discoverable with machine learning techniques. Using boosted decision trees we consider several benchmarks and analyze the potential for their $5\sigma$ discovery at the 14 TeV HL-LHC and at 27 TeV HE-LHC. It is shown that while the ten benchmarks considered with the charged Higgs boson mass in the range 373 GeV- 812 GeV are all discoverable at HE-LHC, only four of the ten with Higgs boson masses in the range 373 GeV-470 GeV are discoverable at HL-LHC. Further, while the model points discoverable at both HE-LHC and HL-LHC would require up to 7 years of running time at HL-LHC, they could all be discovered in a period of few months at HE-LHC.


Introduction
The discovery of the Higgs boson [1][2][3] in 2012 by the ATLAS and CMS collaborations [4,5] was a landmark and contains clues to the nature of physics beyond the standard model. Thus in the standard model the Higgs boson can be as large as 800 GeV while in supergravity (SUGRA) grand unified models [6] (for a review see [7]) it is predicted to lie below 130 GeV [8]. Further, the Higgs boson is discovered with a mass of ∼ 125 GeV exhibiting the fact that the supergravity limit of 130 GeV is respected. However, within supersymmetry (SUSY) the tree level Higgs boson mass is predicted to lie below the Z-boson mass, which indicates that the loop corrections are rather large which in turn points to the size of weak scale supersymmetry lying in the several TeV region [8]. The large size of weak scale supersymmetry makes the observation of sparticles more difficult. Further, with larger sfermion masses efficient annihilation of dark matter particles becomes more difficult and typically requires coannihilation [9] to be consistent with the WMAP [10] and Planck data [11]. Coannihilation in turn implies that the decay of the next-to-lightest supersymmetric particle (NLSP) will produce soft final states in models with R-parity which makes the detection of supersymmetry also more difficult. These constraints are softened in models where the wino or the higgsino content of the neutralino is significant as shown for some of the models discussed in section 3.
It should be noted that the large size of weak scale supersymmetry resolves some of the problems associated with low scale supersymmetry. One of these concerns taming the CP phases that arise in the soft breaking sector of supersymmetry and can produce large electric dipole moments in conflict with the experimental limits that currently exist. One of the ways to control them is the cancellation mechanism [12,13]. However, if the sfermion masses lie in the several TeV region, the CP phases would be automatically controlled [14,15]. In unified models based on supersymmetry one persistent problem relates to the dangerous proton decay arising from baryon and lepton number violating dimension five operators. However, such operators are signficantly suppressed if the scalar masses lie in the several TeV region [16,17]. Another problem that finds resolution if the weak scale is large relates to the so-called gravitino problem, in that a gravitino with a mass larger than 10 TeV will decay early enough not to interfere with big bang nucleosynthesis [18]. It is also quite remarkable that supergravity models with sizable scalar masses in the range of several TeV are consistent with the unification of gauge couplings [18]. Thus the case for supersymmetry is stronger as a consequence of the discovery of the standard model-like Higgs boson [19,20].
One of the signatures of supersymmetric models is the existence of at least two Higgs doublets which leads to two more neutral Higgs bosons, one CP even H 0 and one CP odd A 0 , and two charged Higgs H ± . Thus an indication of the existence of new physics beyond the standard model and an indirect support for supersymmetry can also come via discovery of one or more heavier Higgs bosons beyond the Higgs boson of the standard model. In supergravity unified models radiative breaking of the electroweak symmetry leads in general to two branches, one is the so-called ellipsoidal branch and the other is the hyperbolic branch [21][22][23] (for related works see [24][25][26][27]). On the hyperbolic branch the MSSM Higgs mixing parameter µ can be relatively small with µ/M Z order a few while the squarks masses can lie in the several TeV region and provide the desired loop correction to the standard model-like Higgs boson to lift its tree level value from below M Z to its experimentally measured value. However, a relatively small µ points to relatively light heavier Higgs bosons (relative to the the squark masses) and thus candidates for discovery at colliders. In this work we focus on the potential of the LHC to discover the charged Higgs within SUGRA models which are high scale models where SUSY is broken by gravity mediation. Currently LHC is in its second phase which we might call LHC Run 2 after the very successful LHC Run 1 which discovered the Higgs boson. The LHC Run 2 will run till the end of 2018 and each detector will collect about 150 fb −1 of data. It will then shut down for two years for an upgrade to LHC Run 3 which will resume its run in the period 2021-2023 and it is expected to collect 300 fb −1 of additional data. After that there will be a major upgrade to LHC Run 4 in the period 2023-2026 with an upgrade to √ s = 14 TeV and to high luminosity. It is expected that LHC Run 4 will resume its operations in 2026 and run for ten years at the end of which an integrated luminosity of 3000 fb −1 will be achieved.
Assuming that the discovery of a sparticle or a heavier Higgs is made at the LHC, a full exploration of the spectrum of the sparticle masses and of the Higgses will require a higher energy machine and there are dedicated groups investigating this possibility. One of the possibilities discussed is a 100 TeV proton-proton collider at CERN. This would require a 100 km circular ring in the lake Geneva basin. Another possibility discussed is that of a 100 TeV proton-proton collider in China [28,29]. However, a new possibility has recently been discussed which is a 27 TeV proton-proton collider [30][31][32][33][34] which can be built in the existing CERN ring with 16 Tesla superconducting magnets using the FCC technology. Such a machine will operate with a luminosity of 2.5 × 10 35 cm −2 s −1 and collect up to 15 ab −1 of data. In a recent work [35] an analysis on the potential for the discovery of supersymmetry at HL-LHC vs HE-LHC was carried out. In this work we carry out a similar analysis to discuss the potential for the discovery of charged Higgs at the HL-LHC vs HE-LHC. Here we make a further comparison of the conventional linear cuts vs machine learning tools for the discovery. Specifically we use in our analysis boosted decision trees (BDT) and show that some of the models which are undiscoverable at HL-LHC using the conventional linear cut analysis can be discovered using boosted decision tree technique. We carry out a similar analysis for HE-LHC. Here we show that HE-LHC is much more powerful for the discovery of the charged Higgs than HL-LHC.
The outline of the rest of the paper is as follows: In section 2 we give an overview of the Higgs sector in the MSSM, in section 3 we give a review of the hyperbolic branch of radiative breaking of electroweak symmetry and in section 4 we describe the SUGRA model and the benchmark points used in this analysis satisfying the Higgs boson mass and the dark matter relic density. In section 5 we describe the production modes of the charged Higgs in association with a top [bottom] quark in the four and five flavour schemes and give the respective production cross-sections and charged Higgs branching ratios for the benchmark points. The codes used for simulation of signal and background samples are described in section 6 along with the selection criteria used to study the discovery potential of the charged Higgs in its τ ν decay. Also the two methods for signal analysis, linear cut-based and boosted decision trees, are explained and compared. In section 7 we discuss dark matter direct detection for the SUGRA benchmark points and in section 8 we give conclusions.

The Higgs sector in the MSSM
In the minimal supersymmetric standard model (MSSM), the Higgs sector contains two Higgs doublets H d and H u , where µ is the Higgs mixing parameter appearing in the superpotential term µĤ u ·Ĥ d . Minimization of the potential which preserves color and charge gives two constraints one of which can be used to determine µ up to a sign, and the other to eliminate B in favor of The neutral components of the Higgs doublets can be expanded around their VEVs so that After spontaneous breaking the mass diagonal charged and CP odd neutral Higgs fields are given by The charged and the CP odd neutral Higgs boson masses at the tree level are given by In the MSSM, the couplings of the charged Higgs boson to up-type fermions go as cot β whereas the coupling goes as tan β for down-type fermions. In this paper we will be looking at the leptonic decays of the charged Higgs boson and so enhancing this channel requires larger tan β values. This has become increasingly difficult for low masses of the charged Higgs since exclusion limits tend to be more severe for the high tan β-low mass regime as will explain in the coming sections. The Higgs sector of the MSSM is similar to the 2HDMtype II with some differences such as the SUSY QCD corrections which are present only for the MSSM case. For reviews on the Higgs sector of the MSSM and the 2HDM see Refs. [37][38][39].

Naturalness and the hyperbolic branch of radiative breaking
Issues of naturalness arise in the context of radiative breaking of the electroweak symmetry where one of the stability conditions is given by where δµ 2 is the loop correction [40]. To illustrate the origin of the hyperbolic branch we consider the case of universal boundary conditions given by m 0 , A 0 , m 1/2 , tan β and sign(µ) where m 0 is the universal scalar mass, A 0 is the universal trilinear coupling, m 1/2 is the universal gaugino mass all at the GUT scale and tan β as defined earlier (the analysis of the hyperbolic branch for the non-universal case can be found in [16]). Thus for the universal boundary conditions at the GUT scale we may write this equation in terms of the parameters at the GUT scale [21,41] so that where and D 0 (t) is defined by and b i = (−3, 1, 11) for SU (3), SU (2) and U (1) and t = ln (M 2 G /Q 2 ) where Q is the renormalization group point. Our normalizations are such that α 3 (0) = α 2 (0) = 5 3 α 1 (0) = α G (0). The functions e, f, g, k are as defined in [42]. An interesting aspect of Eq. (8) is that it relates µ, which enters in the superpotential, to the soft breaking terms. This raises the issue of what the size of µ is. One very obvious choice is the following: the radiative breaking equation is supposed to generate masses for the vector bosons W and Z. Thus a reasonable choice is to have µ/M Z which is order few, i.e., µ/M Z ∼ (1 − 5) which was essentially the criterion of naturalness adopted in [21]. We note here that small µ models have been investigated quite extensively recently (see, e.g., [43][44][45] and the references therein).
It is important to note that a relatively small µ discussed above does not necessarily imply that m 0 need be small. To illustrate this point, it is useful to exhibit the underlying geometry of the radiative breaking equation. Thus, as is well known, both the tree value of µ 2 given by Eq. (8) and the loop correction ∆µ 2 loop , have significant dependence on the renormalization group scale. Their sum, however, is relatively insensitive to the changes in the renormalization group scale [21]. Thus suppose we go to the renormalization group point where the loop correction is small, and here we may simply consider the tree formula for µ 2 . It was seen in [21] that while for part of the parameter space of supergravity models all the C i , (i = 1 − 4) are positive, there are regions of the parameter space where C 1 can vanish or even turn negative. Reference to Eq. (8) shows that for the case when C 1 = 0, µ 2 becomes independent of m 0 (this is the so-called focal point region). Similarly when C 1 < 0, one finds curves in the m 0 − A 0 plane where the sum of the contributions to µ 2 involving m 0 and A 0 vanish. This is what one may call the focal curve region [23]. The same idea extends to focal surfaces. In these regions µ and one or more of the soft parameters are uncorrelated. Thus, for example, m 0 and A 0 can be chosen without affecting µ in the focal curve region.
We wish to note here that concepts such as naturalness are generally invoked only in the context of an incomplete theory which means that some parameters in the theory are unknown and one must make reasonable choices for them for investigating the theory. However, choices which might appear unreasonable need not be so if they are dictated by the internal constraints of the more complete theory, which means that no doors need be closed when working with an incomplete theory. Thus in the analysis below we will consider two ranges for µ: one which fits the criteria discussed above, i.e., µ/M Z in the range (1)(2)(3)(4)(5) and for the other we will step outside this range. We note that the analysis above shows that the observation of one of the heavier Higgs bosons H 0 , A 0 , H ± with masses much less than m 0 would point to radiative breaking of the electroweak symmetry on the hyperbolic branch and further if these Higgs bosons are observed with masses in the few hundred GeV range, that would lend support to naturalness defined by small µ.

SUGRA model benchmarks
The focus of this work is to explore the potential of HL-LHC and HE-LHC for discovering a charged Higgs boson in a class of high scale models, specifically SUGRA models consistent with the experimental constraints on the light Higgs mass at ∼ 125 GeV. The analysis is done under the constraints of R-parity so the LSP is stable. Further, in a large part of the parameter space in SUGRA models it is found that the LSP is also the lightest neutralino and thus a candidate for dark matter and the models are thus subject to the constraints that they be consistent with the observed amount of cold dark matter so that [11], Consistency with Eq. (14) would require non-universalities in the gaugino sector. Further, from Eq. (6) we note that the charged Higgs mass depends on the mass of the CP odd neutral Higgs A 0 which in turn depends on the Higgs mixing parameter, µ and tan β. We wish to have charged Higgs masses in the range ∼ (300 − 800) GeV, which requires nonuniversalities in m H d and m Hu at the grand unification scale. Including the non-universalities in the gaugino sector (for recent works see [46]) and in the Higgs masses (for recent works see [47] and for a review see [48]), the extended SUGRA parameter space at the GUT scale is given by where m 1 , m 2 , m 3 are the masses of the U (1), SU (2), and SU (3) C gauginos, and m 0 Hu and m 0 H d are the masses of the up and down Higgs bosons all at the GUT scale. In Table 1 we exhibit ten benchmarks which lead to Higgs masses and sparticle masses in the ranges not excluded. We note here that satisfaction of the relic constraint requires coannihilation in the models we consider. Coannihilation has been considered in a variety of recent works [49][50][51] where with stop, gluino and stau coannihilations were considered. Here as in the analysis of [18] the coannihilating particle is the lightest chargino, χ ± 1 which implies that the chargino and the LSP mass gap must be small, i.e., The mass spectrum of the model is calculated using softSUSY [52,53] and the relic density is evaluated using micrOMEGAs [54]. Taking the computational uncertainties in the codes into consideration, the light Higgs mass constraint is taken to be 125 ± 2 GeV and the relic density as Ωh 2 < 0.126. For the case when the cold dark matter constitutes only a fraction of dark matter, one would have multi-component dark matter [55] (one such recent possibility is the ultralight dark axion, see, e.g., [56,57]).   Table 2 exhibits the light and heavy Higgs masses and the masses of the electroweakinos, the stop and gluino masses along with the µ value and the relic density. Here µ is in the range (200 − 1600) GeV. For small µ, the neutralino has a larger Higgsino content leading to an efficient annihilation of these neutralinos in the early universe. Thus the relic density here can be significantly smaller than indicated by Eq. (14).

Charged Higgs production in association with a top (and bottom) quark
The production of the charged Higgs boson has been extensively studied theoretically and experimentally for most mass ranges. Thus we consider a charged Higgs as light when its mass is much smaller than that of the top quark. Such a particle has been excluded by Tevatron [58] and LEP [59] for the entire tan β range. For moderate mass ranges, m H ± ∼ 150 − 170 GeV, no firm experimental analysis exists because of the absence of theoretical studies of the signal that include important width effects and a full amplitude analysis for pp → H ± W ∓ bb is needed [60]. Thus in the exclusion limits for the charged Higgs given by ATLAS and CMS one finds a mass gap as noted above. In this analysis we consider heavy charged Higgs, i.e., m H ± > m t . In this region ATLAS and CMS [61][62][63][64][65][66][67] have excluded masses up to 1100 GeV for tan β ∼ 60 while masses up to 400 GeV are excluded for tan β < 2 for the channel H ± → τ ν except for the small gap mentioned previously. In the same channel, all masses up to 600 GeV are excluded for tan β ∼ 50. The more stringent constraints on the charged Higgs mass come from constraints on the CP odd Higgs [68][69][70][71]. In the A → τ τ channel, all masses up to 1000 GeV are excluded for tan β < 6, while masses up to 1500 GeV are excluded for tan β 45. Low masses (300 < m A < 500) are only allowed for tan β < 10 whereas higher tan β values require larger masses. Constraints on the CP odd Higgs translates into constraints on the charged Higgs mass according to Eq. (6). We note that while the Higgs mass relations given by Eq. (6) are tree level MSSM mass relations, in the actual analysis the Higgs masses calculated at full one-loop order with SoftSUSY are used. The masses of the CP odd and charged Higgses of the ten benchmark points of Table 2 are still allowed along with the masses of the charginos and neutralinos which belong to a compressed spectrum [72][73][74][75] (for experimental searches on compressed spectra, see Refs. [76][77][78][79]). The largest production mode of the charged Higgs at hadron colliders is the one that proceeds in association with a top quark (and a low transverse momentum b-quark), This production mode can be realized in two schemes, namely, the four and five flavour schemes (4FS and 5FS, respectively), where in the former, the b-quark is produced in the final state and in the latter it is considered as part of the proton's sea of quarks and folded into the parton distribution functions (PDF). This difference between 4FS and 5FS comes about mainly due to the collinear splitting of an incoming gluon into a bb pair resulting in large logarithms which can be absorbed into the DGLAP equations [80] thus making up the 5FS approach. Here, the final state b-quark is assumed massless and has low transverse momentum. Also, the virtual b-quark has a zero virtuality (i.e., m ≈ 0). The cross-sections of the two production modes are evaluated at next-to-leading order (NLO) in QCD with MadGraph aMC@NLO-2.6.3 [81] using FeynRules [82] UFO files [83,84] for the Type-II two Higgs doublet model (2HDM). The simulation is done at fixed order, i.e., no matching with parton shower. For NLO accuracy at both fixed order and with parton shower matching see Ref [85]. The couplings of the 2HDM are the same as in the MSSM, but when calculating production cross-sections in the MSSM, one should take into account the SUSY-QCD effects. In our case, as one can see from Table 2, gluinos and stops are rather heavy and thus their loop contributions to the cross-section is very minimal. In this case, the 2HDM is the decoupling limit of the MSSM and this justifies using the 2HDM code to calculate cross-sections. For the 5FS, the bottom Yukawa coupling is assumed to be non-zero and normalized to the on-shell running b-quark mass which is also calculated with MadGraph at the hard scale of the process. Eq. (17) shows the parton level subprocesses responsible for the production of a charged Higgs in association with a top quark at leading order (LO). Note that in Eqs. (16) and (17), t may refer to a top or antitop, and b to bottom or antibottom depending on the sign of the charged Higgs.
In the 5FS, the process is initiated via gluon-b-quark fusion while in the 4FS it proceeds through either quark-antiquark annihilation (small contribution) or gluon-gluon fusion. In fact, the NLO cross-section of the 5FS process contains an O(α s ) correction which includes, at tree-level, the 4FS processes. At finite order in perturbation theory, the cross-sections of the two schemes do not match due to the way the pertubative expansion is handled but one expects to get the same results for 4FS and 5FS when taking into account all orders in the perturbation. In order to combine both estimates of the cross-section, we use the Santander matching criterion [86] whereby with α = ln The matched cross-section of the inclusive process lies between the 4FS and 5FS values but closer to the 5FS value owing to the weight α which depends on the charged Higgs mass. The uncertainties are combined as such, Table 3 shows the NLO cross-sections for the charged Higgs top-associated production in the 4FS and 5FS for two center-of-mass energies, 14 TeV and 27 TeV. The matched cross-sections along with the uncertainties are given. A factor of ∼ 5 to 8 increase in the production crosssection is seen when going from 14 TeV to 27 TeV. In the 5FS, the cross-sections of the ten benchmark points are up to ∼ 2 times larger than those obtained with the 4FS due to the presence of an additional coupling factor in the latter. The renormalization and factorization scales are chosen as the hard scale of the process with µ R = µ F = 1 3 (m t +m b +m H ± ), withm b being the running b-quark mass. The charged Higgs production cross-section is proportional to which points to a dip in the cross-section around tan β = 8 to 9. The cross-section increases for tan β 8 and tan β > 10. Table 3 demonstrates the fall of the cross-section with the charged Higgs mass but it also shows the effect of tan β. For example, in going from point (f) to point (g), one can observe a rise in the cross-section due to the increase in tan β.   Table 1. The running b-quark mass, in GeV, is also shown evaluated at the factorization and normalization scales, µ F = µ R (in GeV).
We give in Table 4 the branching ratios of the dominant charged Higgs decays. The competing channels are tb (first column) and the electroweakinos (last column). In the MSSM, the coupling of the charged Higgs to up-type fermions goes as cot β, whereas the couplings to down-type fermions (down quarks and leptons) goes as tan β. Typically, the branching ratios of the charged Higgs to tb and τ ν must increase with tan β which can be observed in most of the benchmark points. However, there are exceptions since the chargino-neutralino channel is kinematically open. Thus one can see the branching ratios into the electroweakinos vary from about 2% (point h) to ∼ 65% (point j). For point (h), the only open channel for the electroweakinos is χ ± 1 χ 0 1 and since the µ parameter is ∼ 1 TeV, the Higgsino content of the LSP is small and so is the coupling to the charged Higgs, thus, the observed small branching ratio. However, for point (j), all channels are open, . This point has a small µ parameter and hence the Higgsino content of the LSP is larger. Indeed, we observe a 6 and 50 fold increase in branching ratios to χ ± 1 χ 0 1 and χ ± 1 χ 0 2 , respectively. The largest branching ratio, however, comes from χ ± 1 χ 0 4 and χ ± 2 χ 0 2 which appear to have a considerable amount of Higgsino content.

Discovery potential in the H ± → τ ν channel
We study the discovery prospects of the MSSM charged Higgs boson in its decay to a hadronic tau and missing energy. Theoretical studies of the possibility of observation of a charged Higgs at various colliders are numerous [87][88][89][90][91] including analysis in the tb channel [92] which showed that a mass range of about 300 to 600 GeV for tan β = 30 may be observable with about 1000 fb −1 of integrated luminosity. The τ ν channel has also been studied [93,94]. Before we discuss the result of our analysis we describe the different codes utilized in simulation of the signals and backgrounds. The simulation of the charged Higgs associated production, In the analysis presented here we look at the hadronic tau decay of the charged Higgs accompanied by missing transverse energy (neutrino). This channel has the smallest branching ratio but it is of interest since jets can be tau-tagged and the tau has leptonic and hadronic decay signatures. In the hadronic final states, the tau decay is characterized by its one-prong and three-prong decays which can be utilized to suppress possible SM backgrounds. Hence for such a final state (hadronic tau with missing transverse energy), the SM backgrounds are mainly tt, t+jets, W/Z/γ * +jets, diboson production and QCD multijet events which can fake the hadronic tau decays.

Selection criteria
The selection criteria is based on the flavour scheme under consideration. In the 4FS, the production of a charged Higgs is in association with a bottom and top quarks while the 5FS does not involve a b-quark in the initial state. Hence for the 4FS one has an extra b-tagged jet. We consider the hadronic decay of the top quark (t → bW → 3 jets, with at least one of the jets being b-tagged) and the hadronic tau (τ h ) decay of the charged Higgs. So we can summarize the selection criteria in the two flavour schemes as 4FS : Lepton veto, ≥ 5j (2b, 1τ h ), (21) 5FS : Lepton veto, ≥ 4j (1b, 1τ h ), (22) where the lepton veto involves rejecting events with electrons and/or muons. The minimum p T of the leading jet is 20 GeV and that of the tau-tagged jet is 25 GeV. We will discuss two types of analyses in this work, a cut-based analysis which uses the traditional linear cuts on select kinematic variables and a boosted decision tree analysis, and we will give a relative comparison of these.

Cut-based analysis
We begin the analysis by using a series of linear cuts on select kinematic variables used to discriminate the signal from the background. We carry out the analysis on two benchmark points using 4FS at 14 and 27 TeV. The following set of variables are used for discriminating the signal from the background: where the former is the missing transverse energy and the latter is a powerful variable in discriminating against QCD multijet events with H T being the sum of all p T 's of visible final state particles in an event.

m jets
T 2 , the stransverse mass [104][105][106] of the leading b-tagged and tau-tagged jets defined as where q T is an arbitrary vector chosen to find the appropriate minimum and the transverse mass m T is given by 3. m τ T and p τ T , the transverse mass and leading transverse momentum of the hadronic tau. Since H ± decays to τ ν, the transverse mass, m τ T , has a kinematical endpoint at the charged Higgs mass. This proves to be an important discriminant in this analysis.

E miss
T /m eff , where m eff , is the effective mass defined as As a precursor to the analysis based on BDT in section 6.3 we first discuss for some sample points an analysis based on linear cuts. As an illustration, we consider at 14 TeV the analysis of benchmark point (c). We present in Fig. 1 distributions in four kinematic variables at an integrated luminosity of 3000 fb −1 with the signal increased 100 folds for clarity.  Table 5 shows a cut-flow for signal and background at 14 TeV. After applying the cuts, tt and QCD remain to be significant and a calculation for the required integrated luminosity for a 5σ discovery gives O(10 6 ) fb −1 which is beyond the capabilities of the HL-LHC.

Cuts
Signal A similar analysis is carried out for benchmark point (a) at 27 TeV. We show in Fig. 2 a display of signal and background distributions for four kinematic variables related to bench-mark point (a) of Table 1. The distributions are shown for a 27 TeV center-of-mass energy and an integrated luminosity of 15 ab −1 . The signal is increased ten folds to show the best possible cut value for a particular variable.   Table 6: Cut-flow for signal (point (a) in 4FS) and SM background at √ s = 27 TeV. Samples are normalized to their respective cross-sections (in fb).
Here we note that previous analyses on H ± → τ ν channel used the τ polarization (one and three-prong decays) [107][108][109] which showed that a signal may be extracted for up to 100 fb −1 of integrated luminosity for a mass range of 200-800 GeV in the moderate and high tan β ranges. Most of those points have already been excluded by ATLAS and CMS. Nevertheless, the technique showed success in suppressing tt and QCD backgrounds (for further discussion of this topic and of other techniques see [110][111][112][113]). Despite the successes of many of those techniques one can still face trouble especially in low mass regime where the final states of the charged Higgs decay look very much like the SM background especially with the presence of hadronic tau fakes from QCD. In order to refine the search for the charged Higgs we resort to machine learning techniques which are taking center stage in high energy physics in data analysis.

Analysis using Boosted Decision Trees
Boosted decision trees have been around and used in high energy physics for some time now. Analyses based on BDTs have appeared in searches by ATLAS and CMS collaborations, and helped in the discovery of the SM Higgs boson and more recently in the observation of the decay h → bb [114] (In fact ATLAS used BDTs in this analysis while CMS used another machine learning technique known as deep neural network [115]). BDTs prove to be very helpful in separating signal and background especially for the cases where the signal is small and the background is overwhelming such that simple linear cuts fail to be of much help in such an environment. Based on multivariate analysis technique, BDTs use a set of kinematic variables to make a decision on whether an event is to be classified as a signal or a background. At the end of the training process, a single variable is created, known as the BDT response or score, which is used as a discriminator.
BDTs consist of many decision trees that constitute a series of "weak learners" [116] and based on multivariate analysis technique which make them powerful tools for classification problems. A tree consists of nodes and leaves which all ramify from the main node called a root node. The node refers to a criterion set on a variable which can be a "pass" or "fail". The training of the trees starts with the algorithm selecting a variable which best separates the signal from the background. A cut value of this variable is chosen and applied to the events which are split into left or right nodes depending whether they are classified as signal or background. A new variable is chosen with the best cut to further split the data into signal or background. The splitting into nodes ends when the maximum depth of the tree is reached or some stopping citeria is given. The tree ends with leaves where events classified as signal are assigned the value +1 and a value of −1 if classified as background. Misclassified events, i.e. signal events that end up in background nodes and vice-versa, are given larger weights and the whole process starts again with a new root node. The reason for providing extra weights to those events is that in the next iteration, more attention will be paid to those events and separation efficiency becomes better. More trees are created until the grown "forest" have the specified number of trees and the training process ends. The next process is the testing process to check how well the BDTs have learned about the signal and background features. The testing is done on a separate Monte Carlo sample so that the training and testing processes are statistically independent. The end result of the testing phase is the BDT score variable which can be used as a discriminating variable. An important issue to be aware of is overtraining. The performance of the BDT in the testing phase should not outdo that of the training phase. This can happen in some cases where BDT classifies events according to some specific features found in the training sample. Overtraining can be avoided by controlling the number of trees to be trained and their maximum depth. Usually a choice of a maximum depth of more than 4 on a sample with not enough statistics will result in overtraining.
The type of BDT we use in this analysis is known as gradient boosted decision tree, GradientBoost. The main differences between the various kinds of BDTs lie in the loss functions used. GradientBoost uses a binomial log-likelihood loss function which is ideal for weak classifiers, i.e. trees with a depth of 2 to 4. The effectiveness of GradientBoost can be enhanced by reducing the learning rate using the Shrinkage parameter which was set to 0.2 in this analysis. The number of trees in the forest ranged between 600 to 1500 and the maximum depth between 3 and 4 depending on whether enough statistics is present in the samples or not. With each choice of the number of trees and maximum depth we made sure no overtraining was present. A large set of variables have been tried and the ones which produced the best results were kept and used for all the signal points (a)-(j). The kinematic variables used in the previous section along with the ones listed here enter into the training of the BDTs: 1. The minimum transverse mass, m min T (j 1−2 , E miss T ), of the two leading untagged jets. This variable is effective in reducing tt, W + jets and QCD multijet backgrounds.

∆φ(p τ
T , E miss T ), the opening angle between the leading hadronic tau and missing transverse momentum. This variable tends to be larger for the signal, i.e. 1.5 rad.
3. ln p T , the logarithm of the leading jet p T if present and zero if no jet exists.
4. The number of tracks associated with the hadronic tau decay, N τ tracks . It is a very effective variable which enables the BDT to differentiate between tau decays according to their charge multiplicities since tau decays can proceed as one prong or three-prong decays.

5.
tracks p T , the sum of the track p T 's.
The training and testing of the samples is carried out using ROOT's own TMVA (Toolkit for Multivariate Analysis) framework [117]. In the training of the BDTs, the algorithm ranks the variables in decreasing order of importance. The variable which is ranked at the top is the one the BDT has used the most during the training in order to separate the signal from the background. The ranking of the variables differs from one point to another especially between the ones with very different charged Higgs mass. For this reason, we can split the benchmark points we have into two sets, one with low charged Higgs mass, i.e. m H ± < 500 GeV (points (a)-(e)), and another with high charged Higgs mass, i.e. m H ± > 500 GeV (points (f)-(j)). We present in Table 7 the ranking of the variables for the two charged Higgs mass ranges.
Rank Low mass range High mass range Table 7: The ranking of variables entering in the training of the BDTs in decreasing order of importance for the two charged Higgs mass ranges.
After the training and testing phase, the variable "BDT score" is created. Cuts on this variable will allow us to eliminate most of the background events. However, this is not enough. In addition to the selection criteria discussed in the previous section, additional cuts on some variables are necessary to extract the signal. Those cuts vary from one point to another and are summarized in Table 8.  Note that the spikes in the signal appearing at around a BDT score of -0.2 and -0.3 can be atrributed to statistical fluctuations resulting from the training and testing phase. Those are events that are misclassified as background and given a BDT score < 0.
The cuts on the BDT score ranges between 0.9 up to 0.98 for the different benchmark points. Fig. 5 shows how the estimated integrated luminosity varies as a function of the BDT score cuts for the different benchmark points at 27 TeV. Starting with a very high integrating luminosities for cuts between -1 and 0, we start seeing a drop for cuts > 0.6 until a major dip is observed for values > 0.9. A zoomed in plot (on the right) shows major activity happening between 0.95 and 1 where the dip occurs. The integrated luminosity shoots back up when the cut becomes too strong that no more signal events survive. Figure 5: The calculated integrated luminosities as a function of the BDT cut for the ten benchmark points at 27 TeV.
We apply the selection criteria along with a BDT score cut > 0.95 on the SM background and on each of the 4FS and 5FS signal samples to obtain the remaining cross-sections. The signal cross-sections are combined using Eq. (18) in order to evaluate the required minimum integrated luminosity for S √ S+B at the 5σ level discovery. The results for both the 14 and 27 TeV cases are shown in Table 9. ... 8379 Table 9: Comparison between the estimated integrated luminosity (L) for a 5σ discovery at 14 TeV (middle column) and 27 TeV (right column) for the charged Higgs following the selection cuts and BDT > 0.95, where the minimum integrated luminosity needed for a 5σ discovery is given in fb −1 . Entries with · · · mean that the evaluated L is much greater than 3000 fb −1 .
One can see from Table 9 that four of the ten points may be discoverable at the HL-LHC as it nears the end of its run where a maximum integrated luminosity of 3000 fb −1 will be collected. Given the rate at which the HL-LHC will be collecting data, points (a)-(d) will require ∼ 7 years of running time. On the other hand, the results from the 27 TeV collider show that all points may be discoverable for integrated luminosities much less than 15 ab −1 . The HE-LHC will be collecting data at a rate of ∼ 820 fb −1 per year and with that points (a) and (c) may be discoverable with in the first 3 months of operation, points (b), (d) and (e) may take ∼ 1.2 years, points (h) and (f) ∼ 3.5 years and the rest of the points > 6 years. In the analysis we have not included the effects of CP phases which can be large in supergravity models and can have in general significant effect on phenomena consistent with electric dipole moment constraints (see, e.g., [118]). It should be of interest, however, to investigate such effects at HL-LHC and HE-LHC in a future work.

Dark matter direct detection
Finally we discuss constraints from the direct detection of dark matter experiments in the model. For most of the parameter points of Table 1, µ is small which renders an LSP with a considerable amount of Higgsino content as shown in Table 10. A Higgsino LSP has a large spin-independent (SI) proton-neutralino cross-section which puts strong constraints on the model from dark matter direct detection experiments [119][120][121]. Thus recent results from XENON1T [121] show a sensitivity in the SI p-χ 0 1 cross-section reaching just below O(10 −46 ) cm 2 for an LSP mass less than 100 GeV and rises above that value for masses greater than 100 GeV (see Fig. 6). Projected sensitivity for XENONnT may reach O(10 −47 ) cm 2 in the near future. The neutralino LSP is a mixture of bino, wino and higgsinos such that χ 0 = αλ 0 +βλ 3 +γH 1 +δH 2 , where α is the bino content, β is the wino content and γ 2 + δ 2 is the higgsino content of the LSP. In Table 10 we give the individual contents of the LSP along with the SI proton-neutralino cross-section, R×σ SI , where R = (Ωh 2 )χ0 1 /(Ωh 2 ) PLANCK such that (Ωh 2 ) PLANCK is given by Eq. (14).  In Fig. 6 the ten benchmark points of Table 10 are overlaid on the exclusion plot of [121] and appear in the red box. Here one finds that all points appear to lie close to but below the XENON1T upper limit. Thus improved experiment can either discover dark matter or eliminate some of the parameter points on the plot. We emphasize again that neutralino content of dark matter in the model is typically small order a percentage or less, and thus bulk of the dark matter must have a different source such as ultralight dark axion mentioned earlier [56,57]. The fact that the neutralino content of dark matter is small also appears in other recent models with small µ discussed in [43][44][45]. Figure 6: The SI proton-neutralino cross-section exclusion limits as a function of the LSP mass from XENON1T results taken from [121]. The ten benchmark points are overlaid on the plot showing them lying below but close to the upper limit (black curve). The inset shows the limits from LUX 2017, PandaX-II and XENON1T along with the uncertainty bands normalized to the sensitivity median defined in [121].

Conclusions
In this paper we have given an analysis of the potential of HL-LHC and HE-LHC for the discovery of the charged Higgs boson in the τ ν channel for a mass range of 370-800 GeV for moderate values of tan β using machine learning technique of boosted decision trees. It is shown that the use of machine learning technique allows one to differentiate a signal from the background more efficiently and thus discover models which would otherwise not be discoverable using traditional linear cuts. It is found that using BDTs, charged Higgs with a mass in the range ∼ (370 − 470) GeV and tan β in the range 8 to 11 (benchmarks (a)-(d)) may be discoverable at the HL-LHC with an integrated luminosity in the range ∼ (2300 − 3000) fb −1 . The same analysis is carried out at 27 TeV for the HE-LHC and it is found that all of the ten benchmark points may be discoverable with an integrated luminosity as low as ∼ 200 fb −1 for point (c) and up to ∼ 8000 fb −1 for point (j). Based on the rate at which data will be collected at the HL-LHC and HE-LHC it is found that for points (a)-(d) which are discoverable at both machines, one requires a run of ∼ 7 years at the HL-LHC whereas the run time drops to a few months at the HE-LHC. For the remaining parameter points which are only discoverable at the HE-LHC, a run time ranging from one year to more than 6 years may be required for the higher mass ranges. These results suggest that a transition from HL-LHC to HE-LHC when technologically feasible would significantly expedite the discovery of the charged Higgs in the mass range considered in this work. The analysis was done in the context of the SUGRA-MSSM model and thus exclusion limits from ATLAS and CMS pertinent to this class of models were used. Discovery of charged Higgs was studied also in models such as hMSSM and 2HDM. Further, other charged Higgs decay channels such as tb and electroweak gauginos are all interesting and require separate analyses. Regarding the tb channel, this has the leading branching ratio for the benchmark points under consideration. However, signatures investigated for this decay mode are often not very successful owing to the difficulty in separating the signal from tt background. For this reason, the mode H ± → τ ν is favorable because of its cleaner signature. Finally we note that the observation of a charged Higgs boson with a mass much less than m 0 would point to the hyperbolic branch where radiative breaking of the electroweak symmetry occurs. Further, if the charged Higgs boson mass is seen to lie in the few hundred GeV range, then such an observation would lend support to the idea of naturalness defined by small MSSM Higgs mixing parameter µ.