Probing Heavy Charged Higgs Boson at the LHC

Signature of heavier charged Higgs boson, much above the top quark mass, is investigated at the LHC Run 2 experiments, following its decay mode via top and bottom quark focusing on both hadronic and leptonic signal final states. The generic two Higgs doublet model framework is considered with a special emphasis on supersymmetry motivated Type II model. The signal is found to heavily affected by the huge irreducible backgrounds due to the top pair production and QCD events. The jet substructure technique is used to tag moderately boosted top jets in order to reconstruct charged Higgs mass. The simple cut based analysis is performed optimizing various kinematic selections, and the signal sensitivity is found to be reasonable for only lower range of charged Higgs masses for very high luminosity 3000 fb$^{-1}$ option. However, employing the multi-variate analysis(MVA) technique, a remarkable improvement in signal sensitivity is achieved. We find, the charged Higgs signal for the mass range about $300-600$ is observable with 1000 fb$^{-1}$ luminosity option. However, for more high luminosity option 3000 fb$^{-1}$, the discovery potential can be extended to $700-800$ GeV.


Introduction
The recent discovery of the 125 GeV Higgs boson [1,2] at the CERN Large Hadron collider (LHC) provides the last missing piece of the Standard Model (SM), and open up a new window to explore the physics beyond standard model (BSM). Although, the current precision measurements of various properties of the Higgs boson, in particular the couplings with fermions and gauge bosons, indicate that it is indeed the candidate for the SM Higgs [3], nonetheless, it does not rule out any BSM scenario. Among the plethora of BSM candidates, the supersymmetry based models, such as minimal supersymmetric standard model(MSSM) is the most popular and very well studied BSM scenario. It provides elegant solutions to some of the short comings of the SM, and predicts rich and diverse phenomenology to be testable directly in colliders.
Recall, the MSSM requires at least two Higgs doublets to make the theory anomaly free, and also to generate the masses of up and down type of fermions. The theories with extended Higgs sector predict more Higgs boson states -neutral or charged. In general, the two Higgs doublet model(2HDM) with an extra SU(2) Higgs doublet added with the SM Higgs doublet is well motivated and consistent with the Higgs discovery. In fact, the 2HDM can be interpreted as the effective theory at the low energy of many BSM theories with UV completion. For example, Higgs sector in supersymmetric model may appear as a simple 2HDM (Type II), if the masses of all sparticles decouple at a very high scale. Generally, 2HDM is classified into four categories, Type I, II, III and IV depending on the nature of Yukawa couplings of Higgs bosons with fermions, subject to Z 2 symmetry in order to avoid Flavor changing neutral current (For a more details about 2HDM, see the review of Ref. [4] and [5]). In all classes of 2HDM scenario, there exist five physical Higgs boson states, two CP even (h, H, with the assumption, m h < m H ), one CP odd (A), and two charged Higgs boson(H ± ). The lightest CP even Higgs h can be interpreted as the SM-like Higgs boson under some decoupling limit where other Higgs boson states are very heavy, much above the electroweak scale [6]. However, some other studies also show that CP even Higgs states can behave SM-like with mass 125 GeV in alignment limit even without decoupling [7][8][9][10]. Presence of extra physical Higgs boson states along with the SM like is one of the characteristics of BSM. Certainly, discovery of an extra Higgs boson may confirm the existence of BSM. Hence, looking for these additional Higgs bosons in various channels over a wide range of masses is the important part of the program at the current LHC experiment.
In this context, looking for the charged Higgs boson signal is unique, since discovery of it clearly, and unambiguously confirm the presence of the BSM. Therefore, the study of the charged Higgs has received special attention both phenomenologically, and as well as experimentally. The phenomenology of the charged Higgs boson is well studied for the lower mass range, less than the top quark mass, m H ± < m t , and also experimentally probed in various decay channels. However, detection of the charged Higgs boson for the heavier mass range, greater than the top quark mass(m t m H ± ), is found to be very challenging due to huge contamination by the irreducible SM backgrounds. In this current study, we attempt to find the discovery potential of the charged Higgs boson for this heavier mass range (m H ± m t ). The study is carried out within the framework of generic 2HDM with an emphasis on Type II 2HDM motivated by the supersymmetry. The charged Higgs boson couplings with fermions are very strongly dependent on tan β, and hence its production and subsequent decays are very sensitive to it. In hadron colliders, in the lower mass range (m H ± < m t ), the charged Higgs bosons are produced via a pair of top quark, pp/pp → tt following the decay t → bH ± . For intermediate and heavier mass range, the range of our interest, it is mainly produced directly in association with a top quark(and also b quark) [11]. Furthermore, charged Higgs boson can be produced in SUSY cascade decays via heavier chargino and neutralino production in gluino and squark decays [12,13].
In the program of new physics searches, looking for H ± signal is one of the important agenda in collider experiments. So far, non observation of any charged Higgs signal events in direct searches constrain its production and decay in a model independent way, which in turn are translated to exclude the relevant parameter space, in particular tan β and m H ± for a given model framework. For example, in the past, direct searches at LEP [14] and Tevatron [15] experiments excluded lower mass range of m H ± in terms of tan β. The most recent limits are obtained by the LHC experiments including Run 2 data covering a wide range of m H ± , from lower (m H ± < m t ) to high m H ± ∼ 1 TeV. In Run [23] are considered to probe it upto ∼ 1 TeV mass. The non observation of any signal events in H ± searches at 13 TeV energy with an integrated luminosity 12.9 fb −1 leads the exclusion of the cross section times branching ratio for the mass range 180 GeV < m H ± < 3 TeV, where as limits on the Br(t → H + b) × Br (H + → τ + ν) are set for the range 80 GeV < m H ± < 160 GeV. Eventually, these exclusion limits, rule out m H ± ∼ 90 − 160 GeV corresponding to the entire range of tan β up to 60 in the framework of MSSM m mod+ h scenario [24], except a hole around m H ± ∼ 150 − 160 for tan β ∼ 10, which is still to be constrained [22,24]. Similar results are also published from ATLAS [21] at √ s = 13 TeV. The heavier mass range, m H ± ≥ m t is also probed by both the CMS and ATLAS in for the decay modes, H − → τ − ν,tb, with the hadronic decay mode of τ lepton and leptonic decay of top quark [21,22,25]. The resulting limits predicted by these searches are not any more stringent than the ones obtained from H ± → τ ν mode [17,21,22]. Remarkably, the most stringent constraints on the charged Higgs sector in the context of SUSY motivated Type II type of model are predicted indirectly by the neutral Higgs boson searches, pp → h, H, A → ττ at the LHC [26]. It can be attributed to the fact that the neutral Higgs couplings with tau leptons are very sensitive to tan β, in particular for higher values. Exclusion region predicted by these searches, imply a limit on tan β > 6 for m A < 250 GeV, where as Higher values of tan β = 60 are completely ruled out for m A ∼ m H ± ∼ 1000 GeV, where m A is the mass of the pseudoscalar Higgs in SUSY model( like Type II model), related with the charged Higgs mass, m 2 H ± = m 2 W + m 2 A . In addition to these constraints from direct searches, the charged Higgs sector is also constrained by the flavor physics data. Strong contribution via loops to the Br of rare decay modes of B meson makes it very sensitive to flavor physics observables. Measurements of these Br by B-factories, and also at the LHC and LHCb put very strong constraint to the charged Higgs sector. More details about these latest constraints in the framework of 2HDM can be found in a recent review of Ref. [27], and references therein.
In the phenomenological side, there have been numerous studies on exploring the H ± signal in various decay channels in the context of Higgs sector in MSSM [28][29][30][31][32][33][34][35][36][37][38][39], and as well as in 2HDM framework [30,[40][41][42] using various interesting techniques. More details about charged Higgs phenomenology can be found in Ref. [43]. It is worth to mention here about the use of τ lepton polarization in its 1 and 3 prong decay for H ± → τ ν, which is found to be very useful in extracting the signal suppressing tt and QCD background [44][45][46]. The signal of charged Higgs boson is also probed in the subdominant production channels H ± W ∓ [47] and H + H − [48]. Discovery potential of charged Higgs for heavier mass (m H ± > m t ) range with its dominant decay mode H − →tb is investigated by many authors in the framework of SUSY models [49][50][51][52]. For instance, in the Ref. [52], authors used triple and four b-tagging in order to suppress SM background, which also costs signal significantly as well. Consequently, for heavier mass range of the charged Higgs, it is found to be very hard to achieve a reasonable signal sensitivity, since it suffers from a large tt and QCD backgrounds. A recent study [53] reports about the detection prospect of charged Higgs signal for heavier mass 1 TeV applying boosted technique to tag top quarks from charged Higgs decay in the framework of 2HDM. The authors predicted reasonable sensitivities of charged Higgs signal around the mass range 1 TeV, and found it is difficult to probe in the intermediate mass range. The boosted technique is also used to look for heavy charged Higgs boson signal in the decay channel H ± → W ± A for lighter A boson states [54,55].
In this current study, we explore the detection prospect of charged Higgs boson from the intermediate to heavier mass range, 300 − 1000 GeV, considering the decay mode, H − →tb with the hadronic and leptonic channel of the associated top quark. For heavier mass of H ± , the top quark from its decay is expected to be boosted, and we try to exploit this feature by employing the technique of jet substructure method to reconstruct the top quark, and subsequently the charged Higgs boson. This method helps to avoid the combinatorial problem while reconstructing the top quark simply by combining the hard jets. In this analysis, following the past studies, first we attempt to obtain signal sensitivity using cut based analysis, and then attempt to improve the sensitivity by using the multivariate(MVA) analysis, simulating within a framework of SUSY Higgs sector. Performing a detail analysis in the MVA framework, we achieve a remarkable improvements in H ± signal sensitivity. We present our results for three integrated luminosity options L = 300 fb −1 , 1000 fb −1 and 3000 fb −1 . Finally, for the sake of completeness, charged Higgs signal sensitivities are presented for all classes of 2HDM model setting few benchmark parameter space.
We present this study as follows. Briefly describing the 2HDM in Sec.2, the charged Higgs production is discussed in Sec.3. In Sec.4, the signal and backgrounds are discussed, and subsequently, details of simulation are presented in the subsection 4.2 with a brief description about top tagging in subsection 4.1. The results based on cut and count analysis are discussed in subsection 4.3, while Sec.5 the results based on MVA analysis are presented. Finally we summarize in Sec.6.

Two Higgs doublet Model
In the context of our present study, it is instructive to discuss very briefly about the 2HDM. In this model, an extra SU(2) Higgs scalar doublet is added with the SM Higgs doublet. The most general 2HDM potential consisting two doublets φ 1 and φ 2 with hypercharge Y=+1 is given by [4,5], For simplifications, all the free parameters are assumed to be real to conserve CP property, and the discrete Z 2 symmetry, φ 1 → −φ 1 and φ 2 → +φ 2 is imposed to suppress FCNC at tree level. The Z 2 symmetry is softly broken by the terms proportional to m 12 .
The minimum of the potential V is ensured by two vacuum expectation values(vevs), which break the symmetry down to U(1) em symmetry, where v 1 and v 2 are two vevs corresponding to neutral components of φ 1 and φ 2 respectively, with v = v 2 1 + v 2 2 . The ratio of two vevs defined to be tan β = v 2 v 1 is considered as one of the free parameter of the model. Expanding the doublets around the minimum of the potential, the Higgs fields can be given by [4,5], Already mentioned in the previous section that after symmetry breaking, the potential predicts five physical Higgs boson states, two neutral CP even states, h and H(m h < m H ), one neutral CP odd state A, and two charged states H ± . The physical charged state and CP odd neutral states are expressed as, The two CP even neutral weak states mix through an angle α providing two mass eigenstates, h and H. The input parameters present in the potential V, can be reexpressed in terms of physical masses and other parameters such as, Note that v is set to be at the electroweak scale( = 246 GeV), and one of the CP even Higgs boson can be interpreted as the recently discovered Higgs boson of mass 125 GeV under certain scenario of the model, which are already mentioned in the earlier section [6][7][8][9][10]. The topics of our interest in this current study is to look for the charged Higgs signal, hence we focus only this sector of 2HDM. In the generic 2HDM model, the Yukawa couplings of charged Higgs with fermions are given by [4,5], where V ud is the CKM matrix elements, and the couplings λs represent either tan β or cot β depending on the assignments of Z 2 charges to right handed fermions, which finally define the four types of 2HDM. The Table 1 presents λs corresponding to four types of 2HDM model. As shown in the Table 1, in Type I model, the couplings of charged Higgs with fermions are heavily suppressed for tan β 1, which is true for Type III model, except the coupling with leptons making it lepton specific. In the Type II model which is same as the supersymmetric Higgs sector, couplings are favored with u-type quarks for low tan β case, where as for d-type quarks and leptons, high values of tan β are preferred. The Type IV model is found to be lepto-phobic for high tan β scenario but, the couplings with quarks are same for both Type II and Type IV model. Evidently, the charged Higgs decay Br to fermions are very much tan β dependent. The decay channels of charged Higgs to τ ν ortb channels are very much sensitive to tan β once they are kinematically allowed. The charged Higgs Br computed by HDECAY [56] are demonstrated for various values of tan β and setting m H ± = 500 GeV, in Fig.1 corresponding to four Types of 2HDM. The input parameters are set as, as m h = 125 GeV, m H = m A = m H ± and sin(β − α) = 1, like MSSM scenario [27] with decoupling limit. In Type I model, due to the cot β dependence of coupling, the Br(H − → τ − ν) is suppressed by m 2 τ /m 2 t over Br(H − →tb), leading almost 100% Br tō tb mode. The dominant decay mode of charged Higgs in Type II model, as expected is in thetb channel, following sub-dominant τ ν channel with Br ∼ 10 − 15 %, followed by other very suppressed modes such as, H − →bc,cs. However, in case of Type III model, which is lepton specific, the charged Higgs decays to τ ν mode dominantly, except in the lower region of tan β ∼ 1 − 12 wheretb mode becomes important. In contrary, τ ν mode gets suppressed in Type IV model, because of cot β dependence, andtb channel takes over. It is to be noted that the pattern of these Br are changed significantly in the presence of H ± → W ± φ, φ = h, H, A mode, of which decay width is proportional to cos(β − α) leading it to be the dominant one(∼ 100%) for the choice of sin(β − α) = 0 1 . Interestingly, in the case of SUSY motivated Higgs sector, i.e in Type II model, if kinematically allowed, the charged Higgs can decay also to chargino and neutralino pair, H ± →χ ± iχ 0 j ; (i:1-2, j:1-4), which may be dominant for Higgsino like scenario [57]. As pointed out earlier, that the charged Higgs sector is severely constrained by flavor physics data in addition to the direct searches of which details can be found in reviews [27,43].

Charged Higgs production
In the intermediate to heavier mass range (m H ± m t ), the charged Higgs is inclusively produced directly in proton-proton collision via the process, At the parton level, the production mechanism is initiated via two subprocesses, in 4 flavor(4FS) and 5 flavor scheme(5FS) at the leading order(LO) respectively. In fact, the process in 4FS is part of the NLO QCD correction to the 5FS scheme mechanism. The total NLO QCD effects to the inclusive H ± production is essentially the NLO correction to the process gb → tH plus the total contribution due to the treelevel processes. In 5FS, the NLO QCD corrections are known for sometime in the literature [58][59][60][61], and also very recently approximate NNLO calculations also are published [62]. The total theoretical uncertainty in H ± production in association with top quark(5FS) is found to be of the range 15-20% [63]. In the 4FS, the final state bottom quark originates due to the hard scattering is assumed to have non zero mass, where as in 5FS, the b quark is treated as massless being part of parton flux. In 4FS, the corresponding NLO correction estimated to be around 20% for the lower range of m H ± , and it goes up little for more higher masses [63].
At finite order, the cross section in 4FS does not match with 5FS, as expected, due to different ways of treating perturbative calculation. However, it is expected that the results will match within the respective uncertainties once taking into account of all orders in perturbation. In order to obtain the precise estimation of charged Higgs production cross section, one needs to combine the 4 and 5 flavor scheme predictions appropriately. This combination is performed following the prescription, so called Santander-matching [64]. In the IR limit, ( m H ± m b → 1) the cross sections obtained from 4FS and 5FS scheme calculation match nicely. The main difference between the 4FS and 5FS scheme occurs because of the presence of large logarithm, which arises due to the splitting of incoming gluon into two nearly collinear b quarks [65]. Thus, the calculated cross sections using two schemes should be combined in such a manner that such logarithmic effects are taken into account appropriately. The prescription to match these cross sections computed in two schemes is given by [11,63], Similarly, the theoretical uncertainties are combined as, With this matching methodology, the overall theoretical uncertainty of the combined NLO cross section is found to be around 10%, where as the individual 4FS and 5FS cross sections at NLO are in reasonable agreement within ∼ 20% from the central value [11,63]. The production cross section and the corresponding uncertainty are very sensitive to tan β, owing to the dependence of Yukawa coupling on it. The scale of uncertainty reduces with the decrease of tan β through the correction of bottom Yukawa coupling, which is proportional to tan β. We first estimate the charged Higgs boson production in Type II 2HDM motivated by the SUSY providing inputs tan β and m H ± , and then derive the corresponding cross sections for other classes of 2HDM(Type I, III and IV) simply by appropriately rescaling couplings. It is to be noted that in the MSSM, the NLO QCD corrections may involve the additional loop contribution from gluinos and squarks, which also depend on tan β. This extra contribution can be absorbed through the rescaling of the the NLO QCD prediction of the bottom Yukawa coupling [66]. The total cross section primarily governed by the tbH ± coupling is found to be minimum in strength for tan β ≈ 7 − 8. In Table 2, the charged Higgs boson production cross sections for both schemes, and the final matched values are presented for few representative choices for m H ± and tan β = 30 in Type II model. The cross sections are computed both at LO and NLO, using using MadGraph5-2.6.1 [67], with the FeynRules [68] model file uploaded by authors of [69]. We notice that for tan β = 3, the cross sections go down by a factor of ∼ 1 2 in Type II model. These cross sections, both LO and NLO, are computed using MadGraph5-2.6.1 [67], setting QCD scales, factorization and renormalization same as, , as shown in the first row along with the value of running b-quark mass [70]. Variation of cross sections are found within a range from O(100)fb to O(1)fb corresponding to the mass range 300 -1000 GeV of m H ± .
In Type-I model, (see Table 1) the charged Higgs boson coupling with top and bottom quarks goes by ∼ (m b + m t ) cot β. The cross sections in Type I model simply can be obtained from the values corresponding to Type II model by rescaling the Yukawa couplings [11,63,69]. The total cross section can be parametrized by σ Type-II where, g t and g b are the part of the Yukawa couplings proportional to top and b quark masses respectively. Evaluating the contributions by setting m t = 0(g t = 0) and m b = 0(g b = 0), σ b , σ t and σ bt can be obtained. Thus, the cross sections in Type I model can be estimated rescaling each contribution by cot β. This prescription works in to all orders in QCD, but not appropriate to all orders in the electroweak corrections [11]. The cross sections for both in Type I and II 2HDM are presented in Fig. 2 for various values of tan β and three choices of m H ± = 300 GeV, 500 GeV and 800 GeV. Clearly, as expected, the cross sections in Type I model are suppressed over Type II model by approximately ∼ tan 2 β, for tan β >>1. The cross sections in Type III(Type IV) model are same as the Type I(Type II) due to the identical Yukawa coupling structure with quarks. A dip in cross sections is observed for Type II model around tan β ∼ 7 − 8 unlike the Type I, which can be understood from the respective couplings dependence on tan β or cot β.

Signal and Background
As mentioned before, in this current study, the signature of charged Higgs boson is explored with its decay mode, H ± →tb. The Br(H ± →tb) is almost dominant, more than 70% for a wide range of tan β, and for all classes of 2HDM as shown in Fig. 1, except for the Type III model which is lepton specific. Signal is simulated considering both the H ± production mechanisms, Eq.3.2, and eventually the final results are obtained by combining them following the recipe, given in Eq.3.3.
The resulting signal final state consists multiple top quarks via the following processes: Both leptonic and hadronic final states are considered following the semi leptonic and hadronic decays of the top quarks respectively. Note that the signal final states consist multiple b quarks, which is the characteristics of heavier charged Higgs signal for the considered decay channel [51,52]. The top quark originating from H ± decay is tagged in its hadronic mode, and combining it with the appropriate identified b-jet, the charged Higgs mass is reconstructed. Tagging of top quark is performed implementing the powerful jet substructure analysis [71], which is postponed for discussion in the next section. In case of pure hadronic signal final state, the associated top quark is also identified through kinematic reconstruction in order to make signal more robust. In addition to the reconstruction of top quarks, we exploit the presence of extra hard b-jets in the final state in order to separate out background. Therefore, we focus the charged Higgs signal final state in two categories: where H ± reco and t reco represent the reconstructed Charged Higgs and top quark, and n is the number of leptons in the final state required to be at least one. The main dominant source of irreducible SM backgrounds are due to the tt production which presents exactly the identical final states and inclusive hard QCD jet production. However, extra b-jets may arise via gluon splitting in the initial state radiation. The QCD jet production becomes dominant source of irreducible background, in particular corresponding to the hadronic signal final state, due to the non-negligible mis-tagging probability of hard jets as a top jet. Moreover, the process ttg which predominantly produces the final state ttbb is also taken into account in our background simulation. Before discussing the signal and background estimation strategy, we discuss briefly the top tagging methodology used in our simulation. The top tagged jets are the essential component of our considered signal events. It has been pointed out earlier that the top quark originating from H ± decay is expected to be reasonably boosted(boost factor, γ t ∼ m H ± /m t ), in particular for heavier charged Higgs boson masses. The p T of those top quarks are demonstrated in Fig. 3 for three masses of H ± , along with the same of associated top quarks. Clearly, this figure indicates that the top quark from heavier H ± decay is moderately boosted, however, p T distribution of associated top quarks are found not to be sensitive to m H ± . A top quark decays to a b-quark and a W ± which subsequently decays to a pair of light quarks leading to jets in the calorimeter. However, for fast top quark, these decay products may not appear well separated to resolve as a separate jets. In such cases, the boosted top quark may look like a single jet, called fat jet with three or more subjets as constituent corresponding to its decay products. These subjets are well separated within an angular cone of the order ∼ 2m t /p T . Following this kinematic features, we attempt to tag topjets, surrounded by busy hadronic environment, using the top tagger, namely HepTopTagger [72][73][74][75]. In the process of tagging tops, first cluster particles with p T ≥ 0.5 GeV and |η| < 5, with the Cambridge/Aachen [76] jet algorithm implemented in Fastjet-3.3.0 [77] using jet radius R=1.5 form fat jets. Then require at least one hard fat jet in the event with p T ≥ 200 GeV. In our searches, top tagged jets are likely to be more contaminated by QCD radiation, since more wider radius R=1.5 is considered to contain all subjets from the moderately boosted top quark. Therefore, it is suggestive to take extra measures to eliminate QCD effects due to soft radiation in reconstructing subjets. The sub structures of Fatjets are obtained following the mass drop method using some recursive steps which are built in HepTopTagger [72][73][74][75]. In this process, the last step of clustering process is declustered to obtain two subjets j 1 and j 2 , such that m j 1 > m j 2 . If m j 1 + m j 2 ∼ m j , and m j 1 > 0.8m j , then it is expected that j 2 originates from QCD emission or underlying events, and we discard j 2 , otherwise we keep both j 1 and j 2 . If the mass of the subjets is 30 GeV or less, then we keep it or decompose it further (both j 1 and j 2 or just j 1 depending on how symmetrically the mass splits). The subjets which are obtained at the end of the recursive declustering procedure, also cleaned further through filtering [71] to eliminate the contamination from the QCD radiation. Any of the two subjets are suppose to originate from W decay, and it is ensured by requiring the invariant mass of two subjets m jj = m W ± 15 GeV. Finally, the top is tagged by adding the third sub-jet, which is a b-like jet, examined by matching with the b quark in the event. The invariant mass of three subjets after filtering is required to be m jjb = m t ± 30 GeV. If there be more than one top tagged jet, we choose the one which is the closest to the pole mass of the top quark. Using the default conditions in HepTopTagger, we find the single top tagging efficiency is about 10% for this kind of moderately boosted tops in signal events. Note that in calculating this efficiency no pile-up effects are taken into account. The mistagging efficiencies are obtained using the QCD events and it is found to be around 2 − 3 %.

Top Tagging
We attempt to recover this top tagging efficiency to a better level by employing multivariate analysis. The multivariate analysis is implemented within TMVA [78] combining the HepTopTagger mass drop method, and instead of using the full chain of HepTopTagger, some other additional kinematic variables including N-Subjettiness, energy correlation, are used which are listed below: 1. N-Subjettiness [79]: Variables are defined as, where τ N is the N-th subjettiness variable [80] as defined, ∆R ik is defined to be the geometrical separation between i-the subjet and the k-th reference axes, R 0 is the jet cone size parameter. Clearly, a smaller τ N implies more radiation around the given axes, i.e a better description of jets with N or less subjets, where as large τ N means a better description of jets with more than N subjets. It is found that τ N /τ N −1 is an efficient discriminating variable to distinguish boosted objects [79][80][81].
2. Mass difference: It is defined as, ∆m t = |m jt − m t |, where m jt is the mass of the tagged top jet. This mass difference is also very crucial in tagging tops.

Number of b -quarks:
The number of b-like sub jets n j b , and it is counted by matching subjets with b-partons within |η| < 2.5, and p T > 5 GeV using the matching cone ∆R < 0.3 around the subjet.

Variable related with reconstructed masses: It is defined to be,
This ratio determines the quality of reconstructed W with respect to the overall quality of reconstructed top mass.
6. Energy correlations: The energy correlators among the subjets or particles inside a jet distinguishes the various properties of jets [82]. The correlation function uses the information about the energies and pair-wise angles of particles within a jet. It is also one of the useful function to classify jets.
With these set of variables including HepTopTagger mass drop method, we train Boosted Decision Trees to tag top jets in tt and mis-tags in QCD process. In Fig.4, we show the results as a receiver operator response(ROC) curves for both signal acceptance and background(QCD) rejection efficiencies. This figure clearly demonstrates an improvement in top tagging efficiencies, along with the suppressed background mis-tag rates. The efficiencies obtained using HepTopTagger is also shown by a star. Undoubtedly, the top tagging efficiency through MVA method is improved significantly. We will use this improved efficiency in the simulation of signal and background.

Signal and Background Simulation
The PYTHIA8-8.2.26 (PYTHIA8) [83] is used to generate events via the process gb → tH ± , where as MadGraph aMC@NLO-2.6.1 (MG5) [67] is used for gg → tbH ± , and then showering through PYTHIA8. The dominant SM background process tt and QCD events are generated using PYTHIA8, while MG5 interfacing with PYTHIA8 is used for ttbb process. We generate events by dividing the phase space inp T bins,p T is the transverse momentum of the final state partons in the center of mass frame. Isolation of lepton is ensured by requiring, E AC T ≤ 30% of p T , where E AC T is the sum of transverse momenta of the particles which are within the cone ∆R(= ∆η 2 + ∆φ 2 ) < 0.3 along the direction of lepton. It is to be noted that the lepton isolation criteria is not imposed while selecting events applying lepton veto.
2. b-jet identification: The jets are reconstructed using Fastjet [77] with anti-k T algorithm [84] setting jet size parameter R=0.5. The jets are indented as b-like, if there is a matching with parton level b quark with a cone ∆R < 0.3 and p T ≥ 20 GeV and |η| < 2.5.

Top reconstruction:
The details of the top tagging is already discussed in the previous section. However, we found by matching that about 60-70% or little more, the top quark from H ± decay is tagged. In addition, after the reconstruction of charged Higgs using top tagged jets, an additional top jet is also reconstructed out of the remaining jets for hadronic signal events. For leptonic signal events no such top quark is reconstructed.

Charged Higgs mass reconstruction:
We observed via matching that the leading identified b-jet with p T >50 GeV corresponds (∼ 70-80%) to b quark originating from H ± decay (for m H ± 500 GeV). Hence, the charged Higgs mass is reconstructed combining the leading top tagged jet with the leading b-like jet. In which is 30% around the peak.

Multiplicity of b jets:
In signal events multiplicity of b-jets is higher than the tt and QCD backgrounds. Hard b-jet remains in signal final state, even after reconstruction of two(one) tops, and subsequently a charged Higgs in hadronic(leptonic) final state. The additional b-jets appear in background events due to the gluon splitting, and is not expected to be hard. Therefore, requirement of at least one hard b jet in the final state is expected to be useful in rejecting backgrounds. Hence, a selection, is imposed in the simulation.

Results
We simulate both the production processes in 4FS and 5FS schemes, and then obtain the final yield by appropriately weighting both the resulting cross sections, as per prescription given by Eq. 3.3 and 3.4. For the illustration purpose, in Table 3, the event yields in terms of cross sections are presented after each set of cuts as described above, for signal and backgrounds corresponding to the hadronic final state, see Eq. 4.2(a). The second row shows the total production cross sections of the respective processes at 13 TeV center of mass energy. Results for signal events are shown only for a representative choice of a single mass of charged Higgs, m H ± =500 GeV, although simulations are performed for a wide range of masses, upto 1 TeV. Also note that the results are presented for tan β = 30 and within the framework of Supersymmetric based mode (Type II). Selecting events with a lepton veto and at least one b-identified jets, the fat jets are reconstructed. In order to access the boosted region, events are selected with a p T > 200 GeV on fat jets. These high p T fat jets are used as a input to HepTopTagger to tag them as top jets. We employ MVA method as described above to tag top jets, and it is found that about 30% of events are tagged as a top jet. Subsequently, after top tagging, we look for the hardest leading b-jet with a cut p T > 50 GeV, which is found to be originating from H ± decay for about 70-80% events. Combining top tagged jets and the hardest b jet, the charged Higgs mass is reconstructed, and select events within the mass window ±30% of the input charged Higgs mass. Notice that a good fraction of background events remain within this reconstructed charged Higgs mass window.
With the remaining untagged jets and identified b-jets, the associated top quark is reconstructed. The requirement of a second reconstructed top quark, suppresses the background, in particular QCD, more than the signal. Finally, demanding a hard b-jet with p T >30 GeV rejects backgrounds substantially. Similarly cross section yields for leptonic final states(Eq.4.2(b)) are presented in Table 4. The events are selected with at least one identified b-jet and one isolated lepton. A top jet is tagged, and it is observed that efficiency of top tagging is less, due to the lack of availability of many hadronic top quarks. As before, requiring a hard identified b-jet, with p T >50 GeV and combining with tagged top jet, the charged Higgs mass is reconstructed. Finally, requirement of a hard b-jet suppresses the background more than the signal. Use of an additional cut on missing transverse momentum due to the presence of neutrinos in the leptonic decay of top is found not to be so useful.  In Table 5 we summarize the signal and background cross sections normalized by the kinematic acceptance efficiencies for both the hadronic and leptonic final state respectively. For illustration, we show results for three choices of charged Higgs mass, m H ± = 300, 500 and 800 GeV, corresponding to the signal cross section in both 4FS and 5FS mechanisms. The signal cross sections are found to be O(fb), where as the total background contribution is huge, in particular for hadronic final state. But for   Table 5 and 6 reveals that 300 GeV charged Higgs mass can be discovered for high luminosity options(3000 fb −1 ) with a reasonable significance, but for higher masses ∼ 500 GeV or more, the signal is merely observable. Clearly, it is hard to achieve discoverable signal sensitivity for heavier charged Higgs mass in this channel. However, discovery potential of charged Higgs in leptonic final state is comparatively better. The Table 6 shows, that the charged Higgs signal observable with a moderate significance for the mass range around 500 GeV even for 1000 fb −1 integrated luminosity option.
In summary, undoubtedly, this cut based analysis suggest how difficult it is to achieve discoverable sensitivity of charged Higgs signal owing to the huge background cross section with identical event topology. The present set of cuts are not very efficient to suppress backgrounds at the required level in order to make signal sensitivity better. One may think of more better construction of kinematic observables, and devise a set of cuts providing efficient optimization to reduce the background effect. Therefore, it is a challenging task to find the feasibility of the charged Higgs signal for heavier masses at the LHC. It motivates us further to develop search strategy using the technique of multivariate analysis, which is discussed in the next section.

Multivariate Analysis
In the previous section, we observed that there is no single or a combination of kinematic variables which has the potential to isolate tiny signal out of huge backgrounds. In this section, we discuss MVA in order to improve signal to background ratio aiming to a better significance for a given luminosity option. The basic idea of this method is to combine many kinematic variables which are the characteristics of signal events, into a single discriminator, and eventually this single discriminator is used to separate out the signal suppressing backgrounds. The MVA framework is a powerful tool used very widely in high energy physics, to extract the tiny signal events out of huge background events, including single top discovery [85] and recently the Higgs boson at the LHC [86].
Here we carry out the MVA through Boosted Decision Tree(BDT) method within the framework of TMVA [78].
In the BDT method, events are classified by applying sequentially a set of cuts making sub sets of events with different signal purity. Several disjoint decision trees consisting two branches are constructed through a best selection of cuts out of listed input variables of the given process, and it is repeated using subsequent set of cuts till all the events are classified. While training the sample events, if an event is misclassified i.e a signal event labeled as background or background event as signal event, then it is boosted by increasing the weight of that event. Subsequently, a second tree is made using the new weights, which may not be same as the previous tree. This process is repeated and we constructed about 1000 trees. There are few methods of boosting [87], and we use the gradient boosting technique [88]. In BDT algorithm, these trees are made by training half of the signal and background events. The remaining half of the signal and background events are used to check the performance of the trained BDT.
Following the production and decay mechanism, Eq. 4.1, events are selected for the final state consisting one top tagged jet, more than one identified b jet and untagged jets corresponding to hadronic signal final state. For leptonic signal, in addition, at least one isolated lepton is required. A large number of kinematic variables are constructed out of the momenta of these objects to train event samples, and eventually 10 input variables are used in BDT to train signal and background sample. In Table 7 and 8, the input variables are shown ranking them according to the importance of the BDT analysis for m H ± = 500 GeV corresponding to hadronic and leptonic final states respectively. In the third row of this table a brief description is provided for each of the variables. The importance here means the effectiveness of those variables in suppressing backgrounds maintaining a better signal purity.

Rank
Variables Description jets, and p T of the second b-jet, all appear to be useful variable in eliminating the background events.
In Table 8, the set of kinematic variables are presented for leptonic final state and for m H ± = 500 GeV. However, as before, this set remains same for m H ± = 300 and 1000 GeV, but ranking becomes different for obvious reasons. For instance, for lower mass of m H ± = 500 GeV, the variable 3 becomes more important than the variable 1. Due to the presence of neutrinos, the variable related with missing transverse energy, MHT plays role in discriminating background, in particular from QCD. Like hadronic case, the number of b jets and their corresponding transverse momentum are very effective in increasing signal to background ratio.
In this type of analysis based on machine learning models, one of the issue often encountered is the problem of overtraining the sample. The training of the sample can be checked using a test data sample. Ideally, for a sufficiently large and random monte carlo data, performance on training and testing data is to be similar. If a significant deviations between these two are found, that would be an indication of over training of the sample. This overtraining tests are performed for all m H ± = 300 − 1000 GeV masses. In Fig. 6, the distribution of MVA output discriminator with the number of events are presented, for signal events with m H ± = 500 GeV and backgrounds from QCD, tt and ttbb, along with the significance S/ √ B for an integrated luminosity 300 fb −1 . Significance close to 3σ can be achieved with a selection of the discriminator, D > 0.9. With this cut on D, and for integrated luminosity of 300 fb −1 , the number of events turn out to be 2830 for signal, and 1710000 for total backgrounds, where 70% contribution come from QCD. The selection of D > 0.9 leads to a significance of 2.65 which goes up more for higher luminosity options.
Unlike the hadronic case, in the leptonic signal final state, see Fig. 7, the dominant background appears to be due to tt production. A cut on BDT output D > 0.9 leads to a significance of about 3σ for L = 300 fb −1 . The study is extended upto the 1000 GeV mass of the charged Higgs.
Signal significances are presented for both hadronic and leptonic final state(in parenthesis) in Table.9 for three masses of charged Higgs and for three integrated luminosity options. Remarkably, a significant improvement in sensitivity for both hadronic and leptonic signal is achieved performing the analysis using MVA technique. This table suggests that the discovery potential of charged Higgs in hadronic channel for m H ± upto ∼ 500 GeV can be probed with a reasonable sensitivity, much better than the obtained using simple cut based analysis as shown in, Table 5.  2) The Table 9 shows that the signature of charged Higgs of mass around 800 GeV is observable for 3000 fb −1 luminosity option unlike the hadronic final state. For lower range of masses(∼ 500 GeV) signal is feasible even for 1000 fb −1 luminosity option.
The results presented so far, in Table 9 correspond to the SUSY motivated Type II model. However, the signal cross sections for other classes of 2HDM can be obtained out of these estimated values simply by rescaling the couplings and appropriately multiplying Br(H − →tb), which is very sensitive to tan β, as seen in Fig. 1. The significances for all four types of models are presented for hadronic and leptonic(within the parenthesis) in Table 10 and 11, corresponding to tan β 30 and 3 respectively. The table 10 suggests that for high tan β scenario, charged Higgs boson can be discovered only in the context of Type II model. However, for low tan β scenario, in Type I and III model, the charged Higgs coupling with top and bottom quark is favored yielding better significances. Because of the identical charged Higgs couplings with quarks in Type II and IV model, the corresponding significances become almost same. The feasibility of observing signal of charged Higgs of mass around ∼ 500 in both hadronic and leptonic final state for all classes of 2HDM, is quite promising for 1000 fb −1 luminosity, however for L = 300 fb −1 , only the charged Higgs boson of lower mass range is possible to discover at the LHC.  The discovery potential of charged Higgs of 300 GeV mass, is quite promising even for 300 fb −1 luminosity options at 13 TeV energy. However, for higher masses,e.g. for m H ± = 500 GeV, one needs high luminosity options such as 1000 fb −1 and more. This study shows that for higher masses ∼ 1000 GeV, it is very hard to achieve better signal sensitivity even for high luminosity option.

Summary
We explored the detection prospect of the charged Higgs boson of heavier mass range at the LHC in Run 2 experiments with the center of mass energy, √ s = 13 TeV within the framework of the generic 2HDM. A very brief discussion of the 2HDM is presented in order to setup model framework to carry out the analysis. It is observed, that, in all classes of 2HDM, the Br(H − →tb) is always the dominant one, except in Type III model where it is valid for only lower range of tan β(< 10) only. It is to be noted that the other decay modes, such as H − → W − φ(φ: h,H,A) also open up with a large Br once the condition sin(β − α) = 1 is relaxed, which are discussed in Sec.2. The charged Higgs boson production cross section at √ S = 13 TeV are computed in both 4FS and 5FS mechanisms, and finally the matched values are presented for a few m H ± values. In the context of SUSY motivated Type II model, the matched cross sections varies from O(100 fb) to O(10 fb) for the range of m H ± ∼ 300 − 1000 GeV, for large values of tan β, and found to be less for other classes of 2HDM. The signature of the charged Higgs signal is investigated for the final state consisting reconstructed charged Higgs mass and extra b-jets plus an additional reconstructed top quark for hadronic and leptonic events. The jet substructure technique is used to tag moderately boosted top quark from heavier charged Higgs decay in order to avoid re-combinatorial problem while reconstructing the charged Higgs mass. The MVA method is employed along with the HepTopTagger methodology to tag topjets. A better top tagging efficiency with lower mis-tagging rates are obtained in comparison to the result using only HepTopTagger. The detailed simulation is performed for signal and the main dominant SM irreducible backgrounds from the top quark pair production and QCD. Constructing few kinematic variables and optimizing the event selections, the cut based analysis predicts very poor signal sensitivity, even for very high luminosity option. However, for the lower mass range of charged Higgs boson, around 300 GeV, one can expect modest sensitivity for 3000 fb −1 luminosity option. In order to improve signal significance, the analysis is carried out using the techniques of multi-variate analysis within the framework of TMVA. Several kinematic variables are used to train BDT, and then overtraining of the samples is checked. Remarkably, the MVA analysis improved the signal significance to a reasonable level. For example, it demonstrates that with the L = 1000 fb −1 , the charged Higgs signal for the mass range ∼ 300−700 GeV can be probed for both hadronic and leptonic channel. For more higher luminosity option, such as 3000 fb −1 , the discoverable mass range can be extended moderately upto ∼ 800 GeV. Simply scaling the charged Higgs couplings, and then the production cross sections, we present the signal significances for all classes of 2HDM for three representative choices of charged Higgs masses and two values of tan β = 30 and 3. The results show that for high tan β = 30 scenario, it is difficult to achieve any detectable signal sensitivity for other classes of 2HDM,except for Type II model. Interestingly, for low tan β = 3 case, signature of the charged Higgs boson for the mass range ∼ 300 − 700 GeV seems to be detectable with high luminosity options with a reasonable sensitivity for both leptonic and hadronic signal final state. Finally, it is indeed hard to discover charged Higgs mass beyond 800 GeV in the current and future options of the LHC experiments. If charged Higgs is not discovered withing this mass range, then may be, one needs more high energy collider options such as √ S =100 TeV [89].   [20] CMS Collaboration Collaboration, "Search for Charged Higgs boson to cb in lepton+jets channel using top quark pair events," Tech. Rep. CMS-PAS-HIG-16-030, CERN, Geneva, 2016.