Probing Inert Doublet Model using jet substructure with multivariate analysis

We explore the challenging but phenomenologically interesting hierarchical mass spectrum of Inert Doublet Model where relatively light dark matter along with much heavier scalar states can fully satisfy the constraints on the relic abundance and also fulfill other theoretical as well as collider and astrophysical bounds. To probe this region of parameter space at the LHC, we propose a signal process that combines up to two large radius boosted jets along with substantial missing transverse momentum. Aided by our intuitive signal selection, we capture a hybrid process where the di-fatjet signal is significantly enhanced by the mono-fatjet contribution with minimal effects on the SM di-fatjet background. Substantiated by sizable mass difference between the scalars, these boosted jets, originally produced from the hadronic decay of massive vector bosons, still carry the inherent footprint of their root. These features implanted inside the jet substructure can provide additional handles to deal with large background involving QCD jets. We adopt multivariate analysis using boosted decision tree to provide a robust mechanism to explore the hierarchical scenario, that would bring almost the entire available parameter space within the reach of the 13 TeV LHC.


Introduction
The Standard Model (SM) of particle physics encapsulates our knowledge of fundamental interactions of the particle world with all its glory. Until now, apart from a few minor exceptions, the SM is in perfect agreement with all the high energy collider experiments like the Large Hadron Collider (LHC) experiments at CERN. The reputation of the SM being the complete theory gets tarnished when it cannot explain the presence of tiny yet nonzero masses of the neutrinos that is already established in neutrino oscillation experiments. Observations of cosmic microwave background radiation in various experiments unambiguously establish that 26% of the energy budget of our universe is made up of an inert, stable component, termed as the 'dark matter' (DM). The SM does not contain any particle that can satisfy the observed density of the DM, along with explaining its other properties. Inert doublet model (IDM) is proposed [1,2] as a minimal extension of the SM that can provide an inert weakly interacting DM candidate, stabilized by the discrete symmetry of the model. The SM is extended with an extra SU(2) L scalar doublet which is odd under a discrete Z 2 symmetry, and thus stabilizes the lightest neutral scalar of the model to be an ideal DM candidate. The neutrino mass can also be arranged in this set-up by introducing lepton portals [3].
Exploring the dark sector of IDM, as done in Refs. [4,5], reveals that only small islands of parameter space can satisfy the full relic density of the DM dictated by the WMAP and the Planck results. Only a light DM with mass close to half of the Higgs mass can produce full relic density through the resonant Higgs portal annihilation. Even, this requires a large mass difference between the DM with other beyond the SM (BSM) scalars in the model. Otherwise, the coannihilation effects in a degenerate mass spectrum reduce the relic density to under-produce DM. Another part of the parameter space where one can explain the full DM relic density is for the DM mass, m DM 550 GeV and even that is possible for an extremely degenerate BSM scalar mass spectrum. Among the various DM scenarios discussed above, the heavy DM with hierarchical spectrum can accommodate only about a few percent of the observed relic density and thus, makes this scenario uninteresting to probe at the LHC.
One of the earlier collider studies of the IDM is performed in Ref. [6]. The dilepton and trilepton signatures at the LHC originating from the IDM have been investigated in Refs. [7,8]. Although the degenerate heavy DM scenario can provide full relic density, inertness of the model leads to kinematic suppression in heavy DM production at colliders and therefore, makes the signal very weak. Moreover, detection of the soft decay products from such a compressed mass spectrum remains challenging due to poor signal efficiency. This scenario is probed using charged track signatures by the CMS collaboration [9].
Among the two light DM scenarios in this model, the degenerate BSM scalar mass spectrum can satisfy only about 10% of the observed DM relic density. Nonetheless, this case is probed at the LHC through the mono jet search in Ref. [10]. We motivate our framework with the light DM along with hierarchical mass spectrum where the full DM relic density is achieved. Although very challenging due to tiny production cross sections of the unstable heavy BSM scalars at the LHC, their large mass differences with the DM candidate give rise to interesting signal topology characterized by two boosted jets along with large missing transverse energy (MET) from the DM production. This gives us a scope to employ sophisticated multivariate analysis equipped with jet substructure variables to isolate signal from the overwhelmingly large SM background. The search for BSM scalars for this case is studied in the dijet plus MET channel, in a recent study [11]. All these searches do not exhibit bright discovery potential even with highest possible LHC luminosity. We look to explore for a suitable discovery potential of this scenario, in this paper.
To reiterate the scenario under consideration, the heavier BSM scalars (a pseudoscalar A and charged Higgs H ± ) reside in the mass range (∼ 250 -700 GeV) much higher than the small mass window (55 -80 GeV) where the DM candidate can lie. Hence, the hierarchical mass difference is large enough for these heavy scalars to decay dominantly to vector bosons (V = {W, Z}), which in turn are sufficiently boosted and corresponding hadronic deposits at the calorimeter behave like large radius fatjets, characterized by the jet radius, R ∼ p T /2m V 0.8. Accompanied with large MET acquired by the undetected pair of DM particles, presence of these fatjets in the signal brings additional variables like fatjet mass (M J ) and subjettiness (τ 21 ) which carry the characteristics of boosted W/Z decay. These observables are perceived as the ones that can distinguish well between the signal and the background, dominantly coming from the SM V +jets, as only a tiny fraction of QCD-jets mimic as boosted jets. Still, when the overwhelmingly large cross section of the background is pitted against suppressed signal cross section due to inertness of the model, even a tiny fraction can overshadow the signal to deny a significant discovery potential.
There is a possibility of mono-fatjet signal topology [12,13] with roughly one order of magnitude higher cross section than the di-fatjet one. In this case, although we have bigger cross section, the corresponding background becomes uncontrollably large. Therefore, the mono-fatjet topology alone is not sufficient to achieve discovery significance. The di-fatjet topology, on the other hand, also alone is not enough to produce discovery significance due to tiny production cross section. In this paper, we propose a hybrid topology where the signal selections are designed aiming the di-fatjet topology but can allow a substantial fraction of mono-fatjet signal. In doing this, we not only gain in signal but at the same time the huge mono-fatjet background can also be tamed down.
Probe of the IDM using a cut-based analysis (CBA) in the di-fatjet plus MET channel, has failed to reach discovery significance of 5σ in any of our chosen benchmark points. From the nature of event distribution profiles, it is observed that the two variables viz. M J and τ 21 are very powerful to separate tiny signal from the enormous SM background. The discovery significance in CBA is still elusive even if these jet substructure variables are used to the hilt. A multivariate analysis (MVA) in general performs better than a CBA, if appropriate kinematic variables are used in the analysis. So, a sophisticated MVA involving jet sub-structure variables quoted above is imperative to achieve better discovery potential in the IDM. With the events selected only after baseline cuts (defined later), signal is still too tiny compared to the background to train the MVA set-up. Therefore, the baseline selection criteria should be accompanied with stronger selection cuts at the baseline level to cut down the large background without harming the signal too much before passing events to MVA. These cuts should be chosen optimally, otherwise, if they are very similar to the one used in CBA will reduce predictive power of MVA. Finally, our selection cuts are designed in a way such that it allows signal consists of two high-p T fatjets along with large contamination coming from the events with mono-fatjet that mimic di-fatjet signal. This significantly increases the signal cross section but simultaneously bring in some extra background processes in the picture. We perform a MVA analysis coupled with jet substructure variables to achieve improved signal vs background discrimination which is seemingly could not be achieved in the CBA. It helps us to reach significantly higher LHC discovery potential in the di-fatjet plus MET channel of the IDM. This paper is organized as follows. In Sec. 2 we briefly discuss the IDM, outlining its scalar sector. Next, in Sec. 3, we invoke all the possible theoretical, collider and astrophysical constraints applicable to the IDM, to ascertain the viability of hierarchical BSM scalar sector along with the presence of a light DM. Subsequently, in Sec. 4, we discuss four possible DM scenarios depending on the DM mass and its mass differences to the other BSM scalars to motivate our choice of benchmark points. To define our analysis setup, we list the possible IDM processes contributing to our signal process, di-fatjet plus MET channel in Sec. 5. We also discuss all possible SM backgrounds for this channel. At this point we present our sample benchmark points covering our region of interest, which is also consistent with all discussed constraints. In Sec. 6, we first use the baseline cuts and then introduce two fatjet specific observables and study how these perform to increase the signal vs background ratio and obtain the LHC reach for all the benchmark points. In next Sec. 7, we improve our probe using MVA to recast the signal vs background numbers with non-rectangular cuts and therefore, having better sensitivity for the LHC search. Finally, we summarize our results and conclude in Sec. 8.

Inert Doublet Model
We first discuss the traditional IDM where one adds an additional SU(2) L complex scalar doublet Φ 2 apart from the SM Higgs doublet Φ 1 , which are respectively odd and even under a discrete Z 2 symmetry. The most general scalar potential that respects the electroweak symmetry SU(2) L ⊗ U(1) Y ⊗ Z 2 of the IDM can be written as [5], where Φ 1 and Φ 2 both are hypercharged, Y = +1, and can be written as Here h is the SM Higgs with G + , G 0 being the charged and neutral Goldstone bosons, respectively. The charged scalar H + is present in Φ 2 , along with the neutral scalars, H, A respectively being CP-even and CP-odd. For the vacuum expectation values (VEVs) of the two doublets, we adopt the notation Φ 1 = v/ √ 2, Φ 2 = 0, keeping in mind the exact nature of the Z 2 symmetry. The zero VEV of Φ 2 is responsible for the inertness of this model. Since all the SM fermions are even under Z 2 , the new scalar doublet does not couple to the SM fermions and therefore, new scalars do not have fermionic interactions. The scalar-gauge boson interactions originate through the kinetic term of the two doublets All parameters in the scalar potential are assumed to be real in order to keep the IDM CP-invariant.
Here, the electroweak symmetry breaking takes place through the SM doublet Φ 1 getting a VEV and after this, the masses of the physical scalars at tree level can be written as Here, m h is the SM-like Higgs boson mass, and m H(A) are the masses of the CP-even (odd) scalars from the inert doublet, while m H ± is the charged scalar mass. Either of the neutral scalars can be the DM candidate in this IDM framework, since DM observations can not probe the CP-behaviour. For the present analysis, we consider the CP-even scalar H as the DM candidate, which corresponds to negative values of λ 5 parameter. We define λ 3 + λ 4 + λ 5 = λ L , which can be either positive or negative. The relations between the λ's and the scalar masses get modified when the QED corrections are considered for both the scalar masses and scalar potential parameters. As the inert scalars do not couple to the SM quarks, higher-order QCD corrections are negligible for these parameters. Compared to the SM, only the scalar sector is modified in the IDM. Similar to the SM, λ 1 and µ 2 1 are determined by m h ≈ 125 GeV and v ≈ 246 GeV. There are five free parameters in the scalar sector of the IDM viz. λ L , λ 2 , m A , m H ± , and m H that are expressed in terms of the five scalar potential parameters, µ 2 2 and λ 2,3,4,5 , as shown in Eq. 2.4. The new doublet, being inert to the SM fermionic sector, does not introduce any additional new parameter in this set-up. The inert doublet self coupling parameter λ 2 , mostly limited to fixing unitarity and stability of the potential, does not affect the scalar masses and their phenomenology. The Higgs portal coupling λ L to the chosen DM candidate H determines the rate of the DM annihilation through the Higgs and therefore, is an important parameter in the DM sector along with the DM mass m DM = m H . The collider phenomenology of the IDM depends on the scalar masses m H ± , m A and m H , as the mass differences between them play a major role in proposing the search channels for different scenarios.

Constraints on Inert Doublet Model
The IDM parameter space is constrained from various theoretical as well as experimental considerations. In the model, we have an extra doublet which brings extra parameters in the scalar potential. Therefore, it is imperative to check whether the extended potential is bounded from below or not i.e. stable at tree level along with the potential parameters being within the unitary and perturbative regime. With the presence of extra doublet, oblique parameters should be re-examined with respect to the presence of a light DM and custodial symmetry breaking, respectively. Presence of light scalars can also upset the LEP constraints and the Higgs invisible decay limits. Since a DM is present in the model, we should satisfy the observed relic density keeping the DM-nucleon interactions below the DM direct detection reach.

Theoretical constraints
The scalar sector is modified in the IDM. We ensure the enlarged potential is stable i.e. not unbounded from below and the global minimum is a neutral one. It is also checked if the potential parameters are perturbative at the tree level along with satisfying unitarity bounds.
Bounded from below: Theoretical constraints on quartic potential parameters (λ's) can arise from restricting the scalar potential in Eq. 2.1 not to produce large negative numbers for large field values (i.e. V > 0 ∀ Φ i → ±∞). The mixed quartic terms can be combined to form complete square terms and demanding their coefficients to be positive, leads to the following conditions 1 .
Due to the introduction of new scalars, there are possibilities of having multiple minima. For the inert vacuum to be the global minimum, we restrict it from being charged by ensuring the condition, Perturbativity and unitarity: We form the S-matrix with the amplitudes computed from the 2 → 2 scalar scattering processes taking into account all the extra quartic terms in the scalar potential. The eigenvalues of the S-matrix turn out to be some combinations of these couplings. The perturbative unitarity constraints on those eigenvalues are, |Λ i | ≤ 8π where scattering matrix provide us the combinations as

Collider constraints
Precision measurements at the LEP and the LHC, contributed in pinning down the trace of new physics effects in different forms. The effects of hierarchical heavy BSM scalar mass spectrum and the presence of a lighter DM are under consideration. After the discovery of the Higgs boson, the LHC also measured its properties. Two such measurements, the Higgs decay to γγ and its invisible decay are important to consider in the context of IDM.
Oblique correction constraints: The oblique parameters S, T and U , proposed by Peskin and Takeuchi [15], are different combinations of the oblique corrections i.e. radiative corrections to the two-point functions of the SM gauge bosons. The S parameter encodes the running of neutral gauge boson two-point functions (Π ZZ , Π Zγ , Π γγ ) in the lower energy range, from zero momentum to the Z-pole. Therefore, the S parameter is sensitive to the presence of light particles with masses below m Z which is the case here due to the presence of the light DM. The T parameter, on the other hand, measures the difference between the W W and the ZZ two-point functions, Π W W and Π ZZ , at zero momentum. Mass splitting of the scalars inside a SU(2) L doublet breaks the custodial symmetry which modifies T . In the IDM, the mass splittings between the neutral and the charged scalars are controlled by the T parameter. The experimentally measured values of oblique parameters that we use in our analysis are [16]: h → γγ signal strength constraint: The signal strength for the h → γγ channel is given by the following ratio [17], In the IDM, the Higgs production rate is similar to that of the SM, as it is gluon fusion dominated in both the models. So, in the IDM, the ratio can be approximated as Combined CMS and ATLAS fit in the diphoton channel provides 2σ limit on this observable as [18], Presence of a charged Higgs in the h → γγ decay loop can induce significant shift in this ratio for large values of hH + H − coupling. In the IDM, this coupling depends on λ 3 which is also related to the charged Higgs mass and this parameter is constrained from the allowed range of the ratio R γγ that can deviate from unity.
Constraint from the Higgs invisible decay: Another constraint from the Higgs data, applicable for the scenario when Higgs can decay to a pair of DM particle i.e. m DM < m h /2. The invisible decay width is given by The latest ATLAS constraint on the invisible Higgs decay is [19] In case for a light DM when the Higgs decay to a pair of DM particles is kinematically allowed, this limit can significantly constrain the parameter space of IDM.
LEP bounds: A reinterpretation of the neutralino search results at the LEP-II has ruled out the parameter region [20,21] that satisfy the following three conditions m H < 80 GeV, m A < 100 GeV and m A − m H > 8 GeV.
Reinterpretation of chargino search results at the LEP-II has put a bound [22] on the charged Higgs mass as, m H + > 70 GeV. (3.10) A hierarchical IDM scalar spectrum is not restricted from these constraints. Moreover, due to large mass gap in the spectrum, Z → HA, W ± → HH ± , W ± → AH ± off-shell decays have negligible effect on the total width of the W and Z bosons, that are very precisely measured at the LEP experiments.

Astrophysical constraints
This model contains a DM candidate, the CP-even scalar in Φ 2 . Therefore, astrophysical constraints on this model consist of the DM relic density and the direct probe of DM in Xenon and LUX experiments.
Relic density: There are unputdownable observational evidences of the presence of DM in the Universe, through the latest Planck experiment data. That suggests the current density of the DM comprises approximately 26% energy budget of the present Universe. The observed abundance of DM is usually represented in terms of density parameter Ω as [23] Ω DM h 2 = 0.1187 ± 0.0017 (3.11) where, observed Hubble constant H 0 = 100 h km s −1 M pc −1 . The rate of DM annihilation to the SM particles is inversely proportional to the relic of the DM, and therefore constraints are imposed to avoid overproduction of the relic in the IDM.
Direct detection constraints: Along with the constraints from the relic abundance measurement in the Planck experiment, there exist strict bounds on the DM-nucleon cross section from the DM direct detection experiments like Xenon100 (Xenon1T) [24] and more recently from LUX [25]. For scalar DM considered in this work, the spin independent DM-nucleon scattering cross section mediated by the SM Higgs is given as [1] where µ = m n m DM /(m n +m DM ) is the DM-nucleon reduced mass and λ L = (λ 3 +λ 4 +λ 5 ) is the quartic coupling involved in the DM-Higgs interaction. Recent estimate of the Higgs-nucleon coupling is, f = 0.32 [26], although the full range of allowed values is, f = 0.26 − 0.63 [27]. As shown in Fig. 1 later, the Xenon1T upper bound on the DMnucleon scattering can put a stringent limit on allowed λ L values which constrain the Higgs-DM coupling.

Possible searches and benchmarking
We describe four DM paradigms inside the IDM depending on the DM mass and the hierarchical or degenerate nature of the mass spectrum. In each scenario, we discuss the thermal DM relic abundance along with the phenomenological study done to probe the BSM Relic Density ∼ 1% Table 1. Illustration of four DM paradigms inside the IDM parameter space comparing DM mass as well as scalar mass hierarchy. Available DM density as a fraction of required relic density is also pointed out for these cases. We study the phenomenologically interesting but challenging region marked by Case II.
scalars. We also point out how sign-reversal of λ L can alter the relic density dependence on the DM mass.
Case I: We first consider a case of light DM with mass, m DM 80 GeV together with all other heavy scalars within a narrow mass range. This case is severely constrained from the LEP data which rules out m DM < 45 GeV. Precise LEP measurements of the Z-width constrains Z → AH decay together with the conditions in Eq. 3.9 . Even for the DM mass above 45 GeV, degenerate nature of the spectrum ensures that all the inert scalars take part in the co-annihilation processes and reduce the relic density to somewhat below 10% of the total relic. Sharp dip appears when the DM having mass m h /2 form the resonant production peak in the DM annihilation through the Higgs portal. Moreover, some other shallow dips in the relic density are also observed when the W W and the ZZ annihilation modes open up, enhancing the annihilation cross section. In this low mass region, the DM annihilation through the Higgs portal is the dominant contribution which remains unaffected with the sign of λ L , having no effect on the relic density. This scenario is explored at the LHC in the mono-jet signal as discussed in Ref. [10].
Case-II: Here, we consider the light DM with hierarchical scalar mass spectrum i.e. large mass differences (∆M ) with other heavy scalars. Due to this large mass difference between H and A/H ± , the LEP Z-width measurements do not constrain such a low DM mass. DM annihilates only through the Higgs portal and therefore for tiny λ L , relic is overproduced. However, entire relic density can be described at larger λ L values which are progressively bounded from the DM direct detection data from LUX and Xenon1T. Contrasting this with the degenerate case as pointed out in 'Case-I', here the co-annihilation effect is absent in annihilation cross section and increases the relic density to produce full relic in the range m DM ∼ 53 − 70 GeV depending on different λ L values. Phenomenologically this parameter range is quite interesting although detection of such a very light DM along with much heavier other scalars are challenging at the collider. One has to encounter very small production cross section along with extremely large SM background where the signal characteristics are very background-like. The LHC potential of this case is studied in the dijet plus MET channel in Ref. [11]. Here, we take up this scenario for further analysis.
Case-III: If we move towards the heavier DM regime, degenerate mass spectrum can provide full relic density at around m DM ∼ 550 GeV 2 . Exact mass depends strongly on the value of λ L parameter. From a 10% relic for m DM ∼ 100 GeV, it steadily increases as the HH → W W, ZZ annihilations open up and the cross section decreases with mass. The HHV V coupling turns out to be λ HHV V ∼ (4m DM ∆M/v 2 + λ L ) in the limit DM and other heavy scalars are mass degenerate i.e. ∆M → 0. Even if the DM annihilation rate increases with the DM mass, that increase is strongly suppressed due to tiny mass differences between the different BSM scalars in a nearly degenerate mass spectrum. The DM relic density, along with being inversely dependent on annihilation cross section, also is directly proportional to the DM mass. Therefore, interplay of these two competing effects finally ends up in a gradual increase of the DM relic density. The quartic coupling essentially depends only on λ L in ∆M → 0, even then a λ L sign reversal does not affect the DM pair annihilation. This scenario is phenomenologically interesting but quite difficult to probe at the LHC. Challenging compressed scenario can be probed at the LHC with charged track signal of a long lived charged scalar [9].
Case-IV: In the heavier DM regime with hierarchical mass spectrum, the annihilation cross section increases with the DM mass, due to rapid increase of the DM-gauge boson quartic couplings with its mass (λ HHV V = 2(2m DM + ∆M )∆M/v 2 + λ L ), which is a result of large mass difference between the BSM scalars. This enhances the DM annihilation with m DM , which therefore, leads to decrease of the relic density with increasing m DM within a few percent of the full observed relic density. Here, λ L dependence is mostly overshadowed by large mass differences and does not affect much.
Among the four DM scenarios in the IDM as described above and also summarized in Table 1, two cases have emerged as phenomenologically exciting. Light DM (m DM ∼ 50 − 80 GeV) with hierarchical mass spectrum with a substantial mass gap (∆M 100 GeV) with other heavy scalars can provide the full observed DM relic density. On the other side, we get a rather heavy DM (m DM ∼ 550 GeV) with extremely degenerate mass spectrum which can also provide the required relic density. Both the scenarios are challenging to probe, as the heavier BSM scalars are difficult to be produced in the inert model and essentially confront with large irreducible SM backgrounds. Now, we particularly focus on the low DM mass (50 GeV -70 GeV) with hierarchical mass spectrum i.e. ∆M HA , ∆M H ± H 200 GeV for our phenomenological study. To demonstrate the exact numerical evaluation, in Fig. 1, we explore the m DM − λ L parameter plane of the IDM for a light DM with ∆M = 100 GeV applying the constraints from the DM relic density measurements, the DM direct detection experiments and the constraint from the Higgs invisible decay. This choice of 100 GeV is representative since major annihilation modes for DM is through the Higgs portal and the parameter space of this plot is equally valid for larger ∆M choices. Blue (Red) dots are the points where the  constraints from LUX (Xenon1T) put stringent upper bound on λ L , for all values of light DM. All other constraints described above, provide weaker bounds in this parameter space. With our understanding of allowed DM mass and λ L parameter in the light DM scenario, we now attempt to comprehend other remaining parameters. To do so, we set these parameters to a particular choice from the allowed region of the relic density plot and then perform a scan over the remaining three parameters comprising of heavy scalar masses (M ± H , M A ) and λ 2 . One such frame of the allowed parameter space after imposing the theoretical constraints (unitarity, perturbativity etc.) from section 3.1 are shown by the blue scatter plots in Fig. 2. The red dots in the same plot represent the values of M ± H and M A which satisfy the oblique parameter constraints. The oblique parameters, mainly the T parameter, force these heavy scalar masses M ± H and M A to be almost degenerate. To study the specific low mass DM scenario within the IDM, we choose a set of seven benchmark points (BPs) from the allowed parameter space. These BPs covering heavy scalar mass between 250 GeV to 550 GeV along with corresponding input DM mass and λ parameter are summarized in Table 2. It is worth noting that the choice of M H and λ L is for the theoretical and experimental consistency but the collider analysis proposed in this paper will hold equally good for all the allowed points in Fig. 1.

Collider analysis
We make use of various publicly available HEP packages for our subsequent collider study aiming for a consistent, reliable detector level analysis. We implement the IDM Lagrangian in FeynRules [29] to create the UFO [30] model files for the event generator MadGraph5 (v2.5.5) [31] which is used to generate all signal and background events. These events are generated at the leading order (LO) and the higher-order corrections are included by multiplying appropriate QCD K-factors. We use CTEQ6L1 [32] parton distribution functions for event generation by setting default dynamical renormalization and factorization scales used in MadGraph5 [33]. Events are passed through Pythia8 [34] to perform showering and hadronization and matched up to two to four additional jets for different processes using MLM matching scheme [35,36] with virtuality-ordered Pythia showers to remove the double counting of the matrix element partons with parton showers. Matching parameter, QCUT  is appropriately determined for different processes as discussed in [37]. Detector effects are simulated using Delphes [38] with the default CMS card. Fat-jets are reconstructed using FastJet [39] package by clustering Delphes tower objects. We employ Cambridge-Achen (CA) [40] algorithm with radius parameter R = 0.8 for jet clustering. Each fatjet is required to have P T at least 180 GeV. We use the adaptive Boosted Decision Tree (BDT) algorithm in the TMVA framework [41] for MVA.

Signal topology
As discussed in the introduction, the hierarchical mass pattern in the IDM scalar sector (i.e. M A ∼ M H ± M H ) provides us interesting final states. Once pair of heavy scalars (or one heavy scalar associated with DM candidate) are produced at the LHC they eventually decay dominantly producing two (or one) boosted vector bosons, each of which decaying hadronically and thus producing V -jet (J V ) where V = {W, Z}. These boosted V -jets are always associated with large MET ( / E T ), an outcome of our inability to detect the DM pair at the detector. Representative Feynman diagrams of these signal topologies are demonstrated in Fig. 3. Among them, it must be already clear to the readers that the 1J V + / E T channel, although being cross-section-wise bigger than 2J V + / E T , has less sensitivity at the LHC due to overwhelmingly large SM background. Therefore, we primarily focus on the 2J V + / E T channel where the large background can be tamed down by employing jet substructure variables in a MVA framework. Before moving on to the actual analysis, we give some useful details about these two signal topologies.
2J V + / E T channel: This final state can arise in the IDM for the aforementioned benchmarks from the following three different channels.  Table 3. Production cross sections for the signal processes that contribute to the 1J V + / E T and 2J V + / E T final states at the 13 TeV LHC. These numbers are for pp → xy level before the decay of IDM scalars.
Here, A and H ± decay to ZH and W ± H, respectively. As Z and W are originating from a heavy resonance, it is possible that they have sufficient boost to be reconstructed in a large radius jet. We do not distinguish a Z-jet or a W -jet and call them as V -jet as we always select fatjets with a broad mass range. A V -jet possesses two prong substructure i.e. energy will be centered around two subjet axes. We utilize the N -subjettiness ratio τ 21 (defined later) to tag V -jets.
1J V + / E T channel: This final state can arise from the following two different channels.
Extra jets can arise in the final state due to initial state radiation (ISR) and can form another fatjet. So these channels can potentially mimic the 2J V + / E T final state. We generate matched samples of this signal with up to two additional jets in the final state. In this topology, only one of the two fatjets will have the V -jet like structure and the other jet originates from the QCD radiation which mimics the fatjet characteristics. We find that the contributions to the 2J V + / E T final state from the 1J V + / E T topologies are quite significant and sometimes bigger than the 2J V + / E T contribution itself after our final selection. This is mainly due to bigger production cross-sections of pp → AH, H ± H processes and the tail events which satisfies the fatjet criteria of our analysis 3 .
The leading order production cross sections for the signal processes are discussed above for different BPs are given in Table 3. We have used NLO QCD k-factors of 1.27 and 1.50 for the qq and the gg initiated productions for the signal [42]. 3 The motivation to choose 2JV + / E T channel is that one has large features than in the case of 1JV + / E T to handle the enormous background, where 1JV + / E T also contributes to the to signal 2JV + / E T when extra QCD jet mimics as a fatjet. The 1JV + / E T is explored in the searches [12,13].

Backgrounds
For our signal topologies, major backgrounds come from the following SM processes which we discuss briefly below.
V + jets: There are following two types of mono-vector boson processes that contribute dominantly in the background.
• Z + jets: This is the most dominant background in our case. We generate the event samples by simulating inclusive pp → Z + jets → νν + jets process matched up to four extra partons. Here, invisible decay of Z gives rise to large amount of / E T and QCD jets mimic as fatjets.
• W + jets: This process also contribute significantly in the background when W decays leptonically and the lepton does not satisfy the selection criteria. This is often known as lost lepton background. Neutrino comes from the W -decay contributes to missing energy and QCD jets mimic as fatjets. We generate the event samples by simulating inclusive pp → W + jets → (e,µ) ν + jets process matched up to four extra partons.
In order to get statistically significant background events coming from the tail phase space region with large / E T , we apply a hard cut of / E T > 100 GeV at the generation level to generate these background events.
V V + jets: Different diboson processes like W Z, W W and ZZ also mimic the signal and contribute to the SM background. The pp → W Z process contributes most significantly among these three diboson channels when W decays hadronically and Z decays invisibly. We call this background as W h Z ν . Similarly, W h W , where one W decays hadronically and the other leptonically, and Z h Z (a hadronic Z and a leptonic Z) can also contribute to the SM backgrounds when leptons remain unidentified. All the diboson processes are generated up to two extra jets with MLM matching. In this case, one of the fatjet can come from hadronic decay of V and the other can come from the hard partons.
Single top: Single top production in the SM includes three types of processes viz. top associated with W (i.e. pp → tW process), s-channel single top process (i.e. pp → tb) and t-channel single top process (i.e. pp → tj). Among these, the associated production tW contributes significantly in the SM background for our signal topologies.
tt + jets: This can be a background for our signal topologies when it decays semileptonically, i.e. one of the top decay leptonically and the other decays hadronically. This background contains b-jets. We control this background by applying a b-veto. This background always have one V -jet. Another fatjet can originate from an untagged b-jet or QCD radiation.
Apart from the above background processes, we also calculate the contributions from triboson and QCD multijet processes. However, these contributions are found to be in-  Table 4. Cross sections for the background processes considered in this analysis at the 13 TeV LHC. These numbers are shown with the QCD correction order provided in brackets.
significant as compared the background discussed above, and therefore we neglect the contribution of these background in the analysis. The production Cross-sections with higher order QCD corrections for all the background processes considered in this analysis at the 13 TeV LHC are listed in Table 4.

Cut-based analysis
We perform a CBA to estimate the sensitivity of observing the IDM signatures at the high luminosity LHC runs. It is evident that the signal cross sections are too small compared to the large SM background. Therefore, one needs sophisticated kinematic observables for the isolation of signal events from the background events. Our signal processes always include at least a hadronically decaying vector boson that can provide a V -like fatjet. Therefore, we make use of the jet substructure variables for our purpose.

V -jet tagging: jet substructure observables
Jet substructure observables has emerged as a powerful technique to search for new physics signatures at colliders. In our case, boosted W and Z bosons, originated from the decay of heavy IDM scalars (H ± , A), give rise to collimated jets that can form a large radius jet (fatjet). These fatjets have two-prong substructures. We utilize two jet substructure observables viz. jet-mass (M J ) and N -subjettiness ratio (τ 21 ). The M J is a viable observable to classify the V -jets from the fatjets originated from QCD jets. We calculate the jet mass as M J = ( i∈J P i ) 2 where P i are the four-vector of energy hits in the calorimeter. The discrimination power of M J reduces if extra contribution comes from the parton which do not actually originated from the V -decay. This results in broadening of the peak in the M J distributions. To remove these softer and wide-angle radiation, different jet grooming techniques are proposed such as -trimming, pruning and filtering [46][47][48][49]. We choose pruning for grooming the fatjets.

Pruned jet mass:
We performed the pruning with the standard method as prescribed in Refs. [47,48] to clean the softer and wide-angle emission by rerunning the algorithm and vetoing on such recombinations. At each step of recombination, one calculates the two variable z and ∆R ij , where z is defined as z = min(P T i , P T j )/P T i+j and ∆R ij is the angular separation between two proto-jets. If z < z cut and ∆R ij > R f act then i-th and j-th proto-jets are not recombined and the softer one is discarded. Here, z cut and R f act are parameters of pruning algorithm. We have taken the default values of R f act = 0.5 and z cut = 0.1 as suggested in Ref. [47]. In Fig. 4, we show the distributions for pruned jet mass for signal (BP3) and the important backgrounds. It is evident from these distributions that the peak around 80-90 GeV reflect the V -mass peak for the signal whereas for most of the background processes the peaks below 20 GeV reflect the fatjets mimic from a single prong hard QCD jet.

N -subjettiness ratio:
N -subjettiness is a jet variable which determines the inclusive jet shape by assuming N subjets in it. It is defined as the angular separation of constituents of a jet with the nearest subjet axis weighted by the P T of the constituents and can be calculated as [50,51] Here, i runs over the constituent particles inside the jet and p i,T is the respective transverse momentum. The normalization factor is defined as N 0 = i p i,T R for a jet of radius R. In Fig. 5, the distribution for N -subjettiness ratio for signal BP3 and leading background are shown. The value for τ 21 is small for fatjet emerging from signal than the background. The N -subjettiness ratio τ 21 is close to zero if correctly identify the N -prong structure of the jet.

Event selection
We list our baseline selection criteria to select events for further analysis.
Baseline selection criteria: • Events are selected with missing transverse energy / E T > 100 GeV.
• We demand for at least two fatjets of radius parameter R = 0.8 constructed using CA algorithm with fatjet transverse momentum P T (J) > 180 GeV.
• We apply the following lepton veto -events are rejected if they contain a lepton with transverse momentum P T ( ) > 10 GeV and pseudorapidity |η( )| < 2.4.
• We further demand that the azimuthal separation ∆φ between the fatjets and / E T , |∆φ(J, / E T )| > 0.2. This minimizes the effect of jet mismeasurement contributing to / E T .
After primary selection, we apply the following final selection criteria on events satisfying the baseline selection criteria for final analysis.
Final selection criteria: • After optimization with signal and background, the minimum / E T requirement is raised from 100 GeV to 200 GeV.
• In order to reduce the huge background coming from the tt + jets, we apply b-veto with p T -dependent b-tagging efficiency as implemented in Delphes. Here, jets are formed using the anti-k t algorithm with radius parameter R = 0.5.
• We demand that the pruned jet mass of leading and sub-leading fatjets should be in 65 GeV < M J i < 105 GeV to tag J V candidates.
• Further to discriminate the fatjet J V from the QCD jets, we look for the two-prong nature of the fatjet using N -subjettiness and select the events with τ 21 (J i ) < 0.35 of unpruned fatjet.

BP 3
Zinv + 4j  In Table 5, we present the cut-flow for the signal (BP3) associated with the cut efficiencies and number of events for an integrated luminosity of 3000 fb −1 at the 13 TeV   LHC. Similarly, Table 6 represent the cut-flow for the different backgrounds. From these numbers, it is explicit that the τ 21 and M J are powerful variables to have large background reduction with good signal acceptance. We can further infer from Table 5 that in spite of quite low efficiencies of AH and H ± H channels to satisfy the 2J V + / E T criteria, they give comparable contributions to the signal due to its large production cross section.
We compute the statistical signal significance using S = N S / √ N S + N B . where, N S and N B represent the remaining number of signal and background events after implementing all the cuts. We show the statistical significance for different benchmark points in Table 7. The highest significance is found for BP3. We would like to emphasize that even after utilizing the novel techniques of jet substructure this particular region of parameter space is very challenging to probe with high sensitivity at the HL-LHC. In order to optimize our search further, we use MVA with jet substructure variables.

Multivariate analysis
In previous section, we present the reach of our model using a CBA. Although we have not achieved discovery significance of 5σ in none of our benchmark points, we see that the two variables viz. M J and τ 21 are very powerful to separate tiny signal from the large SM background. In this section, we use a sophisticated MVA to achieve better sensitivity than a CBA. We would like to discuss two important points here. Firstly, we have observed that MVA does not perform well if we use events selected just with the baseline cuts since signal is too tiny compared to the overwhelmingly large background. Therefore, we need to apply, in addition to the baseline selection cuts, the following strong cut on the hardest fatjet mass, M J 0 > 40 GeV and b-veto on jets to further trim down the large background before passing events to MVA. These cut are very effective to drastically reduce the background but not the signal and is optimally chosen such that it is not too close or too relaxed compared to the cuts used in CBA. If the extra strong cuts for MVA are too close to the cuts applied for the CBA, MVA will not give us improved sensitivity. On the other hand, if they are too relaxed, performance of MVA will degrade as background will become too large. Although we select events with two high-p T fatjets, we only demand the jet mass of leading-p T fatjet is greater than 40 GeV. This will pass a large fraction of mono-fatjet signal events along with di-fatjet. Therefore, on one hand, this will increase the signal. But, on the other hand, this will also increase the background.  Table 8. Number of signal and background events at the 13 TeV LHC with 3000 fb −1 integrated luminosity. These numbers are obtained by applying M J0 > 40 GeV and b -jet veto in addition to the baseline cuts defined in the text.
In Table 8, we show the number of signal (1J V and 2J V categories) and background events at the 13 TeV LHC with 3000 fb −1 integrated luminosity. Observe that although we demand two fatjets in our selection, the number of 1J V events that contribute to the signal are always bigger than 2J V contributions for all BPs. This is due to the fact that cross sections for 1J V topologies are much bigger than the 2J V topologies and also significant fraction of 1J V events pass the selection cuts. Therefore, it is necessary to design a hybrid selection cuts, stricter than 1J V but looser than 2J V , where both 1J V and 2J V topologies contribute. Our selection cuts are, therefore, optimally designed to achieve better sensitivity.
For our MVA, we use adaptive BDT algorithm. We obtain two statistically independent event samples for the signal as well as for the background and split the data set randomly 50% for testing and the rest for training purpose for both signal and background. Note that there are multiple processes that contributing to the signal and similarly for the background. In MVA, we construct the signal classes by combining both the 1J V and 2J V topologies that pass our MVA selection criteria. These different signal samples are separately generated at LO and then mixed according to their proper weights to obtain the kinematical distributions for the combined signal. Similarly, all different background samples are mixed to obtain the similar distributions for the background class.
The final set of variables which are used in the MVA are decided from a larger set of kinematic variables by looking at their power of discrimination between signal and background classes. Four substructure variables for two fatjets, i.e. M J 0,1 and τ 21 (J 0,1 ) has already proved to be very important discriminator in our CBA. Stronger transverse momenta cut for such jets are favorable to retain the correct classification of these variables. We already required reasonably high P T criteria for both such jets. However, to construct the hybrid selection cuts P T (J 0 ) can still take a role in determining the purity of the hardest fatjet J 0 . We also include relative separation between these fatjets ∆R(J 0 , J 1 ) and the azimuthal angle separation between the leading fatjet from the missing transverse momentum direction ∆φ(J 0 , / E T ). Scale of new physics is relatively high and that is typically captured by some of the topology independent inclusive variable like H T , / H T , / E T etc. We utilize global inclusive variable Ŝ min proposed to determine the mass scale of new physics for events with invisible particles such as ours [52][53][54]. This variable, constructed out of all reconstructed objects at the detector, demonstrate better efficiency compare to other inclusive counterparts. For example, we did not use / E T as a feature after baseline cut since it showed high correlation with Ŝ min and turned out to be less important than it.
In Fig. 6, we show the normalized distributions of all eight input variables used in the MVA. Signal distributions are obtained for BP3 including 1J V and 2J V topologies and the background includes all the dominant backgrounds discussed in section 5.2 for the 13 TeV LHC. For the same benchmark scenario, method unspecific relative importance of all the kinematic variables are available during TMVA analysis and presented at Fig. 7. Moreover, we mostly keep variables which are less correlated (or anti-correlated) for both the signal and the background. Relative importance is a measure that is used to rank the variables in MVA. In other words, a variable has better discriminatory power if it has greater relative importance. For this particular benchmark point, BP3, M J 0,1 variables are very good discriminators according to their relative importance. The N -subjettiness variables, τ 21 (J 0,1 ), are also very good discriminators as expected. Note that, the relative   importance can change for different benchmark points or different LHC energies etc., that can change the shapes of the variables. The linear correlation matrices for the signal and the background can be seen in Fig. 8. Observe that M J 1 and τ 21 (J 1 ) variables are strongly anti-correlated. The correlation in M J 1 and τ 21 (J 1 ) variable is due to mixture of 1J V and 2J V topology in the signal. However, we keep both of them in the MVA since both of them are very powerful discriminators for 2J V topology.
Since the BDT algorithm is prone to overtraining, one should be careful while using it. This usually happens during the training of the algorithm due to inappropriate choices of the BDT specific parameters. One can avoid overtraining by checking the Kolmogorov-   Smirnov probability during training. We train the algorithm for every benchmark point separately and ensure that the algorithm is not overtrained in our analysis. Note that the set of eight variables that are used in our analysis may not be the optimal ones. There is always the scope of improving the analysis by choosing a cleverer set of variables. But since the variables we use in MVA are very good discriminators, our obtained sensitivities are fairly robust.
In Fig. 9, we show the normalized BDT response for the signal and the background (training and test samples for both the classes) for BP3. One can clearly see that the BDT responses for the signal and background classes are well separated. We apply a cut on the BDT responses i.e. BDT res > BDT cut and show the corresponding cut efficiencies for the signal (blue) and the background (red) and the significance (green) as functions of BDT cut . The significance is computed using the formula σ = N S / √ N S + N B where N S and N B are the signal and background events that are survived after the BDT res > BDT cut cut for a given integrated luminosity. The optimal BDT cut, BDT opt is the cut for which the significance is maximized. In Table 9, we show N S , N B and σ for different BPs for the 13 TeV LHC, considering an integrated luminosity of 3000 f b −1 . We also demonstrate this significance as a function of M H ± ,A in Fig. 10 (red curve), whereas blue curve represents the required luminosity for the 2σ exclusion of different BPs.

Results and Discussion
The IDM is a simple theoretical framework with rich phenomenology providing possible DM candidates. We classify the model space in four categories depending on the masses of the scalars in the model as summarized in Table 1. Some of them are quite interesting in view of the observed properties of the Z-boson, Higgs and DM, together with fulfilling all the available theoretical constraints and from the low energy experiments. All such constraints on the IDM are critically analyzed to establish that a hierarchical BSM spectrum with a light DM (m DM 80 GeV) provides an appealing scenario, as it fulfills the full observed  Table 9. Total number of signal events are, N bc S (including 1J V and 2J V topologies as shown in Table 8) and with number of background events N SM before BDT opt cut. The number of signal and background events after the BDT opt cut are denoted by N S and N B respectively.  relic density. Furthermore, additional constraints from the Higgs invisible decay and the DM direct detection limits leave us with little allowed parameter space left to be explored at the LHC, albeit rather difficult region to explore. Exploiting the fact that after production, the heavy BSM scaler essentially decays into boosted vector bosons together with light DM candidates, we propose a search strategy of a scenario consisting of two boosted fatjets with large MET. Hadronic decay from such boosted vector bosons carry distinctive substructure characteristically different from the single prong large radius QCD jets and can be distinguished with moderate efficiencies using jet substructure observables.
As it turns out that our signal of boosted 2J V + / E T also gets significant contributions from single heavy scalar productions with light DM, where the other second J V is mimicked by a QCD jet, especially since later production is roughly one order of magnitude higher than the two J V processes. Essentially the di-fatjet signal, after our selection cuts, turns out to be a hybrid of di-J V and mono-J V signals. The corresponding background to the mono-J V channel is also very large that contributes to the overall background. The V +jets SM processes are the dominant backgrounds to the above signal, and the sheer magnitudes of these backgrounds of order ∼ 1000 pb make it very difficult to search for the BSM scalars of the IDM in any channel. We use intuitive application of jet substructure variables like the fatjet mass (M J ) and the N -subjettiness (τ 21 ) which encode the internal structure of the fatjets.
Even with these variables, it is extremely difficult to overcome the huge background and therefore, the best case cut-based analysis discovery potential remain restricted to less than 3σ. While cuts on these variables, as detailed in Tables 5 and 6, can bring down the background to less than 1% level from the generated ones simultaneously bringing down the signal numbers to 10 − 20%. At the end, we do not obtain any significant improvement in the discovery potential to make it cross the desired 5σ barrier for discovery. The best LHC sensitivity is obtained for the BP3 with m H ± ≈ m A ∼ 350 GeV and significance decreases both side of the spectrum. With the increase of m H ± , m A , we get higher boost for the decaying vector bosons, resulting in better discrimination power of jet substructure variables. On the other hand, presence of heavier particles leads to suppressed signal cross section. Therefore, the best signal to background sensitivity is obtained only in an intermediate mass range.
To improve the LHC discovery potential, a MVA is undertaken where we employ total eight kinematic variables which try to devise a boosted decision tree and provide the optimum separation between signal and background. Instead of the rectangular cuts used in CBA, MVA can use the full potential of jet substructure variables to study the full hierarchical parameter space of the IDM which is allowed after imposing all the theoretical and experimental constraints. The LHC sensitivity is improved to 5.6σ for BP3 using MVA. Hence, much of the parameter space in a well motivated scenario within the IDM framework which provides a hierarchical BSM spectrum with light DM (m DM 80 GeV), along with an almost degenerate heavy charged Higgs and a pseudoscalar A within the mass range between 250 -700 GeV, can be excluded with 1300 fb −1 integrated luminosity at the 13 TeV LHC.