Tagging a jet from a dark sector with Jet-substructures at colliders

The phenomenology of dark matter is complicated if dark matter is a composite particle as a hadron under a dark gauge group. Once a dark parton is produced at a high energy collider, it showers and evolves to a jet-like object, eventually it provides a collider signature depending on interactions with particles of the Standard Model (SM). For example, a finite lifetime of dark hadron would provide a displaced vertex. Thus by considering features in various subdetectors, one can identify a jet from a dark parton (“dark jet”) with analysis methods in conventional exotic searches. However if the lifetime of the dark hadron is collidernegligible (too short to manifest a displaced vertex), it would be hard to tag a dark jet over Quantum Chromodynamics (QCD) jets of SM. Thus conventional analyses with information from various sub-detectors are not enough to probe dark matter physics in general at colliders. We propose an analysis to utilize a combination of jet-substructure variables to identify dark jets over backgrounds. We study features of jet-substructure variables for a dark jet. We identify what parameters in dark jet are relevant to performance of a given jet-substructure variable. To maximize performance we apply a boost decision tree (BDT) to jet-substructure variables in tagging dark QCD jet over QCD jets. As an illustration, we perform the LHC fourjet analysis with / without jet-substructure variables. Our result shows that by combining various jet-substructure variables, one could get a good discrimination performance to identify a dark jet over QCD backgrounds. We also discuss systematic uncertainties from the choice of parameters in a Monte Carlo simulation in estimating tagging efficiency. ar X iv :1 71 2. 09 27 9v 2 [ he pph ] 1 8 O ct 2 01 8


Introduction
The existence of Dark Matter (DM) in our universe has been confirmed indirectly with its gravitational effects [1]. Still we have no idea about the nature of DM as we have not found DM "directly" with various DM experiments. Especially WIMP (Weakly Interacting Massive Particle) as the most popular DM paradigm has been a subject for various experiments including space indirect searches, nucleon scattering direct searches, and collider experiments. However, we have excluded a wide range in the parameter space of WIMP [2,3] from null results in above searches. In additional to the WIMP paradigm, another DM scenario called asymmetric DM [4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20][21][22], which is inspired by the coincidence of the abundance of visible matter and DM as Ω DM 5Ω B , has attracted attention as the one of alternative DM scenarios.
In the asymmetric DM paradigm, DM and its antiparticle aDM(anti-Dark Matter) are produced not equally in the early universe period. Then an effective annihilation between DM and aDM eliminates aDM in the universe and the remaining DM particles compose current relic density. Thus space indirect experiments become ineffective in searches for asymmetric DM as they rely on currently negligible portion of aDM.
A mechanism that linking unbalance in a visible sector and a dark sector is required in asymmetric DM model, and most of them will deduce an approximate equivalence of visible matter number density and DM number density, n DM n B , in the current universe. Combining with abundance ratio Ω DM 5Ω B , number density ratio between visible matter and DM naturally suggest a DM mass range of O(1 ∼ 10)GeV. In such a low mass region, especially in mass region lower than 5GeV, nucleon scattering direct search experiments become insensitive.
However we can study properties of DM if high energy collider experiments can create particles in the dark sector. It motivates us to develop ideas of understanding features in collider signatures depending on the DM paradigm. If DM is charged under a U (1) , then energetic DM produced in collider will radiate U (1) gauge boson, sometimes called dark photon. Such dark photon decays back to the Standard Model (SM) particles through a kinetic mixing with SM photon, and leads to prompt/long-lived lepton jets or narrow jet signal at collider [23][24][25]. If a gauge group in dark sector is SU (N d ) which cause confinement at a certain scale Λ d , the energetic dark parton, which is the particle charged under SU (N d ), produced in collider will cause a jet-like signal 1 . Such a dark QCD or dark jet phenomenology study can be found in [27][28][29][30][31][32][33][34][35]. Depending on parameters in dark QCD sector, there could be various different collider signals. Different dark sector phenomenology for diverse dark hadron type and dark glue-ball are discussed in [27], where authors suggested b-jet (bottom quark initiated jet) tagging and displaced vertex finding to search those dark QCD signal in a collider. Signatures with bottom/tau tagging, missing energy, lepton jet/lepton pair mass, or displaced vertex/track have been studied in the literature [28][29][30][31][32][33]. Recently, a new dark jet study based on a flavor structure in dark quark sector called semi-visible jet is proposed in [34]. In their scenario, missing energy is collimated with QCD jet, and a transverse mass of two leading jets in the final states becomes useful to discriminate dark jets pair signal from SM background. A comprehensive study based on the mechanism proposed in [26] is given in [35] where authors introduce a quite heavy mediator linking dark sector and visible sector for a long-lived dark meson with finite life time. In that case utilizing displaced tracks can enhance collider search ability. As we will point out, there are still a range of models and parameter space that allow most of the dark hadron produced in collider to decay back to visible particles promptly. In such a case, the performance of analyses based on displaced vertices becomes weaker and signals from dark QCD will look like exactly same as backgrounds from the SM QCD. In FIG. 1 we categorize signatures according to a life-time of a dark hadron and the fraction of invisible particles inside a jet. Here we categorize dark matter searches into three; • Exotic (I): One can identify dark hadron decays via displaced vertex (D.V.).
• Exotic (II): Some stable dark hadrons (dark baryon and also some dark meson) occupy non-negligible portion of a dark jet, which make various kinematic variables useful.
• QCD-like: dark jet looks like a SM QCD jet under conventional treatments of jet.
1 Under a dark confinement, a dark matter particle, which is the lightest baryon under SU (N d ), with a mass O(1 ∼ 10)GeV could be obtained more naturally by the help of bi-fundamental representation mediator particles, see [26]. . We present a diagram to divide a jet-type from dark QCD in terms of (x-axis) percentage of stable (invisible) hadrons in a jet and (y-axis) life time of dark mesons. Here C.S. means a life time enough to be "collider stable" and D.V. stands for a sizable life time to be tagged with "displaced vertices".
As we reviewed above, previous studies of dark QCD collider phenomenology are closely related to some non-conventional signals, especially displaced vertex. In this paper, we propose to utilize various jet-substructure techniques to tag "SM QCD-like" dark jet. Due to recent improvements in quark-gluon jet discrimination with jet-substructure and corresponding applications in different New Physics searches [37][38][39][40]76], we argue that we are at the stage of discriminating dark QCD jets from SM QCD jets. Actually, besides promptly decayed dark hadron, jet-substructure analysis usage can be extended to more general cases, only if most of the dark hadron decays inside the detector range and energy deposits or tracks can be used to reflect the property of dark jets. Another advantage of jet-substructure analysis is its applicability to different models. In this work we will show how one can combine several jet-substructure variables to unveil various dark confinement models at collider.
In next section we briefly introduce our models and show how could those dark hadron decay to SM states promptly. Section 3 is dedicated to a comprehensive exhibition of the ability of jet-substructure on discriminating dark jet and QCD jet. We also explain the reason why these variables are useful. In section 4 we use an example at LHC to show the effect of our dark jet tagging method. Then we summarize this work in section 5. A brief discussion of theoretical uncertainties in our jet discrimination will be given in appendix A.

Benchmark scenarios for Dark QCD models
We introduce a new non-Abelian gauge group SU (N d ) which describes dynamics in the dark sector in addition to the SM gauge group SU (3) × SU (2) × U (1). Several light dark quarks as fundamental representations of SU (N d ) are also required for constitution of dark hadron.
Here, a light dark quark means a dark quark that contributes to the running of dark strong coupling α d (µ) from dark confinement scale Λ d to a higher energy scale. For dark color confinement, the number of dark quarks flavors n f should be smaller than 11 2 N d . At an energy scale much higher than Λ d , the Lagrangian of dark sector can be written as: with q and G µν denote dark quarks and dark gluon field strength respectively. D µ corresponds to the covariant derivative of SU (N d ), and i is the flavor index of dark quarks. For minimality, we set the dark quarks to be SM singlet. A mediator between dark sector and SM sector is required to produce energetic dark partons at colliders. It could be a bi-fundamental representation particle [26], a heavy Z , or a scalar [41]. Here we show the Lagrangian of these mediation for illustration: Here q j is SM quarks, X is a bi-fundamental scalar which is charged under both SU (N d ) and SM SU (3). Z is a vector-mediator connecting a dark quark pair and SM quark pair 2 . i and j are flavor index of dark quarks and SM quarks. The decay of dark hadrons depends on their spin, mass, and the mediator to the visible sector. Here we give a concrete analysis to different kinds of dark hadrons and point out in which case those dark hadrons decay to SM particles promptly.
Generally, dark pion is the lightest meson in the dark hadron spectrum and it makes up a large fraction among particles in a dark jet. As dark pion π d is a spin-0 pseudo-scalar, it decays to quark pair through a high dimensional effective operator. In this case, due to a chiral flipping suppression, π d tends to decay to a heavy SM quark pair and its life time is closely related to the mass of the dark pion m π d . We take the formula used in [35] in estimating the partial width of π d to a SM quark pair: Here κ is the coupling among a mediator X, SM quark q and a dark quark q . f π d is the decay constant of the dark pion, m q is the pole mass of the SM quarks and M X is the mass of the mediator X. κ = 1 is a natural choice. An approximate relation GeV. Thus if f π d m π d 2 GeV, the decay channel to SM K-meson is open. In such a case, a mediator lighter than 300 GeV could induce the proper decay length of π d to be shorter than 1mm, i.e. a promptly decaying dark pion. This range for a mediator mass is still allowed by previous displaced track/vertex searches as summarized in [35]. If the mass of a dark pion is heavier and its decay to D-meson or B-meson is open correspondingly, the allowed parameter space for a prompt decay would be much larger.
Another possibility is the case where there is an extra U (1) under which the dark quark is charged [45]. In this case a dark pion will behave like SM pion and it will decay to dark photon pair π d → γ γ promptly. A dark photon can decay into SM particles through a kinetic mixing with SM hyper charge U (1) Y where the kinetic mixing is parameterized by . With current limits on parameters ( and a mass) of a dark photon [44], we find there are still huge surviving parameter space that can induce a prompt dark photon decay. For instance, a 0.4 GeV dark photon will decay promptly if 10 −5 and it induces the prompt decay of a dark pion into SM particles.
One can also consider the situation where a dark quark has SM electric charge. In this case dark pion decays into SM photon pair directly. This kind of dark pion has been used to explain the galactic center gamma-ray excess [46]. Electric charge of dark quark is noted as e. A simple estimation shows that 0.01 would be enough for a prompt decay π d → γγ. Since there are stringent constraints on "milli-charged" dark matter, an electrically neutral object would be more natural as the candidate of dark matter [47].
In [34] and a more recent paper [48], authors consider a dark meson which is composed by different flavor dark quarks. In this case, a dark meson is stable and the corresponding collider signature from a dark meson is a missing energy signal. But this assumption is model (or parameter)-dependent. For example, an interaction Lagrangian between two dark quark flavor and a mediator X is following: (2.5) By integrating out the heavy mediator X, one can get an effective operator as 3 : So depending on the parameters, flavor mixing dark meson π d can decay promptly into SM particles through this dark flavor violating operator. In addition, as pointed out in [34], most of the dark hadrons from fragmentation processes can decay promptly once a specific mass hierarchy among dark quarks is satisfied. Since the production rate of a heavy quark pair through a fragmentation is suppressed by a factor of exp − 4π|M 2 −m 2 | Λ 2 d , most of the dark mesons through dark fragmentation would be the lightest one which decay promptly.
A dark rho meson is a spin-1 bound state made of dark quarks. Generally there is a mass splitting between a dark pion and a dark rho meson, which depends on the pole mass of dark quarks. If m q Λ d , a dark pion can be treated as a goldstone boson with a mass smaller than Λ d . In this case, a dark rho meson will decay promptly through decay channel ρ d → π d π d . If m q is not too smaller compared to Λ d , the mass splitting is not enough to allow double pion decay. But due to the spin 1 property of ρ d , its decay width will not be chiral-suppressed. Thus the corresponding prompt decay parameter space is lager compared to π d 's case since ρ d does not tend to decay to heavy flavor quark. In the U (1) extended or an electrical charged dark quark case, most preferred decay processes are ρ d → π d γ or ρ d → π d γ. Discussion of multi flavor case is similar to dark pion, so we don't repeat it here.
The lightest dark baryon is stable and it can be a dark matter candidate. In SU (3) d case, the population ratio of baryons over mesons in a hadronization process would be O(10)%, which is negligible. If N d > 3, the ratio of baryon will be further suppressed. And only in SU (2) d case a considerable part of hadron in a dark jet consists of stable dark baryons. Thus in this work, we focus on N d = 3 case as we try to distinguish a SM-like dark jet over SM backgrounds.
If all the dark quarks are much heavier compared to the confinement scale of SU (N ) d (m q Λ d ), the lightest dark hadron will be made of dark gluon. Thus one can call this dark hadron as a dark glue-ball. As a dark gluon and SM gluon belong to different gauge group, the decay of dark glue-ball is loop-induced by a heavy particles which have a charge under both gauge groups. Thus the lifetime of dark glueball will be quite long in general. We will not discuss this scenario in this work.
We have discussed various model settings and parameter choices for most of dark hadrons in a dark jet to decay into SM particles promptly. As we mentioned before, methods based on displaced vertex or missing energy will lose search sensitivities in these cases. In Tab. I we list four benchmark settings of the dark sector, with different spectrum, confinement scale, and decay modes. Due to the non-perturbative nature of a QCD-like theory, some of those parameters need to be given by hands. And the guiding principle is to contain various features that a dark jet could have. Based on above arguments we consider all of the dark hadrons in Tab. I decay promptly. In next section we will show how one can utilize jet sub-structure variables to distinguish a dark jet from SM QCD jets. Table 1. Models we considered in this work. All dark hadrons are assumed to decay promptly. We mainly consider 2 cases: high Λ d case like A and C, low Λ d case like B and D. Parameters in a dark sector for A and C, B and D are the same except the decay channel of a dark pion π d . π d and ρ d mass obey following two equations: . Herem q is constituent dark quark mass and parameter Ω can be determined by other input parameters. The The branching ratios of their decay modes shown here are all 100%, if we don't give a specific value. Decay modes of a dark photon γ with different mass can be found in [24].

Jet-substructure Variables Analysis
Underlying parameters in a dark sector will affect the collider phenomenology of a dark jet. The RGE running of a dark sector gauge coupling α d (µ) is controlled by these parameters: with boundary condition α −1 d (Λ d ) = 0. A comparison in a running coupling between SM QCD and various dark QCD models is shown in Fig. 2 (Corresponding dark sector setting can be found in Tab. I). Running coupling determines parton shower, which happens at a short distance smaller than 1/Λ d . Then those showered partons fragment to dark hadrons. Finally dark hadrons decay back to SM particles which are measured by a detector. Combining these three processes, the detector level measurements of jet-substructure variables, like jet mass or track multiplicity for a dark jet could be quite different from the expectations for SM QCD jets.
Dark jet originated from a single dark parton can be considered as a 1-prong jet. Thus jet grooming [50][51][52] methods including mass dropping algorithm or pruning, which are suitable for reconstructing a boosted heavy object like a gauge boson (W/Z/H) or top-quark, are not expected to be effective in tagging a dark jet. Compared to 2 or 3-prong jet tagging, 1-prong jet tagging is easier due to a simpler jet structure. Jet-substructure variables used to tag a 1-prong jet roughly fall into two categories, infrared collinear (IRC) safe ones and IRC unsafe ones. An IRC safe variable is not sensitive to soft or collinear radiations inside jet, or equivalently, contributions from extra radiation to an IRC safe variable will approach to zero as radiations become soft or collinear. Thus an analytical description of IRC safe variables is possible. We choose jet mass, two-points energy correlation function C (β) 1 [37] , and linear radial geometric moment (Girth) [53] as our IRC safe variables. As clear analytical descriptions have been given for above three variables, it would be easy to understand our results which are mainly based on Monte Carlo simulation.
An IRC unsafe variable, for example the charged track multiplicity, is sensitive to soft and collinear radiations. Besides that, some IRC unsafe variables are also dependent on the detail of fragmentation and dark meson decay channel. For those variables we will provide Monte Carlo based results and give some qualitative arguments.
We choose Pythia 8 [54] for simulating hadronization processes. It has been shown that Monte Carlo samples from Pythia 8 are suitable to describe experiment data with jet substructure analyses [55,56]. Hidden Valley model [27] included in Pythia 8 can be used to simulate dark QCD process, and recently the running of dark gauge coupling have been added to Pythia 8 which greatly enhances the reliability of dark QCD simulation. We generate three processes at the LHC; ff → Z → q q , qg → Zq, and qq → Zg to study signal and background processes of dark jet, quark jet, and gluon jet respectively, with initial state radiation (ISR) and multiparton interactions (MPI) open with default tunes. For realistic analyses, we perform analyses at the detector level with DELPHES 3 [57]. We use Fastjet [58] to cluster final state particles with an anti-kt algorithm [59]. The objects for a jet clustering are energy deposits in an electric calorimeter, a hadronic calorimeter and muons without isolation criterion. Because there can be a fraction of dark jet energy carried by muon, depending on the decay channel of dark pion 4 . Examining the discrimination performance of jet substructure variables with different choices of jet radius (R), jet transverse momentum (p T ), and jet algorithms can be interesting. In our study, we choose R = 0.4 as it is a typical jet radius in the LHC experiment analyses for QCD jet and this choice was studied in the ATLAS light-quark and gluon jet discrimination [60]. For the choice of jet transverse momentum p T , we start with the range of p T ∈ (180 GeV, 220 GeV) as this p T range has the minimum systematic uncertainties [61] and it overlaps with the p T range in the ATLAS jet discrimination study [60]. We consider a detector geometry of pseudo rapidity η ∈ (−2.5, 2.5). Finally we provide results from the p T range of (360 GeV, 440 GeV) and (720 GeV, 880 GeV) for the sake of completeness to cover high p T jets.

Jet mass
Jet mass, as a simple and intuitive variable which reflects the underlying structure of a jet, has been studied by decades [62][63][64][65][66][67]. Jet mass originates from the virtuality of the primordial parton of a jet. As we consider the first order splitting process, a normalized differential cross section of virtuality is: where σ = (dσ/dp 2 )dp 2 is the integrated jet cross section, C is color factor, p is the 4momentum of a primordial parton and p 2 is its virtuality. is an infrared cut, z is the energy fraction carried by a radiated parton, α(µ) and P (z, p 2 ) are QCD running coupling and splitting kernel respectively. Above fixed order result is divergent when a jet mass becomes zero, which is in conflict with experiment data. In order to get a reasonable distribution, one needs to resum higher order corrections. In Leading Log order, differential cross section becomes: 1 σ dσ dp 2 = d dp 2 S(p 2 , Q 2 ), which is a differential to the Sudakov factor S(p 2 , Q 2 ): Here Q is the energy scale of corresponding hard process. This leading order result can roughly reproduce shape of the real data distribution from the LHC experiments. Obviously, this distribution is determined by running coupling α(µ) and color factor C. In order to get an intuition for jet mass distributions, we approximate Eq. (3.3, 3.4) below. With fixing running α(µ) as α, P (z, k 2 ) = 1/z, and choosing = p 2 /Q 2 , we obtain the following approximation: As we see in the above eq. (3.5), the peak of a jet mass distribution moves to a right side as Cα becomes lager. Thus the peak of a jet mass distribution for gluon-initiated jet is on the right side compared to the peak of a distribution from a quark-initiate jet, as color factor for a gluon C A = 3 is larger than the color factor C F = 4/3 of a quark as in Fig. 3.
In SM QCD, the only difference between quark jet and gluon jet is color factor C F (for a quark) and C A (for a gluon). Even so, a dimensionless parameter m J /p T , jet mass divided by its p T , is a good variable used in quark/gluon jet discrimination. For a dark jet, because of a quite different running coupling and a possible different color factor, one could certainly expect a very different distribution of a jet mass compared to the case of SM QCD.  With considering subleading contributions, one can include the effect of a jet size or a hadronization [64,67]. In our study, we will not go further analytically, but utilize Monte Carlo simulation (Pythia 8 ) to get numerical results. Jet mass distributions from different models in Tab. I and SM QCD are shown in Fig. 3.
As the gauge coupling strength of a dark QCD model A (C) is larger than the gauge coupling strength of B (D) according to Fig. 2, a jet from A and C has larger mass than a jet from dark QCD model B and D. Equivalently a dark QCD with a high confinement scale Λ d is easier to be distinguished from SM QCD jets compared to the case of dark QCD models with a low confinement scale. We can check discrimination performance with ROC (receiver operating characteristic) curves in the right column of Fig. 3. We also argue that a jet mass is not sensitive to final states (SM particles from the decay of dark mesons) as jet mass distributions of A(B) almost overlaps the distribution of C(D) in Fig. 3.

Two points energy correlation function
Another variable which is useful to probe properties of a one-prong jet is two-points energy correlation function [37]: with z i = p T i / i∈J p T i is the p T fraction carried by component i within a jet J, and R ij is the distance between component i and j. As studied in [37], the advantage of infrared collinear safe variables including C (β) 1 is that analytical calculation of them is possible. Here we adopt analytical results from [37] to see the dependence of C (β) 1 on the parameters of dark QCD. Firstly one can consider the simplest case, which is the fixed leading order distribution (we will treat coupling constant α as a constant in this part for simplicity): Here R 0 is the size of a jet, which is the upper limit of a splitting angle in shower process. After integrations, one gets: Similar to our previous fixed order calculation for the distribution of a jet mass, C (β) 1 distribution is also divergent in the soft and collinear region. With a leading order resummation, one obtains: One can notice that the probability in soft and collinear region will be suppressed by an exponent. As we have seen in jet mass distribution, the peak value of dark jet C 1 distribution is larger than the peak value of SM QCD jet C (β) 1 distribution, , as dark QCD has a larger coupling compared to SM QCD.
There are two more factors that can enhance the discriminant power of C (β) 1 . Firstly, there is a contribution from non-perturbative fragmentation to C (β) 1 . It can be estimate by convolving a resummed perturbative distribution with a so-called "shape" function [69,70]. The effect of this convolution is shifting the perturbative distribution of C (β) 1 to a higher value, and the shift from this non-perturbative process is roughly proportional to the corresponding confinement scale. Thus fragmentation process will further separate C   distribution of dark jet with p T ∈ (180 GeV, 220 GeV) at parton level, meson level, final state particle level, and detector level for dark QCD model A (corresponding to a high dark QCD confinement scale). Top right: the same as top left, but for dark QCD model B (corresponding to low dark QCD confinement scale). Bottom left: C (β) 1 distribution of different kinds of jets with p T ∈ (180 GeV, 220 GeV). Bottom right: Corresponding ROC curves for discrimination between dark QCD jets and SM QCD gluon-initiated jet. β is chosen to be 0.2 for these 4 plots.
Secondly, when the mass of a dark meson is much larger than SM QCD confinement scale Λ QCD , the decay of dark mesons inside a jet will strongly affect the distribution of C (β) 1 . This effect can be understood by the following simple estimation. We consider two nearly collinear dark mesons insider a dark jet, with energy fractions z 1 , z 2 , and distance θ between these two dark mesons. θ should be small because we assume these two dark mesons to be nearly collinear. In this case, contribution from these two mesons to C (β) 1 is z 1 z 2 θ β . After both mesons decaying to two SM particles with roughly equal distribution of energy, this contribution changes to: (3.10) Here m π d is the mass of a dark meson, p T is the average transverse momentum of dark mesons inside a dark jet. As we consider a collinear limit between two dark mesons, an angular distance between dark mesons decay products is approximated as (m π d /p T ). Thus the mass of a dark meson will increase C (β) 1 of a dark jet as we consider β > 0. For a discrimination between quark-initiated jet and a gluon-initiated jet, β have been chosen as 0.2 [37,40]. In this paper, we also follow this choice of β = 0.2 to compare a jet from a dark QCD with SM gluon-initiated jet as a major backgrounds.
Simulation results are shown in Fig. 4. First we show C 1 distributions from parton level to detector level on the top row. Here, parton level C (β) 1 means the objects we used to do jet cluster is the dark parton after dark shower and before dark hadronization; meson level C (β) 1 comes from dark mesons after dark hadronization; particle level C = 0 becomes lower and the distribution is shifted to a higher value. Together with this effect, due to the decay of dark mesons, the particle level distribution of C 1 only a little. In a conclusion, jets from a dark QCD model with a high dark confinement scale jet is easier to tag over SM QCD jets compared to the case of a low dark confinement scale. We also observed that tagging efficiency is not sensitive to the decay channel of dark meson as C

Linear Radial Geometric Moment
Angularity-style variables including jet broadening or width have been studied since LEP period [71][72][73][74][75][76]. Here we choose linear radial geometric moment (Girth) to study, which is known as an effective observable in discriminating between quark and gluon jet [53]. Girth is defined as: here r i is the distance between a component i of the jet and jet axis. Girth is sensitive to the direction of a jet axis compared to C (β) 1 which does not require a jet axis. Thus for a jet axis, we take the vector sum of all the constituents' momentum inside a jet.
Girth, as a jet width variables, has been analytically analyzed in [69]. Here we give a rough description and readers can check more details in [69] if they are interested. At parton level, perturbative calculation shows that quark/gluon jet discrimination ability mainly relies on color factor ration C A /C F , this is called Casimir scaling. For dark jet discrimination, due to a different coupling, the ratio should be replaced by α S C A /α d C d . Thus on could expect a better discrimination power if α d is quite different with α S . Meson level distribution, as we described in the last subsection, can be obtained by convoluting parton level distribution with a shape function which has a mean value proportional to confinement scale. So large Λ d /Λ QCD will separate Girth distribution of dark jet and QCD jet further. Finally, decay of heavy dark meson will push up Girth value of dark jet.
Our results from simulations are presented in Fig. 5. In this results, we show the distribution evolution of model A and model B from parton level to detector level, as we did for C (β) 1 .
Relationship between different levels are as we expected, but the changes are not so much compared to C (β) 1 . This is because C (β) 1 is more sensitive to small angular distribution. And ,unlike the case of C (β) 1 variable which needs to have at least two components for non-zero value, Girth has a non-zero value with one component. Thus a large angle parton splitting doesn't cause a zero point spike in the distribution of Girth as we can find in Fig. 4. We conclude that the performance of Girth is dependent on the confinement scale of dark QCD as a dark jet from a higher confinement scale is easier to be distinguished than cases from a low confinement scale. With comparison between model A and C (also model B and D) we find that Girth is not sensitive to the different decay channel of a dark meson. And the discriminant ability of Girth is a little weaker than the discriminant ability of C  Figure 6. Top left: Dark meson multiplicity, charged particle multiplicity, and track multiplicity distribution of dark jet with p T ∈ (180 GeV, 220 GeV) and setting B. Top right: Same as top left, but with setting D. Bottom left: Charged track multiplicity distribution of different kinds of jets with p T ∈ (180 GeV, 220 GeV). Bottom right: Corresponding ROC curves for discrimination between dark QCD jets and SM QCD gluon-initiated jet.

Charged track multiplicity
Multiplicity-type variables counting the number of sub-jets, hadrons, or tracks inside a jet, turn out to be useful in discriminating different kinds of one-prong jets. Among them, charged track multiplicity, due to a high resolution and a trigger efficiency of a track reconstruction at the LHC, is the best discriminant variable among various multiplicity-type variables used in quark and gluon jet discrimination [53,77,78]. Unlike jet mass or C (β) 1 which are IRC safe, charged track multiplicity does increase its value through soft and collinear radiations. Besides that, it is also closely related to the decay channel of a dark meson. So we rely on Monte Carlo simulation results to show its property. Fig. 6 is our simulation results. In order to show how the track multiplicity is affected by dark meson's decay channel, we count the amount of dark meson, charged particle, and track with p T > 0.5GeV inside a dark jet, which correspond to meson level, particle level, and detector level respectively in the first row. With an identical dark sector setting, dark meson multiplicity distribution for model B and model D are almost the same. But different decay channels of dark meson make their track multiplicity quite different. Thus compared to dark jet in model B, dark jet in model D is much easier to be discriminated from QCD jet. In general, track multiplicity is a better discriminant variable compared with IRC safe variables.

Energy deposit ratio on different kinds of calorimeters
In order to further reflect final states from dark meson's decay, we suggest to utilize a variable which has a dependency on types of reconstructed particles. At the LHC, most of SM particles, except muons and neutrinos, will be stopped by calorimeters and deposit their energy on calorimeters. There are two kinds of calorimeters used in the LHC, electromagneticcalorimeter(ECAL) and hadronic-calorimeter(HCAL). Electron and photon deposit their energies on ECAL, and hadrons deposit their energies on HCAL if their lifetime is long enough. So for different kinds of jets, due to the ratio of different final states inside them, their energy deposit on ECAL and HCAL will be different. Here we define a variable called E-ratio: For certain kinds of dark jet, this ratio could be quite different with QCD jet. Such as a dark jet from model B. Dark mesons in this kind of dark jet mainly decay to strange quark pair. So most of the energy of dark jet from model B are carried by long-lived Kaons. Then its E-ratio will be much smaller than the E-ratio of QCD jets.
Distribution of E-ratio is shown in Fig. 7. As we expected, E-ratio distribution of model B are quite different with other jets, and corresponding ROC curve also shows a good discriminant performance for model B. While for model A and model C, this variable is not so effective.  Corresponding ROC curves for discrimination between dark QCD jets and SM QCD gluon-initiated jet.

Sub-jet
Properties of an one-prong jet can also be revealed by measuring observables associated with smaller sub-jets inside it. Because different kinds of jets have different energy profiles on y − φ plane. For example, most of the energy of quark jet concentrate on a small central region, while the energy of gluon jet will spread to a larger area [53]. Here we define a sub-jet by re-clustering constituents of an original jet with anti-kt algorithm and a jet radius R = 0.1. We require the p T of these sub-jets to be larger than 5% of the original jet's p T . Here we define f (i) p T as p T of (i)-th hardest sub-jet divided by p T of an original jet: Three variables are used here: 1) the number of sub-jets, 2) p T fraction carried by the hardest sub-jet f (1) p T , and 3) p T fraction carried by the second hardest sub-jet f (2) p T . Simulation results are in Fig. 8. Those distributions show clear physical meaning. QCD quark jet, with a small coupling and color factor, can only trigger large angle shower with a quite low probability. Hence there is a huge possibility for quark jet to concentrate most of its energy in a tiny cone with a radius smaller than 0.1. Due to a larger color factor, QCD gluon jet is "broad" compared to "narrow" quark jet, which means the energy of gluon jet distribute on a larger area and it's more likely to have more sub-jets inside gluon jet. For dark jet, through a larger coupling, they become even more broader and there are more sub-jets inside it. p T fraction of sub-jets are natural expectation of such argument. Among these 3 variables, p T fraction of the hardest sub-jet f (1) p T shows the best discriminant ability. Similar to C (β) 1 , Girth, and jet mass, this variable is only useful for high confinement scale dark jet tagging.

Combine multiple variables
To maximize a tagging performance with multiple jet-substructure variables, we need to consider correlations among them. For example, a correlation plot in Fig. 9 from a 2-dimensional profile between C (β) 1 and Track Multiplicity plane can be used for separating different jets. A standard cut-flow will behave as ordinary "ABCD" method which cut x−axis and y−axis with straight lines. To cut away a high density region in multi-dimensional profile of background QCD jet made with various jet-substructure variables, we use Boosted Decision Tree (BDT) [79] in TMVA-Toolkit [80].
We use 500 decision trees, choose minimum in leaf node as 2.5%, and set maximum depth as 3. To avoid overtraining, half of the events are chosen as test events and Kolmogorov-Smirnov test is required to be larger than 0.01. Generally, if we use more variables we might p T , f (2) p T }. In Fig. 10, we show that the minimal combination of {C can achieve a comparable discriminant power as the result with all the eight variables 5 . For comparison we also provide ROC curve of dark jet vs. quark jet. As one could expect from variables' distribution in previous subsection, the difference between dark jet and quark jet is much larger than the difference between dark jet and gluon jet. In order to understand performance behavior with increasing jet p T , we studied mid-p T jet of 400 GeV and high-p T 5 Actually, besides these 8 variables we also consider other variables, like, jet charge, pT weighted jet charged, obtained by tracks, and pT weighted track multiplicity. But the improvement we can get by adding these variables are negligible. jet of 800 GeV by choosing jet p T ranges of (360 GeV, 440 GeV) and (720 GeV, 880 GeV) in the bottom of Fig. 10. Larger p T shows a better discriminant performance.

LHC example
In section 2, we introduced a mediator particle X, which is used to link the dark sector with the SM: As X is charged under both the SM SU (3) and dark SU (3), pair of X particle would be produced at LHC through QCD process. Once a mediator X is produced, it decays into a SM quark and a dark quark, which evolves to a QCD jet and a dark jet respectively. If the decay length of a dark meson is around O(10) ∼ O(100) mm, a dark jet will leave displaced vertices in detector. By counting the number of displaced vertices, one can obtain robust limit on the mass of mediator particle X [35,83]. If the decay length of a dark meson is shorter than 1mm, analyses with displaced vertices will lose sensitivity. In this section we will show how tagging dark jet with jet-substructures can be used to enhance a search sensitivity for promptly decaying dark mesons. We consider dark sector setting A in Tab.2 as an example for the LHC study. Our analysis is based on the search for pair-produced resonances in four-jet final states on ATLAS [84].
Here we briefly describe a cut flow used in ATLAS report [84]: • Events are required to have at least 4 jets with p T > 120GeV and |η| < 2.4.
• These 4 jets are paired by minimizing ∆R min = i=1,2 |∆R i − 1|, with ∆R i the angular distance between two jets in a pair.
• Define m avg as the average of the invariant masses of these two jets pair as m avg = • Boosting the system of these two resonances (two jets pairs) to their centre-of-mass frame. cos θ * is defined as the cosine of the angle between one of the resonance and the beam-line in the centre-of-mass frame. The mass asymmetry A is defined as: Events are cut by requiring A < 0.05 and | cos θ * | < 0.3. This cut defines the inclusive signal region (SR) selection.  This analysis utilizes information limited to kinetics of final state jets, which are p T , η, and φ. While as we have presented in section 3, one can get more information by looking inside a jet. If a resonance is the mediator particle X, there will be two dark jets in the final state. So, by tagging dark jets, search sensitivity can be enhanced. Our strategy is to use training samples of QCD jet (background) and dark jet (signal) to build a map between jet sub-structure variables and BDT score. Then dark jet tagging can be performed by cutting on BDT score. Similar method have been performed in SUSY study [40].
Training samples are still generated by ff → Z → q q , qg → Zq, and qq → Zg in  prod. cross section of X pair at 13TeV LHC 95% CL limit from ATLAS report recast 95% CL limit from Dark jet tagging [fb] Figure 12. The 95% CL upper limit on the production cross section of X pair, and X's decay channel to a SM quark and a dark quark is assumed to be 100%. Red line is the production cross section of X pair at 13TeV LHC. Blue dashed line is the up limit obtained by using the cut flow in ATLAS report [84]. Black dashed line is the up limit obtained by using our dark jet tagging method.
Backgrounds of the SM QCD 4 jets and signal events from X pair production are generated by Pythia 8 . For background simulation, we generate over 1 billion events, and the events number after inclusive cut is normalized to the data observed in ATLAS report [84]. The production cross section of X pair is the production cross section of stop pair multiplied by 3 [85], for we are considering a dark SU (3) gauge group. In Fig.11 we show the BDT score distributions of 4 leading jets of background and signal after requiring at least 4 jets with p T > 120GeV and |η| < 2.4. If we define a jet with BDT score larger than 0.4 to be tagged as dark jet, and require one or two dark jets in final state, direct search sensitivity can be greatly enhanced. In Tab.4 we list the event number of background and signal after requiring one or two jets in the final state to be tagged as dark jet. It can be seen that the QCD background is hugely depressed but dark jet tagging requirement, while the signal don't change much. Significance in this table are estimated by S √ B+ 2 B 2 . Here S and B are events number of signal and background respectively, and we assume systematic uncertainty to be 10% for a conservative approach. Finally we give a 95% confidence level for a upper limit on the cross-section of X pair production with different masses in Fig. 12.
In order to compare with the method without dark jet tagging, in Fig.12 we also put the up limit obtained by using the cut flow in ATLAS report [84]. In report [84], after inclusive cut, several mass window are designed to further increase the sensitivity. For a certain resonance mass, average mass m avg is required to be located in a narrow region around it. While due to a strong shower in dark sector setting A, the average mass obtained by 4 final state leading jets distribute in a large mass region. Thus the mass window cut discard too much signal event and result in a low sensitivity. Fig.12 shows that the limit from ATLAS report recast is much weaker than our dark jet tagging method.

Conclusion
Dark sector under a strong interaction provides composite states and corresponding attractive phenomenologies. A large theoretical degrees of freedom of this scenario lead to a diverse and model dependent phenomenology. At colliders, such model introduce jet-like signal (called "dark jet"), some of which may not be tagged by distinct or exotic signatures including missing energy or displaced vertex. In this work, inspired by the success of quark/gluon jet discrimination, we try to distinguish dark jet from background SM QCD-jet by using jet sub-structure variables. A series of jet sub-structure variables, like the jet mass, C (β) 1 , or track multiplicity, are used to discriminate dark jets from QCD-jets. Combination of these variables with boosted decision tree (BDT) shows a great discriminant performance. For all of our model settings and jet's p T 200GeV, we can exclude 99% background gluon jets while reserving more than 30% signal dark jet, or exclude 99% background quark jet while reserving more than 50% signal dark jet. Corresponding theoretical uncertainty is also briefly discussed. Our results demonstrate that by considering the information inside a jet, we will get a much better understanding of dark jet and enhance collider search sensitivity to identify signatures of dark QCD model at the LHC.

A Uncertainty Discussion
The discriminant ability shown in previous section might be quite sensitive to theoretical uncertainty of Monte Carlo event generator. In analyses of quark-gluon jet tagging, one can tune parameters in the Monte Carlo event generator from real data to reduce systematics and enhance predictability. Thus, one can simulate quark jet very well with Monte Carlo simulation. And for gluon jet, it's also known that the real data lie in between Pythia and Herwig [81] expectation. More information can be found in recent review [69].  But for a dark jet, we can not estimate systematics as we don't have signals of dark jet at the LHC. Thus parameters in simulating dark QCD hadronization and showering leaves unfixed systematics in our analyses. On top of this difficulty, as we don't have various Monte Carlo generators for dark jet simulation except Pythia 8 , we don't have a choice to compare different event generator to get an estimation about uncertainty depending on different showering and hadronization schemes. Alternatively, we do some simple estimation in this work. Changing renormalization scale in parton shower process has been proved to be a good method to estimate theoretical uncertainty in Pythia [82]. So following this method, we also rescale the renormalization scale in dark sector shower process from 0.5µ 2 to 2.0µ 2 . Then we see how the ROC curve we obtained in section 3 changes. Our result is shown in 1 , E-ratio, Track Multiplicity}. If we fix the acceptance of background gluon jet to be 1%, then the acceptance of signal dark jet changes from 30% to 25% So we can conclude that out dark jet discriminant method is quite robust against the theoretical uncertainty.