Double-charming Higgs identification using machine-learning assisted jet shapes

We study the possibility of identifying a boosted resonance that decays into a charm pair against different sources of background using QCD event shapes, which are promoted to jet shapes. Using a set of jet shapes as input to a boosted decision tree, we find that observables utilizing the simultaneous presence of two charm quarks can access complementary information compared to approaches relying on two independent charm tags. Focusing on Higgs associated production with subsequent $H\to c \bar{c}$ decay and on a CP-odd scalar $A$ with $m_A \leq 10$ GeV we obtain the limits $\mathcal{B}r(H\rightarrow c\bar{c})\leq 6.09\%$ and $\mathcal{B}r(H\rightarrow A(\rightarrow c\bar{c}) Z)\leq 0.01\%$ at $95\%$ C. L..

In the SM the Yukawa interaction describes the coupling of the Higgs boson to a fermion f with a strength given by the Yukawa-coupling y SM f . Deviations from the SM expectation can be parametrised by κ f = y f /y SM f , which can be deduced from a measurement of the signal strength µ f defined as µ f = σ H Br ff /(σ SM H Br SM ff ). Here σ H is the Higgs boson production cross section and Br ff is the branching ratio of the decay process H → ff . Currently the couplings between the Higgs boson and the third generation fermions are consistent with the SM expectations, one gets: µ t = 2.2 ± 0.6 [6] (see [7] for slightly older values), µ b = 0.90 ± 0.18 +0. 21 −0. 19 [8] and µ τ = 0.98 ± 0.18 [9]. However, much less is known about the couplings of the Higgs boson to fermions of the first two families: the current bounds found by ATLAS [10] and CMS [11] are µ e ≤ 4 × 10 5 and µ µ ≤ 7. During LHC's High-Luminosity run µ µ 1 might be achievable [12], while the electron coupling to the Higgs is far below the experimental sensitivity. Here a future e + e − collider could get close to the SM value [13,14].
A global fit to Higgs signal strengths gives the strongest bound of κ c ≤ 6.2 [7]. Modifications of the charm Yukawa coupling can occur in different new physics models [23][24][25][26][27][28], it can even be zero [6]. Our aim is to develop a strategy that allows to set a direct upper limit on the charm Yukawa coupling.
The improvement in our bounds on µ c , derived from inclusive analyses, depends strongly on the c-tagging efficiency at the LHC. While dedicated charm tagging algorithms are relatively new [29], flavour tagging has been used in the identification of jets derived from the hadronization of b quarks for more than 20 years, and were employed at the Tevatron for the discovery of the top quark [30,31]. Two features of the b-mesons are exploited to achieve a good b-tagging performance: 1) the dominance of semileptonic rates when a b-hadron decays and 2) the long life-time of b-hadrons. For the latter one can search for displaced secondary vertices (decay vertex) of b-hadrons with respect to the primary vertex (interaction point) in a given event. This distance, known as impact parameter, is normally larger for b-hadrons in comparison with that associated with states obtained from the hadronization of light quarks (u, d, s) and gluons. A similar approach can be followed for c-jets. However, as tagging procedures for b-jets and c-jets are quite similar, their mutual mis-identification rates are consequently quite large.
In general, bottom-or charm-taggers are designed to find jets initiated by individual b or c quarks, allowing for a generic use of these algorithms in a wide range of applications. However, in searches for light or boosted resonances that decay into a charm or bottom pair, such algorithms might not be ideal, as they neglect correlations between the decay products. For example, if the decaying resonance is a colour singlet particle, its decay products are colour connected and soft gluon emissions of either decay product have a preference to be emitted into the cone between the quark pair [32]. Thus to increase the sensitivity in searches for new physics or Higgs boson measurements it can be beneficial to design dedicated 2-prong reconstruction algorithms that allow to utilise more information about the decaying resonances. Observables that are particularly sensitive to the radiation profile of the event are so-called event shape observables [33,34], which have been proposed as hypothesis-tester in the study of Higgs boson properties [35,36]. By promoting those well-studied observables to jet shape observables, applied to a fat jet, they can be used as input to a machine-learning algorithms to separate signal from large QCD backgrounds.
In this letter we present a procedure to identify jets initiated by cc pairs from Higgs boson decays based on the application of different event shapes and the transverse momenta of leptons (e ± and µ ± ). It is expected that high-p T jets arising from highly-boosted Higgs bosons have a different energy flow in comparison to jets arising from pure QCD backgrounds. We would like to emphasize that in the double tagging strategy presented in this work, we study the energy distribution of the full jet associated with the boosted Higgs bosons decaying into the c and thec quark, without separating the corresponding subjets after hadronization. Our analysis is based on fully showered and hadronised Monte Carlo events and the results obtained can be considered as an upper bound to a more complete study when detector effects are also included.
The structure of the paper is as follows: In Sec. II we describe the event generation and the selection criteria. Then in Sec. III we present the performance of our approach for the selection of the SM Higgs boson H against different sources of background. Using the tagging efficiencies derived from the optimization against QCD c-jets, we present an upper bound for our sensitivity to Br(H → cc). In order to evaluate the efficiency of the simultaneous double c-tagging identification with strategies based on the double application of a single ctagger, we compare our results with those obtained applying the Atlas single charm tagging algorithm JetFit-terCharm. Sec. IV is devoted to the study of the decay channel H(→ A(→ cc)Z) + jets, with A the THDM CP odd scalar. Finally in Sec. V we conclude. The discussion is complemented with the correlation matrices among the event shapes used as well as with the distributions for the leading ones in each one of our studies. A brief description of most of the observables considered is included in the appendix.

II. EVENT GENERATION AND EVENT SELECTION
The signal channels are pp → H(→ cc)Z and pp → H(→ A(→ cc)Z) + jets. Here H is the SM Higgs boson and A denotes the CP odd THDM scalar. As background channel we include pp → Z + jets. In all cases we consider Z → l + l − , for l = e, µ. We take into account two possible values for the mass of the scalar A, m A = 4 GeV and m A = 10 GeV. We generate our samples with SHERPA 2.2.1 [37] at √ s = 13.0 TeV, and include parton shower, hadronization and underlying event contributions. For the jet reconstruction we use the jet finding package FastJet 3.2.1 [38]. The event selection is performed with the version 2.4.2 of the RIVET analysis framework [39].
Our selection strategy is based on the identification of the Higgs and a Z boson in the highly boosted regime, when both particles have a large transverse momentum and are back-to-back. In order to reconstruct the Z boson we require two isolated leptons l + l − (for l = e, µ) with a combined mass satisfying 80.0 GeV < m ll < 100.0 GeV. A lepton l will be considered isolated if the following inequality is satisfied E l /E R < 0.1, where E l is the energy of l and E R is the total energy inside a cone of radius R = 0.3 around l. The identification of the Z boson concludes by imposing a cut p T > 200.0 GeV over the combined transverse momentum of the pair l + l − . We proceed with the next steps only if the Z boson has been successfully reconstructed as described in this paragraph.
A boosted Higgs decaying into a pair of quarks qq produces a jet with a relatively large active area R qq , and thus is commonly referred to as fat jet. As a matter of fact, in the boosted regime, the radius of the jet depends on the mass and the transverse momentum of the Higgs (m H and p T,H ) as well as on the momentum fractions of the quark and the antiquark (z and 1 − z) according to R qq = m H /(p T,H z(1 − z)). Thus, for a Higgs boson of mass m H 125 GeV and a transverse momentum p T 200 GeV decaying symmetrically into a pair charm-anticharm, we expect an angular separation of the Higgs decay products of R cc 1.25. In practice we demand jets with radius R = 1.2 and a transverse momentum p T > 200 GeV reconstructed, with the anti-k T algorithm and select the Jet with the highest p T . We translate all the constituents of this jet to the plane η = 0 by taking p z = 0 and replacing their total energy by their corresponding transverse energy [40].
From NLO-QCD calculations, in the SM the decay fractions of c-quarks into leptons obey with good approximation [41] Br(c →lν l X) = (21.74 ± 3.90)% and Br(c → X ) = 100% − Br(c →lν l X) wherel =ē,μ and X, X denote quark final states. Hence, if we consider jets originated from the hadronization process of cc pairs, we can expect to find 0, 1 and 2 leptons with the following probabilities 61.24%, 34.03% and 4.73% respectively. For each one of our analyses we perform three independent studies: non-leptonic, single leptonic and double leptonic, if zero, one and two non-isolated leptons are found inside the fat jet respectively. A cut in the transverse momentum of the leptons of p T, l ≥ 2.0 GeV allow us to reproduce these numbers with good approximation. Nevertheless, we consider this to be a relatively soft cut, hence in practice we impose the constraint p T, l ≥ 5.0 GeV.
If an event is selected, we probe the substructure of the highest p T fat jet by applying a collection of different event shapes on its constituents, thereby promoting the event shapes to jet shapes. We follow this procedure separately for each one of the leptonic categories introduced in the previous paragraph. To evaluate the signal efficiency and mis-tag rate of our observables we use a multivariate analysis implemented in the TMVA package [42] and consider a Boosted Decision Tree (BDT) as our classifier. In addition to the event shapes and for the single-leptonic and double-leptonic categories, we also include the value of the transverse momentum of the highest p T light-lepton found inside the selected fat jet.

III.
STANDARD MODEL HIGGS cc-TAGGING USING EVENT SHAPES

A. Performance
We begin by obtaining the performance of our strategy when selecting the signal channel pp → H(→ cc)Z against pp → Z + jets. The set of observables that give us the best performance are presented in Table I  Our curves for the signal selection efficiencies as well as our background fake rates in each one of the leptonic studies are shown in Fig. 1. We can combine the three leptonic studies to obtain a single selection efficiency for signal (S) and for background (B) according to the formula Our optimal point after the combination of the different leptonic categories corresponds to ε cc = 0.40 ε QCD,jets = 0.03 (2) obtained from the following partial efficiencies S. = 0.49, ε S. = 0.19 ε B. = 0.04 (3) and the leptonic fractions shown in Table II. Our branching ratio for the process H → cc is then Br(H → cc) = 6.1% leading to the cross section σ pp→H(→cc)Z = 0.08 fb. In the case of background the corresponding cross section is σ pp→Z+jets = 23.56 fb. Based on these results and considering the integrated luminosity Ldt = 3000 fb −1 we can verify the 2 sigma condition for the significance S/ √ B = 2.0.    In order to evaluate the performance of our double charm identification approach against more conventional life-time based single charm tagging procedures, we provide a "naive" comparison with the ATLAS JetFit-terCharm algorithm [29]. In the ATLAS study two main sources of backgrounds are considered, the first one are light-flavor jets, i.e. jets arising from the hadronization of g, u, d, s,ū,d,s; the second background is heavy-flavor jets, in this context b-jets.
From [29] we extract the JetFitterCharm single selection efficiencies c , b and light for the charm-jets, b-jets and light-jets respectively. The double tagging coefficients for each category are calculated as ε 2c = 2 c , For the comparison of the different tagging strategies, we used the boosted Higgs search described in Section II, where the dominant backgrounds are light-flavor-jets +Z and bb jets +Z. As the JetFitterCharm efficiencies are not provided in terms of separate analyses for the different leptonic categories introduced in Sec II, we combine the selection efficiencies achieved in our approach for the non-leptonic, single-leptonic and double-leptonic studies for a given background according to Eq. (1).
We find the best results in rejecting light-flavor jets, which have in this analysis a cross section that is at least an order of magnitude bigger than the bb background. In Fig. 2 we show the ROC curves for the different leptonic analyses and in Fig. 3 we present the performance obtained from the combination of the leptonic categories. Without access to the ATLAS detector simulation a direct comparison between the two approaches is not feasible. However, it can be inferred from Fig. 3 that for 0.16 > ε 2c the jet-shapes strategy shows a strong performance and is likely to add to the tagging strategy employed by ATLAS. Consequently, using event shapes it is possible to outperform the double application of a charmtagger by a single application of a double-charm tagger. This is achieved by looking at the full radiation profile inside a fat jet; without disentangling the radiation signatures of the c-quark and thec-quark independently.

IV. CP ODD THDM SCALAR
The coupling between the CP odd THDM scalar A and the pair cc is directly proportional to the charm quark mass m c and inversely proportional to the THDM vacuum ratio tan β. As shown in [43], the decay channel A → cc is expected to be dominant for 4.0 GeV < ∼ m A < ∼ 10.0 GeV and low values of tan β. Here we determine a 95% C.L. upper bound for the branching ratio Br(H → A(→ cc)Z) in this mass range. Our signal is the  process pp → H(→ A(→ cc)Z)+jets and our background is given by pp → Z + jets. For m A = 4.0 GeV the combination of observables that give the best performance are presented in Table IV, from here the ROC's corresponding to the different leptonic categories are determined, see Fig. 4. The optimal selection efficiency point is resulting from the efficiencies S. = 0.69, ε B. = 0.01 (5) combined with the leptonic fractions presented in Table V as given in Eq. (1). Thus, we get the following 95% C.L. upper limit for the branching ratio Br(H → A(→ cc)Z) < 0.01%, leading to the cross section for the signal process σ pp→H(→A(→cc)Z)+jets = 0.02 fb. For comparison, using track-based substructure observables and considering m A = 4.0 GeV, the 95% C.L. bound Light quark Jets Non leptonic Single leptonic Double leptonic Fox Wolfram-like Fractional energy Fractional energy n = 1/4 correlation x = 1.5 correlation x = 1.5 3-jet resolution C parameter PT,e y3 Durham (P-scheme) C parameter 3-jet resolution 3-jet resolution y3 Jade y3 Jade Br(H → A(→ cc)Z) ≤ 2.1% has been previously determined in [19].
For m A = 10 GeV the observables per leptonic category that yield the best selection efficiency curves, shown in Fig. 5, are presented in Table (VI). Our optimal result corresponds to ε cc,m A =10 GeV = 0.38 ε QCD,jets = 0.0004 (6) calculated from the individual efficiencies per leptonic category S. = 0.29, ε and the partial fractions of leptonic events presented in Table (VII). The 95% C.L. limit on the branching ratio is Br(H → A(→ cc)Z) ≤ 0.003%, leading to the signal cross section σ pp→H(→A(→cc)Z)+jets = 0.01 fb. The correlation matrices for the analyses of this section and the histograms for the main discriminating observables are shown in Figs. 8-9 and Figs. 12-13 respectively.

V. CONCLUSIONS
We have studied the efficiency of event shapes for tagging jets resulting from cc pairs originated in the

Fraction of events
PT,e C parameter y3 Jade (E-scheme) 3-jet resolution 4-jet resolution y3 Durham y4 Durham (P-scheme) (P-scheme) decay H → cc. The results obtained can be considered as an optimal limit for the performance of our selection strategy as we have not included detector effects. We have optimized our analysis depending on the main backgrounds in the selected processes and have taken into account the following possibilities pp → qqZ for q = {u, d, s, c} and pp → ggZ. Our signal channel is pp → H(→ cc)Z and we select highly boosted Higgs bosons. Using jet shape observables as input to a BDT, we find a good performance to separate the cc signal from bb and light-flavor fat jets.
Thus, with this approach we can project an upper limit on Br(H → cc) ≤ 6.1% with SM production rates for Ldt = 3000.0 fb −1 and √ s = 13.0 TeV.  Following an analogous strategy we have studied the CP-odd THDM scalar A decaying into pairs cc. In particular we have determined Br(H → A(→ cc)Z) ≤ 0.01% by considering masses for A inside the range 4.0 GeV < ∼ m A < ∼ 10.0 GeV where the channel A → cc is particularly dominant for low values of tan β.
This section summarizes most of the observables considered during our analysis. For a more extensive discussion see [33,44,45] and the references cited therein.
We begin by introducing the definition of Thrust [46,47] where n T is the direction that maximizes the numerator. To avoid confusion, in the subsequent discussion the symbol "⊥" will be used to denote the transverse contribution of different kinematical variables. Then, the Thrust major is determined according to [48] T M = max     where it should be understood that n is perpendicular to n T .
We use the Thrust of e −η momenta [45] calculated according to Eq. (A.1) but with the three-momenta of each one of the subjets in the event modified according to We include the Fox Wolfram moment inspired observable [49] with E T ot being the total energy of the jet constituents, thus E T ot = i E i . The sum in the numerator of Eq. (A.4) considers only pairs of particles within the same hemisphere, i.e. those particles satisfying p i · p j > 0, and n is a rational number. In what follows we will refer to H n as the Fox Wolfram-like n moment, and we will consider the value n = 1/4. The Transverse spherocity [34] is given by wheren is the direction that minimizes the sum in the numerator.  The 3-jet resolution y 3 , defines the lower bound for the jet recombination parameter y ij in order to have a 3-jet event. Before presenting the determination algorithm for y 3 , according to different schemes, let us first introduce the possible definitions for the parameter y ij with E vis is the sum of the energies for the different final state subjets before the recombinations.
In addition, the recombination schemes between the i-th and j-th subjets are Then for example, in order to calculate the Resolution y 3 Durham (P-scheme) [50,51], we start by assigning an arbitrary high value to y 3 . Next, we calculate the parameter y ij between all the subjets inside a given fat jet using the Durham recombination rule shown before. We then determine the pair of elements whose y ij is minimum, y min ij , and recombine them applying the P-scheme presented above. Finally, if y 3 < y min ij , we do the substitution y 3 = y min ij and repeat the entire process, starting with the re-calculation of the values y ij over the set of subjets determined in the last iteration. The algorithm stops when the total number of subjets left after all the recombinations is equal to 3. The value of y 3 obtained in the final iteration is the number we are aiming for. The determination of the Resolution y 3 Jade (E-scheme) and the Resolution y 3 Jade (E0-scheme) proceed in an analogous way; however the Durham parameter y ij should be substituted by the Jade distance parameter; and the P recombination scheme should be replaced by the E-scheme (E0-scheme).
The Directly global y 3 [34] is constructed using the k t jet algorithm. To begin with, for all n final state par-ticles we define the beam-distance measure and for constituent pairs we calculate in terms of their corresponding pseudo rapidity y and azimuthal angle φ.
In our analysis we use R = 0.7. Let d (n) = min{d kB , d kl }, where the entire set of distances calculated at a given stage is considered. If d (n) is one of the values d ij , then the pseudojets i and j are recombined using the E-scheme defined above. If d (n) is one of the being p ⊥,1 and p ⊥,2 the transverse momenta of the jets obtained by continuing reclustering the event up to 2-pseudojets.
The observable τ x can be modified to give the Fractional energy correlation [45] F (A.10) Here x is a continuous parameter. During the analysis we use x = 1.5 that makes the observable particularly sensitive to collinear emissions for fixed transverse momentum.
To define the Transverse sphericity let us first introduce the transverse momentum tensor M xy = i p 2 x,i p x,i p y,i p x,i p y,i p 2 y,i . (A.11) Then the transverse sphericity can be determined in terms of the eigenvalues λ 1 and λ 2 of M xy (for λ 1 ≥ λ 2 ) as [52] S pheri ⊥,g ≡ 2λ 2 λ 1 + λ 2 , (A. 12) for circular events in the transverse plane we have S pheri ⊥,g → 1, whereas for pencil like events S pheri ⊥,g → 0.
To describe the cone jet mass let us start by introducing some definitions. The components of the highest p T fat jet selected in our studies are first reclustered using the k t algorithm. Then, the region C results from the union of the cones around the two new highest transverse momentum subjets (with coordinates η J,j , φ J,j , for j = 1, 2) according to Where the subindex i runs over the rest of the newly generated subjets. During our implementation we considered R = 1. The central transverse thrust axis n T,C is then defined as the vector that maximizes i∈C | p ⊥i · n T,C | Q ⊥,C , (A.14) where The vector n T,C allow us to divide the region C into the subregions C U and C D , defined in terms of the conditions 0 < p ⊥ · n T,C and p ⊥ · n T,C < 0 respectively. The partial masses in each one of these regions are We can further add the exponentially suppressed term EC given by The Central heavy jet mass with exponentially suppressed forward term [45] is calculated according to ρ H,E = ρ H,C + EC. (A.21) Finally, we consider the following C parameter-like observable [53][54][55]