Higgs boson decays into narrow di-photon jets and their search strategies at the Large Hadron Collider

In many extensions of the Standard Model the Higgs boson can decay into two light scalars each of which then subsequently decay into two photons. The underlying event is h $\to$ 4$\gamma$, but the kinematics from boosted light scalar decays combined with realistic detector resolutions may fail to register the events in straightforward categories and thus may be lost. In this article we investigate the phase space for highly boosted di-photon events from these exotic Higgs decays and discuss search strategies that aim to capture and label events in this difficult region. In the process we develop a new category, $\xi$-jets, which identifies with high selectivity highly collimated di-photon decay modes of the Higgs boson.


Introduction
Nearly a decade after the discovery of the Higgs boson it remains to be decided whether the discovered particle interacts with other known elementary particles in precisely the way the Standard Model dictates [1,2,3,4,5,6]. Deviations from SM expectations can arise by virtue of the Higgs boson being composite, part of a larger Higgs sector, coupled through its portal interactions to hidden sector states, or embedded in extra dimensions to name just a few examples. Alternatives remain viable because the SM Higgs boson couplings to other SM states are known only to at best 10% for some, and only to within O(1) factors for others, including muon, electrons, charm, and Higgs self interactions [7,8]. The possibility of the Higgs boson decaying into final states that are not allowed by the SM is also not constrained well in many cases.
In this article we take up the case of the Higgs boson (h) decaying into other very light scalars (φ 1 and φ 2 ) where each subsequently decays into photons, h → φ 1 φ 2 → (γγ)(γγ) (target observable). (1) In several different limits this process has been studied already [9,10]. In the case of φ 1,2 both having mass above about 10 GeV one finds that the events register as unambiguous 4γ events in the detector that can be searched for well. Within this regime, current studies limit this process to B(h → 4γ) 3 × 10 −4 [11,12].
On the other extreme, if φ 1,2 both have mass less than a few hundred MeV, the photons from φ i → γγ are so collimated coming from the highly boosted φ i resultant from their parent Higgs decay, that each φ i decay appears to go to a single photon. In that case, h → φ 1 φ 2 is simply combined with the standard h → γγ analysis, and it becomes a statistical question to determine what overabundance of such a signal would be consistent with data. At 95% CL the answer to this question is that the branching fraction of non-SM contributions to B(h → γγ) cannot exceed 2.2×10 −4 [13]. Such light scalars may also be disentangled from the SM h → γγ process with sophisticated substructure techniques [14,15].
Combining both extremes leads to an apparent detection h → 3γ. This arises when one of the φ i has mass less than a few hundred MeV and the other more than about 10 GeV. This process is forbidden in the SM, and the branching ratio is currently limited to B(h → 3γ) ×10 −3 , as can be gleaned from [16].
In between these two extremes, from the point of view of observables, is a murky region where the mass of one or both φ i states is between ∼ 0.1 GeV and 10 GeV. In that case, the two photons coming out of the φ i decays are not highly collimated nor or they cleanly separated. Roughly speaking, the ATLAS and CMS detectors see something distinct from a standard photon but that also does not register as two photons when the photon separation is between 0.04 < ∆R < 0.4 [17,18]. It is this difficult middle ground region that we wish to address in this letter.
It should be stated that extending the scalar sector of the SM by one (or multiple) singlets is a mature and well studied subfield [19,20,21]. Much of the parameter space for exotic heavy and light scalars (relative to the Higgs boson mass) is well constrained by direct searches and by precision electroweak measurements [22]. Our simplified model highlights a region of parameter space in a class of singlet extended models that has been less explored by previous studies.
The value in exploring such a regime lies in its ability to utilize the available experimental power from the LHC to investigate one of the most interesting loose ends in the Standard Model. Many models exist coupling new light scalars to the Standard Model in ways that are highly susceptible to the search strategy we advocate here [23]. The nature of the Higgs boson makes such couplings to new physics generic and apparent in a broad swath of theory parameter space. Furthermore, the rough knowledge we have of the Higgs boson to date deserves significant tightening in every reasonable direction. Our goal here is to consider this particular case in detail, highlight the experimental challenges for discovery, proffer some suggestions, and suggest a benchmark theory with points that may be useful for serious further study by experimental groups within the ATLAS and CMS collaborations.

Theory description
The phenomenon we are after is h → φ 1 φ 2 with subsequent decay of φ i → γγ. Such decays arise generically in a broad class of BSM theories, many of which give rise to additional exotic phenomena. Most commonly these are other, similar gauge interactions, such as Z → φγ, but the possibilities are wide and varied. Many BSM theories of this type are not yet constrained by experiment and have their most accessible phenomenon as h → φ 1 φ 2 → 4γ, if there are dedicated searches for it. Our focus lies in this last type of theory.
To devise an experimental strategy and analysis to discover this class of targeted theories, we must begin by constructing a representative theory within the class and finding ways to find evidence for it. Ideally the representative theory should be maximally simple without losing the key features under consideration for our exotic Higgs decays. In this case, there is such a simple theory, and its lagrangian is where F µν is the photon field strength tensor. Of course, one could write down non-trivial |H| 2 φ 2 1 and |H| 2 φ 2 2 terms among others, but that would which add complexity without contributing significantly to the final phenomenology. One might also object that φ i F µν F µν should be traded in for gauge-invariant couplings of φ i to hypercharge field strength tensor φ i B µν B µν and SU (2) field strength tensor φ i W a µν W a,µν . That would be fine, except that upon diagonalizing these interactions to those of the mass eigenstates one finds nevertheless φ i F 2 terms, which will completely dominate in the decays of φ i over φ i Z µν F µν and φ i Z 2 terms due to the Z boson being much heavier than the φ i that we will consider below. 1 The φ i ZF interaction can give rise to Z → φ i γ decays, constrained by searches at the Tevatron and the LHC [24,11], but as the scale of Λ i becomes higher, this constraint goes away while B(φ i → γγ) remains 100% 2 . For that reason we drop these extra consideration and extraneous interactions from the theory description and retain only the lagrangian of Eq. 2.
From the point of view of devising experimental search strategies to find evidence for the Higgs boson decaying into a single light scalar, say σ such that h → σσ → 4γ, the benchmark theory above is adequate. It merely corresponds to the case of m 1 = m 2 . That is not to say the two theories are exactly the same, only that the subsequent search strategies are the same. That is why we propose to work with only one theory -the representative theory of Eq. 2 -which we believe to form a basis upon which benchmark points can be established and strategies devised.

Photon ξ-jets
As we mentioned in the introduction, the target observable of Eq. 1 implies photon separation from φ i decays that is sensitive to the φ i masses. This is illustrated in fig. 1, which shows that m φ = 10 GeV gives well separated photons (∆R > 0.4) and m φ = 0.1 GeV gives very collimated photons (∆R < 0.04), and mass of 1 GeV gives intermediate separation. Recall that ∆R = (∆φ) 2 + (∆η) 2 , and ∆φ is the azimuthal angle separation and ∆η is the pseudorapidity separation of the two photons in φ i → γγ decay.
Thus, it is relatively straightforward phenomenology if both φ 1 and φ 2 have mass greater than 10 GeV. The two states decay into well separated photons φ i → γγ and the target observable becomes four well separated photons, all reconstructing the Higgs mass m h = m 4γ . Such a prospect does not require further discussion here, as all the standard tools of experimental analysis to identify well isolated photons can be employed to make straightforward searches, as have been done in [11,12].
MeV the decays chain of h → φ 1 φ 2 → (γγ)(γγ) yields highly boosted φ i light states that can decay into highly collimated photon pairs that then register in the electromagnetic calorimeter as a single photon. Once m φ i dips below 100 MeV that rate is nearly 100%. Thus, experimentally, for such light φ i the target observables register in the detector as γγ events with m γγ = m h , and thus contributes to the count of such non-exotic events already produced by the direct decays of h → γγ through top and W loops. The sensitivity to this possibility then becomes a statistics question of how many exotic sources of h → γγ events can the data tolerate. As we mentioned in the introduction, that rate is approximately 2.2 × 10 −4 [13]. Additional discussion is not needed here.
We then turn to the more ambiguous case in which the φ i masses fall within the "intermediate mass" range of 0.1 GeV < m φ i < 10 GeV. Within the LHC environment, the production of Higgs bosons and their subsequent decay into such scalars yields photon pairs separated by (3) It is well known that photon pairs that fall within the intermediate separation range of Eq. 3 are extremely difficult to separate or identity. We will speak much more on that below, but here we wish to pay respect to that difficulty by giving it a name. We call two photons that are within the range specified by Eq. 3 a "ξ-jet". The ξ-jet is a purely theoretical object, and it is defined by underlying "truth data" and not with respect to any detector performance. If a photon has another photon within the intermediate separation annulus of Eq. 3, and nothing else is within the outer ring of that annulus, then it ceases to be a photon and the two together form a ξ-jet. Such a concept can be generalized to more than two photons but it is of not much importance here to do that. We also specify as a theoretical object that a photon is defined to be either a single photon or two photons within ∆R < 0.04 of each other.
With these theory definitions of photon and ξ-jet, our target observable is broken into several distinct and non-overlapping final states, depending on the masses of the φ i intermediate states in the decay chain: The first three of these observables we have already discussed. The remaining observables have not been fully explored in the literature, and we wish to consider them in more detail below.

Benchmark model points
We are interested in exploring three observables: γξ, γγξ, and 2ξ. To do so we need benchmark points that give rise to each of these types of observables. They can be obtained rather straightforwardly from our representative theory of Eq. 2 where the masses of φ 1 and φ 2 are chosen to be various permutations of the masses 0.1 GeV, 1 GeV and 10 GeV. In particular, m φ = 0.1 GeV generally always gives φ → γ decays, m φ = 1 GeV typically gives φ → ξ decays, and m φ = 10 GeV generally gives φ → γγ decays according to our definitions in the previous section. These are so far entirely defined theoretically. In the next section we will pursue more carefully how a theoretical ξ-jet registers in an experimental analysis.
From these considerations we can construct the following three benchmark points A, B, and C, specified in Table 1. Fig. 2 shows the relative fraction of each observable for each benchmark point. The dominant and subdominant modes of decay for each benchmark point are listed in Table 1 and can be gleaned from the fraction data given in Fig. 2. Table 1 shows that several combinations of light scalar masses give interesting decay signatures involving Table 1: Benchmark points for h → φ 1 φ 2 → 4γ which then partition into various theory-object observables (modes) according to our definitions of ξ (photon pairs with 0.04 < ∆R < 0.4) and γ (an isolated photon with ∆R > 0.4 or two photons within ∆R < 0.04).
Figure 2: Branching fraction into each final state theory observable for the benchmark points A (blue), B (orange), C (red) and D (green) given in Table 1.
combinations of ξ-jets and photons which (to the authors' best knowledge) are not being searched for in current LHC analyses.

Experimental search strategies
So far our discussion has been mainly theoretical. We have identified a rare Higgs decay whose cascade we claim may be difficult to detect by experiment. In this section, we discuss how our theoretical objects translate into experimental manifestations. We have suggested that some mass ranges of φ i are problematic for experiment. We will discuss some details on why they are challenging and some strategies by which to possibly overcome those challenges.

Multi-photon final states
Isolated photons or extremely highly collimated photons both get identified simply as photons, and analysis based on those standard objects (photons) proceed without much subtlety regarding how to process the data into well-defined final states of 2γ, 3γ and 4γ.

ξ-jet final states
Some of the final states from the decays of Eq. 4 yield ξ-jets. Underneath, a ξ-jet is merely two photons with intermediate ∆R separation (see Eq. 3). But a key question is, how does a ξ-jet, defined as a theoretical object, get processed into various experimental categories? A perfect detector would register it as merely two photons, a bad detector as a single photon or nothing, and a realistic good detector, such as ATLAS or CMS, registers it as something altogether different within several possible categories of varying sensitivity and selectivity 3 .
To address this question of how a ξ registers in a detector it is useful to describe the various categories into which a single photon can fall. As an example we take the standard categories which ATLAS uses for photon identification. There are eight possible standard categories, six are the permutations among three isolation possibilities (non-isolated, loose isolation, and tight isolation) and two ID possibilities (loose ID and tight ID). The other two categories are jet and "lost." Jet is the standard QCD jet from fragmentation of quarks or gluons, and "lost" refers to the possibility that the data does not conform to any other category and is not registered in any higher abstracted category except for mere energy depositions in the detector.
A ξ-jet will register with some probability into one or more of the standard photon categories. The probability to do so depends on the underlying event kinematics. Under typical assumptions, the ξ-jet will often register as "lost" due to the inability to resolve the two photons yet the event covers more than one cell in the electromagnetic calorimeter which a single photon would not do. As no category becomes applicable, it has no option but to be relegated to "lost." The implication of a ξ-jet arising from a Higgs decay being categorized as "lost" is that an analysis that requires reconstructing the invariant mass of the Higgs boson from well-defined decay products can no longer register the events. It is therefore necessary to build a ninth category "ξ-candidate" under which ξ events can fall. ξ-candidates must be defined entirely through detector response, with the goal of producing high sensitivity to underlying ξ-jets with reasonably good selectivity (i.e., mostly only ξ-jets register as ξ-candidates).

ξ-candidates
A detailed definition of the "ξ-candidates" category satisfying the demands stated above is best constructed by a team of experimental experts within the ATLAS and CMS collaborations deeply familiar with their detectors. However, it is likely that such a definition meeting the demands of sensitivity and selectivity will have several key characteristics which we would like to discuss here. We will then make illustrative estimates of the utility of a ξ-candidates definitions based on these characteristics.
We make use of MadGraph aMC@NLO [25] simulations to produce our signal events at leading order with the lagrangian of Eq. 2, which are then hadronized via pythia 8 [26,27]. For our detector studies we utilize Delphes [28] fast detector simulation framework with the default CMS card and FastJet [29] for jet clustering algorithms.
To begin one must have a cluster, established by standard techniques. One useful criteria to impose on the pre-ξ-candidate cluster is a strong isolation requirement against QCD activity within a small cone around the ξ-candidate system, reducing QCD backgrounds from decaying pions. Additional criteria for the definition must also appeal to the stoutness of the photon jet -there are two photons separated enough to not look like one photon and that separation shows up as a larger-than-normal spatial spread among cells within the electromagnetic detector. Furthermore, vetoing on charged tracks eliminates electron-induced showers. Finally, recently established jet n-subjettiness algorithms [30] can be employed to select clusters that have discernible sub-jet structure compatible with 2 collimated photons. Refs. [14,31] go into detail on the ability to use these and other, similar variables to separate ξ-candidates (called photon-jets in these papers) from photons and QCD jets, but all of these considerations will be in play in the definitions below.
Our ξ-jet theory definition was for underlying two-photon clusters with ∆R separation in the range of 0.04 to 0.4. In addition, within the range of 0.025 < ∆R < 0.04 there is a possibility of using electromagnetic shape variables to discern that the underlying event was likely not a single photon, but certainly not clear enough to indicate the possibility of two photons. Nevertheless, our ξ-candidate list of criteria will be applicable for two-photon jets separations down to about ∆R > ∼ 0.025 and up to about ∆R < ∼ 0.25. We will not discuss the range 0.25 < ∼ ∆R < ∼ 0.4 here, because our understanding is that more traditional photon identification tools may be applicable to separate the photons just well enough to help discern signal from photon backgrounds.
Let us now turn to a more precise definition of ξ-candidates (underlying two-photon separation 0.025 < ∼ ∆R < ∼ 0.25). This regime targets events that have two photons in sufficiently close proximity that their cores overlap, thereby interfering with one anothers' identification procedure. This should appear as a cluster of energy in the EM calorimeter, with no tracks or corresponding energy in the hadronic calorimeter, and high 2-subjettiness. We provide an example definition of ξ-candidate criteria in Table 2. Below in Fig. 3 we also show distributions of signal and background for QCD jets and ξ-jets. These distributions reproduce those of [31,14] and show that ξ-jets can be separated from QCD backgrounds with high efficiency.

Variable Definition
Cut Reasoning logθ J hadronic energy fraction < −0.8 exclude QCD and τ N T Number of tracks = 0 excludes single converted photons and jet activity τ 2 /τ 1 Ratio of 2-to 1-subjettiness < 0.3 Selects events with 2 subjets Table 2: ξ definition meant to capture underlying events with for events with 0.025 < ∆R < 0.25. These objects are defined as a cone of radius ∆R = 0.25 about a central cluster in the EM calorimeter, centered on the highest energy pixel. Unless otherwise stated, the region is within the ξ region. Here θ J is the hadronic energy fraction.
(a) (b) Figure 3: Subset of kinematic variables useful for discriminating of ξ-jets (green) and QCD jets (blue) which are a major background. Here θ J is the hadronic energy fraction for a jet, and τ 2 /τ 1 is the ratio of 2-jettiness to 1-jettiness which is useful for picking out events with 2 subjets.

Reconstructing ξ-jets and Higgs decays
Now that we have precise definitions of photons and ξ-candidates we can ask how well the Higgs boson signal can be reconstructed, especially in the case of its decay into one or more ξ-jets. Fig. 4 shows the analysis flow of our reconstruction of ξ-jets using Delphes fast detector simulation. Additional photons not covered by that flow, as well as electrons, muons, jets, etc. are identified and labeled by other analysis flows.
First, one must reconstruct the ξ-jets which we attempt to do by following a strategy similar to Ref. [14]. The method is as follows. First energy flow (eflow) objects [32] (composed of deposits in calorimeter cells) are clustered into jets using the anti-kt algorithm with R = 0.25. Then we re-cluster those energy deposits that were found in each jet using the kt algorithm, which determines a recombination tree for the jets. This tree specifies the subjets at each level of recombination N from N = 1 (the full jet) to N = the number of constituent eflow Figure 4: the analysis flow of our reconstruction of ξ-jets using Delphes fast detector simulation. Additional photons not covered by that flow, as well as electrons, muons, jets, etc. are identified and labeled by other analysis flows. objects in the jet (no recombination). From here we can compute the N -subjettiness variable for the jet for each N . This variable becomes small when the parameter N is large enough to describe all of the relevant substructure of the jet. It is defined to be where k runs over all the constituents of the jet, p T k is the transverse momentum for the k-th constituent, and R is the characteristic jet radius used in the original jet clustering algorithm.
After jet clustering is completed we then check if a reconstructed ξ-candidate already contains a reconstructed photon. Reconstructed photons are composed of eflow objects originating from the ECAL which must pass isolation requirements (cuts on electromagnetic and hadronic activity within a cone around the photon). If a ξ-candidate contains an already reconstructed, isolated photon then this ξ-candidate is deleted.
Before applying additional cuts, we would like to characterize the efficiency at which we reconstruct ξ-jets. To do this we utilize Delphes GenJet objects. GenJets are jets that are clustered, not with calorimeter cells or towers or eflow objects, but with the actual generator level particles. By utilizing GenJets we can define "generated ξ-jets" and see at what rate we correctly reconstruct these.
GenJets are clustered with the same strategy as above, first with the anti-kt algorithm with R = 0.4, and then reclustered with the kt algorithm. A GenJet is selected as a generated ξ-jet if it has: 1) At most two photons with p T > 0.5 GeV, 2) no non-photons with p T > 0.5 GeV. Since our theoretical ξ-jets were defined as pairs of photons with ∆R between 0.04 and 0.4, we throw out ξ-jets with ∆R < 0.025 as these will most likely be reconstructed as one photon.
Once a generated ξ-jet is identified, we loop over all reconstructed ξ-candidates and attempt to find a match. Matching is done by comparing the ∆R between the momentum of the generated and reconstructed jets. If ∆R gen/reco < 0.05 we consider this jet as matched. We also require that the reconstructed ξ-jets pass a cut on the required hadronic energy fraction. This cut is that log(E had /E jet ) < −0.8. Below in Fig. 5 we show ∆R gen/reco , which shows the level of matching between generated and reconstructed ξ-jets. It also serves as a check that this is independent of our model parameters. Now we would like to understand how often we can reconstruct the Higgs mass using our reconstructed photons and ξ-candidates. To simplify matters we will choose m φ 1 = m φ 2 , which is equivalent to having only one light scalar in addition to the observed h 125 . We scan over light scalar masses from 100 MeV to 14 GeV. This range ensures we see a smooth transition between photon dominated decays and ξ-jet dominated decays. The following discussion can be generalized by choosing different masses for the light scalars. After reconstruction, we first collect all of our reconstructed objects, which for now are photons and ξ-candidates. We only require our reconstructed ξ-candidates to pass our hadronic energy fraction cut, otherwise no cuts (besides minimum p T cuts which are used for clustering). We then form all the possible subsets of this collection, which have between 1 and 4 objects (as at most the Higgs decayed into 4 separable photons). If one combination of ξ-candidates and photons yields an invariant From 100-300 MeV the signal from photons + ξ jets becomes the most efficient channel as one of the pairs of photons is collimated enough to form a ξ-jet. Immediately above 300 MeV the signal from pairs of ξ-jets (ξ-jets only) becomes an order of magnitude more efficient than the photon only channel and remains so until 6 GeV. Overall, searches including ξ-jets are more than an order of magnitude more efficient at reconstructing the Higgs from masses between 100 MeV and 10 GeV. Fig. 6 shows that searches including ξ-jets would be invaluable if a light scalar connected to the gauge and Higgs sector as in Eq. 2 exists in nature. We would like to stress that even though our analysis and definitions are quite simple, our results should be robust even after the introduction of more strict experimental search strategies and analysis cuts. It is interesting to compute how many such ξ-jet events one could expect for a given luminosity at the LHC. This is of course a function of the φ mass and the efficiency for reconstructing the h 125 . To give an estimate, we can take the m φ = 2 GeV point as an example. This has an efficiency for reconstruction of about 50%. If we take Br(h → φφ) = 10 −4 and an integrated luminosity of 300 fb −1 , this leaves us with about 7500 reconstructed events.
While a comprehensive study of standard model backgrounds is necessary for an experimental search, we can still make qualitative statements about discriminating ξ-jets from objects which fake ξ-jets. The largest backgrounds will be from jj, and γ j production where a QCD jet fakes a ξ-jet. References [31,14] use a Boosted Decision Trees based on energy and substructure variables to discriminate between QCD jets and photon jets. They quote a fake rate from QCD jets of 10 −4 − 10 −5 , though this fake rate is dependent on the rate at which one accidentally rejects ξ-jets. Additionally, the requirement that the invariant mass of the two ξ-candidates needs to fall within a 3 GeV window of the h 125 mass lowers the background as well, as the rate for QCD jet production tends to fall at high invariant masses. Combined, these factors should allow for a bump hunt search for ξ-jets with high sensitivity.

Conclusion
The discovery of the Higgs boson has lent strong support to the Standard Model, but also has allowed us to search for new avenues along which to extend it. In this work we have investigated exotic decays of the 125 GeV Higgs boson into light scalars which as of yet may be missed via current analysis techniques. We have discussed, first theoretically and then experimentally, a new object dubbed a ξ-jet which could play a pivotal role in the discovery of any light scalars minimally coupled to the standard model Higgs and to photons as in Eq. 2. If experimentalists are able to identify and reconstruct ξ-jets these new objects could be strong evidence for an extended Higgs sector and Beyond the Standard Model physics.

Acknowledgments
We thank Advanced Research Computing at the University of Michigan, Ann Arbor for their computational resources, as well as D. Amidei, C. Hayes, R. Hyneman and T. Schwarz for Figure 6: Top: Efficiency of h 125 reconstruction as a function of scalar mass split into different categories based on number of ξ-jets and photons. Bottom: Ratio to only using photons. The reconstruction efficiency is more than an order of magnitude better when including ξ-jets (pink) over a wide range of masses from 100 MeV to 10 GeV. More specifically, the 2 ξ-jet channel dominates from the range of 300 MeV to 6 GeV. helpful conversations on these issues. This work was supported in part by the DOE under grant DE-SC0007859. B. Sheff is supported by the NSF GRFP program, and N. Steinberg is supported by a fellowship from the Leinweber Center for Theoretical Physics.