Higgs to charm quarks in vector boson fusion plus a photon

Experimentally probing the charm-Yukawa coupling in the LHC experiments is important, but very challenging due to an enormous QCD background. We study a new channel that can be used to search for the Higgs decay $H\to c\bar c$, using the vector boson fusion (VBF) mechanism with an associated photon. In addition to suppressing the QCD background, the photon gives an effective trigger handle. We discuss the trigger implications of this final state that can be utilized in ATLAS and CMS. We propose a novel search strategy for $H\to c\bar c$ in association with VBF jets and a photon, where we find a projected sensitivity of about 13 times the SM charm-Yukawa coupling at 95$\%$ $\text{CL}_s$ at High Luminosity LHC (HL-LHC). Our result is comparable and complementary to existing projections at the HL-LHC. We also discuss the implications of increasing the center of mass collision energy to 30 TeV and 100 TeV.


I. INTRODUCTION
Since the discovery of the Higgs boson (H) by the ATLAS [1] and CMS [2] collaborations, determining its properties has become a high priority for the experiments at the LHC. Higgs boson couplings to weak gauge bosons are governed by the spontaneous symmetry breaking of the gauge theory, and have been well measured. However, the mass generation of fermions is a distinctive question. In the Standard Model (SM), fermion mass terms emerge from Higgs boson Yukawa interactions and thus the Yukawa couplings are proportional to the fermion mass. Therefore, it is crucial to establish the pattern of the Yukawa couplings to fermions in order to verify the SM and seek hints of Beyond-the-Standard-Model (BSM) physics. To date, Higgs couplings to third generation fermions have been observed to bb [3,4], tt [5,6] and to τ + τ − [7][8][9]. Direct observations of the Higgs couplings to the second generation of fermions are thus of critical importance to further confirm the non-universal pattern of Yukawa couplings [10,11]. Because of its distinctive experimental signature, the decay mode H → µ + µ − via the gg fusion production [12] and the vector-boson fusion production (VBF) [13] is promising to be observed in the near future [14,15]. Testing the Higgs Yukawa coupling to the charm quark (y c ), on the other hand, is known to be challenging at hadron colliders due to the formidable QCD backgrounds.
At the LHC, billions of pp collisions happen every second but only a small fraction of these events will be recorded due to limitations in data storage capacity and rate limitations of the detector readout electronics. The judicious selection of events as using triggers such as iso-lated leptons, photons, or jets with high transverse momentum, are needed to record events of physical interest. Searching for the decay mode H → cc with limited energy for the decay products requires incorporating the Higgs boson production mechanism to develop an efficient trigger strategy. There are currently two experimental probes of the charm-Yukawa coupling at the LHC. One approach is to use Higgs association production with a leptonically decaying Z boson (ZH channel) [16,17]. The ZH channel provides a bound on µ of 110 for 36.1 fb −1 of data, where µ is defined as the ratio of the new physics cross section and the SM expectation. An extrapolation of the ATLAS analysis leads to a projection of µ < 6 using 3 ab −1 at the HL-LHC [18]. A recent preliminary result from ATLAS improves the sensitivity to µ < 26 by utilizing 139fb −1 and the leptonic decays of the W boson as well as Z to invisible decays [19]. A similar result from CMS also incorporating associated production with a W boson, Z to invisible decay modes and utilizing substructure techniques yields an observed constraint of µ < 70 [20]. The LHCb experiment has provided limits using 1.98 fb −1 of data, providing an observed constraint of µ < 6400 [21], with a projection of an upper limit on the µ of 50 after collecting 300 fb −1 of data at 14 TeV assuming no improvements in the detector performance or analysis [22].
Another approach does not rely on tagging charm quarks from the Higgs decay, but instead uses the decay of a Higgs boson H → J/ψ +γ [23][24][25][26][27], a process that has been searched for by ATLAS [28,29] and CMS [30]. This process gives a looser bound on charm-Yukawa coupling of 50 times the SM prediction even at the HL-LHC [31], due to the contamination from H → γγ * with a vector meson dominance in γ * -J/ψ mixing, which is about an order of magnitude larger than that from the direct Hcc coupling [23,25,32].
In Table I, we collect the current results from the LHC searches (upper two rows) and the HL-LHC projection (lower two rows). For the V H channel, the HL-LHC projections at CMS are estimated by simply scaling for the signal strength µ from the increase of the luminosity. We further translate the results to estimate the sensitivity to the modification from the SM coupling κ c = y BSM c /y SM c as described in Sec. III C. For the channel H → J/ψ + γ, κ c does not have a simple relation with µ due to the contamination from H → γγ * as noted above. It is instead estimated using Eq. (53a) in [25].
Rather than search for H → cc, it has also been proposed to constrain y c by requiring a charm tag in the production gc → Hc with H → γγ. This channel yields a 95% CL s limit on κ c ranging from 2.6 to 3.9 at HL-LHC, depending on theoretical uncertainties [34]. An attempt to perform a global fit for the Higgs couplings may lead to a tighter bound on the charm-Yukawa coupling [27,[35][36][37], with a few model-dependent assumptions. In particular, a 95% CL s upper bound on κ c of 1.2 at HL-LHC is claimed in [36], obtained from the upper limit of branching ratio of Higgs boson decays to untagged BSM particles assuming |κ V | ≤ 1. Exploiting kinematic information of the Higgs boson, such as transverse momentum distribution and rapidity distribution, was proposed in probing the light-quark Yukawa couplings [38][39][40] and implemented in a combined fit by CMS [41]. The asymmetry between W + and W − production has also been proposed to constrain y c [35,42]. It has also been proposed to constrain y c using Di-Higgs production [43,44]. There are proposals to further enhance the sensitivity of H → cc by utilizing an additional photon radiation [45][46][47].
In this work, we propose a novel approach to probe the Yukawa coupling of charm quark via VBF for the Higgs boson production with an additional photon. This work builds off of the idea of introducing a new subset of the VBF production mode utilizing photon radiation as an additional handle [48][49][50][51][52][53][54] for triggering and background suppression. The new channel we propose can provide complementary information to the existing searches using the W H and ZH channels.
The rest of the paper is organized as follows. We first lay out our search strategy in Sec. II, in particular a proposal for triggering the signal events. We then present our analyses and the results in Sec. III, including HL-LHC. We finally summarize our results and draw the conclusion in Sec. IV.

II. PROPOSED SEARCH STRATEGY
The decay branching fraction for H → cc is about 3%, which leads to a cross section about 0.1 pb (1 pb) from the VBF (gg) production at the 13 TeV LHC [55]. This yields a sizeable signal sample with the currently achievable luminosity. However, the process H → cc is not only difficult to trigger, but also challenging to distinguish from large QCD multi-jet background. More sophisticated search strategies should be developed to reach the needed sensitivity for signal observation. First, the VBF channel has striking experimental signature where a central Higgs boson is accompanied by two light jets with a large rapidity gap. Second, the addition of the photon improves the trigger efficiency compared to what can be achieved using only multi-jet final states as well as suppresses the gluon-rich dominant multi-jet background. We therefore propose to search for the signal process pp → qqHγ with H → cc. (1) Our signal process has distinctive features, which are characterized by two c-jets from the Higgs boson decay and an energetic photon in the central region, with two light jets separated by a large rapidity gap. The dominant background is QCD multi-jet production associated with a photon, where at least two jets are tagged (or mistagged) as c-jets. Other backgrounds include Zγ+jets and VBF Higgs production+γ with H → bb, where both b-jets are mistagged as c-jets. However, their contributions are expected to be much less significant than the QCD multi-jet background, thus not included in this analysis. The representative Feynman diagrams of the signal and leading background are shown in Fig. 1 for illustration. Our analyses are designed to isolate the signal based on their kinematic features, described in detail in Sec. III. To estimate the achievable sensitivity to the signal at the hadronic collider environment, we use simulated events from Monte Carlo tools. With the very large hadronic production rate at the LHC, the trigger system is designed to record events of physical interest. For the relatively soft final states in our signal process, a dedicated trigger strategy is essential to record the signal events.

A. Monte Carlo simulation
Our targeted signal as seen in Eq. (1) is H → cc plus an additional γ. Both signal and QCD multi-jet background are generated at LO with MG5@MCNLO v2.6.5 [56] at the pp collider center of mass (c.m.) energy √ s = 13 TeV using the PDF4LHC15_nlo_mc PDFs [57]. The Higgs boson in the signal process is then decayed into cc by MadSpin [58]. The renormalization and factorization scales are set to be at the EW scale of the W -mass (m W ).  [19] 70 [20] 6400 [21] 120 [29] HL-LHC on µ 6.3 [  I: Summary of existing search results at the LHC (upper two rows) and the HL-LHC projection (lower two rows). The CMS entry marked with * is scaled from the reported µ value to higher luminosity. The entries marked with † were computed from the reported µ values (see Sec. III C.) The entry marked with † † is scaled according to the description in the text following [25].
To enhance generation efficiency, both samples are generated with the following parton-level requirements, which are slightly looser than the thresholds used in analysis. We require two VBF jets inside the detector acceptance in pseudo-rapidity (η), with transverse momenta and an isolated photon in the central region with transverse momentum The parton shower and hadronization are simulated with Pythia8 [59] and a fast detector simulation is implemented with Delphes3 using the default cards [60,61]. Jets are reconstructed using the anti-k t algorithm [62] with a radius parameter After these basic acceptance cuts, the cross-sections of signal and leading background processes at different center-of-mass energies are listed in Table II using the same calculation set-up. We see that the signal rates are sizeable with the current and anticipated luminosities. However, the signal-to-background ratios are quite low, roughly at the order of 10 −5 , rendering the signal identification extremely challenging.  The ATLAS [63] and CMS [64] experiments both contain a two-level trigger system. The first level, or level-1, trigger system is composed of custom electronics while the second level, or high-level trigger (HLT) runs software algorithms. The level-1 trigger systems of both experiments currently only utilize information from the calorimeter or muon subs-systems, and as a consequence, have shared items to trigger on electrons and photons [65]. Furthermore, due to the importance of selecting electrons with p T ≈ m W /2, approximately 25% of the total level-1 rate is devoted to electron/photon triggers. More information on the breakdown of rate and specific implementation can be found in the documents describing the ATLAS trigger menu [66][67][68][69].
After events have been selected from the level-1 trigger using a single EM object, these events can be used to seed a variety of triggers in the HLT, including requiring additional VBF jets or jets with a b-tag to reduce the rate. Relevant to our current considerations, the ATLAS analyses described in [70,71] utilize a trigger with the following offline requirements: • Photon E γ T > 30 GeV; • At least four jets with p j T > 40 GeV; • At least one pair of jets with m jj > 700 GeV; • At least one b-tagged jet with 77% efficiency.
There are also VBF triggers described in [69,72] with higher m jj threshold and jet p T threshold, which would have a lower acceptance for the genuine VBF events. To develop a trigger for H → cc final states, it seems plausible that the trigger described above for H → bb could be modified for charm final states, by either requiring a charm tag or by raising the m jj threshold. We leave the exact details and optimization to the experiments, and proceed by providing motivation for the use case of such a trigger.

A. Cut-based analysis
To obtain a physical intuition on the characteristics of our signal and background processes, we start with a simple cut-based analysis which utilize thresholds on different kinematic observables.
At least 2 jets in the central region |η j | < 2.5 are required to be c-tagged using fixed efficiency values depending on the truth flavor of the jet inspired by [17]. Charm jets are assumed to have a tagging efficiency of 41%, while b-jets have a contamination probability of 25% and light jets a contamination probability of 5%. The two highest-p T ctagged jets are identified as signal jets from Higgs boson decay while the remaining jets are identified as the VBF jets. The VBF jet pair is required to have invariant mass of at least 800 GeV so that the trigger requirement is fully efficient. In addition, the signal c-jet pair is required to have p cc T > 80 GeV to remove potential bias in m cc distribution caused by the p j T thresholds in the trigger.

Optimized analysis selections
In order to exploit the full phase space of the kinematic features, we choose several additional kinematic observables which are useful to further distinguish signal and background. The distributions of these observables after the pre-selections are shown in Figure 2 and 3. In addition to the pre-selections, we make some further judicious cuts based on the signal kinematics as follows. The invariant mass of the VBF jets (and photon) is large due to their back-to-back nature, so we require m jj , m jjγ > 1000 GeV.
The transverse momentum of the VBF jet pair is governed by the W/Z exchange and thus relatively low. We limit their value to be Since final states from electroweak processes tend to be more back-to-back than the QCD multi-jet background, we select events with the following ratio between the magnitudes of the vector and scalar sums of the jets and photon momenta Furthermore, because VBF signal features a large rapidity gap between the two forward-backward jets, events with large pseudo-rapidity separation between the two jets are selected for The reason that the multi-jet background also peaks at a relatively high value in Figure 2 is due to the m jj requirement in the pre-selections.
As the photon is not radiated from c-jets from Higgs decay in our signal process, the angular separation between the signal c-jets and the photon tends to be larger in contrast to the QCD processes. Therefore, we require where c 1,2 are leading and sub-leading c-jets. We also make use of the centrality of the photon relative to the VBF jets and require where y is the rapidity of the jet or photon. [73] Additionally, we utilize the azimuthal angular information on the transverse plane and require ∆φ(cc, jj) > 2.3, ∆φ(jj) < 2.1 It should be noted that the ∆φ(jj), which is motivated by [74], has not been used to our knowledge before in the H → bb searches. We then require the invariant mass of the c-jets and photon system m ccγ < 700 GeV.
Since this observable is highly correlated with m cc , we choose a relatively loose cut here. The invariant mass of the signal c-jet pair m cc is used as the final discriminant. After the above selection requirements, the distribution of m cc is shown in Fig. 4. As expected, the Higgs signal peaks at m H ≈ 125 GeV, while the multi-jet background has a more flat shape. Anticipating finite jet-mass resolution, we therefore require the c-jet pair to be in the Higgs mass window 100 GeV < m cc < 140 GeV.
The expected numbers of events from signal and background of integrated luminosity 3 ab −1 are shown in Table III. The truth flavor components of the c-tagged jet pairs in background are also shown in Table III. The major component is true c-jet pair as expected, but it should also be noted that the sub-leading components mainly involve light jets mistagged as c-jet. This suggests that improving the discrimination between c-jets and light jets can enhance the significance of such search.

B. Multivariate analysis
Recent analyses of data in high energy physics have made extensive use of machine learning techniques, including the use of boosted decision trees (BDT) [3,4,75]. To improve the sensitivity reached by the simple cutbased studies in the last section III A, a multivariate analysis is employed, starting with only the pre-selection cuts. A BDT is trained using the TMVA  The BDT is constructed by 850 trees, each with maximum depth of 3. A small depth is chosen because they are less susceptible to over-training but still perform very well with the aid of boosting algorithms. At each node of a tree, events are split into two subsets by cutting on an observable. The performance of the separation is assessed by Gini Index, defined by p (1 − p) where p is the ratio of signal events to all events in that node. Therefore, a pure signal or background node corresponds to a zero Gini Index. The event sample is randomly split by half into training and test samples, where the former is used for BDT training and the latter is used for analysis and deriving limits. The distribution of the BDT score from signal and background along with a receiver operator characteristic (ROC) curve are shown in Figure 5. The background rejection in the ROC curve on the right panel is defined as one minus the background survival probability after the selection cuts. The BDT performs as expected with a positive score as more "signal-like" and a negative score as more "background-like". We can see that the separation between signal and background is fairly well. The test sample distribution is superimposed To maximize the sensitivity, instead of a single cut, the BDT score is divided into three signal regions: low signal region with BDT score from -0.07 to 0.01, medium signal region with BDT score from 0.01 to 0.08, and high signal region with BDT score > 0.08, as indicated in Table IV. The invariant mass distribution of the c-jet pair is not used in the BDT training but as a final discriminator and shown in Figure 6. The invariant mass distribution m cc could be included in the BDT training to improve the separation between signal and background. However, as is commonly practiced by experiments, we reserved m cc as a most discriminative variable that can be used in a combined fit for signal plus background. A mass window of 100 GeV−140 GeV is again selected in the m cc distribution. The expected numbers of events from signal and background of integrated luminosity 3 ab −1 are shown in Table IV for different BDT score intervals. With the BDT cut, we can reach a signal efficiency (background rejection in the interval of BDT score) of 15% (72%), 28% (85%) and 51% (95%) in the low, medium and high signal region respectively. In comparison, the signal efficiency (background rejection) of the optimized selection cuts in Sec. III A is 24% (97%), shown together with the BDT ROC curve on the right panel in Figure 5. For the same background rejection, BDT can achieve a signal efficiency of 40%, outperforming the cut-based analysis. The overall significance is calculated by combining the significance in the three signal regions in quadrature, with the largest contribution coming from the high signal region. A relative change in significance of roughly 50% is seen, where the relative change is defined as (δ BDT − δ cut-based )/δ cut-based , and δ = S/ √ B. Like the other extrapolations to the HL-LHC [18,77], we have not considered the effects from the systematic errors. On the one hand, it is important to include the systematic effects to draw robust conclusions, especially given the rather small S/B for the signal searches. On the other hand, the systematic effects due to the background measurement are largely unknown for the HL-LHC. We believe that when the large data sample becomes available, the systematic errors may be controlled to a desirable level of a few percent or lower.   The expected 95% CL s upper limit on the signal strength µ in the absence of systematic uncertainties is approximated by 2 √ B/S. The BSM modification of the charm-Yukawa coupling is parametrized using the κscheme as then the number of signal events would approximately scale as where κ H denotes the BSM modification of the Higgs width. In principle, κ H depends on all BSM modifications of SM Higgs decay channels and any new channels. If we assume that κ c is the only non-SM modification, the upper limit on the signal strength can be translated into limit on the charm-Yukawa coupling: which are shown in  It is natural to ask to what extend the probe to the charm-Yukawa coupling can be improved at the future higher energy hadron colliders, such as the HE-LHC [78] and the FCC-hh [79]. The answer obviously depends on the detector performance of the charm-tagging, photon detection, and the QCD jet rejection, we nevertheless perform a crude estimate the sensitivity reach by assuming the same detector performance as the HL-LHC study. The sensitivity is estimated by extrapolating the HL-LHC performance as shown in the previous sections. We again calculate the signal cross section and the leading QCD background for √ s values of 30 TeV and 100 TeV. By scaling the expected number of signal and background events for √ s = 13 TeV to higher energies, we extrapolate the sensitivities, assuming the same luminosity of 3 ab −1 as shown in Table VI.  We have assumed that the cross section increase from √ s = 13 TeV to higher energy values does not change as a function of kinematic variables that are used as input to the BDT. Since both signal and background cross sections increase approximately linearly with center-of-mass energy, we see that the sensitivity scales roughly as the square root of center-of-mass energy for the same integrated luminosity, reaching S/ √ B of 0.25 and κ c ∼ 3.

IV. SUMMARY AND CONCLUSIONS
Testing the charm-Yukawa coupling at the LHC is an important but very challenging task due to the overwhelmingly large QCD background. In this paper, we first reviewed the existing searches at the LHC and obtained the projection at the HL-LHC in probing the charm-Yukawa coupling, as summarized in Table I.
We proposed to study a new channel: the Higgs boson production via the VBF mechanism plus an additional hard photon in the hope to observe the challenging de-cay mode H → cc. The additional photon helps for the trigger to this hadronic decay process and to suppress gluon-rich QCD multi-jet background. The search that we proposed can utilize existing ATLAS and CMS triggers or offer new opportunities, for instance utilizing charm tagging in the HLT. We presented our specific proposal for the trigger design in Sec. II.
Based on the trigger considerations and the kinematic features of the signal, we first performed a cut-based analysis in Sec. III A, which yielded a sensitivity for signal strength µ of about 43 times the SM value at 95% CL s at the HL-LHC. A boosted decision tree described in Sec. III B enhanced the sensitivity by roughly 30% (using the same definition of relative change as used in Sec. III B), to about 29 times the SM value at 95% CL s at HL-LHC, corresponding to an upper limit of y c as 13 times the SM value. Our obtained constraint on charm-Yukawa coupling, summarized in Table V, is better than the H → J/ψ + γ channel [31]. Even though the limit obtained in our analyses is slightly weaker than the ZH direct search by ATLAS [18], our channel will provide complementary information and a combination of different search channels can further improve the limit.
Global analyses of all the Higgs couplings [27,[35][36][37] could result in a more sensitive probe than the direct search result H → cc, but admittedly depending on model-dependent assumptions, such as |κ V | ≤ 1 etc. Direct measurements of charm-Yukawa coupling are nevertheless indispensable.
Finally, we provided the first investigation of the VBF cross section with an associated photon at higher collider energies of 30 TeV and 100 TeV. Assuming the same signal and background acceptance as well as the similar detector performance, some improvement of the sensitivity would be anticipated, as shown in Table V.
As we are entering the new phase of the LHC mission, it is important to push for the challenging measurements and to fully realize the potential for discovery at both the energy frontier and the precision frontier.