Approaching robust EFT limits for CP-violation in the Higgs sector

Constraining CP-violating interactions in effective field theory (EFT) of dimension six faces two challenges. Firstly, degeneracies in the multi-dimensional space of Wilson coefficients have to be lifted. Secondly, quadratic contributions of CP-odd dimension six operators are difficult to disentangle from squared contributions of CP-even dimension six operators and from linear contributions of dimension eight operators. Both of these problems are present when new sources of CP-violation are present in the interactions between the Higgs boson and heavy strongly-interacting fermions. We show that degeneracies in the Wilson coefficients can be removed by combining measurements of Higgs-plus-two-jet production via gluon fusion with measurements of top-pair associated Higgs production. In addition, we demonstrate that the sensitivity of the analysis can be improved by exploiting the top-quark threshold in the gluon fusion process. Finally, we substantiate a perturbative argument about the validity of EFT by comparing the quadratic and linear contributions from CP-odd dimension six operators and use this to show explicitly that high statistics measurements at future colliders enable the extraction of perturbatively robust constraints on the associated Wilson coefficients.


I. INTRODUCTION
The search for new physics beyond the Standard Model (SM) is a central task of the Large Hadron Collider (LHC). With established and motivated models under increasing pressure as more data get scrutinised, phenomenological analyses have turned to largely modelindependent measurement and interpretation strategies adopting the framework of SM effective field theory (EFT) [1][2][3][4][5]. SMEFT as a theoretical framework has undergone a rapid development over the past years, e.g. [6][7][8][9][10][11][12][13].
EFTs facilitate the communication between the weak or measurement scale, and a UV completion that the EFT approach would like to see itself contrasted with. As the UV completion of the SM is currently unknown, the leading operator dimension six deformations of the SM imply 2499 independent parameters [14] that should be considered as a priori free when we would like to constrain generic beyond the SM (BSM) physics that is sufficiently close to the decoupling limit to justify the dimension six approach.
Established phenomena such as the observed matteranti-matter asymmetry, however, provide us a hint where motivated physics might be found, without making too many assumptions about the precise form of the UV completion itself. For instance, Sakharov's criteria [15] of baryogenesis motivate the direct search for CP-violating effects in addition of the CP-violating sources in the SM, which are insufficient to account for the observed matter- * Electronic address: christoph.englert@glasgow.ac.uk † Electronic address: peter.galler@glasgow.ac.uk ‡ Electronic address: andrew.pilkington@manchester.ac.uk § Electronic address: michael.spannowsky@durham.ac.uk anti-matter asymmetry. As the only source of CP violation in the SM is associated with the fermion-Higgs interactions, the Higgs sector naturally assumes a central role in such a search, in particular because its precise form is a lot less well-constrained compared to the gauge sectors.
CP-violating effects associated with "genuine" dimension six effects, i.e. contributions that arise from the interference of SM contribution with dimension-six operators are limited to genuine CP-odd observables and asymmetries thereof [16,17]. In the context of Higgs physics, one motivated observable is the so-called signed φ jj [18] (see also [19][20][21][22][23]) in gluon and weak boson fusion. The first measurements of the signed φ jj were recently published by the ATLAS Collaboration in the h → γγ [24] and h → ZZ [25] decay channels. An analogous observable can be constructed for top quark-associated Higgs production as well, as discussed in detail recently in Ref. [26]. Working in the dimension six linearised approximation, such observables are the only phenomenologically viable ones because the interference terms cancel identically for any CP-even observable, such as total cross sections, decay widths as well as momentum transferdependent observables such as transverse momenta and invariant masses.
Issues arise, however, when multiple operators affect the same observable. In this case, large CP-violating effects in two or more operators can completely cancel, yielding a result that resembles the SM prediction. Higgs-plus-two-jet production via gluon fusion receives corrections from heavy strongly-interacting fermions, including the top quark and possible as-yet-undiscovered heavy fermions that lie far above the electroweak scale. In the effective field theory approach, we can express this as corrections from two operators where t denotes the top quark, G a µν is the gluon field strength with dualG a µν = µνρδ G a ρδ /2, h represents the physical Higgs boson with mass m h = 125 GeV and v = 246 GeV is the Higgs' vacuum expectation value. 1 In the following we will denotec g ,c t as the corresponding Wilson coefficients of Eq. (1).
It is well known that the m t -associated threshold effects allow us to differentiate between these parameters in their CP-even manifestation using momentum transferdependent observables [30][31][32][33][34][35][36][37]. Together with the information from top quark-associated Higgs production, this is enough to sufficiently disentangle the gluon-Higgs interactions from the top-Higgs contributions [38][39][40]. In the case of the CP-odd operators of Eq. (1), momentum transfer-dependent differential distributions used for the CP-even operators are identically zero for the new physics contribution. This makes the extraction of the CP-violating effects in the fermion-Higgs interactions and their separation from competing modifications of the gauge sector-Higgs interactions much more complicated to order ∼c t ,c g .
The purpose of this work is to provide a detailed analysis of this issue and point out possible improvements that are straightforward to implement in existing experimental analyses. This paves the way to obtaining a more detailed picture of the Higgs CP properties at the LHC in the future.
Furthermore, we look at this analysis from the perspective of perturbative validity of the EFT approach. This is done by comparing linearised results to results obtained from including squared dimension six effects. The latter have been discussed in the past in detail (see e.g. [18,19,[41][42][43][44]). Yet, it is important to highlight that in this case any CP-even observable also acts as a probe of the CP-odd interactions. Hence, by including quadratic contributions searches and interpretations of CP-violating effects in the context of EFT become highly dependent on (often implicit) EFT assumptions. Our aim is to find experimental and phenomenological setups in which the quadratic contributions are negligible such that constraints on CP-violating interactions can be extracted perturbatively robust and with minimal assumptions on CP-even contributions. In the context of these considerations we extrapolate our analysis to experiments at future colliders.
This work is organised as follows: In Sec. II we outline our numerical setup and provide an overview of the 1 The normalisation ofÕg corresponds to integrating out the top quark with CP-odd couplings with Yukawa coupling size √ 2mt/v in the limit mt → ∞ [27][28][29]. relevant observables. We also place our analysis into the context of existing LHC analyses in the Higgs final states that we consider. We present our results in Sec. III. In particular, we will comment on the comparison of dimension-six linearised approach with CP-even effects from CP-odd interactions as alluded to above and extrapolate our results to obtain LHC and future hadron collider projections. We conclude in Sec. IV.

A. Processes
To analyse the prospects of discriminatingÕ g fromÕ t via the process pp → hjj, we use a modified version of Vbfnlo [45,46]. Including dimension six interactions, we can write the full squared amplitude Our modifications are such that the SM-interference and squared dimension six amplitude parts can be extracted individually, while keeping the full top mass dependence ofÕ t [18,[20][21][22]. TheÕ g contributions were tested againstÕ t by approaching the m t → ∞ limit numerically. This provides a strong cross check of both implementations and our modifications, which is non-trivial by the fact that for linearised dimension six effects the integrated cross section is numerically zero (it is a CP-even observable), and genuine CP-sensitive observables need to be employed for such cross checks. We output Les Houches events [47] of Higgs production in association with two light jets, hjj and subsequently shower and hadronise them with Herwig [48,49]. For the analysis, we pass this output through a Rivet [50] analysis which closely follows the event selection of Ref. [24]. From the SM sample, we determine the event selection efficiencies on a bin-bybin level by comparing parton-level with particle-level (Rivet) analysis and Ref. [24]. We study the hjj production channel in the h → γγ decay mode with the possibility of including modifications of the Higgs branching ratios for comparisons (see below). We use flat K factors of 1.5 (13/27 TeV) and 1.18 (100 TeV) following Ref. [51].
To disentangle Higgs-gluon from Higgs-top interactions we also consider tth events which are generated with MadGraph 5 [52] where the contribution from the effective operators in Eq. (1) have been implemented through a UFO [53] model file which we generated using FeynRules [54]. We study the tt-associated Higgs production in the h → bb decay mode whose branching ratio is indirectly affected byÕ g andÕ t as well. We detail the computation of branching ratios in the appendix. Similar to the hjj case, we separate the tth contributions according to Eq. (2). We focus exclusively on the production couplings and do not include CP-sensitive information that can be obtained from the Higgs decay, through, e.g. angular observables. Such a measurement would constrain the properties of the Higgs coupling to a particular final state particle and not the properties related to its production. To reflect the impact of higher-order QCD corrections we include flat K factors of 1.30 (13/27 TeV) and 1.36 (100 TeV) [55-57] 2 .

B. Observables
Since we are interested in studying the CP-odd couplings of the Higgs to gluons and the top quark we study CP-sensitive observables. Therefore, in the hjj channel, we calculate the signed azimuthal angle between the jets, which is defined as where φ j,1 (φ j,2 ) is the azimuthal angle of the first (second) jet. The jets are ordered by their rapidity, i.e. which promotes this angular distribution to a P-sensitive observable.
In Fig. 1, we show the ∆φ jj differential distribution for the linear approximation using a particular choice ofc g andc t as an example. Fig. 1 (a) illustrates that the effects ofÕ g andÕ t can be very small even for large values of c g andc t in the vicinity of the blind directionc t ∼ −c g if inclusive observables are considered. As can be seen from Fig. 1 (b), once ∆φ jj is defined with an additional binning in a kinematic observable such as the transverse momentum of the Higgs 3 that focuses on more exclusive events around the top quark threshold and above (p T,h ≥ 150 GeV), we can start disentangling thec g and c t directions. In principle, a fully-binned two-dimensional distribution (∆φ jj , p T,h ) could be considered. However, this would come at the price of a large reduction in statistics and an enlarged statistical uncertainty. While we consider two search regions separated by a 150 GeV p T,h cut, the number of search regions, as well as their separation could be treated as tuning parameters in a more realistic analysis. The ratio plot of Fig. 1 (b) also shows that the linear EFT contribution to the distribution is asymmetric and that the integrated cross section vanishes.
The qualitative behavior of Fig. 1 can be understood from the p T,h differential distributions of the CP-even operators. For momentum transfers that resolve the top-Higgs interactions (andÕ t accordingly) the effect relative toÕ g should decrease as absorptive contributions of the top-loop are probed. The effects are not large, as can be expected from the success of the m t → ∞ approximation for SM hjj production [58][59][60][61][62].
While Fig. 1 shows the ∆φ jj distribution in the linearised approximation Fig. 2 presents an example for the case where the quadratic terms are included. As Fig. 2 shows employing a top threshold related kinematical cut improves on lifting the blind direction also in the case where quadratic contributions are included. In this example we have chosen smaller values for the Wilson coefficients than for the linearised case because thec g -c t resolving power for larger values is mostly driven by the total cross section which does not vanish for the quadratic EFT contribution. This behaviour can already be observed for the inclusive case in Fig. 2 (a) where the ratio plot shows a slight offset in the EFT contribution with respect to the SM. In the high-p T,h sample in Fig. 2 (b) the relative contribution of the linear dimension six part is increased with respect to the inclusive case.
In complete analogy to hjj, for the tth channel we consider the dileptonic decay of the top-quark pair (assuming an event selection efficiency of 2.5% [63]) and study the signed azimuthal angle ∆φ between the two charged leptons defined as where φ ,1 (φ ,2 ) is the azimuthal angle of the first (second) charged lepton [26]. The leptons are ordered according to their rapidity, i.e.
As leptons we consider electrons and muons from the decays t → bW → b + ν and t → bW → bτ + ν τ → b + ν ν τντ and the charge conjugated processes. The clean fully-leptonic final states of tth production can possibly be augmented by semi-leptonic final states with appropriate jet-matching that removes the Higgs final states. We limit ourselves here to the clean final state as we can expect reconstruction to be feasible without relying on non-transparent multivariate techniques at the price of reduced statistics. Example ∆φ distributions for the linear approximation as well as including quadratic terms are shown in Fig. 3. Note that, although, tth receives corrections ∼c g , these contributions solely arise from dressing the gg → tt topologies with initialstate Higgs radiation. This renders tth almost insensitive toÕ g .

A. EFT-Linearised Approximation
In the first part of the analysis we investigate the SM and interference contributions. The SM contributions to the considered Higgs production channels are CP-even while the interference contributions are CP-odd. Since the inclusive cross section is a CP-even observable the contribution from the interference part is exactly zero. The Higgs branching ratios are not affected along the same lines and we adopt the branching ratios of the Higgs Cross Section Working Group in the following [64].
To set limits on the CP-odd couplings in Eq. (1) we study the differential distribution where X = jj, and σ g (σ t ) is specific to the operatorÕ g (Õ t ) but is independent of the Wilson coefficientc g (c t ) by construction. Since in this P-odd differential distribution the linear dependence ∼c g ,c t is non-vanishing, it is possible to scan the behavior dσ(c g ,c t , ∆φ X )/dp T,h to isolate the individual contributions σ g , σ t through their characteristic momentum-dependencies.
To facilitate the limit setting in an adapted way, we can scan the entire parameter space by sampling only two points, (c g ,c t ) = (1, 0) and (c g ,c t ) = (0, 1), for each Higgs production channel. We then perform a fit on the basis of a χ 2 of the differential distribution obtained from the three different data sets, hjj with p T,h < 150 GeV, hjj with p T,h ≥ 150 GeV and tth. The χ 2 test statistic is given by where b i SM is the expected measurement in the ith bin assuming the SM is correct, and b i SM+D6 (c g ,c t ) represents the theoretical prediction for specified values of the Wilson coefficients. V ij is the covariance matrix that accounts for theoretical and experimental uncertainties. For the statistical uncertainty in the ∆φ jj distribution in the h → γγ channel, we take the uncertainty in the measured fiducial cross section for Higgs-plus-two-jet production at 13 TeV and 36/fb [24] and redistribute this across the bins of the observable. This is then rescaled to the respective centre-of-mass energies and luminosities used in this analysis. For the statistical uncertainty in the ∆φ distribution in tth production, we take the measured uncertainty of the dilepton channel in Ref. [63], redistribute this across the bins of the observable, and then rescale to the appropriate centre-of-mass energy and luminosity. We assume each systematic error to be fully correlated and adopt the following values ∆φ : δ th = 10%, δ flat sys = 20% [63], δ shape sys = 1.0% [65], ∆φ jj : δ th = 10% [24], δ flat sys = 10% [24], δ shape sys = 2.5% [24], where δ th is the theoretical uncertainty in the fiducial cross section of each process, and δ flat sys and δ shape sys represent experimental uncertainties in the normalisation and shape of the expected measurements, respectively. Note that for tth production, the current theoretical uncertainty due to background mismodelling in the experimental analysis is much larger than 10%. However, we assume that this will be reduced in future analyses, due to improved theoretical models and increasing use of control regions with the larger datasets.
As an alternative strategy for the analysis of the linear dimension six contributions we also studied asymmetries based on ∆φ jj and ∆φ but they provide weaker constraints onc g andc t than the full distributions.

B. Including Quadratic Dimension Six Terms
In the second part of the analysis we include the quadratic contributions. Analogously to the linear case we use the ∆φ X distributions to calculate the χ 2 . Including quadratic contributions the ∆φ X distributions are given by i.e. we have to sample five parameter points per channel in order to scan the entire parameter space. Choosing (c g ,c t ) = (1, −1) for the σ gt sample provides results with larger numerical stability as the histogram is sampled close to the blind direction inc g -c t space and therefore gives only a small contribution from the dimension six operators.

C. EFT validity
The Monte Carlo event simulation allows us to extract the average probed energy scales of the process. A criterion for the validity of the perturbative series expansion of Eq. (2) can then be phrased for an individual coupling as as the respective matrix element distributions sample the probed energy scales without making reference to the statistical sampling of the energy scales. Note that similar to using renormalisation and factorisation scales as measures to quantify associated uncertainties, this choice is ad-hoc and more constraining criteria can be formulated. While such a scaling is a typical behavior of perturbative models, we can expect it to be violated for nonperturbative SM extensions. For the latter models, the naive hierarchy between dimension six and higher dimensional operators will be violated once we move closer to the characteristic energy scale of the strong interactions, which signals the need to transition from the effective picture to the new relevant microscopic degrees of freedom (see also [66,67] related discussions). Put differently, Eq. (10) is a reason why we typically do not see large CP-violating phases in perturbative scenarios like two-Higgs doublet models or the (N)MSSM.
On a more practical level, as we need to employ Monte Carlo techniques to simulate LHC final states that make use of fixed-order perturbation theory, our phenomenological modelling of a particular scenario cannot be trusted when Eq. (10) is badly violated. Rearranging leads straightforwardly to Note that for our dimension six operators Q = Q (0) . Numerically, we find approximate linear dependencies of the average probed scales as i.e. under the criteria of Eq. (10) we can expect BSM contributions in the vicinity of < ∼ 20% compared to the SM. Again this is a typical ballpark of perturbative SM UV-completions. As to good approximation it becomes clear that the range of the Wilson coefficients are quickly pushed to small values if the probed energy scale that characterises consistency with the SM is pushed to high values.

D. Results and Comparison
In Fig. 4 the constraints onc g andc t at 95% confidence level (CL) are presented for the LHC at 13 TeV showing the contributions from different production channels and kinematical regions. These contours are obtained by including only the SM and linear dimension six contributions. The green band in Fig. 4 represents the constraints using only the tth sample. It illustrates that this channel is only very weakly sensitive to contributions from c g as mentioned in Sec. II B. However, the sensitivity inc t allows to constrain the otherwise blind direction visible in the hjj samples (orange and blue bands) to some extent. Comparing Fig. 4 with the perturbative bounds in Eq. (13) shows that generic searches for (C)P violation in production processes in the Higgs and top sector will be difficult at the LHC in the decay modes considered. Including the quadratic dimension six contributions results in constraints shown in Fig. 5 4 . These constraints are much tighter than those obtained from the analysis in the linear case. Fig. 6 directly compares the limits obtained from the linear approximation and from the analysis which includes the quadratic contributions. This supports the previous point, highlighting that the quadratic contributions are significant which results from the fact that they contribute to the total cross section in contrast to the linear contributions. This is also illustrated in Fig. 6 by comparing to bounds that are obtained from only the shape of the distributions discarding the information on the total cross section. The large effect of the quadratic contributions signals a violation of the perturbative constraint in Eq. (10). In other words, the stronger constraints in Fig. 6 rely on contributions that are perturbatively not under control and therefore should be treated with caution. In addition, including quadratic effects (which are CP-even) amounts to specific assumptions about the CP-even operators in the Higgs sector which cannot be disentangled straightforwardly anymore.
We explore how this situation changes as we moving to future colliders. Specifically, we study the two benchmark scenarios given in Tabs. I and II. Scenario 1 can be considered as a worst-case scenario where the event selection efficiency tth for tth events and the systematic uncertainties do not improve and the integrated luminosity only moderately increases between the different colliders. Scenario 2 is an optimistic one where systematic uncertainties are reduced by a factor of about two when going to higher energies and tth increases by a factor of two. Furthermore the integrated luminosity increases by an order of magnitude going from 13 TeV to 100 TeV in scenario 2. Note that in scenario 2 the event selection efficiencies for hjj events are not adapted to the different collider energies but are kept fixed to the value at 13 TeV.
The results of this study are shown in Fig. 7 where the same analysis strategy for linear and quadratic contributions was applied. The increased centre-of-mass energy allows us to probe considerably higher energy scales, thus tightening the range of Wilson coefficients that can be considered to have a dominant effect from interference contributions (see Eq. (13)). However, the measurements become under increasing statistical control which will allow us to sharpen the exclusion. As can be seen in Fig. 7, the constraints from the linearised approach approxi-   mates quadratic exclusion. This shows that the quadratic contributions are considerably less relevant than we find for the LHC. This way the constraints at a 27 TeV HE-LHC will not only surpass the LHC, but will be more robust as well 5 . As Fig. 7 shows this is further strengthened at a 100 TeV machine, where the constraints for scenario 2 lie within the perturbative bounds given in Eq. (13). Even in scenario 1 the bounds from linear terms are very close to those where quadratic terms are included. Hence, we can probe Wilson coefficients in generic CP violating dimension six extensions in a perturbatively solid way.

IV. CONCLUSIONS
The interactions of the Higgs boson with the heaviest quarks in the SM are motivated sources of CP violation. Analyses of top quark-related interactions that do not rely on particular Higgs final states and, consequently, are free of assumptions on the Higgs decay couplings are largely limited to the dominant top-related Higgs production processes, tth and hjj. 6 Controlling competing effects from gluon-Higgs contact interactions that might arise from additional heavy fermions are crucial in this context. The small statistics which is expected in the tth channel with clean leptonic final states that enables a clean definition of sensitive observables based on the signed φ limits the expected sensitivity as well as possibility to lift blind directions in the gluon-fusion related channels.
Furthermore, and quite different from (C)P-even deformations of the SM, power-counting arguments for the effective interactions have a direct phenomenological consequence. While fully binned distributions provide a sensitive probe under all considerations of our work, for small CP-violating phases where we would expect SM interference-driven contributions to play a significant role in the limit setting, the decoupling of rate information seriously impacts the overall sensitivity of CP-analyses at the LHC. This is only partially mended at the high-energy LHC with 27 TeV as energy thresholds and expected statistics do not lead to a big enough improvement. While the precise specifications of a 100 TeV hadron collider are currently debated, the expected statistical improvement at such a machine locates the expected limit in a parameter region where the interference-driven interpretation starts saturating the EFT limit, i.e. power-counting assumptions do not impact the constraint quantitatively. The latter point is also supported by estimates of the EFT-related parameter validity ranges that are accessed through Monte Carlo simulations.
In summary we can state two main conclusions of our analysis. First, CP violating effects in the top-Higgs sector can be extracted in a perturbatively robust way when measurements with high statistics are available. This can be realized by increased production cross sections at larger center-of-mass energies and increased integrated luminosities. We observe that for example at a 100 TeV colliderc g andc t are constrained to a parameter region where quadratic dimension six contributions are considerably reduced resulting in perturbatively robust exclusion limits. This leaves the linear contribution as the dominant effect from dimension six operators. Second, since we consider CP-odd operators the linear contribution is indeed CP-odd while the quadratic contribution is CP-even which is difficult to disentangle from contributions of other CP-even operators. The fact that we can determine perturbatively robust results because only the linear contribution is dominant therefore also puts us in the position to cleanly study CP-odd SM deformations which otherwise would be intertwined with CP-even contributions.
The operatorsÕ g andÕ t add a pseudoscalar component to the following partial decay widths Γ(h → gg), Γ(h → γγ) and Γ(h → Zγ) of the Higgs. Hence, the branching ratios BR(h → γγ) and BR(h → bb) depend onc g andc t , where Γ SM is the total SM decay width of the Higgs, Γ X SM is the SM partial decay width into the final state X, Γ dim.6 is the total decay width induced by the operators in Eq. (1) and Γ X dim.6 is the partial decay width into the final state X due to dimension six operators. Γ γγ dim.6 , Γ Zγ dim.6. and Γ gg,i dim. 6 can be read off the pseudoscalar part of the decay widths given for example in Ref. [68]: with the loop functions c w = cos θ w and s w = sin θ w where θ w is the weak mixing angle. Finally, we rescale the partial decay widths by the respective K factors [69] K Zγ = 1 − α s π , The K factor for a pseudoscalar decaying into γγ is one at NLO. The numerical value for the branching ratios as functions ofc g andc t used in the analysis are given by BR(h → bb) = 0.577 1 + 0.190c 2 g + 0.397c gct + 0.208c 2 t , BR(h → γγ) = 0.00228 + 0.000413c 2 t 1 + 0.190c 2 g + 0.397c gct + 0.208c 2 t where the Pdg [70] values for G F , α, m Z , the Higgs Cross Section Working Group [64] values for the SM branching ratios of the Higgs and m t = 173 GeV, m h = 125 GeV where used. We have cross-checked these results against an independent calculation using Fey-nArts/FormCalc/LoopTools [71][72][73].