Machine learning the Higgs boson-top quark CP phase

We explore the direct Higgs-top CP measurement via the $pp\to t\bar{t}h$ channel at the high-luminosity LHC. We show that a combination of machine learning techniques and efficient kinematic reconstruction methods can boost new physics sensitivity, effectively probing the complex $t\bar{t}h$ multi-particle phase space. Special attention is devoted to top quark polarization observables, uplifting the analysis from a raw rate to a polarization study. Through a combination of hadronic, semi-leptonic, and di-leptonic top pair final states in association with $h\to \gamma\gamma$, we obtain that the HL-LHC can probe the Higgs-top coupling modifier and CP-phase, respectively, up to $|\kappa_t|\lesssim 8\%$ and $|\alpha|\lesssim 13^{\circ}$ at $68\%$~CL.


I. INTRODUCTION
New sources of CP violation can be a key ingredient to explain the matter-antimatter asymmetry of the universe [1][2][3]. Hence, the quest for new CP violating interactions is a clear target for beyond the Standard Model (SM) searches, being a critical component of the physics program of the LHC. A particularly interesting option is that the Higgs boson couplings present these new physics sources.
From the theoretical point of view, some Higgs interactions are more inclined to display CP violation effects than others. While the widely studied beyond the SM CP structure for the Higgs to vector boson couplings are loop suppressed, arising only at dimension-6 or higher [4,5], CP violation in Higgs to fermion interactions can manifest already at the tree-level [6], being naturally larger. Owning to its magnitude, the top quark Yukawa coupling can play a significant role in this context and be most sensitive to new physics.
In the present manuscript, we perform a detailed investigation of the Higgs-top CP sensitivity with the pp → tth channel at the HL-LHC, considering the most promising decay mode, h → γγ. We explore the complex multiparticle final state with a combination of machine learning techniques and efficient kinematic reconstruction methods. Since distinct Higgs-top CP-phases affect the net top and anti-top quark polarization, propagating the spin effects to the top quark final states, we devote special attention to include the top polarization observables in our study. In particular, these spin effects are used to define genuine CP-observables. After motivating and constructing the relevant kinematic observables, we evaluate how much information can be extracted with them. The convenient metric adopted to quantify this is given by the Fisher information. We show that the ability of probing the pp → tth channel exploring the complex multiparticle final state not only in terms of a raw rate, but as a polarized process, can offer a crucial pathway to probe the underlying production dynamics, accessing possible new physics effects.
The structure of this paper is as follows. In Section II, we present the theoretical parametrization for the top Yukawa coupling. We discuss the new physics effects to the top polarization, define the CP-sensitive observables, and quantify how much information on the CP-phase can be extracted using distinct observables. In Section III, we present the kinematic reconstruction methods, which are relevant to build prominent observables to new physics and maximally explore the tth final state. Next, in Section IV, we move on to the detailed analysis, where we derive the projected sensitivities for the Higgs-top CPphase at the HL-LHC. This study is inclusive in respect to the top pair final states, combining the leptonic, semileptonic, and hadronic channels. Finally, a summary of our key findings is delivered in Section V.
where m t is the mass of the top quark, v is the vacuum expectation value in the SM (v = 246 GeV), κ t is a real number, and α is the CP-phase. The interaction between the CP-even Higgs boson and the top quark in the SM is represented by (κ t , α) = (1, 0), while α = π/2 results in a pure CP-odd Higgs-top interaction. New physics contributions in Eq. (1) will display effects both in the Higgs tth production and decay, h → γγ. Whereas the Higgs decay will more relevantly change the total signal rate [15], we will devote special attention to probe the new physics effects in the Higgs production, exploring the top quarks' final state kinematics. This will be an essential ingredient to uplift the new physics sensitivity from CP-phase effects.

A. Top Polarization
Among the observables sensitive to the structure of the top quark Yukawa interaction, the spin correlations between the top and anti-top in tth associated production offer a prominent pathway for precision studies [6, 15-20, 24, 32, 36-41]. Owing to its short lifetime (∼ 10 −25 s) [42], the top quark is expected to decay before hadronization occurs (∼ 10 −24 s) and spin decorrelation effects take place (∼ 10 −21 s) [43]. Thus, the spin-spin correlations between t andt can be traced back from the top quark decay products. In particular, it is possible to observe correlations between any two decay products, one from the top quark and the other from the anti-top quark. The correlations scale with the spin analyzing power associated with each top decay product.
More accurately, the top quark final states in the leptonic t → W + b → + νb and hadronic t → W + b →dub channels are correlated with the top quark spin axis as follows: where Γ is the partial decay width, ξ i is the angle between the i-th decay product and the top quark spin axis in the top quark rest frame, P t is the polarization of the decaying top (−1 ≤ P t ≤ 1), and β i is the spin analyzing power of the final state particle i [44]. At leading order, the coefficient β i is +1 for charged lepton + andd-quark, −0.3 forν and u-quark, −0.4 for the b-quark, and 0.4 for W -boson. The sign of the coefficient β i is flipped for anti-top decays. Granted by the V − A current structure of the weak interaction, the charged lepton will be a prominent spin analyzer, favoring studies with di-leptonic top pairs. Exploring this phenomenology, the ∆φ lab observable, which is the azimuthal angle difference between the two charged leptons in the lab frame, is a good example of probe that has been found effective in accessing the Higgs-top CPproperties [6,24]. Remarkably, the sensitivity of ∆φ lab improves further in the boosted Higgs regime due to the change in the net polarization for the top-pair at high energies.
Analogously to the charged lepton, the d-quark also presents maximal spin analyzing power. However, it is a challenging task to tag a d-quark jet in a collider environment. An efficient solution is to select the softest of the two light-quark jets, j soft , in the top quark rest frame. This choice uplifts the spin analyzing power of j soft to 50% of the lepton's [45]. This approach boosts the spin correlation analyses for the semi-leptonic and hadronic top quark pairs. Several observables can be defined exploiting this fact, a particularly relevant example, that we will explore in this manuscript, is the azimuthal angle difference between the charged lepton and softest light jet in the top pair rest frame, ∆φ tt j soft .

B. CP-sensitive Observables
Various kinematic observables have been studied in the literature to access the Higgs-top CP structure in the pp → tth channel. Some illustrative distributions are presented in Fig. 1, such as the transverse momentum for the Higgs boson p T h (top left) [46,47], the invariant mass for the top pair m tt (top center), the product of projections of top and anti-top momentum b 4 = p z t p z t /p t pt (top right) [37], and the angle between the top quark and the beam direction in the tt CM frame θ * which is also known as Collins-Soper angle (bottom left) [24]. These observables result in distinct profiles for different Higgstop CP-phases. The pure CP-odd phase, α = ±π/2, leads to a shift to higher energies in the peak of the distributions compared to the SM scenario, α = 0. Different CP-phases interpolate between these two profiles without sensitivity for the sign of the phase.
The variables p T h , m tt , b 4 , and θ * are CP-even observables, being sensitive to the squared terms: cos 2 α and sin 2 α. Thus, these probes are indifferent to the CPeven and CP-odd Higgs-top interference terms, which are proportional to cos α sin α. In particular, they are not sensitive to variations from a relative sign-difference in the CP-phase. Genuine CP-sensitive observables can be constructed from antisymmetric tensor products that require four linearly independent four-momenta. Owing to the top polarization being carried out to the decays, it is possible to construct such observable using, for instance, the top, anti-top and their decay products [16,20,24]. In general, the antisymmetric tensor product can be expressed as where 0123 = 1, and {i, k} represent the final state particles produced from the top and the anti-top decays, respectively.
In the tt CM frame, Eq. (3) can be fortuitously simplified to p t · (p i × p k ). This mathematical relation can be used to define azimuthal angle differences between the decay products, in the tt CM frame, that are odd under CP-transformations [24]: For illustration, we present in Fig. 1 the azimuthal angle between the two charged leptons ∆φ tt in the fully leptonic case (bottom center) and between the charged lepton and the softest light jet in the top rest frame ∆φ tt j soft in the semi-leptonic case (bottom right). Two comments are in order. First, we notice that ∆φ tt ik is indeed sensitive to the sign of the CP-phase, as illustrated in a comparison between the distribution profiles for α = π/4 against −π/4. Second, in light of the spin analyzing power of the charged lepton in relation to j soft , the relative CPsensitivity of the di-leptonic against the semi-leptonic correlation follows our expectation. Namely, the beyond the SM effects in the ∆φ tt j sof t observable are ∼ 50% weaker in respect to ∆φ tt . This can be observed by com-paring the bottom panel of these plots, where we display the BSM/SM ratio.

C. Observable Information
Before proceeding to a full analysis, let us pause for a moment to better understand which distributions and channels are sensitive to the CP-phase α. In particular, we would like to quantify and compare how much information on the CP-phase is available using the different observables in a parton level setup. This will provide some benchmarks and highlight the main ingredients required for an efficient analysis strategy that will be presented in Sec. IV.
Let us first consider the spin correlation observables ∆φ tt ik between two decay products from the top and antitop, which probe the new physics effects linear in α. A convenient metric to quantify the sensitivity of these observables to constrain the parameters of our model is given by the Fisher information [48,49]. Its component describing the sensitivity to the CP-phase α is defined as Here, p(x|κ t , α) is the likelihood function, which describes the probability to observe a set of events with corresponding observables x as a function of the model parameter κ t and α. E[·] denotes the expectation value evaluated at the SM point, (κ t , α) SM = (1, 0). In the following we use the MadMiner package to calculate the Fisher information [50].
In the left panel of Fig. 2, we show the Fisher information associated with the CP-sensitive spin correlation observables for the di-leptonic (red), semi-leptonic (gray), and hadronic (blue) channels. The bars on the left show the full information, i.e., the information that could be accessed via a comprehensive multivariate analysis. This was estimated using the machine learning method based on the SALLY algorithm [51][52][53] trained with all possible spin correlation observables. The remaining bars show the information in individual observables ∆φ tt ik , which were estimated using a histogram based approach.
Focusing first on the di-leptonic channel, the most sensitive among these observables is the spin correlation between the leptons, ∆φ tt , since the spin analyzing power for the charged leptons are maximal. The next most sensitive observables are those where a charged lepton has been replaced with a b-jet or a W boson. We observe that the corresponding Fisher information in ∆φ tt b and ∆φ tt W are suppressed compared to ∆φ tt by the square of the spin analyzing power β 2 b/W ∼ 0.4 2 , as expected. The information in the spin correlations observables between a pair of b-jet(s) and/or W boson(s) is further suppressed by an additional factor of β 2 b/W . Let us now also consider the other top decay channels. As the Fisher information is proportional to the rate [48], we expect it to increase relative to the fully leptonic channel by a factor 2 × BR W→had /BR W→lep ∼ 6 for the semileptonic channel and (BR W→had /BR W→lep ) 2 ∼ 9 for the hadronic channel. Looking at the last three observables involving b-jets and W -bosons, this is indeed the case.
For the other observables, we notice an additional loss of about a factor 2 in spin analyzing power, and hence a factor 4 in the Fisher information, which is caused by probing j soft instead of the d-quark.
Overall, we see that the different observables have distinct overall importance in the three channels. For di-leptonic top decays, most of the information is contained in the spin correlation between the leptons, while the information in other observables is significantly suppressed. In contrast, for the hadronic decay channel, all shown observables have almost similar information. In this case, the resulting full information, that can be obtained by combining the different spin correlation observables, significantly exceeds the information of any individual observable. Overall, all three channels carry a similar amount of information on the CP-phase α, which suggest performing a combined analysis.
Due to the limited tth event rate at the LHC, we expect the non-linear new physics effects to dominate over the linear ones. We therefore expect most of the sensitivity on the CP-structure of the top Yukawa coupling to arise from these non-linear terms, despite the fact that the corresponding observables are not genuine CP-sensitive. To quantify the sensitivity of these CP-even observables to the squared terms, we use modified version of the Fisher information that was introduced in Ref. [54]. In this approach, we simply consider the square of the coupling as our new model parameter and define The result is shown in the right panel of Fig. 2. Here, we show the information associated with a two-dimensional distribution of two observables, relative to the full information associated with a multivariate analysis using all observables. As none of the presented observables relies on the top quark final state kinematics, the results are identical for all three top quark decay channels. The distribution of the invariant mass of the photon pair, m γγ , is only sensitive to the theory parameters through its normalization. Correlating it with itself, we obtain the information associated with the signal strength measurements, which accounts for 31% of the information on the CP-phase. In the absence of background, the correlation of m γγ and any other observable is equivalent to the information in a single differential distribution of that observable. This is shown in the bottom row. As expected, it is also identical to the information for the correlation of an observable with itself, which are shown in the diagonal. We can identify ∆η tt and θ * as the two most sensitive observables, which individually carry about 60% of the full information.
Combing two different observables further increases the information. The two most promising combinations are ∆η tt vs. p T h as well as θ * vs. m tth , which carry about 73% of the full information. Successively adding more observables further increases the information. This shows that a multivariate analysis is vital to maximize the sensitivity on the CP-phase α.

III. KINEMATIC RECONSTRUCTION
Most of the new physics probes discussed so far, viz. m tt , θ * , b 4 , and ∆φ tt ik , require a full reconstruction of the top and anti-top momenta. This is a challenging task at the LHC due to combinatorial ambiguities and the presence of up to two neutrinos in the tt(h → γγ) final state. In this section, we discuss the strategies adopted for the kinematic reconstruction of the semileptonic and hadronic channels, and the more complex di-leptonic mode.
Semi-leptonic channel: In the semi-leptonic channel, the full reconstruction of the tt system requires the determination of the longitudinal momentum of the missing particle ν. We compute it by constraining the invariant mass of the lepton and the neutrino to the W -boson mass. Typically, either two solutions or zero solutions are obtained. Around 35% of events give zero solutions, and discarding all such events would lead to a significant reduction in event statistics. Therefore, in such events, we vary the transverse momentum of the missing system (at most by ±10%) while keeping the azimuth angle of ν unchanged until physical solutions of p z,ν are obtained. Events which give zero solutions even after the aforesaid variation are ignored. We perform the reconstruction for the top quarks iterating over all possible partitions of light jets (j) and b-jet forming the hadronic top (jjb) and leptons and b-jet for the leptonic top ( νb). The two possible neutrino solutions are separately accounted for, forming different partitions. We select the combination that minimizes where m t is the on-shell mass of the top quark.
Hadronic channel: We follow a similar mass minimization strategy in the hadronic channel. We reconstruct the two top quarks, t 1 and t 2 , by iterating over all possible combinations of light jets and b-jets. The combination which minimizes is chosen.
Di-leptonic channel: In the more complex di-leptonic tth channel, the invisible system is constituted by two neutrinos. Therefore, in addition to determining the unknown longitudinal momentum of the missing particles, it is also indispensable to partition the four-momentum of the missing system into the two neutrinos in order to fully reconstruct the top and the anti-top. An additional combinatorial ambiguity arises from the tandem bjet and pairing. The study in Ref. [24] reconstructed the tt(h → bb) system in di-leptonic mode using M 2 assisted reconstruction algorithm and a boosted h → bb, with jet substructure techniques, to suppress the additional combinatorics between the Higgs boson and top quark decays. In contrast, the present analysis reconstructs the tt(h → γγ) system following the Recursive Jigsaw Reconstruction (RJR) algorithm presented in Ref. [55]. The RJR approach utilizes a series of jigsaw rules optimized to estimate the unknown kinematic degrees of freedom in an event topology and resolve the combinatorial ambiguities between/within the final state visible and invisible objects. It results is a complete kinematic basis which can be used to define the four-momenta of all the final state and intermediate objects in an event decay tree. The first step involves the resolution of combinatorial ambiguity between the b-jets and the leptons by using the "Combinatorial Minimization" Jigsaw Rule (JR) [55], identifying the (b-jet, ) pairs by minimizing After establishing the two visible hemispheres corresponding to the top and the anti-top, we apply the "Invisible Mass" JR to estimate the invariant mass of the invisible system (m I ) [55] defined as where m V is the invariant mass of all the two b-tagged jets and the two leptons in the final state. m Va and m V b correspond to the invariant mass of the two visible hemispheres associated with the top and the anti-top that were reconstructed in the previous step. m I is chosen such that it is the smallest Lorentz invariant mass that ensures a non-tachyonic four-momenta for the individual neutrinos upon partitioning the invisible system. Next, we determine the longitudinal momentum of the invisible system, / p z , using the following relation given by the "Invisible Rapidity" JR [55]: Here, p V z and p V T represent the longitudinal and transverse momenta, respectively, of the visible system constituted by the two b-jets and the two leptons, while / p T is the missing transverse momentum.
At this point, we have all the ingredients required to reconstruct the tt system. However, in order to reconstruct the top and the anti-top individually, the invisible fourmomentum has to be correctly partitioned into the two neutrinos. This is achieved by using the "Contraboost Invariant" JR specified in Ref. [55] which estimates the four-momenta of the neutrinos produced from top and anti-top decay in the tt CM frame under the assumption that both t andt have the same invariant mass. The resolved four-momenta of the neutrinos along with the correctly paired b-jets and leptons allows defining the t and thet systems independently. The reconstruction efficiency of this method is about 80%, which is comparable with M 2 assisted reconstruction algorithm [24].
With the fully-resolved tth system, we can reconstruct a multitude of CP-even and CP-odd spin correlation observables defined in the tt CM frame and the lab frame. Several observables that do not depend on the spinpolarization of tt pair are also considered. Our goal here is to maximally explore the tth multi-particle final state, augmenting the CP-sensitivity of the pp → tth channel at the HL-LHC.

A. Simulation and Event Selection
In this section, we explore the direct Higgs-top CP measurement combining machine learning techniques and efficient kinematic reconstruction methods. We consider tth signal with h → γγ in the di-leptonic, semi-leptonic, and hadronic top decay modes at the HL-LHC. The dominant background to this process is given by continuum ttγγ production. We simulate both the signal and background event samples with MadGraph5 aMC@NLO [56] within the MadMiner framework [50] at leading order (LO) with a center-of-mass energy of √ s = 14 TeV. Higher order effects to the signal rate are included via a flat next-to-leading order kfactor [57,58]. We use NNPDF2.3QED parton distribution function [59]. No generation-level cuts have been applied for the signal events, while the backgrounds have been generated in the mass window 105 GeV < m γγ < 145 GeV. Parton shower and hadronization effects have been included with Pythia 8 [60] and fast detector simulation with the Delphes3 package [61], using the default HL-LHC detector card [33,62].
To obtain the cross section and likelihood function as a function of the theory parameters, we use the morphing technique that is already implemented in MadMiner. Here, we take into account the dependence of new physics theory parameters at both tth production and h → γγ decay, and therefore choose a quartic ansatz in the morphing setup, which is used to interpolate the event weights as a function of κ H = κ t cos α and κ A = κ t sin α.
We start our analysis by selecting events consisting of two photons and at least two b-tagged jets. In addition, we require the final state to contain exactly two oppositesign leptons for the di-leptonic channel, exactly one lepton and at least two light jets for the semi-leptonic channel, and at least four light jets for the hadronic channel. We demand the individual particles to pass the following identification cuts: In addition, we require the di-photon invariant mass to satisfy |m γγ −125| < 10 GeV.
We fully reconstruct the tth system following the strategy described in Sec. III. In particular, this allows to obtain both the lab frame and the tt CM frame observables. As an example for an observable that requires the top reconstruction, we present the distribution of the Collins-Soper angle θ * in Fig. 3. When comparing these detector level distributions to the result at parton level, presented in Fig. 1, we observe the robustness of our analysis in respect to the reconstruction strategy and detector effects. The distributions are found to retain the CP sensitivity at the detector level, albeit a reduction of about 20% for the di-leptonic channel, 40% for the semi-leptonic channel and 50% for the hadronic channel, compared to parton level.

B. Analysis Methodology
As we have seen in Sec. II, there is no single observable that carries all the information on the CP-structure of the top quark Yukawa. Instead, there is a variety of sensitive observables. Hence, a multi-variate analysis is needed to extract the maximal information on the theory parameters from the data. In the following, we will summarize the adopted observables and the analysis methodology.
In this analysis, we consider the following list of 80 observables to describe the kinematics of signal and back- ground events.
We include the complete set of observables used by the ATLAS collaboration in a recent Higgs-top CP study [34] and complement this set with additional CP-even observables that show strong sensitivity to the CP-phase (θ * , b 4 , m tt , m tth ) together with the transverse momentum and pseudorapidity of all final state and reconstructed objects. We also incorporate a comprehensive list of spin correlations, as introduced in Eq. (4), which are constructed between all possible final state pairs. We include both observables constructed in the tt rest frame, ∆φ tt ik , and in the lab frame, ∆φ lab ik . Finally, we account for the correlation observables ∆φ tt hi(k) that arise from the tensor products involving the Higgs boson momentum, (p t , pt, p h , p i(k) ). The following pairs {i, k} are considered for the different channels di-leptonic: In the semi-leptonic case, b /W and b had /W had represent the b-jets/W -bosons produced from the leptonically and hadronically decaying top quarks, respectively. In events with more than two b-tagged jets, the hardest two are considered while reconstructing the top and the anti-top quarks. j hard corresponds to the hardest light jet, from the hadronic top quark, in the top rest frame.
To interpret the results of our analysis and obtain projected sensitivities, we follow a likelihood-based approach. According to the Neyman-Pearson lemma, the most powerful test statistic to discriminate between two hypotheses, in our case a new physics model parameterized by θ = (κ t , α) and the SM with θ SM = (1, 0), is the likelihood ratio r(x|θ; θ SM ). Here, x denotes the set of reconstructed observables introduced above.
Whereas the likelihood ratio involving detector level observables is intractable, meaning that it cannot be computed directly, it can be estimated using simulations. To address this issue, we use the machine learning based technique introduced in Refs. [51][52][53][63][64][65][66], which has been implemented in the MadMiner tool [50]. This approach uses both reconstructed observables and matrixelement information, which are then used to train neural network models that estimate the likelihood ratio. It therefore accounts for the effects of parton shower, hadronization, and detector effects, while the matrixelement information helps to significantly improve the performance of the neural network training. Using the estimated likelihood ratio function r(x|θ; θ SM ), which describes both the linear and non-linear new physics effects, we then perform a likelihood ratio test to obtain our projected sensitivities.
We simulate 10 6 signal and 10 6 background events before event selection. Using MadMiner, we train neural networks to estimate the likelihood ratio using the ALICES algorithm with its hyperparameter set to unity [63]. We use fully connected neural networks with three hidden layers, each containing 100 nodes and tanh activation function. The neural network training is performed over 100 epochs using the Adam optimizer. To avoid overtraining, we evaluate the loss function on an independent validation set and employ an early stopping procedure. We use a batch size of 128, and an exponentially decaying learning rate (from 10 −4 to 10 −5 ). The limit setting is performed with MadMiner's Likelihood class.

C. Results
Let us now turn to the results of our study. In Fig. 4 we show the projected sensitivity on the top Yukawa coupling in terms of κ t and α using the tt(h → γγ) measurement. In the left panel, we present the 68% CL contours for the individual top decay channels as colored dashed lines. A combination of all channels is shown in the black solid line. The studied channels can be organized in ascending order of sensitivity as: di-leptonic, semi-leptonic, and hadronic modes. Since the leading observables display efficient reconstruction for all channels, as illustrated in Fig. 3, the order of sensitivity among the final state modes follow their correspondent event rate.
In the right panel, we show the 68% and 95% CL contours as dashed and solid lines, respectively. The p-values in the (κ t , α) parameter space are presented through the color palette. We observe that |κ t | can be constrained within O(8%) of the SM value at 68% CL through a combination of direct searches in the tt(h → γγ) channel at the HL-LHC. Assuming κ t = 1, the combined search would be able to probe the Higgs-top CP phase up to |α| 13 • at 68% CL.
We also perform a separate analysis in which we train a neural network exclusively with the CP-even observables shown in the right panel of Fig. 2. We observe that the projected sensitivity of such an analysis, using this smaller set of CP-even observables that are mostsensitive to the non-linear new physics effects, is almost comparable to the projected sensitivity of the combina-tion study performed using the full set of observables. Overall, almost all the sensitivity to the Higgs-top CPstructure is provided by the non-linear terms in α. The limited tt(h → γγ) event statistics renders sub-leading sensitivity to the observables which probe the linear terms.

D. Systematic Effects
In this section, we explore the implications from systematic uncertainties on the projected sensitivity of κ t and α. In particular, we will consider two sources of uncertainty associated with the normalization of both signal and background.
In the statistical analysis, these uncertainties are parameterized through nuisance parameters ν s and ν b for the signal and background normalization, respectively. These nuisance parameters encode theoretical and experimental uncertainties on the normalization of distributions, neglecting possible shape uncertainties. As before, we train a neural network using the ALICES method in MadMiner to estimate the likelihood ratio r (x|θ, ν; θ SM , ν SM ). This is now a function of both the model parameters θ = (κ t , α) and the nuisance parameters ν = (ν s , ν b ) which have a nominal value ν SM = (0, 0). Before setting limits, a constraint term describing our prior knowledge on the nuisance parameter is added. Adopting a conservative approach, we assume a prior constraint of 20% and 50% in the tt(h → γγ) signal and the ttγγ background, respectively. Finally, we profile over the nuisance parameters following the procedure described in Ref. [51]. Before turning to the sensitivity contours, let us remind ourselves that the presented results are based on a multi-variate analysis. In particular, this includes the invariant mass of the di-photon pair. The considered range, 115 GeV < m γγ < 135 GeV, was chosen sufficiently wide to contain both a signal dominated region at the Higgs resonance and a background dominated region around it. MadMiner uses this background dominated region to constrain the nuisance parameter associated with the background normalization ν b , and therefore effectively performs a data-driven side-band analysis. As we will see in a moment, the effective uncertainty of the background normalization is therefore significantly smaller than the 50% which we assumed as a prior.
In the following, we analyze three scenarios to study the impact of systematic uncertainties on the projected sensitivity in the (α, κ t ) plane. In the first scenario, we study the impact associated with only the uncertainty on the background normalization. To do so, we fix the ν s = 0 in the estimated likelihood ratio and profile over ν b . Similarly, in a second scenario, we fix ν b = 0 and we profile over ν s to study the impact of the signal uncertainty. Finally, in a third scenario, we obtain the limits after profiling the likelihood ratio over both ν s and ν b . In Fig. 5 we present the projected sensitivity on α and κ t for all scenarios. The blue, green, and red contours correspond to the first, second, and third scenarios, respectively. The black contour represents the sensitivity assuming no systematic uncertainty and corresponds to the black-solid contours in Fig. 4.
At first, we observe that the sensitivity in α remains unaffected from systematic uncertainties [32]. This stems from the reason that at κ t = 1, the sensitivity in α is dominantly controlled by the shape information from kinematic distributions and is largely independent of the event rate due to the combination of two competing effects. On the one hand, the signal cross section σ tt(h→γγ) decreases with α: for example at κ t = 1 the cross section σ tt(h→γγ) falls by O(25%) from α = 0 until α ∼ π/3 and then remains roughly unchanged until α = π/2. On the other hand, the signal efficiency also improves with α. These two effects roughly offset any overall dependence on the event rate, thereby leading to unchanged projection contours in the direction of α even after profiling over the nuisance parameters. The situation is qualitatively different in the κ t direction. When α = 0, the measurement is purely based on a rate information, implying that the Higgs coupling strength κ t and the signal normalization, as parameterized by ν s , are essentially degenerate. Therefore, our prior uncertainty of the signal normalization will directly propagate into a systematic uncertainty on κ t . The new physics effects in the Higgs-top coupling manifest as ∼ κ 2 t at the tth production level and as ∼ (1.28 − 0.28κ t ) 2 in h → γγ decay [7]. After combining these two factors, an uncertainty of 20% in the pp → tt(h → γγ) cross section translates to roughly 12% uncertainty in κ t . We observe this effect in Fig. 5: for α = 0 the projected sensitivity falls from |κ t | 8% in the absence of systematic uncertainties to |κ t | 13% on profiling over ν s . We observe that despite a prior 50% uncertainty in the background normalization compared to 20% in the signal, its impact on the projection contours in the (κ t , α) plane is milder. As discussed above, this is a consequence of the sideband measurement and illustrates the robustness of our multi-variate analysis.

V. SUMMARY
In this study, we derived the prospects to direct measure the Higgs-top CP-structure in tt(h → γγ) channel at the HL-LHC. We show that a combination of machine learning techniques and efficient kinematic reconstruction methods can boost new physics sensitivity, effectively exploring the complex tth multi-particle phase space.
Among the several probes included in our machine learning analysis, this study encompass a comprehensive set of spin correlation observables. Beyond the SM CPphases steer the spin-polarization of the top pair, and the spin correlations are carried forward by their decay products. We harness the potential of the spin correlation observables via the full reconstruction of the top and anti-top, evaluating these particular observables in the tt CM frame, where the correlations are maximal. In the hadronic and semi-leptonic tth channels, we used mass minimization to fully reconstruct the tth system. In the more complex di-leptonic channel, we employed the Re-cursive Jigsaw Reconstruction technique to resolve the combinatorial ambiguities and determine the unknown degrees of freedom. In all channels, the effects of parton showering, hadronization, and detector resolution were included.
Exploring the intricate tth multi-particle phase space with CP-odd and even observables defined in the laboratory frame and the tt CM frame, we obtain strong projections for the Higgs-top CP-phase. Through a combined semi-leptonic, hadronic, and di-leptonic tt(h → γγ) search, the HL-LHC can directly probe the Higgstop coupling modifier and CP-phase respectively up to |κ t | 8% and |α| 13 • at 68% CL.
Possible improvements can be expected by including other relevant channels, such as tt(h → bb) [6,21,24,32]. While this channel displays the bulk of the Higgs decay, BR(h → bb) ∼ 58%, it results in sub-leading limits in comparison to tt(h → γγ) as it endures a substantial QCD background that is associated with sizable uncertainties [67,68]. Fast-moving precision calculations [69][70][71] and possible combination of side-band analysis with tth/ttZ ratios [32,72] may change this scenario, con-trolling the respective uncertainties, and pushing further forward the sensitivity with this extra channel in the near future.

ACKNOWLEDGMENTS
We thank Johann Brehmer and Sam Homiller for helpful discussions. RKB and DG thank the U.S. Department of Energy for the financial support, under grant number DE-SC 0016013. The work of FK is supported by the U.S. Department of Energy under Grant No. DE-AC02-76SF00515 and by the Deutsche Forschungsgemeinschaft under Germany's Excellence Strategy -EXC 2121 Quantum Universe -390833306. Part of this work was performed at the Aspen Center for Physics, which is supported by National Science Foundation grant PHY-1607611. Some computing for this project was performed at the High Performance Computing Center at Oklahoma State University, supported in part through the National Science Foundation grant OAC-1531128.