Study of the CP property of the Higgs boson to electroweak boson coupling in the VBF 𝑯 → 𝜸𝜸 channel with the ATLAS detector

A test of CP invariance in Higgs boson production via vector-boson fusion has been performed in the 𝐻 → 𝛾𝛾 channel using 139 fb − 1 of proton–proton collision data at √ 𝑠 = 13 TeV collected by the ATLAS detector at the LHC. The Optimal Observable method is used to probe the CP structure of interactions between the Higgs boson and electroweak gauge bosons, as described by an eﬀective ﬁeld theory. No sign of CP violation is observed in data. Constraints are set on the parameters describing the strength of the CP-odd component in the coupling between the Higgs boson and the electroweak gauge bosons in two eﬀective ﬁeld theory bases: ˜ 𝑑 in the HISZ basis and 𝑐 𝐻 ˜ 𝑊 in the Warsaw basis. The results presented are the most stringent constraints on CP violation in the coupling between Higgs and weak bosons. The constraints on ˜ 𝑑 are further tightened through combination with results from the 𝐻 → 𝜏𝜏 channel.

The violation of the charge-conjugation and parity (CP) symmetry is one of the three Sakharov conditions [1] needed to explain the observed baryon asymmetry of the universe.The only established CP violation source is the complex phase in the quark mixing matrix [2], from which the derived magnitude of CP violation in the early universe is insufficient to explain the observed value of the baryon asymmetry [3][4][5].The discovery of the Higgs boson by the ATLAS and the CMS experiments [6,7] at the Large Hadron Collider (LHC) [8] opened a new direction to search for sources of CP violation: the interactions of the Higgs boson.The Standard Model (SM) Higgs boson () is even under simultaneous charge-conjugation and parity inversion.However, CP violating interactions are still allowed experimentally.Any deviation from a pure CP-even interaction of the Higgs boson with other SM particles could be a new source of CP violation and also a direct indication of physics beyond the SM (BSM).The CP structure of Higgs boson couplings to electroweak gauge bosons and fermions has been studied extensively by the ATLAS and the CMS experiments [9][10][11][12][13][14][15][16][17][18].The results are consistent with the SM prediction, and no sign of CP violation has been found yet.
A CP-odd component in the Higgs boson coupling to electroweak bosons (,  = /) can be described by adding dimension-6 operators to the SM Lagrangian, using an effective field theory (EFT) approach.The total matrix element (M ) can be written as ( The first term describes the SM contribution.The second term (interference term) is CP-odd, representing a new source of CP violation in Higgs boson couplings, and is parameterized by the Wilson coefficient   .The third term (quadratic term) describes a CP-even BSM contribution parameterized by  2  .The interference term only affects CP-odd observables and does not contribute to CP-even observables, e.g. the inclusive cross-section [19].
Several methods were developed to construct CP-odd observables that can distinguish CP violation contributions, e.g. in Refs.[12,17].This study adopts the Optimal Observable [20][21][22][23][24] defined as to test the CP structure of the Higgs boson coupling to electroweak bosons in vector-boson-fusion (VBF) production and combines event-based information from a multidimensional phase space into a single CP-sensitive observable.
The Optimal Observable is evaluated with the momentum fraction  1 ( 2 ) of the initial-state parton from the proton moving in the positive (negative) -direction (along the beam), and from the four-momenta of the Higgs boson and two VBF jets.At the reconstruction level, the momentum fractions are derived as  reco 1,2 = (    e ±    )/ √  by exploiting energy and momentum conservations of the Higgs boson, which is built from the two selected photons, and the selected VBF jets.Here,     (    ) is the invariant mass (rapidity) of the Higgs boson and VBF jet system, and

√
represents the center-of-mass energy of the proton-proton collision.A detail description of the Optimal Observable calculation can be found in Ref. [9].
In the SM, the OO distribution is expected to be symmetric with a mean value of zero, and any asymmetrical effects would indicate contributions from the CP violation term, in the absence of rescattering by new light particles in loops [25].For a given event, the matrix elements in the OO definition are calculated using the four-momenta of the Higgs boson and the two forward VBF jets, and have no dependence on the decay mode of the Higgs boson.This method was first introduced in the  →  analysis [9] by ATLAS and can be used in all Higgs boson decay channels.This Letter reports an analysis to test the CP invariance of the  coupling by using the Optimal Observable method in the VBF  →  channel, using the 139 fb −1 of proton-proton ( ) collision data at √  = 13 TeV recorded during 2015-2018 with the ATLAS detector.The VBF signal yield in OO bins is extracted from a simultaneous fit to the diphoton invariant mass spectra split into the OO bins, which is then used to determine the CP violation contributions to the  coupling.
Results are interpreted in two EFT bases: the HISZ [26] and Warsaw [27][28][29] bases.The HISZ basis is used in order to combine the results with the previous measurement from the  →  channel [9], whereas the Warsaw basis is used to provide measurements for future combinations with other Higgs boson measurements.In both bases, three Wilson coefficients multiplying CP-odd operators describe possible CP-odd couplings between the Higgs boson and electroweak gauge bosons.In the HISZ basis, d is constrained by assuming d = d and setting the third coefficient to zero, as in Ref. [9].In the Warsaw basis,   W is constrained by setting   B and   W  to zero.In both bases, all CP-even operators coefficients are set to zero.Constraints on all three coefficients in the Warsaw basis were obtained previously in the  →   channel [13,16] and  →  channel using differential cross-sections [12].The measurements have significant correlations since these channels cannot distinguish between the three operators.The VBF topology in this analysis is mainly sensitive to   W and could help to reduce this correlation.
The ATLAS detector [30][31][32] is a multipurpose particle detector with a forward-backward symmetric cylindrical geometry and near 4 coverage in solid angle [33].The trigger system consists of a hardwarebased first-level trigger and a software-based high-level trigger [34].Events used in this analysis were accepted by a diphoton trigger requiring the leading and subleading photons to have transverse energies ( T ) greater than 35 GeV and 25 GeV, respectively, during the whole data-taking period.This trigger had a Loose photon identification requirement in 2015-2016 [35], but due to the increasing instantaneous luminosity the identification requirement was tightened for data-taking in 2017-2018 [35].In addition, a single-photon trigger with Loose identification criteria and an  T threshold of 120 (140) GeV in 2015-2016 (2017-2018) was used to recover events with collimated diphoton pairs with very high transverse momentum ( T ) [35].The average trigger efficiency is over 98% for events passing the full diphoton event selection for this analysis [35].An extensive software suite [36] is used in the reconstruction and analysis of real and simulated data, in detector operations, and in the trigger and data acquisition systems of the experiment.Higgs boson production via VBF was simulated with Powheg Box v2 [37] using the PDF4LHC15nlo [38] parton distribution function (PDF) set.The generation is accurate to next-to-leading-order (NLO) in QCD, and the total cross-section is normalized to a calculation including QCD corrections at full NLO and approximate next-to-next-to-leading-order (NNLO) accuracy as well as electroweak (EW) corrections at full NLO accuracy [39][40][41].Higgs boson production via gluon-gluon fusion (ggF) was modeled at NNLO accuracy in QCD using Powheg Box v2 [42,43] and the NNLO family of PDF4LHC15 PDFs.The simulation achieves NNLO accuracy for arbitrary inclusive  →  observables by reweighting the Higgs boson rapidity spectrum in Hj-MiNLO [44][45][46] to that in HNNLO [47], and the total cross-section is normalized to a prediction calculated at next-to-next-to-next-to-leading-order (N 3 LO) accuracy in QCD and including NLO EW corrections [48][49][50][51][52][53][54][55][56][57][58].Other Higgs boson production processes, e.g. in association with a vector boson ( ) or top quark(s) (, ), were also modeled using Powheg Box v2.Prompt diphoton production () was simulated with the Sherpa 2.2.4 [59] generator.More details can be found in Ref. [12].
To simulate the effects of nonzero values of d and   W in the  vertex, a reweighting method is implemented for the HISZ basis and Warsaw basis, respectively, and applied to the aforementioned SM VBF signal sample.For the d coefficient in the HISZ basis, as detailed in Ref. [9], two weights are calculated by the HAWK program [39,40,60] for each event using generator-level information: with a specific amount of CP mixing (given in terms of d), to model the contribution from the interference term and the quadratic term, respectively, as shown in Eq. ( 1).For the interpretation in the Warsaw basis, a reweighting of the reconstructed OO distribution at different values of   W is obtained by , where 'MG5' labels the prediction from MadGraph [61,62] using SMEFTSim [27,28], and 'NLO' labels the aforementioned SM VBF signal sample.MadGraph events for nonzero values of   W were generated setting the scale of new physics Λ = 1 TeV and fixing all other Wilson coefficients to zero.For both interpretations, higher-order QCD and electroweak corrections are assumed to factorize from the new-physics effects.Limits in the two bases are extracted from the effect of the interference-only term and also from the effect of the interference-plus-quadratic terms.The OO value is calculated using HAWK because the  operators in the two EFT bases are similar.HAWK uses the HISZ basis assuming d = d , which corresponds to   W =   B for the Warsaw basis.However, since   B has negligible impact on VBF, only   W is varied (setting   B = 0) and the computed OO is assumed to be equally optimal for   W only.
All generated events were passed through a full simulation of the ATLAS detector response [63] using Geant4 [64], except the Sherpa  sample, which was passed through a fast parametric simulation of the detector response [63].The effects of multiple   interactions in the same or neighboring bunch crossings (pileup) are included by overlaying events generated with Pythia 8 [65].Events are weighted such that the distribution of the average number of interactions per bunch crossing matches that observed in data.
Photons are reconstructed from variable-size topological clusters formed from electromagnetic calorimeter cells with significant energy deposits and from tracks, initiated by converted photons, measured in the inner detector (ID) [66].Events must have at least two photon candidates outside the calorimeter's transition region between the barrel and the end-cap, 1.37 < || < 1.52, and within || < 2.37, where the two leading (highest- T ) photons are used to reconstruct the Higgs boson candidate and the primary vertex of the event [67].The diphoton invariant mass   is required to be in the range 105-160 GeV.The leading and subleading photons are further required to have  T /  greater than 0.35 and 0.25, respectively, and fulfill the Tight identification selection and Tight calorimetric and track-based isolation requirement [66].Jets are reconstructed using the anti-  algorithm [68,69] with a radius parameter  = 0.4 from inputs formed with a particle-flow algorithm [70], which uses information from both the calorimeter and the ID.Jet candidates are required to have  T > 30 GeV and || < 4.4.To suppress jets from pileup collisions, jet candidates with || < 2.4 and  T < 60 GeV are required to pass the Tight jet vertex tagger (JVT) selection [71].For jets with || ≥ 2.4, the Loose forward JVT selection [72] is applied to remove pileup jet contamination.To construct the region enriched with VBF signal events, two loose criteria are applied: events must have at least two jets with pseudorapidity separation |Δ   | > 2 and Zeppenfeld variable [73]  Zepp = |  − ( 1 +  2 )/2| < 5.
To increase the VBF signal purity, two boosted decision trees (BDT) [74] are trained.BDT VBF/ggF is used to separate VBF signal from ggF events, which are the major background from Higgs boson production.
BDT VBF/Continuum is used to distinguish VBF  →  events from continuum background events, which consist of the prompt diphoton events () and events where one or two of the photon candidates originate from jets misidentified as photons (  or  ).The  events, which are the dominant component of the continuum background, are obtained from simulation, while   and   events are obtained from dedicated data control regions, as described later.The two BDTs use the same input variables: invariant mass of the dĳet system formed by the two leading jets (   ), pseudorapidity separation of the dĳet system (Δ   ),  T of the Higgs boson and the leading two jets (    T ), azimuthal angle between the diphoton and dĳet systems (Δ(,  )), minimum angular separation between the photons and the two leading jets (Δ min   ),  Zepp and perpendicular projection of the diphoton  T onto the diphoton thrust axis (  Tt ) [75].These input variables are all CP-even to be insensitive to the CP property of the VBF signal and to have negligible correlation with   .Figure 1 shows the BDT output distributions of the VBF signal, the ggF background, the continuum background, and the data in the   sideband (  ∈ [105, 118] GeV or [132,160] GeV).The comparison between the continuum background and the sideband data shows the continuum background used in the BDT training is well modeled.Events are categorized as follows: firstly, a requirement is placed on BDT VBF/ggF to separate events into 'tight' (T) and 'loose' (L) regions.The ratio of VBF signal to ggF background is improved by a factor of ten in the 'tight' region.Then, two independent requirements on BDT VBF/Continuum are applied to the 'tight' and 'loose' regions to maximize the combined significance of the VBF signal.Three signal regions are defined: TT, TL, and LT, where the first (second) letter corresponds to the BDT VBF/ggF (BDT VBF/Continuum ) separation type.More details on the BDT input variables and the categorization requirements can be found in the Supplemental Material [76].In the TT and TL categories, the dominant Higgs boson backgrounds are from the ggF process, and the contributions from non-ggF Higgs processes, e.g. ,  and , are found to be negligible.In the LT category, Higgs boson backgrounds are still mostly from the ggF process, while those from non-ggF Higgs processes increase to about 1%-3% of the VBF event yield.This novel BDT-based strategy improves the significance of VBF signal by 10% with respect to the latest  →  analyses [77] with the same dataset.The signal yield is extracted via a combined unbinned maximum-likelihood estimation applied to the   distribution of observed data in each OO bin, as shown in Figure 2.Both the signal and background shapes are modeled with analytic functions.The  →  signal shape is described by a double-sided Crystal Ball (DSCB) function [12], consisting of a Gaussian distribution in the region around the peak, continued by power-law tails at lower and higher   values.The parameters of the DSCB function in each category are obtained by a fit to the simulated VBF sample, as well as other Higgs boson production modes in proportion to their SM cross-sections.
The modeling of the continuum background relies on both simulation and data-driven methods.The   shape of the  component is estimated using the Sherpa sample, while the   shapes of the   and   components are obtained using data control regions formed by inverting the Tight photon identification and isolation requirements.The template is then built by summing the ,   and   components, where their fractions are measured in data using a two-dimensional double-sideband method [78].The composition of the continuum background is found to be approximately 85%  events and 15%   +   events.The background templates are smoothed using Gaussian process regression (GPR) [79] with the Gibbs kernel to reduce fluctuations due to the limited sample size.The   distribution of the continuum background is found to have a smoothly falling shape.The analytic function chosen to model the continuum background is either a power-law function, a Bernstein polynomial [12], or an exponential function of a polynomial, and it is selected for each OO bin independently.The selected function should have the smallest spurious signal, defined as the systematic bias in the fitted signal yield due to differences between the fit function choice and the background template.The coefficients of these functions are considered to be independent across categories, and in all cases are treated as free parameters in the fits to data.More details can be found in Ref. [12].
An unbinned likelihood is constructed with the   spectra of each OO bin in signal regions TT, TL and LT.The negative log-likelihood (NLL) is evaluated for various d and   W hypotheses. Confidence intervals are obtained by reading values off the NLL curve, which is constructed by interpolating between the points with spline functions.The normalization of the signal is allowed to float in the fit.The analysis therefore exploits only the shape of the distribution of the Optimal Observable, and ignores the potential dependence of the inclusive cross-section on CP-mixing scenarios.If present, any BSM CP-even effects would mainly change the normalization, and produce very small symmetric changes in the OO distribution, which are found to not bias the parameter of interest for the CP-odd effect.All other Higgs boson production modes are considered as backgrounds and are normalized to their SM predicted yields.The expected ΔNLL curve is obtained using a pseudo-dataset where the event yields and distributions in the signal regions are set to the SM expectations for both the signal and background processes.
Both the theoretical and experimental systematic uncertainties are incorporated into the likelihood model of the measurement as nuisance parameters.Theoretical uncertainties arise from the modeling of VBF and ggF processes because of the missing higher-order terms in the perturbative QCD calculations, the modeling of the underlying event and parton shower, the parton distribution functions, and the value of  s .These uncertainties are estimated by following the procedure described in Ref. [12].The experimental uncertainties include the uncertainties in the photon energy scale and resolution [66], the jet energy scale and resolution [80], the luminosity measurement, and the modeling of pileup events and the photon identification and isolation criteria [81].The spurious signal that could arise from mismodeling of continuum background is estimated in each OO bin.
Figure 3 shows the ΔNLL curves as functions of d or   W . Here, the d results use the interferenceplus-quadratic terms in Eq. ( 1), while the   W results use only the interference term.The confidence intervals for the two scenarios, interference-only and interference-plus-quadratic, are shown in Table 1.The difference between the results in the two scenarios is found to be small.The results are compatible with the SM and the precision is limited by the statistical uncertainty of the data.For example, the total impact on the 95% confidence intervals of d from the systematic uncertainty is less than 2%.The measurement is sensitive enough to determine an observed 95% confidence interval for d, which was not achieved in previous analyses.The expected 68% confidence interval shown for the  →  channel in Table 1 differs slightly from that presented in Ref. [9], where the expected  →  results were obtained with the nuisance parameters constrained only by the control regions.In the present analysis, the expected results are obtained with the nuisance parameters constrained by both the control regions and signal regions.The saturation of the ΔNLL shape at larger values of d is a result of the dominance of the quadratic term.This was a limiting factor in the  →  analysis where this saturation together with the larger statistical and systematic uncertainties prevented setting intervals at the 95% confidence level, the level that is most commonly used to constrain the corresponding EFT operators.
This letter reports a significantly improved expected 95% confidence interval compared to previous  →  analysis and presents the observed 95 % confidence interval for the first time.The 95% confidence interval for   W obtained using the interference-only term is a factor of five more restrictive than in the  →  differential measurement reported in Ref. [12] because of the dedicated BDTs for the VBF signal selection and the use of the Optimal Observable.The 68% confidence interval for   W is about twice as restrictive as that from either the ATLAS or CMS  →   four-lepton analysis [13,16].The luminosity uncertainty of the data in 2015-2016, the uncertainties of the photon energy scale and resolution, and the theoretical uncertainty of VBF and ggF processes are correlated in the combination.The jet-related uncertainties are  In figure (a), the blue lines represent the results of this analysis, while the red lines represent the results from the  →  analysis [9].The black lines show the combination of these two analyses.For all figures, the dashed horizontal lines show the values of ΔNLL used to define the 68% and 95% confidence intervals.
not correlated since a different jet reconstruction technique was used in the  →  analysis.In conclusion, a test of CP invariance in Higgs boson production via vector-boson fusion is performed in the  →  channel using 139 fb −1 of √  = 13 TeV proton-proton collision data collected by the ATLAS detector at the LHC.The Optimal Observable method is used to probe CP-violating interactions between the Higgs boson and electroweak gauge bosons described by an effective field theory.The results are compatible with the SM.No sign of CP violation is observed in the Optimal Observable distributions.The constraints on CP-violating effects in the  coupling are the most stringent to date.They allow 68% and 95% confidence intervals to be set for parameters describing the strength of the CP-odd component in the  coupling in two effective field theory bases: d in the HISZ basis and   W in the Warsaw basis.
The sensitivity is sufficient to set a 95% confidence interval for d for the first time, and the constraints on d are tightened further by combining them with previous results from the  →  channel.The constraints on   W are about twice as restrictive as those from either the ATLAS or CMS four-lepton analysis.
[78] ATLAS Collaboration, Measurement of the isolated diphoton cross section in 𝑝

Figure 1 :
Figure 1: Distribution of the output of BDT VBF/ggF (left) and BDT VBF/continuum (right).The comparison between the continuum background and the sideband data indicates the continuum background used in the BDT training is well modeled.

Figure 2 :
Figure 2: Distribution of the optimal observable OO for events with   ∈ [118, 132] GeV.Contributions in three signal regions are summed together with a weight of ln(1+ /) for each signal region, where  and  are the expected yields of signal and background events with   ∈ [118, 132] GeV.The overflow and underflow are included in the highest and lowest bin, respectively.The uncertainty band shown includes all systematic uncertainties.The weighted summed   distribution of data events is shown in the inner panel along with the signal and background contributions.The lower panel is the OO distribution in data after subtraction of all backgrounds, in comparison with the SM VBF process, and VBF processes with d = 0.06 and d = −0.06.The sensitivity to d is dominated by the tails of the OO distribution.

Figure 3 :
Figure 3: ΔNLL curves as a function of (a) d and (b)   W .In figure (a), the ΔNLL of d considers the interferenceplus-quadratic terms, whereas in figure (b) the ΔNLL of   W considers the interference-only term.The solid lines are the observed results, while the dashed lines are the expected results.In figure (a), the blue lines represent the results of this analysis, while the red lines represent the results from the  →  analysis[9].The black lines show the combination of these two analyses.For all figures, the dashed horizontal lines show the values of ΔNLL used to define the 68% and 95% confidence intervals.