Measurement of $\sigma(e^+e^- \to HZ) \times {\cal B}r(H \to ZZ^*)$ at the 250 GeV ILC

We report on studies of the $e^+e^- \to HZ$ process with the subsequent decay of the Higgs boson $H \to Z Z^\star$, where the $Z Z^\star$ combination is reconstructed in the final states with two jets and two leptons. The analysis is performed using Monte Carlo data samples obtained with detailed ILD detector simulation assuming the integrated luminosity 2 ab$^{-1}$, the beam polarizations ${\cal{P}}_{e^-e^+} = (-0.8, +0.3)$, and the center-of-mass energy $\sqrt{s}$ = 250 GeV. The analysis is also repeated for the case of two 0.9 ab$^{-1}$ data samples with polarizations ${\cal{P}}_{e^-e^+} = (\mp0.8, \pm0.3)$. The process is measured in four decay channels, which correspond to two combinations for the Higgs final states and two decay modes of the directly produced $Z$ boson, $Z \to q \bar{q}$ and $Z \to \nu \bar{\nu}$. To obtain the Higgs boson mass distributions, we used the variables $M(jj\ell\ell)$ and $M_{\Delta} = M(jj\ell\ell) - M(jj) + M(Z_{\rm nom})$, where $M(Z_{\rm nom})$ = 91.2 GeV. Contributions of the potential background processes are taken into account based on the available MC event samples. We propose a model-independent method for obtaining the width of the Higgs boson using the measurement of the $e^+e^- \to HZ$ process.


I. INTRODUCTION
Since the discovery of the Higgs boson by the ATLAS and CMS collaborations [1,2] in 2012, the next important task is to measure parameters of this particle with the highest possible accuracy. In recent years, the mass, couplings, production cross sections, and quantum numbers of the Higgs boson have been measured in the LHC experiments with increasing accuracies. However, the width of the Higgs boson is difficult to measure at LHC in a model-independent approach. Indirect methods will be used to measure the Higgs width at LHC after the high-luminosity upgrade, but even with the data sample of 3 ab −1 the uncertainty is expected to be ∼20 % [3,4].
The width of the Higgs boson is strictly theoretically defined within the Standard model (SM) for a fixed Higgs mass. The width value of 4.1 MeV/c 2 has been calculated for the mass of 125 GeV/c 2 [5]. This result may be distorted by beyond the Standard Model (BSM) contributions. Therefore, the model-independent measurement of the Higgs width provides an important test of SM. A high accuracy of the Higgs width measurement can be reached at the future e + e − linear collider ILC [6,7]. A large number of Higgs bosons will be produced at ILC, whereas backgrounds are expected to be relatively small. Events in the ILD detector [8] at ILC have clean and well-defined signatures and, therefore, processes of interest can be identified and studied in detail.
We propose to use the process e + e − → ZH with the subsequent decay H → ZZ to measure the product of the cross section and the decay branching fraction, which can theoretically be expressed as: σ(e + e − → HZ) × Br(H → ZZ ) = C · g 4 Z /Γ H Here C is a constant which can be calculated theoreti-cally with an uncertainty of less than 1 % [7], g Z is the Higgs boson coupling HZZ, and Γ H is the Higgs boson width. Therefore, the measurement of this product can be used to obtain the width of the Higgs boson with a high accuracy, because the coupling g Z is expected to be determined combining ILC and LHC results with an uncertainty of about 0.5 % [9] using other processes. In this analysis we assume the data sample of 2 ab −1 collected by the ILD experiment in the e + e − collisions at a center-of-mass energy 250 GeV. The accuracy of the e + e − → ZH process measurement at 250 GeV with the subsequent decay H → ZZ has also been evaluated using the SiD detector simulation [6]. In the studied process the two on-shell Z bosons and one off-shell Z boson have to be reconstructed. The directly produced Z boson is denoted below as Z 1 to separate it from the Z bosons produced in the Higgs decay. The decays of the Z ( ) (1) bosons to two hadronic jets, two opposite sign leptons ( ± = e ± , µ ± ), and two neutrinos are considered in the analysis. The number of signal events is expected to be small if two of these three Z ( ) (1) bosons are reconstructed in leptonic modes. Therefore we reconstruct only one of three Z ( ) (1) bosons in the leptonic mode, and other two in hadronic or neutrino modes. In this paper four channels are studied and the final precision of the Higgs width measurement is calculated combining accuracies obtained in these channels.

II. MC SAMPLES AND ANALYSIS TOOLS
In this analysis we study the following subprocesses: arXiv:2108.08867v2 [hep-ex] 19 Nov 2021 The official Monte Carlo (MC) data samples produced by the ILD group are used. All processes are generated using Whizard 2.8.5 package [10] with the LCIO [11] output format; hadronization is performed by Pythia6 [12]. The detailed simulation of the ILD detector effects is performed using the ILD_l5_o1_v02 model from the ILCSoft toolkit [13] v02-00-02 using the DD4HEP [14] software package. Finally, the events are reconstructed with the MarlinReco [15] package.
The official MC samples are generated assuming four possible combinations with 100 % beam polarization, P e − e + = (± 1.0, ± 1.0), and 250 GeV center-of-mass energy. For the signal events two sets of MC samples are obtained: e − L e + R (LR) with left-handed electrons and righthanded positrons and e − R e + L (RL) with right-handed electrons and left-handed positrons. Initial state radiation (ISR) and beam radiation processes are properly included at the generation level. The γγ beam induced processes are overlaid on the generated events before reconstruction. The MC samples contain data arrays, so-called data collections, with information about all particles in an event. In particular, the MCParticles [16] and Pan-doraPFOs (the Particle Flow Objects reconstructed with PandoraPFA [17]) collections are included in the samples. Table I shows the basic information taken from logbook ELOG [18] for MC samples selected from repository for this analysis. Zero background contribution is obtained in studies of the MC samples ZZ(4j) + γ (2 ), ZZ/W W (4j)+2ν, H(all)+X, and Z(2ν)Z(2j)+γ (2 ), which are not included in Table I. ILCSoft includes the Marlin software package, which contains special program codes, so-called processors, used, in particular, for the separation of isolated leptons and the jet reconstruction. The Marlin package provides an option to use the FastJet [19] software codes, designed for clustering particles in jets based on various reconstruction algorithms. This method is used in the analysis.

III. EVENT PRESELECTION AND INITIAL ANALYSIS
The MC samples studied are preselected using the MCParticle collection containing information from the MC event generator level. The signal samples are extracted requiring only specific process and decay chains. The background samples are studied without preselections, however the most dangerous background processes are also preselected and studied separately. All following selections are applied using the information on the reconstruction level.
The first step of the event selection is to identify two isolated lepton candidates.
The standard IsolatedLeptonTagging [20] processor is applied for this goal. This processor finds high energy leptons in events using multivariate double-cone method and TMVA [21] machine learning algorithms. We used the default set of parameters and weights included in this processor. The Z and Z reconstruction efficiencies in the leptonic modes in the channel with four jets (two jets) are ∼67 % (∼72 %) and ∼90 % (∼91 %), respectively. Only events with two identified isolated leptons are kept for the following analysis. These leptons are excluded from the following jet reconstruction procedure.
Energetic ISR photons can be observed inside the detector, this can affect the analysis. A simple ISR identification procedure [22] is applied after the lepton identification procedure. If an ISR photon candidate is found, it is removed from the PFOs collection and not used in the jet reconstruction.
The next step is the jet reconstruction which is performed using FastJet clustering tools. For this goal we choose the Valencia [23] algorithm, which was specially developed for jet reconstruction at electron-positron colliders. We select this algroithm for its high efficiency of jet reconstruction near the beam direction. After excluding isolated leptons and ISR photons all remaining particles in an event are expected to be a part of jets or to come from a γγ low P t process. On average about 0.4 γγ low P t hadron events are expected per bunch-crossing at √ s = 250 GeV [24]. We use an exclusive k T clustering algorithm [25] with a generalized jet radius of 1.5 to remove the γγ overlay particles and the Valencia algorithm to force the remaining particles into two or four jets, depending on the studied channel. Three parameters should be adjusted in the Valencia algorithm, the generalized jet cone radius R, and the β and γ parameters, which are used to control the clustering order and the background resilience. We set the β parameter to 1.0 that corresponds to behavior of the k T algorithm. The radius R and the parameter γ are tuned to optimize the invariant two-jet mass shape from the Z → jj decay. The position and the width of the two-jet mass distribution are evaluated for different R and γ values using the M (jj) − M (Z), IQR 34 , RM S 90 and Median parameters proposed in [23]. We chose the values of these parameters which provide the best Z boson mass reconstruction quality. Table II shows the sets of the best R, β and γ parameters chosen for the Valencia jet clustering algorithm in each channel.
The product of the cross section and the branching fraction discussed above can be measured experimentally using the formula: where N sig is the number of signal events measured in a specific channel, and L int is the integrated luminosity of a used data sample. For a studied channel the selection efficiency is denoted by and the relevant decay branching fractions of the Z bosons decays taken from PDG [3] are denoted as Br(Z 1 ), Br(Z), and Br(Z * ).
To get the expected number of signal or background events with P e − e + = (−0.8, +0.3) polarization and the integrated luminosity 2 ab −1 , we apply a weight factor to each event from the MC samples. The MC samples are generated assuming 100 % polarized beams; the sample nominal integrated luminosities L nom are given in Table I. The weight factor W is calculated as: The numbers of initial MC events, the numbers of events remained after lepton identification, the weight factors, and the final number of weighted events are given in Table III. The number of Higgs boson signal events is obtained by fitting distributions of the invariant mass M (jj ). However for the channels with Z → jj and Z → decays the following formula gives a better resolution: where M (Z nom ) = 91.2 GeV. This formula results in a narrower Higgs boson mass peak, because uncertainties of the jet reconstruction are mostly canceled in the mass difference.

IV. RESULTS
The four channels are studied and the signal statistical uncertainties are evaluated. To suppress backgrounds various cuts are applied as summarized in Table IV. First, the signal and background distributions are obtained with the weighted bin contents and uncertainties. These distributions are fitted to obtain shape parameters separately for the signal and background. Then, the signal statistical uncertainties are estimated using the obtained distribution shapes and normalizations. To reproduce the real data distribution, the weighted signal and background distributions are summed, the content of each bin is rounded to the integer number and the Poisson uncertainties for the bin contents are assumed. The binned extended maximum likelihood fit method is applied to the combined distributions with the function including signal and background terms with the fixed shapes determined in the first step and free normalizations. Finally, the toy MC method is applied to obtain precise estimates for the signal statistical uncertainties.
A. Study of the e + e − → Z1(j1j2) H(ZZ ) process with Z → j3j4, Z → + − The final state of the first studied channel includes two leptons and four jets. To form the Z 1 and Z bosons from these four jets we calculate χ 2 for six possible two-jet combinations: where P (Z 1 ) = 60.0 GeV/c is the mean Z 1 momentum in the e + e − → HZ 1 process at the 250 GeV center-ofmass energy. All σ parameters are the mean widths of corresponding mass or momentum distributions on the reconstruction level. The combination with the minimal χ 2 is selected for the following analysis. After jet matching is performed, several cuts are applied. To remove random backgrounds, the full visible energy in the event is required to lie in the range 200 < E(jjjj + − ) < 260 GeV. After this cut, the dominant backgrounds come from the e + e − → W + W − γ and e + e − → ZZγ processes, with the off-shell γ decaying to two leptons and the W and Z bosons decaying to two jets. The distribution of the off-shell photons falls sharply with increasing mass. The invariant mass of the two leptons rarely exceeds 10 GeV/c 2 , while the mass of two leptons in the signal events starts from 10 GeV/c 2 . To obtain the best signal significance, the M ( + − ) > 13 GeV/c 2 cut is applied to suppress these backgrounds. A small contribution comes from the four jets W + W − and ZZ backgrounds, where b or c-quarks decay semileptonically and the produced leptons split off the corresponding jet. To suppress these backgrounds, the minimum lepton momenta is required to be larger than 9 GeV/c. The maximum lepton momentum is required to be P max ( ) < 32 GeV/c. We also apply the cut M (jj) > 70 GeV/c 2 , which suppresses the contribution from the H → Z Z process. Figure 1 shows the M ∆ distributions for the signal and background events separately (a) and for the sum of the signal and background events (b), obtained as described above.
The signal distribution is modeled by the sum of two functions: a Breit-Wigner function convolved with a Gaussian function and an additional wide Gaussian function to account for residual Z Z events and a few events due to a wrong jet matching in the χ 2 selection. The width of the Breit-Wigner function is fixed to the value Γ = 2.495 GeV/c 2 , because the Z boson natural width transfers into the M ∆ value. The background is described by a third order Chebychev polynomial function. First, the signal and background distributions are fitted separately to obtain the shapes of the distributions. Then, the distribution of the sum of the signal and background contributions is fitted with the sum of the corresponding functions with fixed shapes and free normalizations. A clear signal peak is observed in the combined distribution. The fit yields 193.4 ± 24.5 signal events.
The two jet mass distributions for the Z 1 and Z bosons are shown in Fig. 2. These distributions are wide, however the large uncertainties in the jet reconstruction mostly cancel in the M ∆ distribution. The signal significance is checked with a toy MC using the RooFit package. The 10000 M ∆ mass distributions are generated using the shapes and normalizations for the sum of the signal and background distributions obtained separately. The generated mass distributions are fitted with a function including both signal and background terms with free normalizations. Figure 3 demonstrates the distribution of the numbers of the signal events obtained from the toy MC. The fit of this distribution to the Gaussian function gives the mean value and width of 192.4 ± 0.3 and 24.9 ± 0.2 events, respectively. The toy MC results agree within uncertainties with the combined fit results. Therefore the statistical uncertainty for this channel is 12.9%. We quote as the final results the toy MC results for this and other channels. In this channel the Z boson is reconstructed in the leptonic mode and the Z boson is reconstructed in the hadronic mode. The minimal χ 2 value is calculated from six possible jet combinations using the formula: where additionally to the M (Z nom ) and P (Z 1 ) values defined above, the mean Z 1 energy E(Z 1 ) = 110.0 GeV is introduced for the e + e − → HZ 1 process at 250 GeV. The energy term slightly improves the quality of the χ 2 selection. All σ parameters are obtained using the corresponding distributions on the reconstuction level. We apply 70 < M ( ) < 95 GeV/c 2 and 200 < E(jjjj ) < 260 GeV requirements to suppress backgrounds due to random lepton pairs and possible H → Z Z contribution. Kinematically, uncorrelated lepton pairs with masses in the Z boson mass region are rarely produced at √ s = 250 GeV. The cut on the di-lepton mass removes almost all combinatorial backgrounds. Additionally we reject candidates with the mass M (jj) > 50 GeV/c 2 corresponding to the Z → jj decay. We found no significant remaining backgrounds in this channel after the application of all cuts. Figure 4 shows the distribution of the mass M (jj ), which peaks around of the Higgs boson mass. The integral of the signal distribution in the mass range 100 < M (jj ) < 160 GeV/c 2 is 275.3 events. The background is very small, the integral over all bins is 18.3 events. This background is flat and can be subtracted from the final number of events. Using this method, the signal mean value and uncertainty are estimated to be 275.3 ± 17.2 events. The statistical uncertainty for this channel is 6.3%.
We also studied the processes in which the directly produced Z 1 boson decays to neutrinos. Comparing with decays in the hadronic mode, a smaller number of signal events is expected in the neutrino mode, due to the smaller Z decay branching fraction. However the final state has a simple signature with only two jets and two leptons.
We studied different background sources to this channel with the Z → jj and Z → + − decays. There are many background sources with large cross sections which can contribute to this channel. Special attention must be paid to the e + e − → Z(2j)Z(τ + τ − ) process with leptonic τ decays, and also to the e + e − → W (2j)W ( ν) process with a lepton produced within one of the jets. Another potentially dangerous background is due to bb pair production, where both b-quarks decay semileptonically. These two leptons have to split off the jets to imitate the studied configuration. The probability for two leptons produced in hadronic jets to be identified as isolated leptons is very low, however it is compensated by high rates for this process. These backgrounds have a signature similar to the signal configuration. To reduce these backgrounds a set of cuts given in Table IV is applied. The effective cuts are on the full visble momentum 30 < P (jj ) < 70 GeV/c and energy E(jj ) < 145 GeV/c. The angular cuts on the azimuthal angle of the full system relative to the beam direction, |cos θ vis | < 0.8, and on the angle between the Z and Z boson directions, ∆φ ZZ < 120 • , are used to further suppress the backgrounds. Some additional suppression of specific background configurations can be achieved if dedicated cuts are applied on the minimum and maximum momenta of the leptons. We also tested the processes e + e − → bb and e + e − → ZH(bb) and found a small background contribution. To suppress these backgrounds we applied the 13 < M ( ) < 34 GeV/c 2 cut. The cut 80 < M (jj) < 113 GeV/c 2 is used to suppress the contribution from the H → Z Z process. Figure 5 shows the M ∆ distributions for the signal and background events separately (a) and their sum (b). The fit procedure is applied to estimate the statistical significance. The signal is modeled with a convolution of a Breit-Wigner function and a Gaussian function. The width of the Breit-Wigner function is fixed to the value Γ = 2.495 GeV/c 2 . The background is described by the third order Chebychev polynomial function. First, the As in the previous channel, the Z 1 boson here decays also to neutrinos. However the hadronic and leptonic modes for the Z and Z bosons are swapped.
The dangerous background sources are similar to the previous channel, except the bb background. However this channel's backgrounds are more effectively suppressed due to the large dilepton mass. We select events in the intervals 80 < M ( ) < 95 GeV/c 2 , 13 < M (jj) < 38 GeV/c 2 , 40 < P (jj ) < 70 GeV/c and E(jj ) < 145 GeV to suppress random lepton pairs and other backgrounds. Angular cuts |cos θ vis | < 0.9 and ∆φ ZZ < 140 • are also applied. Finally we require that at least one jet from the Z decays has the momentum < 22 GeV/c, whereas the second one has the momentum < 42 GeV/c. Figure 6 shows the mass distribution M (jj ) obtained after all the cuts applied for the signal and background events separately (a) and their sum (b). It has to be noted that the mass distribution is relatively narrow in this channel. This is because of only two jets and two leptons in the final state. Therefore we do not need to apply the minimum χ 2 method and the invariant mass of the two jets will not change if some of particles are wrongly assigned to jets. The signal distribution is described by the sum of two Gaussians. The wide Gaussian accounts for the H → Z Z contribution. The background is described by the third order Chebychev polynomial function. Using the fit procedure we obtain 74.1 ± 13.9 events. The toy MC estimation gives 73.3 ± 14.2 events, corresponding to a statistical uncertainty of 19.3%. The combined fit results perfectly agree with the toy MC values in all four channels.

V. COMBINED SIGNAL SIGNIFICANCE ESTIMATE
An important result of this study is an estimate of accuracy which can be reached for the Higgs width measurement. To estimate the accuracy, we calculate the combined statistical uncertainty for the four studied channels using the formula S comb = 1/ 4 i=1 S −2 i . Results obtained for all studied channels and the combined value of statistical uncertainty are given in Table V. As given in Table V, the statistical uncertainty of the proposed method is 5.29 % for the integrated luminosity 2 ab −1 and polarization P e − e + = (−0.8, +0.3). Alternatively, we assumed two data samples with the polarizations P e − e + = (−0.8, +0.3) and P e − e + = (+0.8, −0.3) and the integrated luminosity of 0.9 ab −1 each. The same analysis is repeated for this data taking scheme and the total statistical uncertainty of 6.15 % is obtained.
We note that the branching fraction Br(H → ZZ ) will have a small contribution from the H → Z Z decay, which is around (5-10) % depending on the studied channels. This contribution can be accurately evaluated and corrected for. In the last two channels there is also a contribution from to the W -fusion process e + e − → H(ZZ ) ν eνe . For the used cuts its fraction is about 15 % of the selected signal events. As in the case of the previous correction, this fraction can be precisely evaluated and, therefore, does not result in a loss of accuracy.
The systematic uncertainties are not studied in this analysis. The largest systematic uncertainties are expected from the uncertainty in the selection efficiency and the uncertainty due to the signal and background shape modeling in the fit. The later systematic uncertainty will dominate. Unfortunately accurate estimates of the systematic uncertainties cannot be performed without real data.

VI. CONCLUSIONS
We studied the e + e − → HZ process with subsequent H → ZZ decay. The analysis is performed assuming the integrated luminosity 2 ab −1 collected at the e + e − collisions with center-of-mass energy 250 GeV and the beam polarizations P e − e + = (−0.8, +0.3). Four channels are studied and the corresponding signal and background contributions are estimated using MC simulation. Summing results obtained in the four studied channels we obtain the combined statistical uncertainty 5.29 %. This indicates, that the Higgs width can be measured using this method with an accuracy of about (5-6) % within the model-independent approach. We also repeated the analysis assuming two data samples with integrated luminosities 0.9 ab −1 and two beam polarizations P e − e + = (∓0.8, ±0.3) and obtained the statistical uncertainty of 6.15 %. The accuracy of this method is similar to one obtained in [6,7], where measurements of four or five processes have to be performed to determine the Higgs width. The results of both methods can be combined to further improve the accuracy.
The Higgs width can be potentially constrained in the future with an accuracy of about 2 % by applying a global fit with many Higgs related parameters included in the frame of the effective field theory (EFT) approach [26]. Our measurement can be used to test the Higgs width value obtained within the SM, as well as within the EFT approach. Moreover, the results obtained in this analysis can be included in the global EFT fit, that can improve its accuracy. Quantitatively it will be studied in a global fit for the upcoming Snowmass 2021 Higgs white paper with the input measurement from this paper.