Probing a Scalar Singlet-Catalyzed Electroweak Phase Transition with Resonant Di-Higgs Production in the $4b$ Channel

We investigate the prospective reach of the 14 TeV HL-LHC for resonant production of a heavy Higgs boson that decays to two SM-like Higgs bosons in the $4b$ final state in the scalar singlet extended Standard Model. We focus on the reach for choices of parameters yielding a strong first order electroweak phase transition. The event selection follows the $4b$ analysis by the ATLAS Collaboration, enhanced with the use of a boosted decision tree method to optimize the discrimination between signal and background events. The output of the multivariate discriminant is used directly in the statistical analysis. The prospective reach of the $4b$ channel is compatible with previous projections for the $bb\gamma\gamma$ and $4\tau$ channels for heavy Higgs boson mass $m_2$ below 500 GeV and superior to these channels for $m_2>500$ GeV. With 3 ab$^{-1}$ of integrated luminosity, it is possible to discover the heavy Higgs boson in the $4b$ channel for $m_2<500$ GeV in regions of parameter space yielding a strong first order electroweak phase transition and satisfying all other phenomenological constraints.


I. INTRODUCTION
After the discovery of the Higgs boson at the Large Hadron Collider (LHC) [1,2], understanding the details of electroweak symmetry-breaking (EWSB) in the context of the thermal history of the universe remains an important challenge for particle physics. In particular, it is possible that EWSB was accompanied by generation of the cosmic baryon asymmetry if new physics beyond the Standard Model (BSM) was active during that era. The Planck measurement of this asymmetry, characterized by the baryon-to-entropy density ratio Y B = n b /s, gives [3]: Explaining the origin and magnitude of Y B is a key problem for BSM scenarios. Electroweak baryogenesis (EWBG) is one of the appealing possibilities, in part due to its linking of Y B to EWSB and in part due to its testability in current and near future experiments. Three "Sakharov conditions" [4] need to be satisfied for a successful EWBG: baryon number (B) violation, C and CP violation, and departure from thermal equilibrium (through a strong first order electroweak phase transition) or a breakdown of CPT symmetry. In the Standard Model (SM), the first condition -baryon number -violation can be induced by the process of electroweak sphalerons. However, the CP violation in the SM is too feeble, and the EWSB transition is a crossover transition given the observed SM Higgs mass m h ∼ 125 GeV [5][6][7][8][9].
Therefore, the minimal SM cannot generate a successful strong first order electroweak phase transition (SFOWEPT). On the other hand, if new scalars exist in addition to the SM Higgs doublet, their interactions with the SM Higgs doublet may catalyze a SFOEWPT, thereby providing the necessary conditions for successful EWBG. 1 In this paper, we focus on the singlet extension to the SM, the xSM, which is proven to be able to give a SFOEWPT [10,11]. In the xSM, after EWSB, the gauge eigenstates of the singlet scalar and the SM Higgs doublet mix with each other to form the mass eigenstates h 1 (SM-like) and h 2 (singlet-like). Further, we restrict our study to searching for a signal of the on-shell production of the heavy singlet-like Higgs h 2 decaying into two SM-like Higgs h 1 (i.e. m 2 > 2m 1 ), because the regions of parameter space that can generate SFOEWPT simultaneously tend to enhance the h 2 h 1 h 1 tri-linear couplings [10][11][12]. Currently, the AT-LAS and CMS experiments are searching for a resonant di-Higgs signal through different 1 New CP-violating interactions would also be required, a topic we do not treat further here.
Higgs decay final states: 4b [13,14], bbW W * or bbZZ * [15,16], bbτ τ [17,18], bbγγ [19,20], W W * W W * [21], and γγW W * [22]. Thus far, no significant excess over SM backgrounds has been observed. On the theoretical side, several studies have been performed in the parameter regions that are viable for SFOEWPT. The singlet-like h 2 with a relatively light mass (∼270 GeV) can be discovered in the bbτ τ final state at the 14 TeV LHC with a luminosity of 100 fb −1 [12]. In the bbγγ and 4τ final states, a discovery is possible for m 2 up to 500 GeV at the 14 TeV high-luminosity LHC (HL-LHC) with a luminosity of 3 ab −1 [23]. In the bbW W * final state, a resonant signal can be discovered for m 2 in the range between 350 GeV and 600 GeV at the 13 TeV LHC with a luminosity of 3 ab −1 [24].
In this paper, we study the prospective discovery/exclusion in the 4b final state at the 14 TeV HL-LHC with a luminosity of 3 ab −1 . To that end, we first identify 22 benchmark points with m 2 ∈ [300, 850] GeV that produce the maximal and minimal di-Higgs signal rate σ h 2 × BR(h 2 → h 1 h 1 ) in consecutive 50 GeV intervals. The selected benchmark points satisfy all the current phenomenological constraints from the Higgs signal rate and electroweak precision data, and also satisfy the theoretical constraints from vacuum stability, perturbativity, and a SFOEWPT. We perform a full simulation of signal and background processes with the MadGraph5 parton level event generator [25] using PYTHIA6 [26] to simulate the parton shower and the DELPHES3 fast detector simulation [27]. Further, we use the TMVA package [28] to implement the Boosted Decision Tree (BDT) algorithm to optimize the event selection, finally obtaining the signal significance from the BDT score distributions of background and signal events.
Based on this analysis and the results shown in Fig. 4 below, we arrive at the following conclusions: • For singlet-like Higgs masses below 500 GeV, the significance of the 4b final state is competitive with the bbγγ and 4τ final states, and it is possible to make a discovery at the 14 TeV HL-LHC with a luminosity of 3 ab −1 for some portions of the SFOEWPTviable parameter space.
• For singlet-like Higgs masses above 500 GeV, the significance of the 4b final state is higher than in the bbγγ and 4τ final states but somewhat below recent projections for the bbW W * final state.
• With the results of the benchmark models that produce minimal di-Higgs signal rate, we found that it is impossible to exclude (at the 95% confidence level) all portions of parameter space consistent with a SFOEWPT and present phenomenological constraints at the HL-LHC.
The discussion of our analysis leading to these conclusions is organized as follows: Sec. II introduces the xSM framework and describes both theoretical and phenomenological constraints. In Sec. III, we describe the requirements for a SFOEWPT and the parameter scan.
In Sec. IV, we discuss the simulation and analysis of the 4b signal and background in detail and also present prospects for the 14 TeV HL-LHC. Section V is dedicated to the conclusions.
In the Appendix, we perform a global analysis of ATLAS Run 2 single Higgs measurements and present the distributions of the kinematic variables used in the BDT analysis.

A. The Model
The most general, renormalizable scalar potential in the xSM model is given by: where S is the real singlet and H is the SM Higgs doublet. When S obtains a vacuum expectation value (vev, see below), the a 1 and a 2 parameters induce mixing between the singlet scalar and the SM Higgs doublet, thereby providing a portal for the singlet scalar to interact with other SM particles. A Z 2 symmetry is present in the absence of a 1 and b 3 terms, a necessary condition for S to be a viable dark matter candidate. In what follows, however, we retain both parameters in our study as they play an important role in the strength of the electroweak phase transition (EWPT) and also in the di-Higgs signal rate at collider experiments.
After EWSB, H → (v 0 + h)/ √ 2 with v 0 = 246 GeV, and S → x 0 + s where x 0 is the vev for S without loss of generality. The stability of the scalar potential requires the quartic coefficients along all the directions in the field space to be positive. This translates into a requirement of a positive Hessian determinant of the potential with respect to fields s and h: det This leads the bounds λ > 0, b 4 > 0 and a 2 > −2 √ λb 4 . Another way to obtain these bounds is by parameterizing (h,s) as (r cos α, r sin α) in the field space, and we are able to extract the quartic coefficients of r along the α direction in the field space: Requiring the above expression be larger than zero for any value of cos α also leads to the same conditions.
Utilizing the minimization conditions, one can express two potential parameters in Eq. (2) in terms of the vevs and other parameters: Two additional conditions need to be satisfied for (v 0 , x 0 ) to be a stable minimum. One of them is that (v 0 , x 0 ) minimizes the potential locally, implying that: Also, this minimum point should be a global minimum, a requirement that we impose numerically.
As for the perturbativity consideration, we have the following naïve requirements on the quartic couplings: However, as discussed in Sec. III when scanning over the parameter space for benchmark points we implement more stringent bounds on those parameters compared with the above requirements. One may refer to Refs. [29][30][31] for more details about the perturbativity bound in the xSM.
Now we obtain the elements of the mass-squared matrix by: After the diagonalization of the above mass matrix, the physical masses of two neutral scalars can be expressed as: with m 2 > m 1 by construction. The mass eigenstates and gauge eigenstates are related by a rotation matrix: where h 1 is the SM-like Higgs boson with m 1 = 125 GeV, and h 2 is identified as the singletlike mass eigenstate. The mixing angle θ can be expressed in terms of the vevs, physical masses and potential parameters: From Eq. (11), one can observe that the couplings of h 1 and h 2 to the SM vector bosons and fermions are rescaled with respect to their SM Higgs couplings: where XX represents final states consisting of pairs of SM vector bosons or fermions. In this case, all the signal rates associated with the single Higgs measurements are rescaled by the mixing angle only: where σ h 1 and BR are the production cross section and branching ratio in the xSM, and the quantities with the superscript SM are the corresponding values in the SM. In the xSM for m 2 > m 1 , we have BR = BR SM because the partial width of each decay mode is rescaled by cos 2 θ and there is no new decay channel appearing.
In order to investigate the di-Higgs production, we also require the tri-Higgs couplings.
The one relevant for the resonant di-Higgs production is λ 211 : In this work, we focus on the situation where m 2 > 2m 1 such that a resonant production of h 2 and a subsequent decay to h 1 h 1 is allowed. Therefore, we are able to calculate the partial width Γ h 2 →h 1 h 1 : and the total width of h 2 : where Γ SM (m 2 ) represents the total width of the SM Higgs boson with a mass of m 2 , which is taken from Ref. [32]. The signal rate for pp → h 2 → XX normalized to the SM value is given by: which will be used to constrain the parameter space in the next section. The production cross section for the process pp → h 2 → h 1 h 1 can also be calculated: where s θ ≡ sin θ and, for future reference, c θ ≡ cos θ.

B. Phenomenological Constraints on the Model Parameters
The mixing angle θ between the singlet and the SM Higgs doublet in the xSM is constrained by measurements of the single SM-like Higgs signal strengths. We obtain a 95% C.L. upper limit on sin 2 θ of 0.131 by performing a global fit with current ATLAS Run 2 single Higgs measurements as discussed in Appendix A 1.
The constraints on the (m 2 , c θ ) plane can be found in our previous work [24]. We will also guarantee each benchmark point in the parameter scan in the next section satisfies all the limits mentioned above.
Finally, we discuss the constraints from electroweak precision observables ( In the xSM, the parameter U = 0 is a good approximation; we therefore focus only on the deviations in the S and T parameters, which we take from the Gfitter group [41]: where ρ ij is the covariance matrix in the (S,T ) plane. Again, we will impose the criteria in the parameter scan in the next section such that for each benchmark point, ∆χ 2 (m 2 , c θ ) defined below is less than 5.99, which corresponds to deviations of S and T parameters within 95% C.L.: where the ∆O 0 i denote the central values in Eq. (21) and (σ 2 ) ij ≡ σ i ρ ij σ j , with σ i being the error in S or T as indicated in Eq. (21). One can observe from Fig. 1 in Ref [24] that in general the upper limit for sin 2 θ extracted from EWPO is more stringent than the bound obtained from the Higgs global fit, with a limit changing from 0.12 for m 2 = 250 GeV to 0.04 for m 2 = 950 GeV.
The character of EWPT is understood in terms of the finite-temperature effective po- However, the fact that the standard derivation of V T =0 ef f suffers from gauge dependence is well known which is discussed in depth in Ref. [42]. Here we employ a hightemperature expansion to restore the gauge independence in our analysis (see Ref. [43] for details). In such a case, we include in our finite temperature effective potential the T = 0 tree level potential and the gauge-independent thermal mass corrections to V T =0 ef f , which are crucial to restore electroweak symmetry at high temperature. In this limit, the a 1 and b 3 parameters will generate a tree-level barrier between the broken and unbroken electroweak phases, thereby allowing for a first-order EWPT. We also note that the presence of the a 2 term may also strengthen the first order transition, as discussed in Ref. [10]. In the hightemperature limit, we follow Ref. [10,44] and write the T -dependent, gauge-independent (indicated by the presence of a bar) vevs in a cylindrical coordinate representation as withv(T = 0) = v 0 andx(T = 0) = x 0 . The critical temperature T c is defined as the temperature at which the broken and unbroken phases are degenerate, i.e.
Once the critical temperature is found, one is able to evaluate the quenching effect of the sphaleron transitions in the broken electroweak phase (see, e.g., Ref. [45]), which is related to the energy of the electroweak sphaleron that is proportional to the vev of SU(2) L doubletv(T ). A first-order EWPT is strong when the quenching effect is sufficiently large, and the criterion is approximately given by: To select the benchmarks parameter points for the collider simulation, we perform a scan over the parameters a 1 , b 3 , x 0 , b 4 , and λ within the following ranges: while the remaining parameters are fixed from the input values of v 0 = 246 GeV and m h = 125 GeV. Our lower bound on quartic couplings b 4 and λ guarantees tree-level vacuum stability. We also require a naïve perturbativity bound on the Higgs portal coupling a 2 /2 5. For each set of randomly chosen parameters, we calculate c θ , m 2 , and λ 211 , and only keep the points that satisfy all the phenomenological constraints mentioned in the previous section (Higgs signal rate, LHC search for heavy Higgs h 2 , and EWPO). We then pass these sets of parameters into the CosmoTransitions package [46] and numerically evaluate all the quantities related to the EWPT, such as critical temperature, sphaleron energy, tunneling rate into the electroweak symmetry-broken phase, using as an input the xSM finite temperature effective potential in the high-temperature limit. Finally, we only keep the sets of parameters that satisfy the strong first-order EWPT criterion defined above and also have a sufficient tunneling rate to prevent the universe from remaining in a metastable vacuum.
From the randomly chosen parameters satisfying the foregoing requirements, we identify benchmark points with maximum and minimum signal rate from 11 consecutive h 2 mass windows of width 50 GeV ranging from 300 to 850 GeV. The upper bound of m 2 = 850 GeV is obtained by the observation that we did not find a choice of parameters for m 2 larger than 850 GeV that give a SFOEWPT, even though our scan in m 2 reaches one TeV. We list all the benchmark points in Tables I and II. We would like to mention that the benchmark points B3 and B4 for maximum signal rate in Table I has already been excluded by the CMS h 2 → ZZ search [33], but we retain them here to make contact with the results of previous studies for comparison. In contrast, the new ATLAS and CMS limits on resonant di-Higgs production in the bbτ τ channel [17,18] do not yet appear to constrain the SFOEWPT-viable parameter space.

A. Reproduction of 13 TeV LHC results
For the signal process, the h 2 mass is varied from 300 GeV to 1500 GeV in steps of 100 GeV. For the background processes, we generate pp → 4b and pp → tt with top quarks decaying hadronically. We follow the ATLAS resolved analysis in Ref. [47], and reproduce the signal efficiency and background distributions in Figs. 1 and 2  The default CMS DELPHES card is used rather than the ATLAS DELPHES card as it better approximates the b-tagging and jet reconstruction performance. Jets are constructed using the anti-k t clustering algorithm with a radius parameter R set to 0.4, and the efficiency for a b-quark-initiated jet to pass the b-tagging requirements is parameterized as a function of the jet transverse momentum p T in a manner corresponding to an average 70% efficiency working point described in Ref. [49]. (This is the default setting in the DELPHES CMS card).
Benchmark cos θ The selection criteria for the ATLAS analysis are as follows: • Events must have at least four b-tagged jets with p T > 40 GeV and |η| < 2.5. If the number of b-tagged jets is greater than four, the four jets with the highest p T are selected to reconstruct two dijet systems in each event.
• Two dijet systems are formed using the selected b-tagged jets. The two jets in each dijet system are required to have ∆R < 1.5 and the transverse momentum of the leading (subleading) dijet system must be greater than 200 (150) GeV.
• The leading and subleading dijet systems must satisfy the following set of requirements depending on the reconstructed invariant mass (m 4j ) of the four selected b-tagged jets: The central values for m lead 2j and m subl 2j in the above equation are somewhat lower than in the ATLAS analysis [47] to account for differences in the treatment of jets in DELPHES compared to the ATLAS simulation.
The acceptance times efficiency values for signal events with m 2 ranging from 500 to 1000 GeV are compared with the ATLAS results in Fig. 1. Overall, the signal region efficiencies obtained in this analysis agree well with those from Ref. [47].
The background event yields in the signal region are summarized in Table III. In addition to the yields from 4b and tt production, the contribution from bbcc production with the cquark jets passing the b-tagging requirements is estimated assuming that the kinematic distributions of jets in bbcc events are similar to those of 4b events: where N 4b is the estimated number of QCD 4b events, σ bbcc and σ 4b are parton level cross sections for bbcc and 4b processes, tag   • Dijet systems are formed such that the separation ∆R jj between the two jets satisfies the following requirements: • If more than one pair of dijet systems satisfies this constraint, the pair with the smallest variable D h 1 h 1 is selected with In order to optimize the separation between signal and background events, the analysis in this paper relies on a BDT trained on half of the simulated signal and background events and validated with the other half. Separate training is performed for each benchmark point studied. The kinematic quantities included in the training of the BDT are Among those variables, ∆R lead jj , ∆R subl jj , and m 4j are consistently ranked high in terms of discrimination power for all benchmark points. To derive the optimal sensitivity, BDT score distributions for signal and background events are rebinned such that each bin contributes the maximum S/ √ B (S and B are the numbers of signal and background events in that bin), starting from the bin with the highest BDT score where the signal contributes the most.
This rebinning also requires a minimum of ten background events per bin to minimize the impact of statistical fluctuations. As an illustration, the rebinned BDT score distributions for two benchmark points are shown in Fig. 3.
The production cross sections and the efficiencies of backgrounds before the BDT selection are summarized in Table IV    probability that the background-only model yields an observed number of events at least as large as the expectation for the signal plus background model, is then translated into the corresponding N σ Gaussian significance. As a test of the statistical analysis, it was verified that the 95% upper limit on the cross section as a function of resonance mass derived from our emulation of the 13 TeV ATLAS analysis (discussed in Sec. IV A) agrees with the results from Ref. [47] within 10% for h 2 masses up to 750 GeV and within 20% up to 850 GeV.
The slight deviation at higher mass may be due to the use of the asymptotic formula which is known to produce upper limits that are too aggressive for the low number of expected events at high mass with only 3.2 fb −1 of luminosity.
The significance N σ as a function of resonance mass is shown in Fig. 4, where the upper and lower boundaries of the band correspond to the influence of uncertainties in the produc-tion cross sections for the 4b and tt backgrounds as given in Table IV. The two boundaries   are obtained by coherently changing the number of events for the two backgrounds by the   1σ uncertainties listed in Table IV,  The significance is compared to that obtained with the same method for the bbγγ and 4τ channels at the 14 TeV HL-LHC [23] and for the bbW W channel at the 13 TeV LHC [24] in Fig. 5. We only compare the benchmark points from BM3 to BM11 because the first two BM points are different from those in Ref. [23]. We find that for a heavy Higgs mass m 2 less than 500 GeV, the bbγγ channel is the most sensitive channel in the search for a resonant di-Higgs signal. Moreover, the 4b channel is competitive with the bbγγ channel, which could serve as a complementary check if a signal is observed in the bbγγ channel.
However, for m 2 larger than 500 GeV the 4b channel provides better sensitivity than the bbγγ or 4τ channels but not as good as the bbW W * channel [24]. We note, however, that the analysis given in Ref. [24] employs a novel Heavy Mass Estimator (HME) and assumptions that the systematic uncertainties will be improved compared to those quoted in the recent CMS bbW W * analysis [16] that did not implement the HME. These differences may account for the stronger projected limits given in Ref. [24] than one would infer by rescaling the results in Ref. [16] by the improved statistics expected for the HL-LHC [57]. We also note that Ref. [24] assumes an ATLAS-CMS combination, thereby doubling the number of events.
We do not make such an assumption in the present study.

V. CONCLUSION
Investigating the thermal history of EWSB is important for determining whether or not the cosmic matter-antimatter asymmetry was generated through EWBG. Monte Carlo simulations indicate that the EWSB transition is cross over in the minimal SM, given the the observed Higgs mass. In this context of a SM-only universe, EWBG cannot occur.
However, introducing new scalar degrees of freedom can change the behavior of thermal obtained from Ref. [23] whereas those for the bbW W * channel are obtained from Ref. [24].
effective potential and make the SFOEWPT possible during the EWSB era. Adding a real scalar singlet is one of the simplest ways to extend the SM -yielding the xSM -and realize this possibility. Previous studies have demonstrated the existence of a strong correlation between an enhanced coupling of a heavy singlet-like scalar to a SM-like di-Higgs pair and the occurrence of a SFOEWPT in the xSM parameter space. Therefore, there exists strong motivation to search for resonant production of heavy singlet-like scalar that decays to SM-like di-Higgs state as a probe of SFOEWPT in the xSM.
In this paper, we focused on the possibility of discovering at the HL-LHC a resonant gluon fusion production of the heavy singlet-like scalar in the xSM that decays into a pair of SMlike Higgs with a four b-quark final state. The four b-quark final state is a promising channel, given its large branching ratio, but it also suffers from a significant QCD background. In analyzing this process, we first validated our simulation against the ATLAS 13 TeV cutbased analysis, then implemented the BDT, a multi-variable analysis method to help to classify signal and background events for the HL-LHC. We selected 11 benchmark points for both maximum and minimum di-Higgs signal rates that yield a SFOEWPT and that satisfy all the theoretical and phenomenological bounds for a heavy singlet-like scalar mass in successive 50 GeV energy bins ranging from 300 to 850 GeV. We then analyzed the signal significance for the 14 TeV HL-LHC with a luminosity of 3 ab −1 . We also compared the results with earlier projections for the bbγγ and 4τ channels and find that for the mass of the singlet-like scalar larger than 500 GeV, the significance for the 4b channel is superior to both of these other channels. For heavy singlet-like scalar mass less than 500 GeV, the significance for the 4b state with maximum signal rate can be larger than 5. This significance is comparable to that of the bbγγ final state, and is somewhat better than that projected for the 4τ final state. While our projection for the reach using the 4b channel is somewhat below that for the bbW W * channel as analyzed in Ref. [24], the latter work utilized a new Heavy Mass Estimator and assumptions about future reductions in systematic uncertainties that await validation with new data. Thus, inclusion of the 4b channel in a comprehensive search strategy that also includes the bbγγ, 4τ , and bbW W * channels is strongly motivated.
In terms of exclusion, we find that for the future 14 TeV HL-LHC, one can exclude the mass of a heavy singlet-like scalar up to around 680 GeV for the benchmark points with maximum signal rate. However, a signal in the case of the minimum signal rate benchmark points is far from being excluded. Therefore in this sense, a future 100 TeV pp collider may be required to fully exclude the possibility of generating SFOEWPT in the xSM.

Appendix B: Distributions of BDT variables
We plot the signal and background distributions of kinematic variables used in the BDT analysis here. The signal is taken to be the benchmark point B7 in Table I.