Search for two Higgs bosons in final states containing two photons and two bottom quarks in proton-proton collisions at 8 TeV

A search is presented for the production of two Higgs bosons in final states containing two photons and two bottom quarks. Both resonant and nonresonant hypotheses are investigated. The analyzed data correspond to an integrated luminosity of 19.7  fb^(−1) of proton-proton collisions at √s = 8  TeV collected with the CMS detector. Good agreement is observed between data and predictions of the standard model (SM). Upper limits are set at 95% confidence level on the production cross section of new particles and compared to the prediction for the existence of a warped extra dimension. When the decay to two Higgs bosons is kinematically allowed, assuming a mass scale Λ_R = 1  TeV for the model, the data exclude a radion scalar at masses below 980 GeV. The first Kaluza-Klein excitation mode of the graviton in the RS1 Randall-Sundrum model is excluded for masses between 325 and 450 GeV. An upper limit of 0.71 pb is set on the nonresonant two-Higgs-boson cross section in the SM-like hypothesis. Limits are also derived on nonresonant production assuming anomalous Higgs-boson couplings.


I. INTRODUCTION
The discovery of a boson with a mass of approximately 125 GeV, with properties close to those expected for the Higgs boson (H) of the standard model (SM) [1,2], has stimulated interest in the exploration of the Higgs potential. The production of a pair of Higgs bosons (HH) is a rare process that is sensitive to the structure of this potential through the self-coupling mechanism of the Higgs boson. In the SM, the cross section for the production of two Higgs bosons in proton-proton (pp) collisions at 8 TeV is 10.0 AE 1.4 fb for the gluon-gluon fusion process [3][4][5], which lies beyond the reach of analyses based on the first run of the CERN LHC.
Many theories beyond the SM (BSM) suggest the existence of heavy particles that can couple to a pair of Higgs bosons. These particles could appear as a resonant contribution in the invariant mass of the HH system. If the new particles are too heavy to be observed through a direct search, they may be sensed in the HH production through their virtual contributions (as shown, e.g., in Refs. [6,7]); also, the fundamental couplings of the model can be modified relative to their SM values (as shown, e.g., in Refs. [8,9]); in both cases, a nonresonant enhancement of the HH production could be observed.
Models with a warped extra dimension (WED), as proposed by Randall and Sundrum [10], postulate the existence of one spatial extra dimension compactified between two fixed points, commonly called branes. The region between the branes is referred to as bulk, and controlled through an exponential metric. The gap between the two fundamental scales of nature, such as the Planck scale (M Pl ), and the electroweak scale, is controlled by a warp factor (k) in the metric, which corresponds to one of the fundamental parameters of the model. The brane where the density of the extra dimensional metric is localized is called "Planck brane," while the other, where the Higgs field is localized, is called "TeV brane." This class of models predicts the existence of new particles that can decay to a Higgs-boson pair, such as the spin-0 radion (R) [11][12][13], and the spin-2 first Kaluza-Klein (KK) excitation of the graviton [14][15][16].
There are two possible ways of describing a KK graviton in WED that depend on the choice of localization for the SM matter fields. In the RS1 model, only gravity is allowed to propagate in the extradimensional bulk. In this model the couplings of the KK graviton to matter fields are controlled by k=M Pl [10], with the reduced Planck massM Pl defined by M Pl = ffiffiffiffiffi ffi 8π p . For the possibility of SM particles to propagate in the bulk (the so-called bulk-RS model), the coupling of the KK graviton to matter depends on the choice for the localization of the SM bulk fields. This paper uses the phenomenology of Ref. [17], where SM particles are allowed to propagate in the bulk, and follows the characteristics of the SM gauge group, with the righthanded top quark localized on the TeV brane (so-called elementary top hypothesis).
The R is an additional element of WED models that is needed to stabilize the size of the extra dimension l. It is usual to express the benchmark points of the model in terms of the dimensionless quantity k=M Pl , and the mass scale Λ R ¼ ffiffi ffi 6 p exp½−klM Pl , with the latter interpreted as the ultraviolet cutoff of the model [18]. The addition of a scalar-curvature term can induce a mixing between the scalar radion and the Higgs boson [18,19]. This possibility is discussed, for example, in Ref. [20]. Precision electroweak studies suggest that this mixing is expected to be small [21]. In our interpretations of the constraints we neglect the possibility of Higgs-radion mixing. On one hand, the choice of localization of the SM matter fields for the KK-graviton resonance impacts the kinematics of the signal and drastically modifies the production and decay properties [22]. The physics of the radion, on the other hand, does not depend much on the choice of the model [18], which obviates the need to distinguish the RS1 and bulk-RS possibilities.
Models with an extended Higgs sector also predict one spin-0 resonance that, when sufficiently massive, decays to a pair of SM Higgs bosons, and corresponds to an additional Higgs boson. Examples of such models are the singlet extension [23], the two Higgs doublet models [24] (in particular, the minimal supersymmetric model [25,26]), and the Georgi-Machacek model [27]. The majority of these models predict that heavy scalar production occurs predominantly through the gluon-gluon fusion process. The Lorentz structure of the coupling between the scalar and the gluon is the same for a radion or a heavy Higgs boson. Therefore the models for the production of a radion or an additional Higgs boson are essentially the same, provided the interpretations are performed in a parameter space region where the spin-0 resonance is narrow. The results of this paper can therefore be easily applied to constrain this class of models.
Phenomenological explorations of the two-Higgs-boson channel were studied prior to the observation of the Higgs boson [28], and, since then, other studies have become available [29][30][31][32][33][34][35]. Most of these indicate that in BSM physics an enhancement of the HH production cross section is expected, together with modified signal kinematics for the HH final state. This paper describes a search for the production of pairs of Higgs bosons in the γγbb final state in pp collisions at the LHC, using data corresponding to an integrated luminosity of 19.7 fb −1 collected by the CMS experiment at ffiffi ffi s p ¼ 8 TeV. Both nonresonant and resonant production are explored, with the search for a narrow resonance X conducted at masses m X between 260 and 1100 GeV. The fully reconstructed γγbb final state discussed in this paper combines the large SM branching fraction (B) of the H → bb decay with the comparatively low background and good mass resolution of the H → γγ channel, yielding a total BðHH → γγbbÞ of 0.26% [36]. The search exploits the mass spectra of the diphoton (m γγ ), dijet (m jj ), and the four-body systems (m γγjj ), as well as the direction of Higgs bosons in the Collins-Soper frame [37], to provide discrimination between production of two Higgs bosons and SM background.
A search in the same final state was performed by the ATLAS collaboration [38]. Complementary final states such as HH → bbbb, HH → ττbb, and HH to multileptons and multiphotons were also explored by the ATLAS [39,40] and CMS [41][42][43][44] collaborations. This paper is organized as follows: Section II contains a brief description of the CMS detector. In Sec. III we describe the simulated signal and background event samples used in the analysis. Section IV is dedicated to the discussion of event selection and Higgs-boson reconstruction. The signal extraction procedure is discussed in Sec. V. In Sec. VI we present the systematic uncertainties impacting each analysis method. Section VII contains the results of resonant and nonresonant searches, and Sec. VIII provides a summary.

II. THE CMS DETECTOR
The CMS detector, its coordinate system, and main kinematic variables used in the analysis are described in detail in Ref. [45]. The detector is a multipurpose apparatus designed to study physics processes at large transverse momentum p T in pp and heavy-ion collisions. The central feature of the apparatus is a superconducting solenoid, of 6 m internal diameter, providing a magnetic field of 3.8 T. A silicon pixel and strip tracker covering the pseudorapidity range jηj < 2.5, a crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter (HCAL) reside within the field volume. The ECAL is made of lead tungstate crystals, while the HCAL has layers of plates of brass and plastic scintillator. These calorimeters are both composed of a barrel and two endcap sections and provide coverage up to jηj < 3.0. An iron and quartz-fiber Cherenkov hadron calorimeter covers larger values of 3.0 < jηj < 5.0. Muons are measured in the jηj < 2.4 range, using detection planes based on three technologies: drift tubes, cathode strip chambers, and resistive-plate chambers.
The first level of the CMS trigger system, composed of special hardware processors, uses information from the calorimeters and muon detectors to select the most interesting events in a time interval of less than 4 μs. The highlevel trigger (HLT) processor farm further decreases the event rate from around 100 kHz to less than 1 kHz, before data storage.

III. SIMULATED EVENTS
The MADGRAPH version 5.1.4.5 [46] Monte Carlo (MC) program generates parton-level signal events based on matrix element calculations at leading order (LO) in quantum chromodynamics (QCD), using LO PYTHIA version 6.426 [47] for showering and hadronization of partons. The models provide a description of production through gluon-gluon fusion of particles with narrow width (width set to 1 MeV) that decay to two Higgs bosons, with mass m H ¼ 125 GeV, in agreement with Ref. [48].
Events are generated either for spin-0 radion production, or spin-2 KK-graviton production predicted by the bulk-RS model.
The samples for nonresonant production are generated considering the cross section dependence on three parameters: the Higgs-boson trilinear coupling λ, parametrized as κ λ ≡ λ=λ SM , where λ SM ≡ m 2 H =ð2v 2 Þ ¼ 0.129, with v ¼ 246 GeV being the vacuum expectation value of the Higgs boson; the top Yukawa coupling y t , parametrized as κ t ≡ y t =y t SM , where y t SM ¼ m t =v is the SM value of the top Yukawa coupling, and m t the top quark mass; and the coefficient c 2 of a possible coupling of two Higgs bosons to two top quarks. The first two parameters reflect changes relative to SM values, while the third corresponds purely to a BSM operator. In this parametrization the SM production corresponds to the point κ λ ¼ 1, κ t ¼ 1, and c 2 ¼ 0. The parameters κ λ and c 2 cannot be directly constrained by alternative measurements at the LHC. Therefore we vary these parameters in a wide range: −20 ≤ κ λ ≤ 20 and −3 ≤ c 2 ≤ 3. The range 0.75 ≤ κ t ≤ 1.25 is compatible with constraints from the single Higgs-boson measurements provided in Ref. [49].
The part of the Higgs potential ΔL relevant to two-Higgs-boson production and their interactions with the top quark can be expressed as in Ref. [50], where t L and t R are the top quark fields with left and right chiralities, respectively, and H is the physical Higgsboson field. Besides being used to predict SM single-Higgs-boson production, the MC predictions for the background processes are used also in comparisons with data, to optimize the selection criteria, and for checking backgroundestimation methods based on control samples in data. The dominant background, originating from events with two prompt photons and two jets in the final state, is generated at next-to-leading order (NLO) in QCD using SHERPA version 1.4.2 [51]. Multijet production with or without a single-prompt photon represents a subdominant background, and is generated with the PYTHIA 6 package. Other minor backgrounds, including Drell-Yan (pp → Z=γ Ã → e þ e − ), SM Higgs-boson production with jets, as well as vector boson and top quark production in association with photons, are generated using MADGRAPH and PYTHIA 6, or the generator POWHEG version 1.0 [52][53][54] at NLO in QCD. The generated events are processed through GEANT4-based [55,56] detector simulation.

IV. EVENT RECONSTRUCTION
The events are selected using two complementary HLT paths requiring two photons. The first trigger requires an identification based on the energy distribution of the electromagnetic shower and loose isolation requirements on photon candidates. The second trigger applies tighter constraints on the shower shape, but a looser kinematic selection. The trigger thresholds on the p T range between 26 and 36 GeV,and between 18 and 22 GeV, respectively, for photons with highest (leading) and next-to-highest (subleading) p T , with specific choices that depend on the instantaneous LHC luminosity. The HLT paths are more than 99% efficient for the selection criteria used in this analysis [57].
A. The H → γγ candidate Photon candidates are constructed from clusters of energy in the ECAL [58,59]. They are subsequently calibrated [60] and identified through a cutoff-based approach (referred to as "cut-based analysis" in Ref. [57]). The identification criteria include requirements on p T of the electromagnetic shower, its longitudinal leakage into the HCAL, its isolation from jet activity in the event, as well as a veto on the presence of a track matching the ECAL cluster. These criteria provide efficient rejection of objects that arise from jets or electrons but are reconstructed as photons. Both photons are required to be within the ECAL fiducial volume of jη γ j < 2.5. Small transition regions between the ECAL barrel and the ECAL endcaps are excluded in this analysis, because the reconstruction of a photon object in this region is not optimal.
The directions of the photons are reconstructed assuming that they arise from the primary vertex of the hard interaction. However on average ≈20 additional pp interactions (pileup) occur in the same or neighboring pp bunch crossings as the main interaction. Many additional vertexes are therefore usually reconstructed in an event using charged particle tracks. We assume that the primary interaction vertex corresponds to the one that maximizes the sum in p T 2 of the associated charged particle tracks. For the simulated signal, it is shown that this choice of vertex lies within 1 cm of the true hard-interaction vertex in 99% of the events. With this choice for energy reconstruction and vertex identification, the diphoton mass resolution remains close to 1 GeV independent of the signal hypothesis.
Diphoton candidates are preselected by requiring 100 < m γγ < 180 GeV. The two photons are further required to satisfy the asymmetric selection criteria p γ1 T =m γγ > 1=3 and p γ2 T =m γγ > 1=4, where p γ1 T and p γ2 T are the transverse momenta of the leading and subleading photons. The use of different p T thresholds scaled by the diphoton invariant mass minimizes turn-on effects that can distort the distribution at the low-mass end of the m γγ spectrum. If there is more than one diphoton candidate selected through the above requirements, the pair with the largest scalar sum in the p T of the two photons is chosen for analysis.
B. The H → bb candidate The Higgs-boson candidate decaying into two b quarks is reconstructed following a procedure similar to that used in CMS searches for SM Higgs bosons that decay to b quarks [61].
The particle-flow event algorithm reconstructs and identifies each individual particle (referred to as candidates) with an optimized combination of information from the various elements of the CMS detector [62,63]. Then the anti-k T algorithm [64] clusters particle-flow candidates into jets using a distance parameter D ¼ 0.5. Jets are required to be within the tracker acceptance (jη j j < 2.4), and separated from both photons through a condition on the angular distance in η × ϕ space of ΔR γj ≡ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi ðΔηÞ 2 þ ðΔϕÞ 2 p > 0.5, where ϕ is the azimuth angle in radians. The jet energy is corrected for extra depositions from pileup interactions, using the jet-area technique [65] implemented in the FASTJET package [66]. Jet energy corrections are applied as a function of η j and p j T [67,68]. Identification criteria are applied to reject detector noise misidentified as jets, and the procedure is verified using simulated signal.
The identification of jets likely to have originated from hadronization of b quarks exploits the combined secondary vertex (CSV) b quark tagger [69]. This algorithm combines the information from track impact parameters and secondary vertexes within a given jet into a continuous output discriminant. Jets with CSV tagger values above some fixed threshold are considered as b tagged. The working point chosen in this analysis corresponds to an efficiency, estimated from simulated multijet events, of ≈70% and a mistag rate for light quarks and gluons of 1%-2%, depending on jet p T . This efficiency and the mistag rate are measured in data samples enriched in b jets (e.g., in tt events). Correction factors of ≈0.95 are determined from data-to-simulation comparisons and applied as weights to all simulated events.
Events are kept if at least two jets are selected and at least one of them is b tagged. To improve signal sensitivity, events are subsequently classified in two categories: events with exactly one b-tagged jet (medium purity) and events with more than one b-tagged jet (high purity). In the former category, the H → bb decay is reconstructed by pairing the b-tagged jet with a non-b-tagged jet, while in the latter category a pair of b-tagged jets is used. In both cases, when multiple pairing possibilities exist for the Higgs-boson candidate, the dijet system with largest p T is retained for further study. For medium-and high-purity simulated signal events, this procedure selects the correct jets in more than 80% and more than 95%, respectively.
The resolution in m jj improves from 20 GeV for m X ¼ 300 GeV to 15 GeV for m X ¼ 1 TeV in the high-purity category, and from 25 GeV for m X ¼ 300 GeV to 15 GeV for m X ¼ 1 TeV in the medium-purity category. In the search for a low-mass resonance, the dijet mass resolution is improved using a multivariate regression technique [61] that uses the global information from the events as well as the particular properties of each jet, in an attempt to identify the semileptonic decays of B mesons and correct for the energy carried away by undetected neutrinos. The relative improvement in resolution is typically 15%. For the highmass analysis and nonresonant analysis the m jj resolution is better than for low-mass analysis. The improvement provided by the regression technique was found to be very limited. Therefore in those cases no regression was used.
Independent of whether a search involves the usage of jet energy regression, all jets are required to have p j T > 25 GeV. Finally, we require that 60 < m jj < 180 GeV.

C. The two-Higgs-boson system
The object selections discussed thus far are summarized in Table I.
In each category, two Higgs bosons are obtained by combining the diphoton and the dijet boson candidates. To improve the resolution in m γγjj , an additional constraint is imposed requiring m jj to be consistent with m H . This is achieved by modifying the jet 4-momenta using multiplicative factors. The value of each factor is obtained event by event through a χ 2 minimization procedure where the size of the denominator is defined by the estimated resolution for each jet [70]. The procedure, similar to the one used in Ref. [70], is referred to as a kinematic fit and the resulting four-body mass is termed m kin γγjj . The scattering angle, θ CS HH , is defined in the Collins-Soper frame of the four-body system state, as the angle between the momentum of the Higgs boson decaying into two photons and the line that bisects the acute angle between the colliding protons. In the Collins-Soper frame, the two-Higgs-boson candidates are collinear, and the choice of the one decaying to photons as reference is therefore arbitrary. Using the absolute value of the cosine of this angle, j cos θ CS HH j, obviates this arbitrariness.

D. Backgrounds
The SM background in m γγ can be classified into two categories: the nonresonant background, from multijet and electroweak processes, and a peaking background corresponding to events from single Higgs bosons decaying to two photons.
After the baseline selections of Table I, the dominant nonresonant background with two prompt photons and more than two extra jets, referred to as γγþ ≥ 2 jets, represents ≈75% of the total background. The nonresonant background with one prompt photon and a jet misidentified as a photon as well as more than two extra jets, referred to as γjetþ ≥ 2 jets, represents in turn ≈25%. The background from two jets misidentified as photons is negligible.
The remaining nonresonant and resonant backgrounds contribute much less than 1% to the total. They represent associated production of photons with top quarks or single electroweak bosons decaying to quarks, and Drell-Yan events with their decay electrons misidentified as photons. The resonant backgrounds correspond to different SM processes contributing to single-Higgs-boson production.
All nonresonant backgrounds are estimated from data, and the resonant background from SM single-Higgs-boson production in different channels is taken from the MC simulation normalized to NLO or next-to-NLO (NNLO) production cross sections, whichever are available [36].
The comparison between data and MC predictions is provided in Fig. 1. The γγ=γjetþ ≥ 2 jets background is normalized to the total integral of data in the signal free region, defined by the condition m γγ > 130 or m γγ < 120 GeV in addition to the selections of Table I.

V. ANALYSIS METHODS
In the final step, this analysis exploits kinematic properties of the final state to discriminate either the resonant or nonresonant signal from SM background: the Higgs boson masses m γγ and m jj , the cosine of their scattering angle j cos θ CS HH j, and the mass of the two-Higgs-boson system, m kin γγjj . Distributions in these variables are shown for different signal assumptions in Fig. 2. The signal peaks in m γγ and m jj are shown in Figs. 2(a) and 2(b). The corresponding distributions for the QCD background are smoothly varying over the shown ranges. The j cos θ CS HH j is rather uniform for signal, as shown in Fig. 2(c), while it peaks toward 1 for background. Finally, a resonant signal appears as a narrow Reconstructed spectra for data compared to the γγ=γjetþ ≥ 2jets background after the selections described in Table I (selections on photons and jets and a requirement of at least one b-tagged jet): (a) m γγ , (b) m jj , (c) j cos θ CS HH j, and (d) m kin γγjj . The hatched area corresponds to the bin-by-bin statistical uncertainties on the background prediction reflecting the limited size of the generated MC sample. The comparison is provided for illustrative purpose, the backgrounds, except the one coming from single-Higgs-boson production, are evaluated from a fit to the data without reference to the MC simulation. peak in the m kin γγjj spectrum, while the nonresonant signal has a broad contribution as shown in Fig. 2(d).
The dominant background from nonresonant production of prompt photons and jets exhibits a kinematic peak around m kin γγjj ≈ 300 GeV followed by a slowly falling tail at high m kin γγjj . In the resonant case, we consider two strategies, one for m X close to the kinematic peak, and one for m X heavier than the kinematic peak. A third strategy is considered for the nonresonant case, since the signal distribution as a function of m kin γγjj is broad. In all cases a categorization is used based on the number of b-tagged jets. All the strategies are summarized in Table II and briefly described below.
(1) Resonant search in the low-mass region (260 ≤ m X ≤ 400 GeV): the events are selected in a narrow window around the m X hypothesis in the m kin γγjj spectrum, and the signal is identified simultaneously in the m γγ and m jj spectra. This approach avoids a direct search for a resonance in the m kin γγjj spectrum near the top of the kinematic peak of the SM background.
(2) Resonant search in the high-mass region (400 ≤ m X ≤ 1100 GeV): the events are selected in a window around m H in both the m γγ and m jj spectra, and the signal is identified in the m kin γγjj spectrum.
(3) Nonresonant search: a selection is applied in the j cos θ CS HH j variable to reduce the background. In addition to the categorization in the number of b-tagged jets, a categorization is applied in m kin γγjj by defining a high-mass region and a low-mass region. The signal is identified simultaneously in the m γγ and m jj spectra. The nonresonant background is described through different functions such as exponentials, power law, or polynomials in the Bernstein basis [57]. When the search is performed simultaneously in the diphoton and dijet mass spectra, these functions are used to construct a two-dimensional (2D) probability density (PD) for the background in each category, following an approach similar to Ref. [71]. Otherwise, a one-dimensional (1D) PD is used. In all cases, FIG. 2. Simulated spectra for the spin-0 radion signal at m X ¼ 300 and 700 GeV, and for some values of the anomalous couplings, compared to SM Higgs-boson production and QCD background, after the selections described in Table I (selections on photons   we choose the background PD to minimize the bias on signal. The bias is always found to be at least a factor of 7 smaller than the statistical uncertainty in the fit, and can be safely neglected [1]. In each invariant mass distribution used to identify the signal, the signal PD is modeled, following the same approach as in Ref. [57], through the sum of a Gaussian function and a crystal ball (CB) function [72], using the parameters extracted from fits to MC simulations. The resolution parameters in both functions are kept independent, σ G x for the Gaussian and σ CB x for the CB function, but in the fits to each of the channels (x ¼ γγ; jj; γγjj), we let the μ parameter for both the Gaussian and the CB component float, which provides three independent μ x values.
Finally, we consider the contribution from SM single-Higgs-boson production in 2D searches. The gluon-gluon and vector-boson fusion processes are modeled in m γγ by a sum over Gaussian and CB functions, and through a constant term in m jj . The associated production of vector bosons that subsequently decay to jets, and the SM single Higgs bosons are modeled in the same way as the signal. The parameters of the distribution are extracted from a fit to the MC simulation.   The total PD used for signal extraction corresponds to a sum over separate PD contributions from the signal component, single-Higgs boson production, and nonresonant backgrounds. We also verify that 2D PD functions can be considered as uncorrelated between m γγ and m jj within the statistical uncertainties. To obtain this result we calculated the correlation in data. The uncertainty in the correlation was estimated by generating pseudoexperiments from a model assuming no correlation between m γγ and m jj and calculating the root mean square of the resulting distribution.

A. Low-mass resonant
In addition to the preselections summarized in Table I, each mass hypothesis has a selection applied on m kin γγjj in a narrow window around m X . The window sizes increase with m X to account for the increasing experimental resolution from Δm X ¼ AE10 GeV at m X ¼ 260 GeV to Δm X ¼ þ31 −20 GeV at m X ¼ 400 GeV. A possible signal can be extracted from data using a simultaneous fit to the m γγ and m jj spectra. The sensitivity to the signal in this search is increased through the b jet energy regression that improves the resolution of the signal in m jj . The background-only PD is a first-order polynomial in the Bernstein basis and a power law in the medium-and high-purity categories, respectively, as shown in Fig. 3, together with their 68% and 95% confidence level (C.L.) contours for the selection optimized for the search with m X ¼ 300 GeV, 290 < m kin γγjj < 310 GeV. As a cross-check, two alternative signal extraction techniques are tested. In one, a selection is performed in the m jj spectrum, and the signal extracted in the m γγ spectrum. In the other, a selection is performed in the m jj spectrum and the m γγjj spectrum is exploited, using a normalization extracted from sidebands in the m γγ spectrum. The two procedures give compatible results within the statistical uncertainties.

B. High-mass resonant
In addition to the requirements in Table I, selections are applied on m γγ and m jj , as summarized in Table III. A possible signal can be extracted from a fit to the m kin γγjj distribution for mass points between 320 ≤ m kin γγjj ≤ 1200 GeV. The background-only PD is a power law for each category, and is seen to well describe the data in Fig. 4. The lower threshold of 320 GeV is chosen to avoid the kinematic turn-on, while still ensuring full containment of signal for the m X ≥ 400 GeV mass hypotheses. Single-Higgs-boson production is a negligible background in this phase space region, and is absorbed into the parametrization of the nonresonant background.

C. Nonresonant
We apply a selection on j cos θ CS HH j in the search for nonresonant two-Higgs-boson production. To increase the sensitivity to a large variety of BSM topologies (see examples shown in Fig. 2), an additional categorization is applied in m kin γγjj . For the SM-like topology in gg → HH production, the m kin γγjj spectrum peaks roughly at 400 GeV, while for jκ λ j ≳ 10 the peak shifts down to the kinematic threshold of m kin γγjj ≈ 250 GeV. Large values of the c 2 (jc 2 j ≈ 3) parameter usually lead to an opposite effect by shifting the peak in the m kin γγjj spectrum above 400 GeV. Two categories are defined for m kin γγjj smaller or larger than 350 GeV, a value optimized for the SM-like search. The details of the selections and categorizations are provided in Table IV. Events / 10 GeV A possible signal can be extracted using a simultaneous fit to the m γγ and m jj spectra. The backgroundonly PDs are exponentials and power-law expressions for the medium-and high-purity categories, respectively, which agree with the data, as can be seen in Figs. 5 and 6.

D. Signal efficiency
The signal efficiency is a function of the mass hypothesis, as shown in Fig. 7. It is estimated with respect to all events generated in a given signal sample. The efficiency increases as the resonance mass increases from m X ¼ 260 to 900 GeV because of higher photon and jet reconstruction efficiencies. The efficiency starts to drop for m X > 900 GeV. At this point, the typical angular distance in the laboratory frame between two b quarks produced in Higgs-boson decay is of the order of the distance parameter D [73]. The minimum in efficiency is observed at m X ¼ 300 GeV. It results from an optimization procedure designed to maximize the overall analysis sensitivity. This procedure chooses an optimal size of m kin γγjj window for each m X hypothesis. For m X ¼ 300 GeV, the background is largest and the resulting m kin γγjj window is smallest, inducing a small drop in signal selection efficiency.   Finally, the single and double b tag categories contribute in roughly equal ways to the total efficiency. Figure 8 provides the efficiencies of selecting the signal events as a function of κ λ for different values of κ t and assuming c 2 ¼ 0. The left plot provides efficiencies for m kin γγjj < 350 GeV categories and right for m kin γγjj > 350 GeV categories. For large absolute values of jκ λ j (typically larger than 10) the efficiency is rather flat, while for small values of jκ λ j the efficiency in the m kin γγjj < 350 GeV (m kin γγjj > 350 GeV) categories is reduced (increased). The change in efficiency is caused by the interference between two-Higgs-boson box diagrams and the Higgs-boson self-coupling channel. The total efficiency in four categories is ≈15%-30%, depending on the model parameters. This figure illustrates the way that m γγjj categorization can help separate different nonresonant signal hypotheses.

VI. SYSTEMATIC UNCERTAINTIES
The analysis defines a likelihood function based on the total PD and the data. The parameters for total signal and for the background-only PD are constrained in the fit to maximize this function. A uniform prior is used to parametrize the background PD. When converting the fitted yields into production cross sections, we use simulations to estimate the selection efficiency for the signal. The difference between the simulation and the data is corrected through scaling factors. The uncertainty in those factors is taken into account through parameters included in the likelihood function. The nuisance parameters (parameters not of immediate interest) are varied in the fit according to a log-normal probability density function. They can be classified into three categories. The first category contains the uncertainty in the estimation of the integrated luminosity, which is taken as 2.6% [74]. The second category includes systematic uncertainties that modify the efficiency of signal selection. Finally the third category contains the uncertainties that impact the signal or the Higgs-boson PD. More precisely, the values of the PD parameters are taken from fits to the MC simulation of signal and Higgs-boson  Table I and Sec. V. The efficiency is shown for a spin-0 hypothesis of a radion particle, but is similar for a spin-2 hypothesis of a KK graviton. The error bars associated with statistical uncertainties are smaller than the size of the markers.   production. The systematic uncertainties affect the mean values and the resolution parameters of the PD, while all other CB parameters are fixed to their best values. The sources of nuisance parameters are described below and their contribution to different categories is presented in Table V. The photon-related uncertainties are discussed in Ref. [57]. While the photon energy scale (PES) is known at the subpercent level in the region of p γ T characteristic of the SM H → γγ signal, the uncertainty increases to 1% for p γ T > 100 GeV. The photon energy resolution (PER) is known with a 5% precision [57]. A 1% normalization uncertainty is estimated in the offline diphoton selection efficiency and in the trigger efficiency. An additional normalization uncertainty of 5% is estimated for the high-mass region to account for differences in the p T spectrum of signal photons and of electrons from Z → ee production used to estimate the quoted uncertainties.
The uncertainty in the jet energy scale (JES) is accounted for by changing the jet response by 1%-2% [68], depending on the kinematics, while the uncertainty in the jet energy resolution (JER) is estimated by changing the jet resolution by 10% [67]. An additional 1% uncertainty in the four-body mass accounts for effects in the high-mass region related to the partial overlap between the two b jets from the Higgs-boson decay. The uncertainty in the btagging efficiency is estimated by changing the b-tagging scale factor up and down by 1 standard deviation in each purity category [69]. The related systematic uncertainties are known to be anticorrelated between the two categories.
Theoretical systematic uncertainties are considered for the single-Higgs-boson contribution from SM production, corresponding to the scale dependence of higher-order terms and impact from the choice of proton parton distribution functions (PDF) [36,75]. No theoretical TABLE V. Summaries of systematic uncertainties. For the normalization uncertainties, the values in the right column indicate the impact on the signal normalization. The uncertainty in the b-tagging efficiency is anticorrelated between the b tag categories. The uncertainty in the m kin γγjj categorization is anticorrelated between m kin γγjj categories for the nonresonant search.

General uncertainties in normalization
Integrated luminosity 2.6% Diphoton trigger efficiency 1.0% Diphoton selection efficiency 1.0% Resonant low-mass and nonresonant analyses: 2D fit to m γγ and m jj Uncertainties in normalization Acceptance in p j T (JES and JER) 1.0% b-tagging efficiency in the high-purity category 5.0% b-tagging efficiency in the medium-purity category Low-mass resonant and nonresonant m kin γγjj < 350 GeV 2.1% Nonresonant m kin γγjj > 350 GeV 2.8% uncertainties are assumed on BSM signals. However, there is one exception. We consider the situation where the kinematic properties of the new signal are identical to those of the SM, but the cross section is different (SM-like search). In that case we parametrize the BSM cross section σ BSM HH by the ratio μ HH ¼ σ BSM HH =σ SM HH . When such a search is performed the theoretical uncertainties on σ SM HH are included in the likelihood. Finally, an additional systematic uncertainty of 0.24 GeV is assigned to account for the experimental uncertainty in the Higgs-boson mass [48]. The impact of this uncertainty is comparable to the one from PES.
The analysis is limited by the statistical precision. The systematic uncertainties worsen the expected cross section limits by at most 1.5% and 3.8% in the resonant and nonresonant searches, respectively.

VII. RESULTS
No significant excess is observed over the background expectation in the resonant or nonresonant searches. Upper limits are computed using the modified frequentist approach for confidence levels, taking the profile likelihood as a test statistic [76,77] in the asymptotic approximation. The limits are subsequently compared to theoretical predictions assuming SM branching fractions for Higgs-boson decays.

A. Resonant signal
The observed and median expected upper limits for all the data at 95% C.L. are shown in the top of Fig. 9, and at the bottom in a zoomed-in view of the low-mass region. The expected limits range from 1.99 fb for m X ¼ 310 GeV to 0.39 fb for m X ¼ 1 TeV. At the transition point between the low-mass and high-mass searches, m X ¼ 400 GeV, results with both methods are provided. An improvement of about 20% is observed from the use of the 2D model approach with respect to the 1D analysis.
The result is compared with the cross sections for KKgraviton and radion production in WED models. The tools used to calculate the cross sections for the production of the KK graviton in the bulk and RS1 models are described in Refs. [78,79]. The implementation of the calculations is described in Ref. [80]. In analogy with the Higgs boson, the radion field is predominantly produced through gluon-gluon fusion [81,82]. The cross section for radion production is calculated at NLO electroweak and next-to-next-to-leading logarithmic QCD accuracy, using the recipe suggested in Ref. [18]. This recipe consists of multiplying the radion cross section based on the fundamental parameter of the theory, Λ R , by a K-factor calculated for SM-like Higgs-boson production through gluon-gluon fusion [36,83]. The calculations are performed for the SM-like Higgs boson with masses up to 1 TeV. We use the CTEQ6L PDF [84] in these calculations. No mixing between a radion and the Higgs boson is considered in this paper.
In Table VI, we summarize the inclusive production cross sections and the branching fractions of the heavy resonances in the theoretical benchmarks we use for interpretation. The absolute values of the production cross sections with ðk=M Pl Þ 2 for the KK graviton [22] and scale with 1=Λ R 2 for the radion [85]. The values for the branching fractions of the resonances in the theory benchmarks do not depend on the fundamental parameters of the theory. The resonance decays have a phase space suppression, related to the mass difference between the resonance and its decay products. In this way, the decay to a Higgs-boson pair is not allowed if m X < 250 GeV nor to top quark pairs if m X < 350 GeV. In Table VI, we see that the value of the branching fraction    . The green and yellow bands represent, respectively, the 1 and 2 standard deviation extensions beyond the expected limit. Also shown are theoretical predictions corresponding to WED models for radions and RS1 KK gravitons. The upper plot with a logarithmic scale for the y axis also provides the prediction for the production cross section of a bulk KK graviton. The vertical dashed line in the upper plot shows the separation between the low-mass and high-mass analyses. The limits for m X ¼ 400 GeV are shown for both methods.

SEARCH FOR TWO HIGGS BOSONS IN FINAL STATES …
PHYSICAL REVIEW D 94, 052012 (2016) changes with the resonance mass from m X ¼ 300 to m X ¼ 500 GeV. The exact pattern of this phenomenon is related to the balance between the different phase space suppressions for decays to HH or to tt, which depends on the model under consideration. The analysis excludes a radion with masses below 980 GeV for the radion scale Λ R ¼ 1 TeV. The search has also a sensitivity to the presence of a radion with an ultraviolet cutoff Λ R ¼ 3 TeV in the region between 200 and 300 GeV.
The difference in total selection efficiency between the spin-0 (radion) and the spin-2 (KK-graviton) models does not exceed 3%. Thus, the same upper limits that are extracted using a radion simulation can be used directly to exclude a KK graviton with masses between 325 and 450 GeV, assuming k=M Pl ¼ 0.2. The analysis is not yet sensitive to the presence of a KK graviton in the bulk scenario with the same parameters.

B. Nonresonant signal
We consider the kinematic properties for a new signal identical to those of the SM, but with a different cross section. The observed and expected upper limits on SM-like gg → HH → γγbb production are, respectively, 1.85 and 1.56 fb. This can be translated into 0.71 and 0.60 pb, respectively, for the total gg → HH production cross section. The results can also be interpreted in terms of observed and expected limits on the scaling factor μ HH < 74 and < 62 þ37 −22 , respectively. This result provides a quantification of the current analysis relative to the SM prediction.
We also interpret the results in the context of Higgsboson anomalous couplings. The cross section for nonresonant two-Higgs-boson production σ BSM HH in this context can be written as a polynomial in the parameters of the theory relative to the SM nonresonant cross section σ SM HH as σ HH σ SM The numerical coefficients of Eq.
(2) can be calculated by fitting cross sections as described in Ref. [86], obtaining thereby Under the assumption that radiative corrections to gluon-gluon fusion of two Higgs bosons do not depend significantly on anomalous interactions [87,88], we normalize σ HH such that, when κ t ¼ 1, κ λ ¼ 1, and c 2 ¼ 0, to the cross section that equals the SM prediction at NNLO in QCD. In Fig. 10, 95% C.L. limits on nonresonant cross sections are shown, assuming changes only in the trilinear Higgsboson couplings, with the other parameters fixed to their SM values. All κ λ values are excluded below −17.5 and above 22.5. These results are obtained by extrapolating the limits between the simulated points, as well as above the highest simulated value of κ λ using Eq. (2), which relies on the similarity of distributions for signal at large values of jκ λ j [86,89], as well as on the behavior of the signal efficiency described in Sec. V D. Figure 11 shows the 95% C.L. limits for nonresonant two-Higgs-boson production in the c 2 and κ t planes for different values of κ λ . The specific interference pattern for each combination of parameters produces different exclusion limits for different simulated points of parameter space [86,89]. Only discrete values are provided for limits because a linear interpolation between the simulated points could not follow the strong variations due to interference terms. The points in the theoretical phase space excluded by the data are surrounded by small black boxes. Certain combinations of c 2 , κ λ , or κ t parameters can be excluded under the assumption that Higgs bosons have their usual SM branching fractions. For example, we observe that jc 2 j ≥ 3 is disfavored by the data when κ λ and κ t are fixed to SM values. VI. Cross section and branching fractions for the benchmark theories used in this paper [22,85]

VIII. SUMMARY
A search is performed by the CMS collaboration for resonant and nonresonant production of two Higgs bosons in the decay channel HH → γγbb, based on an integrated luminosity of 19.7 fb −1 of proton-proton collisions collected at ffiffi ffi s p ¼ 8 TeV. The observations are compatible with expectations from standard model processes. No excess is observed over background predictions. Resonances are sought in the mass range between 260 and 1100 GeV. Upper limits at a 95% C.L. are extracted on cross sections for the production of new particles decaying to Higgs-boson pairs. The limits are compared to BSM predictions, based on the assumption of the existence of a warped extra dimension. A radion with an ultraviolet cutoff Λ R ¼ 1 TeV is excluded with masses below 980 GeV. The search has sensitivity to the presence of a radion with an ultraviolet cutoff Λ R ¼ 3 TeV when its mass lies between 200 and 300 GeV. The RS1 KK graviton is excluded with masses between 325 and 450 GeV for k=M Pl ¼ 0.2. The analysis is not yet sensitive to the presence of a KK graviton in the bulk scenario with the same parameters.
For nonresonant production with SM-like kinematics, a 95% C.L. upper limit of 1.85 fb is set for the product of the HH cross section and branching fraction, corresponding to a factor 74 larger than the SM value. When only the trilinear Higgs-boson coupling is changed, values of the self coupling are excluded for κ λ < −17 and κ λ > 22.5. The parameter space is also probed for the presence of other anomalous Higgs-boson couplings.

ACKNOWLEDGMENTS
We are grateful to B. Hespel, F. Maltoni, E. Vryonidou, and M. Zaro for a customized model of the nonresonant signal generation. We congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centers and personnel of the Worldwide LHC Computing Grid for delivering so effectively the computing infrastructure essential to our analyses. Finally, we acknowledge the    [3] D. de Florian and J. Mazzitelli, Higgs Boson Pair Production at Next-to-Next-to-Leading Order in QCD, Phys. Rev. Lett. 111, 201801 (2013