Search for pair production of Higgs bosons in the b¯bb¯b final state using proton-proton collisions at √s = 13 TeV with the ATLAS detector

A search for Higgs-boson pair production in the b ¯ bb ¯ b final state is carried out with 3 . 2 fb − 1 of proton-proton collision data collected at ﬃﬃﬃ s p ¼ 13 TeV with the ATLAS detector. The data are consistent with the estimated background and are used to set upper limits on the production cross section of Higgs-boson pairs times branching ratio to b ¯ bb ¯ b for both nonresonant and resonant production. In the case of resonant production of Kaluza-Klein gravitons within the Randall-Sundrum model, upper limits in the 24 to 91 fb range are obtained for masses between 600 and 3000 GeV, at the 95% confidence level. The production cross section times branching ratio for nonresonant Higgs-boson pairs is also constrained to be less than 1.22 pb, at the 95% confidence level.


Introduction
The discovery of a Higgs boson (h) [1,2] at the Large Hadron Collider (LHC) provides an opportunity to search for physics beyond the Standard Model (SM) in channels involving couplings with the Higgs boson.In particular, the production cross section for Higgs-boson pairs in the SM is significantly smaller than predicted by a host of models, making this channel promising for a search for new phenomena.Examples of such models are the bulk Randall-Sundrum (RS) model [3,4] with a warped extra dimension and the two-Higgs-doublet model (2HDM) [5].In the RS model, spin-2 Kaluza-Klein (KK) excitations of the graviton G * KK are produced via gluon fusion with subsequent decay to the hh final state.Similarly, a heavy spin-0 scalar, H, in the 2HDM also gives rise to a resonant hh signature.Enhanced nonresonant hh production is expected in models with light colored scalars [6] or direct t thh vertices [7,8].
Previous searches for hh production have been performed by the ATLAS and CMS collaborations with pp collisions at √ s = 8 TeV.The final states include b bb b [9,10], b bτ + τ − [11,12], b bγγ [13,14], and γγWW * [11].A combination of these different channels has been performed by ATLAS [11], which shows that for resonance masses above 500 GeV the sensitivity is highest in the b bb b channel.
The dominant h → b b decay mode is exploited in this paper to search for both resonant and nonresonant production of Higgs-boson pairs in pp collisions at √ s = 13 TeV.Two analyses are presented.The "resolved" analysis is optimized for hh systems that have sufficiently low mass to be resolved into four distinct b-jet signatures in the ATLAS detector.The "boosted" analysis focuses on higher-mass hh systems that are characterized by higher-momentum Higgs bosons for which the two b-jets cannot be resolved due to the high boost.In this situation, large-radius jets are utilized to capture the by-products of each Higgs-boson decay and small-radius track jets are used to identify the presence of b-hadrons.The final results are obtained using the resolved analysis up to resonance masses of 1100 GeV, where its expected sensitivity is higher than that of the boosted analysis, whereas the boosted analysis is used for masses above 1100 GeV.
The two analyses generally follow the same approach as that adopted for the 8 TeV data (see Ref. [9]).The analysis of the 13 TeV data reported in this paper benefits from an enhanced sensitivity to highmass resonances due to the significant increase in the production cross section in that kinematic region.Furthermore, the boosted analysis includes a channel with only three b-tagged track jets, in addition to the channel with four b-tagged track jets already included in the previous analysis.This new channel improves sensitivity for resonances with mass above 2000 GeV because the b-jet identification efficiency decreases sharply at high transverse momenta.The boosted analysis also operates with smaller track-jet radii to account for the larger boost at 13 TeV.

The ATLAS detector
The ATLAS experiment [15] at the LHC is a multipurpose particle detector with a forward-backward symmetric cylindrical geometry and a near 4π coverage in solid angle. 1 It consists of an inner tracking 1 ATLAS uses a right-handed coordinate system with its origin at the nominal interaction point (IP) in the center of the detector and the z-axis along the beam pipe.The x-axis points from the IP to the center of the LHC ring, and the y-axis points upwards.Cylindrical coordinates (r, φ) are used in the transverse plane, φ being the azimuthal angle around the z-axis.
detector surrounded by a thin superconducting solenoid providing a 2 T axial magnetic field, electromagnetic and hadronic calorimeters, and a muon spectrometer.The inner tracking detector (ID) covers the pseudorapidity range |η| < 2.5.It consists of silicon pixel, silicon microstrip, and transition radiation tracking (TRT) detectors.A new innermost pixel layer [16] inserted at a mean radius of 3.3 cm is used for the first time in the 2015 data taking.Lead/liquid-argon (LAr) sampling calorimeters provide electromagnetic (EM) energy measurements.A steel/scintillator-tile hadronic calorimeter covers the central pseudorapidity range (|η| < 1.7).The endcap and forward regions are instrumented with LAr calorimeters for both the EM and hadronic energy measurements up to |η| = 4.9.The muon spectrometer (MS) surrounds the calorimeters and includes three large superconducting air-core toroids.The field integral of the toroids ranges between 2.0 and 6.0 T m for most of the detector.The MS includes a system of fast detectors for triggering and precision tracking chambers.A dedicated trigger system is used to select events.The first-level trigger is implemented in hardware and uses the calorimeter and muon detectors to reduce the accepted rate to 100 kHz.This is followed by a software-based high-level trigger (HLT) that reduces the accepted event rate to 1 kHz on average.

Data and simulation samples
The data sample used in this analysis was collected during the 2015 LHC run with pp collisions at √ s = 13 TeV.After requiring that the data be collected during stable beam conditions and that relevant detector systems be functional, the total integrated luminosity is estimated to be 3.2 fb −1 with an uncertainty of 5.0% derived following the methodology detailed in Ref. [17].In the resolved analysis, events are selected by a combination of three triggers requiring either one or two jets selected by a dedicated HLT b-tagging algorithm [18].These triggers require either one b-tagged jet with transverse momentum p T > 225 GeV, two b-tagged jets with p T > 55 GeV and an additional jet with p T > 100 GeV, or four jets with p T > 35 GeV, two of which are b-tagged.A trigger requiring a single jet of radius 1.0 and p T > 360 GeV is used to select events in the boosted analysis.The p T thresholds for these single-or multiple-jet triggers are lower at the first level of the trigger system.The combination of all the above triggers has an efficiency rising from 95% to 99% for selecting b bb b signal events passing the full analysis selection as the resonance mass increases.Simulated Monte Carlo (MC) event samples are used to model signal production and the background from t t and Z+jets events.A method based on data is used to model the dominant multijet backgound.Signal G * KK events are generated at leading order (LO) with MG5_aMC@NLO v2.2.2 [19] using the NNPDF2.3LO parton distribution function (PDF) set [20], and Pythia 8.186 [21] to model the parton shower and hadronization process using the A14 set of tuned underlying-event parameters [22]  2 .Generation of the heavy H scalar in a simplified model with a fixed narrow width Γ H = 1 GeV is performed with MG5_aMC@NLO and the CT10 PDF set [23].With this Γ H choice, the width of the reconstructed hh resonance is dominated by the experimental resolution.For this model, parton showering and hadronization are handled by Herwig++ [24] with the CTEQ6L1 PDF set [25] and the UEEE5 underlying-event tune [26].The scalar interpretation for this search only makes use of the acceptance times efficiency from this model and no interpretation in terms of 2HDM parameters is presented.Nonresonant SM pp → hh → b bb b events are generated via the gluon-fusion process with MG5_aMC@NLO using form factors for the top-quark loop from HPAIR [27,28].The cross section times branching ratio to the b bb b final state, evaluated at next-to-next-to-leading order with the summation of logarithms at next-to-next-leading-logarithm accuracy, is 11.3 +0.9 −1.0 fb [29].The uncertainty includes the effects due to renormalization and factorization scales, PDF set, α S , effects of finite top-quark mass in loops, and the h → b b branching ratio.Generation of t t events is performed with Powheg-box v1 using the CT10 PDF set.The parton shower, hadronization, and the underlying event are simulated using Pythia 6.428 [30] with the CTEQ6L1 PDF set and the corresponding Perugia 2012 tune [31].The top-quark mass is set to 172.5 GeV.Higher-order corrections to t t cross sections are computed with Top++ 2.0 [32].These incorporate NNLO corrections in QCD, including resummation of next-to-next-to-leading logarithmic soft gluon terms.The overall t t normalization is extracted from the data while the shape of kinematic distributions is taken from MC simulation.The Z+jets sample is generated using Pythia 8.186 with the NNPDF2.3LO PDF set.
For all MC samples, charm-hadron and bottom-hadron decays are handled by EvtGen 1.2.0 [33].To simulate the impact of multiple pp interactions that occur within the same or nearby bunch crossings (pileup), minimum-bias events generated with Pythia 8 are overlaid on top of the hard scatter event.The detector response is simulated with Geant 4 [34,35] and the events are processed with the same reconstruction software as that used for the data.

Event reconstruction
The resolved and boosted analyses rely on the reconstruction of jets with the anti-k t clustering algorithm [36] but with different values of the radius parameter R. Calorimeter jets with R = 0.4 (1.0) are used to determine the kinematic properties of Higgs-boson candidates in the resolved (boosted) analysis.Those jets are reconstructed from topological clusters of energy deposits in calorimeter cells.The R = 0.4 jet energies are determined from reconstructed cluster energies at the electromagnetic scale with correction factors derived from simulation to account for the response of the calorimeter to hadrons [37].Jets from pileup are suppressed with the use of tracking information as detailed in Ref. [38].The R = 1.0 jets are built from locally calibrated clusters [37] and are trimmed [39] to minimize the impact of pileup.This trimming proceeds by reclustering the jet with the k t algorithm [40] into smaller R = 0.2 subjets and removing those subjets with p subjet T /p jet T < 0.05, where p subjet T is the transverse momentum of the subjet and p jet T that of the original jet.In addition to the above large-R trimmed jets, the boosted analysis uses track jets with R = 0.2 to identify b-hadrons from Higgs-boson decays [41].Such jets are reconstructed from charged-particle tracks with p T > 0.4 GeV and |η| < 2.5 that satisfy a set of hit and impact parameter criteria to make sure that the tracks originate from the primary vertex, thereby minimizing the impact of pileup.Track jets are associated to large-R jets using ghost association [42].In this method, the large-R jet algorithm is rerun with both the four-momenta of track jets modified to have infinitesimally small momentum (the "ghosts") and all topological energy clusters in the event as potential constituents of jets.As a result, the presence of track jets does not alter the large-R jets already found and their association to specific large-R jets is determined by the jet algorithm.Collision vertices are reconstructed requiring a minimum of two tracks with p T > 0.4 GeV in each vertex.The primary vertex is chosen to be the vertex with the largest p 2 T , where the sum extends over all tracks associated with the vertex.
The identification of jets containing b-hadrons is based on the R = 0.4 calorimeter (R = 0.2 track) jets in the resolved (boosted) analysis and a multivariate tagging algorithm [43].This algorithm is applied to a set of tracks with loose impact parameter constraints in a region of interest around each jet axis to enable the reconstruction of the b-hadron decay vertex.The b-tagging requirements result in an efficiency of 70% (77%) for jets containing b-hadrons in the resolved (boosted) analysis, as determined in a sample of simulated t t events.The corresponding efficiencies for c-hadron jets and light-quark or gluon jets are 12% (29%) and 0.2% (1.4%), respectively.Different b-tagging operating points are chosen in the two analyses to maximize their respective sensitivities.
Muons are reconstructed by combining tracks in the ID and MS, and are required to satisfy tight muon identification criteria [44].The four-momentum of muons with p T > 4 GeV and |η| < 2.5, that are within ∆R of 0.4 (0.2) of jets used for b-tagging in the resolved (boosted) analysis, is added to the calorimeter jet four-momentum to partially account for the energy lost in semileptonic b-hadron decays.

Event selection
The event selection for the resolved and boosted analyses is described below.These analyses are optimized independently for the reconstruction and selection of hh → b bb b final states, with the resolved analysis aiming at event topologies containing four distinct b-jets, whereas the boosted analysis focuses on topologies with higher-momentum Higgs bosons resulting in merged jets.
Different selection and background estimation strategies are adopted for the two analyses.To facilitate the comparison between these different choices, Table 1 summarizes each of the requirements and approaches described in this section.

Selection
Events selected for the resolved analysis must contain at least four b-tagged jets with |η| < 2.5 and p T > 40 GeV.The four highest-p T b-tagged jets are used to form two dijet systems, requiring an angular distance ∆R between the jets within the dijet system smaller than 1.5.The transverse momentum of the leading (subleading) dijet system is required to be greater than 200 (150) GeV.These requirements are made to ensure a high trigger efficiency and to avoid ambiguities in forming dijets.In the rare case that a jet is assigned to more than one dijet system, only the combination containing the jets with the highest probability of being b-jets according to the multivariate b-tagging algorithm is considered.
The resolved analysis considers resonance masses in the range 400-1500 GeV.Event selection that varies as a function of the reconstructed resonance mass (m 4j ) is used to increase the analysis sensitivity across the mass range searched.Mass-dependent selection requirements are made on the leading dijet p T , the subleading dijet p T and the pseudorapidity difference between the dijets as follows [9]: These selection requirements were optimized simultaneously by performing a three-dimensional scan of threshold values, using the expected exclusion limit on the G * KK resonance with k/ MPl = 1 as a metric.After selecting two dijets that satisfy the mass-dependent criteria, 15% of the total background consists of t t events.This t t background mainly comprises events where both top quarks decay hadronically.These hadronic decays often lead to three jets for each top quark -one b-jet directly from the top-quark decay and two from the decay of the W boson. Reduction of the t t background is important as relatively large systematic uncertainties are associated with modeling t t in the signal region.In order to reduce the t t background, jets not already used in the formation of the two dijets ("extra jets") in the event are used to reconstruct W-boson and top-quark candidates by combining them with one or both of the jets in a given dijet.These extra jets are required to have p T > 30 GeV, |η| < 2.5, and ∆R < 1.5 relative to the dijet.The W-boson candidate is reconstructed by adding the four-momentum of the extra jet to the four-momentum of the jet in the dijet system with the lowest probability of being a b-jet according to the multivariate b-tagging algorithm.The top-quark candidate is reconstructed by summing the dijet and the extra jet.The compatibility with the top-quark decay hypothesis is then determined using the variable: where m W and m t are the invariant masses of the W-boson and top-quark candidates.The values in the denominator approximate the dijet and three-jet system mass resolutions.If either dijet in an event has X tt < 3.2 for any possible combination with an extra jet, the event is rejected.This requirement, referred to as the "t t veto", reduces the t t background by ∼60%, while retaining ∼90% of signal events.The event selection criteria described above are collectively referred to as the "4-tag" selection requirements.
Following the 4-tag selection, a requirement on the combination of the leading and subleading dijet masses (m lead 2j and m subl 2j , respectively) is used to define the signal region.The signal region is defined using the variable: where the 0.1 m 2 j terms approximate the widths of the mass distributions.The center of the signal region was optimized using G * KK samples with k/ MPl = 1.On average, the subleading Higgs-boson candidate is reconstructed at lower masses as a result of energy lost from semileptonic b-hadron decays and final-state radiation.The signal region is defined as X hh < 1.6.This corresponds to the kinematic requirements illustrated by the inner region in Figure 1.The data shown in this figure are derived from a sample of events that satisfy all selection criteria except for having only two jets that pass the b-tagging requirements, referred to as the "2-tag" sample.
The acceptance times efficiency for each stage of the resolved-analysis event selection is shown in Figure 2 for spin-2 and spin-0 resonances.The acceptance times efficiency, A × ε, of the full selection for the G * KK with k/ MPl = 1 ranges from 0.1% for a G * KK of mass 400 GeV to 5.3% for a G * KK with a mass of 1000 GeV.The spin of the resonance affects the angular distribution of the decay products, resulting in a lower acceptance in the case of a spin-0 H boson than for the spin-2 G * KK .As a result, the spin-0 resonance search is performed starting at a mass of 500 GeV.Nonresonant di-Higgs production occurs primarily at the low end of the m 4j spectrum, leading to A × ε = 0.64% for the full selection.
The final step in the resonant analysis is to search for an excess in the m 4j distribution for events in the signal region.The sensitivity of the search is increased by improving the m 4j resolution in this region.This is achieved by scaling the four-momentum of each of the Higgs-boson candidates such that their mass is equal to the Higgs-boson mass.This leads to an improvement of ∼ 30% in the signal m 4j resolution with little impact on the background.

Background estimation
After the 4-tag selection described above, ∼90% of the remaining background in the signal region originates from multijet events, which are modeled using data.The remaining ∼10% of the background is expected from t t events.The t t yield is determined from data, while the m 4j shape is taken from MC simulation.The Z+jets contribution is less than 1% of the total background and is estimated from MC simulation.The background from all other sources -including processes featuring Higgs bosons -is negligible.

Multijet background
The multijet background is modeled using an independent data sample selected using the same trigger and selection requirements as described above, except for the b-tagging requirement: only one of the two selected dijets is formed from b-tagged jets, while the other dijet is formed from jets that both fail the b-tagging requirements.This "2-tag" selection yields a data sample that consists of 98% multijet events and 2% t t events.The predicted signal contamination is negligible.
The 2-tag sample is normalized to the 4-tag sample and its kinematics corrected for differences introduced by the additional b-tagging requirement on the 4-tag sample.These kinematic differences arise because the b-tagging efficiency varies as a function of jet p T and η, the various multijet processes contribute in different fractions, and the fraction of events passed by each trigger path changes.The normalization and kinematic corrections are determined using a signal-free sideband region of the m lead 2j -m subl 2j plane.The resulting background model is verified and the associated uncertainties are estimated using a control region.The sideband and control regions are shown in Figure 1.The sideband region is defined as (m lead 2j − 124 GeV) 2 + (m subl 2j − 115 GeV) 2 > 58 GeV, while the control region is defined as the region in the m lead 2j -m subl 2j plane between the signal and sideband regions.These definitions are chosen to be orthogonal to the signal region and to give approximately equal event yields in both the sideband and control regions.
The normalization of the multijet background prediction is set by scaling the number of events in each region of the 2-tag sample by the following factor µ Multijet calculated in the sideband region: where is the number of events observed in the sideband region in the 2-or 4-tag data sample, respectively.The yields N 2-/4-tag tt are the estimated number of t t events in the 2-/4-tag selected sideband region estimated from MC simulation.To predict the distributions of the multijet background in each region, the predicted t t 2-tag distributions are first subtracted from the 2-tag data sample before the distribution is scaled by µ Multijet .
The correction for the kinematic differences between 2-tag and 4-tag samples is performed by reweighting events in the 2-tag sample.The weights are derived in the sideband region, from linear fits to the ratio of the total background model to data for three kinematic distributions that are found to have the largest disagreement between 2-tag and 4-tag: the leading dijet p T , the angular separation between the jets in the subleading dijet, and the angular separation between the two dijets.The reweighting is performed using one-dimensional distributions but is iterated so that correlations between the three variables are taken into account.After the correction process, there is agreement between the background model and sideband region data.
The multijet background model is validated in the control region.Table 2 compares the observed data yield in the control region with the corresponding background estimate.The modeling of the m 4j distribution in the control region is shown in Figure 3.The 4-tag events in the control region are well-described by the background model in both normalization and m 4j shape.

t t background
The normalization of the t t background is derived from data in a t t control region.Due to the limited yield in this control region, the shape of the t t background is taken from MC simulation.To further decrease statistical uncertainties, the t t shape is derived from MC simulation using the "2-tag" selection, with a systematic uncertainty assigned to cover the differences between the 2-tag and 4-tag m 4j distributions.The t t control region is formed from events which pass the 4-tag selection, except for the t t veto, which is reversed: if either of the dijets fails the X tt requirement, the event enters the t t control region.This selection leads to a sample of 21 events, of which 13.3 are estimated to be multijet events using the 2tag sample described previously.After subtracting the multijet background, the t t control region yield is extrapolated to predict the t t yield in the signal region, N tt , using the following equation: where N CR tt is the number of events in the t t control region, after subtraction of multijet background, and t is the efficiency for a selected dijet in a t t event to pass the t t veto.This equation relies on the assumption that the t of each dijet in the event is uncorrelated, an assumption validated in t t MC simulation.The t is measured using an independent, high-purity "semileptonic t t " data sample.Events in this sample are selected by requiring one dijet candidate passing the nominal selection with p T > 150 GeV and one "leptonic top-quark" candidate.The leptonic top-quark candidate is defined using a reconstructed muon and one b-tagged jet.This b-tagged jet is required to be distinct from jets in the dijet candidate, and the muon is required to have p T > 25 GeV, be isolated, and fall a distance ∆R < 1.5 of the b-tagged jet.The leptonic top-quark candidate is required to have p T > 150 GeV, where the leptonic top p T is defined as the vector sum of the b-jet p T and the muon p T .The t t veto efficiency is then measured as the fraction of the reconstructed dijet candidates which passed the t t veto, yielding tt = 0.60 ± 0.04 (stat.)± 0.06 (syst.).A 10% systematic uncertainty is assigned to cover potential differences between t as measured in the semileptonic t t sample and t in the full 4-tag selection, where the method is applied in t t MC simulation to evaluate such differences.The measured t agrees well with the corresponding semileptonic t t MC prediction of 0.58.Equation ( 4) gives a data-driven t t background prediction of 4.2±3.8events.The uncertainty is dominated by the statistical uncertainty in the yield in the t t control region, with a smaller contribution from the uncertainty in the measured t t veto efficiency.

Systematic uncertainties
Two classes of systematic uncertainties are evaluated: those affecting the modeling of the signal and those affecting the background prediction.
The signal modeling uncertainties comprise: theoretical uncertainties in the acceptance, uncertainties in the jet energy scale (JES) and resolution (JER), and uncertainties in the b-tagging efficiency.
The following sources of theoretical uncertainty in the acceptance are evaluated: missing higher-order terms in the matrix elements and PDF set, as well as modeling of the underlying event, hadronic showers, initial-and final-state radiation.The total theoretical uncertainty is dominated by the uncertainties associated with the modeling of the initial-and final-state radiation.
The jet energy uncertainties are derived based on in situ measurements performed during Run 1 and from MC simulation extrapolations from Run-1 to Run-2 conditions [45].The JES systematic uncertainty is evaluated using three separate and orthogonal uncertainty components [46].The JER uncertainty is evaluated by smearing jet energies according to the systematic uncertainties of the resolution measurement [46].The uncertainty in the b-tagging efficiency is evaluated by propagating the systematic uncertainty in the measured tagging efficiency for b-jets [47].The efficiencies are measured as a function of b-jet p T and η.For b-jets with p T > 300 GeV, systematic uncertainties in the tagging efficiencies are extrapolated with MC simulation and are consequently larger [18].Systematic uncertainties in the normalization and shape of the multijet background model are assessed in the control region.The background prediction in the control region agrees with the observed data to within ±5%, which is taken as the uncertainty in the predicted multijet yield.To further test the robustness of the background estimation, the background model is re-evaluated using different sideband and control region definitions and different b-tagging requirements on the "2-tag" sample.These changes affect the kinematic and flavor compositions of the various regions used in the background prediction.The control region agreement and signal region predictions of all variations considered are all consistent to within the assigned ±5% uncertainty in the multijet background prediction.
The uncertainty in the description of the multijet m 4j distribution is determined by comparing the background prediction to the data in the control region as shown in Figure 3.To evaluate the level of agreement, a linear fit is performed on the ratio of the distributions.This fit, along with its uncertainties, shown in the bottom panel of Figure 3, gives a slope consistent with zero.The uncertainty in the multijet background shape is defined using the uncertainty in the fitted slope.
The uncertainty in the t t normalization is described above.The uncertainty in the MC-derived t t m 4j distribution is dominated by the uncertainty associated with using the 2-tag selection to model the 4-tag selection.This uncertainty is assessed by comparing the 2-tag and 4-tag t t MC predictions in the signal region.Table 3 summarizes the relative impact of the uncertainties in the event yields.

Event yields
The predicted number of background events in the signal region, the number of events observed in the data, and the predicted yield for two potential signals are presented in Table 4.The numbers of predicted background events and observed events are in agreement.

Selection
The boosted analysis selects events with at least two large-R jets with 250 < p T < 1500 GeV, |η| < 2.0, and mass m J > 50 GeV.The upper bound on the transverse momentum and the mass requirement correspond to the kinematic region where jet calibration uncertainties are available from Refs.[41] and [48].Only the two large-R jets with highest p T are retained for further selection.In order to reduce the contamination from t t events, the leading jet is additionally required to have p T > 350 GeV, thus ensuring that all topquark decay products are contained in a single large-R jet with mass close to that of the top quark.
At least two track jets must be found by the ghost method [42] to be associated with each large-R jet.
They are required to be consistent with the primary vertex of the event as well as to satisfy p T > 10 GeV and |η| < 2.5.
Since high-mass resonances tend to produce jets that are more central than multijet background processes, the two large-R jets are required to have a separation |∆η| < 1.7.
Signal event candidates are selected if each of the large-R jets has a mass consistent with that of the Higgs boson.This is defined as for the resolved analysis in Eq. ( 2), where the small-R dijet mass is replaced by the large-R jet mass, requiring X hh < 1.6.This requirement defines the signal region in the leading-subleading large-R jet mass plane.
Two samples of events are selected based on the number of b-tagged leading and subleading track jets associated with each large-R jet.They are referred to as the "3-tag" and the "4-tag" samples, and require exactly three or at least four track jets passing the b-tagging selection, respectively.In the 3-tag sample, the fourth jet is explicitly required to fail the b-tagging requirements to define orthogonal samples.
The signal region corresponds to the kinematic requirements illustrated by the inner region in Figure 5.
The data shown in this figure are derived from a sample of events that satisfy all selection criteria except for having only two track jets that pass the b-tagging requirements, referred to as the "2-tag" sample.This sample is used to estimate the background contribution as described below.
The acceptance times efficiency for each stage of the boosted-analysis event selection is shown in Figure 6 for the G * KK and heavy scalar models.The requirement that at least two individual track jets be associated to the large-R jets becomes less efficient at high mass due to merging.The full selection for a G * KK resonance with a mass of 1000 GeV (2000 GeV) and k/ MPl = 1 has an acceptance times efficiency of 9% (11%) in the 3-tag sample and 8% (5%) in the 4-tag sample.

Background estimation
As in the resolved analysis, the dominant source of background stems from multijet (80-90%) events and the rest is primarily due to t t production.The background estimation method generally follows the same approach as that described in Section 5.1.2.Differences are highlighted below.
The shape of the multijet background in both the 3-tag and 4-tag samples is derived from the 2-tag sample.Due to the large statistical uncertainty in the background prediction for dijet masses (m 2J ) above 1500 GeV, an exponential fit to the data in the range between 900 and 2000 GeV is used to model the high-mass tail of the dijet distribution in the signal region.The estimated background yield in each signal region, N 3(4)-tag bkg , is computed according to where N

2-tag
Multijet is the number of multijet events in the 2-tag sample, N 3(4)-tag tt and N

3(4)-tag Z
are the numbers of events predicted by the 3(4)-tag t t and Z+jets MC samples.The parameter µ

3(4)-tag
Multijet corresponds to the ratio of multijet event yields in the 3(4)-tag and 2-tag samples, as defined in Eq. ( 3), except for considering 3-or 4-tag events in the numerator.Finally, the parameter α 3(4)-tag tt is a scale factor designed to correct the t t event yield estimated from the MC simulation.are extracted from a binned likelihood fit to the leading large-R jet mass distribution obtained in the sideband region of the 3(4)-tag sample, as shown in Figure 7.In this fit, the multijet distribution is extracted from the 2-tag sample, after subtraction of the t t and Z+jets contributions predicted by the MC simulation.The t t and Z+jets distributions in the sideband region of the 3(4)-tag sample are taken from the MC simulation.The resulting fit values and their statistical uncertainties for the 3-tag sample are µ 3-tag Multijet = 0.160 ± 0.003 and α 3-tag tt = 1.02 ± 0.09, with a correlation coefficient of −0.60 between these two parameters.The corresponding values measured in the 4-tag sample are µ 4-tag Multijet = 0.0091 ± 0.0007 and α 4-tag tt = 0.82 ± 0.39, with a correlation coefficient of −0.58.A large anti-correlation is observed since the multijet and t t background contributions are constrained to add up to the total number of events in the sideband region of the 3-tag and 4-tag data samples.
The modeling of the background yield and kinematics is validated in the control region of the 3-tag and 4-tag samples.Good agreement is observed between the data and the predicted background in both the sideband and control regions of the 3-tag and 4-tag samples as shown in Table 5.The shapes of the t t kinematic distributions in the 4-tag signal region are extracted from the MC simulation in the 3-tag signal region due to the limited size of the 4-tag MC sample.The number of events in data and predicted background events in the hh sideband and control regions of the 3-tag and 4-tag samples for the boosted analysis.The number of multijet and t t background events in the sideband regions are constrained by the number of observed events, as explained in the text.The uncertainties are purely statistical.

Systematic uncertainties
Evaluation of systematic uncertainties in the boosted analysis generally follows the same approach as that described in Section 5.1.3.Differences are highlighted here.
The large-R jet energy resolution and scale uncertainties as well as the jet mass resolution (JMR) and scale (JMS) uncertainties are derived in situ from 8 TeV pp collisions, taking into account MC simulation extrapolations for the different detector and beam conditions present in 8 and 13 TeV data-taking periods [49].The uncertainty in the b-tagging efficiency for track jets is evaluated with the same method used for R = 0.4 calorimeter jets.
Systematic uncertainties in the normalization and shape of the background model are assessed in the control region.The background predictions in both the 3-tag and 4-tag control regions agree with the observed data to within statistical uncertainties.The statistical uncertainties in the control region yields are assigned as systematic uncertainties in the multijet background normalization.The uncertainty in the shape of the multijet background is assessed in the control region via a linear fit to the ratio of the distributions shown in Figure 8.
An additional uncertainty in the shape of the tail of the background prediction is assigned by fitting the 2-tag dijet mass distribution with a variety of empirical functions designed to model power-law behavior, as described in Ref. [50].The largest difference between the exponential function predictions and those from alternative fit functions, considering the variation of the fitted parameters within their statistical uncertainties, is taken as a systematic uncertainty.
Relative systematic uncertainties in both the background and signal event yields are summarized in Table 6 for the 3-tag and 4-tag selections.For the background, the entry labeled "Statistical" corresponds to the statistical uncertainty from the fit to the leading large-R jet mass (see Section  extract the multijet and t t background yields, taking the correlation between these yields into account.It also includes the t t modeling uncertainties and the statistical uncertainty associated with the data yield in the 2-tag sample.Uncertainties in the m 2J shape of the multijet and t t backgrounds are not listed in Table 6, as they do not affect the event yields, but are accounted for in the statistical analysis.

Event yields
The predicted number of background events in the 3-tag and 4-tag signal regions, the number of events observed in the data, and the predicted yield for a potential signal are reported in Table 7.One event in the 4-tag signal region, with a mass of 852 GeV, is in common with the resolved analysis.The dijet mass distribution in the signal region is shown in Figure 9.An excess of data is observed in the 3-tag signal region for m 2J ∼ 900 GeV and in the range between 1600 and 2000 GeV.The significance of these excesses is evaluated below.

Results
The results from the resolved and boosted analyses are interpreted separately using the statistical procedure described in Ref. [1] and references therein.A test statistic based on the profile likelihood ratio [51] is used to test hypothesized values of µ, the global signal strength factor, separately for each model tested.The statistical analysis described below is performed using the data observed in the signal regions.The systematic uncertainties are treated as independent within each signal region using Gaussian or lognormal constraint terms in the definition of the likelihood function.In the boosted analysis, the data from the 3-tag and 4-tag signal regions are fitted simultaneously treating data-derived systematic uncertainties related to the multijet background estimate as uncorrelated and all other systematic uncertainties as fully correlated.In the case of the search for nonresonant hh production, only the number of events passing the final selection is used whereas the m 4j or m 2J distributions are used in the case of the search for hh resonances.G * KK (1000 GeV), k/ MPl = 1 3.4 ± 0.9 2.9 ± 1.1

Source
Table 7: The number of predicted background events in the hh 3-tag and 4-tag signal regions, compared to the data for the boosted analysis.Errors correspond to the total uncertainties in the predicted event yields.The yields for a 1000 GeV G * KK in the bulk RS model with k/ MPl = 1 is also given.

Background-only hypothesis tests
In order to determine if there are any statistically significant local excesses in the data, a test of the background-only hypothesis (µ = 0) is performed.The significance of an excess is quantified using the local p 0 , the probability that the background could produce a fluctuation greater than or equal to the excess observed in data.A global p 0 is also calculated for the most significant discrepancy, using background-only pseudoexperiments to derive a correction for the look-elsewhere effect across the mass range tested [52].
In the case of the resolved analysis, the largest deviation from the background-only hypothesis occurs around 900 GeV and is found to have a local significance less than 2 σ.
In the case of the boosted analysis, the largest local deviation corresponds to a broad data excess in the 3-tag signal region starting at m 2J ∼ 1700 GeV.The local significance of this excess is 2.0 σ assuming a G * KK resonance with k/ MPl = 1.

Exclusion limits
The data are used to set upper limits on the cross sections for the different benchmark signal processes.Exclusion limits are based on the value of the statistic CL s [53], with a value of µ regarded as excluded at the 95% confidence level (CL) when CL s is less than 5%.
The nonresonant search is performed using the resolved analysis, since it has better sensitivity than the boosted analysis.Using the SM hh nonresonant production as the signal model, the observed 95% CL upper limit is σ(pp → hh → b bb b) < 1.22 pb, a value to be compared with the inclusive SM prediction (as defined in Section 3) of σ(pp → hh → b bb b) = 11.3 +0.9 −1.0 fb.
For the resonant Higgs-boson pair production search, the resolved and boosted analyses offer their best sensitivity in complementary resonance mass regions.The resolved analysis gives a more stringent expected exclusion limit for resonance masses up to (and including) 1100 GeV, while the boosted analysis offers better sensitivity beyond that mass.A simple combination of the separate exclusion limits from the resolved and boosted analyses is used.This is achieved by taking the limit from the analysis with the more stringent expected exclusion at each mass point for each of the signal models.
Figure 10 shows the combined 95% CL upper limits for three different resonances: a spin-2 G * KK in the bulk RS model with k/ MPl = 1 and 2, and a spin-0 narrow-width H boson.For the spin-2 G * KK with k/ MPl = 1, limits on σ pp → G * KK → hh → b bb b are set in the range between 21 and 73 fb for masses between 600 and 3000 GeV.The corresponding range of limits for the G * KK resonance with k/ MPl = 2 is 34 to 86 fb.Although no events are observed at masses near 3000 GeV, the observed limit remains about 1 σ weaker than the expected limit due to a substantial low-mass tail in the shape of high-mass resonance signals and the slight data excess observed at high mass.The cross-section limits for resonance masses below 600 GeV weaken substantially due to the lower acceptance times efficiency (see Figure 2) and the increased level of background.These cross-section upper limits translate into observed (expected) excluded mass ranges of 480-770 (470-735) GeV for k/ MPl = 1 and < 965 (< 995) GeV for k/ MPl = 2.The cross-section upper limits for the spin-0 narrow-width H boson are similar, with 95% CL exclusion limits ranging from 30 to 300 fb in the mass range between 500 and 3000 GeV.
The search sensitivity of this analysis is similar to that achieved at √ s = 8 TeV with 19.5 fb −1 for resonance masses below 1350 GeV but exceeds it above that mass by factors of 1.4 at 1500 GeV and 12 at 2000 GeV.The search has also been extended to resonance masses beyond 2000 GeV, up to 3000 GeV.The results of the resolved analysis are used up to a mass of 1100 GeV and those of the boosted analysis are used at higher mass where its expected sensitivity is higher.The red curves show the predicted cross sections as a function of resonance mass for the models considered.

Conclusions
A search for both resonant and nonresonant production of pairs of Standard Model Higgs bosons has been carried out in the dominant b bb b channel with 3.2 fb −1 of pp collision data collected by ATLAS during the 2015 run of the LHC at √ s = 13 TeV.Results are reported for the resolved analysis with each h → b b decay reconstructed as two separate b-tagged jets and for the boosted analysis with each h → b b decay reconstructed as a single large-radius jet associated with two small-radius track jets and a minimum of three b-tags for the hh system.No significant data excess is observed above the estimated background consisting mainly of multijet and t t events.Upper limits on the production cross section times branching ratio to the b bb b final state are set for spin-0 and spin-2 resonances with values ranging between 24 and 113 fb (at 95% CL) for resonance masses in the range between 600 and 3000 GeV.For nonresonant production, the upper limit is 1.22 pb (at 95% CL).The search sensitivity of this analysis exceeds that achieved at √ s = 8 TeV with 19.5 fb −1 for resonance masses above 1350 GeV.Furthermore, the search has been extended to cover the mass range between 2000 and 3000 GeV.
. The Higgs-boson mass is set to 125.0 GeV.Values of the signal cross section times branching ratio for G * KK → hh → b bb b with the coupling constant k/ MPl = 1 are 11.2 fb and 0.185 fb for G * KK masses of 1000 GeV and 2000 GeV, respectively.The parameter k corresponds to the curvature of the warped extra dimension and the effective four-dimensional Planck scale MPl = 2.4 × 10 18 GeV.Signal samples are also generated with k/ MPl = 2 to study broader resonances.Both the cross section and natural width depend on (k/ MPl )

Figure 1 :Figure 2 :
Figure 1: The m subl 2j vs. m lead 2j distribution for the 2-tag data sample used to model the multijet background in the resolved analysis.The signal region is the area surrounded by the inner black contour line, centered on m lead 2j =124 GeV, m subl 2j =115 GeV.The control region is the area inside the outer black contour line, excluding the signal region.The sideband region is the area outside the outer contour line.

Figure 3 :
Figure3: The m 4j distribution in the control region of the resolved analysis for the data and the predicted background (top panel).The small hatched bands drawn on the histogram and on the horizontal line in the data to background ratio (bottom panel) represent the statistical uncertainty in the total background estimate.The bottom panel also includes a first-order polynomial fit to the data-to-background ratio.The dashed lines show the ±1σ uncertainties in the two fitted parameters.

Figure 4
Figure4shows a comparison of the predicted m 4j background distribution to that observed in the data.The predicted background and observed distributions are in agreement, with no significant local excesses.

Figure 4 :
Figure 4: Distribution of m 4 j in the signal region of the resolved analysis for data compared to the predicted background.The hatched bands represent the combined statistical and systematic uncertainty in the total background estimate.The expected signal distribution for a G * KK resonance with mass of 800 GeV is also shown.

Figure 5 :Figure 6 :
Figure 5: The m subl J vs. m lead J distribution for the 2-tag data sample used to model the multijet background in the boosted analysis.The signal region is the area surrounded by the inner black contour line, centered on m lead J = 124 GeV, m subl J = 115 GeV.The control region is the area inside the outer black contour line, excluding the signal region.The sideband region is the area outside the outer contour line.The kinematic region enriched in t t events is indicated by the dashed white contour line.

A
sideband region defined by (m lead J − 124 GeV) 2 + (m subl J − 115 GeV) 2 > 36 GeV is used to measure µ tt from the data.The background estimate is validated in a control region defined to be complementary to the sideband and signal regions.

Figure 7 :
Figure7: The leading large-R jet mass distribution in the hh sideband region for data (points) and background estimate (histograms) in the boosted analysis for events in the (left) 3-tag and (right) 4-tag categories.The shape of the multijet distributions is taken from the 2-tag region and is fitted to the data.The hatched bands represent the statistical uncertainty in the total background estimate.

Figure 8 :
Figure 8: Dijet mass distribution in the control region for data (points) and background estimate (histograms) in the boosted analysis for events in the (left) 3-tag and (right) 4-tag categories.The hatched bands represent the statistical uncertainty in the total background estimate.

Figure 9 :
Figure 9: Dijet mass distribution in the hh signal region for data (points) and background estimate (histograms) in the boosted analysis for events in the (left) 3-tag and (right) 4-tag categories.The expected signal distributions for G * KK masses of 1000, 1500 and 1800 GeV are also shown.The uncertainty band includes both the statistical and systematic uncertainties in the background estimate.

Figure 10 :
Figure 10: The expected and observed upper limit for pp → G * KK → hh → b bb b in the bulk RS model with (a) k/ MPl = 1 and (b) k/ MPl = 2, as well as (c) pp → H → hh → b bb b with fixed Γ H = 1 GeV, at the 95% confidence level.The results of the resolved analysis are used up to a mass of 1100 GeV and those of the boosted analysis are used at higher mass where its expected sensitivity is higher.The red curves show the predicted cross sections as a function of resonance mass for the models considered.

Table 1 :
T > 10 GeV, |η| < 2.5, |∆η| < 1.7 Event selection requirements and definition of the different regions used in the resolved and boosted analyses.The methodologies used to estimate the background normalization and shape are also outlined.The variables are defined in the text.Dijet and large-R jet minimum p T values are indicated for leading (subleading) such objects.The functions f (m 4j ) and f (m 4j ) represent the mass dependence of the minimum p T and maximum |∆η| requirements placed on the dijet candidates in the resolved analysis.

Table 2 :
The number of events in data and predicted background events after applying the t t veto in the sideband and control regions for the resolved analysis.The uncertainties are purely statistical.The t t yield in this table, in contrast to the final result, is estimated using MC simulation.

Table 3 :
Summary of systematic uncertainties (expressed in percentage yield) in the total background and signal event yields in the signal region of the resolved analysis.Uncertainties are provided for nonresonant SM Higgs pair production, for a G * KK resonance with k/ MPl = 1 and m = 500 GeV, and for three resonances with m = 800 GeV: a G * KK resonance with k/ MPl = 1, a G * KK resonance with k/ MPl = 2, and a spin-0 narrow-width H boson.

Table 4 :
The number of predicted background events in the hh signal region for the resolved analysis, compared to the data.The yield for two potential signals, SM nonresonant Higgs pair production and an 800 GeV G * KK resonance with k/ MPl = 1 are shown.The quoted errors include both the statistical and systematic uncertainties.

Table 6 :
Summary of systematic uncertainties (expressed in percentage yield) in the total background and signal event yields in the 3-tag and 4-tag signal regions in the boosted analysis.Uncertainties are provided for a G * KK resonance mass of 1500 GeV with k/ MPl = 1 or 2, as well as for a spin-0 narrow-width H boson.
Also at School of Physics and Engineering, Sun Yat-sen University, Guangzhou, China al Also at Institute for Nuclear Research and Nuclear Energy (INRNE) of the Bulgarian Academy of Sciences, Sofia, Bulgaria am Also at Faculty of Physics, M.V.Lomonosov Moscow State University, Moscow, Russia an Also at Institute of Physics, Academia Sinica, Taipei, Taiwan ao Also at National Research Nuclear University MEPhI, Moscow, Russia ap Also at Department of Physics, Stanford University, Stanford CA, United States of America aq Also at Institute for Particle and Nuclear Physics, Wigner Research Centre for Physics, Budapest, Hungary ar Also at Flensburg University of Applied Sciences, Flensburg, Germany as Also at University of Malaya, Department of Physics, Kuala Lumpur, Malaysia at Also at CPPM, Aix-Marseille Université and CNRS/IN2P3, Marseille, France ak * Deceased