Search for Higgs bosons decaying to aa in the μμττ final state in pp collisions at √s = 8 TeV with the ATLAS experiment

A search for the decay to a pair of new particles of either the 125 GeV Higgs boson ( h ) or a second charge parity ( CP )-even Higgs boson ( H ) is presented. The data set corresponds to an integrated luminosity of 20 . 3 fb − 1 of pp collisions at ﬃﬃﬃ s p ¼ 8 TeV recorded by the ATLAS experiment at the LHC in 2012. The search was done in the context of the next-to-minimal supersymmetric standard model, in which the new particles are the lightest neutral pseudoscalar Higgs bosons ( a ). One of the two a bosons is required to decay to two muons while the other is required to decay to two τ leptons. No significant excess is observed above the expected backgrounds in the dimuon invariant mass range from 3.7 to 50 GeV. Upper limits are placed on the production of h → aa relative to the standard model gg → h production, assuming no coupling of the a boson to quarks. The most stringent limit is placed at 3.5% for m a ¼ 3 . 75 GeV. Upper limits are also placed on the production cross section of H → aa from 2.33 to 0.72 pb, for fixed m a ¼ 5 GeV with m H ranging from 100 to 500 GeV.

Search for Higgs bosons decaying to aa in the μμττ final state in pp collisions at ffiffi s p ¼ 8 TeV with the ATLAS experiment

I. INTRODUCTION
Since its introduction, the standard model (SM) has successfully predicted several new particles, culminating in the discovery of the Higgs boson (h) in July, 2012 [1,2].The h boson discovered by ATLAS and CMS appears to have properties consistent with that predicted by the SM within current experimental uncertainties [3,4].While the many achievements of the SM are remarkable, several unanswered questions remain, such as the hierarchy problem [5][6][7][8] and the nature of the observed dark-matter abundance in the Universe.By introducing an additional symmetry between fermions and bosons, supersymmetric extensions [9][10][11][12][13][14][15][16][17] of the SM are able to address both of these issues.
The minimal supersymmetric SM (MSSM) [18][19][20][21][22] is the simplest extension to the SM that incorporates supersymmetry.It predicts four additional Higgs bosons, generally assumed to be heavier than the h boson: two neutral states, the H and A bosons, as well as two charged states, the H AE charged bosons.However, the recent discovery of only an h boson and several null-result searches for the associated enhancements in Higgs boson decay rates [23][24][25] have made the model less attractive.Specifically, the measured mass of the Higgs boson [4,26] close to 125 GeV results in the reintroduction of a small hierarchy problem in the MSSM, which is removed in the next-to-MSSM (NMSSM) [27,28] by allowing for additional contributions to the mass of the Higgs boson from new scalar particles.In particular, the NMSSM contains an additional pseudoscalar Higgs boson (a), generally assumed to have a mass lower than the h boson since its mass is protected by a Peccei-Quinn symmetry [29].The 95% confidence level (CL) upper limit on the branching ratio (BR) of the h boson decaying to non-SM particles is independently set by the ATLAS and CMS experiments to be 37% and 49% respectively [4,30].
The analysis presented in this paper targets the H → aa decay for a boson masses above the τ-lepton pairproduction threshold.The current lower limit on m H for such decays was set by the ALEPH experiment at LEP to be m H > 107 GeV for BRðH → aaÞ ¼ 1, under the assumption that the H boson couples to the Z boson with SM-strength Zh coupling [31].The DØ experiment at the Tevatron also set limits on H → aa in this m a range in the μμττ channel [32].At the LHC, the CMS experiment looked for direct a boson production with a → μμ, with m a between 5.5 and 14 GeV [33].Another search for the a boson below the τ-lepton pair-production threshold has been performed by the CMS experiment, which looked for the 4μ final state [34].
This analysis uses data corresponding to an integrated luminosity of 20.3 fb −1 of pp collisions at ffiffi ffi s p ¼ 8 TeV recorded by the ATLAS detector at the LHC in 2012.A search is performed for events consistent with the H → aa process, where one a boson decays to two muons and the other decays to a pair of τ leptons with at least one τ lepton decaying to an electron or muon.The search is performed in a range of m a values from 3.7 to 50 GeV, with m H set to 125 GeV to be consistent with the measured mass of the h boson, m h [4,26].A range of m H values from 100 to 500 GeV, with m a set to 5 GeV, is also considered.For the purposes of this analysis, the assumption is made that there is no coupling of the a boson to quarks, and the event selection has been optimized for sensitivity to m a ≲ 10 GeV (approximately the b b production threshold), for which the decay products of either a boson are likely to overlap.In this range, restricting the decays of one a boson to a pair of muons (rather than allowing for both to decay to pairs of τ leptons) reduces the total production rate of the signal by a factor of approximately 100.This decrease in signal efficiency is accepted in exchange for a very high trigger efficiency, a larger signal-to-background ratio, and an expected narrow resonance in the dimuon invariant mass (m μμ ) spectrum.The latter is used to discriminate between background and signal hypotheses based on templates derived in both data and simulation.

II. ATLAS DETECTOR
The ATLAS experiment [35] at the LHC is a multipurpose particle detector with a forward-backward symmetric cylindrical geometry and a near 4π coverage in solid angle. 1  It consists of an inner tracking detector surrounded by a thin superconducting solenoid providing a 2 T axial magnetic field, electromagnetic and hadron calorimeters, and a muon spectrometer.The inner tracking detector covers the pseudorapidity range jηj < 2.5.It consists of silicon pixel, silicon microstrip, and transition radiation tracking detectors.Lead/liquid-argon (LAr) sampling calorimeters provide electromagnetic (EM) energy measurements with high granularity.A hadron (iron/scintillator-tile) calorimeter covers the central pseudorapidity range (jηj < 1.7).The end cap and forward regions are instrumented with LAr calorimeters for EM and hadronic energy measurements up to jηj ¼ 4.9.The muon spectrometer surrounds the calorimeters and is based on three large aircore toroid superconducting magnets with eight coils each.Its bending power is in the range from 2.0 to 7.5 T m.It includes a system of precision tracking chambers and fast detectors for triggering.A three-level trigger system is used to select events.The first-level trigger is implemented in hardware and uses a subset of the detector information to reduce the accepted rate to at most 75 kHz.This is followed by two software-based trigger levels that together reduce the accepted event rate to 400 Hz on average depending on the data-taking conditions during 2012.

III. MONTE CARLO SIMULATION
Monte Carlo (MC) simulations are used in this analysis to model the H → aa signal process and to roughly estimate the composition of the SM background, which is then fitted to the data, with the exception of the low-mass SM resonances (J=ψ, ψ 0 , ϒ 1S , ϒ 2S , and ϒ 3S ).All MC simulated samples are obtained with the full ATLAS detector simulation [36] in which the particle propagation through the detector is modeled with GEANT4 [37].The effect of multiple proton-proton collisions from the same or nearby beam bunch crossings (in-time and out-of-time pileup, respectively) is incorporated into the simulation by overlaying additional minimum-bias events generated with the PYTHIA8 generator [38] onto events with a hard scattering.Simulated events are weighted to match the distribution of the average number of interactions per bunch crossing observed in data, but are otherwise reconstructed in the same manner as data.The event generators, cross sections, underlying-event parameter tunes, and the parton distribution function (PDF) sets used for simulating the SM background processes are summarized in Table I.
In each simulated H → aa signal event, a scalar H boson is produced via gluon fusion and is then forced to always decay to two pseudoscalar a bosons.Both the scalar H and pseudoscalar a bosons are expected to have narrow widths, which are negligible when compared to the detector TABLE I. Simulated signal and background samples.The background simulation is used to understand the background composition and μμ invariant mass distribution.The term "tune" refers to the choice of parameters used for the underlying-event generation.Limits are set on the signal cross section, so the entry here is listed as 'Á Á Á'.

Process
Generator Cross section Tune PDF set CTEQ6L1 [40] t t POWHEG [41,42] 1 ATLAS uses a right-hand coordinate system with its origin at the nominal interaction point (IP) in the center of the detector and the z axis along the beam pipe.The x axis points from the IP to the center of the LHC ring, and the y axis points upwards.Cylindrical coordinates ðr; ϕÞ are used in the transverse plane, ϕ being the azimuthal angle around the z axis.The pseudorapidity is defined in terms of the polar angle θ as η ¼ − ln tanðθ=2Þ.Angular distance is measured in units of ΔR ≡ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi ðΔηÞ 2 þ ðΔϕÞ 2 p .
resolution.The H width is set to the value for a SM Higgs boson with the same mass and the a boson width is set to 1 MeV [33].This analysis searches for a resonance in the m μμ spectrum and is therefore insensitive to the resolution for m H .The a bosons are each allowed to decay only to either two muons or two τ leptons.The decay rate of a → μμ is expected to depend on the mass of the a boson as well as the muon and τ-lepton masses [29]: The width values calculated from this equation are used by Pythia to generate signal events, under the assumption that BRða → μμÞ ¼ 1-BRða → ττÞ.BRða → μμÞ ranges from 1.25% for m a ¼ 3.7 GeV down to 0.35% for m a ¼ 50 GeV.

IV. EVENT SELECTION
The data used for this analysis were collected using a combination of a single-muon trigger and an asymmetric dimuon trigger.The transverse momentum (p T ) threshold for the single-muon trigger is 36 GeV, while the dimuon trigger thresholds for the highest p T (leading) and the second-highest p T (subleading) muons are 18 and 8 GeV, respectively [57].Muons are reconstructed by combining tracks in the inner detector (ID) with those in the muon spectrometer (MS) [58] and are selected if they satisfy the following criteria: each muon must have a p T > 16 GeV, jηj < 2.5, projected longitudinal impact parameter jz 0 sin θj < 0.4 mm, the ratio of the transverse impact parameter d 0 to its estimated uncertainty σ d 0 such that jd 0 j=σ d 0 < 3, and a track isolation requirement.The transverse and longitudinal impact parameters are defined with respect to the primary vertex, defined as the vertex with the largest P p 2 T of the associated tracks.The track isolation of each muon is defined as the scalar sum of the p T of all tracks with p T > 1 GeV, in a surrounding ΔR ¼ 0.3 cone, excluding those tracks associated with either muon.The track isolation is required to be less than 12% of the muon p T .The selected muons are organized into oppositely charged pairs, which are then ordered by the vector sum of the p T of the constituent muons.The leading pair is identified as the a → μμ candidate.Events are required to have an a → μμ candidate with p T > 40 GeV.

A. Signal regions
Two signal regions (SR) are designed to select H → aa events where one a boson directly decays to a pair of muons, and the other to a pair of τ leptons.The regions are optimized for the scenario where one τ lepton decays to either a muon or an electron and the other τ lepton is identified by selecting one to three additional tracks. 2 Each signal event must have an a → μμ candidate with a dimuon invariant mass between 2.8 and 70 GeV. 3 Additionally, the event must contain a third lepton (muon or electron), with the flavor indicated in the SR name (i.e.SRμ or SRe).In the case where the third lepton is a reconstructed muon, it must have p T > 7 GeV and jηj < 2.5.If the third lepton is a reconstructed electron, it is required to satisfy identification requirements optimized for good efficiency extending to low p T [59], and must have p T > 7 GeV and jηj < 2.47.Electrons in the transition region between the barrel and end cap EM calorimeters (1.37 < jηj < 1.55) are rejected.The two a bosons are expected to be produced back to back in the transverse plane.This topology is enhanced by requiring the azimuthal separation between the tagged a → μμ candidate and the third lepton to be greater than 1 rad.Furthermore, the two τ leptons from the decay of the a boson tend to be highly collimated for low values of m a .Therefore, the third lepton (muon or electron) is required to have one, two, or three tracks, in a cone of ΔR ¼ 0.4, not including the track of the lepton itself.Tracks are reconstructed in the ID and are required to have p T > 1 GeV, jηj < 2.5, transverse and longitudinal impact parameters, jd 0 j and jz 0 j, less than 1 mm, at least seven hits in the two silicon tracking detectors, and one hit in the innermost layer of the pixel detector if such a hit is expected.The leading p T track within the ΔR ¼ 0.4 cone around the third lepton, but not matched to the lepton itself, is denoted the lead track.It is used to approximate the axis of the second τ lepton from the a → ττ decay.The ΔR ¼ 0.2 cone, centered on this axis, may have no more than three tracks, not including the track of the third lepton.The charge of the lead track is taken to be the same as its τ-lepton ancestor and is required to be opposite to that of the lepton.Finally, the third lepton is required to pass a cut on its track isolation, defined as the scalar sum of the p T of the tracks that are within a cone of ΔR ¼ 0.4.The lead track is excluded from this calculation.The track isolation is required to be less than 12% of the lepton's p T .
If the event has more than one candidate to be the third lepton, the leading muon is used to define the signal region.The signal region has been designed to select events in which the charged decay product(s) of the second τ lepton is either an electron, a muon, a single charged hadron or three charged hadrons; each decay product is identified as a selected track.For the signals targeted by this analysis, one or more of the tracks in the last case can have p T insufficient to be selected.This results in final states with less than three reconstructed tracks.The mass range used to define the two signal regions (2.8 to 70 GeV), and ultimately used to fit the observed m μμ spectra, is wider than the range of signal hypotheses (3.7 to 50 GeV) in order to be able to increase the stability of the fit when considering the low-mass SM resonances and the low-end tail of the SM Z resonance.
If no third muon is identified, the leading electron is used.Events with muon as well as electron candidates are classified as SRμ events in order to maximize the signalto-background ratio.No veto is applied to events with more than three leptons in order to maintain signal selection efficiency for events where both τ leptons decay to muons or electrons.In these cases, the additional lepton may be selected as the lead track.Table II shows the selection efficiency for a signal simulated with m a ¼ 5 GeV and m H ¼ m h ¼ 125 GeV.

B. Validation and control regions
Two validation regions (VR) are used to test the methods of the analysis in a signal-free environment.The background in the signal region is expected to be dominated by events with jets that pass the a → ττ selection outlined above.Furthermore, the charges of the reconstructed lepton and lead track in such events are not correlated.Events in the validation regions are therefore selected with the same criteria as signal events, with the sole exception that the lepton and lead track are required to have the same charge.The signal yield in the validation regions is expected to be less than 5% (10%) of that in the muon (electron) signal region, as determined in simulation.As done on the signal region, the two regions are denoted VRμ and VRe based on the flavor of the third lepton.
Two control regions (CR) are used in this analysis to constrain the SM background, one light-flavor-dominated region CRj and one heavy-flavor-dominated region CRb.Events in each control region are required to have an a → μμ candidate with a mass between 2.8 ( 15) and 70 GeV for CRj (CRb). 4The two regions are further defined by imposing requirements on jets in the event.The light-flavor-dominated region is used to measure the nonresonant Drell-Yan background as well as the lowmass SM resonances.It is defined by requiring at least one selected jet and exactly zero b-jets.The heavy-flavordominated region must have at least two b-jets and is used to measure the t t background.All events in the signal regions are explicitly removed from the control regions.
Jets are reconstructed using the anti-k t algorithm [60] and radius parameter R ¼ 0.4 with three-dimensional energy clusters, measured in the calorimeter [61], as input.The calibration of the clusters is performed by applying different weights to the energy deposits arising from the electromagnetic and hadronic components of the showers.The final calibration of the jet energy corrects the response of the calorimeter to match the particle-level jet energy [62,63].Corrections are first determined in simulation and then improved and validated using data.Additional corrections for pileup from other pp interactions in the same or neighboring bunch crossings are applied as well [64].Jets are required to have p T > 40 GeV and jηj < 2.5.Jets with p T between 40 and 50 GeV are further required to pass a cut on the jet vertex fraction (JVF), which is defined as the p T -weighted fraction of tracks inside the jet that are determined to originate from the primary vertex of the event.The JVF is required to be greater than 0.25, but is not applied to jets that have no tracks.Jets can be tagged as containing a b-hadron using a multivariate b-tagging algorithm, and then denoted b-jets.The algorithm is based on the impact parameters of tracks and information from reconstructed secondary vertices [65].The operating point chosen for the b-tagging algorithm corresponds to a true b-jet efficiency of 60% and purity of 95%, as determined in t t simulation.

V. SIGNAL AND BACKGROUND MODELS
An unbinned, log-likelihood fit is performed on the observed dimuon invariant mass spectra in the signal regions (SRμ and SRe) to a combination of background and signal models, the details of which are described in this section.

A. Signal model
A double-sided crystal ball (CB) function [66,67] is used for the signal model; it was found to be the simplest function that was robust enough to describe the shape of the simulated signal resonances and the The generator-level selection requires at least two muons with p T above 15 GeV and jηj < 2.5.Generator-level events are further required to have a pair of such muons that have opposite charge and have p T ðμμÞ > 30 GeV.Only 0.5% of generated events have a μμττ final state due to the small rate of a → μμ.As discussed in Sec.V, a small number of events with four τ leptons is also accepted, but the rate is too small to impact the efficiency reported in this table .observed SM resonances. 5The crystal ball function is composed of a Gaussian (GA) core, with mean μ CB and width σ CB , and power-law distributions of orders 2.5 and 10 for the low-end and high-end tails.The threshold parameter α CB , which is in units of σ CB , determines the point of transition from the core to either tail.The mean of the crystal ball function is assumed to be proportional to m a , with slope a μ , while its width is assumed to be linearly dependent on both m a and m H with slopes a σ and b σ respectively.
The crystal ball function is used to model the line shape of the a → μμ resonance.A small contribution from a → ττ → μμ þ 4ν is included as a Gaussian distribution, which, due to the kinematics of the τ → μ þ 2ν decay, is expected to have a lower mean and worse resolution than the a → μμ resonance.The mean, μ ττ , and width, σ ττ , are set proportional to the corresponding parameters of the crystal ball function with the parameters k μ < 1 and k σ > 1.The fraction of a → ττ → μμ þ 4ν in the total signal is given by f ττ .The full signal model is The values of the parameters b σ , k μ , k σ , and f ττ are determined by fitting the signal simulation, while the parameters α CB , a μ , and a σ are measured in data following the procedure described in the Sec.V B, and are found to be consistent with simulation.Figure 1 shows the result of a simultaneous fit of both signal regions to all simulated signal samples, projected into SRμ for one benchmark mass point with m a ¼ 5 GeV and m H ¼ m h ¼ 125 GeV.

B. Background model
The full background model consists of several pieces: six SM resonances (J=ψ; ψ 0 ; ϒ 1S ; ϒ 2S ; ϒ 3S ; Z), a t t component, and one piece for the nonresonant continuum background (dominated by low-mass Drell-Yan events).Each SM resonance is modeled by the same double-sided crystal ball function used for the signal a → μμ resonance (see Sec. I), with the offset b σ set to zero.The mean, μ X , and width, σ X , of each resonance X are assumed to be linearly dependent on its mass (m X ) with the same slopes as in the signal model.The measured value of the mass of each resonance is found to be consistent with the PDG best-fit value [68] and are therefore constrained to the PDG value and its associated uncertainty.The low-mass resonances are combined into two composite models, ψ and ϒ, by adding the resonances of the higher spin states with fractions f ψ 0 , f ϒ 2S , and f ϒ 3S , as shown in Eqs. ( 4)- (5).
The background from t t production is modeled with a Rayleigh distribution [69], defined by multiplying m μμ by a Gaussian distribution with mean set to zero and width, σ tt .The mostly Drell-Yan continuum background is modeled by an exponential decay function, with parameter α γ Ã < 0,

FIG. 1 (color online)
. Simulated dimuon invariant mass (m μμ ) distribution and the result of the simultaneous fit projected into SRμ for one benchmark mass point with m a ¼ 5 GeV and m H ¼ m h ¼ 125 GeV.The best fit of the μμ resonance model to the h → aa signal simulation in SRμ is shown in blue with its uncertainty as a yellow band.Also shown are the forced AE1σ systematic uncertainty variations, defined in Sec.VII, in α CB and f ττ (dashed magenta).The top plot shows the simulation and fits on a linear scale, the middle on a logarithmic scale, and the % residual of each fit is shown at the bottom.The errors on the signal simulation are statistical only. 5For a given resonance X, the double-sided crystal ball function has been simplified to remove those parameters that are common to all resonances.The expression CBðμ X ; σ X ; multiplied by m μμ raised to the power n γ Ã.The full expression for the continuum background is m n γÃ μμ e α γÃ m μμ .A contribution to the background from b b production followed by two semileptonic decays of b-hadrons to muons was also considered and found to be small in the signal region, with a m μμ shape similar to the Drell-Yan component but with a rate about 1% as large.This is expected since dimuon events from b b are highly suppressed after applying the muon isolation requirements, the muon p T cuts, and the dimuon p T cut.Events from double semileptonic b-hadron decays (b → cμ þ X → μμ þ X) are found to contribute as well, but only for m μμ < 3.5 GeV (which is below the signal region) at a rate of about 10% that of Drell-Yan events.
Lastly, the full model in each region (CRj, CRb, SRμ or SRe) is defined by adding four background models: one each for the ψ and ϒ resonances, one for the t t component, and one for the Drell-Yan component.The shape parameters of each component, the fractional contribution of higher spin states, and the relative contribution of Z boson to the total Drell-Yan production (f Z ) are constrained by simultaneously fitting the control regions as described in Sec.VI.These parameters are therefore assumed to be the same in the control and signal regions.The relative contributions of each of the four background components (quantified by the independent parameters f tt , f ϒ , and f Res ) are expected to vary between the regions and are therefore measured in the fit to the signal regions (SRμ and SRe).The complete background model is shown in Eq. ( 6):

VI. BACKGROUND MEASUREMENT
A simultaneous fit of the full background model is performed on the m μμ data in the two control regions (CRj and CRb), with m μμ in the range from 2.8 (15) to 70 GeV,for CRj (CRb).The results of the fit are used to constrain the dependent parameters of the four background components as described in Sec.V B. The fitted values and uncertainties of the relevant parameters are reported in Table III and the fitted m μμ spectra in the two regions are shown in Fig. 2. A large correlation is found between a μ , a σ , and α CB while a large anticorrelation is found between σ tt and f Z .The figure also includes a comparison with Drell-Yan and t t simulation normalized to the data. 6In both control regions, the data are well described by the background model, as indicated by the small residuals shown in the figure and by the good agreement with simulation.The large number of events in the control regions provides a tight constraint on the parameters of the model.The small shift of the resonance mean masses to ð99.86 AE 0.01Þ% of their PDG values [68] is consistent with the Oð0.1%Þ systematic uncertainty on the muon energy scale.The fractional mass resolution, a σ , is found to be ð1.68AE 0.02Þ%, corresponding to an ð84 AE 1Þ MeV width for a 5 GeV dimuon resonance.

VII. SYSTEMATIC UNCERTAINTIES A. Signal model
In the signal model for the a → μμ resonance, the parameters α CB , a μ , and a σ are limited by the experimental resolution, and are thus assumed to be 100% correlated with the corresponding parameters of the fits to the SM resonances and are measured in the observed CRj data.To account for the extrapolation uncertainty from parameters for the SM resonances to those for signal, an additional uncertainty is assigned to the crystal ball function's threshold parameter, α CB .Due to the large correlation between the three parameters, this additional uncertainty assigned to α CB covers the uncertainty of the other two.The uncertainty is determined by fitting the signal model in each simulated signal sample separately and taking the maximum difference between them.Another assumption in the signal model is that the parameter, b σ , and the Gaussian description of the a → ττ → μμ þ 4ν tail, parametrized by k μ , k σ , and f ττ , are properly described in the simulation.The systematic uncertainty on α CB covers the uncertainty on b σ , since it is strongly correlated with α CB , a μ , and a σ .The other three parameters are similarly correlated with one another and, thus, only one additional systematic uncertainty is introduced.The final results of the analysis are found to have little sensitivity to the amount of a → ττ → μμ, so a conservative 50% systematic uncertainty is assigned to f ττ .

B. Signal normalization
The dominant source of systematic uncertainty on the signal normalization is found to be the theoretical uncertainty on the rate of SM gg → h=H production.In the m H range relevant for this analysis (from 100 to 500 GeV) the total uncertainty varies from 10% to 11% and is determined from the spread of the cross-section predictions using different PDF sets and their associated uncertainties, as well as from varying the factorization and renormalization scales [56].A constant 11% is used in this analysis.The next largest systematic uncertainty is on the p T resolution of the lead track and is found to be 5%.This uncertainty on the signal normalization is determined by varying the p T of each track by a conservative AE2%, and propagating the effect through the full analysis.Additional sources of systematic uncertainty include those on the trigger efficiency, the lepton reconstruction efficiency, the lepton energy scale and resolution, and the charge of the track.All of these sources were found to contribute a negligible amount to the total uncertainty on the normalization of the signal.

C. Background model
The results of the background measurement reported in Sec.VI are fitted values and associated uncertainties of the dependent parameters of the four background components-Drell Yan, t t, ψ, and ϒ.In the fit to the signal regions, each parameter is constrained by a Gaussian prior with a mean equal to its fitted value from the background measurement and width equal to the corresponding uncertainty.
Two general assumptions are made in the construction of the background model.First, the chosen functional form accurately describes the background in the signal regions.Second, the dependent parameters are the same in each region.Both of these assumptions introduce a potential bias in the final result, whereby a nonzero signal may be observed incorrectly.The tolerance of the background model for such a spurious signal is measured for values of m a and m H corresponding to each simulated signal point.The measurement is performed using a large sample of signal-free events.Simulated t t events with an identified a → μμ candidate are used as a large sample of events for the t t background component.In lieu of simulation, the observed data in the light-flavor-dominated control region are used for the Drell-Yan component.The simulated t t events are then weighted and combined with the data such that the relative contribution of t t matches the simulationbased expectation in the signal region with m μμ between 20 and 60 GeV. 7Finally, the resulting sample is scaled to the expected normalization of the signal region.The combined signal and background model is fit to the large sample of events.The potential bias is taken to be the measured rate of spurious signal þ 1σ and is found to be between 0.2% and 3.2% of the signal rate normalized to the SM gg → h production rate with BRðh → aaÞ ¼ 100%, with a maximum at m a ¼ 20 GeV and m H ¼ m h ¼ 125 GeV.The measured bias is taken as an additional uncertainty on the signal normalization.For the spurious signal calculation, a narrower m μμ range (20 to 60 GeV instead of 2.8 to 70 GeV) is used to scale the t t simulation because it is the range in which the t t background is expected to dominate.

VIII. VALIDATION OF METHODS
To test the methods of the analysis, two validation regions, VRμ and VRe (as defined in Sec.IV B), are used in place of SRμ and SRe.The validation regions are designed to have properties similar to the signal regions and to also test the robustness of the method against variations in the background composition, since no a priori assumptions are made about the relative contributions from t t, Drell Yan, J=ψ or ϒ.Furthermore, the validation checks for non-negligible backgrounds with dimuon invariant mass distributions that are unaccounted for in the background model.The results of the simultaneous fit of the full background model (including all relevant systematic uncertainties) to the validation regions are shown in Fig. 3.In the fit, all region-independent parameters are constrained by the results of the fit to the control regions (reported in Table III).The strong correlations between some of the region-independent parameters, which are reported in Sec.VI, are not explicitly accounted for in the fit to the two validation regions.This simplification is found to have a negligible effect on the results of the fit.The consistency with the background-only model is evaluated by scanning the local p-value as a function of m μμ from 3.7 to 50 GeV,   (5,10,and 20 GeV).Simulated SM backgrounds are shown in the stack, with the Z=γ Ã sample only valid above m μμ > 10 GeV.calculated using frequentist hypothesis tests based on the profile-log-likelihood ratio test statistic and approximated with the asymptotic formulas [70].The p-values are evaluated in 50 MeV intervals below 15 GeV, then 100 MeV intervals up to 30 GeV, and 200 MeV intervals up to m μμ ¼ 50 GeV.The minimum local p-value is found for m μμ ¼ 47.4 GeV to be 0.0074, corresponding to a local significance of 2.44σ.Correcting for the look-elsewhere effect [71] gives a global p-value of 0.31, indicating that at least one excess of this magnitude, or larger, is expected from background fluctuations in at least 31% of experiments.

IX. RESULTS
A simultaneous fit of the full background model is performed on the m μμ spectra in the two signal regions (SRμ and SRe), with m μμ in the range from 2.8 to 70 GeV.The observed m μμ distribution and the background-only fit are shown in Fig. 4; the data are well described by the fit.In the fit, all region-independent parameters are constrained by the results of the fit to the control regions, reported in Table III.The strong correlations between some of the region-independent parameters, which are reported in Sec.VI, are not explicitly accounted for in the fit to the two signal regions.This simplification is found to have a negligible effect on the results of the fit.The fitted values and uncertainties of the remaining parameters, as well as the corresponding values from the fits to the control and validation regions, are shown in Table IV.No significant correlations are found between the parameters listed in Table IV.The consistency with the background-only model is evaluated by scanning the local p-value as a function of m μμ from 3.7 to 50 GeV, using the same calculation, m a    [56].In both figures, the observed and expected limits have been scaled by an Oð1Þ parameter, BRða → ττÞ 2 , to account for the branching ratios assumed in this analysis, and facilitate reinterpretation of the results.range, and intervals used in the scan of the validation region.The results of this scan are reported in Fig. 5.The minimum local p-value is found for m μμ ¼ 8.65 GeV to be 0.0223, corresponding to a local significance of 2.01σ.Correcting for the look-elsewhere effect [71] gives a global p-value > 0.5, indicating that at least one excess of this magnitude, or larger, is expected from background fluctuations in at least 50% of experiments.
With no evidence to support the NMSSM hypothesis, a 95% CL limit can be set using the CL s prescription [72].Figure 6 shows the observed and expected limits on the rate ðσðgg → hÞ × BRðh → aaÞÞ relative to the SM Higgs boson gluon-gluon fusion production cross section (σ SM ), calculated at NLO þ NNLL precision [56], as a function of m a with m H set to 125 GeV.The limits are evaluated in the same intervals used for the p-value scan.Also shown in the figure is the total rate ðσðgg → HÞ × BRðH → aaÞÞ as a function of m H with m a set to 5 GeV, evaluated at 50 GeV intervals from m H ¼ 100 GeV to 500 GeV and at m H ¼ m h ¼ 125 GeV.In both panels of Fig. 6, the observed and expected limits have been scaled by BRða → ττÞ 2 to explicitly account for the branching ratios assumed in this analysis and facilitate reinterpretation of the results.

X. CONCLUSION
A search for the decay of a scalar Higgs boson to two pseudoscalar a Higgs bosons (H → aa) in the context of the NMSSM is presented with LHC data corresponding to an integrated luminosity of 20.3 fb −1 of pp collisions at ffiffi ffi s p ¼ 8 TeV, collected in 2012 by the ATLAS experiment.
Final states are considered with two muons consistent with the decay of one a boson as well as a third lepton (e or μ) and tracks, consistent with collimated τ leptons from the other a boson.A scan of the dimuon invariant mass distribution from 3.7 to 50 GeV shows no significant excess of data over SM backgrounds.Limits are set assuming no coupling of the a boson to quarks.The observed 95% CL upper limits on the production rate, σðgg → HÞ × BRðH → aaÞ, are consistent with the expected limit and are determined to be from 2.33 to 0.72 pb, for m H between 100 and 500 GeV (and m a ¼ 5 GeV).A 95% CL upper limit for the production of the h boson and its decay rate to two pseudoscalar a bosons is set for m a from 3.7 to 50 GeV, with the most stringent limit placed at 3.5% for m a ¼ 3.75 GeV.

FIG. 2 (
FIG. 2 (color online).Observed m μμ distribution in CRj (top) and CRb (bottom) and the SM background model after a simultaneous fit.The Z=γ Ã component of the fit is the combination of the Z boson resonance and the γ Ã continuum models.The % residual of the fit is shown below each plot.Simulated SM backgrounds are shown in the stack, with the Z=γ Ã sample only valid above m μμ > 10 GeV.The two insets show magnified versions of the J=ψ and ϒ resonances.

FIG. 3 (
FIG. 3 (color online).Observed m μμ distribution in VRμ (top) and VRe (bottom) and the background-only fit.The Z=γ Ã component of the fit is the combination of the Z boson resonance and the γ Ã continuum models.The % residuals are shown below each plot.Bins below 4 GeV are 200 MeV wide, between 4 and 15 GeV they are 500 MeV wide, and above 15 GeV they are 2 GeV wide.Simulated SM backgrounds are shown in the stack, with the Z=γ Ã sample only valid above m μμ > 10 GeV.

FIG. 4 (
FIG. 4 (color online).Observed m μμ distribution in SRμ (top) and SRe (bottom) and the background-only fit.The Z=γ Ã component of the fit is the combination of the Z boson resonance and the γ Ã continuum models.The % residuals are shown below each plot.Bins below 4 GeV are 200 MeV wide, between 4 GeV and 15 GeV they are 500 MeV wide, and above 15 GeV they are 2 GeV wide.The expected distribution from a signal with BRðh → aaÞ ¼ 10% is shown for three different m a hypotheses(5,10, and 20 GeV).Simulated SM backgrounds are shown in the stack, with the Z=γ Ã sample only valid above m μμ > 10 GeV.

FIG. 5 (
FIG. 5 (color online).Observed p-value as a function of m μμ , with downward fluctuations of the data represented by a p-value of 0.5.The p-values are evaluated in 50 MeV intervals below 15 GeV, then 100 MeV intervals up to 30 GeV, and 200 MeV intervals up to m μμ ¼ 50 GeV.The p-values shown have not been corrected for the look-elsewhere effect.

FIG. 6 (
FIG. 6 (color online).Observed (solid red) and expected (dashed black) limits with the expected AE1σ and AE2σ bands shown in green and yellow respectively.The top figure shows the expected and observed limits on the rate ðσðgg → hÞ × BRðh → aaÞÞ relative to the SM Higgs boson gluon-gluon fusion production cross section (σ SM ) as a function of m a with m H set to 125 GeV.The limits are evaluated in 50 MeV intervals below 15 GeV, then 100 MeV intervals up to 30 GeV, and 200 MeV intervals up to m a ¼ 50 GeV.Shown in the bottom figure is the total rate ðσðgg → HÞ × BRðH → aaÞÞ as a function of m H with m a set to 5 GeV, evaluated at 50 GeV intervals from m H ¼ 100 GeV to 500 GeV and at m H ¼ m h ¼ 125 GeV.The width of the black band in the bottom figure indicates the theoretical uncertainty on the SM gg → H cross section[56].In both figures, the observed and expected limits have been scaled by an Oð1Þ parameter, BRða → ττÞ 2 , to account for the branching ratios assumed in this analysis, and facilitate reinterpretation of the results.

TABLE II .
Relative efficiency for each selection step for a signal simulated with m a ¼ 5 GeV and m H ¼ m h ¼ 125 GeV.The first entry is a generator-level selection relative to the number of generated h → aa → μμττ events.a The top section of the table shows the efficiency for selecting a → μμ candidates while the bottom part is for the selection of events in the two signal regions.

TABLE III .
Measured values and uncertainties for the SM background model's parameters, which are constrained by a simultaneous fit to the control regions.The errors on the fit parameters are statistical only.

TABLE IV .
Measured values and uncertainties of regiondependent parameters.The m μμ distribution is fit between 2.8 and 70 GeV for all regions, except for CRb, which has a lower bound at 15 GeV.There is no contribution to the total background from the ψ or ϒ resonances.