Search for the Higgs boson decaying to two muons in proton-proton collisions at $\sqrt{s} =$ 13 TeV

A search for the Higgs boson decaying to two oppositely charged muons is presented using data recorded by the CMS experiment at the CERN LHC in 2016 at a center-of-mass energy $\sqrt{s} =$ 13 TeV, corresponding to an integrated luminosity of 35.9 fb$^{-1}$. Data are found to be compatible with the predicted background. For a Higgs boson with a mass of 125.09 GeV, the 95% confidence level observed (background-only expected) upper limit on the production cross section times branching fraction to a pair of muons is found to be 3.0 (2.5) times the standard model expectation. In combination with data recorded at center-of-mass energies $\sqrt{s} =$ 7 and 8 TeV, the background-only expected upper limit improves to 2.2 times the standard model value with a standard model expected significance of 1.0 standard deviations. The corresponding observed upper limit is 2.9 with an observed significance of 0.9 standard deviations. This corresponds to an observed upper limit on the standard model Higgs boson branching fraction to muons of 6.4 $\times$ 10$^{-4}$ and to an observed signal strength of 1.0 $\pm$ 1.0 (stat) $\pm$ 0.1 (syst).


1
In the standard model (SM), the masses of fermions are generated by their Yukawa coupling to the Higgs field [1][2][3][4], whose existence was confirmed by the Higgs boson (H) discovery [5][6][7].Measurements at CMS and ATLAS provided evidence that the Higgs boson couples to bottom quarks [8,9], and established that it couples with tau leptons [10,11] and top quarks [12,13].The Higgs boson mass has been measured and is found to be m H = 125.09± 0.24 GeV in a combination of ATLAS and CMS data samples [14].The study of the Higgs boson decays to muons is of particular importance because it extends the investigation to its couplings to fermions of the second generation.For a Higgs boson with a mass of 125.09GeV, the expected branching fraction (B) to muons is 2.17 × 10 −4 [15], and the narrow decay width of the Higgs boson [16,17] is several orders of magnitude smaller than the O( GeV) experimental dimuon mass resolution.The signal would appear as a narrow resonance over a smoothly falling mass spectrum from the SM background processes, primarily Drell-Yan (DY) and leptonic tt decays.
The CMS and ATLAS Collaborations placed upper limits on the product of the Higgs boson production cross section and branching fraction B(H → µ + µ − ) of approximately seven times the SM value at 95% confidence level (CL) with LHC Run 1 data [18,19], collected at center-ofmass energies √ s = 7 and 8 TeV.The ATLAS Collaboration improved its observed (expected) limit to 2.8 (2.9) times the SM expectations by adding data collected at 13 TeV [20].This Letter presents a search for H → µ + µ − events with the CMS detector using 35.9 fb −1 of proton-proton (pp) collision data collected in 2016 at √ s = 13 TeV, and its combination with the data collected at √ s = 7 and 8 TeV corresponding to integrated luminosities of 5.0 fb −1 and 19.7 fb −1 , respectively.
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter, and a brass and scintillator hadron calorimeter, each composed of a barrel and two endcap sections.Forward calorimeters extend the pseudorapidity (η) coverage provided by the barrel and endcap detectors.Muons are detected in gas-ionization detectors embedded in the steel flux-return yoke outside the solenoid.A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Ref. [21].
The Monte Carlo (MC) simulated events used to model the signal include the four leading Higgs boson production processes: gluon-gluon fusion (ggH), vector boson fusion (VBF), and associated production with a vector boson (VH, V=W, or Z) or top quarks (ttH).The Higgs boson MC samples are generated at next-to-leading order (NLO) for masses of 120, 125, and 130 GeV with POWHEG 2.0 [22], using the parton distribution function sets of NNPDF3.0 [23].The ggH acceptance in each analysis category is found to be in agreement with that calculated at NLO with the MADGRAPH5 aMC@NLO2.2.2 [24] generator.The SM background processes considered are DY, single and pair production of top quarks (st, tt), and di-and triboson production (VV, VVV).Background samples are generated using MADGRAPH5 aMC@NLO and POWHEG.Spin correlations in multiboson processes generated using MADGRAPH5 aMC@NLO are simulated using MADSPIN [25].The parton shower and hadronization processes are modeled by the PYTHIA8.212[26] generator with the CUETP8M1 [27] underlying event tune.The detector response is based on a detailed description of the CMS detector and is simulated with the GEANT4 package [28].Simultaneous pp interactions overlapping the event of interest (pileup) are included in the simulated samples.The distribution of the number of additional interactions per bunch crossing in the simulation corresponds to that observed in the 13 TeV data collected in 2016, with an average of 23 interactions.The SM Higgs boson cross section and branching fractions are taken from the LHC Higgs boson cross section working group recommendations [15], while cross sections for the background processes are taken from FEWZ3.1 [29], TOP++2.0 [30], HATHOR [31,32], and MCFM [33].Simulated background processes are used only to optimize the event selection but not for the final background estimate, which is obtained from data.
The particle-flow (PF) algorithm [34] is used to reconstruct observable particles in each event.It combines all subdetector information to reconstruct individual particles and identify them as charged or neutral hadrons, photons, or leptons.Electrons and muons are formed by associating a track in the silicon detectors with a cluster of energy in the electromagnetic calorimeter [35] or a track in the muon system.The relative transverse momentum (p T ) resolution of muon candidates with p T < 100 GeV is 1.6% in the barrel [36].Jets are reconstructed using the anti-k T clustering algorithm [37] with a distance parameter of 0.4, as implemented in the FASTJET package [38].Jets are required to have a minimum p T of 30 GeV and a maximum |η| of 4.7.Further identification criteria are applied in order to reject jets from pileup or noise present in the detector [39].For jets with |η| < 2.4, multivariate algorithms discriminate jets arising from the hadronization of b-quarks [40].The missing transverse momentum p miss T is defined as the magnitude of the negative vector p T sum of all reconstructed particles (charged and neutral) in the event and is modified by corrections to the energy scale of reconstructed jets.The reconstructed vertex with the largest value of summed physics-object p 2 T is taken to be the primary pp interaction vertex.
Events are selected by the trigger system requiring the presence of at least one isolated muon with p T > 24 GeV [41].The offline selection requires two oppositely charged muons with p T > 26 GeV (p T > 20 GeV) for the leading (subleading) muon and |η| < 2.4.To reject events with muons from nonprompt decays, muons must be isolated, with a relative isolation sum <25%.The relative isolation sum is calculated as the scalar p T sum of PF objects, excluding the muon, within a cone of radius ∆R = √ (∆η) 2 + (∆φ) 2 = 0.4 centered on the direction of the muon, and divided by the muon p T .Charged particles not associated with the event vertex are not considered in this sum, and a correction is applied in order to account for the neutral particle contamination arising from pileup [42].The invariant mass of the Higgs boson candidate (m µµ ) is constructed from the two highest p T oppositely charged muons, and the event is retained for further analysis if 110 < m µµ < 150 GeV.The overall trigger efficiency for these events is 98.5%.
Events are classified into categories using variables that are largely uncorrelated with m µµ in order to enhance the sensitivity to the Higgs boson signal.The primary Higgs boson production mechanisms targeted by this analysis are VBF and ggH.The p T and η of the dimuon system, and the |∆η| and |∆φ| between the muons, distinguish between ggH signal events and the DY background.The |η| of each of the two highest p T jets, the mass and |∆η| between the jets in each of the two highest mass dijet pairs, and the number of jets with |η| < 2.4 (central jets) and |η| > 2.4 (forward jets) identify VBF signal events.Finally, the number of b-tagged jets and p miss T identify events with tt decays.These variables are used as input to a boosted decision tree (BDT) [43], which was trained with simulated signal and background events normalized to their respective SM cross sections.The dimuon mass resolution is intentionally not used as input to the BDT in order to avoid biasing the background shape.Simulated signal events used in the training steps are not used later in the analysis.Figure 1 shows the BDT output distributions for data and for simulated events.The output of the classifier was transformed such that the sum of all signal events has a uniform distribution.A large fraction of the VBF signal events can be distinguished from background processes, and corresponds to events with the highest BDT score.
The event categories are defined using the BDT score and the expected dimuon mass resolution,  The transformed BDT output distributions in data (solid points) and MC simulation (histograms).The stacked solid histograms represent the background processes, while the stacked dashed histograms represent the signal.In the legend, V denotes the vector bosons W and Z, and TTX indicates the top quark pair production in association with a vector boson V or another top quark pair.The vertical lines denote the BDT response intervals indicated in Table 1.
gauged by the largest |η| of the two muons.The best mass resolution is obtained when both muons are located in the central part of the detector |η| < 0.9, where the muon momentum resolution is approximately constant, and degrades when one of the muons is more forward, especially in the region |η| > 1.9.
The number of categories and the values of the BDT and |η| boundaries of the categories were optimized according to an iterative process using ∑ i S 2 i /B i as a figure of merit, where S i and B i are the number of expected signal and background events in each category in the i th mass bin from 120 to 130 GeV with 0.5 GeV spacing.A first category boundary is created by optimizing the figure of merit against all possible boundaries in |η| and in BDT score separately, and then choosing the one with the larger gain.The process is then repeated recursively within each of the two newly created categories to create additional category boundaries within them until a set number of categories is achieved.Some rounding of the values of the boundaries was made afterward, checking that the simplification does not significantly worsen the expected limit.This procedure incorporates the dimuon mass resolution into the definition of the categories, optimizing the sensitivity of the analysis.This optimization results in 15 categories shown in Table 1.Simulated events are used to optimize the event categories and to estimate the selection efficiency for signal events.In each category, the shape and the normalization of the dimuon mass distribution of the background contributions are obtained from a parametric fit to the data using a set of empirical functions.The product of signal acceptance and efficiency for the H → µ + µ − signal varies depending on production process.This product is shown in Table 1 for each category for a Higgs boson mass of 125 GeV, together with the functional form used to derive the background from data and the S/ √ B ratio within the full width at half maximum (FWHM) of the expected signal distribution.
The reconstructed invariant mass of the signal is modeled with a sum of up to three Gaussian functions, which provides a satisfactory description of the low-mass tail of the distribution, using empirical parametric shapes that are separately fit to the simulated dimuon invariant mass distribution for each production process in each category for m H = 120, 125, and 130 GeV.The fit parameters are interpolated for masses within that range.
The invariant mass distribution of the background primarily follows the smoothly falling spectrum of the high-mass DY background.Secondary contributions come from the single and pair production of top quarks.In each category, the background distribution is modeled by fitting the data with a single analytic function, chosen from a set of alternative options.These include a sum of exponential functions, Bernstein polynomials (B deg n ), and a modified version of the Breit-Wigner Z boson line shape derived and validated by fitting FEWZ predictions of the DY invariant mass distribution at next-to-NLO [44,45]: where m Z and Γ Z are the mass and the width of the Z boson fixed to known values [46].In addition, FEWZ spectra templates multiplied by polynomial functions are considered, as well as a modified Breit-Wigner distribution multiplied by a Bernstein polynomial of up to degree 4 (B deg 4 ).The chosen function maximizes the expected sensitivity without introducing a bias in the measured signal yield, which is determined as follows.In each category, backgroundonly fits to the data are performed with every function.From each of these fits, thousands of pseudo-data sets are generated, taking into account the uncertainties in the fit parameters and their correlations, and simulated signal events are added according to their expected SM yields.Each of the background functions is then used to fit the pseudo-data sets generated from every other function, with the total signal yield floating freely in the fit.The bias is estimated as the median excess or deficit in the measured signal yield relative to the SM expectation.Accepted functions in each category have a maximum possible bias of less than 20% of the statistical uncertainties for m H = 120, 125, and 130 GeV.This corresponds to an overall uncertainty in the calculated limit of less than 1%, which is neglected.For the final result, the signal yield is measured with a simultaneous signal plus background profile likelihood fit of all categories.
The systematic uncertainties considered in the analysis account for possible mismodeling in the signal shape or rate.The shape of the reconstructed Higgs boson invariant mass is affected by the muon momentum scale and resolution.Uncertainties in the calibration of these values are propagated to the shape of the invariant mass distribution of the Higgs boson, assuming a Gaussian prior, yielding variations of up to 0.05% in the position of the peak and up to 10% in its width.Jet energy uncertainties in scale and resolution affect the analysis through migrations between categories.The largest variation of this kind amounts to 6% of the relative yield.Uncertainty in the simulation of additional pileup events is modeled by varying the total inelastic cross section [47,48] by ±5%, which translates to ≈1% variations in the yields.The systematic uncertainty in the b-tagging or light-quark and gluon jets mistagging efficiencies result in event migration across categories of ≈1%.Lepton efficiency mismodeling is accounted for with trigger and isolated muon identification uncertainties (≈2%).The factorization and renormalization scales used in the MC simulations are varied up and down separately by a factor of 2, translating to changes of up to 6% in the signal acceptance per category.The parton distribution functions used in the signal MC simulations are varied using the NNPDF3.0replicas, which yield differences of ≈2%.In the comparison of measured signal yields with expectation, additional uncertainties in the calculated signal cross sections are considered.They are due to the choice of factorization and renormalization scale (3.9, 0.4, 3.8, 1.9, and 10%, for ggH, VBF, ZH, WH, and ttH, respectively) and parton distribution functions (3.2, 2.1, 1.6, 1.9, and 3.7%), as well as the 1.7% uncertainty in the H → µ + µ − branching fraction [15].Finally, a Table 1: The optimized event categories, the product of acceptance and selection efficiency in % for the different production processes, the total expected number of SM signal events (m H = 125 GeV), the estimated number of background events per GeV at 125 GeV, the FWHM of the signal peak, the background functional fit form, and the S/ √ B ratio within the FWHM of the expected signal distribution.BDT  2.5% uncertainty is associated with the integrated luminosity measurement [49].
A maximum likelihood signal-plus-background fit to the dimuon invariant mass spectrum is performed across all categories to measure the signal strength modifier µ, defined as (σB) obs / (σB) SM where σ indicates the Higgs boson production cross section.The best fit signal strength for a Higgs boson mass hypothesis of 125.09GeV ( μ125 ) and 68% CL interval are extracted with a profile likelihood ratio, according to the procedure described in Ref. [50], yielding μ125 = 0.7 ± 1.0 (stat) +0.2 −0.1 (syst) for m H = 125.09GeV [51]. Figure 2 shows the background component and the signal-plus-background fits to the data in all categories combined, weighted by the expected signal-to-background ratio in each category.The 95% CL upper limit on the signal strength modifier computed with the asymptotic CL s method [52][53][54] and the compatibility of the dimuon yield with the background-only hypothesis for the 2016 data set (13 TeV) are also derived.The observed (expected for µ = 0) upper limit at 95% CL for m H = 125.09GeV is 2.95 (2.45), with an observed (expected for µ = 1) significance of the incompatibility with the background-only hypothesis of 0.6 (0.9) standard deviations (s.d.).
The 95% CL upper limit on the signal strength as a function of m H in the region around the Higgs boson mass for a combination of data recorded at center-of-mass energies of 7, 8, and 13 TeV is shown in Fig. 3, and yields an observed (expected for µ = 0) limit on the production rate of 2.92 (2.16) times the SM value at m H = 125.09GeV.The observed limit generally agrees well with expected limit curve for µ = 1 that is also shown, and corresponds to an upper limit on the H → µ + µ − branching fraction of 6.4 × 10 −4 , assuming the SM production cross sections.The best fit signal strength for m H = 125.09GeV is μcomb    In summary, we present a search for the Higgs boson decaying to two muons using data recorded by the CMS experiment at the LHC in 2016 at a center-of-mass energy of 13 TeV, corresponding to an integrated luminosity of 35.9 fb −1 .No significant evidence for this decay is observed.Limits are set on the cross section times branching fraction of the Higgs boson decaying to two muons.The combination with data recorded at center-of-mass energies of 7 and 8 TeV yields a 95% confidence level observed upper limit of 2.92 times the standard model value for m H = 125.09GeV.The corresponding expected upper limit in the absence of a SM decay in this channel is 2.16, which is the most sensitive to date.Assuming standard model production cross sections for the Higgs boson, the observed limit corresponds to an upper limit of 6.4 × 10 −4 on the Higgs boson branching fraction to two muons.

Figure 1 :
Figure1: The transformed BDT output distributions in data (solid points) and MC simulation (histograms).The stacked solid histograms represent the background processes, while the stacked dashed histograms represent the signal.In the legend, V denotes the vector bosons W and Z, and TTX indicates the top quark pair production in association with a vector boson V or another top quark pair.The vertical lines denote the BDT response intervals indicated in Table1.

125= 1
.0 ± 1.0 (stat) +0.1 −0.1 (syst), and the observed combined significance is 0.9 s.d.The expected values for µ = 1.0 is μcomb 125 = 1.0 +1.1 −1.0 and the combined expected significance is 1.0 s.d.Theoretical uncertainties are considered correlated across the data sets, while the main experimental uncertainties are considered uncorrelated.

Figure 2 :
Figure 2: Data and weighted sum of signal-plus-background fits to each category.Events are weighted according to the expected signal-to-background ratio in the category to which they belong.The lower panel shows the difference between the data and the background component of the fit.

Figure 3 :
Figure3: The 95% CL upper limit on the signal strength modifier, µ, in the region around the Higgs boson mass for the combination of the 7, 8, and 13 TeV data sets together with the expected limit obtained in the background only hypothesis (dashed black line) and in the signalplus-background hypothesis (dashed red line) for the SM Higgs boson with m H = 125 GeV.
response Maximum ggH VBF WH ZH ttH Signal Bkg/GeV FWHM