Search for narrow resonances in the b-tagged dijet mass spectrum in proton-proton collisions at $\sqrt{s}$ = 13 TeV

A search is performed for narrow resonances decaying to final states of two jets, with at least one jet originating from a b quark, in proton-proton collisions at $\sqrt{s}$ = 13 TeV. The data set corresponds to an integrated luminosity of 138 fb$^{-1}$ collected with the CMS detector at the LHC. Jets originating from energetic b hadrons are identified through a b-tagging algorithm that utilizes a deep neural network or the presence of a muon inside a jet. The invariant mass spectrum of jet pairs is well described by a smooth parametrization and no evidence for the production of new particles is observed. Upper limits on the production cross section are set for excited b quarks and other resonances decaying to dijet final states containing b quarks. These limits exclude at 95% confidence level models of Z' bosons with masses from 1.8 to 2.4 TeV and of excited b quarks with masses from 1.8 to 4.0 TeV. This is the most stringent exclusion of excited b quarks to date.


Introduction
Searches for heavy resonances decaying into jet pairs provide a powerful tool with the potential for discovering new physics at hadron colliders. These resonances can be produced via partonparton interactions, and then can decay to two partons. The final-state partons hadronize and are observed as two jets, referred to as dijets.
Such heavy resonances at the TeV scale are a feature of several models that extend the standard model (SM) to address some of its shortcomings, such as the extreme fine tuning required in quantum corrections to accommodate a Higgs boson observed at a mass of 125 GeV [1][2][3][4]. These models introduce heavy resonances that couple to the SM bosons and fermions [5,6]. A minimal extension of the SM is represented by the sequential standard model (SSM) [7], which introduces a spin-1 Z boson with the same couplings to fermions as the SM Z boson, but with a much larger mass. The SSM, among many others, is generalized in the heavy vector triplet (HVT) framework [8], which extends the SM by introducing a triplet of heavy vector bosons, one neutral Z and two oppositely charged W , collectively referred to as V . The heavy vector bosons couple to SM bosons and fermions with strengths g H = g V c H and g F = g 2 c F /g V , respectively, where g V is the strength of the new interaction; c H is the coupling between an HVT boson, the Higgs boson, and a longitudinally polarized SM vector boson; c F is the coupling between an HVT boson and an SM fermion; and g is the electroweak coupling. In this search, we consider two benchmarks, defined in Ref. [8]. In the Model A scenario of the HVT framework, the coupling strengths of the heavy vector bosons to SM bosons and fermions are of the same order, and the new particles decay primarily to fermions. In the Model B scenario of the same framework, the couplings to fermions are suppressed with respect to the couplings to bosons, resulting in a branching fraction of the new particles to SM bosons that is close to unity.
Other models extending the SM postulate that leptons and quarks are composite objects, composed of more fundamental constituents. At an energy beyond the scale of constituent binding energies, the compositeness scale Λ, a new interaction should emerge [9][10][11]. Such models foresee the existence of excited states of quarks (q * ) and leptons ( * ), which could be produced in high-energy collisions and then detected through their decays to SM particles. Excited states of composite quarks [12], which would be produced through a gauge interaction or via contact interactions (CI) [13], can result in large cross sections and could decay predominantly to a quark and a gluon (qg). In the model considered in this search, the compositeness scale Λ is set equal to the resonance mass and the couplings of excited quarks to other particles are assumed to be the same as for nonexcited fermions.
Since the early 1980s, searches for resonances in the dijet invariant mass spectrum [14] have been common at hadron colliders. From proton-proton (pp) collision data at √ s = 7 and 8 TeV, the CMS Collaboration has published searches for dijet resonances where at least one of the two jets in the final state arises from a b quark [15-17], based on a jet flavor identification commonly referred to as b tagging. The most recent search from the CMS Collaboration at √ s = 13 TeV [18] did not attempt to identify the type of final-state parton that produced the jets, and considered resonance models where the final-state partons could be gluons or any flavor of SM quarks. Similar to a recent search from the ATLAS experiment [19], the present search identifies heavy-flavored quarks among the two final-state partons, and has increased sensitivity to models of dijet resonances that decay to b quarks. Examples are such models that predict an excited b quark (b * ), or a Z where the couplings to the third generation are enhanced relative to the couplings to the first and second generations. Such enhanced couplings are also favored in models created to accommodate the possible anomalies observed in the low-energy heavy-flavor sector of the SM [20,21].
In this paper we report a search for Z decaying into a b quark pair as well as b * decaying into b quark and gluon, where two production modes are considered for the latter; bg → b * and b * via CI. Both resonances are assumed to be narrow. The data used for the analysis were collected by the CMS detector from pp collisions at a center-of-mass energy of 13 TeV during the LHC Run 2 (2016-2018), and correspond to a total integrated luminosity of 138 fb −1 .
Tabulated results are provided in the HEPData record for this analysis [22].

The CMS detector
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter (HCAL), each composed of a barrel and two endcap sections. The ECAL and HCAL provide pseudorapidity coverage up to |η| < 3.0, which is further extended by forward calorimeters [23]. Muons are measured in gas-ionization detectors embedded in the steel flux-return yoke outside the solenoid. For nonisolated particles of transverse momentum 1 < p T < 10 GeV and |η| < 1.4, the reconstructed tracks have a p T resolution of 1.5%, a transverse impact parameter resolution of 25-90 µm, and a longitudinal impact parameter resolution of 45-190 µm [24].
Events of interest are selected using a two-tiered trigger system. The first level, composed of custom hardware processors, uses information from the calorimeters and muon detectors to select events at a rate of around 100 kHz within a fixed latency of about 4 µs [25]. The second level, known as the high-level trigger, consists of a farm of processors running a version of the full event reconstruction software optimized for fast processing, and reduces the event rate to around 1 kHz before data storage [26].
A detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Ref. [23].

Data and simulated samples
The data sample analyzed has been collected by the CMS experiment with trigger algorithms that require large hadronic activity in the event. In particular, two sets of trigger algorithms are used: single-jet triggers that require a jet with p T > 550 GeV to be present in the event and triggers requiring the scalar p T sum for all jets with p T > 30 GeV and |η| < 3.0 in the event, to exceed 1050 GeV.
The qq → Z → bb signal Monte Carlo (MC) simulated events are produced at leading-order (LO) accuracy in quantum chromodynamics (QCD) with the MADGRAPH5 aMC@NLO v2.4.2 matrix element generator [27]. Various Z mass hypotheses in the range of 1.6 to 8 TeV are considered for these narrow resonance signals, which have a natural width smaller than the experimental dijet mass resolution (2-3%). This approximation is valid in a large fraction of the HVT parameter space, and is fulfilled in the Model A and Model B benchmarks discussed earlier [8].
For the b * production processes, the signal samples are produced with PYTHIA 8.230 [28] for resonance masses up to 6 TeV. The characteristic dijet mass shape corresponding to the bg → b * → bg signal component is reported in Fig. 1 for resonance masses from 1 to 6 TeV. The contribution of the low-mass tail to the signal shape becomes more dominant at large res-onance masses. This is because the resonance natural mass shape coming from a Breit-Wigner distribution is convolved with falling parton distribution functions (PDFs), where for larger resonance masses the PDFs fall more steeply.
In all cases parton showering and hadronization processes are simulated by interfacing the event generators with PYTHIA 8.230 with the CUETP8M1 [29, 30] tune. The NNPDF2.3LO [31,32] PDFs are used to model the momentum distribution of the colliding partons inside the protons. The generated events include additional pp interactions (pileup) with a distribution that is chosen to match that observed in the data. These events are processed through a full detector simulation based on GEANT4 [33] and reconstructed with the same algorithms as those used for data.

Event reconstruction and selection
The event reconstruction is performed using a particle-flow (PF) algorithm [34], which uses an optimized combination of information from the various elements of the CMS detector to reconstruct and identify individual particles produced in each collision. The algorithm identifies each reconstructed particle either as an electron, a muon, a photon, or a charged or neutral hadron. The charged component of the pileup contribution is removed by applying the charged hadron subtraction mitigation algorithm. The momenta of neutral hadrons are rescaled according to their probability to originate from the primary interaction vertex deduced from the distribution of associated energy deposits in the calorimeters [35]. The PF candidates are then clustered into jets using the anti-k T algorithm [36] implemented in the FASTJET package [37,38] with a distance parameter R = 0.4.
Jet energy corrections, extracted from simulation and data in multijet, γ + jets, and Z + jets events, are applied as a function of p T and η to correct the jet response and to account for residual differences between data and simulation. Multijet events refer to SM events composed uniquely of jets produced through the strong interaction. The energy resolution for jets with p T ∼ 1 TeV is approximately 5% [39]. Jets are required to pass identification criteria, which have negligible impact on the signal efficiency, in order to remove spurious jets arising from detector noise [40].
Events are required to contain at least two jets, each with p T > 30 GeV and |η| < 2.5, and each separated by ∆R = (∆η) 2 + (∆φ) 2 > 0.4 from any electrons or muons. Here, φ is the azimuthal angle in radians. Geometrically close jets are combined into "wide jets", and used to determine the dijet mass, as in our previous dijet searches [18]. In forming wide jets, the two jets with the largest p T are used as seeds, and the four-momenta of jets within ∆R < 1.1 of a seed jet are added to the seed jet's four-momentum. The wide-jet algorithm, designed for dijet resonance event reconstruction, reduces the analysis sensitivity to gluon radiation from the final-state partons.
The pseudorapidity separation of the two wide jets is required to be |∆η| < 1.1. This requirement suppresses t-channel multijet production, and enhances the contribution to the signal from s-channel production.
In order to ensure the full efficiency of the trigger, only events with a dijet invariant mass m jj larger than 1530 GeV are considered. The trigger efficiency measurement is performed with a sample acquired with an independent trigger algorithm that requires the presence of a muon with p T > 50 GeV at the high-level trigger.
Jets from b quarks are identified using a b-tagging DeepJet discriminator that relies on the application of convolutional neural networks on low-level objects, such as PF candidates and secondary vertex information, and improves the jet flavor identification capabilities of the CMS experiment, especially at high jet momenta [41,42]. The operation point of the discriminator was chosen in order to maximize the sensitivity to the Z and b * signal searches. It corresponds to a misidentification probability of 1% for jets from light (u, d, s) quarks and gluons with p T of about 80 GeV. For this mistagging value, the efficiency of identifying genuine b quarks is approximately 75, 45, and 10% for a jet with p T of 0.2, 1.0, and 3.0 TeV, respectively. The simulated event distributions passing the tagger are adjusted via scale factors such that they agree with the observed data.
Events are divided into independent categories, depending on the signal model considered.
When searching for a b * resonance decaying to a b quark and a gluon, only a single category with at least one b-tagged jet is needed. For a Z signal, three categories are defined: 2b, 1b and muon. Events are included in the 2b category if both jets fulfill the b-tagging requirements, and in the 1b category if only one of the two jets is b tagged. The muon category contains the events in which neither jet passes the b tag selection, but at least one jet contains a muon, identified within both the inner tracker and the muon detector. The presence of the muon naturally enriches the heavy-flavor component of the jets, mitigating the loss of signal efficiency of the b-tagging algorithm at very large p T . This category dominates the signal sensitivity for Z mass larger than 5 TeV, a region where the muon presence request has a roughly 50% larger signal efficiency than in the 2b category. The product of acceptance and efficiency of the selections for the signal were very similar in all three data-taking years, and the average values are shown in Fig. 2. The dominant source of inefficiency is the requirement |∆η| < 1.1, which reduces the acceptance to approximately 41%.

Background estimation
Background estimates based on simulated data predict that it is largely dominated by multijet production, which accounts for more than 95% of the total background. The contribution of top quark pair production is approximately 3-4% of the total background, depending on the b-tagging category. Production of vector bosons in association with partons, and production of boson pairs, contribute the remaining 1-2% of the total background.
The background is estimated directly from data assuming that it can be described by a smooth, monotonically decreasing function. The validity of this assumption is checked in simulation.
TeV. Starting from the simplest functional form, an iterative procedure based on the Fisher F-test [43] is used to check at 90% confidence level (CL) if additional parameters are needed to model the individual background distributions. Depending on the dataset considered for the fit, a three-parameter ) functional form is necessary to describe the data. The observed dijet mass spectra are well modeled by the background functional forms, as shown in Fig. 3, where the widths of the dijet mass bins correspond to the dijet mass resolution.
The expected shape of the reconstructed signal mass distribution is extracted from the simulated signal samples. The b * signal shapes are dijet mass distributions obtained directly from the simulation, as shown in Fig. 1, while the Z signal shapes are parametrizations that are fit separately to the simulation samples for each category, using a double-sided Crystal Ball function [44]. This functional form consists of a Gaussian distribution that models the core of the shape, and two power law functions that model the upper and lower tails, and requires a total of six parameters. The resolution of the reconstructed Z is given by the width of the Gaussian core, and is found to be constant at 2% of the resonance mass. The signal shape, for a Z of arbitrary mass, is then obtained by a spline interpolation of the simulated signal shape parameter values as a function of mass. Examples of the Z and b * shapes are shown in Fig. 3, with arbitrary normalizations.
Statistical tests have been performed to check the robustness of the fit method. Pseudoexperiments are generated after injecting a simulated signal with a range of values for the signal mass and cross section. The dijet mass distribution of the pseudodata is then fitted with the signal distribution combined with an alternate background function containing one more parameter than the nominal function. The fitted signal yield is found to be compatible within one third of the statistical uncertainty to the injected yield, regardless of the injected signal strength and resonance mass. These tests confirm that the background estimation is insensitive both to the choice of the function used and to the possible presence of a signal in the data.

Systematic uncertainties
The background shape is estimated from the fit to the data in the considered categories, and no assumption or constraint is applied to function parameters. These parameters are considered as uncorrelated among categories and data-taking periods. As described in Sec. 5 Figure 3: The observed differential cross sections as a function of the dijet mass, shown as fit with the background functions, for the four tagging categories (rows) and the three data-taking periods (columns). The number of parameters in the fit, and the goodness of fit "χ 2 /ndf", are listed where "ndf" is the number of degrees of freedom. The lower panel within each row shows the pulls, (data − fit)/uncertainty, in units of the statistical uncertainty in data. The upper three rows are used to search for Z models, the bottom row is used to search for the b * model, and example shapes of these signal models are shown with the same arbitrary normalization for three choices of resonance mass. tainty.
The dominant uncertainties in the signal arise from the b tagging and the jet-reconstruction uncertainties. The uncertainty in the b-tagging scale factors [45] yields an uncertainty in the signal normalization of 2 and 4% in the 1b and ≥1b categories, respectively, and 15% in the 2b and muon categories. Uncertainties in the reconstruction of the hadronic jets affect mainly the shape of the reconstructed resonance mass. The four-momenta of the reconstructed jets are scaled and smeared according to the uncertainties in the jet p T and momentum resolution. These effects result in a 2% uncertainty in the mean, and 10% in the width of the signal. The signal normalization is also affected by the uncertainty in the selection of the muons inside the jets. The efficiency of the identification of the muons inside jets is measured in a statistically independent and almost pure sample of high-p T b jets, originating from the decay of pair-produced top quarks. This uncertainty is deduced from the statistical uncertainty in the efficiency measurement, ranging up to 43% for muons with p T < 100 GeV. An uncertainty of 100% is assumed for jets with the p T of an associated muon beyond this 100 GeV threshold, because no data was available in this region. Additional systematic uncertainties affecting the signal normalization include the vetoes for high-p T leptons and missing transverse momenta (accounting for 1% each), pileup contributions (0.1%), and the integrated luminosity (1.2% in 2016 [46], 2.3% in 2017 [47], and 2.5% in 2018 [48]). The systematic uncertainty from the choice of PDFs [49] is estimated to be 8-41% of the normalization of the signal cross section, depending on the resonance mass. The factorization and renormalization scale uncertainties are estimated by varying the scales up and down by a factor of 2, both simultaneously and independently, using the maximum obtained value. The resulting effect is a variation of 6-14% of the normalization of the signal cross section.

Results and interpretation
Exclusion limits are obtained by performing a background-only fit and a combined signal-plusbackground fit to the dijet mass distributions, separately by category and data-taking year. In the fit, based on a profile likelihood, the parameters and the normalization of the background in each category are left free to float. The systematic uncertainties in the signal are treated as nuisance parameters, with Gaussian constraints for the jet energy scale and resolution, and log-normal constraints for the integrated luminosity, and are profiled in the statistical interpretation [50]. The uncertainties that affect the signal normalization (PDFs and factorization and renormalization scales) are treated differently depending on how the exclusion is presented. When deriving upper limits on the cross section, these uncertainties are not varied in the fit, but are reported separately as the uncertainty in the theoretical cross sections from the model. When placing limits on the model parameters, these nuisance parameters are fixed at the bestfit values, in the same manner as with the other systematic uncertainties. A more detailed description of the statistical treatment and the likelihood function is reported in Ref. [51]. Upper limits at 95% CL are set using the CL s modified frequentist method [52,53], adopting the asymptotic approximation [54].
A model-independent representation of the observed upper limit on the product of cross section and branching fraction of a bb resonance, as well as the expected limit and its relative 68 and 95% uncertainty bands, are shown in Fig. 4. The acceptance, which is included in the product on the right side of Fig. 4, is defined exclusively via the geometric requirements p T > 30 GeV, |η| < 2.5, and |∆η| < 1.1. This analysis sets an observed mass limit of 2.4 TeV, and an expected mass limit of 2.3 TeV, at 95% CL, on a narrow Z resonance in both the SSM and the HVT Model A. No upper limit can be set on Model B with suppressed fermionic coupling.
HVT model B (g Figure 4: The observed 95% CL upper limits (solid curve) on the product of the cross section and branching fraction (left), and multiplied by signal acceptance (right), for a resonance decaying to bb. The corresponding expected limits (dashed curve) and their variations at the 1and 2-standard deviation levels (shaded bands) are also shown. Limits are compared to predicted cross sections for Z bosons from the sequential SM (SSM) and the heavy vector triplet (HVT) models A and B. The latter two models follow the parameter choices of g V = 1 and g V = 3 respectively.
The cross section limit shown in Fig. 4 (left), is reinterpreted in Fig. 5 as limits on the coupling strengths of a heavy vector boson to SM bosons (g H ) and fermions (g F ). The subset of the parameter space where the natural width of the resonance is larger than the typical experimental resolution, and hence the narrow width approximation is invalid, is indicated in Fig. 5 with a gray shaded area.  Figure 5: The coupling strengths to SM bosons (g H ) and fermions (g F ) of a Z boson with mass 2.0 TeV (blue) and 2.5 TeV (red) that are excluded at 95% CL for the HVT model. The shading indicates the excluded side of the contour. The benchmark scenarios corresponding to HVT models A and B are represented by a purple cross and a red point, respectively. The gray shaded area corresponds to the region where the resonance natural width is predicted to be larger than the typical experimental resolution, and thus the narrow-width approximation is not fulfilled.
For resonances that decay to a b quark and a gluon, upper limits are set by using the category where at least one of the two leading jets is b tagged. Upper limits on the joint product of cross section, branching fraction, and acceptance are reported in Fig. 6. When considering the single b * production process, bg → b * → bg, with the cross section shown in Fig. 6, this analysis sets 95% CL mass limits of 2.5 TeV (observed) and 2.6 TeV (expected) for an excited b quark. Additionally the b * production via contact interactions has been considered. Several processes contribute to the b * production via CI [13], the dominant one being qq → bb * (or bb * ) → bbg where three jets are present in the final state. In order to display the theory cross section for this new process on the same exclusion limit plot, only its resonant component has been considered.
In particular events are selected when the reconstructed dijet originates from the b * decay, and does not include the other q b quark in the event, which is the case for 50 (80)% of the events within the acceptance at 2 (4) TeV. Including both processes in the total signal cross section (single b * production and the resonant component of the b * production via CI) gives a significantly more stringent 95% CL mass limit of 4 TeV on excited b quarks.   Figure 6: The observed 95% CL upper limits on the product of the cross section, branching fraction, and acceptance for dijet resonances decaying to a b quark and a gluon (points). The corresponding expected limits (short dashed) and their variations at the 1-and 2-standard deviation levels (shaded bands) are also shown. Limits are compared to predictions for single b * production (blue, dot dashed), the resonant component of the b * production via contact interactions (magenta, long dashed), and the total b * signal from the sum of these two production modes (red, solid).

Summary
A search for heavy resonances decaying into b quarks has been presented and no excess has been found over the standard model (SM) expectations. The data were collected by the CMS experiment at √ s = 13 TeV during 2016-2018 and correspond to an integrated luminosity of 138 fb −1 . Model-independent upper limits are set on the product of the cross section of the resonance and its branching fraction to b quarks. Signals of Z bosons decaying to pairs of b quarks are considered, for both the previously explored sequential standard model (SSM), and also for a new heavy vector triplet (HVT) model. The decays of Z bosons in both the SSM and the HVT Model A are excluded at 95% confidence level for masses from 1.8 to 2.4 TeV, and limits are set on the coupling strengths of the HVT boson to SM bosons and fermions. Signals of an excited b quark are considered for a previously explored channel, bg → b * → bg, and a production mode via contact interactions. The excited b quark is excluded at 95% confidence level for masses from 1.8 to 4.0 TeV. This is the most stringent exclusion of excited b quarks to date.

Acknowledgments
We congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centers and personnel of the Worldwide LHC Computing Grid and other centers for delivering so effectively the computing infrastructure essential to our analyses. Finally, we acknowledge the enduring support for the construction and operation of the LHC, the CMS detector, and the supporting computing infrastructure provided by the following funding agencies:  [21] LHCb Collaboration, "Test of lepton universality in beauty-quark decays", Nature Phys. 18 (2022) 277, doi:10.1038/s41567-021-01478-8, arXiv:2103.11769.
[27] J. Alwall et al., "The automated computation of tree-level and next-to-leading order differential cross sections, and their matching to parton shower simulations", JHEP 07