Search for low-mass quark-antiquark resonances produced in association with a photon at $\sqrt{s} = $ 13 TeV

A search for narrow low-mass resonances decaying to quark-antiquark pairs is presented. The search is based on proton-proton collisions events collected at 13 TeV by the CMS detector at the CERN LHC. The data sample corresponds to an integrated luminosity of 35.9 fb$^{-1}$, recorded in 2016. The search considers the case where the resonance has high transverse momentum due to initial-state radiation of a hard photon. To study this process, the decay products of the resonance are reconstructed as a single large-radius jet with two-pronged substructure. The signal would be identified as a localized excess in the jet invariant mass spectrum. No evidence for such a resonance is observed in the mass range 10 to 125 GeV. Upper limits at the 95% confidence level are set on the coupling strength of resonances decaying to quark pairs. The results obtained with this photon trigger strategy provide the first direct constraints on quark-antiquark resonance masses below 50 GeV obtained at a hadron collider.


1
New resonances coupling to pairs of quarks (generally referred to as Z ) are ubiquitous signatures in theories beyond the standard model (SM), appearing in dark matter models [1,2] and models with extra dimensions [3], amongst others [4][5][6][7][8][9]. The first dijet searches at a hadron collider were performed by UA1 [10] and UA2 [11], and have been extended to higher resonance masses by CDF [12] and D0 [13] at the Tevatron, and by ATLAS [14] and CMS [15] at the LHC. However, as collision energy and beam intensity have increased, there has been a loss of sensitivity to lower mass resonances, which stems from the increasing cross section of background multijet events, tighter online requirements needed to handle growing event rates, and the large numbers of simultaneous collisions per bunch crossing (pileup). These issues can be partially mitigated by focusing on events in which the resonance is produced in association with high momentum initial-state radiation (ISR). In such a scenario, the two quarks hadronize into a single massive jet. In particular, by considering events with a high transverse momentum (p T ) ISR photon or jet, the ATLAS Collaboration searched for a Z decaying to quark-antiquark pairs [16] and reported a result for resonance masses as low as 100 GeV. The CMS Collaboration used this method with ISR jets to search for Z with masses as low as 50 GeV [17], the lowest mass then probed by collider experiments.
This analysis, which considers events produced with ISR photons from pp collisions at √ s = 13 TeV, using data collected by the CMS detector in 2016, and corresponding to an integrated luminosity of 35.9 fb −1 , extends dijet searches to low Z masses where only indirect measurements [18] provide constraints on the hadronic production of such new physics. This extension to low Z masses is possible in this analysis because of the reliance on a photon trigger, for which it is feasible to select dijet events using a lower p T threshold than for jet triggers. However, the mass of the Z is sufficiently low compared to its momentum that the separate hadronizations of the resulting quark and antiquark merge into a single large-radius jet. This search is performed by looking for a localized excess in the jet mass spectrum in events with a photon and a jet with the two-pronged jet substructure expected for the signal.
The main background, arising from photons produced in association with jets by SM processes, is derived using a data-driven method. Additional resonant SM background processes, composed of tt events and the SM production of W + γ and Z + γ, are estimated from simulation, with corrections obtained from control regions in data. The results are interpreted within the framework of a Z with mass between 10 and 125 GeV, decaying into quarks, and are used to set limits on the quark coupling g q as a function of the Z mass.
The CMS detector consists of a silicon tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), a brass and scintillator hadron calorimeter (HCAL), and gas-ionization muon detectors. A superconducting solenoid provides a uniform magnetic field within the detector. Events are sorted by a two-tiered triggering system [19] to ensure that only events of potential physics interest are recorded. A more detailed description of the CMS detector, including its angular coordinates η and φ, can be found in Ref. [20].
Events are reconstructed using the CMS particle-flow (PF) [21] algorithm, which combines information from every element of the CMS detector to reconstruct and identify individual particles (called PF candidates). Each particle is classified as either a muon, electron, photon, charged hadron, or neutral hadron. The energy of photons is obtained directly from ECAL measurements. Similar measurements, along with information from the tracker, are used to determine the energy of electrons. Misidentification of particles is possible, so additional isolation and purity requirements on potential photons are imposed [22]. The momentum of muons is measured from the curvature of their tracks. Neutral and charged hadron energies are measured from their deposits in the ECAL and HCAL, with information from the tracker used to further constrain the energy of the charged hadrons. The missing transverse momentum ( p miss T ) is defined as the negative vector p T sum of all reconstructed particles in an event. The PF candidates are clustered into jets using FastJet [23] with the anti-k T algorithm [24] and a distance parameter of 0.4 and 0.8 for AK4 and AK8 jets, respectively. Particles produced in additional collisions within the same bunch crossing are suppressed by applying a weight to each PF candidate, calculated by the pileup-per-particle identification [25] algorithm. Jets are corrected as a function of their p T and pseudorapidity (η) to match the observed detector response [26]. Jets arising from the hadronization of b quarks are identified using the CSVv2 algorithm [27].
The signal benchmark model [28] used in this analysis and in Refs.
[16] and [17] features a vector resonance Z , with the coupling constant to quarks set to g q = 1/6, at which the Z width is well below the resolution of the detector. It was simulated to leading order with the MADGRAPH5 aMC@NLO [29] generator, with MLM matching [30] between jets from matrix element calculations and the parton showers. Up to 3 additional jets are allowed in the matrix element calculation. The model assumes no interaction between the SM Z and the Z . The same generator is used to model at leading order the quantum chromodynamic production of multijet events, which can include radiated photons, and the γ+jets background, where the photon is part of the hard interaction, as well as, to next-to-leading order, the backgrounds W + γ and Z + γ. The multijet and γ+jet components are treated together as a single nonresonant background, with the angle between the leading photon and the nearest jet used to define a phase space for each sample. Events from the multijet sample are removed if they are in the γ+jets phase space. The POWHEG 2.0 [31][32][33] generator is used to model tt events at next-to-leading order. All signal and background generators are interfaced with PYTHIA 8.212 [34], with the CUETP8M1 underlying event tune [35], to simulate parton showering and hadronization effects. The generated events are processed through a GEANT4 [36] simulation of the CMS detector. This simulation includes effects from both in-time and out-of-time pileup. The parton distribution function set NNPDF3.0 [37] is used to produce all simulated samples. Where necessary, differences between the reconstruction of simulated and real quantities are corrected by applying scale factors to the simulation, derived from control regions in data [26].
The trigger strategy used by this search is to require one photon with p T > 175 GeV and |η| < 3.0. To ensure a full triggering efficiency for events that satisfy the subsequent selection, offline photons are required to have p T > 200 GeV and |η| < 2.4. Events with additional identified photons of p T > 14 GeV or leptons of p T > 10 GeV are discarded to avoid overlap with other searches and to reduce backgrounds from electroweak sources. Even leptons in a pair that are sufficiently co-linear to be reconstructed as a single jet are generally also tagged as separate leptons and thus excluded. The Z → qq decay is assumed to correspond to the highest momentum AK8 jet in the event. Only events with leading jet p T > 200 GeV are considered. To reduce the contribution from tt, events with an AK4 jet with p T > 30 GeV and satisfying the loose working point [27] of the CSVv2 algorithm (excluding AK4 jets within ∆R < 0.6 of the leading AK8 jet), or with p miss T > 75 GeV, are discarded. A separation √ (∆η) 2 + (∆φ) 2 > 2.2 is required between the leading AK8 jet and the photon in the event.
The soft drop mass algorithm (with β = 0, z cut = 0.1) [38,39] is used to remove soft and wide-angle radiation from the jet, and the resulting distribution of "groomed jet mass" (m SD ) is inspected for localized excesses. The modeling of m SD has been tested for masses only down to 10 GeV [40]; thus 10 GeV is the lowest signal mass considered by this analysis. The highest signal mass considered is 125 GeV, above which there is a low probability of reconstructing the Z as a single jet. The selected events are divided into signal and control regions based on the η of the photon, with the boundary between the regions chosen to maximize the sensitivity of the analysis. Events with photon |η| < 2.1 are considered to be in the signal region. The events with |η| > 2.1, in which the photon is more likely to have been radiated in a multijet process rather than in a hard scattering, define the η control region to perform substructure measurements of jets with kinematic variables similar to those of jets in the signal region. These variables are computed only on jet constituents that have survived the soft drop algorithm. 42] is used to further separate signal jets with two-pronged substructure from the background. This variable is defined using a combination of functions that correlate angles among the constituents of the jet to categorize the substructure. A jet originating from a two-pronged decay is more likely to have a low value of N 1 2 . In addition, we define the dimensionless quantity ρ = ln(m 2 /p 2 T ) [43], which, unlike the mass itself, is approximately uncorrelated with the jet p T .
While the N 1 2 variable offers a considerable discrimination power, the background efficiency for retaining jets based on a fixed cut on N 1 2 has dependencies on the jet ρ and p T . These lead to distortion of the m SD distribution, making a search for a peak difficult. To preserve the shape of the mass distribution, a varying cut on N 1 2 is used to remove 90% of the background. To achieve this, a decorrelated variable N 2 DDT is built, which is similar to the one proposed in Ref. [43]. This variable is defined as: where X 10% is the value of N 1 2 where a cut would retain 10% of the background. The values for X 10% in bins of the jet ρ and p T are taken from the η control region. A smoothing procedure is applied to the X 10% distribution to reduce unphysical features where the statistical uncertainty is large. The selection N 2 DDT < 0 is applied to events in the signal region. By construction, this selection will have a background efficiency of exactly 10% for the sample in which N 2 DDT was constructed. Signal hypotheses across the entire parameter range of the analysis were injected into simulated background distributions to evaluate potential contamination of the η control region and its effect on the mass distribution of events passing the N 2 DDT < 0 requirement. The effects were found to be negligible compared to the statistical uncertainty in the mass distribution. Differences in the p T and ρ dependence of X 10% in the signal and control regions are expected, and are explicitly parametrized as part of the background estimation procedure.
The dominant background is due to nonresonant events in which a light quark or gluon jet passes the N 2 DDT requirement. The second component consists of events with two-pronged jets arising from a mixture of Z + γ, W + γ and tt events. Events from tt production enter the signal selection largely through electrons being incorrectly reconstructed as photons. Other sources of background were found to be negligible. The nonresonant background is estimated from a data-driven method described below, with the simulated samples used for validation only. The other backgrounds are taken from simulation and their shapes and normalizations allowed to vary in a final fit of the passing and failing regions, with a correction derived from a tt control region, also described below.
The nonresonant background in the signal region is estimated by considering the events that have passed all selection requirements except N 2 DDT < 0. In the η control region, the pass-to-fail ratio of background events for the N 2 DDT < 0 requirement is one to nine, independent of jet p T and ρ. In the signal region, this ratio is taken to be a smooth function F (p T , ρ) which models the differences between the N 2 DDT variable in the η control and signal regions. The relationship between passing (N P ) and failing (N F ) events is then: N P = F (p T , ρ)N F , for each bin of p T and ρ. The deviation of F from a flat ratio corresponds to the difference between the signal and control regions. The unknown function F is expanded into a polynomial series: where the unknown coefficients a ij are determined by a simultaneous likelihood fit of the passing and failing events, in which the signal and resonant backgrounds are allowed to float. The number of coefficients in the fit is determined by performing the Fisher F-Tests [44] on progressively higher order polynomial combinations of p T and ρ. The optimal polynomial form is found to be third order in both p T and ρ.
While this fit to data ensures that differences in the nonresonant background modeling of the N 2 DDT variable are accounted for, consistent behavior in data and simulation for resonances is not assured. A dedicated tt control region is defined, built from events containing a high-p T muon, with the selection optimized to be dominated by tt production. The efficiency of the N 2 DDT < 0 requirement is measured by fitting the W-mass peak (where the hadronization products from both quarks merge into one jet) in the passing and failing jet mass distributions of this control region, for both data and simulated samples. This efficiency, an explicit parameter of the fit, is used to correct relative yields for resonant tt events and the W + γ and Z + γ backgrounds obtained from simulation in the passing and failing regions. The data-to-simulation efficiency scale factor is found to be 0.909 ± 0.046 (stat+sys), and is applied to all the resonant backgrounds, as well as to the signal.
To model the m SD distribution in the signal region, a binned 2D maximum likelihood fit is performed on the events passing and failing the N 2 DDT < 0 requirement, in all (p T ,ρ) bins of the signal region [17]. In the fit, all SM processes and the signal are allowed to float simultaneously. Signal shapes are taken from simulation. The fit is performed for the background-only (null) hypothesis and for signal hypotheses for each simulated signal mass (10, 25, 50, 75, 100 and 125 GeV), as well as for interpolated mass shapes derived by vertical template morphing [45] these simulated event distributions to cover a signal hypothesis in steps of 5 GeV from 10 to 125 GeV. To ensure proper modeling of the high mass tail, the fit is performed on events with masses up to 201 GeV. The m SD distribution of the signal region, summed over all p T and ρ bins, is shown in Fig. 1. The contributions from resonant backgrounds are evaluated as part of the likelihood, with their shapes and normalizations allowed to vary within the systematic uncertainties in the initial estimates (see Table 1). The average value of the nonresonant background efficiency in the signal region determined by the fit is 9%.
The uncertainty in the nonresonant background originates from the systematic uncertainty in the fit and the statistical uncertainty from the number of events in the region failing the N 2 DDT < 0 requirement. The signal, tt, W + γ and Z + γ backgrounds are affected by correlated shape and normalization uncertainties. We constrain the efficiency of the selection based on N 2 DDT in the tt control region, with the scale factor uncertainty applied to the yields of signal and the resonant backgrounds in the final fit to the signal region. The jet mass scale and resolution uncertainties are considered as uncertainties in the shape of the signal and the resonant background components in the fit. Finally, uncertainties associated with the jet energy corrections [26], trigger efficiency, lepton veto efficiency, resonant background normalizations and the integrated luminosity determination [46] are applied to the expected yields of the signal and the resonance backgrounds. These are summarized in Table 1. To validate the robustness of the fit, a goodness-of-fit test and bias tests are performed using simulated data with a variety of simulated signals injected. No significant bias is observed for any Z mass.
The results of the fit are used to set 95% confidence level (CL) upper limits on g q . Upper limits are computed under a modified frequentist approach, using the CL s criterion [47,48]. A profile likelihood ratio is used as the test-statistic and its distribution under the null and alternate hypotheses are determined with asymptotic approximations [49]. Limits are shown in Fig. 2 as a function of the resonance mass. Coupling values above the solid curves are excluded at 95%  CL. Systematic uncertainties are treated as nuisance parameters, which are modeled with lognormal priors and profiled in the limit calculations. Values of g q greater than 0.3 are excluded at 95% CL for the entire mass range. For most of the mass range below 50 GeV, made accessible by the trigger strategy, the exclusion from this analysis is more stringent than the indirect limits set by measurements of the Z boson and Υ meson decay widths [18].
In summary, a search for a low mass Z resonance decaying to qq pairs has been presented, using data from proton-proton collisions at the LHC with a center-of-mass energy of 13 TeV. Jet substructure and decorrelation techniques are implemented to search for narrow resonances over a smoothly falling background of the jet groomed mass. No significant excess is observed above the standard model expectation. Upper limits are placed on the quark coupling strength g q of Z bosons with masses between 10 and 125 GeV. Below 50 GeV, the results obtained with this trigger strategy probe the lowest diquark resonance masses reached by a hadron collider. Table 1: The systematic uncertainties included in the computation of the limit on the coupling strength of Z to quarks. Parameters denoted by the symbol affect both the shape and normalization of the affected processes; otherwise only the normalization is modified. The parameters affecting normalizations have log-normal priors, and those affecting the shape have Gaussian priors, unless marked with the † symbol, which denotes that this parameter was floating and constrained by the final simultaneous fit of the passing and failing distributions.

Systematic effect
Affected processes Uncertainty (%) Polynomial for delivering so effectively the computing infrastructure essential to our analyses.    Figure 2: Upper limits at 95% CL on the coupling strength g q of Z → qq. The observed limit is shown as a solid black line, while the expected limit is dashed. The green (dark) and yellow (light) bands represent 1 and 2 standard deviation intervals. Limits from other searches and the indirect constraint from measurements of the Υ and Z boson decay widths [18] are also shown.    [19] CMS Collaboration, "The CMS trigger system", JINST 12 (2017) P01020, doi:10.1088/1748-0221/12/01/P01020, arXiv:1609.02366.