Search for pair production of scalar and vector leptoquarks decaying to muons and bottom quarks in proton-proton collisions at √ s = 13 TeV

A search for pair production of scalar and vector leptoquarks (LQs) each decaying to a muon and a bottom quark is performed using proton-proton collision data collected at √ s = 13 TeV with the CMS detector at the CERN LHC, corresponding to an integrated luminosity of 138 fb − 1 . No excess above standard model expectation is observed. Scalar (vector) LQs with masses less than 1810 (2120) GeV are excluded at 95% confidence level, assuming a 100% branching fraction of the LQ decaying to a muon and a bottom quark. These limits represent the most stringent to date.


Introduction
The standard model (SM) of particle physics describes the interactions of elementary particles to high accuracy.There remain, however, observations of particle physics that are not fully explained, such as the similarities between the quark and lepton families.In some theoretical models of physics with extensions beyond the SM, these similarities are a manifestation of a deeper symmetry between the two types of particles, and there naturally arise particles that bridge the two families.Leptoquarks (LQs) are new bosons that would manifest this fundamental connection between quarks and leptons and are predicted by numerous extensions of the SM, such as grand unified theories [1][2][3][4][5][6][7][8], composite models with lepton and quark substructure [9], technicolor models [10][11][12], and superstring-inspired models [13].Particles with similar decay modes are also found in R-parity violating supersymmetry [14][15][16][17][18][19][20][21][22][23].Leptoquarks are color-triplet scalar or vector bosons carrying both lepton and baryon numbers.They generally decay either to a charged lepton and a quark, or to a neutrino and a quark.In recent years, there have been a number of observed tensions with the SM in measurements of B meson decays and tests of lepton universality [24][25][26][27][28][29][30][31][32][33][34][35], often present in measurements of processes involving muons and third-generation quarks, as well as in precision measurements of the anomalous magnetic moment of the muon [36][37][38].These tensions have increased interest in LQs and significant theoretical literature [39][40][41][42][43] has re-examined the phenomenological and historical constraints of LQ models and searches, in particular removing the traditional constraint that LQs should decay within the same generation of quarks and leptons.Leptoquarks may provide a tree-level explanation of some anomalies in scenarios where the LQ-leptonquark coupling is not restricted to remain in one generation [40].At hadron colliders, LQs can be produced singly or in pairs.The dominant leading-order (LO) processes for pair production of LQs at the LHC involve gluon-gluon fusion and quark-antiquark annihilation, shown in Fig. 1.Pair production can also arise from so-called t-channel production, which does not provide a significant contribution for the model and parameters chosen in this analysis and is not considered [44].Interpretations of direct searches for LQs are typically based on a general model where LQlepton-quark interactions are added to the SM Lagrangian [45].The interactions of scalar LQs with SM particles are completely determined by three parameters [45]; the LQ mass m LQ , the Yukawa coupling at the LQ-lepton-quark vertex λ LQ , and the branching fraction β of the LQ decay to a charged lepton and a quark.Vector LQs are further dependent on an additional parameter κ that relates to their anomalous magnetic and electric quadrupole moments [46].
Values of κ = 0 and 1 are considered in this analysis, corresponding to the minimal coupling and Yang-Mills scenarios, respectively.Leptoquark pair production cross sections (σ) are independent of λ LQ over many orders of magnitude [45].In this analysis, λ LQ has been set to 1, which will produce LQs that decay very close to the point of production [40] and ensures the pair production cross sections and results are independent of λ LQ .
Pair production of LQs is characterized by final states with two leptons and two jets with large transverse momentum p T .Previous limits on scalar LQ pair production decaying to muons and jets in proton-proton (pp) collisions have been published by the CMS and ATLAS Collaborations [47][48][49][50][51].The CMS result excludes, at 95% confidence level (CL), LQs decaying to muons and light (u, d, s) quarks with m LQ < 1.5 (1.3) TeV for β = 1 (0.5) [48], and ATLAS excludes LQs decaying to muons and b quarks with m LQ < 1.7 TeV for β = 1 [50,51].Previous limits on vector LQs have been reported by CMS [47] and ATLAS [51], where CMS excludes vector LQs decaying to muons and light quarks with m LQ < 1.3 (1.5) TeV for β = 1 in the minimal coupling (Yang-Mills) scenario, and ATLAS excludes vector LQs decaying to muons and b quarks with m LQ < 1.7 (2.0) TeV for β = 0.5 in the minimal coupling (Yang-Mills) scenario.The present analysis searches for scalar and vector LQs that decay to a muon and a bottom quark, using data collected by the CMS detector during the 2016-2018 pp LHC runs at √ s = 13 TeV.Results are presented for a value of β = 1, corresponding to maximal production of the µµbb final state, as well as in the range 0 < β < 1.

The CMS detector
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter (HCAL), each composed of a barrel and two endcap sections.Forward calorimeters extend the pseudorapidity (η) coverage provided by the barrel and endcap detectors.Muons are measured in gas-ionization detectors embedded in the steel flux-return yoke outside the solenoid.A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Ref. [52].
Events of interest are selected using a two-tiered trigger system.The first level (L1), composed of custom hardware processors, uses information from the calorimeters and muon detectors to select events at a rate of around 100 kHz within a fixed latency of about 4 µs [53].The second level, known as the high-level trigger, consists of a farm of processors running a version of the full event reconstruction software optimized for fast processing, and reduces the event rate to around 1 kHz before data storage [54].

Event reconstruction
The primary vertex (PV) is taken to be the vertex corresponding to the hardest scattering in the event, evaluated using tracking information alone, as described in Section 9.4.1 of Ref. [55].The global event reconstruction (also called particle-flow event reconstruction [56]) aims to reconstruct and identify each individual particle in an event, with an optimized combination of all subdetector information.In this process, the identification of the particle type (photon, electron, muon, charged or neutral hadron) plays an important role in the determination of the particle direction and energy.Photons (e.g., coming from π 0 decays or from electron bremsstrahlung) are identified as ECAL energy clusters not linked to the extrapolation of any charged particle trajectory to the ECAL.Electrons (e.g., coming from photon conversions in the tracker material or from B hadron semileptonic decays) are identified as primary charged particle tracks and potentially many ECAL energy clusters corresponding to these track extrapolations to the ECAL and to possible bremsstrahlung photons emitted along the way through the tracker material.Muons (e.g., from B hadron semileptonic decays) are identified as tracks in the central tracker consistent with either a track or several hits in the muon system, and associated with calorimeter deposits compatible with the muon hypothesis.Charged hadrons are identified as charged particle tracks neither identified as electrons, nor as muons.Finally, neutral hadrons are identified as HCAL energy clusters not linked to any charged hadron trajectory, or as a combined ECAL and HCAL energy excess with respect to the expected charged hadron energy deposit.
Jets originating from bottom quarks are identified using the DEEPJET discriminator [57], a multiclass jet flavor tagging algorithm that uses deep learning and exploits low-level features from a high number of jet constituents.A working point with a 75% efficiency to identify b jets and a 1% misidentification probability for light jets is used in this analysis [58,59].
The energy of photons is obtained from the ECAL measurement.The energy of electrons is determined from a combination of the track momentum at the main interaction vertex, the corresponding ECAL cluster energy, and the energy sum of all bremsstrahlung photons attached to the track.The energy of muons is obtained from the corresponding track momentum.The energy of charged hadrons is determined from a combination of the track momentum and the corresponding ECAL and HCAL energy deposits, corrected for the response function of the calorimeters to hadronic showers.Finally, the energy of neutral hadrons is obtained from the corresponding corrected ECAL and HCAL energy deposits.
For each event, hadronic jets are clustered from these reconstructed particles using the infrared and collinear safe anti-k T algorithm [60,61] with a distance parameter of 0.4.Jet momentum is determined as the vectorial sum of all particle momenta in the jet, and is found from Monte Carlo (MC) simulation to be, on average, within 5-10% of the true momentum over the whole p T spectrum and detector acceptance.Additional pp interactions within the same or nearby bunch crossings (pileup) can contribute additional tracks and calorimetric energy depositions to the jet momentum.To mitigate this effect, charged particles identified to be originating from pileup vertices are discarded and an offset correction is applied to correct for remaining contributions [62].Jet energy corrections are derived from simulation to bring the measured response of jets to that of particle level jets on average.In situ measurements of the momentum balance in dijet, photon+jet, Z+jet, and multijet events are used to account for any residual differences in the jet energy scale between data and simulation [63].The jet energy resolution amounts typically to 15-20% at 30 GeV, 10% at 100 GeV, and 5% at 1 TeV [63].Additional selection criteria are applied to each jet to remove jets potentially dominated by anomalous contributions from various subdetector components or reconstruction failures [62].
Muons are measured in the range |η| < 2.4, with detection planes made using three technologies: drift tubes, cathode strip chambers, and resistive plate chambers.The single muon trigger efficiency exceeds 90% over the full η range, and the efficiency to reconstruct and identify muons is greater than 96%.Matching muons to tracks measured in the silicon tracker results in a relative p T resolution of 1% in the barrel and 3% in the endcaps for muons with p T up to 100 GeV, and of better than 7% in the barrel for muons with p T up to 1 TeV [64].Muons are required to be isolated from tracks arising from electromagnetic and hadronic activity in the event that may otherwise mimic muon signatures in the detector.
The missing transverse momentum vector ⃗ p miss T is computed as the negative vector p T sum of all the particle-flow candidates in an event, and its magnitude is denoted as p miss T [65].The ⃗ p miss T is modified to account for corrections to the energy scale of the reconstructed jets in the event.

Data and simulated samples
The data set used in this paper was collected by the CMS detector during the 2016-2018 pp LHC runs at √ s = 13 TeV and corresponds to an integrated luminosity of 138 fb −1 [66][67][68].The integrated luminosities for the 2016, 2017, and 2018 data-taking years have 1.2-2.5% individual uncertainties [66][67][68], while the overall uncertainty for the 2016-2018 period is 1.6%.Events are selected using triggers that require at least one muon with p T > 50 GeV, with no isolation requirements.
Signal samples are produced in 100 GeV steps for values of m LQ between 300-3000 GeV at LO in quantum chromodynamics (QCD) with MADGRAPH5 aMC@NLO [69,70] versions 2.2.2 and 2.3.3 using the triplet scalar S3 model of Ref. [39].For this analysis, the LQ (LQ) is forced to decay 100% to LQ → µ − b (LQ → µ + b), giving rise to a µ − µ + bb final state.These samples are used to study the signal efficiency.
The main SM background processes in this final state are tt+jets and Z/γ * +jets production.Other minor contributions from SM processes include diboson (WW/WZ/ZZ)+jets, tt + Z/W/H (ttV), single top quark production, and W+jets.Background from QCD multijets has been shown to be negligible [48] and is not considered in this analysis.
The parton distribution function (PDF) sets used for generating the 2016 (2017-2018) signal and background samples are NNPDF3.0NLO (3.1 LO for signal, 3.1 NNLO for background) [78], with the full CMS detector geometry and response simulated using GEANT4 [79,80]; all samples use the CUETP8M1 [81] (CP2 for signal, CP5 for SM backgrounds) [82]) underlying event tune, with additional pp interactions overlaid and corrected to match the distribution measured in data.
The scalar LQ pair production cross sections are calculated with MADGRAPH5 aMC@NLO using the tool from Ref. [83] with NLO QCD corrections [83][84][85] and the PDF4LHC15 [86] NLO PDF set, and are used for comparison with data and background estimations in the statistical analysis.For the vector LQ interpretation, LQ production cross sections are calculated at LO in QCD using a singlet U1 model in which the LQ couples only to left-handed fermions [83].The scalar LQ simulated samples are again used for the signal efficiency.It has been shown that relevant vector LQ pair production kinematic distributions agree within 10% with equivalent scalar pair production, for m LQ > 500 GeV [40].
The simulated samples are corrected so that the detector response and resolution for both leptons and jets (including b tagged jets) and the trigger efficiency match those measured in data [63,64].

Background estimation
The main SM processes that can mimic the LQ signal in this channel are Z/γ * +jets, tt+jets, and to a lesser extent diboson production.Backgrounds are estimated and validated using a selection dominated by background events, referred to as the preselection.This preselection requires two muons with p T > 53 GeV, to remain fully efficient with respect to the trigger  For diboson and ttV processes, the normalization is performed simultaneously, with a control region again of 80-100 GeV around the Z peak, but with a third lepton requirement (to remain orthogonal to the Z/γ * +jets control region) and no b tag requirement (to diminish the statistical uncertainty).The error bars are the data statistical uncertainties, while the shaded band represents the combined statistical and systematic uncertainty in the full background estimate.The signal contribution in all control regions is negligible.
requirements, and at least two jets with p T > 50 GeV, at least one of which is tagged as a bottom jet.Each event is required to have only two muons, and a veto on a third muon or an electron with p T > 20 GeV is imposed.No requirement is placed on the muon charges.The muons are required to be separated from each other by ∆R = √ (∆η) 2 + (∆ϕ) 2 > 0.3, where ϕ is the azimuthal angle (in radians) with respect to the counterclockwise beam axis.There are two requirements of m µµ > 50 GeV and S µµjj T > 300 GeV, where m µµ is the invariant mass of the dimuon system, and S µµjj T is defined as the scalar p T sum of the two leading jets and two muons in the event.
The Z/γ * +jets, tt+jets, diboson, and ttV contribution estimates make use of backgrounddominated data control regions (CRs) after the preselection.Background shapes are taken from simulation, and show good shape agreement with the data in the CRs.For normalization, the simulation is compared to data in the different CRs, and the measured data normalization scale factors are applied to simulated events in the analysis.The Z/γ * +jets and tt normalizations are computed iteratively, then held constant to compute the simultaneous diboson and ttV normalization.For Z/γ * +jets, the CR is a m µµ window of 80-100 GeV around the Z peak, and for tt+jets it is a window of 100-250 GeV.Both are computed at the preselection level.For diboson and ttV processes, the normalization is performed simultaneously, with a CR again 80 < m µµ < 100 GeV around the Z peak, but with the preselection modified by the additional request of a third lepton (to remain orthogonal to the Z/γ * +jets CR) and no b tag requirement (to diminish the statistical uncertainty).Background contributions from single top quark and W+jets events are estimated from simulation.The CR m µµ distributions, after application of the data normalization scale factors, are shown in Fig. 2. The signal contribution in all control regions is negligible.
The background predictions are validated at the preselection level by comparing them with data in all relevant kinematic distributions.Distributions at preselection of the muon and jet p T spectra are shown in Fig. 3.There is good agreement apart from in the high p T regions of the jet spectra, where there is a tendency for the simulation to overestimate the rate.The small discrepancy is not important since the analysis sensitivity is dependent on the final selections discussed in Section 6.1, and the jet p T distributions play a limited role in the final selection.

Final selection optimization
In order to separate signal from background, a final selection is defined for each m LQ hypothesis.Multivariate techniques are used to increase the signal sensitivity compared to an analysis based on a set of independent selection criteria.A boosted decision tree (BDT) is trained on background and signal events at a modified preselection level for each m LQ hypothesis.The modified preselection requires additional selection criteria of m µµ > 250 GeV and m µµjj > m LQ , to maintain the separation of the CR from the signal regions and reduce training bias in background-enriched and signal-depleted regions.An iterative procedure identifies eleven minimally-correlated kinematic variables with strong signal-to-background separation power to use as the input variables to the BDTs: • invariant masses: m µµ and m µµjj ; • reconstructed LQ invariant masses m µj1 and m µj2 , defined by pairing the two muons and jets such as to minimize the mass difference; • final-state momenta of muons and jets: p T (µ1), p T (µ2), p T (j1), and p T (j2); • combined momenta: S µµjj T and p miss T ; • ∆R separation between the dimuon pair and the leading-p T jet momentum vectors.
Training samples are constructed using the full MC data set, with random event sampling from the background and signal simulated samples.The gradient boost [87][88][89] algorithm is used for the BDT training, avoiding input variables that are too highly correlated, and studies of the independent training and testing events show no evidence of overtraining.Examples of the BDT response at the preselection level are shown in Fig. 4.
The signal-to-background separation is optimized individually for each m LQ using the relevant BDT discriminant as the sensitive variable.The figure of merit in the optimization is the Punzi significance [90] for a discovery potential of five standard deviations.This method is optimal for both making a discovery and for setting limits, and is valid in cases with low background event counts.A constant BDT discriminant selection is used for signal mass points m LQ > 1800 GeV to avoid instabilities in the optimization due to a limited event count in the simulated samples.The signal efficiency for each m LQ hypothesis, defined as the number of events passing final selection divided by the total number of generated events, is shown in Fig. 5.The discrete nature of the final selection for each LQ candidate mass produces the observed variation in the efficiency.The background rejection ranges from 55 to 96% and is above 90% for all m LQ > 600 GeV.
A detailed table of the event counts in data, expected background, and expected signal at final selection is available in the HEPData record for this analysis [91].These data are also represented visually in Fig. 6.The x axis shows the final selection event yields for each of the individual m LQ hypotheses shown on the y axis.Each bin on the y axis represents an independent m LQ hypothesis.For example, the uppermost bin is a comparison of the yields after final selection in the m LQ = 300 GeV signal mass hypothesis.The hatched band represents the combined statistical and systematic uncertainty in the full background estimate.

Systematic uncertainties
Theoretical uncertainties in the LQ signal production cross sections vary from 13 to 36% across the m LQ range of 300-2500 GeV.They are estimated by varying the PDF eigenvectors within their uncertainties and the renormalization and factorization scales by factors of one-half and  two [83].
Systematic uncertainties in the background yields and in the signal efficiency are calculated for each final selection by running the full analysis with separately varied detector conditions, particle momenta, or scale factors.These yields are compared to those for the nominal analysis, and the differences from nominal are used in the analysis of uncertainties.
Systematic uncertainties in the jet energy resolution [63] and muon momentum resolution [64] are measured by smearing the jet and muon momenta, including high-p T specific corrections for muons [92].Uncertainties due to the jet energy scale and the muon momentum scale are estimated by propagating up and down the uncertainties in the applied jet and muon momentum corrections.
Shape uncertainties of the main background predictions are estimated by independently varying the factorization and normalization scales in the simulation by factors of one-half and two.The PDF uncertainty is estimated by varying the NNPDF eigenvectors within their uncertain-    ties, following the PDF4LHC prescription [86,93].
The uncertainties in the Z/γ * +jets, tt+jets, diboson, and ttV background normalizations are estimated by varying the normalization scale factors described in Section 5 up and down by their statistical uncertainties.
Other sources of systematic uncertainty include: the luminosity measurement [66][67][68]; muon reconstruction, identification, and isolation [64]; b tagging efficiency [58,59]; pileup [94]; trigger efficiency and prefiring (L1 trigger incorrectly assigned to an earlier bunch crossing); top p T reweighting (to account for differences in top quark p T spectra observed in data and simulation) [95]; and track reconstruction efficiency.For LQ mass hypotheses above 1 TeV, the systematic uncertainties in the muon reconstruction efficiency become dominant and have a significant effect on the signal samples.These large uncertainties are due to limited numbers of very high momentum muons in the data control samples used to derive the data-to-simulation correction factors.
The effects of these systematic uncertainties in signal efficiency and total background yield are shown in Table 1.The maximum values given in Table 1 are only relevant for large values of m LQ , where the total uncertainty is dominated by the statistical uncertainty in the simulated background samples.For most values of m LQ , the systematic uncertainties are at the lower end of the range, as can be seen in the two rightmost columns of the table, which shows the uncertainties for the m LQ = 1800 GeV signal point.

Limit setting
The data are compared to background predictions after the final selections have been applied.No significant excess above the predicted background is identified for any m LQ .Limits are set on the LQ pair production cross section as a function of m LQ , obtained using the asymp-   The expected and observed upper limits at 95% CL on the product of the LQ pair production cross section and the branching fractions β 2 as a function of m LQ .The black solid line represents the observed limits, the dotted line is for the median expected limits, and the inner dark-green and outer light-yellow bands are for the 68 and 95% CL intervals.The solid blue line and corresponding blue band represents the theoretical scalar LQ pair production cross sections and the uncertainties on the cross sections due to the PDF prediction and renormalization and factorization scales, respectively.Similarly, the dash-dotted (dashed) line and corresponding band represents the cross sections of theoretical vector LQ pair production and uncertainties in the minimal coupling (Yang-Mills) scenario.
totic approximation [96] of the modified frequentist CL s approach [97,98], which uses the ratio of the tail probabilities in the signal+background to background hypotheses.The systematic uncertainties described above are introduced as nuisance parameters in the limit setting procedure using log-normal probability functions.Uncertainties of a statistical nature are described by Γ distributions with widths determined by the number of events in simulated samples or observed in data CRs.Individual data sets from different years are treated separately, then statistically combined into a single limit on the full data set.Most systematic uncertainties are treated as fully uncorrelated across years in the combination, with the exception of PDF, pileup, and shape uncertainties, which are fully correlated, and the integrated luminosity uncertainty, which is partially correlated.
The 95% CL upper limits on σβ 2 as a function of m LQ are shown in Fig. 7, together with the NLO (LO) predictions for the scalar (vector) LQ pair production cross sections.Theoretical uncertainties in the LQ signal production cross sections are shown as a band around the signal production cross section.Uncertainties in the vector model cross sections are larger than in the scalar model because of the LO assumptions in the interference of the helicity states of the vector model [83,99].The discrete nature of the final selection for each LQ candidate mass produces the observed variation in the observed limit.By comparing the observed upper limit with the theoretical cross section values, scalar LQs with m LQ < 1810 GeV are excluded under the assumption of β = 1, in agreement with the median expected limit of 1810 GeV.Vector LQs with m LQ < 2120 (2460) GeV are excluded in the minimal coupling (Yang-Mills) scenario, under the assumption of β = 1, compared to the median expected limit of 2200 (2580) GeV.
Figure 8 shows the expected and observed exclusion limits at 95% CL as a function of the scalar LQ mass and β.Scalar LQs with m LQ < 1540 GeV are excluded for β = 0.5, compared to the median expected limit of 1560 GeV.

CMS
Figure 8: The expected and observed exclusion limits at 95% CL as a function of the leptoquark mass and the branching fraction β.The solid line represents the observed limits, the dashed line represents the median expected limits, and the inner dark-green and outer light-yellow bands represent the 68 and 95% CL intervals.The area left of the observed limit is excluded.

Summary
A search has been performed for pair production of leptoquarks (LQs) decaying to muons and bottom quarks using proton-proton collision data collected at √ s = 13 TeV in 2016-2018 with the CMS detector at the LHC, corresponding to an integrated luminosity of 138 fb −1 .Limits are set at 95% confidence level on the product of the scalar LQ pair production cross section and β 2 , as a function of the LQ mass m LQ , where β is the branching fraction of the LQ decaying to a muon and a bottom quark.Scalar LQs with m LQ < 1810 GeV are excluded for β = 1.The results are also presented as a function of β, and scalar LQs with m LQ < 1540 GeV are excluded for β = 0.5.A further interpretation is performed with a vector LQ model, and vector LQs with m LQ < 2120 (2460) GeV are excluded in the minimal coupling (Yang-Mills) scenario for β = 1.These represent the most stringent limits to date on these models.

Figure 1 :
Figure 1: Dominant leading order Feynman diagrams for pair production of LQs at the LHC.

Figure 2 :
Figure2: Comparison of data and background m µµ distribution at the preselection level for the Z/γ * +jets and tt+jets (left) and diboson and ttV (right) background control regions, with the corresponding data-to-background ratio shown below.For Z/γ * +jets, the control region is a m µµ window of 80-100 GeV around the Z peak, and for tt+jets is a window of 100-250 GeV.For diboson and ttV processes, the normalization is performed simultaneously, with a control region again of 80-100 GeV around the Z peak, but with a third lepton requirement (to remain orthogonal to the Z/γ * +jets control region) and no b tag requirement (to diminish the statistical uncertainty).The error bars are the data statistical uncertainties, while the shaded band represents the combined statistical and systematic uncertainty in the full background estimate.The signal contribution in all control regions is negligible.

Figure 3 :
Figure 3: Comparison of data and background p T distribution at the preselection level for the leading two muons and jets.The error bars are the data statistical uncertainties, while the shaded band represents the combined statistical and systematic uncertainty in the full background estimate.

Figure 4 :
Figure 4: Comparison of data and background BDT discriminant distributions at the preselection level for LQ mass hypotheses of 1500 GeV (upper left), 1800 GeV (upper right), and 2000 GeV (lower).The error bars are the data statistical uncertainties, while the shaded band represents the combined statistical and systematic uncertainty in the full background estimate.

Figure 5 :
Figure 5: Total signal selection efficiency, defined as the number of events passing the final selection divided by the number of generated events.The discrete nature of the individual BDT training and final selection for each LQ candidate mass produces the observed variation in the efficiency.Relative uncertainties are less than one percent in all cases.

Figure 6 :
Figure 6: Data, background, and signal event yields after final selections, for each scalar m LQ hypothesis.Each bin on the y axis represents an independent m LQ hypothesis.The hatched band represents the combined statistical and systematic uncertainty in the full background estimate.

Figure 7 :
Figure7: The expected and observed upper limits at 95% CL on the product of the LQ pair production cross section and the branching fractions β 2 as a function of m LQ .The black solid line represents the observed limits, the dotted line is for the median expected limits, and the inner dark-green and outer light-yellow bands are for the 68 and 95% CL intervals.The solid blue line and corresponding blue band represents the theoretical scalar LQ pair production cross sections and the uncertainties on the cross sections due to the PDF prediction and renormalization and factorization scales, respectively.Similarly, the dash-dotted (dashed) line and corresponding band represents the cross sections of theoretical vector LQ pair production and uncertainties in the minimal coupling (Yang-Mills) scenario.

Table 1 :
Systematic uncertainties in signal efficiency and background yields in the combined 2016-2018 data set, shown as a range over all final selections (second and third columns) as well as for the m LQ = 1800 GeV point (rightmost two columns).The last two rows show the total systematic and statistical uncertainties in the simulated samples.