Search for baryon number violation in top quark production and decay using proton-proton collisions at √ s = 13 TeV

A search is presented for baryon number violating interactions in top quark production and decay. The analysis uses data from proton-proton collisions at a center-of-mass energy of 13 TeV, collected with the CMS detector at the LHC with an integrated luminosity of 138 fb − 1 . Candidate events are selected by requiring two oppositely-charged leptons (electrons or muons) and exactly one jet identified as originating from a bottom quark. Multivariate discriminants are used to separate the signal from the background. No significant deviation from the standard model prediction is observed. Upper limits are placed on the strength of baryon number violating couplings. For the first time the production of single top quarks via baryon number violating interactions is studied. This allows the search to set the most stringent constraints to date on the branching fraction of the top quark decay to a lepton, an up-type quark (u or c), and a down-type quark (d, s, or b). The results improve the previous bounds by three to six orders of magnitude based on the fermion flavor combination of the baryon number violating interactions.


1
In the standard model (SM), the baryon number is a conserved quantum number.Its conservation, however, is not a direct consequence of fundamental symmetries within the SM, and it can be violated by nonperturbative effects [1].The size of such violations is too small to explain the observed matter-antimatter asymmetry in the universe [2].Certain scenarios of physics beyond the SM, such as grand unified theories [3] and supersymmetry [4], naturally include baryon number violation (BNV) and could provide a mechanism to explain this observation.Various low-energy direct searches for signatures of BNV have been conducted over the past decades, with constraints set on the BNV energy scale via processes such as nucleon [5], τ lepton [6], c [7], and b quark [8] decays.There are also stringent indirect constraints from proton stability involving heavy quarks [9] for specific theoretical assumptions [10].Experiments at the CERN LHC provide the highest sensitivity for potential high-energy BNV processes involving the top quark.Previously, the CMS Collaboration has performed a search for BNV decays of the top quark in single lepton (electron or muon) channels in proton-proton (pp) collisions at √ s = 8 TeV [11].This Letter presents the first search for top quark BNV interactions via single top quark production in association with a lepton in pp collisions at 13 TeV in dilepton final states.The data used in the analysis correspond to an integrated luminosity of 138 fb −1 , collected by the CMS experiment at the LHC during 2016-2018.
Assuming the mass scale of new physics responsible for BNV processes is larger than the energy scale directly accessible at the LHC, BNV interactions of top quarks can be described through an effective Lagrangian, L eff .Assuming the SM field content and gauge symmetries, the terms in the BNV effective Lagrangian must involve three quark fields and one lepton field [12].Including up to dimension-six operators, the most general effective Lagrangian that describes the BNV interactions of the top quark and a charged lepton takes the form [13]: where d, u, and ℓ are down-type quark, up-type quark, and charged-lepton fields, respectively, where the superscript "c" denotes charged conjugated fields.Colors are labeled by greek indices, Λ is the generic scale of new physics, and C s and C t are fermion-flavor-dependent effective couplings.The s and t labels in Eq. ( 1) denote that the new physics scale may be linked to the mass of a heavy mediator exchanged in the s or t channels, respectively [13].No specific chirality is assumed for the BNV interactions.The terms in L eff violate baryon and lepton numbers simultaneously.These effective interactions open new top quark decay and production channels at the LHC. Figure 1 displays representative Feynman diagrams for single top quark production ("ST mode") and top quark decay ("TT mode") via BNV interactions in top quark-antiquark pair production (tt).This analysis uses events in dileptonic final states (e + e − , e ± µ ∓ , and µ + µ − ) where one lepton is produced via the BNV interaction and a second lepton comes from the decay of the W boson produced in the dominant t → bW decay.The strength of the twelve flavor combinations of top quark four-fermion BNV interactions are probed in these final states.These take the form tℓq u q d , where ℓ can be an electron or muon, q u can be an up or charm quark, and q d can be any down-type quark.The BNV interactions with a tau lepton can contribute to the dileptonic final states considered in this analysis via its leptonic decays.However, only a small fraction of these events appears in the signal selection because of the low branching fraction of the leptonic final states and the lower energies of the decay electrons or muons.Therefore, tτq u q d interactions are not probed in this analysis.
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter, and a brass and scintillator hadron calorimeter, each composed of a barrel and two endcap sections.Forward calorimeters extend the pseudorapidity (η) coverage provided by the barrel and endcap detectors.Muons are measured in gas-ionization detectors embedded in the steel flux-return yoke outside the solenoid.A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Ref. [14].
The data are modeled by Monte Carlo (MC) simulations of the signal and background processes.Simulated events are produced with event generator programs using the next-to-nextto-leading order (NNLO) parton distribution function (PDF) sets from NNPDF3.1 [15].Parton showering and hadronization are done with PYTHIA v8.240 [16] using the underlying-event tune CP5 [17].Generated events undergo a full simulation of the detector response using GEANT4 [18].The presence of simultaneous pp collisions in the same or nearby bunch crossings, referred to as pileup (PU), is modeled by superimposing inelastic pp interactions, simulated using PYTHIA, on all MC events.Simulated events are then reweighted to reproduce the PU distribution observed in data.
Contributions to the background include SM tt production, single top quark production in association with a W boson (tW), W or Z bosons produced in association with tt (tt+W/Z), Drell-Yan in association with jets (DY+jets) processes, W+jets production, and diboson processes (including WW, WZ, and ZZ).The contribution from quantum chromodynamics (QCD) multijet production is found to be negligible.The POWHEG v2.0 next-to-leading order (NLO) MC generator is used to simulate the SM tt, tW, and diboson events [19][20][21].The MAD-GRAPH5 aMC@NLO v.2.6.5 generator is used to simulate the tt+W/Z, DY+jets, W+jets, and diboson events [22].The tt sample is normalized to the cross section calculated at NNLO in QCD including resummation of next-to-next-to-leading logarithmic (NNLL) soft-gluon terms with the TOP++2.0program [23], 832 +20  −29 (scale) ± 35 (PDF + α S ) pb, where α S is the strong coupling.To improve the modeling of the transverse momentum (p T ) spectrum of the top quark in POWHEG, simulated SM tt events are weighted as a function of the p T of the top quark to match the expectations at NNLO QCD accuracy, including electroweak corrections [24].Other simulated samples are normalized to their cross section predictions at NNLO (for DY+jets and W+jets [25]), NLO+NNLL (for tW production [26]), or NLO (for the diboson and tt+W/Z processes [27,28]).
The simulated signal samples are generated using the MADGRAPH5 aMC@NLO v.2.6.5 generator with the TopBNV model [13] at leading order in QCD.The top quark BNV signal sample has two independent components: (i) ST mode, and (ii) TT mode, as shown in Fig. 1.Independent samples are generated for the various possible fermion flavor combinations.The top quark mass and width are set to 172.5 and 1.33 GeV, respectively.There is no interference considered between the BNV signal and SM processes and the BNV couplings affect only the signal yield.Since the BNV couplings are probed individually, signal samples are generated separately for nonzero C t and C s couplings, assuming Λ = 1 TeV.The BNV signal cross sections and the branching fractions for the BNV top quark decays depend quadratically on C/Λ 2 [13].Theoretical cross sections for single top quark production and top quark decays via the BNV interactions are shown in Table 1.The dominant signal process is the ST mode because of its larger cross section.Final-state particles in the ST mode also have a harder p T spectrum compared to SM processes and the TT mode [13].Therefore, the analysis is optimized with respect to the ST mode signatures and the TT mode contribution is added for completeness.
Table 1: Theoretical inclusive cross sections, in units of pico barn (pb), for single top quark production (ST) and top quark-antiquark pair production with the decay (TT) via BNV interactions, assuming a top quark mass of 172.5 GeV, the top quark decay width 1.33 GeV, Λ = 1 TeV, and C t = 1 or C s = 1.The uncertainties arising from the choice of the renormalization and factorization scales and PDFs are given as (σ±Scale±PDF).Here, the sum of the two cross sections is given where ℓ = e or µ.Signal events in the ST mode contain two opposite-sign leptons, one jet originating from a bottom quark, and missing energy due to the undetected neutrino from the top quark decay.
Events were selected online during data taking by a combination of single-electron and dielectron triggers for the e + e − events as in Refs.[29,30].Single-muon triggers are used for the e ± µ ∓ and µ + µ − events , as described in Ref. [31].The particle-flow (PF) algorithm aims at reconstructing individual particles (photons, charged and neutral hadrons, muons, and electrons) by combining information from the various components of the CMS detector [32].The primary vertex is taken to be the vertex corresponding to the hardest scattering in the event, evaluated using tracking information alone, as described in Section 9.4.1 of Ref. [33].Electron and muon candidates [34,35] are required to lie within |η| < 2.4 to keep them within the silicon tracker coverage.Electron candidates in the transition region between barrel and endcap calorimeters (1.44 < |η| < 1.57) are removed.The same high-p T lepton identification and isolation criteria described in Ref. [31] are used to reject nonprompt leptons.Events are required to have exactly two opposite-sign leptons.To operate well above the trigger threshold, the selected electron (muon) should have p T > 35 (53) GeV.Selected events are divided based on their lepton flavors into three mutually exclusive categories: e + e − , e ± µ ∓ , and µ + µ − .To suppress backgrounds, especially from the DY+jets processes, we reject events in which the two leptons have an invariant mass below 106 GeV.
Jets are reconstructed from the PF candidates using the anti-k T clustering algorithm with a distance parameter of 0.4 [36,37].We select jets with |η| < 2.4 and p T > 30 GeV.Jets originating from b quarks (b jets) are identified (b-tagged) using the DEEPJET algorithm [38] with an average b tagging efficiency of 68% and a light quark and gluon jet misidentification rate of 1.1%.Events are required to have exactly one b-tagged jet.The missing transverse momentum (⃗ p miss T ) is defined as the negative vector ⃗ p T sum of all PF particles, and its magnitude is denoted as p miss T [39].Events with p miss T < 60 GeV are rejected to further suppress the DY+jets events.
The selected events in the signal region have exactly one opposite-sign lepton pair with invariant mass greater than 106 GeV, p miss T > 60 GeV, and exactly one b-tagged jet irrespective of the number of untagged jets.The dominant background is the tt process (∼89%), followed by the tW (∼9%) and DY+jets (∼1%) processes.Signal selection efficiencies for the ST (TT) mode are about 2.8 (1.1)% with respect to an inclusive MC sample for the tℓud flavor combination, assuming C s = 0 and nonzero C t .The fractional uncertainties on the signal efficiencies due to the effects described below range from 4-9%.
In the ST mode, the lepton and top quark are produced directly from the annihilation of the incoming quarks, and are Lorentz-boosted and approximately back-to-back.The subleading lepton in the ST mode is primarily from the top quark decay chain.To use these specific features of the signal events, the four-momentum vectors of the top quarks are reconstructed from the decay products: the subleading lepton, the neutrino, and the b jet candidate.The neutrino p T can be inferred from the ⃗ p miss T .The longitudinal momentum of the neutrino is inferred assuming energy-momentum conservation at the W boson decay vertex and constraining the W boson mass to 80.4 GeV as discussed in Ref. [40].
A boosted decision tree (BDT) [41,42] is employed to distinguish signal from the sum of the background processes.Independent signal and background samples are used for training the BDT.A merged sample from tt, tW, and DY+jets background events, weighted by their cross sections, is used as the background training and testing sample.For the signal, events from various BNV flavor combinations of the ST mode are merged with equal weights and are used in the BDT training and testing.Ten variables are inputs to the BDT: the transverse momenta of the leading lepton (ℓ 1 ), subleading lepton (ℓ 2 ), and the top quark candidate (t); the distances between the leading and subleading leptons ] and ∆R(ℓ 1 , t); the azimuthal angles between the leading and subleading leptons [∆ϕ(ℓ 1 , ℓ 2 ) = |ϕ ℓ 1 − ϕ ℓ 2 |] and ∆ϕ(ℓ 1 , t); the invariant mass and p T of the dilepton system; and |p The templates describing the BDT distributions for the signal and background events are taken from simulation.The normalization of the DY+jets background, which is important in the e + e − and µ + µ − channels, is determined by applying a scale factor to the simulation derived from data in a control region where the reconstructed dilepton mass is close to the Z boson mass [43].
The list of uncertainties considered and the techniques used to estimate their values are very similar to those in Ref. [44].We consider uncertainties in the integrated luminosity [45][46][47], pileup effects [48], trigger, lepton identification [34,49], and b tagging [38,50] efficiencies, in the calculation of p miss T , and those related to the jet energy scale and resolution [51].Uncertainties arising from choices in signal and tt modeling include PDFs, renormalization and factorization scales, and initial-and final-state QCD radiation.The uncertainty arising from the modeling of the top quark p T spectrum is evaluated by the renormalization and factorization scales at NNLO QCD accuracy, including NLO electroweak corrections [24].Modeling uncertainties from the matching of the matrix element level calculation to the parton shower simulation, the modeling of the underlying event defined in PYTHIA tunes, and the models of color reconnection are considered for the SM tt process, as described in Ref. [52].The modeling uncertainties apply only to the signal acceptances.Normalization uncertainties of 5, 10, and 30% are considered for the tt, tW, and other processes based on experimental measure-ments [52,53], respectively.An additional 20% normalization uncertainty is added for DY+jets processes to account for PU mismodeling in large p miss T events.Events/bin  Figure 2 shows the BDT discriminant distributions for events in the three channels (e + e − , e ± µ ∓ , and µ + µ − ) passing the event selection for the three data-taking years (2016-2018) combined.To illustrate signal distributions, simulated "teud" and "tµud" samples are included in the figure, assuming Λ = 1 TeV and C t = C s = 1.Signal events are well separated from background events.To extract the signal contribution, a simultaneous binned maximum-likelihood fit is performed of the BDT output distributions in the signal region for three years and three channels, with the systematic uncertainties described above treated as nuisance parameters.
The best fit for the BNV effective couplings is consistent with zero and no significant excess over the background expectations is observed.The sources of systematic uncertainty with the largest impact on the estimated signal contribution depend on the fermion flavor combination of the BNV interactions.The three main sources of uncertainty that are common among the BNV interactions are uncertainties in the normalization of the SM tW process, muon energy scale, and modeling of the top quark p T spectrum in the SM tt simulation.The exclusion limits are calculated using the asymptotic approximation of the CL s method [54].The adequacy of the asymptotic approximation has been validated with pseudo-experiments.The limit-setting procedure is performed for each individual BNV coupling while setting the other BNV couplings to zero.The observed and expected limits at 95% confidence level (CL) on the BNV effective coupling strengths are listed in Table 2.The limits on the strengths of the BNV couplings are translated to limits on the branching fractions for the BNV top quark decays.The differences between different quark flavor combinations stem mainly from the different PDFs involved in the production mode.The results for limits on various BNV branching fractions are displayed in Fig. 3. Tabulated results are provided in the HEPData record for this analysis [55].In summary, a search for baryon number violation (BNV) in events with top quarks is performed using the LHC proton-proton collision data at a center-of-mass energy of 13 TeV.The analysis explores baryon number violating effects in single top quark production for the first time.Data were collected by the CMS experiment in 2016-2018 and correspond to an integrated luminosity of 138 fb −1 .Events with a lepton pair and exactly one b-tagged jet are selected.A boosted decision tree (BDT) is used to separate signal events from background events.A binned maximum likelihood fit to the BDT output distribution is performed to search for the BNV processes.Considering BNV vertices in the production of top quarks dramatically increases the sensitivity of this search.No significant excess of events over the background pre- Branching fraction [10  diction is observed.Upper limits are placed on the strengths of the BNV couplings, which are multiple orders of magnitude more stringent than the previous limits [11].

Figure 1 :
Figure 1: Representative Feynman diagrams for single top quark production (left) and top quark decays (right) via BNV interactions.The red circles mark the BNV vertices.

Figure 2 :
Figure 2:The BDT output distributions for data (points) and backgrounds (histograms) for the e + e − (upper left), µ + µ − (right), and e ± µ ∓ (lower) channels, including the ratio of data to the predicted total background yield.The hatched bands indicate the total uncertainty (statistical and systematic added in quadrature) for the SM background predictions.The predicted yields of the backgrounds and the uncertainty bands are shown after the simultaneous fits for the signal-plus-background hypothesis.Examples of the predicted signal contribution for the BNV interactions via teud (solid gray line) and tµud (dashed black line) vertices are shown.

Figure 3 :
Figure 3: The observed upper limits on the branching fractions of the top quark BNV decays are shown with circle and triangle shapes for electron and muon couplings, respectively.The observed limits corresponding to the C t and C s coefficients are shown with filled and open markers, respectively.The yellow light (green dark) bands indicate the range within plus or minus one (two) standard deviations bands around the expected limits.