Search for pair production of vectorlike quarks in the fully hadronic final state

The results of two searches for pair production of vectorlike T or B quarks in fully hadronic final states are presented, using data from the CMS experiment at a center-of-mass energy of 13 TeV. The data were collected at the LHC during 2016 and correspond to an integrated luminosity of 35 . 9 fb − 1 . A cut-based analysis specifically targets the bW decay mode of the T quark and allows for the reconstruction of the T quark candidates. In a second analysis, a multiclassification algorithm, the “ boosted event shape tagger, ” is deployed to label candidate jets as originating from top quarks, and W , Z , and H . Candidate events are categorized according to the multiplicities of identified jets, and the scalar sum of all observed jet momenta is used to discriminate signal events from the quantum chromodynamics multijet background. Both analyses probe all possible branching fraction combinations of the T and B quarks and set limits at 95% confidence level on their masses, ranging from 740 to 1370 GeV. These results represent a significant improvement relative to existing searches in the fully hadronic final state.


Introduction
With the discovery of a light Higgs boson (H) by the ATLAS and CMS Collaborations in 2012 [1][2][3], the standard model (SM) is complete as a low-energy effective theory describing all known fundamental particles and their interactions.However, several questions still remain with the theory, for example, why the mass of the observed Higgs boson is 125 GeV, whereas quantum loop corrections would be expected to drive the mass up towards the Planck scale.Many models of new physics beyond the SM predict additional particles that can affect the quantum corrections to the Higgs boson mass and resolve this so-called "hierarchy problem".New states proposed include new particles such as supersymmetric partners of SM particles, or fourth-generation quarks.
Chiral fourth-generation quarks, t or b , with identical properties to the SM third-generation t and b quarks, but with larger masses, are effectively excluded because of their impact on the Higgs boson production cross section.However, many models of new physics, such as those predicting a composite Higgs boson [4][5][6][7], or "little-Higgs" models [8,9], include fourthgeneration particles of a new type, called vector-like quarks (VLQs), labeled T and B, having electric charges of +2e/3 and −1e/3, respectively.These VLQs do not obtain their mass via the Higgs boson Yukawa coupling, and will not affect the values of the Higgs boson production cross section or decay width.Therefore, these are viable search candidates for the LHC experiments, and are predicted to have masses at the TeV scale [10], allowing the hierarchy problem to be resolved.
The VLQs are called "vector-like" because their left-handed and right-handed chiralities transform under the same SU(2) ⊗ U(1) symmetry group of the SM electroweak gauge bosons.This leads to several decay modes of the VLQs, through charged-and neutral-current interactions.Although decays to light first-and second-generation quarks are possible, the dominant decay modes of the VLQs are to third-generation SM quarks [11].The possible decay modes of the VLQs to the third-generation quarks are as follows (charge-conjugate modes implied): T → bW, B → tW, T → tZ, B → bZ, T → tH, B → bH.
(1) Specific model assumptions can influence the proportions of these VLQ decay modes.Both single and pair production of VLQs are possible, with single production dominating at larger VLQ masses (≈2 TeV), while single and pair production rates are comparable for VLQ masses ≈1 TeV.This analysis considers only the pair production of VLQs.
Both the ATLAS and CMS Collaborations have recently presented searches for pair production of VLQs.The CMS Collaboration has searched for T and B quarks in the dilepton final state, targeting VLQ decays to Z bosons [12], and excluding T (B) quark masses up to 1280 (1130) GeV.An analysis from CMS including single-lepton, dilepton, and multilepton final states [13] probes all decay modes of the VLQs, and excludes T quark masses in the range 1140-1300 GeV and B quark masses in 910-1240 GeV, depending on the combination of the VLQ branching fractions.Finally, a CMS result optimized for the bWbW channel, using singlelepton final states, excludes T quark masses up to 1295 GeV [14].The ATLAS Collaboration has recently presented a search for VLQ pair production in the fully hadronic channel, with sensitivity to all possible decay modes of the VLQs [15].This analysis most strongly excludes T and B quarks when they decay to Higgs bosons, with mass exclusion limits of 1010 GeV.The AT-LAS Collaboration has also performed a combination of searches utilizing various final states, resulting in mass exclusion limits of up to 1370 GeV [16].
In this paper, we describe two independent analyses targeting pair production of vector-like quarks in fully hadronic final states.We first present an analysis that employs a traditional strategy, utilizing W boson tagging and b quark tagging algorithms.This analysis specifically targets the bW decay mode of the T quark, but is used to evaluate sensitivity to all possible decays of the T or B quark, and is referred to as the "cut-based analysis".The second analysis uses a novel machine learning technique to identify and classify different varieties of Lorentzboosted particles that originate from VLQ decays.This strategy allows the analysis to target all the decay modes of the T or B quark.We refer to this analysis as the "NN (neural network) analysis".
The cut-based analysis uses dedicated algorithms to identify efficiently jets consistent with W bosons and the hadronization of b quarks.These algorithms allow the reconstruction of each VLQ T quark present in the event, providing a mechanism to reduce further the contribution of background processes.At least four jets are required to be present, and events are classified according to the number of jets that are identified as being consistent with a W boson, to obtain signal regions of varying signal purities.The H T distribution, defined as the scalar sum of jet transverse momenta (p T ), is used for signal discrimination in each category.The NN analysis uses a neural network algorithm with a multiple-class output to identify jets as consistent with one of six distinct decay topologies from highly boosted particles: top quark, W boson, Z boson, Higgs boson, b quark, and light u/d/s/c quark or gluon (denoted "light jets").Events with exactly four jets are considered for the analysis, which is the expected final state for fully hadronic decays of VLQ pairs, as seen in Eq. 1.The multiplicities of jets falling into each of the six categories are used to define 126 independent signal regions, in which the value of H T is used to discriminate signal from the expected background processes.
The main background contribution in these fully hadronic final states comprises multijet events from quantum chromodynamics (QCD) processes.Techniques based on control samples in data are used to predict the expected QCD multijet background yield and H T shape.In the cut-based analysis, control regions are used to measure QCD multijet background yields and shapes, which are then extrapolated to the signal regions.In the NN analysis, misidentification rates for each of the six categories of jets considered in the multiclassification algorithm are used to predict the level of contribution of multijet events in the signal regions.Each method is validated using samples of observed and simulated events.
The paper has the following structure.Section 2 provides a description of the CMS detector and trigger system.The event reconstruction, including jet reconstruction, jet substructure, and the multiclassification algorithm used in the NN analysis, is described in Section 3. The data sets and simulated samples used are presented in Section 4. Information about the definition of the signal and control regions is included in Section 5.The methods employed to predict the QCD multijet background from data for each analysis are explained in Section 6, and details of the systematic uncertainties affecting the analyses are itemized in Section 7. Signal region yields and distributions are given in Section 8, and the statistical analysis used to extract the results is described in Section 9. Finally, the results of the two analyses are presented in Section 10, and a summary is given in the last section.

The CMS detector
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter (HCAL), each composed of a barrel and two endcap sections.Forward calorimeters extend the pseudorapidity (η) coverage provided by the barrel and endcap detectors.Muons are detected in gas-ionization chambers embedded in the steel flux-return yoke outside the solenoid.
Events of interest are selected using a two-tiered trigger system [17].The first level, composed of custom hardware processors, uses information from the calorimeters and muon detectors to select events at a rate of around 100 kHz within a time interval of less than 4 µs.The second level, known as the high-level trigger, consists of a farm of processors running a version of the full event reconstruction software optimized for fast processing, and reduces the event rate to around 1 kHz before data storage.
A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Ref. [18].

Event reconstruction
To reconstruct and identify each individual particle in an event, a "particle-flow algorithm" [19] that uses an optimized combination of information from the various elements of the CMS detector is employed.The energy of photons is obtained from the ECAL measurement.The energy of electrons is determined from a combination of the electron momentum at the primary interaction vertex as determined by the tracker, the energy of the corresponding ECAL cluster, and the energy sum of all bremsstrahlung photons spatially compatible with originating from the electron track.The energy of muons is obtained from the curvature of the corresponding track.The energy of charged hadrons is determined from a combination of their momentum measured in the tracker and the matching ECAL and HCAL energy deposits, corrected for zero-suppression effects and for the response function of the calorimeters to hadronic showers.Finally, the energy of neutral hadrons is obtained from the corresponding corrected ECAL and HCAL energy.
The reconstructed vertex with the largest value of summed physics-object p 2 T is taken to be the primary proton-proton (pp) interaction vertex.Here the physics objects are the jets, clustered using the jet finding algorithm [20,21] with the tracks assigned to the vertex as inputs, and the associated missing transverse momentum, taken as the negative vector p T sum of those jets.The output of the particle-flow algorithm provides a list of particles that are used as inputs to the jet finding algorithm.Charged hadrons that are not associated with the primary interaction vertex are removed before jet finding to mitigate the effects of additional pp ("pileup") interactions occurring in the same or neighboring bunch crossings as the interaction of interest.The anti-k T clustering algorithm [20] is used, as implemented in the FASTJET software package [21], to produce two collections of jets, the first obtained with a distance parameter of R = 0.4 (AK4 jets), and the second obtained with R = 0.8 (AK8 jets), where R is the radius of the jet in the η, φ plane (where φ is the azimuthal angle).The AK8 jets are used to identify the hadronic decays of massive SM particles, including top quarks, and W, Z, and Higgs bosons, while the AK4 jets are used to identify other hadronic activity in the event.The cut-based analysis uses both AK4 and AK8 jets, while the NN analysis only uses the AK8 jets for analysis.
The jet momentum is determined as the vectorial sum of all particle momenta in the jet, and is found from simulation to be within 5 to 10% of the true momentum over the whole p T spectrum and detector acceptance.Pileup interactions can contribute additional tracks and calorimetric energy deposits, increasing the apparent jet momentum.To mitigate this effect, tracks identified to be originating from pileup vertices are discarded and an offset correction is applied to correct for remaining contributions.Jet energy corrections are derived from simulation studies so that the average measured response of jets becomes identical to that of particle-level jets.In situ measurements of the momentum balance in dijet, photon+jet, Z+jet, and multijet events are used to determine any residual differences between the observed and simulated jet energy scale, and to derive appropriate corrections [22].Additional selection criteria are applied to each jet to remove jets potentially dominated by instrumental effects or reconstruction failures.The jet energy resolution is approximately 15% at 10 GeV, 8% at 100 GeV, and 4% at 1 TeV.

Jet substructure
To identify the hadronic decays of highly Lorentz-boosted objects, including top quarks, and W, Z, and H, jet substructure information provides powerful discrimination from massive jets originating from QCD multijet production.
The mass of the jet itself can discriminate QCD jets from boosted heavy objects.A grooming algorithm is applied to jet constituents to better estimate the mass of the originating particle of the jet.In the algorithm used, the constituents of the AK8 jets are reclustered using the Cambridge-Aachen algorithm [23,24].The "modified mass drop tagger" algorithm [25], also known as the "soft-drop" (SD) algorithm, with angular exponent β = 0, soft cutoff threshold z cut < 0.1, and characteristic radius R 0 = 0.8 [26], is applied to remove soft, wide-angle radiation from the jet.The SD mass (m SD ) is used to determine the consistency of a jet with a given boosted heavy object.
In addition to the m SD , information about the distribution of particles within the jet can be used for further discrimination.A quantity called "N-subjettiness" [27,28] is used to determine the consistency of a jet with N or fewer subjets.The N-subjettiness values τ N are defined as where the index i refers to each jet constituent, and ∆R ≡ √ (∆η) 2 + (∆φ) 2 is the angular distance between a jet constituent and a candidate subjet axis.The quantity d 0 is a normalization constant.To identify boosted top quarks, the quantity τ 32 ≡ τ 3 /τ 2 is used to target the expected three-subjet signature, while for W, Z, and Higgs bosons, the quantity τ 21 is used because of the expected two-subjet decay topology.
Jets originating from bottom quarks, which hadronize and subsequently decay, are selected with an algorithm to identify and reconstruct displaced vertices, along with their associated tracking information.Known as the combined secondary vertex algorithm (CSVv2) [29], it provides several working points of varying efficiencies and misidentification rates.In the cutbased analysis, the CSVv2 algorithm is applied to AK4 jets using a working point corresponding to a misidentification probability in simulated tt events of 0.01 for u/d/s/g jets and an efficiency for identifying genuine b jets of approximately 0.63.In the NN analysis, the CSVv2 algorithm is applied to the subjets of the AK8 jets to increase the categorization efficiency for decays of top quarks, Z and Higgs bosons, which can have one or more displaced vertices within the jet.A CSVv2 working point is not explicitly used in the NN analysis, however the output value of the CSVv2 discriminator for each subjet is used as an input to the multiclassification algorithm to categorize jets.
In the cut-based analysis, a working point for identifying merged decay products of a highlyboosted W boson in a single jet (W tagging) is chosen.To be considered for W tagging, an AK8 jet must have p T > 200 GeV.The jet must satisfy 65 < m SD < 105 GeV and τ 21 < 0.55 to be W tagged.This working point corresponds to an efficiency of about 0.50 to identify genuine W jets and a misidentification probability of about 0.03 [30].Because of an observed dependence of m SD on the W jet momentum, an additional correction is applied to ensure the W tagged jet m SD peak is stable and the W tagging efficiency remains roughly constant as a function of jet momentum.

Boosted event shape tagger (BEST) algorithm
The NN analysis does not focus on a single VLQ decay mode and thus the expected signatures can contain various combinations of top and bottom quarks along with W, Z, and H. Using standard cut-based working points for each type of particle leads to complications with overlaps in selection criteria when considering many different final states simultaneously.For this reason, a new algorithm is used that simultaneously attempts to identify six categories of jets: t, W, Z, H, b, and light jets.The algorithm is called the boosted event shape tagger (BEST) algorithm, as first detailed in Ref. [31], and uses hypothesized reference frames to determine the consistency of a jet with the expected topology from top quark, W, Z, H decays, b quark and light jets.The algorithm uses a neural network to classify jets according to one of those six possibilities.The NN analysis presented here is the first CMS result to use the BEST algorithm.
The BEST algorithm relies on the fact that jets from very high energy ("highly boosted") heavyparticle decays will have a distinct topology in the rest frame of the decaying object.For example, the decay of a highly boosted t quark produces three collimated particles in the laboratory frame, but in the rest frame of the t quark, the three distinct jet directions lie in a plane.By Lorentz-boosting the particles or constituents in a jet back to the rest frame, it can be seen whether the distribution of particles is consistent with that expected from a top quark decay.This boost transformation is applied four different times to obtain four sets of jet constituents.The boost transformation is performed assuming the jet originates from a top quark, W, Z, or H, after forming the boost vector by using the jet four-vector with the mass altered to be that of the particle under consideration, while keeping the jet momentum constant.
The sets of jet constituents resulting from each boost transformation are used to compute kinematic quantities, including Fox-Wolfram moments [32], aplanarity, sphericity, and isotropy, based on the eigenvalues of the sphericity tensor [33], and the jet thrust [34].In each boosted reference frame, jet constituents are reclustered to obtain a set of objects relative to the transformed jet axis.These objects are used to compute the longitudinal asymmetry, defined as the ratio of the longitudinal-component sum of the momenta to the p T sum of this set of objects.This ratio gives another way to compute the isotropy of constituents that is expected for a jet consistent with one of the hypothesized particles.Additionally, the jet m SD , jet η, charge, τ 32 , τ 21 , and subjet CSVv2 scores from the original jet reference frame are used.In total, 59 kinematic quantities from the original and transformed sets of constituents are used as inputs to a deep neural network to discriminate between the different jet species.These kinematic quantities are validated by examining distributions in data and simulated events, where good agreement in shape is observed.
The BEST neural network is trained using samples of simulated AK8 jets that originate from the decay of heavy resonances and that correspond to the final state objects (t, W, Z, H, b, or light jets).The jets in the training sample are matched to the object of interest using the generatorlevel information.Samples with heavy resonance masses from 1 to 4 TeV are used to populate the jet p T range from 0.4 to ≈2 TeV.The neural network is trained using the Python-based SCIKIT-LEARN package, using the MLPCLASSIFIER module [35].The network architecture consists of 3 hidden layers with 40 nodes in each layer using a rectified-linear activation function.
There are six output nodes, corresponding to the six particle species of interest.A sample of 500 000 jets is used to train the network, split evenly between the six training samples.The six outputs from the network represent probabilities for the jet to originate from the corresponding particle.The classification of an AK8 jet is chosen according to the output node with the highest probability.Several validation studies have been performed in different samples of data events enriched in different types of processes: a muon+jets sample containing boosted top quarks and boosted W bosons, a sample containing events from QCD processes enriched in gluon-initiated jets, and a sample of photon+jets events enriched in quark-initiated jets.In each of these samples, we find good agreement in the shape and rate of the BEST neural network inputs, as well as the output probabilities [36].

Data set and simulated samples
Both the cut-based and NN analyses use the data set collected by the CMS experiment at the CERN LHC in 2016, corresponding to an integrated luminosity of pp collisions of 35.9 fb −1 .Events in the cut-based analysis are selected online using a trigger algorithm requiring an H T value of at least 800 GeV, or 700 GeV if a jet with mass above 50 GeV is present.Events are also selected by another two triggers, which require a single jet with either p T ≥ 450 or 360 GeV with a mass above 30 GeV.The above trigger selection is measured to be fully efficient for the signal regions, with corrections applied for percent-level inefficiencies in control regions.Events in the NN analysis are selected online using the above trigger algorithms in combination with several other algorithms requiring multijet topologies.The trigger requirements for the NN analysis are fully efficient in the signal and control regions, because of the higher jet momenta considered.
Methods utilizing data are employed to estimate the dominant background from QCD multijet production, however, samples of simulated events are used to validate the background estimation techniques described in Section 6.These samples of QCD multijet events are generated at leading order with PYTHIA [37,38].
Simulated events are used to model the subdominant background contributions.The largest of these in both analyses is because of the SM pair production of top quarks, generated at next-to-leading order (NLO) with POWHEG v2 [39,40] and showered with PYTHIA 8.212, using the event tune CUETP8M2T4 [41].The production of a W or Z boson in association with additional jets, where the W/Z boson decays to quarks, is generated at leading-order (LO) with MADGRAPH5 aMC@NLO 2.2.2 [42,43].Diboson events (WW, WZ, ZZ) are generated at LO with PYTHIA, and rare top quark production processes (ttW, ttZ, tttt) are generated at NLO with MADGRAPH5 aMC@NLO and showered with PYTHIA.Background contributions from Higgs boson production in the dominant gluon fusion mode with decays to bb and W + W − are included via events generated with MADGRAPH5 aMC@NLO plus PYTHIA and POWHEG v2 + PYTHIA, respectively.Backgrounds other than tt using PYTHIA use the CUETP8M1 event tune [44].The cut-based analysis considers only the tt and W+jets background contributions.Other processes such as Z+jets were measured to contribute at only the 1% level to the total background expectation, and therefore were not further investigated.
Samples of vector-like T and B quark pair production events are generated at LO using MAD-GRAPH5 aMC@NLO [45] + PYTHIA, with T and B quark masses ranging from 0.7 to 1.8 TeV in increments of 100 GeV.They are inclusive with respect to the VLQ decay mode, and are generated with equal branching fractions for T/B quark decays to each of the three modes (tH/bH, tZ/bZ, bW/tW).Events are weighted to produce results for different combinations of branching fractions, and are normalized to theoretical cross section expectations calculated at the next-to-next-to-leading order (NNLO), including next-to-leading-logarithmic order softgluon resummation, with TOP++2.0 [46], as listed in Table 1.

Event selection
In this section, the event selection and reconstruction techniques applied to the two analyses are described.

Cut-based analysis
The cut-based analysis, optimized for each T decaying to a b quark and W boson, requires at least two AK8 jets with p T > 200 GeV and |η| < 2.4.The AK8 jets serve as boosted W boson candidates, and are evaluated with the W boson tagging algorithm described above.In addition, the analysis requires at least two AK4 jets with p T > 30 GeV and |η| < 2.4, serving as b jet candidates.At least two of the selected AK4 jets must be distinct from the AK8 jets, requiring an angular separation of ∆R > 0.8.The analysis requires the scalar sum of AK4 jet energies, H AK4 T , to be larger than 1200 GeV.For a signal mass of 1200 GeV, this selection is 95% efficient.With the two AK4 and two AK8 jets, there are two possible combinations of a W jet and b jet candidate that can be formed.As signal events are expected to produce two particles with equal mass, we can form the variable where m T1 is the mass of the higher-p T T quark candidate and m T2 the mass of the lower-p T T quark candidate, of two T candidates each formed from one AK8 jet and one AK4 jet.The assignment of AK4 and AK8 jets to T quark candidates is chosen to minimize the value of ∆m.
Events are required to have ∆m < 0.1.
Events passing the H AK4 T and ∆m requirements are further divided into categories.Applying the W boson tagging and b quark tagging working points described above, events are divided into categories based on the multiplicity of W and b tags in the event.There are nine tagging combinations, with possibilities of 0, 1, or ≥2 W tags in combination with 0, 1, or ≥2 b tags.

Neural network analysis
In the NN analysis, each jet is classified according to one of the six categorizations from the BEST algorithm: t, b, W, Z, H, or light.The number of jets with each BEST classification label is used to divide events into exclusive categories of varying signal and background contributions, with categories containing larger numbers of t, b, W, Z, or H candidates being enriched in the VLQ signal, as it is expected to decay to multiple highly boosted massive objects.In each category, the distribution of H AK8 T , which is the scalar sum of the four selected AK8 jet energies, is used to discriminate signal from the background processes.
The signal regions are defined as follows: • exactly 4 AK8 jets, each with p T > 400 GeV and |η| < 2.4; • a unique set of (N t , N H , N W , N Z , N b , N j ), where any N i ≤ 4; The possible combinations of N i satisfying the above conditions give 126 independent signal region categories.In some categories, where there is a lack of simulated events to model the subdominant background processes, a single bin is used as a counting experiment instead of the full H AK8 T shape information.This occurs in 14 of the 126 total categories.No further selections are applied on the jet kinematic variables or the BEST algorithm output probabilities.

Background estimation methodology
After the requirements described above have been applied to select the expected signal events, both the cut-based and NN analyses remain dominated by background events from QCD multijet production processes.Since simulated QCD multijet events do not reliably model the observed data, because of missing higher-order contributions during event generation, both analyses incorporate a method to estimate the background contribution from QCD multijet production directly from observed data events.This section describes the methodology employed by each analysis.The non-QCD background contributions are taken from simulation.

Cut-based analysis
The cut-based analysis uses an "ABCD" matrix method based on observed distributions of two uncorrelated event quantities to predict the shape and rate for the expected QCD multijet background in the signal region.The two quantities used to define the control regions are H AK4 T and ∆m.The shape of the expected QCD multijet background is obtained by selecting data events passing the H AK4 T > 1200 GeV requirement, but failing the ∆m < 0.1 selection.This control region is labeled region B. The expected backgrounds from tt and W+jets events, as estimated from simulation, are subtracted from the observed distribution to obtain the expected contribution solely from QCD multijet events.After obtaining the shape, the rate can be estimated by defining another set of control regions, namely with H AK4 T < 1200 GeV.This sideband, with ∆m < 0.1, is labeled region A. The ratio of the number of events in A to B (again after subtracting the tt and W+jets component) results in an extrapolation scale factor.The control region with H AK4 T > 1200 GeV, ∆m > 0.1 is labeled D. The scale factor is then applied to the shape obtained from region D to describe the expected H AK4 T distribution of QCD multijet events in the signal region, labeled region C in this description.
The above procedure is only valid if the quantities H AK4 T and ∆m are uncorrelated.In simulated QCD multijet events, a small correlation (<5%) is observed, therefore a residual correction is derived from these events.Specifically, the ABCD procedure is performed in the simulated sample, and the resulting prediction is compared with the observed yield of simulated events in the signal region.A trend in the ratio of these two H AK4 T distributions is observed and fit using a linear function.This function is used to scale the resulting H AK4 T distribution in data.Three functions are derived, for simulated events with 0, 1, or 2 W-tagged jets.The procedure is validated by applying it in observed events with exactly 0 W-tagged jets, and agreement is found within 2.5%.

The NN analysis
The NN analysis uses a method based on the classification fractions of the BEST algorithm to estimate the shape and rate of the QCD multijet background using data.In the inclusive sample of events with exactly three AK8 jets, independent from the four jet sample in which signal is extracted, the classification fraction X for a given category X of the BEST algorithm is computed according to the following definition: where N X represents the number of jets in BEST category X, and N represents the total number of jets.The classification fraction is measured as a function of jet p T using data events.There is negligible signal contamination in this region, which is dominated by QCD multijet events.The fractions for each BEST category are shown in Fig. 1.These fractions are used to estimate the yield of events having any arbitrary combination of BEST labels.To obtain the QCD multijet yield as well as the H AK8 T shape, the inclusive sample of events with exactly four AK8 jets is used in data, however, the BEST labels are not utilized.For each of the 126 signal regions in the NN analysis, every event is evaluated as a candidate for the given signal region.There may be multiple permutations of which of the four AK8 jets are labeled according to the expected multiplicity in the signal region, so the following process is repeated for each permutation.Each of the four jets is weighted according to the classification fraction as measured above, as a function of its p T .The four jet weights are then multiplied to obtain the final event weight.An H AK8 T distribution is generated based on the original H AK8 T of the event, with the normalization scaled by the event-level weight.After repeating this process for all possible permutations of the BEST labels, and iterating over all events, the final H AK8 T distribution for the expected QCD multijet contribution is obtained: where r represents the expected QCD multijet shape distribution and yield, and the index i corresponds to one of the four jets in the event.
The procedure above is validated in a sample of simulated QCD multijet events, and agreement is obtained between predicted and observed events in both the yield and shape of the H AK8 T distribution, within the uncertainties propagated from the measurement of the classification fractions, for all of the 126 signal regions considered.

Systematic uncertainties
Several sources of systematic uncertainty are evaluated and included in the final analysis results.Table 2 summarizes the different contributions, and the analysis to which they contribute.They are described in detail below.
[GeV]  distribution in each analysis.Systematic sources with an uncertainty of "±1σ" affect the shape and rate, all others affect the rate only.Sources of systematic error that affect "all simulation" impact both the signal simulation and simulated backgrounds.• Process cross sections: Uncertainties in the cross sections used to normalize simulated background processes are included.For the W+jets and Z+jets backgrounds, uncertainties of 15% are applied [47,48].For the subdominant diboson, rare top quark process (ttV, tttt), and Higgs boson contributions, a conservative uncertainty in the cross section value of 50% is applied.For tt backgrounds, the uncertainty in the cross section value is included through the scale uncertainties described below, which cover both shape and normalization effects.
• Integrated luminosity: The uncertainty in the measurement of the integrated luminosity recorded during the 2016 data-taking period by CMS is 2.5%, and is applied to all simulated signal and background samples [49].
• Pileup reweighting: All simulated samples used in the analysis are reweighted to ensure the distribution of the number of pileup interactions per event matches the corresponding observed distribution for the 2016 run.This pileup distribution is obtained using a proton-proton inelastic cross section value of 69.2 mb [50,51].A systematic uncertainty in the distribution is obtained by varying the value by ±4.6%, resulting in an uncertainty with both a normalization and shape component.
• Jet energy scale and resolution: Uncertainties in the corrections applied to jets are propagated to the final discriminating distributions by reconstructing events with the jet-level corrections shifted within their corresponding uncertainties, which depend on the jet p T and η [22].
• Parton distribution functions: For the tt and VLQ signal simulated samples, we use PDFs from the NNPDF3.0set [52], and evaluate the systematic uncertainty due to the choice of PDF according to the process described in Ref. [53].For the signal samples, changes in the shape and normalization are considered in the NN-based analysis.In the cut-based analysis, we find the shape component to be negligible, and consider only a normalization uncertainty.
• Scale uncertainties: For the tt and VLQ signal simulated samples, we vary the renor- malization and factorization scales up and down independently by factors of 2 to account for uncertainties in the choice of scales used to generate the simulated sample.For the tt samples, the effect associated with this scale variation is sufficiently large to cover the uncertainty in the cross section as well.For the signal samples, changes in the shape and normalization are considered in the NN-based analysis.In the cut-based analysis, we find the shape component to be negligible, and consider only a normalization uncertainty.
• The CSVv2 discriminant reshaping (NN-based): When using the shape of the CSV discriminant, as we do for inputs to the BEST algorithm, a reshaping event weight is applied based on the CSVv2 scores of the AK8 jets [29].We keep the nominal analysis result without the addition of these CSVv2 reshaping weights, but add an additional systematic uncertainty where the standard deviation (s.d.) value is the difference between applying the weights and not applying the weights.
• The BEST classification scale factors (NN-based): Uncertainties in the classification and misclassification scale factors are included through 11 independent nuisance parameters, one each for the classification and misclassification efficiencies for the 5 heavy objects (t, W, Z, H, b), and a final nuisance for the QCD categorization efficiency .Event weights are applied on a jet-by-jet basis in each event to produce shape variations in each of the signal regions.An uncertainty of 5% per BEST classification is used to compute event weights, and shape templates are formed for each category of the BEST algorithm, separately for correctly and incorrectly classified jets.This uncertainty is allowed to float during the signal extraction, to measure a value for each scale factor.
• The BEST classification fractions for data-driven method (NN-based): We propagate the uncertainty in the measurement of the classification fractions, due to limited event counts in data control regions, to the background estimate.The uncertainties from the six classification fractions X used are added in quadrature to obtain the total uncertainty for a given event of the expected QCD multijet background distribution r, as described in Section 6.2.
• Trigger uncertainty: We measure the trigger efficiency to be > 99% in our signal region.A 2% uncertainty is applied to cover small observed trigger inefficiencies for events with low H AK4 T values.The impact of the trigger inefficiency has been measured to be negligible for the NN analysis because the signal regions are higher in jet momenta and therefore further away from the trigger turn-on region.No additional systematic uncertainty is applied to simulated events in the NN analysis.
• W tagging scale factor uncertainty (cut-based): We apply scale factors to account for the difference in W jet tagging efficiency between simulation and data.The factor is applied as a weight to simulated events based on the number of W tags.The uncertainty in this factor is 14%, plus a small factor due to extrapolating the tagging efficiency to higher p T .The uncertainty for each tag is increased by 4.1% log p TW /200, where p TW is the transverse momentum of the tagged W jet.
• Soft drop jet mass scale and resolution (cut-based): To account for the uncertainty in the soft drop selection used in W tagging, the jet m SD is varied in simulation according to an uncertainty in the mass scale and the mass resolution.We consider only the impact on selection efficiency from this variation.The mass is varied by 0.94% to account for the scale, and the resolution on the mass is varied by 20%.These scale factor and mass uncertainties are derived in Ref. [30].
• b tagging scale factor uncertainty (cut-based): We apply scale factors to account for the difference in the b jet tagging efficiency between simulation and data [29].This factor, as well as its uncertainty, depends on the p T , η, and hadron flavor of the jet.This affects the shape of the H AK4 T distribution, and is applied by varying the scale factor of b and c jets simultaneously.Light-jet weights are varied separately, resulting in two separate systematic uncertainties.
• Extrapolation fit (cut-based): The function we use to correct the QCD multijet background prediction from data carries some statistical uncertainty from the fitting procedure.We assign a corresponding systematic uncertainty equal to the combined uncertainty on the fit parameters.We generate templates by shifting the fitted function by these uncertainties, and reevaluating the background in each bin.There is one fit per W tag category, and therefore two independent nuisance parameters.These are correlated across b tag categories with equal W tags.
• Normalization systematic (cut-based): Sideband regions with 0 b-tagged jets and 1 or 2 W-tagged jets are used to validate the cut-based analysis method.A small normalization discrepancy is observed after applying the QCD multijet background estimation technique.Two conservative, independent, log-normal nuisance parameters are therefore included for the QCD multijet background estimation, one applying to the 1W categories and one applying to the 2W categories, each with a value of 20%.We perform a maximum likelihood fit using only the 0 b tag sideband categories, and extract scale factors and associated uncertainties for these two parameters.The extracted scale factors are then applied to the signal regions as listed for each of the four signal region categories in the cut-based analysis.The upper row shows channels with 2 W tags, and 2 or 1 b tags, respectively.The lower row is for 1 W tag. The shaded error band represents the statistical uncertainty in the background.These distributions reflect the nuisance parameters evaluated after a likelihood fit to a background plus signal hypothesis, where the hypothesized signal is a T quark with a mass of 1200 GeV and 100% branching fraction to bW.The signal distributions show the expected yield of events assuming the cross section values in Table 1.The vertical axis labels denote that bin contents in these distributions have been scaled by their corresponding bin widths.The lower panel of each plot shows the ratio of the observed number of events in a bin to the expected number.in Table 2.

Signal discrimination
In this section, we present the distributions used to test for the presence of a signal.For the cut-based analysis, there are 4 independent categories: 1 W tag with either 1 or 2 b-tagged jets, and 2 W tags with either 1 or 2 b-tagged jets.In each category, the H AK4 T distribution is used for signal discrimination.Figure 2 shows the H AK4 T distributions for each of the 4 signal region categories.The amount of signal that falls into these categories depends on the hypothesized mass and decay fraction; for a bWbW decay, the acceptance ranges from 6.1 to 7.5%.
For the NN analysis, there are 126 independent signal region categories, corresponding to all the possible combinations of BEST label multiplicities for 4 AK8 jet events.Between 0.3% and 15% of signal events with a tZtZ decay pass the kinematic requirements to be placed into these Events per category   signal regions, depending on the VLQ mass. Figure 3 shows a visualization of the expected and observed yields in each of the 126 categories.For further signal discrimination, the analysis results use the H AK8 T distribution in each of the signal region categories.Figure 4 shows the H AK8 T distributions for combined categories including at least one W, Z, H, t, or b candidate, as well as the inclusive distribution summing all 126 signal regions.The individual distributions are not independent, as a particular category may satisfy the criteria for several distributions.

Statistical analysis
Independent statistical procedures are performed for the cut-based and NN analyses, using the same methodology.No explicit combination of the two analyses is presented here, as they are performed on many of the same events.The THETA software package [54] is used to perform a Bayesian shape-based analysis using the distributions from the signal region categories.Each bin of the distributions is combined statistically in a likelihood, where contributions from systematic and statistical uncertainties are added through nuisance parameters in the likelihood function.Each of the rate nuisance parameters is implemented with a log-normal prior distribution, while the shape-based nuisance parameters utilize Gaussian prior distributions.
In the cut-based analysis, all four signal regions are fit simultaneously.Most systematic uncertainties are fit simultaneously across the four categories, with the exception of the extrapolation fit and normalization uncertainties.For these parameters, there are independent uncertainties for the two W tag multiplicities.The ratio of events in the control regions to signal regions is fixed when calculating the multijet background component, and is not a parameter considered in the fit.
For the NN analysis, nuisance parameters for the BEST classification efficiency scale factors are allowed to fluctuate unconstrained, allowing a simultaneous measurement of scale factor val-       T for all events entering the 126 signal regions of the NN analysis (upper left), as well as for only categories containing at least one candidate of each of the particle types identified by the BEST algorithm: ≥1 W jet (upper right), ≥1 Z jet (middle left), ≥1 H jet (middle right), ≥1 t jet (lower left), and ≥1 b jet (lower right).The plots shown here are not mutually exclusive, as a particular signal region may satisfy several of the criteria for the individual summary categories.The vertical axis labels denote that bin contents in these distributions have been scaled by their corresponding bin widths.The lower panel of each plot shows the ratio of the observed number of events in a bin to the expected number.ues.A uniform prior distribution is assumed for the signal normalization.Additionally, due to the limited numbers of simulated events, we follow the "Barlow-Beeston lite" method [55] and assign an additional nuisance parameter to each bin of background components relying on simulated events.Prior to the statistical analysis, the discriminating distributions are rebinned to reduce the statistical uncertainty in the tails of the distributions to below 30%, as they can suffer from the effects of having limited simulated events passing all signal criteria.The likelihood function is used to extract Bayesian upper limits on the cross section for pair production of T or B quarks at 95% confidence level (CL).Additionally, samples of pseudodata are formed by sampling the expected backgrounds after varying the uncertainties within their prior distributions.For each pseudodata sample, the statistical analysis is performed to extract a range of upper limit outcomes.The median of these outcomes is the expected limit, and the range of outcomes within one or two standard deviations of the median is also shown for comparison.
We perform the statistical analyses several times while scanning over the possible branching fractions of the T/B quark.We use increments of 0.2 in the branching fractions of the T (B) quark to b W (t W), t Z (b Z), and t H (b H), to produce results for 21 different combinations of branching fractions.This allows the interpretation of the results in several different models that may alter the expected branching fractions significantly, enhancing or suppressing different decay modes.

Results
We observe no statistically significant excess over the expected background.The expected and observed limits on the cross section for pair production of T and B quarks are shown in the case of branching fractions of one for the individual decay modes in Figs. 5 and 6, for the cut-based and NN analyses, respectively.Because the cut-based analysis is optimized for the bW decay mode and includes selections targeting boosted W jets, it lacks sensitivity to the other decay modes.The NN analysis does not target a specific decay mode, but shows the best sensitivity to T quark decays to tZ and tH, or B quark decays to tW.It has lower sensitivity in the bWbW channel due to lower efficiency for correctly identifying b jets using AK8 reconstructed jets with the BEST algorithm.
A scan over all branching fractions considered is performed, with the results translated to limits on the VLQ mass. Figure 7 shows the results for the T quark graphically, with the values tabulated in Table 3. Figure 8 and Table 4 show the corresponding results for the B quark.
We exclude vector-like T quark masses ranging from 740 GeV, up to 1370 GeV for the tH decay mode in the NN analysis.The cut-based analysis provides additional sensitivity to the bW decay mode, with a T quark mass exclusion of 1040 GeV for T decays solely to bW.These results complement the existing results from other decay channels, and in the hadronic channel extends the excluded T quark mass from 705 GeV obtained in the previous 8 TeV analysis [56] to 1040 GeV.For vector-like B quarks, sensitivity is lost because of the additional b quarks present in the B decays, for which the BEST analysis has a larger misidentification rate.The cut-based analysis is not currently optimized for B quarks, however does provide some complementary sensitivity to the bZ decay mode.These analyses exclude vector-like B quarks with masses up to 1230 GeV, for B decays solely to tW.A mass exclusion of 1070 GeV is obtained in the cut-based analysis for the bZ decay mode scenario.

CMS
Figure 5: Limits at 95% confidence level on the ratio of the cross section to the theoretical cross section for pair production T quarks (left) and B quarks (right) in the cut-based analysis, with decays solely to tZ/bZ (upper), tH/bH (middle), and bW/tW (lower).The solid black line shows the observed limit, while the dashed black line shows the median of the distribution of limits expected under the background-only hypothesis.The inner (green) band and the outer (yellow) band indicate the regions containing 68 and 95%, respectively, of the distribution of limits expected under the background-only hypothesis.

CMS
Figure 6: Limits at 95% confidence level on the ratio of the cross section to the theoretical cross section for pair production T quarks (left) and B quarks (right) in the NN analysis, with decays solely to tZ/bZ (upper), tH/bH (middle), and bW/tW (lower).The solid black line shows the observed limit, while the dashed black line shows the median of the distribution of limits expected under the background-only hypothesis.The inner (green) band and the outer (yellow) band indicate the regions containing 68 and 95%, respectively, of the distribution of limits expected under the background-only hypothesis.

Summary
Two independent searches for vector-like T and B quarks using the fully hadronic final states have been presented.Both searches use data collected by the CMS experiment in 2016 at a center-of-mass energy of 13 TeV, corresponding to an integrated luminosity of 35.9 fb −1 .A cutbased analysis, using jet substructure observables to identify hadronic decays of boosted W bosons, targets the bW decay mode of the T quark, and improves sensitivity relative to results of such searches conducted previously.The analysis uses a quantum chromodynamics multijet background estimation method based on shape and rate extrapolations from various control regions to the signal region.Improvements in W tagging techniques, as well as the addition of signal regions requiring just a single W-tagged jet, enhance the performance of this analysis relative to previous searches based on different strategies.This search extends the T quark mass exclusion to 1040 GeV, relative to the previous exclusion of 705 GeV obtained by a similar analysis targeting the bW decay mode using data collected at 8 TeV [56].
A new strategy is presented and compared with the traditional cut-based approach.The neural network analysis uses a multiclassification technique, the boosted event shape tagger algorithm, to identify jets originating from heavy objects such as t or b quarks, and W, Z, or H.This allows the analysis to be sensitive to all decay modes of the T and B quarks.Using classification fractions, the dominant multijet background is estimated using data.The neural network analysis provides sensitivity for the tH and tZ decay modes competitive with that obtained by other searches utilizing lepton+jets or multilepton topologies.
For each analysis, results are presented in terms of cross section limits for the pair production of T and B quarks, along with exclusion limits in terms of the T and B quark masses, for the different combinations of branching fractions considered.The mass exclusion limits at 95% confidence level for the neural network analysis range from 740 to 1370 GeV, providing comparable sensitivity to the searches utilizing leptons, which exclude vector-like quark masses in the range 910-1300 GeV [13].These results represent the most stringent limits on pair produced vector-like quarks in the fully hadronic channel to date.

Figure 1 :
Figure 1: classification fractions for the six categories of the BEST algorithm, measured in data events as a function of jet p T .Error bars shown indicate statistical uncertainties in the fractions to be propagated to the estimate of the QCD multijet background contribution.The rightmost bin includes jets with p T values above 3 TeV.

Figure 2 :
Figure 2: The distributions of H AK4 T

Figure 3 :
Figure 3: A summary of the 126 signal region categories used in the NN analysis.This figure shows the expected yields in each category, while the signal discrimination is performed with the H AK8 T distributions from each of the categories.The bottom panel shows the ratio of observed data to total background in each category, with Poisson error bars where applicable, along with the total background uncertainty shown for each category by the gray band.

Figure 7 :
Figure 7: Observed (left) and expected (right) mass exclusion limits at 95% confidence level for each combination of T quark branching fractions, in the cut-based analysis (upper) and NN analysis (lower).

Figure 8 :
Figure 8: Observed (left) and expected (right) mass exclusion limits at 95% confidence level for each combination of B quark branching fractions, in the cut-based analysis (upper) and NN analysis (lower).

Table 1 :
Theoretical cross sections for TT and BB production, calculated at NNLO with TOP++2.0.

Table 2 :
Sources of systematic uncertainties that affect the H AK4 T or H AK8 T ≥

Table 3 :
Exclusion limits at 95% confidence level presented in terms of the T quark mass, for the different branching fraction scenarios considered, in each of the two analyses.

Table 4 :
Exclusion limits at 95% confidence level presented in terms of the B quark mass, for the different branching fraction scenarios considered, in each of the two analyses.
MST (Taipei); ThEPCenter, IPST, STAR, and NSTDA (Thailand); TUBITAK and TAEK (Turkey); NASU and SFFR (Ukraine); STFC (United Kingdom); DOE and NSF (USA).Individuals have received support from the Marie-Curie program and the European Research Council and Horizon 2020 Grant, contract Nos.675440 and 765710 (European Union); the Leventis Foundation; the A.P. Sloan Foundation; the Alexander von Humboldt Foundation; the Belgian Federal Science Policy Office; the Fonds pour la Formation à la Recherche dans l'Industrie et dans l'Agriculture (FRIA-Belgium); the Agentschap voor Innovatie door Wetenschap en Technologie (IWT-Belgium); the F.R.S.-FNRS and FWO (Belgium) under the "Excellence of Science -EOS" -be.h project n. 30820817; the Beijing Municipal Science & Technology Commission, No. Z181100004218003; the Ministry of Education, Youth and Sports (MEYS) of the Czech Republic; the Lend ület ("Momentum") Programme and the János Bolyai Research Scholarship of the Hungarian Academy of Sciences, the New National Excellence Program ÚNKP, the NKFIA research grants 123842, 123959, 124845, 124850, 125105, 128713, 128786, and 129058 (Hungary); the Council of Science and Industrial Research, India; the HOMING PLUS program of the Foundation for Polish Science, cofinanced from European Union, Regional Development Fund, the Mobility Plus program of the Ministry of Science and Higher Education, the National Science Center (Poland), contracts Harmonia 2014/14/M/ST2/00428, Opus 2014/13/B/ST2/02543, 2014/15/B/ST2/03998, and 2015/19/B/ST2/02861, Sonata-bis 2012/07/E/ST2/01406; the National Priorities Research Program by Qatar National Research Fund; the Programa Estatal de Fomento de la Investigaci ón Científica y Técnica de Excelencia María de Maeztu, grant MDM-2015-0509 and the Programa Severo Ochoa del Principado de Asturias; the Thalis and Aristeia programs cofinanced by EU-ESF and the Greek NSRF; the Rachadapisek Sompot Fund for Postdoctoral Fellowship, Chulalongkorn University and the Chulalongkorn Academic into Its 2nd Century Project Advancement Project (Thailand); the Welch Foundation, contract C-1845; and the Weston Havens Foundation (USA).