Search for single production of a vector-like $T$ quark decaying into a Higgs boson and top quark with fully hadronic final states using the ATLAS detector

A search is made for a vector-like $T$ quark decaying into a Higgs boson and a top quark in 13 TeV proton-proton collisions using the ATLAS detector at the Large Hadron Collider with a data sample corresponding to an integrated luminosity of 139 fb$^{-1}$. The Higgs-boson and top-quark candidates are identified in the all-hadronic decay mode, where $H\to b\bar{b}$ and $t\to b W \to b q \bar{q}^\prime$ are reconstructed as large-radius jets. The candidate Higgs boson, top quark, and associated B-hadrons are identified using tagging algorithms. No significant excess is observed above the background, so limits are set on the production cross-section of a singlet $T$ quark at 95% confidence level, depending on the mass, $m_T$, and coupling, $\kappa_T$, of the vector-like $T$ quark to Standard Model particles. In the considered mass range between 1.0 and 2.3 TeV, the upper limit on the allowed coupling values increases with $m_T$ from a minimum value of 0.35 for 1.07<$m_T$<1.4 TeV to 1.6 for $m_T$ = 2.3 TeV.


Introduction
The discovery of the Higgs boson [1, 2] and measurements of the Higgs-boson couplings [3-6] by the ATLAS and CMS collaborations confirm that the Standard Model of particle physics (SM) is an accurate description of nature at currently accessible energy scales. However, the SM still leaves many questions unanswered and is therefore not a complete theory. For example, radiative corrections to the Higgs-boson propagator from top-quark loops lead to a quadratic divergence in the mass of the Higgs boson [7]. The mechanism to cancel out the contribution from the top quark requires an unreasonable degree of fine-tuning to produce the observed 125 GeV Higgs boson. This so-called hierarchy problem is often considered to indicate that new physics naturally cancels out the divergent contributions to the Higgs-boson mass.
Vector-like quarks are hypothetical spin-1/2 particles that arise in various models that address problems in the SM such as the hierarchy problem. Vector-like quarks are color-triplets whose left-and right-handed chiralities transform in the same way under weak-isospin [8,9]. In Little Higgs [10,11] and Composite Higgs [12,13] models, the Higgs boson is naturally light because it is a pseudo Nambu-Goldstone boson arising from a spontaneously broken global symmetry [14]. Vector-like quarks arise naturally in such models. Unlike the chiral current of SM quarks, vector-like quarks have a pure vector current in the Lagrangian. In addition, vector-like quarks do not acquire mass by interacting with the Higgs field, so they are not excluded by measurements of Higgs-boson properties.
In these models, vector-like quarks are expected to couple preferentially to third-generation quarks [8,15] and can have both neutral-current and charged-current decays. An up-type vector-like quark with charge +2/3 can decay into , , or , while a down-type quark with charge −1/3 can decay into , , or (and the charge conjugate states). To be consistent with results from precision electroweak measurements, the mass-splitting between vector-like quarks belonging to the same SU(2) multiplet should be small [16], preventing cascade decays such as → . Couplings between the vector-like quarks and the first-and second-generation quarks are not excluded [17,18], but they are expected to be small.
Vector-like quarks can be produced singly or in pairs in proton-proton ( ) collisions. There have been numerous searches for the pair production of vector-like quarks [19][20][21][22][23][24][25][26][27][28][29][30][31][32][33][34][35][36][37] that have excluded -quark masses below 1.37 TeV at 95% confidence level (CL) for a variety of decay modes. For -quark masses above ∼1 TeV, vector-like quarks would mainly be produced singly if the couplings to SM particles were sufficiently large. Searches for single production of quarks have placed limits on -quark production cross-sections for -quark masses between 1 and 2 TeV at 95% CL for various SM couplings [38][39][40][41][42][43][44][45][46]. For these higher masses, where single vector-like quark production is expected to dominate [16], the cross-section depends on the vector-like quark mass scale as well as the couplings to SM particles. This paper reports a search for single production of a singlet vector-like quark in 13 TeV collisions produced at the Large Hadron Collider (LHC) and recorded by the ATLAS detector in a 139 fb −1 data sample. The search targets quarks decaying into a SM Higgs boson and a top quark, → , where both the Higgs boson and top quark decay hadronically and are reconstructed as jets of particles. A Feynman diagram for this process is shown in Figure 1. The mass, , of the quark and the overall coupling factor, , to the SM boson, boson, and Higgs boson [47] are unknown parameters. There are also three additional parameters, , , and , that determine the -quark branching ratios. In this analysis, the asymptotic limit of these parameters -as goes to infinity -is assumed, leading to branching ratios of 1/2, 1/4, and 1/4 for → W , → , and → Z , respectively. In this model, the unknown parameters and define the expected -quark cross-section and resonance lineshape. The search assumes this signal model in the interpretation of the data.
The quark is assumed to be a weak singlet state in this analysis; if additional multiplets of vector-like quarks are assumed, the possible final states and branching ratios require an approach involving simultaneous consideration of several final states [16], which is beyond the scope of this paper.
The results reported here significantly extend the sensitivity to events in which a singly produced quark decays to followed by the hadronic decays →¯and → . The use of fully hadronic decays allows the direct reconstruction of the -quark final state, increasing the expected signal-to-background ratio in the signal region defined for the search. Compared with the most sensitive prior search [46], this search uses ∼4 times more integrated luminosity, its sensitivity is improved by using tagging techniques resulting in a signal-to-background improvement of ∼3, and it uses a data-driven multĳet background estimate that reduces the uncertainty in the background estimate by an order of magnitude.
This fully hadronic final state is of particular interest for vector-like quark masses above 1 TeV. The resulting high-T jets from the top-quark and Higgs-boson are "boosted", so that the decay products of the top quark and Higgs boson are collimated and captured in two large-radius (large-) jets. This final state has the largest branching fraction of all the potential decay modes and the large-jets can be identified as either Higgs-boson or top-quark candidates through tagging algorithms that use the substructure within the jet [48,49]. In addition, bottom-quark jet identification ( -tagging) provides high background rejection with high efficiency given the three bottom-quark jets coming from the →¯and → decays. Assuming the existence of single -quark production, the signal would appear as an excess of events with invariant masses around the -quark mass for values of 0.5. Above this , the invariant mass distribution broadens to masses below the -quark mass as increases due to the convolution of increasing width and partonic densities. The largest backgrounds come from boosted top-quark pair production and multĳet events arising from the production of lighter high-T quarks ( , , , , and ) and/or gluons. The ATLAS [50][51][52][53][54][55][56][57][58] and CMS [59][60][61][62][63][64][65][66][67][68][69] collaborations have published measurements of the tt differential cross-sections at centerof-mass energies of √ = 7 TeV, 8 TeV, and 13 TeV in collisions. The measured cross-section for the production of top quarks with T > 300 GeV is ∼20% lower than predicted by perturbative quantum chromodynamics (QCD) calculations performed at next-to-leading-order (NLO) in the strong coupling constant s . A control sample of fully reconstructed high-T top-quark pairs is used with Monte Carlo (MC) models to normalize the expected background from top-quark pairs in the candidate sample. The multĳet background is estimated using data-driven techniques developed for studies of events containing boosted top quarks [57]. This paper is organized as follows. Section 2 describes the ATLAS detector and Section 3 describes the datasets and MC samples that are used in this analysis. Section 4 describes the object definition and event selection, while Section 5 summarizes the estimation of SM backgrounds to the -quark signal. The systematic uncertainties are presented in Section 6 and Section 7 presents the results of the search. Conclusions are drawn in Section 8.

ATLAS detector
The ATLAS detector [70] at the LHC is centered on the collision point and covers nearly the whole 4 solid angle. 1 It consists of an inner tracking detector surrounded by a 2 T superconducting 1 ATLAS uses a right-handed coordinate system with its origin at the nominal interaction point (IP) in the center of the detector and the -axis along the beam pipe. The -axis points from the IP to the center of the LHC ring, and the -axis points upwards. Cylindrical coordinates ( , ) are used in the transverse plane, being the azimuthal angle around the -axis.
The pseudorapidity is defined in terms of the polar angle as = − ln tan( /2). Angular distance is measured in units of solenoid, electromagnetic and hadronic calorimeters, and a muon spectrometer incorporating three large superconducting toroid magnets.
The inner detector, including the insertable B-layer added as a new innermost layer in 2014 [71,72], provides charged-particle tracking information from a pixel detector and silicon microstrip detector in the pseudorapidity range | | < 2.5 and a transition radiation tracker covering | | < 2.0.
The calorimeter system covers the pseudorapidity range | | < 4.9 and measures the positions and energies of electrons, photons, and charged and neutral hadrons. Within the region | | < 3.2, electromagnetic calorimetry is provided by barrel and endcap high-granularity lead and liquid-argon sampling calorimeters. The hadronic sampling calorimeter uses either scintillator tiles or liquid argon as active material and steel, copper or tungsten as absorber.
The muon spectrometer comprises separate trigger and high-precision tracking chambers measuring the tracks of muons in a magnetic field generated by superconducting air-core toroid magnets. The precision chamber system covers the region | | < 2.7 , while the muon trigger system covers the range | | < 2.4.
A two-level trigger system is used to select which events to save for offline analysis [73]. The first level is implemented in hardware/firmware and uses a subset of the detector information to reduce the event rate from the 40 MHz proton bunch crossings to less than 100 kHz. This is followed by a software-based high-level trigger that reduces the event rate to approximately 1 kHz. An extensive software suite [74] is used in the reconstruction and analysis of real and simulated data, in detector operations, and in the trigger and data acquisition systems of the experiment.

Data and simulated samples
This analysis studies collisions with a center-of-mass energy of √ = 13 TeV recorded by the ATLAS detector between 2015 and 2018. Only data-taking periods in which all the subdetectors were operational are considered. The dataset corresponds to an integrated luminosity of 139 fb −1 [75], measured using the LUCID-2 detector [76]. The events used in this analysis were collected by a set of triggers requiring at least one anti-jet [77, 78] with a jet radius parameter of = 1.0 [73]. The maximum T threshold value of these triggers was 480 GeV, which was found to be fully efficient when requiring the offline reconstruction of at least one large-jet with T > 500 GeV and | | < 2.0, as described in Section 4.
The main backgrounds for this search are from tt and multĳet events. There are also small contributions from single-top-quark and¯+ X ( = , , ) events. The multĳet background is estimated using a data-driven method described in Section 5, while the other backgrounds, as well as the -quark signal events, are estimated with MC simulations as described below. The multĳet background estimate also includes backgrounds arising from electroweak and QCD processes such as / + jets production.
The -quark signal samples were produced at leading order, using the M G 5_ MC@NLO MC generator [79] to generate the hard interaction and the P 8 generator for parton showering and hadronization. The parton distribution function (PDF) set used is NNPDF3.0 [80]. Both -mediated and -mediated production contribute to single -quark production and were included in the MC event generation, with the -mediated process having a cross-section five times smaller than the -mediated process and comprising less than 1% of the total yield after the event selection described in Section 4. The matrix elements were calculated using the phenomenological model given in Ref. [47]. These include all tree-level processes, ensuring the inclusion of both resonant and nonresonant single -quark production modes. The decay channel considered is → , with and as unknown parameters. The three additional parameters, , , and , that determine the -quark branching ratios are set to the asymptotic limit in , leading to branching ratios of 1/2, 1/4, and 1/4 for → W , → , and → Z , respectively. In order to accurately model the change in cross-section and lineshape as and are varied, MC samples were created for a variety of mass and coupling values, with ranging from 1.0 to 2.3 TeV in steps of 0.1 TeV and ranging from 0.1 to 1.6 in steps of 0.05 for < 0.5 and 0.1 for larger . All signal samples are normalized to cross-sections that have been calculated at NLO in QCD [81]. These cross-sections are computed in a -quark narrow-width approximation and a correction factor is applied [82] to account for finite-width effects.
For all MC samples, the masses of the top quark ( top ) and Higgs boson were set to 172.5 GeV and 125.0 GeV, respectively. The production of tt events was modeled using the P B v2 [83][84][85][86] generator. This provides matrix elements at NLO with the NNPDF3.0 PDF. In addition, the ℎ damp parameter, which controls the matching of the matrix element to the parton shower in P and effectively regulates the high-T radiation against which the tt system recoils, was set to 1.  [90]. The decays of bottom and charm hadrons were simulated using the E G 1.6.0 program [91].
The uncertainty due to initial-state radiation (ISR) was estimated by varying the Var3c A14 tune, renormalization scale r , factorization scale f , and the ℎ damp parameter independently. The Var3c A14 tune variation corresponds to the variation of s for ISR in the A14 tune. The renormalization scale and factorization scales were varied by factors of 0.5 and 2.0 corresponding to an increase and decrease in ISR, respectively. The ℎ damp uncertainty is measured by comparing the nominal tt sample with a sample using ℎ = 3 . The impact of final-state radiation (FSR) uncertainties was evaluated by increasing and decreasing the renormalization scale for emissions from the parton shower by a factor of two.
The impact of using a different parton-shower and hadronization model was evaluated by comparing the nominal tt sample with a sample that was also generated by P B v2 but used H 7.1.3 [104,105] instead of P 8.230 for parton showering and hadronization. The H 7.1 default set of tuned parameters [105,106] and the MMHT2014 PDF set [107] were employed.
To assess the uncertainty in the matching of NLO matrix elements to the parton shower, the P tt sample was compared with a sample of events generated with M G 5_ MC@NLO 2.6.0 but retaining the P 8.230 parton-shower and hadronization models. The M G 5_ MC@NLO calculation used the NNPDF3.0 set of PDFs, as in the P sample, and P 8 again used the A14 tune and the NNPDF2.3 set of PDFs.
The production of a single top quark in association with a boson ( ) was modeled using the P B v2 generator [84][85][86]108] at NLO in QCD with the five-flavor scheme and the NNPDF3.0 set of PDFs. The diagram removal scheme [109] was used to remove interference and overlap with tt production.
The P 8.230 parton-shower and hadronization models were employed, using the A14 tune and the NNPDF2.3 set of PDFs. The inclusive cross-section for production was corrected to the theory prediction calculated at NLO in QCD with NNLL soft-gluon corrections [110,111].
Single-top-quark -channel production was modeled using the P B v2 [84][85][86]112] generator at NLO in QCD using the four-flavor scheme and the corresponding NNPDF3.0 set of PDFs. Parton showering and hadronization were performed with P 8.230, using the A14 tune and the NNPDF2.3 set of PDFs. The inclusive cross-section was corrected to the theory prediction calculated at NLO in QCD with the H 2.1 generator [113,114]. Single-top-quark -channel MC events were not generated because the cross-section for this process is much smaller than that for production and the -channel processes. However, the -channel process makes a small contribution to the data-driven multĳet background estimate and is therefore partially accounted for. The production of SM is treated in a similar manner, as the background yield is negligible due to a combination of small cross-section and low yield in the high-T region.
The production of tt in association with a Higgs boson (tt + ) was modeled by the P B v2 [83][84][85][86]115] generator at NLO. The production of tt in association with a W or Z boson was modeled using the M G 5_ MC@NLO 2.3.3 generator at NLO. Parton showering and hadronization for these processes was performed by P 8.210 and the decays of bottom and charm hadrons were simulated using E G 1.2.0. The cross-sections for the tt + / / processes were calculated using M G 5_ MC@NLO at NLO QCD and NLO EW accuracies using Ref. [116]. The tt + crosssection was corrected to take into account contributions from off-shell Z bosons with masses down to 5 GeV. The predicted values of the cross-sections at 13 TeV are 0.88 +0.09 −0.11 pb, 0.60 +0.08 −0.07 pb, 0.51 +0.04 −0.05 pb for tt + , tt + , and tt + , respectively, where the uncertainties reflect QCD scale variations.
The effect of multiple interactions in the same and neighboring bunch crossings (pileup) was modeled by overlaying the simulated hard-scattering event with inelastic events generated with the P 8.186 MC generator [117] using the NNPDF2.3 set of PDFs and the A3 set of tuned parameters [118].
The detector response was simulated using the G 4 framework [119,120], and the data and MC events are reconstructed with the same software algorithms.

Object definition
This analysis makes use of jets, electrons, muons, and event-based quantities formed from their combinations.
Electron candidates are identified from high-quality inner-detector tracks matched to calorimeter energy deposits consistent with an electromagnetic shower [121]. The calorimeter deposits must form a cluster with T > 25 GeV, | | < 2.47, and be outside the transition region 1.37 ≤ | | ≤ 1.52 between the barrel and endcap calorimeters. A likelihood-based requirement is used to suppress misidentified jets, and calorimeter-and track-based isolation requirements are imposed using the gradient working point [121], which provides uniform rejection in and improved rejection as T increases.
Muon candidates are reconstructed using high-quality inner-detector tracks combined with tracks reconstructed in the muon spectrometer [122]. Only muon candidates with T > 25 GeV and | | < 2.5 are considered. Calorimeter-and track-based isolation criteria similar to those used for electrons are used [123]. To reduce the impact of nonprompt leptons, muons within Δ = 0.4 of a jet are removed.
The anti-algorithm implemented in the FastJet package [77, 78] is used to define three types of jets for this analysis: (1) VRTrack jets with a variable-radius parameter with values between = 0.02 and = 0.4 [124], (2) small-jets with = 0.4, and (3) large-jets with = 1.0. These are reconstructed independently of each other. The VRTrack jets make use of tracking information from the inner detector, the large-jets use information from topological clusters [125] in the calorimeter, and the small-jets use both tracking information and topological clusters [126].
Only jets that have T > 25 GeV and | | < 2.5 are considered. To reduce pileup effects, the jet-vertex tagger (JVT) algorithm [127] is used to reject small-jets that do not originate from the primary interaction vertex. The primary vertex is selected as the one with the largest Σ 2 T , where the sum is over all tracks with transverse momentum T > 0.5 GeV that are associated with the vertex. This JVT algorithm is applied only to small-jets with T < 60 GeV and | | < 2.4.
The topological clusters used as input to the small-and large-jet reconstruction are calibrated using the local calibration method [128]. The jet energy scale is energy-and -dependent with calibration factors derived from simulation and in situ measurements [125,129,130]. The large-jet candidates are required to have | | < 2.0 and T > 350 GeV. The requirement is imposed to optimize the -quark signal-to-background ratio and to select jets in a kinematic regime where the object tagging is efficient and well-understood. The T requirement ensures that the large-jets are sufficiently collimated to contain most of the decay products of the top quark or Higgs boson. A trimming algorithm [131] with parameters sub = 0.2 and cut = 0.05 is applied to suppress gluon radiation and further mitigate pileup effects. The small-jets are used to validate the modeling of large-jets arising from the tt and multĳet backgrounds and are not used directly in the event selection. Only small-jets with T > 25 GeV and | | < 2.5 are considered, so as to match the VRTrack jet candidates.

Higgs boson, top quark, and -jet tagging
This analysis searches for Higgs bosons, top quarks, and -hadron jets ( -jets) to identify -quark candidates that undergo a → decay, followed by → bb, → , and →¯ decays. Distinct tagging algorithms are employed to identify these three objects.
Higgs-boson candidates are identified by requiring the large-jet mass [128] to be between 100 and 140 GeV, along with an upper bound on the jet-substructure variable 21 [132,133], which is a relative measure of whether the jet has a two-body or one-body structure. The upper bound on 21 is chosen as a function of the jet T in order to achieve a tagging efficiency of 70% for Higgs bosons, independent of their T . The tagger provides a rejection factor between five and ten for light-quark and gluon jets.
The top-quark-tagging algorithm uses a deep-neural-network (DNN) scheme [48]. It makes use of jet-substructure variables to discriminate between top-quark jets and jets arising from W, Z, Higgs bosons, gluons, and lighter quarks. An 80% efficiency working point is used, which is defined for all top-quark jets whose decay products are clustered together into the large-jet. In addition, only jets with a reconstructed mass between 140 and 225 GeV are considered. The orthogonal mass window requirements for tagging Higgs bosons and top quarks ensure that a jet can only be identified as either a top-quark or Higgs-boson candidate.
The -tagging algorithm used is known as DL1, a DNN-based tagging scheme that uses the secondary vertex information and the impact parameters of the charged tracks in a VRTrack jet [49]. The working point chosen for this algorithm results in 70% tagging efficiency for -jets in tt events, with a rejection of ∼10 and ∼400 for charm and light quarks, respectively. This algorithm is applied to all VRTrack jets that have been geometrically matched to the large-jets by requiring that the jet axes have an − distance Δ < 1.0.

Event preselection
A preselection is performed to obtain a sample of candidate signal and background events. Each event is required to have a primary vertex with five or more associated tracks with T > 0.5 GeV [134].
To identify the fully hadronic decay topology, events must have at least two large-jets with T > 350 GeV and | | < 2.0. The highest-T jet is required to have T > 500 GeV to ensure that the inclusive jet trigger used to record the events has 100% efficiency. The two highest-T large-jets are referred to as the leading and second-leading jets. All other large-jets are ignored. The large-jets must have a mass between 100 and 225 GeV.
To remove candidates where a tt event has resulted in a lepton+jet or dilepton final state, events are rejected if they have an identified electron or muon candidate, as described in Section 4.1.
This preselection defines the data sample used in the -quark search, which comprises about 4 million events.

Event classification by tagging states
The leading and second-leading large-jet candidates are examined to determine if either jet satisfies the Higgs-boson-tagging or top-quark-tagging criteria. In addition, each VRTrack jet contained within a large-jet is examined to determine if it is -tagged. In what follows, a -tagged VRTrack jet associated with a large-jet is referred to as a -tag.
With these tagging definitions, the events are classified according to the tagging states of each largejet: the jet could be neither Higgs-boson-tagged nor top-quark-tagged, be Higgs-boson-tagged, or be top-quark-tagged. The jet also could have no -tags, 1 -tag, or ≥2 -tags. Altogether, a large-jet could be in one of nine different tagging states, so a 9 × 9 matrix is defined as shown in Figure 2 to categorize all possible tagging states of the two jets in an event.
Three sets of regions are defined: a signal region , a tt normalization region , and eight validation regions, as illustrated in Figure 2. The signal region consists of those events where one jet is Higgs-bosontagged with ≥2 -tags and the other jet is top-quark-tagged with ≥1 -tag, and comprises four event-tagging states as illustrated in Figure 2.
The tt normalization region is designed to contain the highest-purity sample of tt pair events. The four event-tagging states that define this region are those with both large-jets being top-quark-tagged and each having at least 1 -tag. Top-quark-tagged jets with ≥2 -tags are included, as these typically result from Leading large-R jet tagging state Second-leading large-R jet tagging state mistagging a charm-quark jet arising from a →¯decay. This region is used to study top-quark-tagging performance and to validate the top-quark acceptance and background estimates.
The validation regions are used to validate the background estimation techniques used in the and . The regions with a leading large-jet top-quark-tagged with 1 -tag and the second-leading largejet not being top-quark-tagged with 1 -tag ( 1 and 2) or the second-leading large-jet being either Higgs-boson-tagged or top-quark-tagged with no -jets ( 3 and 4) validate the multĳet and non-all-hadronic tt background estimates. The validation regions defined by the event-tagging states with a Higgs-boson-tagged large-jet with ≥2 -tags and with the other jet top-quark-tagged with no associated -tags ( 5 and 6) are expected to be dominated by mistagged events and are used to cross-check the mistagging estimates for the Higgs-boson, top-quark, and -jet tagging schemes. The validation regions defined by one jet that is neither Higgs-boson-tagged nor top-quark-tagged with 1 -tag and the other jet being top-quark-tagged with ≥2 -tags ( 7 and 8) are considered to validate the background modeling involving ≥3 -tags.

Background estimation and validation
The two largest contributions to the and are multĳet events and tt production, with smaller contributions arising from events with only one hadronically decaying top quark or from tt production in association with a W, Z, or Higgs boson.
The multĳet background in all the regions is estimated using a data-driven technique employing sidebands and control regions dominated by multĳet events and originally developed to study boosted tt production [53, 58,135]. This background is found to be the largest source of candidate events in the signal region and is determined iteratively, as described in Section 5.1.
The second-largest background in the consists of events in which a pair of boosted top quarks decay hadronically to produce two top-quark jets. This background is estimated using MC calculations normalized by the event yield in the after subtracting other backgrounds. As this subtraction requires an estimate of the multĳet background, the background-subtracted event yield is determined iteratively, as described in Section 5.2.
The third-largest contribution in the is from the non-all-hadronic tt process where one top quark decayed semileptonically and the other hadronically. In this case, the final-state leptons are not reconstructed or are misidentified and not rejected by the electron and muon veto. The rate of this process, estimated using MC samples, is normalized using the observed tt event yield in the .
Other contributions from SM processes with at least one top-quark jet are estimated using MC samples as described in Section 5.3.

The data-driven multĳet background estimate
The multĳet background is estimated by a data-driven method using events from specifically chosen event-tagging states to estimate the multĳet background event yields in the signal, normalization, and validation regions. The event-tagging states used are dominated by multĳet backgrounds and have small contributions from events with one or more top quarks. These event-tagging states also have potential contributions from -quark production of less than 1% for all choices of -quark masses and couplings considered in this search. Contributions from / +jets are negligible due to a combination of a relatively low cross-section for high-T hadronically-decaying bosons [136] and the tagging requirements. For a given event-tagging state, the number of events from all MC backgrounds is subtracted from the observed number of events with that event-tagging state. This provides an estimate of the number of multĳet events in each event-tagging state. As noted above, the tt background subtracted from each region is normalized to the event yield in the that depends on the multĳet estimate in that region. Hence, the multĳet background and tt yield are calculated iteratively. This procedure is similar to the algorithm used in Ref. [58].
For example, consider the event-tagging state defined by requiring that the leading jet is top-quark-tagged with 1 -tag and the second-leading jet is Higgs-boson-tagged with ≥2 -tags. The method uses the numbers of multĳet events A , B , C , and D , after MC background subtraction, in four regions A, B, C, and D. In region A the leading jet is neither top-quark-tagged nor Higgs-boson-tagged with no -tags and the second-leading jet is neither top-quark-tagged nor Higgs-boson-tagged with no -tags. In region B the leading jet is neither top-quark-tagged nor Higgs-boson-tagged with no -tags and the second-leading jet is Higgs-boson-tagged with ≥2 -tags. In region C the leading jet is top-quark-tagged with 1 -tag and the second-leading jet is neither top-quark-tagged nor Higgs-boson-tagged with no -tags. In region D, which is one of the event-tagging states, the leading jet is top-quark-tagged with 1 -tag and the second-leading jet is Higgs-boson-tagged with ≥2 -tags. If the tagging efficiencies of the two large-jets are uncorrelated, then the ratio of the numbers of multĳet events in two distinct event-tagging states that differ only by the tagging state for one of the large-jets will be independent of the tagging state of the other large-jet. In this example, the ratio of D to C is equal to the ratio of B to A since the ratios only differ by the tagging state of the leading large-R jet. Hence, the number of multĳet events in region D is A corresponding method is performed for each of the event-tagging states of the and , and for all the validation regions using different event-tagging states to define regions A, B, and C. Since the tt background subtraction in regions A, B, and C is normalized to the tt event yield in the , which requires an estimate of the multĳet background, the calculation of the multĳet background and the tt event yield in each region is iterated as described in Section 5.2.
The assumption of uncorrelated jet-tagging states is only approximately true. The level of correlation is determined by examining ratios of the numbers of events with specific event-tagging states that do not overlap with the , , or validation regions, shown as the gray event-tagging states in Figure 2. The observed corrections between the jet-tagging states defined by the top-quark, Higgs-boson, and -tagging criteria are applied to the multĳet background estimates for each of the event-tagging states that define the , , and the eight validation regions, with the total corrections varying from 1.01 to 1.10 with uncertainties ranging from 0.03 to 0.06. In the calculation of the multĳet background for the event-tagging state belonging to the illustrated above, there are four corrections applied as a product. The multĳet estimates are calculated independently for each of the four event-tagging states that make up the and , after which they are summed. This provides a fully data-driven multĳet background estimate in each region.
For example, to calculate the correlation between the mistagging probabilities when the leading jet is top-quark-tagged and the second-leading jet is Higgs-boson-tagged, the event yields in four regions, E, F, G, and H are considered. In region E the leading jet is neither top-quark-tagged nor Higgs-boson tagged with no -tags and the second-leading jet is neither top-quark-tagged nor Higgs-boson-tagged with no -tags. In region F the leading jet is top-quark-tagged with no -tags and the second-leading jet is neither top-quark-tagged nor Higgs-boson-tagged with no -tags. In region G the leading jet is neither top-quark-tagged nor Higgs-boson-tagged with no -tags and the second-leading jet is Higgs-boson-tagged with no -tags. In region H the leading jet is top-quark-tagged with no -tags and the second-leading jet is Higgs-boson-tagged with no -tags. The ratio between the number of events in regions E and F is related to the ratio of events in regions G and H by where is the measure of the correlation in mistagging probabilities between the leading jet being top-quark-tagged and the second-leading jet being Higgs-boson-tagged, with both large-jets having no associated -tags. The value of in this example is 0.976 ± 0.004, where the uncertainty is statistical only, and is applied as a correction to the multĳet background estimate.
Each correlation is measured in an analogous way using the numbers of events in pairs of event-tagging states. The pairs chosen have MC background contributions less than 8% of the observed event yield, thus reducing the systematic uncertainties arising from the subtraction of the MC background contributions. The multĳet background estimate taking into account the tagging correlations is calculated bin-by-bin for each distribution so that the shape of the multĳet background distribution is measured as well as the total background event yield.
Since the multĳet background depends on the tt background subtraction, the two are determined iteratively as described in the next section.

Top-quark pair yields and multĳet backgrounds in the and
Previous measurements of the tt differential cross-sections for highly boosted top quarks [58] show that the observed cross-section is lower than MC predictions by ∼20%. To avoid the uncertainty this would create in the tt background contribution and the multĳet estimates in each region, the ratio of the observed rate to the predicted rate of events in the , norm , is used to normalize the predicted tt background contributions in the , validation regions, and the event-tagging-states used for the multĳet estimate.
The value of norm is determined after the initial multĳet estimate that uses the nominal tt prediction by requiring the predicted event yield in the to match the observed yield. However, the multĳet estimate itself is a function of norm , as the estimation technique described in the previous section requires the subtraction of the tt background contribution in the multĳet-dominated event-tagging states during its calculation. Thus, both the multĳet estimate and norm are calculated iteratively using where norm is the value of norm resulting from the th iteration, Data is the observed event yield in the , Multĳet ( norm ) is the data-driven multĳet background event yield from the th iteration in the , top-related is the sum of the backgrounds from single-top-quark, tt + W, tt + Z, and tt + production that are estimated by MC calculations in the , and ttMC is the sum of the tt events with all-hadronic and non-all-hadronic decays in the .
In each iteration of the multĳet estimate, ttMC is scaled by norm +1 before subtraction. This calculation converges to subpercent level in four iterations to a value of norm = 0.82 ± 0.01, where only statistical uncertainties are considered. This is consistent with cross-section measurements of boosted tt production [57]. The tt contribution predicted by the MC calculations in the is scaled by norm .
The resulting tt yield estimates are 8587 ± 1369 events and 174 ± 35 events in the and , respectively, where the uncertainties include the systematic uncertainties described in Section 6. This estimate of the tt yield in the is used only for the iterative multĳet background estimate.
The multĳet yields in the and after this iterative calculation are estimated to be 1452 ± 57 and 316 ± 9 events, respectively. The uncertainties in the multĳet estimates, including the uncertainties in the tagging correlations, consist of the statistical uncertainties in the event-tagging states used for the calculation and the systematic uncertainties arising from the MC background subtraction, as described in Section 6.

Other top-quark backgrounds
Single-top-quark production in the final state and the -channel represent a small contribution to the total background prediction, which is estimated using the P +P 8 MC calculation described in Section 3. The -channel single-top-quark process is not included as an explicit contribution because of its small cross-section and because a part of it is already taken into account in the data-driven multĳet estimate.
The uncertainty in the single-top-quark background is increased by 50% to account for the uncertainty in this contribution.
The estimated single-top-quark yields in the tt and are 93 ± 52 and 8 ± 6 events, respectively.
The backgrounds from production of a top-quark pair in association with a W, Z, or Higgs boson are also estimated using the MC samples described in Section 3. The estimated yields in the and are 115 ± 25 events and 9 ± 2 events, respectively.

Validation of background calculations
Kinematic variables with the ability to distinguish between tt and multĳet contributions in the and the validation regions are examined to further validate the background modeling. The potential contribution of -quark production to these regions is <1%. The distributions of the mass of the leading small-jet associated with the leading large-jet events in the where both large-jets are top-quark-tagged and have ≥1 -tags are shown in Figure 3(a). A W-mass peak is observed, which arises when the W-boson decay products are collimated into a small-jet, along with a low-mass peak arising from light quarks and bottom quarks. A shoulder is seen around the top-quark mass, which arises from a small number of highly boosted top-quark jets where all the decay products of the top quark are clustered into the small-jet. The observed distribution is well-modeled with a large tt contribution and a smaller multĳet distribution. The invariant mass distribution of the leading large-jet in the same sample, shown in Figure 3(b), confirms the interpretation that this region is dominated by tt production.
The distributions of the jet mass for the leading small-jet associated with the leading large-jet are shown in Figure 4 for validation regions 1 through 4. The relative sizes of the tt and multĳet contributions vary between these validation regions, further testing the robustness of their modeling and normalization. There is agreement between the observed and predicted distributions in both normalization and shape, except for a small excess in the prediction of events for 1. This is further discussed in Section 6.2.
Further validation of the multĳet background estimates is illustrated in Figure 5, where the distributions of the invariant mass of the two leading jets, or dĳet system, are shown for the four validation regions dominated by multĳet backgrounds. The distributions for events with a top-quark-tagged jet with no -tags and a Higgs-boson-tagged jet with ≥2 -tags ( 5 and 6) are shown in Figures 5(a) and 5(b), respectively. Distributions for events with a jet with 1 -tag, but no top-quark or Higgs-boson tag, and another jet with a top-quark tag and ≥2 -tags ( 7 and 8) are shown in Figures 5(c) and 5(d), respectively. There is also agreement between the observed and predicted distributions.     1 defined by requiring the leading large-jet be top-quark-tagged with 1 -tag and the second-leading jet is Higgs-boson-tagged with 1 -tag, (b) 2 defined by requiring the leading large-jet be top-quark-tagged with 1 -tag and the second-leading jet is neither Higgs-boson-tagged nor top-quark-tagged with 1 -tag, (c) 3 defined by requiring the leading large-jet be top-quark-tagged with 1 -tag and the second-leading jet is top-quark-tagged with no -tag, and (d) 4 defined by requiring the leading large-jet be top-quark-tagged with 1 -tag and the second-leading jet is Higgs-boson-tagged with no -tag. The predicted distribution includes the estimated backgrounds and a hypothetical -quark signal with = 1.6 TeV and = 0.5. The blue hashed lines correspond to the sum in quadrature of the statistical and systematic uncertainties of the prediction in a given bin. The lower panels show the ratio of the data to the prediction, along with the uncertainty in the ratio. A ratio outside the bounds of the axis is represented by a blue arrow. The last bin includes the event overflows. Contributions to the predicted yield are stacked in the same order as they appear in the legend.  Figure 5: Dĳet invariant mass distributions for the two large-jets in four validation regions: (a) 5 defined by requiring a leading jet Higgs-boson-tagged with ≥2 -tags and second-leading jet top-quark-tagged with no associated -tag, (b) 6 defined by requiring a leading jet top-quark-tagged with no associated -tag and second-leading jet Higgs-boson-tagged with ≥2 -tags, (c) 7 defined by requiring a leading jet top-quark-tagged with ≥2 -tags and second-leading jet neither top-quark-tagged nor Higgs-boson-tagged with 1 -tag, and (d) 8 defined by requiring a leading jet neither top-quark-tagged nor Higgs-boson-tagged with 1 -tag and second-leading jet top-quark-tagged with ≥2 -tags. The predicted distributions include the estimated backgrounds and a hypothetical -quark signal with = 1.6 TeV and = 0.5. The blue hashed lines correspond to the sum in quadrature of the statistical and systematic uncertainties of the prediction in a given bin. The lower panels show the ratio of the data to the prediction, along with the uncertainty in the ratio. A ratio outside the bounds of the axis is represented by a blue arrow. The last bin includes the event overflows. Contributions to the predicted yield are stacked in the same order as they appear in the legend.

Systematic uncertainties
Systematic uncertainties that affect the interpretation of the data are estimated using data and MC samples. Variations corresponding to a +1 and −1 confidence interval are derived for each uncertainty.
These systematic uncertainties are broken down into detector-related and modeling uncertainties. They do not have a significant dependence on the choice of -quark mass and coupling, so an example of the size of the systematic uncertainties arising from the likelihood fit described in Section 7 (the "post-fit" results) is provided in Table 1 for = 1.6 TeV and = 0.5.

Detector-related uncertainties
The most significant detector-related systematic uncertainties arise from the measurements of jet properties and tagging efficiencies.
Uncertainties associated with the large-jets arise from the jet energy scale (JES), jet mass scale (JMS), jet mass response (JMR), jet energy resolution (JER), and the JVT requirement. The uncertainties in the JES, JMS, and JMR are evaluated by using in situ measurements [125]. The JES is measured in events where a large-jet recoils against well-defined reference objects (photons, bosons, or calibrated smalljets). The JMS and JMR uncertainties are measured using both a double-ratio method that compares the calorimeter-to-tracker response ratio between data and simulation [125] and a fit to the -boson mass peak in high-T lepton+jets tt events. The JER uncertainty is measured by studying dĳet mass resolution and the effect of energy flow near the jet radius [128]. The JVT uncertainty arises from the correction factors used to match the efficiencies in the MC samples to data.
The efficiency for tagging -jets is measured in data using dilepton tt events [49]. Correction factors are applied to the jets in the MC sample so that the -jet tagging efficiency as a function of jet T in MC events matches that in data events. Uncertainties arising in the evaluation of the efficiencies are propagated to the correction factors. The largest source of -jet tagging uncertainty is the extrapolation of tagging efficiencies to -jets with T > 300 GeV, as -jet tagging calibrations use data with T < 300 GeV.
The efficiency and rejection power of the DNN top-quark tagger is measured in data and correction factors are applied to MC events to match the measured efficiencies [137]. These corrections take into account the correlations between the tagging efficiencies and other jet observables such as the jet energy and mass. The uncertainties in these corrections are treated as systematic uncertainties.
The efficiency of the 21 requirement used for the Higgs-boson tagger is measured using the calorimeter-totracker response double-ratio method [125]. The corresponding uncertainty, which is approximately 2%, is included in the uncertainty of the Higgs-boson-tagger efficiency.
The relative uncertainty in the integrated luminosity is determined to be 1.7% [75], obtained using the LUCID-2 detector [76] for the primary luminosity measurements.

Modeling and background uncertainties
The most significant modeling uncertainties arise from the MC calculations of the tt production process and decay into the all-hadronic final states, the modeling of the non-all-hadronic tt background, the cross-sections for processes producing smaller backgrounds involving at least one top quark, and the multĳet background estimates.
The tt background estimate has systematic uncertainties from initial/final-state radiation (ISR/FSR), the renormalization scale, factorization scale, PDF, parton-shower algorithm, matrix-element calculation, and ℎ damp parameter value. The effects of ISR/FSR, renormalization scale, and factorization scale uncertainties are evaluated using the method described in Section 3. The PDF uncertainties are evaluated by use of the PDF4LHC15 Hessian uncertainties, where the 30 variations are combined into one nuisance parameter. Uncertainties arising from the choice of parton-shower and hadronization algorithms are evaluated by comparing the nominal P +P 8 sample with the P +H sample. The uncertainty arising from the matrix-element calculation is assessed by comparing the nominal MC sample with the M G 5_ MC@NLO+P 8 sample.
Although the non-all-hadronic tt background is relatively small in the and , a 5% excess of predicted events relative to the data is observed in 1 defined by the event-tagging state with the leading large-jet top-quark-tagged with 1 -tag and the second-leading jet Higgs-boson-tagged with 1 -tag. This validation region is estimated to have a non-all-hadronic tt background fraction of approximately 15% and it is possible that the observed excess is due to mismodeling of this background. A conservative uncertainty of 62%, which covers the excess if it is attributed entirely to the non-all-hadronic tt background, is applied to the size of this background in the and .
The uncertainty in the multĳet background estimate is approximately 4%, as described in Section 5.2. The uncertainty in the predicted single-top-quark background estimate is 75% while the uncertainty in the predicted¯+ W/Z/ background estimate is 22%, as described in Section 5.3.

Results
The dĳet invariant mass formed from the tagged large-jets is interpreted as a combination of the expected SM backgrounds and a -quark signal. The dĳet mass in the is the invariant mass of the Higgs-boson and top-quark candidates while in the it is the invariant mass of the two top-quark candidates. The dĳet invariant mass distributions for the and are shown in Figure 6, assuming a -quark signal contribution with = 1.6 TeV and = 0.5 scaled to the theory cross-section of 41 fb. The overall acceptance times efficiency of -quark detection in the all-hadronc final state is 1.6% for this choice of mass and couplings, taking into account the kinematic requirements and tagging efficiencies. The predicted background rates and shapes are in good agreement with the observed distributions.
The dĳet mass is used as a discriminant in the and to test for the presence of a -quark signal. Two parameters of interest are defined: o , the observed cross-section for single production of a -quark, and fit , the and tt background normalization.
A binned-likelihood fit is performed in which a -quark signal and the background model is fitted to the dĳet mass distribution and simultaneously the background model is fitted to the dĳet mass distribution. The fit is performed for events with a dĳet mass greater than 1 TeV. The fit model in the is the sum of the background distributions and a -quark signal distribution with a given mass, coupling, and Table 1: Size of the post-fit uncertainties in the -quark signal cross-section for a -quark mass of 1.6 TeV and coupling = 0.5. The fitted cross-section is −10 fb and is consistent with zero. The background uncertainty is the sum in quadrature of the systematic uncertainty on the multĳet background and the statistical uncertainties on the MC-derived backgrounds. The total uncertainty of 25 fb is the sum in quadrature of the total systematic uncertainty and statistical uncertainty. The uncertainty arising from simultaneously fitting the tt normalization factor is included in the total systematic uncertainty. The individual uncertainties do not add up in quadrature to the total uncertainty because of their correlations in the fit. signal cross-section obs . In the the very small contribution from the -quark signal is neglected. The signal cross-section is allowed to take negative values in the fit whereas fit is constrained to be positive. The fit of the tt background in the and measures fit using both regions and thus provides a scaled tt background contribution in the .
The fit incorporates the systematic uncertainties as Gaussian nuisance parameters. Additional bin-by-bin uncertainties are included to account for the statistical uncertainties in the predicted multĳet and MC backgrounds. The tt contributions to the and are fully correlated in the fit. The likelihood is then profiled [138] as a function of each nuisance parameter and used as the test statistic to determine the statistical significance of the fit results. Figure 7 shows the dĳet mass distributions for the and after the fit (post-fit) assuming a signal hypothesis with = 1.6 TeV and = 0.5. The observed and predicted event yields in the and are given in Table 2. The fitted value of fit = 0.79 ± 0.12 is consistent with the tt normalization factor norm = 0.82 ± 0.01 determined from the background-subtracted event yield in the (the uncertainty on norm is statistical only). There is good agreement between the predicted post-fit signal region background yield of 494 ± 22 events and the observed yield of 471 events, consistent with no significant excess in data above SM backgrounds over the entire invariant mass distribution as seen in Figure 7(a). The fit of the = 1.6 TeV and = 0.5 signal hypothesis results in ( → + → + ) = −10 ± 25 fb, further confirming no excess of events at masses around 1.6 TeV.
Similarly, fit results with -quark cross-sections consistent with zero are obtained for -quark masses between 1.0 and 2.3 TeV and for values from 0.1 to 1.6. Based on these fit results, for 1.0 < < 2.3 TeV there is no significant evidence of a quark decaying to the final state.
The fit results are used to set 95% CL upper limits on the single--quark production cross-section for 1.0 < < 2.3 TeV and 0.1 < < 1.6 using the CL s method [139]. The predicted cross-sections assume a singlet quark with a → branching fraction of 1/4. Figure 8 shows the 95% CL upper limits as a function of for different values of . The cross-section limits range from ∼10 fb to ∼200 fb, depending on . The decrease in sensitivity for masses from 1.0 to 1.2 TeV arises from the change in signal shape due to the T requirements on the Higgs-boson and top-quark candidate jets. The T requirements shape the distribution to peak at roughly 1.2 TeV, which can be seen in Figures 6 and 7. Figure 9 shows the exclusion limits as a function of and . Figure 10 shows the observed and expected 95% CL limits on the -quark mass as a function of the -quark width-to-mass ratio Γ/ and the branching fraction for -quark decay into a Higgs boson and a top quark.
For the considered mass range of 1.0 to 2.3 TeV the upper limit on allowed values of rises from a minimum value of 0.3 starting at = 1.1 TeV, up to 1.6 for = 2.3 TeV.
At 95% CL, this analysis excludes quarks with Γ/ ≥ 0.05 for 1.05 < < 1.2 TeV, with the mass limits rising with Γ/ to exclude < 1.7 TeV for Γ/ ≥ 0.5.    showing the results of the model when fitted to the data. A -quark hypothesis with = 1.6 TeV and = 0.5 is used in the fit. Since the central value of the fitted -quark cross-section is negative, the predicted mass distribution shows no contribution from the signal. The blue hashed lines correspond to the sum in quadrature of the statistical and systematic uncertainties of the prediction. The lower panels show the ratio of the data to the prediction, along with the uncertainty in the ratio. A ratio outside the bounds of the axis is represented by a blue arrow. The last bin includes the event overflows. Contributions to the predicted distributions are stacked in the same order as they appear in the legend.

Conclusion
A search is reported for the single production of a vector-like singlet quark decaying into a Higgs boson and a top quark both of which decay hadronically. The search uses 139 fb −1 of 13 TeV proton-proton collision data collected with the ATLAS detector at the LHC. The final states are fully reconstructed by clustering the decay products into two large-jets. The use of fully hadronic decays allows the direct reconstruction of the -quark final state, increasing the signal-to-background ratio for the search. The results significantly extend the sensitivity for the production of quarks decaying fully hadronically. The search sensitivity is further improved by a larger dataset than used previously, tagging techniques with greater background rejection, and a data-driven multĳet background estimate that reduces the uncertainty in the background modeling. The cross-section upper limits are typically a factor of 2 lower than previous searches.
The analysis is performed by searching for an excess above SM backgrounds in the invariant mass distribution. This distribution shows no evidence of significant contributions from single -quark production and is consistent with the expected SM background sources. Therefore, limits are set at 95% C.L. on the production cross-section of a quark decaying to the final state. These depend on the -quark mass and coupling to SM particles and range from ∼10 fb to ∼200 fb, depending on the assumed value for the couplings. In the resonance mass range between 1.0 and 2.3 TeV, the upper limit on the allowed coupling values rises with from a minimum value of 0.3 for = 1.1 TeV to 1.6 for = 2.3 TeV. This analysis excludes quarks with Γ/ ≥ 0.05 for 1.05 < < 1.2 TeV, with the mass limits rising with Γ/ to exclude < 1.7 TeV for Γ/ ≥ 0.5.
These results provide significantly improved mass and coupling limits on vector-like quark models involving a quark decaying into a Higgs boson and a top quark. The exclusion limits set by this analysis extend the limits set by previous searches.