Seeking leptoquarks in the $\bf t\bar{t}$ plus missing energy channel at the high-luminosity LHC

The $t\bar t$ plus missing energy channel is one of the most efficient to detect third-generation leptoquarks (LQs). It offers an important test to models which explain flavor anomalies in $B$ meson decays. We outline a search strategy in the channel, relying on tagging the tops and on observables constructed out of the tops, and we assess the reach on LQs of the future high-luminosity LHC program. We find that with 3 ab$^{-1}$ a vector (scalar) LQ decaying 50% (100%) to top and neutrino can be excluded up to masses of 1.96 TeV (1.54 TeV). We also indicate several observables that, in case of a future discovery in the channel, can be used to distinguish a scalar LQ from a vector LQ. The implications of our findings to models addressing the recent flavor anomalies are finally discussed.


Introduction
Leptoquarks (LQs), which are hypothetical particles carrying both lepton and baryon number, appear in a variety of theories beyond the Standard Model (BSM), as Pati-Salam model [1], grand unification theories [2] and BSM composite dynamics [3]. Recently, third generation LQs caught a special attention from the high energy physics community, since they represent the best candidates [4][5][6][7][8][9] to explain anomalies in flavor physics observed by experiments on B-meson decays: Belle [10][11][12][13], Babar [14,15] and by LHCb [16][17][18]. In particular, the experiments find the indication of lepton flavor universality violation in the rather clean ratio observables R D ( * ) , at about 4σ level (by combining the results of the different experiments), and R K ( * ) , at about 2.5σ level. It is really appealing that the anomalies can be explained simultaneously by models with LQs in the TeV range [19], thus in the reach of the LHC. Furthermore, some models with LQs can also address the discrepancy from the SM in the muon magnetic moment [20][21][22]. The optimization of the search strategies for LQs at the LHC is thus very important to enlighten the physics behind the flavor anomalies and in general for seeking BSM physics. The general LQ phenomenology at hadron colliders has been explored in [23] and more recently in [24][25][26]. The relevant production mechanisms at the large hadron collider (LHC) are pair production driven by QCD interactions and single production mediated by modeldependent couplings of the LQs to leptons and quarks. Several searches, which give bounds on the LQ masses, have been performed by ATLAS and CMS in the pair production channel at the 13 TeV LHC [27][28][29][30][31], while the study in [32] considers the single production of third-generation LQs decaying into bτ . Limits can also be obtained by recasting the results of the searches for supersymmetry. The strongest limits on 2/3-charged third-generation LQs are currently set by the CMS analysis in [28], which used 35.9 fb −1 of data at a center of mass energy √ s = 13 TeV. [28] reinterprets the results of a search for gluinos and squarks to constraint pair produced LQs decaying into a neutrino plus a top, a bottom or a light jet. A vector LQ decaying 50% to tν is excluded by this analysis for masses below 1530 GeV, in the Yang-Mills (YM) case, and for masses below 1115 GeV in the minimal coupling (MC) scenario. A scalar LQ decaying 100% to tν is excluded up to masses of 1020 GeV. In our study we will try to improve the search strategy applied in this search and we will estimate the sensitivity of the LHC at a collision energy of 14 TeV and at high luminosity.
Projections of the reach of the High-Luminosity LHC (HL-LHC) and future colliders [33] on different types of LQs have been presented in [34], considering pair production in the µµjj channel, in [35] for pair and single production in the bµµ and bbµµ channel, and in [36] for a scalar LQ in the bbνν and ccτ τ channel. Estimates of the HL-LHC reach on scalar LQs, based on an extrapolation of the results of current experimental searches, have been also shown in [37]. A recent study has also analyzed the HL-LHC reach on a vector LQ in the tt plus missing energy channel [38]. Despite considering the same final state, our analysis and search strategy will be different, relying more on the identification of the tops. In our study we consider pair produced vector and scalar LQs each decaying into a top and a neutrino, leading to a final state of two tops plus missing energy. This channel, due to a peculiar topology and to the possibility of exploiting the top tagging to disentangle the signal from the background, proves to be very powerful and, as we will show, it represents one of the best channels to probe LQs involved in the explanation of the flavor anomalies. We outline a search strategy which relies in tagging the two tops in the final state, indicate the HL-LHC reach and point out several observables, that in case of a future observation of the LQ signal, can distinguish between a scalar and a vector LQs. Finally we present implications of our results to models that explain the recent flavor anomalies.
The paper is organized as follows: we define our model setup in section 2, we define our search strategy in section 3 and present our results in section 4. Shape observables to distinguish between scalar and vector LQs are shown in section 5. In section 6 we discuss the implications of our findings to the flavor anomalies. We offer our conclusions in section 7.

Setup
LQ states can be classified [24,39] in terms of their spin (scalar or vector) and SM quantum numbers, (SU (3) c , SU (2) L , U (1) Y ), where the electric charge, Q = Y + T 3 , is the sum of the hypercharge (Y ) and the third component of the weak isospin (T 3 ). In scenarios with baryon number violating couplings, these particles need to be very heavy in order to avoid the stringent limits on the proton lifetime. On the other hand, if baryon number symmetry is respected, LQ masses and couplings need to satisfy much weaker constraints, allowing them to be considerably lighter. The phenomenology of LQs with O(1 TeV) masses is very rich, including potential signatures in flavor physics observables and in the direct searches performed at the LHC.
In this paper we are interested in the tt plus missing energy signature at the LHC. This process can be induced by pair produced LQs, which then decay to a top quark and a neutrino. The production mechanism is dominated by gluon fusion, gg → LQ LQ † , as illustrated in Fig. 1, which, for the scalar LQ, depends on a single parameter, the LQ mass. The vector LQ QCD production is controlled by a second parameter, k, which describes non-minimal interactions of U 1 with gluons and depends on underlying dynamics. The branching fraction (B) for LQ → tν are model dependent. In Table 1, we list the LQ states that can decay to tν, along with the corresponding operator, which can arise via interactions with a lepton doublet (L), or a right-handed neutrino (ν R ). Depending on the type of interaction and the SU (2) L ×U (1) Y LQ representation, one can derive the maximal value of B(LQ → tν) allowed by gauge symmetry, as listed in the third column of Table 1. This branching fraction can be as large as 50% or 100% for interactions with left-handed doublets, depending if the Yukawa coupling contributing to this also enters in the SU (2) counterpart, but it can be as large as 100% if interactions with right-handed neutrinos are allowed.
Our analysis will be performed with two representative models, which can produce the same final state. Motivated by the B-physics anomalies, we consider: (i) the scalar LQ S 3 = (3, 3, 1/3), and the (ii) vector LQ U 1 = (3, 1, 2/3), which we describe now in detail: This S 3 LQ has been considered in models addressing the B-physics anomalies with two scalar LQs [37,40]. The Yukawa Lagrangian of this model reads [24] where τ k (k = 1, 2, 3) denote the Pauli matrices, S k 3 are the LQ triplet component and y L is a generic Yukawa matrix. Note that we have neglected LQ couplings to diquarks in the above equation since they would disturb the proton stability [24]. An appropriate symmetry must be imposed to forbid these couplings, which are tightly constrained by experimental limits on the proton lifetime. It is convenient to recast the above expression in terms of charge eigenstates, namely, where V is the CKM matrix. 1 The superscript denotes the electric charge of the LQ states. In this particular model, the branching fraction we are interested in reads where we have neglected the fermion masses and used the fact that |V tb | |V ts | |V td |. We adopted a compact notation where (y L · y † L ) ii ≡ j |y ij L | 2 .
The U 1 model attracted a lot of attention because it can provide a simultaneous explanation to the anomalies in b → s and b → c transitions, with a single mediator [41]. The most general Lagrangian consistent with the SM gauge symmetry allows couplings to both left-handed and right-handed fermions, namely, where x ij L , x ij R and w ij R are Yukawa couplings. If we neglect the interactions to righthanded fields, we have, in the mass eigenstate basis: and we obtain that 1 The PMNS matrix is not relevant to our study and has been set to the identity [25].
The U 1 QCD interactions that control the U 1 pair production are determined by the kinetic terms: where U µν 1 denotes the U 1 strength tensor and k is a dimensionless parameter which depends on the ultraviolet completion of the model. We can identify the two scenarios of minimal coupling (MC), k = 0, and the Yang-Mills (YM) case k = 1.
In the following, we will assume that the dominant interactions are the ones to thirdgeneration left-handed fermions, as suggested by the B-physics anomalies. In this case, the branching fractions to tν will be 100% for S 3 and 50% for U 1 , which are the most optimistic values. Nonetheless, it is clear that our results can be rescaled and applied to more general flavor structures and to the other models listed in Table 1.

Field Spin Quantum Numbers
Operators Table 1: Classification of the LQ states that can decay to tν, in terms of the SM quantum . We adopt the same notation of Ref. [24] and we omit color, weak isospin and flavor indices for simplicity. The last column correspond to the maximal value of B(LQ → tν), as allowed by gauge symmetries. In the cases where interactions to lepton doublets (L) and right-handed neutrinos (ν R ) are both allowed, i.e. for the models U 1 and V 2 , we give the maximal branching fraction assuming only interactions to L or ν R , respectively.

Search strategy
We outline a search strategy at the 14 TeV LHC for pair-produced scalar and vector LQs, decaying each into a top quark and a neutrino. In particular, we will consider the U 1 and S 3 LQs introduced in section 2, assuming a decay branching ratio into tν of 50% for U 1 and of 100% for S 3 . For U 1 , we will analyze the YM scenario k = 1. At the end, we   will also present the HL-LHC reach in the MC scenario, k = 0, which will be calculated based on the efficiencies obtained in our analysis and by rescaling the signal number of events according to the different values of the production cross sections. Fig. 2 shows cross section values at the 14 TeV LHC for the QCD pair production of S 3 and U 1 in both the YM and MC cases. We focus on a final state given by two tops decaying hadronically plus missing energy. The main background consists of Z + jets events where the Z decays to neutrinos and leads to missing energy. Minor backgrounds, which we also include in our analysis, come from W + jets and tt events, where a leptonic decaying W leads to missing energy from the neutrino and a lost lepton [28]. 2 We simulate signal and background events at leading order with MadGraph5 aMC@NLO [42]. Events are then passed to Pythia [43] for showering and hadronization. We also apply a smearing to the jet momenta in order to mimic detector effects [44]. Signal events are generated via UFO files [45], created by using Feynrules [46]. For the case of the scalar LQ S 3 , we apply correction factors to the cross section values, which account for QCD nextto-leading-order effects. We calculate them by using the code in [25]. Jets are clustered with Fastjet [47] by using an anti-kt algorithm [48]. We choose a large cone size, R = 1.0, which we identify as an optimal choice, based on the top reconstruction procedure which we will apply. 3 The signal we want to detect is characterized by large missing transverse energy, / E T , and at least two fat-jets, coming from the hadronic decays of the two tops. Considering these signal features, as a first step of our analysis, we accept the events if they satisfy the conditions: with n j denoting the number of jets satisfying the p T and rapidity requirements. Events are rejected if at least one isolated lepton, either a muon or an electron, with p T > 10 GeV and in the central region |η| < 2.5 is found (lep veto). 4 A crucial part of our search strategy relies on the reconstruction of both of the two tops in the final state. To reconstruct the top pair we apply the following procedure. We first consider the invariant mass of the leading jet. Since the jets are clustered on a relatively large cone size and the tops in the signal are boosted, most of the top decay products are collected in a single fat-jet. As we can see from the plot on the left in Fig. 3 , the invariant mass of the p T -leading jet, j(1), is centered around the top mass for a large portion of the signal events. As a first step of the reconstruction procedure we thus select the events with the j(1) invariant mass, M j(1) , in the region [160 GeV, 220 GeV]. j(1) is then identified with the p T -leading top, t(1). We then analyze the invariant mass of the second-leading 2 We checked that other backgrounds, as QCD multijet events, give a negligible contribution. 3 In our simulations we do not include initial state radiation or underline events. This is because, as proved for example in [49], the effects of contamination on the jet invariant mass coming from these events can be eliminated by applying techniques as "grooming" [50]. We thus expect that our simulations can correctly reproduce the distributions that an experimental analysis can find after the application of these advanced jet "cleaning" techniques. 4 We consider the lepton isolated if it is separated from a jet by ∆R >0.4. The choice of 10 GeV as a trigger on the lepton p T is a conservative choice for the evaluation of the W +jets and tt background contribution. Indeed, the current ATLAS trigger is 7 GeV for electrons and 6 GeV for muons [51]. jet. As evident from the plot on the right in Fig. 3, the majority of signal events are again centered around the top mass. If M j(2) is within the region [160 GeV, 220 GeV] we identify the second-leading top, t(2), with j(2). A small portion of signal events for M j (2) is centered around the W mass. In order to retain these signal events we consider the j(2) invariant mass region [70 GeV, 110 GeV] and the system given by the second and the third p T -leading jets. If the invariant mass of the j(2) plus j(3) system is within [160 GeV, 220 GeV] the system is identified with t(2). We select the events where both of the two tops, t(1) and t(2), have been identified with the outlined procedure. The efficiency of our top pair tagging is of about 20% for the signal, while we can reject the background by a factor of about 1.4·10 3 . Table 2 indicates the cross section values for signal and background after the acceptance cuts and after the reconstruction and tagging of the pair of top quarks.
Once having identified the tops t(1) and t(2), we construct several observables based on them, which efficiently discriminate the LQ signals from the background. We will thus complete our signal selection by applying cuts on these "top observables". One of these observables, that we indicate by M T 2 , is inspired by the M T 2 variable commonly used by experimental searches [52]. In our study it is constructed upon the tops, instead on jets, and it is defined as where p T t(1, 2) is the transverse momentum of the top t(1, 2) and ∆φ( / E , t(1, 2)) denotes the azimuthal angular separation between the missing energy vector and the top t(1, 2). We then consider as a signal-to-background discriminant the invariant mass of the system made of the two tops t(1) and t(2), which we indicate as M tt .

Acceptance
Top Tagging  After the top reconstruction we thus refine our signal selection by imposing the cuts: / E T > 500 GeV M tt > 800 GeV (10) which exploits the large missing energy and the large invariant mass of the top pair system in the signal events, and the two set of cuts on the transverse momenta of the tops and on the M T 2 variable: loose : where the loose (tight) selection is applied to signals with masses up to (above) 1.4 TeV. Signal and background distributions for the relevant observables used in this analysis are shown in Fig. 5. Table 3 presents the cross section values for signal and background after the complete selection, namely after the top tagging plus the cuts in eq. (10) and the loose or tight selection.

HL-LHC reach
Based on our final results, shown in table 3, we calculate the HL-LHC reach on LQs. In particular, we derive the values for the integrated luminosity needed to exclude at 95% confidence level (CL) or to observe at 3σ a scalar LQ S 3 and a vector LQ U 1 as a function of their mass. The exclusion reach at 95% CL is calculated by a goodness-of-fit test considering a Poisson distribution for the events. The 3σ reach is estimated according to the significance level S/ √ B, with S (B) the number of signal (background) events. The reach for a scalar LQ S 3 can be confronted with the expected reach obtained by extrapolating the results of the CMS analysis [28], which has been presented in [37]. The CMS study makes use of variables as missing energy and other based on the p T of jets in the final state, but does not apply any top tagging. Given the fact that the reach of our analysis is considerably larger than the one in [37], 5 we point out that the identification of the tops in the final state and the use of "top observable" for the signal-to-background discrimination can improve the LHC sensitivity to LQs. Furthermore, in our study we have applied a simple cut-and-count analysis and we expect our results to be conservative. A more refined top reconstruction, making use for example of substructure techniques as "jettiness" [53,54] or a statistical analysis of the shape of the relevant distributions considered in this study (see the subsequent section 5), could augment the reach of the HL-LHC. We leave these analyses to a more specialized experimental work.

Distinguishing between scalar and vector LQs
We consider here the case where the HL-LHC will discover a LQ signal in the channel analyzed in this study, following the search strategy outlined in Sec. 3. We discuss in this section how to distinguish between the two possible signals of a vector LQ and a scalar LQ. We indicate several observables that can be used to distinguish between the two cases. A first category of observables use the difference in the energy of the final states coming from a scalar or a vector LQ. Indeed, due to the different scaling of the QCD pair production cross section with the mass, LQ signals from a vector LQ and a scalar one, where the vector is considerably heavier than the scalar LQ, can be identified with a similar significance. For example, considering our results, we find that with 3 ab −1 a 5σ discovery could be realized for either a vector LQ U 1 , in the YM case, of about 1.7 TeV (S/ √ B = 5.8) or a lighter scalar S 3 of about 1.3 TeV (S/ √ B = 5.1). The observables we identify are constructed from the reconstructed pair of tops. Tagging the two tops is thus important not only to discover the LQs but also to characterize the signal. The "top observables" M tt , M T 2 and the p T of the two tops, that we already used to disentangle the signal from the background can also efficiently distinguish between U 1 and S 3 . An other observable we point out as a signal analyzer is an angular observable, specifically the azimuthal angular separation between the two reconstructed tops. ∆φ(tt) is able to probe directly the spin of the LQs. Similar observables, but for different topologies and constructed from the leptonic decays of tops, have been considered to identify properties of dark matter interactions [55,56] and of Higgs couplings [57,58]. Fig. 5 shows the distribution of the "top observables" for the background and the scalar and vector LQ signals. They have been obtained after the procedure of top tagging and after applying the cuts: / E T >500 GeV, M tt >800 GeV. We can see that the signal from U 1 is distributed on larger values of the top observables compared to the signal from S 3 . The difference is particularly clear in the tails of the distributions. 6 Fig. 6 shows the ∆φ(tt) distribution for the background and the U 1 and S 3 signals. The background tends to be more homogeneously distributed, the vector signal is characterized by tops at larger azimuthal separation compared to the scalar case. The difference in the ∆φ(tt) shape only depends on the spin of the LQs and does not change with other values of the LQ masses, differently from the other "top observables" discussed above. ∆φ(tt) is thus particularly helpful to distinguish between the signals from a scalar LQ S 3 and a vector U 1 in the MC scenario k = 0, where the differences in the energy of the final states are less marked.

Implications to the flavor anomalies
LQs with masses in the TeV range are particularly interesting, since they can mediate flavor-violating processes that can explain the deviations from the Standard Model observed in the decays of B mesons. In this Section we compare the reach on LQs derived in our study for the HL-LHC with the regions of parameters favored by the B-physics anomalies.
Firstly, we consider the model with a vector SU (2) L singlet. This simplified model is particularly attracting because allows for a simultaneous explanation of the B-physics anomalies, both in R D * and R K * , with a single particle, the LQ U 1 (c.f. [59] for a recent review). We adopt the ansatz of Ref. [41], based on a U (2) q × U (2) flavor symmetry. In this case, we have, for the couplings in (4), x ij L ≡ g U β ij , with β ij = δ 3i δ 3α up to small breaking terms of the flavor symmetry. Ref. [41] calculated the region in the plane coupling vs mass favored by B-physics anomalies. We show in Fig. 7 which part of this region can be probed by our analysis in the ttνν channel at the HL-LHC. The plot on the left refers to the YM scenario for the QCD pair production of U 1 , the plot on the right to the MC case. The green band is extracted from [41] and represents the 1σ region preferred by low-energy flavor observables, the gray band indicates the current limit on m U 1 , which comes from the CMS analysis [28] at √ s = 13 TeV and 35.9 fb −1 . The blue lines are our lower bounds on the U 1 mass at √ s = 14 TeV with a luminosity of 300 (dashed line) and 3000 (dotted line) fb −1 . We see that HL-LHC can probe in the ttνν channel a large part of the parameter space. Note that our projections are slightly better than the ones with τ b final states presented in [41] for the MC scenario, confirming that the ttνν channel is very powerful to test models with LQs involved in the explanation of the flavor anomalies. If the HL-LHC program will exclude a LQ U 1 in the tν channel, the B-physics anomalies can only be explained by considering large couplings x 33 L > 1. Finally, we consider scalar LQs. Scalar LQs, either SU (2) L triplets, S 3 , or singlets, S 1 , are particularly motivated in scenarios with a BSM composite dynamics and can accomodate B anomalies [37,60]. In particular, S 3 is a good candidate to explain the anomaly in R K * and S 1 the one in R D * [59]. We focus on the model in [37]. From the U (2) q × U (2) ansatz, the coupling in (2) reads y ij L ≡ g 3 β ij , with β ij = δ 3i δ 3α . We show in Fig. 8 the part of the coupling vs m S 3 parameter space that can be probed by our results. The green band shows the 1σ flavor fit and is extracted from Ref. [37], the gray band indicates the current limit on m S 3 from the CMS analysis [28] and the blue lines indicate the 95% CL limits on the S 3 masses derived in our study. We see again that our analysis at the HL-LHC can probe a large portion of the parameter space relevant to B anomalies. We can also note, by comparing our results with the projected reach in different channels shown in Fig. 2 of [37], that the ttνν channel is one of the most powerful to test the relevant parameter space for B anomalies. anomalies compared to 95% CL limits obtained in our analysis in the tt plus missing energy channel. The green band is extracted from [41] and represents the 1σ region preferred by lowenergy flavor observables, the gray band indicates the current limit on m U 1 , from the CMS analysis [28] at √ s = 13 TeV and 35.9 fb −1 . The blue lines are our lower bounds on the U 1 mass at √ s = 14 TeV with a luminosity of 300 (dashed line) and 3000 (dotted line) fb −1 . The plot on the left (right) refers to the YM (MC) scenario.

Conclusions
We have studied the pair production of LQs in the tt plus missing energy channel, which is one of the most powerful to detect third-generation LQs. These particles, as highlighted in the recent literature, offer an explanation to the B-physics flavor anomalies. We have indicated a search strategy in the channel, which shows the advantages of tagging the tops in the final state and which uses observables constructed out of the tops to discriminate the signal from the background and to characterize the signal. We have then assessed the reach on LQs of the future high-luminosity LHC program. Our results, presented in Fig.  4, show that with 3 ab −1 (300 fb −1 ) the HL-LHC can exclude a vector LQ, in the YM scenario, and decaying 50% to top and neutrino, up to 1.96 TeV (1.72 TeV) or observe at 3σ the corresponding signal for masses up to 1.83 TeV (1.6 TeV). In the MC case, a vector LQ with a mass up to 1.62 TeV (1.4 TeV) can be excluded with 3 ab −1 (300 fb −1 ). For the case of a scalar LQ completely decaying into top and neutrino, the exclusion reach extends up to 1.54 TeV (1.3 TeV) with 3 ab −1 (300 fb −1 ), while scalar LQs as heavy as 1.41 TeV (1.16 TeV) can be observed at 3σ. We have further presented several observables, again constructed out of the tops, that, in case of a future discovery in the channel, can be used to distinguish between a scalar and a vector LQ (Fig.s 5 and 6). Finally, we have discussed the implications of our results to models addressing the recent B-physics anomalies . The search in the ttνν channel probes to be a very powerful test of these models, with the possibility to constrain a large part of the interesting parameter anomalies compared to 95% CL limits obtained in our analysis in the tt plus missing energy channel. The 1σ region preferred by low-energy flavor observables (green band) is extracted from [37], the gray band indicates the current limit on m S 3 , from the CMS analysis [28] at √ s = 13 TeV and 35.9 fb −1 . The blue lines are our lower bounds on the S 3 mass at √ s = 14 TeV with a luminosity of 300 (dashed line) and 3000 (dotted line) fb −1 .