LHC Study of Third-Generation Scalar Leptoquarks with Machine-Learned Likelihoods

.

To date, none of these experimental analyses, even with sophisticated ML tools, has been able to find significant deviations from the standard model (SM) predictions, so it is legitimate to ask whether the use of unbinned methods would have the potential to improve on these binned methods.In order to try to answer this question, we work within the framework of Machine-Learned Likelihoods (MLL) [135][136][137], a method that combines supervised ML classification techniques with likelihood-based inference tests, allowing to estimate the experimental sensitivity of high-dimensional data sets without the need of binning the score output to extract the resulting one-dimensional signal and background probability density functions (PDFs), by means of the use of kernel density estimators (KDE) [138,139].It is well known that the major drawback of unbinned methods is the lack of knowledge about how to properly introduce nuisance parameters in the likelihood estimation, which makes them unsuitable for experimental analyses at present.As an initial approach to this issue and mainly as a test of the stability of our results under a simplified addition of some systematic uncertainties, we develop for the first time within the MLL framework a strategy to include them in the unbinned analysis and to estimate their impact on the calculation of signal significances.
In this article we apply the MLL framework to the study of the double production at the LHC of up-type and down-type scalar LQs exclusively coupled to third-generation SM fermions.LQs only coupled to the third generation are interesting in their own right and they have been extensively studied from both experimental and phenomenological perspectives (see for instance [17,79,84,108,114,134]) 1 .For this scenario we consider the decays of up-type and down-type LQs into tν τ /bτ and bν τ /tτ final states, respectively, and compute the expected exclusion limits in the [BR(LQ 3 )] plane, where BR(LQ u/d 3 → ql) is the branching fraction of a given channel to a quark and a tau lepton and m(LQ 3 ) the mass of the LQ.It is worth stressing that the study of thirdgeneration LQs is not related to a limitation of the MLL framework.In fact, we expect this to be suitable and even promising for the study of LQs coupled to other or mixed generations, although dedicated searches for specific signals and backgrounds, beyond the scope of this work, would be necessary in order to asses a quantitative impact.
The paper is structured in the following way.In Section 2 we present the main features of the LQ model we work with and the LHC experimental setup in which we perform our phenomenological analysis.In Section 3 we describe the signal and background simulation sampling and our event selection criteria.Section 4 is devoted to our collider analysis with the unbinned method MLL.Our results are shown in Section 5, in which we have incorporated in the MLL method an approach to the inclusion of systematic uncertainties, considering only those that directly affect the most relevant variables with which we feed the ML classifier, individually and without correlations.We also show the prospects for the LHC with a center-of-mass energy of 14 TeV.Finally, our manuscript ends in Section 6 where we discuss our main conclusions.

Phenomenological Framework
Leptoquarks naturally arise from many extensions of the SM, such as Grand Unification Theories (GUTs), technicolor scenarios, or composite models as new hypothetical fields.LQs are either scalar or vector fields and can couple to quark-lepton currents, therefore they involve local interactions between quarks and leptons, being this feature their main signature.Following Ref. [12], the effective theory containing the most general couplings, invariant under SU (3) × SU (2) × U (1), for scalar leptoquarks which couple third-generation quark to lepton fields preserving baryon and lepton number conservation is defined by the Lagrangian, [82] the ATLAS Collaboration presented a search for new phenomena at the LHC in processes involving tau leptons, b-jets, and missing transverse momentum in the final state.The analyzed data set corresponds to a center-of-mass energy of √ s = 13 TeV for proton-proton collisions and an integrated luminosity of 139 fb −1 .The results were interpreted within two simplified benchmark model scenarios, one of which considers the pair production of up-type or down-type scalar leptoquarks.In both cases, the LQs were assumed to couple only to third-generation SM fermions following the minimal Buchmüller-Rückl-Wyler model yielding the decays LQ u 3 → tν τ /bτ for up-type LQs, and LQ d 3 → bν τ /tτ for down-type LQs.The model parameters consisted of the masses of the LQs, m(LQ 3 ), and their respective branching fractions for decays into a quark and a charged lepton, β = BR(LQ We are interested here in the single-tau signal region reported by the ATLAS collaboration, which corresponds to the analysis involving the LQ simplified model.In particular, in order to explore the impact on the exclusion limits in the [BR(LQ 3 )] plane, we focus on the multi-bin signal region which corresponds to the one where those limits were obtained by ATLAS.This signal region is characterized by selection requirements on the missing transverse momentum (E miss T ), the tau-lepton transverse momentum (p T (τ )), the sum of the transverse momenta corresponding to the tau lepton and the two leading jets (s T ), the tau-lepton transverse mass (m T (τ )), and the sum of the transverse masses of the b-jets, ( m T (b 1 , b 2 )).In the next section, we will discuss the details concerning these requirements as well as the specifics and simulation of the backgrounds.Besides, we will also elaborate on the simulation of the signal data sets corresponding to LQ masses ranging from 800 GeV up to 1800 GeV and a fixed value of BR(LQ u/d 3 → ql) = 0.5 that encompasses different values of the LQ couplings.We will also explain how we extended the analysis to other branching fractions.

Event Simulation and Selection
The proposed signal under study is the double production of up-type or down-type leptoquarks at the LHC.In the up-type case, one of the letpoquarks decays into a top quark and a neutrino and the other one into a bottom quark and a τ lepton: In the down-type case, one of the letpoquarks decays into a bottom quark and a neutrino and the other one into a top quark and a τ lepton: The simulated sample for leptoquarks was generated with MadGraph5_aMC@NLO [140] at leading order in QCD with the NNPDF2.3LO PDF set [141], and at a center-of-mass energy of 13 TeV.The UFO model is the one developed in [142].Events were processed with Pythia [143,144] for parton showering and hadronization, and Delphes [145] for fast detector simulation, using the default ATLAS card.
In order to reproduce the conditions from search [82] we considered the same object identification criteria defined by the ATLAS collaboration, summarized in Table 1.For validating our event generation pipeline, we reproduced the p T (τ ) distribution in [82] for a m(LQ u 3 ) = 1.2 TeV and β = 0.5, for the single-tau final state.This final state is defined by the presence of exactly one τ had and at least two b-jets (n b ≥ 2), with a light-lepton veto (e/µ) and a lower bound on E miss T at 280 GeV.There is also a lower bound on the sum of the transverse masses of the b-jets, where b 1 and b 2 are the two leading b-tagged jets and m T the transverse mass defined as, The aforementioned p T (τ ) distribution uses the multi-bin signal region, defined by bounds on the τ transverse momentum as well as its transverse mass and s T , being the latter defined by where j 1 and j 2 are the two leading jets.We fairly reproduced the p T (τ ) shape distribution, and fit the results with an additional global normalization factor.The bounds on these variables are shown in Table 2. Background events were simulated at LO in QCD with MadGraph5_aMC@NLO, and subsequently processed with Pythia and Delphes.We considered the main contributions in the multi-bin single-tau signal region: t t (with 1 real τ had ), fake-t t (with no real τ had ), single-top, W +jets, t tH and t tV (with It is important to mention here that, as simplified assumptions of the work, in the analysis no QCD corrections are taken into account in the signal and background processes.The ATLAS study [82], on the other hand, considers two extra jets in the signal simulation, while backgrounds are generated with a POWHEG box [146][147][148][149]. For our multivariate analysis, event selection criteria are loosened to fully exploit the discrimination power of machine-learning classifiers.We used the same criteria for jets, b-jets, and lepton  identification.We keep on working with the single-tau final state (n τ had =1, n b ≥ 2, no light leptons, and E miss T > 280 GeV), but without any requirement in high-level observables m T (b 1 , b 2 ), m T (τ had ) and s T .Also the lower bound on p T (τ ) was relaxed to 20 GeV.In Table 2 we show both selection strategies.These loose cuts also ease data simulation, since ATLAS cuts for the single-tau multi-bin region are very tight and allow to retain only a few Monte Carlo events per simulation: we obtain that a fraction of ∼ 10 −4 background events survives the ATLAS cuts with respect to the generated ones, while considering the loose cuts the surviving fraction is 2 orders of magnitude larger.For the signal events the impact is milder, from ∼ 10 −1 − 10 −2 to ∼ 10 −1 , since the ATLAS cuts are designed to define a signal enriched region.In Fig. 1 we present the expected number of events for both selection criteria at the LHC √ s = 13 TeV and 139 fb −1 .For signals, we have chosen as benchmark m(LQ u/d 3 ) = 1.2 TeV and β = 0.5.The backgrounds are in descending order of relevance considering the loose cuts and we can see that the hierarchy of the main backgrounds is modified with respect to the ATLAS cuts.Another important characteristic is that the signal-to-background ratio decreases significantly, however, this is intentional in order to highlight that for the multivariate analysis, one does not need to design very tight signal regions which could inadvertently remove significant information that would otherwise help in the discrimination task.

Object
We simulated events on the mass range m(LQ 3 ) ∈ [800, 1800] GeV, selecting benchmark points with a step of 200 GeV in mass, and a fixed value of β = 0.5.For each benchmark point, we simulate enough events such that we end up with ∼500k events after the selection cuts described above.Similarly, we have simulated enough background events to obtain a ∼500k data sample at detector level, considering the relative weight of each background channel.
Since the signal samples were generated with β = 0.5, both leptoquark decay channels, either into a quark and a neutrino or into a quark and a charged lepton, are possible.These events can be reweighted to different branching fractions to derive limits in the m(LQ parameter space.To this end, for every m(LQ 3 ) value we simulated a small sample of events, ∼50k, within the range β ∈ [0, 1] to reweight the β = 0.5 data according to their relative cross sections after selection cuts.

Analysis Strategy with Unbinned Machine-Learned Likelihoods
A large but simple set of discriminating variables is used to feed the ML algorithm: p T , η and ϕ of the reconstructed τ had , b 1 and b 2 (the two leading b-tagged jets); the missing transverse momentum information, E miss T and ϕ(E miss T ); the number of identified jets, n jets , the number of b-tagged jets, n b , and the hadronic activity H T = n jets i=1 p j i T .Following the same motivation explained for the 'loose' event selection criteria, notice that we are not considering any high-level observable.As an example, in Fig. 2 we present the signal (m(LQ u 3 ) = 1.2 TeV and β = 0.5) and background distributions of the most relevant variables for the ML discrimination, p T (τ ), E miss T , and H T , as we will see below (the distributions for LQ d 3 are similar).For each value of m(LQ 3 ), we trained a supervised per-event classifier, using the XGBoost toolkit [150,151], with 500k events per class (balanced data set), as a binary classifier to distinguish signal (S) from background (B).In the background sample, we consider the relative weight of each background channel by its relative contribution after applying our selection cuts.For further details about the algorithm employed, the code is available at [152].In the left panel of Fig. 3 we present the feature importance score, considering the gain metric that measures the relative contribution of   the corresponding feature.A higher value of this metric when compared to another feature implies it is more important for generating a prediction.We employ the same data set as in Fig. 2, and as anticipated before, p T (τ ), E miss

T
, and H T are the most relevant features.For different values of m(LQ 3 ) there are small changes in the hierarchy, although the general trend is not modified.The output of the ML classifier, o(x), for this example is shown on the right panel of Fig. 3 when tested with only pure background or pure signal new samples, blue and red histograms, respectively.The output, o(x) ∈ [0, 1], quantifies if an event is either signal-like (near 1) or background-like (near 0).Within our setup, we obtained an area under the receiver operating characteristic (ROC) curve of 0.992.
To estimate the exclusion limits in this work, we will exploit the entire discriminant ML output by comparing a binned and an unbinned approach.It is known that the binning process has an inherent drawback associated to the loss of information of the probability densities of each event inside each bin, which in turn impacts in the likelihood estimation, potentially decreasing the significance.This loss of information can be reduced by making the bin width small enough, but this solution is usually limited by the finite statistics, that renders the binned likelihood density estimation unreliable for large number of bins.On the other hand, unbinned methods not only preserve the granularity of individual data points, potentially offering a more accurate representation, but also allow greater flexibility in capturing complex distributions and subtle patterns in the data because they do not average information across bins.
For the histogram-based approach, the traditional Binned Likelihood (BL) method [153] is used, where a likelihood function is built as the product of Poisson probability functions.Then, the full ML output is binned to find the expected number of signal and background events in each bin (see the histograms in the right panel of Fig. 3).
For the unbinned approach, the strategy applied in this work is based on the MLL framework3 , that can be schematically summarized as follows: • It uses a binary classifier to discriminate between signal and background (in this case XGBoost), which is fed with event-by-event variables.This allows us to convert a high-dimensional problem into a single-dimensional one, based on the score of the classifier output.
• From the entire unbinned classifier output, we estimate the signal and background PDFs, ps (o(x)) and pb (o(x)), using KDE to fit the classifier output when tested with only pure background or pure signal samples, respectively (see the dashed curves in the right panel of Fig. 3).
• Knowing the signal and background PDFs, we compute the likelihood function of the hypothesis tests of interest.MLL has both discovery and exclusion tests included, which allows us to estimate both the signal significance of discovery (5σ) and evidence (3σ) and also to impose exclusion limits at 95% confidence level (CL, 1.64σ) [153].
Even though the KDE method is called a non-parametric method for density estimation because it does not assume a specific functional form for the underlying distribution, it involves one parameter known as "bandwidth".This parameter controls the degree of smoothness of the estimated density function, and can be chosen or estimated from the data.A larger bandwidth leads to a smoother estimate, while a smaller bandwidth results in a more variable estimate that is sensitive to local fluctuations.Throughout our work, we selected the bandwidth parameters individually for the signal and background PDFs with a grid search using the GridSearchCV function available inside the sklearn.model_selection[154] python package, which gives as an output the bandwidth which maximizes the data likelihood in a 5-fold cross-validation strategy.For further details about the implementation of the unbinned method and the KDE algorithm see [152], where the code for this work is available.→ qℓ), space.The limits are derived using a binned likelihood method and an unbinned approach, both considering the entire output of the ML classifier, o(x).The colored regions represent ±1σ stat , but no systematic uncertainties are included.As a reference, we present the expected exclusion-limit contours obtained by ATLAS in [82].

Results
Assuming there is no significant excess, we use the exclusion hypothesis test and compute the exclusion limits for each point in the parameter space of the scalar LQ mass, m(LQ 3 ), and its branching fraction into a quark and a charged lepton, β = BR(LQ u/d 3 → qℓ).Finally, we define the exclusion limits at 95% CL as the curve where the significance is equal to 1.64σ.
In Fig. 4 we show the expected exclusion limits using a binned likelihood method (green) and an unbinned approach (red), both considering the full output of the ML classifier, o(x).The colored regions show the impact of the statistical uncertainty (±1σ stat ) 4 .However, no systematic uncertainties were included in this calculation.As a reference, we also present the expected exclusion-limit contours obtained by ATLAS [82] through a binned likelihood test (multi-bin signal region) considering high-level physical-based variables and the stringent selection cuts in Table 2 instead of using the ML output.In our implementation of the binned likelihood method we chose to work with 16 bins as in the search performed by ATLAS, which turns out to be very close to the maximum number of equal-sized bins that can be allowed when requiring at least 5 background events per bin, a common practice to ensure statistically robust and reliable results at LHC experiments.We have checked nonetheless that the results with the binned likelihood method do not change significantly when increasing the number of bins up to this maximum value.
For both types of leptoquarks, the expected exclusion contours extend to masses ∼1.28 TeV and ∼1.36 TeV for the binned and unbinned methods, respectively, and for intermediate values of BR(LQ u/d 3 → qℓ).Since the fraction of events with exactly one tau lepton decreases when BR(LQ u/d 3 → qℓ) → 0 or 1, the signal acceptance decreases which leads to lower mass values excluded.We can see that the multivariate analysis shows a tendency towards a possible improvement of the exclusion limits set by the ATLAS collaboration.Moreover, the unbinned method provides more stringent expected exclusion limits in the entire parameter space, although it is computationally more expensive.
We want to highlight that we have performed a series of cross-checks to assess the robustness of our procedure, and our results do not change significantly.Instead of working with a single ML output, o(x), we averaged over the output of ten independent ML realizations, defined as ⟨o(x)⟩ = 1 10 10 i o i (x), where each ML was trained with an independent data set.Consequently, the estimation of ps,b (o(x)) using the KDE over the average variable ⟨o(x)⟩ turns out to be slightly smoother than the estimation over a single machine learning output o(x) (shown for example on the right panel of Fig. 3).
We also have checked that the numerical instabilities introduced by the bandwidth parameter do not affect our results.The influence of the bandwidth parameter is expected to be less pronounced for large data sets because the density estimate tends to converge to the true underlying distribution as the sample size increases.We employ 50k events to estimate the probability density functions with KDE using data cross-validation, and we have checked that increasing or decreasing the bandwidth by a factor ∼ 1 − 10 with respect to that value does not significantly modify the results.The same conclusion holds if we fine tune the bandwidth search (with more computational cost) by increasing the sample size and/or decreasing the step size used by the search algorithm to determine the bandwidth.
Finally, with new and independent training/testing data sets we have cross-validated that the results using these samples are compatible within statistical fluctuations in both binned and unbinned methods.

Approach to the Inclusion of Systematic Uncertainties
This subsection aims to estimate the impact of some systematic uncertainties on our calculation of the expected exclusion limits.The procedure detailed below attempts to be a first approximation to evaluate the stability of our results, especially for the unbinned method, and in a multivariate-based approach, deals directly with the relevant kinematic variables used to feed the ML classifier instead of dealing with the underlying nuisance parameter affecting them.
First, we need to translate the systematic uncertainties in the physical-based space to an uncertainty of the ML classifier output space.We consider only uncertainties that affect directly the features used to train our ML algorithm and take the correlations among them not significant.Since the most relevant variables for the ML discrimination are p T of tau leptons, E miss T , and H T (see the left panel of Fig. 3), we consider systematic shifts of 5-10% [155] on each of those variables individually.Then, inspired by [156,157], the impact on the ML output is assessed by using the same test data set as in the original setup, with the whole set of events with all the kinematic variables used as input variables unchanged, except for the one that we choose to shift by a 5 − 10%.Then, we increase or decrease only the value of the selected variable in all the events of the data set by the  same percentage.For example, to estimate the impact of ∆p T (τ ), the systematic uncertainty of the tau transverse momentum, we take the ML algorithm trained with no uncertainties and evaluate it with two new test samples with all the variables unchanged but replacing p T (τ ) → p T (τ ) ± ∆p T (τ ), where ∆p T (τ )/p T (τ ) = 0.1.With this procedure, we obtain two ML outputs: o(x) ± , respectively.For the binned method, the uncertainty of the ML output o(x d ) in each bin d would be ∆o( |, and we can estimate the significance introducing it into the profile likelihood formulae [158].For the unbinned case, we do not have an expression to compute the significance including systematic uncertainties.Nevertheless, we can estimate its impact by repeating the entire unbinned procedure twice with o(x) ± .To be conservative, we take as the final result the outcome that provides the less restrictive limit, taking into account individually the results for each possible shift and in each of the three considered variables (p T (τ ), E miss T , and H T ).The variable that most affects the results is p T (τ ).
These results are shown in Fig. 5. Comparing with the corresponding panels of Fig. 4, we can see that the impact on the exclusion contours is only of a few per cent.Importantly, the effect in both methods is similar.This indicates that the treatment to include systematic uncertainties in the unbinned case provides a good numerical approximation despite not having an analytical expression as in the binned case, and it renders the limits shown in Fig. 4 stable.Finally, we have checked that including uncertainties in the other features does not impact significantly the results.However, we would like to remark that not all systematic uncertainties that originate from detector effects and theoretical assumptions were considered.Thus, although we expect a mild impact on the significance in the case these missing effects do not affect the most relevant variables in the ML discrimination, a full treatment including all sources and their correlations would be needed in a complete analysis.

Prospects for 14-TeV LHC
Next, we compare the expected exclusion limits obtained with √ s = 13 TeV and √ s = 14 TeV.We simulated new but small signal and background samples, ∼50k events per channel, with the same setup described in Section 3, but at a center-of-mass energy of 14 TeV and with a HL-LHC ATLAS card for the Delphes fast detector simulator.In Fig. 6 we show the ratio of the cross-section at √ s = 14 TeV and √ s = 13 TeV for different processes after selection cuts.For the 'loose cuts', we can see that the cross-section of all the background increases by a similar factor, i.e. the hierarchy is the same for both center-of-mass energies.On the other hand, this is not true for the 'ATLAS cuts', which involve cuts in high-level observables.Although the overall hierarchy in the background is conserved, the relative weight of the single-top and W +jets channels increases significantly.
Regarding the leptoquark signal, its cross section increases by a similar factor for both selection cuts.Importantly, the signal-to-background ratio is larger at 14 TeV which will impact significantly on the exclusion limit reach.We have checked this trend for all leptoquark masses and branching fractions.
Since the generation of a new full set of events to train the ML algorithms is computationally very expensive, we employed the data sets at 13 TeV for the training stage, but used the cross-section values, relative weights of the background channels, relative weights of the LQs for different values of β, and expected number of signal and background events calculated at 14 TeV for the significance calculation in both binned and unbinned methods.Finally, in Fig. 7 we present the projected expected exclusion contours at the 95% CL for √ s = 14 TeV and 300 fb −1 (dashed curves), and √ s = 14 TeV and 3000 fb −1 (dotted curves).These results include systematic uncertainties, but for the sake of simplicity, we do not include the statistical uncertainty colored band.For comparison, we also include the limits for √ s = 13 TeV and 139 fb −1 (solid curves) that were shown in Fig. 5.For √ s = 14 TeV and 300 fb −1 , the expected exclusion contours extend to masses ∼1.5 TeV and ∼1.6 TeV for the binned and unbinned methods, respectively, and for intermediate values of BR(LQ u/d 3 → qℓ).For √ s = 14 TeV and 3000 fb −1 these are extended to ∼1.65 TeV and ∼1.8 TeV.As previously pointed out, the unbinned method provides the most stringent constraints.Remarkably, in the right panel, we can see that for LQ d 3 the mass limit for the binned case at √ s = 14 TeV and 3000 fb −1 would be the same as the limit established by the unbinned method with 10 times less luminosity.

Conclusions
In this work, we have performed a collider analysis of LHC searches for pairs of scalar leptoquarks decaying into b-quarks and tau leptons.To carry out this phenomenological analysis, we have used an unbinned approach based on the so-called Machine-Learned Likelihoods method, in which we have incorporated as a novelty a simplified procedure for the inclusion of some systematic uncertainties.We remark that this method could be also applied to LQs coupled to other or mixed generations with promising results.However, it is not a goal of the present analysis to study these cases.
Our strategy employs a binary classifier that discriminates between signal and background, estimating their PDFs through the use of KDE.The fact of knowing the signal and background PDFs allows us to compute the likelihood function of the exclusion hypothesis test in order to   impose 95% CL exclusion limits on the parameter space defined by the LQ mass and its branching fraction into a third-generation quark and a third-generation charged lepton.The results with this unbinned method, for an LHC center-of-mass energy of 13 TeV and a luminosity of 139 fb −1 , seem to show a tendency towards a potential improvement of the exclusion limits set by the ATLAS analysis.
A first approach to the inclusion of systematic uncertainties is done by translating them from the physical-based space to the ML classifier output one.We have consider only individual 5-10% uncertainties on the τ -lepton p T , E miss T , and H T , without correlations among them.The impact on the ML output was assessed then by replacing the training data set with variations of parameter values within their estimated uncertainties and repeating the analysis.Our results indicate that their impact on the exclusion limits is slight and the effect in both binned and unbinned methods is similar.Therefore, our approach to the inclusion of systematic uncertainties in the unbinned method provides a good numerical approximation despite the lack of an analytical expression as in the binned analysis.We have also checked that the inclusion of uncertainties in the other variables does not impact significantly on the results.Nevertheless, it is important to remark that a full treatment including all systematic sources and their correlations would be needed in a complete analysis.The simplified approach developed here attempts to show that the results obtained without the inclusion of any systematic uncertainty remain stable when at least a rough estimate of some of them is considered.
For the LHC at 13 TeV with 139 fb −1 and intermediate branching fractions, we find exclusion limits for leptoquark masses ∼1.25 TeV and ∼1.3 TeV for the binned and unbinned methods, respectively.We have also estimated the prospects for the LHC at 14 TeV with luminosities of 300 fb −1 and 3000 fb −1 .For the lower luminosity, the 95% CL exclusion limits reach LQ mass values of ∼1.5 TeV and ∼1.6 TeV for the binned and unbinned methods, respectively.For 3000 fb −1 these limits are extended to ∼1.65 TeV and ∼1.8 TeV, being the unbinned method the one that provides the most stringent constraints.

Figure 1 :
Figure 1: Expected number of events at the LHC √ s = 13 TeV and 139 fb −1 for both types of LQs with m(LQ u/d 3 ) = 1.2 TeV and β = 0.5, and the main backgrounds considered in this work.Two selection cuts are shown, in red the ones used throughout this work and in black the ones described by ATLAS [82].

Figure 2 :
Figure 2: Distributions of the most relevant variables for the ML discrimination: p T (τ ) (left panel), E miss T (middle panel), and H T (right panel).The signal distributions (red curve) correspond to a LQ u 3 with mass equal to 1.2 TeV and β = 0.5 as a benchmark.The stacked histograms show the SM background contributions.Minor backgrounds, i.e. fake-t t, t tV , and t tH, are grouped together.

Figure 3 : 3 ) = 1 . 2
Figure 3: Left panel: feature importance score (gain metric), which indicates how useful or valuable each feature was for the ML discrimination.Right panel: output of the XGBoost classifier when tested with only pure background (blue) or pure signal (red) samples.The dashed curves correspond to the PDFs obtained with the KDE method.Both panels consider as signal a m(LQ u 3 ) = 1.2 TeV and β = 0.5 as a benchmark.

Figure 4 :
Figure 4: Expected exclusion contours at the 95% CL for the third-generation up-type (left panel) and down-type (right panel) scalar-leptoquarks, in the leptoquark mass, m(LQ u/d 3 ), and branching fraction into a quark and a charged lepton, BR(LQ u/d 3

Figure 5 :
Figure 5: Same as Fig. 4, but including systematic uncertainties as detailed in the main text.

Figure 6 :
Figure 6: Ratio of the cross-section at √ s = 14 TeV and √ s = 13 TeV for different processes: both types of LQs with m(LQ u/d 3 ) = 1200 GeV and the main backgrounds considered in this work.Two sets of selection cuts are compared, in red the ones used throughout this work and in black the ones described by ATLAS [82].

Figure 7 :
Figure 7: Projected expected exclusion contours at the 95% CL for √ s = 13 TeV and 139 fb −1 (solid curves), √ s = 14 TeV and 300 fb −1 (dashed curves), and √ s = 14 TeV and 3000 fb −1 (dotted curves).All the results include systematic uncertainties as detailed in the main text.The rest of the references as in Fig. 4.

Table 1 :
Object identification criteria employed in this work.

Table 2 :
Selection cuts considered by ATLAS Collaboration [82] used to validate our pipeline, and loosened cuts employed in our multivariate analysis. t