Dijet resonance search with weak supervision using $\sqrt{s}=13$ TeV $pp$ collisions in the ATLAS detector

This Letter describes a search for narrowly resonant new physics using a machine-learning anomaly detection procedure that does not rely on a signal simulations for developing the analysis selection. Weakly supervised learning is used to train classifiers directly on data to enhance potential signals. The targeted topology is dijet events and the features used for machine learning are the masses of the two jets. The resulting analysis is essentially a three-dimensional search $A\rightarrow BC$, for $m_A\sim\mathcal{O}(\text{TeV})$, $m_B,m_C\sim\mathcal{O}(100\text{ GeV})$ and $B,C$ are reconstructed as large-radius jets, without paying a penalty associated with a large trials factor in the scan of the masses of the two jets. The full Run 2 $\sqrt{s}=13$ TeV $pp$ collision data set of 139 fb$^{-1}$ recorded by the ATLAS detector at the Large Hadron Collider is used for the search. There is no significant evidence of a localized excess in the dijet invariant mass spectrum between 1.8 and 8.2 TeV. Cross-section limits for narrow-width $A$, $B$, and $C$ particles vary with $m_A$, $m_B$, and $m_C$. For example, when $m_A=3$ TeV and $m_B\gtrsim 200$ GeV, a production cross section between 1 and 5 fb is excluded at 95% confidence level, depending on $m_C$. For certain masses, these limits are up to 10 times more sensitive than those obtained by the inclusive dijet search. These results are complementary to the dedicated searches for the case that $B$ and $C$ are Standard Model bosons.

A search for dĳet resonances is one of the first analyses performed when a hadron collider reaches a new center-of-mass energy [1][2][3][4][5][6][7][8].While such searches are sensitive to nearly all resonance decays  → , dedicated searches for particular decays will always be more sensitive.This is the motivation for dedicated resonance searches for the case where  and  are -leptons [9,10], -quarks [11][12][13], top quarks [14,15], vector bosons [16,17], Higgs bosons [18][19][20][21][22][23], and more, including asymmetric combinations.In all cases, a selection on the structure of the energy flow from each side of the decay is used to enhance events with the targeted topology.Searches for any combination of Standard Model (SM) particles can be well-motivated by one or more theory frameworks beyond the SM (BSM), but not all combinations are currently covered by dedicated searches [24].Furthermore, there are only a small number of searches [25][26][27][28][29][30][31][32][33][34][35][36][37][38][39] that cover the vast set of possibilities where at least one of  or  is itself a BSM particle [40].There is no previous search where all of ,  and  are BSM particles and can have different masses.
While it is crucial to continue searching for particular dĳet topologies, the fact that not all SM and BSM possibilities are covered suggests that a complementary generic search effort is required.What is needed is a method for searching for many topologies all at once that ideally does not pay a large statistical trials factor.A variety of existing and proposed model-agnostic searches range from nearly signal model-independent but fully background model-dependent [41][42][43][44][45][46][47][48][49][50][51][52][53][54][55][56] (because they compare data with SM simulation) to varying degrees of partial signal-model and background-model independence [57][58][59][60][61][62][63][64][65][66][67][68][69][70][71][72].The method used for this analysis employs a machine-learning-based anomaly detection procedure to perform a dĳet search in which the jets from a potential signal have a nontrivial but unknown structure [70,71].Simply stated, classifiers are trained to distinguish particular dĳet invariant mass bins from their neighbors.Localized resonances will be enhanced with a selection based on the classifier.This Letter presents a search for a generic  →  resonance, in which all of ,  and  could be BSM particles and   ,     so that the decay products of  and  can be contained within single large-radius jets.A search by the CMS collaboration [17] involves a three-dimensional fit over jet and dĳet masses, but results are reported only for the case that  and  are  or  bosons.The analysis presented here achieves sensitivity to BSM cases without performing a scan in masses other than from the  particle.In particular, the search uses events collected by the ATLAS detector [73,74] using the full 139 fb −1 Run 2 √  = 13 TeV   collision data set.Weakly supervised classifiers are used to enhance potential signals without using simulations of any particular signal models.
Events with at least two jets are considered, and the invariant mass distribution of the two leading jets is used to perform a 'bump hunt'.Jets are formed [75,76] from locally calibrated calorimeter cell-clusters [77] using the anti-  algorithm [78] with a radius parameter of  = 1.0.These jets are trimmed [79] by reclustering the jet constituents with the   algorithm using  = 0.2 and removing the constituents with transverse momentum ( T ) less than 5% of the original jet  T .The jet four-vectors are then calibrated as detailed in Ref. [80].The two jets are required to each have  T > 200 GeV and pseudorapidity 1 || < 2.0.In order to be broadly sensitive to hadronically decaying narrowly resonant particles, events are required to have at least one jet with  T > 500 GeV and two leading jets with a rapidity difference of The  T threshold is chosen so that the online trigger system is fully efficient [81,82].Furthermore, both jets must have jet mass 30 GeV <  J < 500 GeV for stability of the neural network (NN) training described below.The upper threshold reduces the  JJ -dependence of the  J distribution.The bump hunt is performed for dĳet invariant masses in the range 2.28 TeV <  JJ < 6.81 TeV.
The masses of the two leading jets are used for classification.As the first application of fully data-driven machine-learning anomaly detection, this restricted feature set is used to establish the procedure and is already In order to eliminate a trial factor associated with ( 1 ,  2 ), the NN identifies a region of interest, and no event is used to train the NN that is applied to it.A -fold cross-validation procedure is employed in which the full data set is divided randomly into  parts of equal size.Among these,  − 2 parts are used for training  classifiers (the training set) with different initializations, and the ( − 1) th part is used to decide, based on the loss, which of these  networks to select (the validation set).The selected network is then mapped to an efficiency  in the  th part (the test set) so that the meaning of the network output can be compared across data sets and trainings.The efficiency  is defined as the fraction of events with a given NN value or higher.This output is averaged across the  − 1 other permutations of the training and validation parts.The entire procedure is then repeated  times, where each part is a test set exactly once.For this analysis,  = 5 and  = 3, so there are 3 × 4 × 5 = 60 NNs trained for each signal region.Two event selections from thresholds imposed on the NN outputs are used: one that keeps the 10% most signal-region-like events ( = 0.1) and one that keeps the 1% most signal-region-like events ( = 0.01).
As the classifier-based event selection depends on the data, and in particular on the possible presence of true signals, it is not possible to directly define control regions to validate the method.The entire procedure was validated using simulated events as well as a validation region with |Δ JJ | > 1.2.For -channel resonances, it is expected that this inverted rapidity-difference requirement reduces the signal efficiency while enhancing the dĳet background by over an order of magnitude.In these validation tests, the learning works effectively and there is no evidence for selection-induced excesses.The expected limits are comparable to the ones that will be reported for the unblinded data in Fig. 3.  and (d) there is an injected signal of   = 3 TeV, and   = 400 GeV and   = 400 GeV.The location of (  ,   ) for the given injected signal is marked with a green ×.The injected cross section is just below the limit at low   and   from the inclusive dĳet search [101].Additional signal region plots in the absence of an injected signal can be found at [102].
Following the validation, first, the performance of the NNs on data is studied with and without injected signals.Since the NNs are two-dimensional functions, they can be visualized directly as images.Figure 1 presents the network output from a representative signal region in the absence of signal and also in the presence of injected signals.By construction, there must be a region of low efficiency and the data are the same in all four plots.In the absence of a signal, regions of low efficiency are located randomly throughout the ( 1 ,  2 ) plane.The signals are  →  , for a new vector boson  [103], and the  and  boson masses are varied, with widths set close to zero.These signals were simulated using P 8.2 [104][105][106] with the A14 set of tuned parameters [107] and NNPDF 2.3 parton distribution function [108].All samples of simulated data were processed using the full ATLAS detector simulation [109] based on G 4 [110].The amount of signal injected in all cases is about the same as, or less than, the level already excluded by the all-inclusive dĳet search [101].In all cases, the low-efficiency (signal-like) regions of the NN are localized near the injected signal.Some signals are easier to find than others; the difficulty is set both by the relative size of the signal and by the total number of events available for training in the signal vicinity.
After applying an event selection based on the NN trained on a particular signal region, the  JJ spectra are fit with a parametric function.The entire  JJ spectrum between 1.8 and 8.2 TeV is fit with a binning of 100 GeV; however, a fit signal region and fit sideband region are defined for evaluating the quality of the fit.The fit signal regions are defined as the  JJ signal regions the NN used for training, combined with the adjacent halves of the left and right neighboring regions; the fit sidebands are defined as the complement of the fit signal regions.An iterative procedure is applied until the -value from the fit sideband  2 is greater than 0.05.Since the NN is trained to distinguish the signal region from its neighboring regions, it is expected that the  JJ spectrum is smooth in the fit sideband region in the presence or absence of a true signal.First, the data are fit to / =  1 (1 − )  2 −  1  3  −  3 , where  =  JJ / √ ,   are fit parameters, and the   are chosen to ensure that the   are uncorrelated.If the fit quality is insufficient, an extended function is used instead [101]: ) .If the fit quality remains insufficient, a variation of the UA2 [2] fit function is tested: If the fit quality is still insufficient, the fit sidebands are reduced by 400 GeV on both sides and the three functions are tried again in order.This procedure is then iterated until the fit is successful.The fit results in the signal regions for the  = 0.1 and  = 0.01 NN efficiency selections are presented in Figure 2. The largest positive deviation from the fit model is 3.0 in signal region 1, around 2500 GeV, at  = 0.1 (the corresponding NN output does not show any significant features [102]).Globally, the positive tail of the signal region significance distribution is consistent with a standard normal distribution at the 1.5 level.
The  signal models can be used to set limits on the production cross section of specific new particles.To illustrate the sensitivity of the analysis to the full three-dimensional parameter space (  ,   ,   ), two   points and multiple (  ,   ) points are selected.As the NN performance depends on the data, the entire learning procedure has to be repeated every time a new signal model and signal cross section are injected into the data.In order to reduce statistical fluctuations related to the shape of the signal, for each signal cross section the network is retrained with five random samplings from the signal simulation, and the network with the median performance is chosen.A profile-likelihood-ratio test is used to determine 95% confidence intervals for the excluded signal cross section.When the number of expected events is much larger than one, asymptotic formulae [111] are used for this test, otherwise, the test is performed numerically (only Fig. 3d).The excluded cross section is reported as max( CL ,  injected ), where  CL is the cross section determined from the profile-likelihood-ratio test and  injected is the injected cross section.This procedure is chosen because the network's performance may not be as good if there were truly less signal than was injected.The resulting exclusion limits are presented in Figure 3.As the background expectation is determined entirely from data, the only systematic uncertainty associated with the background is the statistical uncertainty from the fit.The only other relevant uncertainties are those related to the signal  JJ and  J modeling; experimental uncertainties in the reconstructed jet kinematics account for about a 10% uncertainty in the excluded cross section.
The limits on  production vary with   ,   , and   .For   =   = 400 GeV, the excluded cross section is about 1 fb, a significant improvement over existing limits.Lower   and   result in weaker limits because of the larger SM background in those regions; it is therefore difficult for the NN to learn to tag these signals.The NN is most powerful when the local signal-to-background ratio is high and there are enough events for it to learn effectively.For some models, such as (  ,   ,   ) = (5000, 80, 80) GeV, the NN is not able to identify the signal effectively, resulting in limits weaker than those from previous searches.For comparison, the sensitivities of the ATLAS inclusive dĳet search (recast with signals from this paper) [112] and the all-hadronic diboson resonance search [101] are also shown in Figure 3.The inclusive dĳet search sensitivity decreases for high   and   masses due to the use of small-radius jets that do not capture all of the  and  decay products.The diboson resonance search has greater sensitivity when   ,   ≈   ,   , but it has no sensitivity away from these points.In this case, the diboson search uses more information than the weakly supervised one, but the trend is expected: assuming that the simulations used for developing the analysis selection are reliable, a fully supervised approach should outperform the weakly supervised one for any particular signal model.Direct searches for  and  that trigger on initial-state radiation are also sensitive to these signal models [34][35][36][37][38][39], but the sensitivity is much weaker than 10 fb.(c,d)   = 5000 GeV and NN trained on signal region 5.The limits are broken down between the analyses with (a,c)  = 0.1 and (b,d)  = 0.01.Also shown are the limits from the ATLAS dĳet search [101] and the ATLAS all-hadronic diboson search [112].The inclusive dĳet limits are calculated using the  signals from this paper and the full analysis pipeline of Ref. [101]; the diboson search limits are computed using the Heavy Vector Triplet [113]  signal from Ref. [112].The acceptance for the  in this paper, compared to the  acceptance in Ref. [112], is 86% and 54% for   = 3 and 5 TeV, respectively.Missing observed markers are higher than the plotted range.
Poor limits occur when the NN fails to tag the signal.
While the regions are chosen with definite boundaries, the analysis is sensitive to signals across the entire range.In particular, it is found that for nearly 75% of the range, the efficiency for a signal is unaffected by a shifted peak location.This efficiency is everywhere above half of the nominal efficiency.
In conclusion, this Letter presents a model-agnostic resonance search in the all-hadronic final state using the full LHC Run 2   data set of the ATLAS experiment.Weakly supervised classification NNs are used to identify the presence of potential signals without training on simulations of any particular signal models.For jets produced from Lorentz-boosted heavy-particle decays, this search is more sensitive than the inclusive dĳet search and extends the coverage of the all-hadronic diboson search to regions away from the SM boson masses.This is the first search that covers  →  production where all of ,  and  are BSM particles that can have different masses.The feature space used by the NNs is only two-dimensional, so there is great potential to extend this method to include additional features and more final states in order to ensure broad coverage of unanticipated scenarios.

Figure 1 :
Figure1: The efficiency mapped output of the NN versus the input variables for the events in signal region 2 for four cases: (a) there is no injected signal; (b) there is an injected signal of   = 3 TeV, and   = 400 GeV and   = 80 GeV, (c) there is an injected signal of   = 3 TeV, and   = 200 GeV and   = 200 GeV, and (d) there is an injected signal of   = 3 TeV, and   = 400 GeV and   = 400 GeV.The location of (  ,   ) for the given injected signal is marked with a green ×.The injected cross section is just below the limit at low   and   from the inclusive dĳet search[101].Additional signal region plots in the absence of an injected signal can be found at[102].

Figure 2 :
Figure 2: A comparison of the fitted background and the data in all six signal regions, indicated by vertical dashed lines, and for (a,c)  = 0.1 and (b,d)  = 0.01.Dashed histograms represent the fit uncertainty.The lower panel is the Gaussian-equivalent significance of the deviation between the fit and data.The fits are performed including the sidebands, but only the signal region predictions and observations in each region are shown.As the NN is different for each signal region, the presented spectrum is not necessarily smooth.The top plots (a,b) show the result without injected signal, and the bottom plots (c,d) present the same results but with signals injected only for the NN training at   = 3 TeV (Signal 1) and   = 5 TeV (Signal 2), each with   =   = 200 GeV.The injected cross section for each signal is just below the limit from the inclusive dĳet search [101].

Figure 3 :
Figure 3: 95% confidence level upper limits on the cross section for a variety of signal models, labeled by (  ,   ), in GeV.The limits are shown for signal models with (a,b)   = 3000 GeV and NN trained on signal region 2; and(c,d)   = 5000 GeV and NN trained on signal region 5.The limits are broken down between the analyses with (a,c)  = 0.1 and (b,d)  = 0.01.Also shown are the limits from the ATLAS dĳet search[101]  and the ATLAS all-hadronic diboson search[112].The inclusive dĳet limits are calculated using the  signals from this paper and the full analysis pipeline of Ref.[101]; the diboson search limits are computed using the Heavy Vector Triplet[113]    signal from Ref.[112].The acceptance for the  in this paper, compared to the  acceptance in Ref.[112], is 86% and 54% for   = 3 and 5 TeV, respectively.Missing observed markers are higher than the plotted range.Poor limits occur when the NN fails to tag the signal.