Effective Field Theory in the top sector: do multijets help?

Many studies of possible new physics employ effective field theory (EFT), whereby corrections to the Standard Model take the form of higher-dimensional operators, suppressed by a large energy scale. Fits of such a theory to data typically use parton level observables, which limits the datasets one can use. In order to theoretically model search channels involving many additional jets, it is important to include tree-level matrix elements matched to a parton shower algorithm, and a suitable matching procedure to remove the double counting of additional radiation. There are then two potential problems: (i) EFT corrections are absent in the shower, leading to an extra source of discontinuities in the matching procedure; (ii) the uncertainty in the matching procedure may be such that no additional constraints are obtained from observables sensitive to radiation. In this paper, we review why the first of these is not a problem in practice, and perform a detailed study of the second. In particular, we quantify the additional constraints on EFT expected from top pair plus multijet events, relative to inclusive top pair production alone.


Introduction
The search for physics beyond the Standard Model (BSM) is the most pressing problem in particle physics, especially given the on-going experimental programme at the Large Hadron Collider. To date, clear signatures of BSM physics have remained elusive, although it is widely suspected that the new physics may have something to do with the nature of electroweak symmetry breaking, and thus affect the behaviour of the recently discovered Higgs boson or, due to its large Yukawa coupling with the former, the top quark.
The lack of clear evidence for BSM physics thus far motivates the use of effective field theory [1][2][3][4][5] (for a comprehensive review see [6]), in which one characterises corrections to the SM Lagrangian by gauge-invariant higher dimensional operators built from the SM fields. This has the advantage of being manifestly model independent, but is only applicable if the lowest energy scale associated with the new physics is above the typical energies probed by the collider of interest. Given that this is likely to be a viable situation at the LHC, such techniques have been widely used in the contexts of Higgs and electroweak precision physics . A priori, it is not clear whether new physics will first show up in the behaviour of the Higgs boson rather than the top quark. Thus, a number of more recent studies have looked at constraining effective theory in the top quark sector [56][57][58][59][60][61][62][63][64] (see also [65][66][67][68][69][70][71][72][73][74][75][76][77][78][79][80] for analyses using the alternative language of anomalous couplings).
Ideally, one should attempt to constrain all possible higher dimensional operators, using all possible experimental data. To this end, Refs. [81,82] have presented a proof of principle that such global fits are possible (see also Refs. [63,64]). In particular, the fit of Ref. [82] directly constrained the coefficients of all (combinations of) operators affecting single and double top production and decay, as well as associated production of a vector boson, using a wide variety of datasets from the Tevatron and LHC (runs I and II). These datasets were all corrected back to parton level, and compared with tree-level theory results, supplemented with NLO information using (bin-bybin) K-factors. There are many available datasets, however, which do not admit such a simple theoretical description. A typical example is observables sensitive to multiple final state jets of QCD radiation, which cannot be accurately modelled by available LO or NLO matrix elements. To this end, one may employ parton shower algorithms to simulate additional radiation, and many general purpose codes are available. Furthermore, parton showers may be systematically matched to matrix elements calculated at next-to-leading order in QCD perturbation theory [83][84][85][86][87][88][89][90], and to algorithms which model the hadronisation of final state quarks and gluons. Computations of this type for top-related processes, including higher dimensional operators, have been recently presented in Refs. [38,64].
When many final state jets are required in a given observable, one must systematically improve the output of a parton shower by including higher order tree-level matrix elements. A matching prescription is then needed to avoid double counting of radiation included in both the matrix elements and the shower, and different schemes have been presented in Refs. [91][92][93][94][95][96]. The aim of such a matching procedure, roughly speaking, is to ensure that jets that are widely separated from others in a given event are generated by the hard matrix elements, and those which are approximately collinear with other jets are generated by the parton shower. The separation between these two regimes is specified by a matching scale Q, and this must be carefully chosen so as to avoid discontinuities in distributions relating to additional jet radiation: too small, and matrix elements will be evaluated in momentum regions in which they are becoming collinearly singular; too large, and the parton shower will be used in kinematic regions where it fails to approximate higher order matrix elements sufficiently accurately. Such discontinuities thus rep-resent a mismatch between how QCD radiation is described by a parton shower, and by exact matrix elements, and is a problem even within the SM alone.
When higher dimensional operators are added to the SM, a second source of discontinuity arises. The operators generate additional Feynman rules, including couplings to quarks and gluons. Thus, rates for the emission of QCD radiation become modified. If matrix elements in this theory are matched to a parton shower, it follows that widely separated jets (generated by the matrix elements) will include the effect of the BSM physics, whereas those which are not widely separated will be generated by a parton shower which contains SM radiation only, leading to a mismatch between the two descriptions. Naïvely, one may expect this effect to be small: first of all, the higher dimensional operators are typically suppressed by a large energy scale. Moreover, they contain momentum-dependent numerators that tend to boost radiated particles to higher transverse momenta, thus widely separated from other particles. However, there is a more formal argument one can give to explain why any additional discontinuity resulting from the lack of EFT operators in the shower is negligible relative to the SM effect, related to the fact that emissions from EFT operators do not give rise to infrared singularities. Although the ideas involved are well-known to a QCD audience, this argument is not well-known in the BSM literature, and thus we provide a detailed explanation in this paper.
Armed with results for matrix elements containing EFT corrections matched to a parton shower one may address, for a particular production process / observable requiring such a theory description, whether or not useful additional constraints on given operators can be obtained. Our motivation here is extending the global EFT fits of Refs. [81,82] to include particle-level observables, for which a notable example is top quark pair production in association with multiple jets. For this process to provide a worthwhile new input to a global EFT fit, it is important to check that useful additional constraints are obtained by including radiation observables as well as those related to the top particles alone 4 . We will study this in detail for a number of operators, whose coefficients are set to values consistent with recent constraints, finding that indeed there is more information to be gleaned by adding observables sensitive to extra radiation.
The structure of the paper is as follows. In section 2, we examine the role of effective theory corrections to the three-gluon vertex, and argue that any additional discontinuity when matching matrix elements and parton showers is kinematically subleading to the existing SM discontinuity. In section 3, we study a number of observables in top plus multijet production, and examine in detail whether additional constraints can be obtained by using observables sensitive to additional jet radiation, given the matching uncertainties. Finally, we conclude in section 4.

Matching effective theory matrix elements with parton showers
As discussed above, if one matches higher order tree-level matrix elements containing higher dimensional operators to a parton shower, BSM effects are included in the matrix elements, but not the shower. Information about the BSM physics is then missing in additional jets that are generated by the shower, rather than the matrix elements, where the former are formally correct only in the collinear limit. This in turn means that jets which are sufficiently collinear to other jets (defined in terms of the matching scale) in a given event are potentially missing BSM effects. In this section, we review in detail why this is not actually a problem in practice, due to the differing nature of SM and BSM radiation in the collinear limit. Our presentation will be similar to that of Refs. [97,98]. For illustrative purposes, we consider the case of a gluon which branches into a gluon pair, with momenta labelled as in Fig. 1 is the fraction of the energy of parton a that is carried by parton b. Given that this virtuality is small as θ → 0, we may treat parton a as being approximately on-shell in the collinear limit. We can then consider the amplitude for emission of a gluon from the leg p a , which will be given by where M n is the amplitude before the gluon branching, {a i } are adjoint indices of the gluons, and the ellipses denote all momenta and colour degrees of freedom associated with partons that are suppressed in Fig. 1. Furthermore, µ i is the polarisation vector of parton i, and V a 1 a 2 a 3 µ 1 µ 2 µ 3 the three-gluon vertex. We may write the latter as where the first term on the right-hand side is the Standard Model component (coming purely from the QCD Lagrangian), and the second term collects the additional contributions arising from higher dimensional operators. To this end let us consider the following three-gluon operator: where G A µν is the gluon field strength tensor. We may also consider the associated Lagrangian where Λ is the new physics scale, and C G an unknown coefficient. The effect of this Lagrangian is to generate new interactions involving the gluon field, and in particular to modify the Feynman rule for the three-gluon vertex. Explicit results for the two terms on the right-hand side of Eq. (2.4) are then and where the square brackets in the subscripts denote antisymmetrisation of indices.
In constructing the squared amplitude in the SM, one need only include diagrams in which the radiated gluon p b lands on the leg p a on both sides of the final state cut. From Eq. (2.3), the sum of all such diagrams, summed over all final state polarisations and averaged over the polarisation of the branching parton p a , is (in four spacetime dimensions) (2.9) In order to evaluate this expression, one may choose an explicit basis of polarisation vectors for each parton, (p i ) ∈ { in i , out i }, pointing in and out of the scattering plane respectively. One may then use the dot products [97,98] in and The terms in the first line are recognisable as the usual QCD splitting function P gg (z) describing the probability for a gluon to branch into two gluons. The first term in the second line arises from interference of the BSM contribution with the SM, and is suppressed by the inverse square of the new physics scale. The second term in the second line is quadratic in the new physics, and would mix with potential dimension eight effects in the effective theory expansion, thus is formally of higher order and can be neglected. It is only the interference term that constitutes those BSM corrections that are missing in the collinear region. However, upon studying this term, we see explicitly that it contains a factor of θ 2 and, hence, is formally kinematically suppressed in the collinear limit. From Eq. (2.1), the prefactor in Eq. (2.13) is O(θ −2 ), and thus the SM term contains the well-known collinear enhancement of QCD radiation. The BSM interference term (including the prefactor) is O(θ 0 ), and will be negligible provided that the matching scale between matrix elements and parton shower is chosen to be sufficiently small. Put another way, any additional source of discontinuity in jet-related distributions coming from the absence of BSM corrections in the shower, is kinematically suppressed relative to the discontinuity already present in the SM. Given the above discussion, one may ponder whether it is possible to nevertheless include the BSM interference contribution in Eq. (2.13) in the gluon branching probability, despite the fact that this corresponds (eventually) to resumming a subleading contribution. However, such a procedure would be formally incorrect. In general, the radiated gluon may be emitted from parton leg i, and land on leg j in the conjugate amplitude. Above, we have considered only contributions for which i = j, as shown in Fig. 2(a). This is correct for the SM: such contributions, as we have seen above, are O(θ −2 ), whereas diagrams with i = j ( Fig. 2 . It is incomplete, however, for the BSM interference contributions. Diagrams with i = j and i = j are both of the same kinematic order (O(θ 0 )). All of them must be therefore be included to ensure a gauge invariant result, so that it makes no sense to resum only a subset of them.
Above we have examined only the operator of Eq. (2.5). However, the kinematic suppression that we have observed will be fully general, including operators that affect other parton branchings, involving (anti-)quarks in addition to gluons. This follows from the fact that any dimension six operator appears in the Lagrangian with a dimensionless coefficient C i , and an inverse power of the new physics scale, Λ −2 . For the Lagrangian to have the correct dimension, the momentum space Lagrangian must contain two powers of momentum, that combine with Λ to make a dimensionless ratio. We can see this explicitly in the example of the three-gluon operator given above: the BSM vertex of Eq. (2.8) contains two additional powers of momentum relative to the SM result of Eq. (2.7). When evaluating the graph of Fig. 1, there is only one momentum scale available, namely the virtuality of the branching parton, p 2 a . Thus, the higher dimensional operator must contribute an interference term where we have used Eq. (2.1). This is indeed observed in Eq. (2.13).
We have so far seen that the effects of higher dimensional operators are negligible in the collinear region, and thus should not significantly contribute to discontinuities in jet-related kinematic distributions when matching parton showers (based on enhanced collinear radiation) to matrix elements. However, this is not the full story. Parton shower algorithms also include the effects of wide-angle soft radiation, by e.g. explicit angular ordering, or the choice of evolution variable [97,98]. The above dimensional argument applies also in this case: for soft, but not necessarily collinear, emissions, the only momentum scale that can combine with the new physics scale in the BSM vertex for the radiated parton is the virtuality of the emitting parton. This remains small if the emitted parton is soft. Indeed, looking at the interference term in Eq. (2.13), we see that this vanishes as z → 0 or z → 1, corresponding to the two limits in which either parton a or parton b is soft. There is thus no soft singularity from the BSM part, mirroring the lack of collinear singularity.
In this section, we have reviewed in detail the fact that additional radiation produced by EFT operators is not associated with collinear singularities, and thus does not lead to SM-like discontinuities when matching tree-level matrix elements with a parton shower. However, in order to obtain meaningful constraints from observables requiring such a theory description, it is important to examine whether or not deviations due to EFT corrections lead to significant new information, in light of the theoretical uncertainties due to the matching procedure. This is the subject of the following section.

Results
As discussed above, the aim of our study is to ascertain whether or not jet observables in top production provide additional constraints on new physics (as described using EFT), relative to observables only involving the top quarks. To this end, we must first examine the matching uncertainty affecting how the jets are modelled.

Effect of dimension six operators on jet radiation
In this section, we present a number of example distributions, including EFT effects consistent with current constraints on operator coefficients [82]. Results are obtained as follows. We implement EFT operators in a FeynRules [99] model file, and interface this with MadGraph5_aMC@NLO [100] for the generation of tree-level events containing top pairs with up to 2 jets. These are matched to the parton shower Pythia 8 [101], using the default MadGraph MLM-based matching scheme [102], with a central matching scale Q = 30 GeV. Our default choice for the renormalisation and factorisation scales is the top mass, µ ren = µ fact. = m t , and we use the parton distributions of Ref. [103]. We cluster all visible final-state particles into jets using the anti-k T algorithm [104] with jet radius R = 0.4, as implemented in FastJet [105]. We consider only the dilepton final state, and require both leptons to be isolated from hadronic activity, defined via the requirement that the total transverse momentum with a cone of radius ∆R = 0.3 around the lepton satisfies p cone T ≤ 0.1 × p T,l , where p T,l is the transverse momentum of the lepton. We remove the isolated leptons and any b-tagged jets from the list of final state jets and particles, so as to consider only jets originating from additional radiation.
There are six combinations of dimension 6 EFT operators affecting top quark pair production at tree-level in the SMEFT (see e.g. [106]), for which we use the Warsaw basis of Ref. [5]. Four of these are 4-fermion operators, for which we we choose the representative example in what follows (n.b. this consists of a sum of operators appearing in Ref. [5]). Here u, d and t are the up, down and top quark fields respectively. The remaining two operators are a correction to the three-gluon vertex 5 (where G A µν is the gluon field strength tensor and f ABC the structure constants of the SU(3) gauge group), and the chromomagnetic moment operator 3) 5 The interference of the top pair production amplitude containing the gluon operator of Eq. (3.2) with the corresponding SM amplitude vanishes, such that this operator contributes at quadratic, and thus dimension eight, level only. This leads some people to disregard this operator, but a different school of thought is that it should be included as the leading contribution of this operator to the given process. We choose to follow the latter approach.  where σ µν is a fermionic spin generator, and T A an SU(3) generator. The combinations have already been significantly constrained by global analyses of top quark data. Motivated by the analysis of Ref. [82], we take as representative constraints C q = 1.25 TeV −2 ,C G = 0.45 TeV −2 ,C tG ≡C 33 uG = 0.64 TeV −2 . (3.5) In Fig. 3, we show the distribution of the transverse momentum p T,t of the hardest top particle, together with the invariant mass m tt of the top quark pair. Note that the former differs from the top transverse momentum used in the fit of Ref. [82], which used data corrected back to parton level, where extra radiation had been accounted for in the unfolding process. For such datasets (which mimic the 2 → 2 scattering process), the transverse momentum distributions of the top and antitop quarks will be equal. When extra radiation is involved, the symmetry between the top and antitop p T distributions is broken, and one may choose whether to isolate the transverse momentum of the top quark (rather than antitop), or to take the hardest top particle. A reason to use the latter is that it should be more sensitive to details of the additional radiation, given that it amplifies the recoil of the top (or antitop) against the extra jets. By contrast, the invariant mass distribution is more stable against radiative corrections.
In each panel of Fig. 3, the orange and red bands depict the (renormalisation and factorisation) scale and matching scale uncertainties associated with the SM result. We see in both cases that the matching uncertainty is smaller than the other scale variation, suggesting that modelling of additional radiation is well under control. Both the gluon and four-fermion operators show a  shape difference with respect to the SM, albeit slight in the former case given that the gluon operator is already rather well-constrained. Nevertheless, it is difficult to gauge the statistical significance of the deviation from the SM by eye alone, and a much more quantitative description will be provided in the following section. The effect of the four-fermion operator in the p T,t spectrum is sizeable at large transverse momenta, as expected given that EFT operators often boost final state particles, as they contain extra momenta to offset the factor inverse new physics scale Λ −2 . Furthermore, the transverse momentum of the hardest top particle should be particularly sensitive to the nature of additional radiation, as discussed above. Deviations are less evident, as expected, in the invariant mass spectrum, although there is still a deviation from the SM, which is worth quantifying further. Note that the dipole operator has a very similar shape to the SM contribution (as noted in Ref. [57]), but nevertheless leads to a change in overall normalisation that is still compatible with current constraints. In Fig. 4 we show the transverse momenta of the first and second hardest additional jets, using the same conventions as Fig. 3. Again we see that the matching uncertainty is smaller than the other scale uncertainties, and that are potentially statistically significant deviations from the SM. It is interesting, however, to note that the effect of the four fermion operator is much smaller than for the transverse momentum spectrum of the hardest top particle at high p T . This can be at least partly explained from the fact that top pair production at the LHC is dominated by the gluon channel. Thus, when only the four-fermion operator is switched on, most of the additional radiation will be purely SM-like. Another feature of Fig. 4 is that the dipole operator of Eq. (3.3) leads to a normalisation change of the jet radiation profile, but not a significant shape change. It thus mirrors the properties already observed for top-related observables in Ref. [57], that the shape of kinematic distributions involving the dipole operator is highly similar to the SM alone. Thus, we see that the transverse momenta of the additional jets are in principle useful for distinguishing the dipole and three-gluon operators, whilst providing complementary information to those observables (such as m tt ) that also constrain four fermion operators.
In Fig. 5, we show the rapidity of the top quark, and of the hardest additional jet (similar results are obtained for the second hardest jet). The results are consistent with previous plots: the effect of the dipole operator is to change the normalisation of the SM contribution, but not the shape. The four fermion operator again has a smaller effect, although becomes marginally more pronounced at higher absolute rapidities, due to the fact that the parton luminosity then diminishes the gluon initiated channel.
To summarise, we have seen that EFT contributions to observables sensitive to additional radiation in top pair production indeed lead to deviations from the SM, and of comparable size to those observables (e.g. top transverse momentum, and the top pair invariant mass) that are already used in EFT fits in the top sector. Furthermore, these deviations survive against the matching and scale uncertainties associated with the SM results. How much discriminating power these extra observables have depends on the amount of data collected in coming years, but also on whether the additional jet observables are highly correlated with the top quark kinematics. We explore these issues in the following section.

Distinguishing power of jet observables
Above, we have seen that EFT operators significantly affect additional jet radiation in top pair production, such that observables involving this radiation can potentially provide useful additional constraints in global fits of top quark EFT to data. In order to check whether or not this is realised, however, we must examine the degree of correlation between observables involving the jet radiation, and those involving the top particles alone. There is clearly some degree of correlation, given that the top and antitop will recoil against additional radiation. However, it may well be the case that certain top observables are less correlated with radiation properties than others. This is then useful information for choosing optimal (i.e. the most complementary) sets of observables with which to constrain new physics. In Fig. 6, we show two-dimensional scatter plots of the p T of the first hardest jet, and either the p T of the hardest top particle, or the top pair invariant mass. All results are calculated in the SM only. In each plot, we also show the Pearson correlation coefficient ρ. We see that the transverse momentum of the hardest top particle is more correlated with the properties of the hardest jet than the top pair invariant mass is, as expected given that the latter observable is more stable to higher order corrections. In the case of the invariant mass, the correlation coefficient is less than 0.5, suggesting that indeed the additional jet radiation is capable of providing significant complementary information relative to top properties alone. Similar plots are shown in Fig. 7 for the second hardest jet. Unsurprisingly, the top properties are less sensitive to the second hardest jet than they are to the first hardest jet. Again, we see that the invariant mass provides the most complementary information to the the properties of the jet radiation.
Let us now turn to the question of how sensitivity to new physics is affected by the inclusion of jet radiation observables. To estimate the gain in sensitivity, we perform a binned hypothesis test using two-dimensional distributions based on pairs of observables discussed above (in principle three-dimensional distributions carry even more shape information, but suffer from poor statistics). Our hypothesis test is based on the modified frequentist CL s method [107]. We take the signal hypothesis (s) to be each dimension six operator in turn. The background hypothesis (b) is the SM only, and we generate (pseudo)-data corresponding to the background, before calculating the binned log-likelihood ratio where the sum is over bins i, s i and b i are the expected number of signal and background values respectively, and d i is the number of observed events. The confidence levels for excluding the s + b and b-only hypotheses are These represent, respectively, the probability that the test statistic q would be greater than that observed in the data, given the hypothesised number of signal and background events s + b or background-only events b. In practice, we numerically evaluate these p-values by generating a large number of Monte Carlo pseudo-experiments, with CL s+b being the fraction of pseudoexperiments that generate at least as many events as observed in the data. A signal hypothesis is regarded as excluded at the 95% confidence level if CL s ≡ CL s+b /(1 − CL b ) ≤ 0.05.
To judge the usefulness of observables sensitive to additional jet radiation, we take observables X ∈ {m tt , p T,t }, and calculate the CL s in two cases: (i) using X alone; (ii) using X in combination with the transverse momentum of the first hardest jet, p T,j 1 . We then examine the ratio CL s (X, p T,j 1 ) CL s (X) , (3.8) which measures the "improvement" due to including the extra radiation. We choose p T,j 1 as a particular example, but similar results are obtained by choosing other radiation observables. Given that the dipole and gluon operators of Eqs. (3.2), (3.3) appear, from Figs. 3-5, to be the hardest to distinguish from the SM, we will focus on these. Results for each operator are shown in Fig. 8, where to obtain the luminosity scale on the horizontal axis, we have multiplied all cross-sections by a factor of 6%, corresponding to a typical event selection efficiency for dileptonic top pair events. We see in both cases that using the additional jet radiation leads to a significant improvement in the CL s . More improvement is obtained when adding the radiation to the invariant mass distribution rather than the p T of the hardest top, as expected given that the former is less correlated with the radiation. More improvement is seen for the gluon operator, presumably due to the fact that this leads to a significant shape change with respect to the SM, whereas the dipole operator does not.
We did not present results for the four fermion operator of Eq. (3.1). The negative interference in some kinematic regions means that distributions involving this operator sometimes undershoot, and sometimes overshoot the SM, as can be clearly seen in Figs. 3-5. This in turn leads to cancellations in the log likelihood ratio of Eq. (3.6), so that this is not the best quantity to use to measure the advantage of using additional information. One could instead use e.g. χ 2 values, although it is any case already clear from Figs. 3-5 that the four fermion operator typically leads to much larger deviations from the SM subject to current constraints, thereby rendering the analysis of this section less relevant.

Conclusion
In this paper, we have considered the issue of whether observables relating to additional jet radiation in top pair production provide a useful input to global fits of effective field theory (EFT) in the top sector. In the absence of NLO QCD corrections to processes containing EFT operators, the best way of describing such observables is to use higher order tree-level matrix elements interfaced with a parton shower. One may then worry about potential discontinuities arising from the fact that radiation generated by the matrix elements includes BSM effects, whereas radiation generated by the shower does not. We have reviewed in section 2 why this is not a problem in practice, due to the fact that the new physics contributions do not generate soft or collinear singularities. We then studied top pair production generated at tree-level with up to two additional jets, matched to a parton shower. The matching uncertainty was found to be smaller than the factorisation and renormalisation scale uncertainty. Furthermore, deviations from the SM due to EFT operators could be observed in a number of kinematic distributions, including those associated with additional jet radiation. We quantified this in section 3.2 by looking at the relative improvement in the CL s for the EFT signal plus SM background, when using the p T of the hardest additional jet in addition to the top pair invariant mass or p T of the hardest top particle. We saw significant improvements, suggesting that indeed multijet observables can provide highly useful complementary information to inclusive top observables alone. The inclusion of such observables in global EFT fits is in progress. dileptonic top pair events, and Keith Hamilton for further useful discussions. CDW is supported by the UK Science and Technology Facilities Council (STFC), under grant ST/P000754/1. CE is supported by the IPPP Associateship scheme, and by the STFC under grant ST/P000746/1.