A methodology for theory uncertainties in the SMEFT

A process specific methodology is defined to systematically assign theoretical uncertainties in the Standard Model Effective Field Theory when performing leading order global fits. The method outlined also minimises the computational and theoretical burden to systematically advance such analyses to dimension eight.


I. Introduction:
The Standard Model (SM) is an incomplete description of observed phenomena in nature, and usefully thought of as an Effective Field Theory (EFT) for data analysis with characteristic energies around the Electroweak scale: vT ≡ 2 H † H .
The Standard Model Effective Field Theory (SMEFT) is based on the infared assumptions that physics beyond the SM is present at scales Λ > vT , that there are no light hidden states in the spectrum with couplings to the SM, and a SU(2) L scalar doublet (H) with hypercharge y h = 1/2 is present in the EFT.A power counting expansion in the ratio of scales q 2 /Λ 2 < 1, with q 2 a kinematic invariant associated with experimental measurements in the domain of validity of the EFT, defines the SMEFT Lagrangian L SMEFT = L SM + L (5) + L (6) + L (7) Higher dimensional operators (Q i ) define SMEFT corrections to the SM predictions, and carry a mass dimension d superscript.The notation is such that C(6 i /Λ 2 and C(8 i /Λ 4 .The operators multiply dimensionless Wilson coefficients C (d) i , which take on specific values as a result of the Taylor expanded effects of physics beyond the SM.As the nature of physics beyond the SM is unknown, we treat these Wilson coefficients as free parameters to constrain from the data.The sum over i, after non-redundant operators are removed with field redefinitions of the SM fields, runs over the operators in a particular operator basis.We use the Warsaw basis [1,2] for L (6) in this paper.For higher order operators of mass dimension greater than six, we use the conventions in Refs.[3][4][5][6], which defines the geoSMEFT formulation of this theory to all orders in 2 H † H /Λ for n-point functions, with n ≤ 3.
Using the SMEFT, ATLAS and CMS have started to perform global analysis of LHC data.Such global fits are performed in the context of theoretical predictions of limited precision in the SMEFT.As approximations are used in the theory predictions, it is required to define a theoretical error in SMEFT studies, to avoid missinterpreting experimental results in global SMEFT studies.In this paper we define such a methodology.II.Missing higher order terms: When a prediction is made in the SMEFT at leading order (LO), subleading terms are neglected.The expansions present in the theory are • the loop expansion in g 2 SM /16π 2 , where g SM ⊂ [g 1 , g 2 , g 3 , λ, y ψ ] and y ψ is the Yukawa coupling for the fermion species ψ; is the vacuum expectation of the Higgs field in the SMEFT; this expansion is relevant when SM kinematics is present in a process due to a resonant SM state fixing q 2 ≃ v2 T , and Collectively we refer to the vT /Λ and q 2 /Λ 2 expansion as the SMEFT operator expansion. 1There are also crossterms in these expansions.LO in the SMEFT means considering L (6) perturbations to the SM predictions, and roughly ∼ 30 new parameters impact Higgs, electroweak, and top-quark processes [7].Global SMEFT fits seek to constrain these parameters.Uncertainties due to the truncation of the EFT expansion due to missing higher-order terms should be assigned in this effort, if the historically standard methodology of EFT studies of experimental data 2 is to be followed in the case of SMEFT studies of LHC data.At LO, a predicted (dimensionless SM) amplitude includes a perturbation due to L (6) .We illustrate the methodology with a pole observable where here a i is a numerical coefficient that is process dependent.Sums over repeated indicies are implied.The expression is not exact and is only defined to order O(1/Λ 2 ).It should be understood to have arbitrary and 1 The expansion in vT /Λ is relevant if some SM particles go on shell or are nearly on shell in an observable.We refer to these observables as pole observables in this work.Conversely the expansion in q 2 /Λ 2 is relevant when considering non-resonant regions of phase space.We refer to such observables as tail observables in this work. 2See for example Refs.[8][9][10][11][12][13][14][15][16].
unfixed corrections of order O(1/Λ 4 ), until the SMEFT corrections for this process is defined to order O(1/Λ 4 ).We return to this point in Section V.Quantum corrections cannot be forbidden as the SMEFT is built of the SM fields.As higher order terms in the loop and operator expansion are unknown, but necessarily present, it is important to access the impact of neglecting these terms in LO SMEFT analyses of data.The next order in the missing terms in both expansions are The expression is also not exact and a i , b jk , c l , d m , e n are numerical coefficients that are process dependent.In each case, the indicies i, j, k • • • run over a subset of the full set of operators in L (6) and/or L (8) .Squaring this expression, integrating over phase space, with relevant experimental cuts one finds a cross section that is (schematically) We assume that a LO simulation, including relevant experimental cuts and acceptance corrections of the SMEFT are known for an observable, so the A i are fully known for all i in L (6) in each process using a SMEFTsim [7,17] based simulation.Such results can also be produced directly for a class of terms in B jk with operators residing on different vertices, for discussion see Ref. [17].A SMEFTsim based simulation does not produce the effects of canonically normalizing the SMEFT to O(1/Λ 4 ) and L (8) operators.
To estimate the effect of the error due to neglecting higher order terms for σ, knowing the actual prediction of all terms at next order in the loop expansion, and operator expansion, and varying the unknown parameters in the higher order terms is a well defined and straightforward procedure.As such, we focus on how to directly extract sub-leading terms from the results of a LO simulation.
III. Missing perturbative corrections: Higher order terms in the SMEFT expansion that have common kinematic populations of phase space as operators at L (6) (already in a LO simulation) will receive common numerical corrections due to Monte Carlo event generation, and phase space/acceptance cuts in an observable.This means that for classes of L (8) , (L (6) ) 2 and L (6) /16π 2 terms, these terms can be produced with appropriate rescaling from LO simulation results, without the need of redundant and costly Monte-Carlo event generation.
The key to the methodology we lay out here is to define the appropriate approach to rescaling to leverage this fact in practice.
Consider a partonic scattering process X → Y .This (implicitly) defines a tree-level SMEFT amplitude A X→Y (C i /Λ 2 ) that interferes with the SM amplitude A SM X→Y , the later of which can be of any perturbative order.By definition up to suppressed phase space integrals with cuts.
We estimate perturbative uncertainties on the EFT parameters C (6) i using the general form for one-loop perturbative corrections to the SMEFT amplitude The d m terms can be determined with a dedicated oneloop calculation, and are in general unknown.The coefficients e in of the log-enhanced terms, which generally give the largest contribution to the perturbative correction, are known.The subscript has explicit dependence on i, the coefficient appearing in the LO simulation, as these corrections have associated divergences.Such divergences are canceled by the one loop renormalization of the process of interest, and must feed in via a tree level operator dependence.The relevant calculations were determined in the one-loop renormalization group evolution (RGE) of the SMEFT in Refs.[18][19][20].A bare SMEFT Lagrangian L (6),0 is coded into SMEFTsim and it is related to the renormalized Lagrangian (denoted with an (r) superscript) where the SMEFT RGE counterterms Z i,j are introduced via This rescaling by Z i,n is again transparent to the simulation chain.It is UV physics that scales with the leading order Wilson coefficient dependence, that is known, and being used in the fit.The divergences exactly cancel after renormalization, however the log terms that are associated with the divergences, defining e in do not cancel, but are predicted.The log terms have the interpretation, when retained, of the Wilson coefficients being fit at the scale Λ vs being fit to at the the measurement scale µ.The typical momentum transfer of a process fixes µ 2 , e.g. for inclusive on-shell Higgs-boson decay µ 2 = m 2 h .If renormalization group improved perturbation theory is used all of the log terms can be summed running between Λ and µ using standard EFT techniques.When doing a LO SMEFT fit in fixed order perturbation theory, not preforming any resummation, the log terms are the (generally largest) part of the theoretical error due to neglected terms in perturbation theory.The log terms are not the full perturbative correction.Z SM refers to the SM Wavefunction/v T renormalization above.Importantly, this causes extra log terms in addition to those inferred with this procedure [21].It is important to note that Z SM is also modified with dependence on C (6) i .This dependence is an example of operators mixing down, and is always proportional to λ.This dependence is fully given in Ref. [18].Such corrections can be included this into the error estimate, the procedure is the same, one just rescales the SM couplings with these modifications of the running of the SM parameters.The top Yukawa (y t ) and gauge coupling (g 3 ) terms is expected to generally introduce the dominant theory error.The RGE is insufficient to characterize the full perturbative corrections and use for determining central values of parameters [21]; it does not give a particularly good approximation for the full perturbative correction at lower values of Λ.However, practically, for such lower values of Λ the L (8) corrections that are also neglected are expected to dominate the error estimate.Using the log terms as a reasonable proxy for unknown perturbative corrections, for the purpose of a perturbative theory error when Λ ≫ vT is sufficient, well defined, and known at this time.
This procedure can be applied to all processes in the fit.It is not limited to processes proceeding through low n-point interactions.It is known that to preserve the Ward Identites of the SMEFT [4] operator by operator, it is necessary to expand out the propagator shifts in the SM masses due to higher dimensional operators [22].These linear mass shifts (in particular to the W mass in the case of an α input parameter scheme) should be considered part of the theoretical prediction at leading order.The perturbative error algorithm can be applied to such corrections.On the other hand the shifts in the width are not required to preserve the Ward ID at one loop and the pert error algorthm should not be applied to such terms.This is due to the one loop nature of the widths of the SM particles.Similarly, if a process has already been improved to one loop in the SMEFT amplitude, this procedure should not be used as the equivalent two loop RGE should be used.Partial results of this form are available [23], but the full two loop-RGE remains unknown making this impractical to execute.
The log-enhanced one-loop correction to the SMEFT amplitude modifies each observable according to and provides an estimate of the neglected perturbative corrections.We reserve the notation ∇ for errors in this work.

Example: VBF Higgs production
To demonstrate the determination of perturbative uncertainties we consider the inclusive result for the "qq → hqq VBF-like" process quoted in Table 9 of Ref. [17] (the "direct" contributions).In this case, Hq , CHu , CHd , C′ ℓℓ }.For simplicity, we restrict our attention to perturbative corrections proportional to the top quark Yukawa in this example, these results are present in Ref. [19]; specifically in Eqns.(A.33,A.27,A.28,A.30,A.29,A.35) of this work.By inspection C( 6) Hℓ pp Hℓ .The perturbative error that follows is These results from Ref. [17] have the cuts m jj > 350 GeV , p T (h) < 200 GeV .A reasonable kinematic invariant to choose for the µ dependence in the logarithm is µ ∼ 200 GeV .p is a flavour label.Consistent with the lepton flavour assumption it is summed over p = {1, 2, 3}.This procedure is repeated for all of C(6 Hq , C Hq , CHu , CHd , C′ ℓℓ } for each SM coupling dependence that one wishes to retain.For practical purposes retaining the dependence on y t , y b , g 1,2,3 is sufficient.What results is an error estimate that is a linear sum of unknown (nuisance) parameters, with calculated coefficients.It is dominated in its SM coupling dependence by y t , g 3 corrections.A distribution of the unknown Wilson coefficients needs to be chosen to produce a number to add in quadrature with other errors.A very weak dependence is present on the distribution chosen (other than the overall scale Λ choice), consistent with the central limit theorem.This statistical behavior is present in the results shown in Refs.[6,24].
III. Missing O(1/Λ 4 ) corrections: Many of the O(1/Λ 4 ) corrections can also be determined from LO SMEFT results with re-scalings.It is appropriate to organize the theory in terms of specific composite operator kinematics, with scalar dressings that do not introduce new kinematics to identify these rescalings.This is exactly the geoSMEFT approach developed in Refs.[3][4][5][6]25].In this case, field-space connections G i multiplying composite operator forms f i as where G i depend on the group indices I, A of the (nonspacetime) symmetry groups, and the scalar field coordinates of the composite operators.Powers of D µ H are included in f i .The kinematic dependence is thus factorized into the f i and the rescalings by G i are exactly the re-scalings needed to produce O(1/Λ 4 ) corrections from LO simulation results.The geoSMEFT has been defined for all interactions up to four point interactions at this time.The rescaling procedure is best illustrated with a specific example.The three-point function h − γ − γ in the SMEFT to all orders is given in Ref. [5].hγ µν γ µν is a common kinematic factor for the L (6,8) contributions.As such, simple replacements can be made on the L (6) dependence determined in SMEFTsim to directly generate O(1/Λ 4 ) terms from LO results.Using the all orders definition of the decay width Ref. [5,6] at LO one finds Here The use of this replacement in the one loop result for the f mW 1 dependence in Γ(h → γγ) introduces a relative uncertainty of (v 4 T /Λ 4 )(1/16π 2 ).The replacement generates not only the quadratic terms, but also the full set of v4 T /Λ 4 correction contributing to ∇Γ(h → γγ).An error can then be assigned by choosing a set of distributions for the C (6) i , C (8) i and a value for Λ when neglecting this class of terms.The choice of Λ dictates the size of the error induced, and it is appropriate to choose multiple values of Λ when determining errors.A straightforward choice is to choose Λ ∼ 1 TeV and Λ ∼ 3 TeV.L (8) induced errors dominate for the former choice, while errors due to neglected perturbative corrections dominate for the later choice.
The case of h − γ − γ is particularly simple, due to a narrow width approximation for Higgs factorising production and decay, and the presence of only one vertex with a common kinematic structure.Extending this pro-cedure to processes where multiple Feynman diagrams contribute, where each vertex building up the individual Feynman diagrams is generalised into the case of the geoSMEFT, requires the individual dependence on at least one Wilson coefficient present at L (6) in each type of vertex be identified and isolated, so that a rescaling procedure using the geoSMEFT generalisation of that vertex can be performed.As the same L (6) correction can appear in multiple vertices, this can require choosing combinations of Wilson coefficients with fixed linear algebra relations to project out the dependence of a Wilson coefficient at a particular vertex.For example, consider a process where the same L (6) Wilson coefficient C The kinematics associated with the vertices δV 1,2 can differ in what follows. 3Each of the V 1 and V 2 have a rescaling in the geoSMEFT, but the appearance in the overall result is a convolution of dependence on C(6) 1,2,3 from both vertices.Isolating the dependence on C 3 )/b 2 in the LO result.The resulting shift due to δV 1 is then modified to The geoSMEFT based rescaling uses the known dependence of 1 and a descendent L (8) result l .The net dependence on C(6) in Eqn. ( 21) can be rescaled back to a net a 1 dependence using the known dependence on all vertices in the contribution to the observable; i.e. a 1 , a 2 , b 1 , b 2 , b 3 are all known analytically in the LO SMEFT results encoded in SMEFTsim, that are consistent with the geoSMEFT generalisation to higher orders in 1/Λ.This procedure can be iterated for Wilson coefficient dependence in more than two vertices.Performing the full set of rescaling replacements for all vertices that build up the Feynman diagrams contributing to an observable then generalises the LO SMEFTsim result with a well defined class of terms at O(1/Λ4 ).A chosen set of distributions of C i , and far more importantly, a chosen Λ scale, 3 Here a 1,2,3 and b 1,2,3 refer to explicit analytic (or numerically evaluated) dependence on a Wilson coefficient in a vertex.For example, when δV 1 = f mW 1 then a HB = 1, a HW = 0.29, a HW B = −0.54.
then defines a numerical error for this set of calculated terms.
IV. New kinematics beyond L (6) : The procedure outlined above, relies on the LO simulation for global SMEFT studies having a complete set of kinematic functions to rescale.A relevant question is when new kinematic forms appear first at L (8) in SMEFT global fits of Higgs and EW data.Such anomalous kinematics is remarkably limited when only considering n ≤ 3 point interactions building up pole observables.The anomalous kinematics are limited to the field space connections of the geoSMEFT of the form The operator contributions to this field space connection are equation of motion (EOM) reducible at L (6) , and hence are not present in the Warsaw basis.At L (8) such terms are no longer EOM reducible.Such contributions modify V V → h production, and h → 4ℓ through the modification of the hV V vertex (here V is a general vector).These corrections must be added in a dedicated extension of SMEFTsim [26]. 4 modified kinematics is present in V H production which requires a dedicated extension of simulation tools, see Ref. [27].These contributions can be directly targeted for code extensions and direct simulation to complete out the calculation of relevant observables to O(1/Λ 4 ) using the L (8 operator basis in Refs.[28,29].Such dedicated extensions to SMEFTsim by subsets of L (8) operators, combined with the algorithm above results in a fully well defined theoretical result to O(1/Λ 4 ) process by process, and such results can then be used in order to generate a theory error by directly varying the sub-leading terms in a chosen manner.The majority of the O(1/Λ 4 ) results, when considering pole observables, can be generated for this purpose with geoSMEFT based rescaling, avoiding the need for simulation or code modification.On the other hand, when considering tail observables, four point interactions are generally important and unsuppressed compared to other SMEFT corrections. 5It is then required to add more operators to simulation tools to characterise observables to O(1/Λ 4 ).When considering four point interactions (with no scalar field) a geoSMEFT based rescaling of lower order results is not possible at this time.
V. Quadratic terms: There is some confusion in the literature on the nature of quadratic terms.We address this issue to assess the use of quadratic terms for a theory error estimate, and if the use of quadratic terms for SMEFT fits to define central values in the fit is well defined.
Here quadratic terms, means the result of squaring Eqn.(2); the resulting O(1/Λ 4 ) term is the quadratic term.On general grounds retaining only a subset of terms in the power counting of an EFT is an ill-defined procedure, which is not invariant under the field redefinitions that define the theory. 6In the case of the SMEFT retaining quadratic terms is subject to the following field redefinition based ambiguities.Eqn. ( 2) should be understood to have unspecified but existent terms of the form There are also corrections of O(1/Λ 4 ) with dynamical fields of dimension four and two in each case.The freedom to perform field definitions of O(1/Λ 4 ) is fundamental to defining the SMEFT, and such an ambiguity is not fixed defining the theory to O(1/Λ 2 ).Predictions proportional to n i and o j are ambiguous until the full set of corrections are defined at O(1/Λ 4 ) when defining an operator basis for L (8) .Squaring this result gives terms of O(1/Λ 4 ) All terms are of order O(1/Λ 4 ) and the n i and o j are arbitrary as it stands.These terms can be chosen to have C i dependence in particular, modifying the dependence on C(6) k C(6) l .This arbitrariness at O(1/Λ 4 ) is represented by results in the published literature.Ref. [6] demonstrated that the leading quadratic term did not correctly predict the dependence on (f mW 1 ) 2 .The reason that the quadratic term does not correctly predict the dependence on (f mW 1 ) 2 is due to the redefinition of the electric coupling in the geoSMEFT.This redefinition is a specific example of field redefinition based corrections of the form shown in Eqn.(24).In addition, the arbitrariness represented by n i , o j untill L (8) is defined is required to ensure that SMEFT predictions are well defined at O(1/Λ 4 ) and not intrinsically dependent on the basis chosen for L (8) .When defining L (8) the use of the Higgs EOM leads to modifications of L (6) terms that are ∝ v2 T .These terms are basis dependent artifacts that cancel in a full matching, as demonstrated in Ref. [6,24].This correlated matching of L (8) , and v2 T corrections to L (6) is an 6 One can directly confirm that inconsistent expansions in the power counting expansion among vertex fucntions violate SMEFT Ward identities [4].
example of matching effects descending down in operator mass dimension.This physics in matching and defining the operator bases, is similar to the mixing of operators of different mass dimension in the SMEFT RGE; both effects come about due to the presence of the classical scale v2 T .C(6) k a k is not well defined in its predictions to O(1/Λ 4 ) when neglecting such effects, as the Higgs EOM is not consistently applied at this order neglecting such terms.The quadratic terms are in general not well defined contributions at order O(1/Λ 4 ) for these reasons, they should not be used to fix central values in the global SMEFT fit as a default prediction.Such corrections also cannot be translated unambiguously between operator bases until the theory is fixed to O(1/Λ 4 ).The reason is that Eqn.(25), and other EOM relations between L (6) operators have implicit (generally unspecified) corrections that are further suppressed by O(1/Λ 2 ). 7evertheless, the use of quadratic terms to define an error estimate is a reasonable procedure [32][33][34][35], in the absence of complete results developed using the method outlined here.In particular, as the methodology outlined here is focused on improving SMEFT results to O(1/Λ 4 ) for pole observables, the use of quadratic terms to estimate an error for tail observables as advocated in Ref. [33][34][35] can be appropriate.
VI. Conclusions: We have defined a methodology to improve predictions for pole observables in the SMEFT systematically to O(1/Λ 4 ) using geoSMEFT results.When new kinematics is first present at L (8) , modifications to code tools, and new simulation and event generation is required to complete results to O(1/Λ 4 ).However such corrections are a small subset of the full set of corrections extending predictions to O(1/Λ 4 ).This approach to improving LO results to sub-leading order in the operator expansion relies on simple linear algebra, Taylor expansions of known closed form all orders expressions in the geoSMEFT, and the known dependence on L (6) encoded in SMEFTsim.The approach outlined here can be combined with the approach of Refs.[32][33][34][35] for non pole observables.
Truncation errors result from taking the resulting exact expressions to O(1/Λ 4 ) and varying the unknown higher order terms in a range of values.Then (∇σ 1/Λ 4 ) 2 + (∇σ 1/16π 2 Λ 2 ) 2 defines a theory error estimate.As the operator expansion in the SMEFT involves many terms at O(1/Λ 4 ), that are randomly chosen via distributions in linear sums, the central limit theorem applies.In global combinations, common values of Λ should be chosen to define errors for observables.The resulting SMEFT theory error for a LO fit is a Gaussian distributed numerical value for each observable, with magnitude determined by the chosen Λ.
pears in two vertices in a Feynman diagram, or sum of diagrams with dependences

in V 1
to perform the required rescaling to generate the O(1/Λ 4 ) result, one can choose to fix δV 2 = 0 in the known SMEFT result by choosing C