Complete Set of Dimension-9 Operators in the Standard Model Effective Field Theory

We present a complete and independent list of the dimension 9 operator basis in the Standard Model effective field theory by an automatic algorithm based on the amplitude-operator correspondence. A complete basis (y-basis) is first constructed by enumerating Young tableau of an auxiliary $SU(N)$ group and the gauge groups, with the equation-of-motion and integration-by-part redundancies all removed. In the presence of repeated fields, another basis (p-basis) with explicit flavor symmetries among them is derived from the y-basis, which further induces a basis of independent monomial operators through a systematic process called de-symmetrization. Our form of operators have advantages over the traditional way of presenting operators constrained by flavor relations, in the simplicity of both eliminating flavor redundancies and identifying independent flavor-specified operators. We list the 90456 (560) operators for three (one) generations of fermions, all of which violate baryon number or lepton number conservation; among them we find new violation patterns as $\Delta B = 2$ and $\Delta L = 3$, which only appear at the dimensions $d \ge 9$.


Introduction
Being the most successful theory of particle physics to date, the standard model (SM) still leaves many questions about the nature of matter unanswered, which motivates direct and indirect experimental searches on new physics (NP). For instance, the baryon asymmetry of the universe and nonzero neutrino masses may indicate that the baryon number ∆B and the lepton number ∆L should be violated via additional new degrees of freedom. The absence of signals of physics beyond the SM at the Large Hadron Collider (LHC) suggests that new particles are either very weakly coupled or much heavier than the electroweak scale. Assuming that new particles live at high energies, Λ, well above the electroweak scale, their effects at experimental energies much below Λ can be systematically described under the effective field theory (EFT) framework.
The Standard Model effective field theory (SMEFT) provides a systematic approach to describe the effects of heavy particles at low energy in a model-independent way. The SMEFT Lagrangian can be systematically organized by the dimension of effective operators in inverse powers of the heavy scale Λ, as follows where each O construction follows that one writes all the possible Lorentz and gauge invariants using SM fields solely. Although it is possible to find a set of operators with Lorentz and gauge invariance for a given mass dimension d, these sets might be redundant due to possible relations between different operators. By means of equations of motion (EOM), Fierz identities, and integration by parts (IBP), one can eliminate redundancies for each dimension and obtain a complete and also independent operator basis. The operators up to dimension 7 have been listed in this way in Ref. [1][2][3][4][5][6][7][8].
At dimension 8 and higher, the number of such operators increases tremendously, which makes the task very tedious and prone to error. Instead, we provided a systematic and automated method [9] to write a complete and independent basis directly, which has been applied to listing the complete dimension 8 operators in the SMEFT. At the same time, Ref. [10] utilizing the traditional way to treat the EOM and IBP redundancies also write down the dimension 8 operators.
Compared to Ref. [10], since we started from the operators in which the EOM is absent and the IBP is treated in the beginning, the correctness of our result is theoretically guaranteed from the first principle. It is also pointed out [9] that our method provides a relatively simple way to enumerate all the independent flavor-specified operators, while the traditional method has not. based on the requirement that the operator is invariant under the weak hypercharge symmetry and the Lorentz symmetry [11][12][13]. Therefore, from above we learn that operators at odd dimensions must have BNV or LNV, and that |∆B − ∆L| = 2 up to dimension 15 1 . At the odd dimensions, the LNV processes with ∆L = 2, relevant to the leptogenesis mechanism, neutrino-less double beta decay, and the neutrino masses, exist [14,15]. For example, if the leptogenesis or baryogenesis occurs at temperatures above the weak scale, B − L violation is required to avoid the washout effect by the electroweak sphalerons and at the same time, constraints from proton two-body decays (which conserves B − L) are not applicable. At dimension 5, the only operator is the Weinberg operator [1] with (∆B, ∆L) = (0, 2), while at dimension 7, all the operators have BLV with possibilities (∆B, ∆L) = (0, ±2), (±1, ∓1) [7], which either break the lepton number by 2 or induce proton two-body decay.
Starting from dimension 9, besides the operators with (∆B, ∆L) = (0, ±2), (±1, ∓1), there are new violation patterns in the operators with (∆B, ∆L) = (±1, ±3), (±2, 0). First, operators relevant for ∆B = 2 processes, such as neutron-antineutron oscillations, appear first at d = 9 [16], which are directly connected to the low-scale realization of the baryogenesis without the need for sphaleron processes. Second, the ∆L = 3 processes can only arise from dimension 9 and higher operators [17,18]. The lepton number violated only in three units implies the proton decay final states must be at least three-body and the new physics associated at a scale could be as low as 1 TeV, which opens the possibility of searching for such processes not only in proton decay experiments but also at the LHC [18]. Finally, operators with ∆L = 2 are supposed to be sub-dominate over the ones at dimension 5 and 7 levels. However, if the ∆L = 2 operators start to appear at the dimension 9 level, new physics effect could be as low as 1 TeV and thus can be tested at the LHC in the near future. For example, typically the operators for the Majorana neutrino masses, such as Weinberg operators, are related to the tree-level seesaw, and thus the new physics scale is quite high. However, if the Majorana neutrino masses are generated from the tree-level mechanisms at dimension 9, the related new physics is around the TeV scale [19]. Thus one expects that the LHC experiment will start to explore these kinds of models in the near future. In the neutrino-less double beta decay processes, if the dominant contributions originate from the dimension 9 operators [20,21], one expects the new physics scale should be around TeV, and thus collider experiments could also shed light on such kinds of new physics in the near future. Hence, listing a complete set of dimension 9 operators will set up the framework for these phenomenology studies.
We adopt the method in [9] to list the dimension 9 operators in the SMEFT. While the method is elaborated in [9], we present in this paper more details about its motivation stemming from the so-called amplitude-operator correspondence.
By establishing the one-to-one correspondence between the effective operators and the local amplitudes they generate, we first categorize them in terms of the external states in the scattering -a certain collection of particles in the EFT. A category of operators thus found is called a type. For a given type of operators, we define a couple of bases for various uses as follows: • y-basis: Our algorithm utilizes group theory technique to enumerate an independent and complete basis as a collection of Young tableau for each factor of the operators, thus named Young tableau basis or y-basis. For the Lorentz factor, the basis is obtained as the Semi-Standard Young Tableau (SSYT) of an auxiliary SU (N ) group, where N is the number of fields; for the gauge groups, the basis is given by Young tableau constructed from the Littlewood-Richardson (L-R) rule.
• m-basis: For practical purposes, the operator basis had better be monomials, while the y-basis operators, after transforming to the usual convention, are often long polynomials. By a systematic reduction to y-basis, we can select a set of monomial operators that have independent coordinates with respect to the y-basis. A complete basis of monomial operators selected this way is called an m-basis, which is highly non-unique.
• p-basis: Even the y-basis is not enough when repeated fields are present, as explained in [9] from the operator viewpoint. In this paper, we also illustrate this extra constraint from the amplitude viewpoint, which introduces the symmetric permutation basis, or p-basis, as the symmetrized flavor-blind amplitude basis and the corresponding operators. The symmetrization procedure provides a full-rank conversion matrix from the y-basis to the p-basis, which guarantees its independence and completeness. p-basis operators when viewed as flavor tensors of a group of repeated fields, the ones that form a basis of an irrep of the symmetric group are related by certain permutations.
• p'-basis: To reduce the lengths of operators in the usual notation, while keeping the flavor symmetries manifest, we develop a systematic procedure, the de-symmetrization, to obtain a series of m-basis operators that symmetrize to independent combinations of the p-basis with the same flavor symmetry. The procedure is especially important if multiple representation spaces of the same symmetry exist. This is a new part of our method that was not developed in [9].
The resulting operator basis we obtain with the above method is listed in terms of various levels of categories: • Class: A (Lorentz) class includes types of operators with a given amount of fields under each Lorentz irreducible representation (irrep) and the same amount of covariant derivatives, such that they may share the same Lorentz structures. It is different from the concept of operator "class" in other literature, because we distinguish the chiralities of the fields as their corresponding particles have definite helicities. In particular, fermions and gauge bosons should be written on the chiral basis in our notation. The list of possible classes at a given dimension is model-independent, as we show in the tables at dimension 9, though not all of them show up in specific models.
• Type: The definition is given previously. All the types are obtained by plugging field content of the SMEFT into the dimension 9 classes, making sure that the representations of them could form singlets for each symmetry group. We emphasize that our "type" has more rigorous definitions than those in the other literature, as we specify the way to eliminate the EOM redundancy so that the type of operators we define only corresponds to local amplitudes they can generate for a given collection of external particles.
• Term: The p-basis or p'-basis are operators with free flavor indices, which contract with Wilson coefficient tensors to form a (Lagrangian) term. The corresponding amplitude basis is flavor-blind. Our "terms" are irreducible flavor tensors with a specific flavor symmetry λ, different from the concept of "terms" in other literature [10,22] where flavor tensors with different symmetries may merge into a reducible tensor. We compare the form of our terms to the traditional form of operators with flavor relations [3,11,23] to show their equivalence, and explain the privileges of our form.
• Operator: The number of (flavor-specified) operators per term can be understood as the independent entries in the Wilson coefficient tensor, constrained by the flavor symmetry. One can also view the independent operators as p-basis contracted with independent flavor tensor basis, labeled by flavor Young tableau.
The paper is organized as follows. In section. 2, we discuss the principle to find independent and complete operators with the amplitude-operator correspondence. In section. 3, we describe the general ideas of how to obtain a complete set of independent operators with free flavor indices in y-, m-, p-bases and how to convert them to each other. In section. 4, we take a concrete example to show how to obtain a set of independent terms for a given type of operators and demonstrate the advantages of listing operators in the level of terms with definite flavor permutation symmetry. In sections 5, we list all the independent terms for dimension-9 in the SMEFT with different categories. We reach our conclusion in section. 6. Additionally, in appendix. A, we list useful formulae transforming operators between two-and four-component spinor notations, and in appendix. B we provide a list of sub-classes up to dimension 9.

On-shell Convention for Effective Operators
The Lagrangian of the SMEFT consists of the SM fields and their covariant derivatives D µ along with the following group factors: which result in the invariant operators under the Lorentz group SL(2, C) = SU (2) l × SU (2) r and the SM gauge group Here the indices for the fundamental representation of the SU (2) l and SU (2) r groups are denoted by (α, β, γ, δ) and (α,β,γ,δ). The indices for the fundamental and adjoint representations of the SU (3) C are (a, b, c, d) and (A, B, C, D) while those for the SU (2) W are (i, j, k, l) and (I, J, K, L), respectively. These are conventional notations for the effective operators, which we use to represent our final result -the complete set of dim-9 operators in the SMEFT -in section 5.
Although operators in this notation are more familiar to phenomenologists, it is hard to systematically define an independent basis for them, given the redundancies due to the EOM and the IBP relation. The usual way to achieve this goal is to write down an over-complete basis and derive their dependencies manually [3,8,10,22,24,25]. However, this has to be done model by model, and becomes extraordinarily cumbersome at higher dimensions. In Ref. [26][27][28][29][30], it was pointed out that independent operators could be enumerated in terms of their corresponding local on-shell amplitudes, dubbed the amplitude basis. Ref. [31] further proposed an algorithm to enumerate independent amplitude basis subject to momentum conservation, which is equivalent to the IBP redundancy as we will explain shortly. In Ref. [9], an integrated algorithm using the correspondence was proposed and applied to the enumeration of the dimension 8 operators in the SMEFT. In this section, we would like to elaborate the amplitude-operator correspondence and prove its applicability to the task of operator enumeration.

Amplitude-operator correspondence
The correspondence is in particular about operators as Lagrangian terms, which are Lorentz singlets, so that they directly contribute to scattering amplitudes. Among the amplitudes they contribute, the set of local amplitudes or "amplitude basis" 2 span a linear space isomorphic to the operator space. To prove the isomorphism, we first investigate the general structure of amplitude basis, which we express in terms of the spinor helicity variables λ iα ,λα i , defined as up to the little group transformations 3 λ i → e −iϕ/2 λ i ,λ i → e iϕ/2λ i , while the spinor indices are raised and lowered by the Levi-Civita tensor 12 = − 21 = +1. The number of constituting spinors is constrained by the little group representations of the external particles, e.g. the helicities for massless particles. The amplitude basis B should respect the little group representations of all the external particles. For example, under the little group U (1) i for the ith massless particle, it should gain a phase B → e ihiϕ B. Therefore in general, massless particle of helicity h i contributes a factor λ ri−hi iλ ri+hi i that has the correct little group weight, where r i ≥ |h i | is a free (half-)integer parameter. The general form of the amplitude basis reads where φ i , i = 1, . . . , N are the external particle multiplets with momenta p i , and a i are the collections of group indices for them. The mass dimension of the amplitude is determined as [B] ≡ r = i r i . The kinematic factor M is a function of the spinor variables that only depends on the helicities h i of the external particles, and characterize the energy dependency and the angular distribution of the amplitude. Global Lorentz invariance demands that all spinor indices are contracted, which are conventionally denoted as thus M must consist of n = r−h 2 number of · type brackets andñ = r+h 2 number of [·] type brackets, h = i h i being the total helicity. The group factor T is the product of tensors for each group under which the multiplets φ a transform. For symmetry groups, like the gauge group or some global symmetry group, T has to be invariant tensors.
The index a can also include the flavor degree of freedom, which not necessarily has a symmetry, while the tensor in charge does not have to be invariant tensors. We can define the subspace of local amplitudes with the same set of external particles φ a1 1 , . . . , φ a N N and the same mass dimension r as a "type", in which various amplitude basis are specified by the group tensors T , the partition r i , and the structure of spinor contractions. Furthermore, the types with the same tuple (h 1 , . . . , h N ; r) form a "class" that share the same bases of the kinematic factors M. Note that we do not have to specify the division between the initial and final states because different divisions are simply related by crossing symmetry and analytic continuation.
Here we take a simple example to illustrate: the amplitude basis for 4 left-handed fermions. Each of them contributes a factor λ ri+1/2 iλ ri−1/2 i where r i is a positive half integer. The lowest mass dimension is when r i = 1/2 and r = 2, and we have the following possible contractions  Schouten identity indicates that M 1 + M 2 + M 3 = 0, which reduces the number of independent amplitude basis by 1. This redundancy is equivalent to the Fierz identity for operators, which we will solve systematically later. For higher 3 The definition can be extended for massive particles, whose little group is SU (2) and hence the spinor variables have an extra SU (2) index I, J . . . . In this paper we only enumerate amplitude basis for massless particles. mass dimension r = 3 where one of the r i takes 3/2, there is only oneλ which cannot form Lorentz singlet. Thus the next available dimension is r = 4, for instance r 1 = r 2 = 3/2 and r 3 = r 4 = 1/2, and one possible amplitude basis is M 1 (−1/2, −1/2, −1/2, −1/2; r = 4) = 12 2 34 [12]. (2.10) Later in section 2.2, we will derive the full constraints on these parameters, so that we can enumerate the valid classes (h 1 , . . . , h N ; r) that could form Lorentz singlet.
To find the operator that generates such amplitude basis, one simply does the following translation where F L/R = 1 2 (F ∓ iF ) are the chiral basis of the gauge bosons, and ψ denotes left-handed Weyl spinors. For a unified notation, right-handed Weyl spinors are denoted as conjugates of some left-handed spinors ψα R = αβ (ψ † L )β. All the spinor indices for the operators on the right hand side are made totally symmetric, among dotted and undotted indices respectively, the same as those on the left hand side. These indices are contracted between such building blocks according to how the spinor variables are contracted. Thus the Lorentz structure corresponding to the kinematic factor in eq. (2.7) is given by to an amplitude with extra photon γ from the covariant derivative of the charged field Ψ: where J Ψ is the charged Ψ current that minimally couple to the photon field A. The first term is the local but gauge dependent vertex contribution, while the sum is gauge invariant but contains a mass pole for Ψ, at which the amplitude factorize into an amplitude basis without the photon and an amplitude basis for the minimal coupling.
Do the operators in eq. (2.12) exhaust all the possible forms of gauge invariant operators? The only caveat comes from the requirement of total symmetries among the spinor indices. It turns out that if they are not totally symmetric, indicating their corresponding amplitude (of the given type) contains antisymmetric spinors from the same particle, the resulting amplitude basis must vanish due to the on-shell condition λ i[α λ iβ] = ii αβ = 0. It follows from the relations Terms that convert to other types by EOM are omitted, which stems from 11 = [33] = 0. In sum, taking momentum conservation into account, the amplitude basis corresponds to an IBP non-redundant basis of operators. Inspired by this correspondence, our strategy of operator enumeration is essentially the enumeration of amplitude basis.
The physical reason for such correspondence is that the free parameters of the theory should count the same in both the Lagrangian formalism and the on-shell formalism. While it is straightforward to define them in a Lagrangian as the independent Wilson coefficients, the free parameters in the on-shell formalism should be encoded in local amplitudes because they are the ultimate outcome of the cascade of unitarity factorization of any amplitudes. Building a quantum field theory from the operator basis and their Wilson coefficients is already a textbook technique, but building a theory from the corresponding amplitude basis has not been as successful, though we show that they contain the same amount of information. The recursion relations developed in the past decade [33] are only applicable to certain "onshell constructible" theories [34], whereas a more general on-shell formalism from amplitude basis is still waiting to be discovered.

On-shell building blocks and Lorentz classes
In light of the amplitude-operator correspondence eq. (2.11), we adopt the chiral basis of the fields and derivatives, all with spinor indices, which are in the irreducible representations (j l , j r ) of the Lorentz group SU (2) l × SU (2) r φ ∈ (0, 0), ψ α ∈ (1/2, 0), ψ † α ∈ (0, 1/2), (2.16) 4 Such polynomial functions are regarded as form factors of operators [32] which characterize the state generated by an operator from the vacuum F = ψ|O|0 , which is not a physical process and does not satisfy momentum conservation.
In this notation, we have the SMEFT field content as in the table 1, where the conjugate fields with conjugating representations and opposite helicities and charges are omitted.
To enumerate the valid Lorentz classes at a given dimension d, denoted by F n−1 L ψ n −1/2 φ n0 ψ † n 1/2 F n1 R D n D , corresponding to classes of amplitudes 5 M(h 1 , . . . , h N ; r), one may adopt the steps described in [9], where the following constraints are considered At dimension 9, we list all the classes in table 2, which is model independent. The types of operators are thus obtained by substituting the SMEFT field content from table 1 into eq. (2.12) with varying number of derivatives and spinor contractions, while the representations of the constituting fields under gauge groups should be able to form singlets (U (1) charges should add up to zero). The classes colored in gray are those ruled out by this condition, thus they don't appear in the SMEFT. In the next section, we show the details of obtaining a complete basis for a given type of operators/amplitudes, and how to convert an arbitrary operator (basis) to our basis.

N (n,ñ)
Classes 9 (0, 0) φ 9 Table 2: All the Lorentz classes at dimension 9. Classes in gray do not appear in the SMEFT due to global symmetries, such as the odd parity for all the SU (2)W doublets that forbids quite a few of Lorentz classes with odd number of scalars.

y-basis: a complete basis from Young Tableau
In this section, we briefly summarize the algorithm to obtain a complete basis for a type of local amplitudes/operators, which we elaborated in [9]. As explained previously, a type of local amplitudes are given by the same external particle species at certain mass dimension r, which consists of the kinematic factor M that describes the energy dependency and the angular distribution, and the gauge group factor T = G T G that describes the gauge group representations. Given the helicities h i and gauge representations r G i of the external particles, the two factors span linear spaces of dimension N M and N G respectively, whose outer product is the linear space of amplitude basis. The spin statistics of identical particles will put extra constraints on this product space, which we postpone to investigate at the end of this section.
According to the amplitude-operator correspondence, the space of operators with the same type should have the same structure, which has total dimension (3.1) Therefore our first task is to enumerate the N M basis for a given class of M(h 1 , . . . , h N ; r) and the N G basis for group G given the representations r G i . For the kinematic factor, since the EOM redundancies are removed by construction, the remaining redundancies are the momentum conservation and the Schouten identity, both mentioned in the previous section. We utilize an SU (N ) transformation introduced in [31], under which the total momentum (all-out-going convention) that vanishes due to the momentum conservation is invariant. This transformation is reformulated in terms of operators and is further developed in [9]. The non-redundant amplitudes/operators thus form a particular irreducible representation space of the SU (N ) group, the basis of which is given by the SU (N ) semi-standard Young tableau (SSYT). Specifically, the shape of the YD for this particular irrep, called primary YD, is determined by a tuple of 3 numbers (N, n,ñ), where n,ñ are the parameters introduced in the previous section, as the numbers of · type and [·] type brackets in the amplitude. They can be derived from the constraints eq. (2.19). The primary YD is given by which is translated to amplitudes column by column as where the E is the Levi-Civita tensor of the SU (N ) group. As shown in table. 2, where classes are organized in terms of the tuple (N, n,ñ), there are typically more than one class that share the same primary YD. It is proved in [9] that the classes are in one-to-one correspondence with the collection of labels to be filled in the YD. For a given class, the number of the label i in the collection is given by With the collection of labels and the YD, it is not hard to enumerate all the SSYT's and translate them into amplitudes via eq. (3.3), or further into operators using the amplitude-operator correspondence. One can also count the number of the SSYT's N M without the label filling, as is pointed out in [9] that N M could be regarded as the multiplicity of the primary YD in the direct product decomposition of the one-row sub-YD for each label The direct product decomposition is carried out by the famous Littlewood-Richardson (L-R) rule. A concrete example for the Lorentz class F L ψ 3 ψ † D is given in eq. (4.6).
The gauge group sectors T G are also given by Levi-Civita tensors that contract with the fundamental indices of the fields, those of which in non-fundamental irrep (e.g. gauge field in adjoint rep) provide multiple fundamental indices with particular symmetries. As we only have adjoint and anti-fundamental representation for the SM fields their conversion is listed below: The corresponding y-basis group factors are obtained by constructing the singlet Young tableaux following the L-R rules with the corresponding indices filled in as discussed in Ref. [9]. The singlet Young tableaux for SU (2) W and SU (3) C constructed are in the following forms: where n box is the total number of boxes in the YD, equal to the total number of fundamental indices of the fields. As an example, we illustrate the way to construct such singlet Young tableaux with the type G L d 3 C e † C D, which we will discuss in detail in section. 4. The SU (2) W group is trivial for this type of operators, we thus focus on only SU (3) C part. The conversion of the non-fundamental indices in this case generates correspondence: from which we can construct the singlet Young tableaux in the following with the L-R rule in the following order: The complete basis of group factors are obtained by contracting the products of the 's obtained from the Young tableau with those prefactors converting the non-fundamental indices in eq. (3.6), which yields tensors with exactly the conjugating indices of the fields: so that they contract with the fields to form gauge singlets. The number of complete basis can, again, be given by the In the above example we derive through the L-R rule as reproducing the number of basis we enumerated.
Since the basis obtained for both M and the gauge groups G are given by Young tableau, we entitle the outer product of them as the Young tableau basis, or y-basis, of local amplitudes/operators. The y-basis operators are denoted as O (y)

Operators reducing to y-basis
For operators, the y-basis defined above may not be of the most convenient form illustrated at the beginning of section 2.
However, as a complete basis, the y-basis can be used to uniquely identify any operator in the EFT, either from other literatures in some conventional form, or obtained in some particular computations like the Covariant Derivative Expansion (CDE) [35], as a coordinate in the space of operators. To achieve this goal, it is demanded to expand an arbitrary operator in terms of the y-basis.
For the Lorentz structure, one should first convert it into the standard form eq. (2.12) with the following steps: • Decompose Dirac fermions into chiral/Weyl fermions As we only deal with massless fields, the two Weyl components are actually independent, thus one can easily do the decomposition.
• Convert the covariant derivatives D µ and the gauge fields F µν into SU (2, C) basis, with dotted or undotted spinor In the standard form, the spinor contraction structure can be translated into an SU (N ) Young tableau, though not necessarily SSYT. The group theory proves that the SSYT's are an independent and complete basis of all the Young tableau, given the Fock conditions that relate them. The Fock conditions for the primary YD eq. (3.2) are exactly equivalent to the momentum conservation (the IBP relation for operators) and the Schouten identities, the redundancy relations that we removed to obtain the y-basis. Therefore, we need a systematic replacement rule to apply these relations to an arbitrary Young tableau operator obtained above, until we get a combination of the independent y-basis We want to emphasize here that the process is not for obtaining the complete basis, but for reducing any Lorentz structure to the basis that we define. The replacement rule is decribed below: • Remove all derivatives on the first field Φ 1 by the IBP relation: The derivatives are distributed among the rest of the building blocks by the Leibniz rule. Corresponding to the convertion of spinor helicity formula is In the sum, the term with k = i or k = j would vanish, which in the corresponding operator amounts to a selfcontracting building block that should be converted to other types of operators. We omit these terms and thus use the for the relation, which should be understood for the following two steps as well.
• Remove derivatives on Φ 2 (or Φ 3 ) when the two spinor indices on them contract with those in building block 1 and 2, such as The corresponding replacement rule for amplitudes are (3.22) • Remove pairs of derivatives acting on Φ 2 and Φ 3 , with indices contracting with each other, by using the following identity where the terms in the first line are all convertible to other types via the EOM. It corresponds to the following relation among Mandelstam variables • The Schouten identity can be applied to any pair of 's with all-different indices (contracting with 4 different building blocks i < j < k < l) In spinor helicity language, it reads The rule is that whenever the third term (specified by the order of the labels) shows up in the operator/amplitude, replace it by the other two terms.
• Apply the Schouten identity for the˜ 's in the same manner.
For the gauge group tensor T G , one can convert any bases to each other with the help of an inner product defined for the tensors: G are all products of Levi-Civita tensors, their contractions are easily calculated algebraically. Then using the Gram-Schmidt process, one can obtain a set of orthogonal tensors T (o) G span the same space of y-basis. Therefore the coordinates of any group tensor T in this orthogonal basis can be obtained as: With eq. (3.18) and eq. (3.29), we can reduce any operator to our y-basis In particular, we can use the reduction to build other complete bases, like a basis with conventional notation. We define such a basis of conventional monomial operators generally as m-basis. Given an over-complete set of monomial , we can reduce them all to our y-basis and obtain a coefficient matrix The m-basis is thus constructed by selecting N independent rows in the matrixK my that form a full-rank square matrix K my , which serves as the conversion matrix between the y-basis and m-basis. Note that the m-basis is highly non-unique, which not only depends on the notation but also depends on the selection of rows inK.

p-basis: in the presence of repeated fields
There is one more redundancy that is not yet considered for the N dimensional space of type, which is when there are repeated fields/identical particles in the operator/amplitude. While it is explained in [9] in terms of operators, we present here a derivation of the constraint from the amplitude point of view.
In the actual physical amplitude, identical particles should be subject to the spin-statistics, which picks out certain linear combinations of the amplitude basis (i.e. they may not be factorizable as in eq. (2.7)). These combinations are totally symmetric or totally anti-symmetric under the permutations of the bosonic or fermionic identical particles, and are thus called a p-basis: where D φ (π) is the representation of the permutation π for the particle φ, and (−1) π denotes the signature of π. It is reflected in the amplitude-operator correspondence by the Feynman rule that sums up all possible contractions between repeated fields and the external legs in the vertex function. According to the dictionary eq. (2.11), such amplitude basis would correspond to an operator basis, also called p-basis, with explicit permutation symmetries among the repeated fields.
We would like to clarify that the notions of identical particles are for particle multiplets, which include the gauge group and even the flavor degrees of freedom. In general, the permutation symmetry of the function B stems from the inner product of the permutation symmetries of M and T . To explicitly show the constraint of spin-statistics on the amplitude basis for particles with flavors, we take the flavor index out of the collection a, and denote the flavor part of the tensor as κ, such that instance, [2,1] is denoted by . Therefore, the spin-stat requires where we use the short-hand notation In the following, λ without subscript is short for λ κ by default, and the p-basis will be organized in terms of λ. First we find the N dimensional space of the flavor-blind amplitudes T ⊗ M and combine the y-basis into λ representation spaces, each being a d λ -dimensional subspace of amplitudes. Suppose the number of representation spaces for each λ is given by n λ , such that and all the p-basis amplitudes are labeled by λ, x = 1, . . . , d λ , and ξ = 1, . . . , n λ . We will describe the derivation of the full-rank conversion matrix K py defined as in the next section. In the meantime, the generic rank-m flavor tensor κ with flavor number n f can be decomposed into tensor bases that form S(λ, n f ) number of λ representation spaces for the group S m , such that the total degrees of freedom match The function S(λ, n f ) is known as the Hook content formula, which also counts the number of semi-standard Young tableau (SSYT). For example, with λ = [3] and n f = 2, we have S(λ, n f ) = 4, and the 4 flavor tensor bases are given by (normalization not relevant in this paper): 1 ) 111 = 1, otherwise 0, 1 1 2 : (κ [3] 2 ) 112 = (κ [3] 2 ) 121 = (κ [3] 2 ) 211 = 1, otherwise 0, 1 2 2 : (κ [3] 3 ) 122 = (κ [3] 3 ) 212 = (κ [3] 3 ) 221 = 1, otherwise 0, 2 2 2 : (κ [3] 4 ) 222 = 1, otherwise 0. (3.38) Finally, from the n λ representation spaces for the flavor-blind amplitudes, and the S(λ, n f ) representation spaces for the κ tensor, the S m inner product of an arbitrary pair of them contains a totally symmetric amplitude basis. Hence the total number of the flavor-specified amplitude basis is given bȳ As discussed in section. 3.3, when repeated fields appear in a given type of operator, the dimension of the subspace may be less then N calculated in eq. (3.1) due to the certain permutation symmetry among the flavor indices. Therefore, in this subsection we shall demonstrate the workflow to obtain the p-basis operator which is what we called "terms" for a given type of operator, and in the next three subsections we shall illustrate the whole procedure obtaining the p-basis operator concretely with a dim-9 example: G L d 3 C e † C D. As studied in detail in Ref. [9] and discussed in section. 3.3, the permutation symmetry of the flavor structure is related to that of gauge and Lorentz structure indicated in the eq.(3.10) in Ref. [9]: where λ is the partition of k corresponding to a certain irrep of the symmetric group, x labels the basis vector of the irrep, D λ (π) is the matrix representation of the symmetric group for this irrep. Having introduced the concept of y-basis and m-basis in section. 3, we shall name the T λ x and M λ x p-basis for gauge group factors and Lorentz structures. With these ingredients in hand one can construct O (p) of the flavor symmetry λ with Clebsch-Gordon coefficients (CGCs) of the symmetric group C (λ1,x1),(λ2,x2),(λ3,x3) (λ,x),j : where λ, λ 1 , λ 2 , λ 3 are irreps of S k for flavor, Lorentz Structure, SU (3) C and SU (2) W group factors respectively, x with and without subscripts corresponds to the labels of basis vector for each irreps of S k , j is the multiplicity of the resulting irreps from the decomposition.
In figure. 1, we show our workflow obtaining the all the terms of operator for a given dimension in a flowchart and describe each step as follow: 1. Enumerate tuples of the numbers of fields for different helicities and the number of derivatives following the constraints in eq. (2.19), and each tuple corresponds to a class of operator.
2. For each class of operators, one can filling the slots of definite helicities with concrete SM fields, and the combination of fields that can form the gauge singlet is retained as a type of operator.
3. For a given type of operators, one can enumerate the y-basis for Lorentz and gauge group structures with the corresponding SSYT.
4. for each y-basis, one can convert it to a m-basis with some group identities, the form of which is more familiar to the phenomenology community.

5.
After obtaining the y-basis and m-basis, one can symmetrize them by acting on the corresponding group algebra symmetrizer b λ x , to obtain the p-basis for the Lorentz and gauge group structures, the appropriate irreps of the symmetric group λ is obtained by the plethysm technique in advance. 6. With the p-basis Lorentz and gauge group structures one can construct the p-basis operators, the "terms", with the inner product decomposition of the symmetric group related to the repeated fields.
7. Finally, to shorten our notations for the "terms", we perform a subtle recombination of the p-basis operators for a given type called "de-symmetrization" to arrive at the form of terms of operator present in section. 5.

Lorentz Structure
As discussed in Sec.3.2 in Ref. [9], the y-basis of the Lorentz structure is enumerated by the SSYT of the corresponding primary Young Diagram of the auxiliary SU (N ) group determined by the tuple of three numbers (N, n,ñ) for the given type of the operator, where N is the number of field building blocks, n is the number of tensors with undotted spinor indices, whileñ is that of tensors with dotted spinor indices.
Given the operator type G L d 3 C e † C D, N is obviously equal to 5, while n = 3 andñ = 1 can be obtained by [9]: We automatize the whole procedure in a Mathematica code.
where n D = 1 is the number of derivative, n −1 , n −1/2 , n 1/2 , n 1 are numbers of the fields with helicities equal to −1, −1/2, 1/2, 1 respectively. The next step is to find the numbers of field labels #i for i from 1 to 5 that need to be filled in the primary YD. Following the eq. (3.51) in Ref. [9]: where h i is helicity of the corresponding fields. This leads to #1 = 3, #2 = #3 = #4 = 2, #5 = 0 where we have already arranged the fields in the order of increasing helicities. From the direct product decomposition: we know in advance the number of SSYT should be 4. The corresponding SSYTs with the numbers filled in are: which corresponds to a set of y-basis: with operator forms from the correspondence: In the above equations, prst represent the flavor indices, abc and A represent color indices for the anti-fundamental representation and the adjoint representation respectively.
One can obtain the m-basis Lorentz structures by converting G Lαβ to G Lµν in eq. (4.9-4.12) and finding independent monomials with the method discussed in section. 3.3: does not contribute), and the allowed irreps of S 3 for d C are those resulting in primary YD after takeing plethysm with . In our example we have: after taking into account the subtleties related to the Grassmann nature of the fermion fields and the odd number of presence of E tensor of SU (N ) group converting the anti-fundametnal indices of the˜ to the fundamental ones [9], the allowed symmetries remains the same (no transposition for the YDs needed). The total number of the resulting primary Young diagrams should multiply the dimension of the corresponding irrep of the symmetric group d λ , which leads to 1 + 2 + 1 = 4 distinct Lorentz structures given that d [2,1] = 2 and d [3] = d [1 3 ] = 1 and is consistent with the result in eq. (4.6). We shall obtain the p-basis M p (λ,x)ξ (ξ label the multiplicity of the irrep λ of the symmetric group) by acting corresponding group algebra projector b λ x on elements in eq. (4.8) until the number of multiplicity of that λ reaches the demanding value. Since the multiplicity equals to 1 for each λ in our case, we simply omit the label ξ in what follows.
We finally obtain the matrices K pm relating the p-basis and m-basis: (4.20)

Gauge Group
The treatment of the gauge group is similar to that of the Lorentz group, but the usage of the Young diagram and Young Tableaux are different. First of all, one need to find the y-basis for each gauge group by finding the singlet Young tableaux constructed by the ordinary LR-rule with the corresponding gauge group indices of each field filled in provided that each field is expressed in terms of fundamental indices only. If a field is not in the fundamental representation then one can perform the conversion by contracting with the Levi-Civita tensor and the group generators. We have already work out the y-basis group factors in section. 3.1 in eq. (3.12), we list them here again To obtain the m-basis we investigate each monomials in the above equations, and select two independent monomials as our m-basis: In practical, the independence of the monomial can be checked numerically by flattening the T m 's into a 1-D vector with components corresponding to specific tuples of (a, b, c, A) in a fixed order. The next step is to find the proper permutation symmetry among the SU (3) C indices abc of the repeated fields d C and to obtain the symmetrized group factors in the p-basis by acting corresponding b λ x on T (m) SU3 's. In our example the only possible permutation symmetry is [2,1], since only the plethysm of with [2,1] can generate the singlet:

Flavor Basis from Inner Product Decomposition
As we have obtained the symmtrized gauge group factors and Lorentz structures, we shall construct p-basis through eq. (4.3). In our example, SU (2) W gauge group factor is trivial, so we only need to focus on the inner product decomposition from the Lorentz and SU (3) C parts: The CGCs for each decomposition are listed in There are two subtleties remaining in the above notation. First, the meaning of subscript ξ in {O respectively. The second subtlety is that the {O (p) (λ,x),ξ } is over complete, as discussed in Ref. [9] and [22] for λ with dimension larger than one the flavor space spaned by each basis vector is the same, so we only retain the first basis vector for these irreps. Finally we arrived at the complete set of independent terms of operator for G L d 3 C e C D: where n λ is the number of λ representation spaces in the operator type. This process is to look for certain subset of m-basis operators that need not have any permutation symmetries itself symmetrizing to independent combinations of p-basis, hence we call it de-symmetrization. With the action of the Young symmetrizer, this form of operator is still intrinsically a polynomial. Another interpretation is to apply the symmetrizer to the Wilson coefficient tensor instead of the operators, so that the whole term is indeed a monomial as a singlet under the flavor group SU (n f ) which is exactly spanned by the κ tensor basis we introduced in section 3.3. Therefore by keeping the Young symmetrizer, we actually recover the necessary information of flavors we derive for the amplitude basis.
We have to pick out n λ number of independent operators from the N projections for a given λ in eq. (4.33). It is non-trivial to guarantee the independence, unless n λ = 1 when we only need to find a non-vanishing projection. In general, we need to obtain the coefficient matrix c in eq. (4.33), from which we simply pick out n λ rows to form a full-rank submatrix c ζξ , where ζ takes values from an n λ -size subset of 1 through N . Instead of directly inspecting the matrix representation of the Young symmetrizers for the m-basis, it is easier to see how they act on the p-basis, because the p-basis already have specific symmetries. Due to the following property we have Thus we obtain the matrix representation of the symmetrizer for p-basis where we set the first n λ p-basis to be O (p) (λ,1),ξ for convenience. Therefore we first convert the m-basis operator to p-basis using the matrix K mp = (K pm ) −1 , and obtain where the matrix c iξ is identified as the n λ columns in K mp that correspond to the O (p) (λ,1),ξ basis. As explained above, we only need to select independent rows in c that form our p'-basis for λ.
As an example, we demonstrate the desymmetrization for the λ = [2, 1] representation in eq. (4.31), for which we find the inverse matrix where the highlighted columns correspond to the p-basis O As one may notice, an m-basis may be selected multiple times, such as the M m 2 T m SU3,2 in the above case. When that happens, the final result contains terms that could merge into a single Lagrangian term in the traditional sense which belongs to reducible representation of the flavor group SU (n f ). This notation is equivalent to the flavor relations in the traditional operator enumeration [3,23,36], while the crucial difference is that in the traditional treatment, the flavor relations need to be worked out specifically for each type of operators, involving all of the operator redundancy relations like the EOM, the IBP relation and the Fierz identities for both Lorentz and gauge groups. At higher dimensions, it may even be necessary to work out flavor relations among different operators of the same type. Suppose we have two monomial terms O ((m) 1,2 , which has intrinsic flavor relations that imply the following p'-basis that are independent within each merged term Also suppose that in our treatment we find n λ1 = 2 and n λ2 = n λ3 = 1, thus the two terms Y 1,2 are actually equivalent, which is translated into a flavor relation between the two operators. Such flavor relations did not show up at dim-7 or lower, but are inevitable at higher dimensions when the subspaces of type become larger, and are quite tricky to work out systematically. Therefore, our method of operator enumeration has the privilege that we do not need to work out these relations explicitly, but rather provide an equivalent notation to represent the flavor information. In the following, we use a dim-7 example to show the equivalence between our Young symmetrizer notation and the traditional flavor relations.

Flavor Tensor versus Flavor Relation
To demonstrate the advantage of our notation with Young symmetrizers we take the operator O trps One the other hand, the identity in the group algebraS 3 can be written as a summation of the 4 distinct primitive idempotents that are proportional to the 4 Young symmetrizers of different SSYT: . Acting the identity on an arbitrary tensor yields the original tensor indicates that a 3rd rank tensor can be decomposed to 4 distinct subspace with the corresponding permutation symmetry. This is essentially the underlining reason that we have the decomposition Therefore we can insert an identity E in front of the O f1f2f3 in eq. (4.50) and eq. 4.51, using the results of the group algebra multiplications: • O f1f2f3 . (4.56) 6 We have changed the order of the flavor indices in the superscripts to match our notation. Using the properties of the Young symmetrizer 7 : and acting Y [3] 1 and Y on the both sides of the eq. 4.56 one can deduce that: which means that we do not have a totally symmetric or a anti-symmetric subspace for the operator O trps LdddH regarding to the permutation of the flavor indices r, p, s of three down quark fields. As Y [3] 1 • O f1f2f3 = 0, eq. (4.55) becomes:   If one starts from the flavor relation, then finding out the corresponding flavor-specified operators may be difficult.

Preview of the Result
In this section, we summarize our main results for the dimension 9 operators in the SMEFT. Table. 5  Based on the p -basis, we further perform a few conversions for the convenience of phenomenologists. First, we have transfered the field strength tensors from the chiral basis F L,R to the usual form F andF . Although the chiral basis is a more natural choice from the helicity amplitudes prospect, the F,F basis has many privileges like its hermiticity and definite CP. Moreover, a lot of mature techniques are also implemented in terms of the F,F basis, like the program of Feynman rule calculations. We summarize the conversion rules between the two bases as follows 8 : from which we can easily deduce the following useful identities After the conversion, we do not distinguish types with F orF , as they are sometimes not independent of each other.
Therefore the types we present in the following sections do not count the same as the numbers in the table 5.
The conversion rules of the fermion bilinears in the SM are obtained by substituting these fields into the relations in eq. (A.2), such asū Finally, unlike the dimension 8 basis in [9], types are all complex here. We only present the operators without its Hermitian conjugate. The Hermitian conjugates of 4-component spinor bilinears can be converted using the following relations: (5.7)

Classes involving Two-fermions
The classification of different types is based on the number of fermions, as there is no operator without fermion fields.
We first list the operators involving two fermions, in which all operators describe ∆L = 2 processes, since only the leptons bilinear is allowed to appear. The type l 2 H 4 H †2 can contribute to the neutrino Majorana mass. The type W l 2 H 3 H † and Bl 2 H 3 H † may contribute to the neutrino anomalous magnetic moment. The type W 2 l 2 H 2 contains the operators contributing to the neutrino-less double beta decay at tree-level.

No gauge boson involved
In this subsection, we deal with the classes ψ 2 φ n D D 6−n D . Note that for even n D we have operators with fermions of opposite helicities, or chirality conserving, while for odd n D we have operators with fermions of the same helicities or chirality violating.
Class ψ 2 φ 6 : The only Lorentz structure of this class is This class involves the Weinberg operator with additional Higgses: After taking all the Higgs vev, it can give rise to additional contributions to the Majorana neutrino masses.
Class ψ 2 φ 5 D: The class has to be ψψ † φ 5 D, which has the following Lorentz structures: Considering the conservation of hyper-charge, the only operator in this class is Class ψ 2 φ 4 D 2 : The class ψ 2 φ 4 D 2 contains 12 new Lorentz structures that are all absent at lower dimensions:     Note that "term" in our definition is different from the other literature, so the numbers is larger than those in, for instance, [22]. That's not surprising since they did an extra step of merging before the counting. However the number of operators are exactly the same as in [22,37]. The links in the rightmost column refer to the list(s) of the terms in given classes.
The following two types are allowed in this class and the operators are listed below The superscripts of the O's label the terms in particular type, in the order from left to right and from top to bottom.
Class ψ 2 φ 3 D 3 : With three derivatives, we have 10 independent Lorentz structures as follows The Lorentz structures are also new here. There is only one possible type for these Lorentz structures: Class ψ 2 φ 2 D 4 : With four derivatives, we have 3 independent Lorentz structures as follows Still there is only one possible type:

One gauge boson involved
Class F ψ 2 φ 4 : The class has to be F L ψψφ 3 , which has only one Lorentz structure There are two possible types, the anti-symmetric flavor representations of which contribute to neutrino anomalous magnetic moment: The fermion bilinear terms here are always chirality-violating.
Class F ψ 2 φ 3 D: In this class, gauge boson contracts with the fermion current and the Higgs current. There are 3 independent Lorentz structures as follows Two types are written as follows: Class F ψ 2 φ 2 D 2 : There are 2 classes of this form. One is F L ψ 2 φ 2 D 2 , a dimension 7 class F L ψ 2 φ 2 with two additional derivatives, which has 7 independent Lorentz structures: The other class is F R ψ 2 φ 2 D 2 , where the flip of helicity for the gauge boson is made possible by the presence of the two additional derivatives. The Lorentz structures of this class are After converting to the F,F basis, these two classes mix together.

Two gauge boson involved
Cass F 2 ψ 2 φ 2 : Two classes are involved, with the same and opposite helicities for the gauge bosons and fermions. For the class F 2 L ψ 2 φ 2 , we obtained 2 independent Lorentz structures: while for F 2 R ψ 2 φ we have only 1 independent Lorentz structure

Classes involving Four-fermions
In this sub-section, quarks begin to appear in operators, and |B − L| is always equal to 2, such that only (∆B, ∆L) = (±1, ∓1) and (∆B, ∆L) = (0, 2) are allowed. The classes involve three quarks and one lepton, or two quarks and two leptons, or four leptons. The operators with (∆B, ∆L) = (±1, ∓1) usually contribute to the proton two-body decay processes, while the ∆L = 2 operators could give rise to contribution to the neutrino-less double beta decay processes, such as the operator type W udl 2 D at tree-level. We're going to present the operators in terms of the number of quarks.
Operators with ∆B = −1 or ∆L = −2 are taken conjugate to make them look a bit neater.

No guage boson involved
Class ψ 4 φ 3 : There are two classes in this form: ψ 2 ψ †2 φ 3 and ψ 4 φ 3 , and the independent Lorentz structures are Operators of this class contribute to the four-fermion interactions if the Higgs fields take their vev, and operators involving two or three l's are relevant to the neutrino non-standard interactions.

One guage boson involved
Class F ψ 4 φ: There are two class involved: F L ψ 4 φ and F L ψ 2 ψ †2 φ, and the independent Lorentz structures are Via a simple relation F Rµν σ µν αβ = 0, F µν σ µν αβ = F Lµν σ µν αβ , we replace all F L with F . All types follow this replacing rule.
1. Operators involving three quarks with ∆B = 1 and ∆L = −1: The p-basis operators are usually combinations of multiple monomials as a result of symmetrization, which makes them lengthy to express. In this work we provided a systematical way that we call "de-symmetrization" to solve the problem by expressing our final result in the form of a Young symmetrizer acting on a monomial operator. The algorithm was not discussed in detail in our previous paper. The subtlety was to guarantee the independence among the operators with the same symmetrizer acting on different monomials, which is necessary only when n λ , the number of certain representation space in a given type, is greater than one. The de-symmetrization procedure results in independent combinations of the p-basis, thus named p -basis, which have quite concise expressions with the Young symmetrizer denoting the flavor symmetry.
One may also make the Young symmetrizer to act on the Wilson coefficient tensor with which the monomial operator contracts, so that the operator becomes a monomial genuinely while the Young symmetrizer only serves as a reminder of the symmetry of the Wilson coefficients. It is equivalent to the flavor relations that the traditional method of operator enumeration applies to solve the repeated field issue. We summarize the advantages of our method and final notation over the traditional method and notation: • The completeness and independence are guaranteed by the underlining mathematical principle. The flavor symmetries among the Wilson coefficients are given systematically, unlike in the traditional treatment where flavor relations should be found manually.
• It enables one to directly write down the flavor-specified operators by enumerating the flavor SSYT's of the corresponding flavor symmetry. This is the most important reason that we insist in expressing our final result as the irreducible flavor tensors, as it is tricky to list the independent operators from the flavor relations accompanied by the traditional form of the operators.
• We provide a systematic way to convert any basis into our y-basis without any ambiguity, or, by using the conversion matrices that we also obtained, into any other basis that we provide here. Therefore, our basis could serve as the standard basis of operators.
The last point will in principle benefit a lot of studies about effective field theory. For example, in matching between the UV new physics and the SMEFT operators, an independent and complete basis of operators is necessary for an unambiguous result. Therefore we need to identify the operator generated after integrating out heavy particles as a unique coordinate with respect to an independent and complete operator basis. Note that in reducing such an operator to our y-basis, terms eliminated by the EOM or the [D, D] identity in this paper should be kept in the form of other types of operators. We will leave it for our future work.

B List of Classes up to Dimension 9
We list all the classes of Lorentz structures from dimension 5 to dimension 9, where ψ and ψ † represent particle with helicity −1/2 and 1/2 respectively, F L and F R represent gauge bosons with helicity ∓1, φ represents scalar fields. The gray operators in each class are those not possible to form by SM U (1) Y singlets.