New method for fitting coefficients in standard model effective theory

We address the issue that certain directions in the space of standard-model-effective-theory operator coefficients may be poorly constrained by experimental observables. We argue that this issue is best dealt with by making the use of a principal-component analysis, and we present an efficient method for carrying out that analysis that is based on singular-value decomposition. We demonstrate the method by applying it to the process of top-quark decays to a $b$ quark and a $W$ boson.


I. INTRODUCTION
In recent years, standard model effective theory (SMEFT) [1][2][3][4] has been a focus of activity in both the theoretical and experimental particle-physics communities. SMEFT has been advocated as a means to quantify systematically deviations of the global set of experimental measurements from the predictions of the standard model. To this end, a number of efforts have been undertaken to perform global fits of the Wilson coefficients of SMEFT operators [5][6][7][8][9][10].
Several difficulties can arise in using data to constrain coefficients in SMEFT. First, there are many operators in SMEFT (59 in dimension-6) and potentially many data points that could be used in fitting the coefficients of these operators. Second, theoretical expressions for the SMEFT contributions to a given set of experimental observables can contain "flat directions" (or nearly flat directions) in the space of SMEFT coefficients, that is, directions for which the observables are insensitive to the values of SMEFT coefficients. Third, for analyses involving a limited sector of observables, there may be fewer observables than SMEFT coefficients. This situation will necessarily result in the existence of exactly flat directions.
All of these difficulties can pose computational problems in fitting SMEFT coefficients to data. A global fit may be computationally challenging because standard methods for carrying out fits may bog down when the number of observables and coefficients is large.
When the number of coefficients is greater than the number of observables and/or there are flat directions, methods of fitting that minimize χ 2 numerically may not converge reliably.
In addition to these technical issues, there is also an issue of principle: When there are more coefficients than observables and/or flat directions, the uncertainties in the coefficients can be highly correlated, and presentation of bounds in terms of ranges on the individual SMEFT coefficients may be very misleading.
In this paper, we advocate addressing this latter issue of principle by carrying out principal-component analyses of the SMEFT coefficients. We address the technical issues that arise in carrying out fits of the SMEFT coefficients in a reliable and computationally efficient manner by making use of the method of singular-value decomposition (SVD). As we will show, the SVD method is also a very convenient one for carrying out the principalcomponent analysis.
The method that we present is based on a Gaussian uncertainty analysis. While this, of course, is not completely general, it should prove to be adequate at least for initial exploratory studies of the bounds on SMEFT coefficients. The SVD method also requires that the observables depend linearly on the SMEFT coefficients. This is the case in a computation at leading order in the effective-field-theory expansion. As we will describe later, the method can also be used iteratively to carry out a principal-component analysis of the SMEFT coefficients in the case in which higher-order contributions in the effectivefield-theory expansion are considered-provided that the expansion itself converges.
We demonstrate the principal-component/SVD method by applying it to a restricted class of observables that appear in decays of the top quark to a b quark and a W boson. 1 This example allows us to demonstrate the use of the SVD method to deal with the difficulties of flat directions. While we have not specifically demonstrated the utility of the the principalcomponent/SVD method for a situation in which there is a large number of observables and a large number of SMEFT coefficients, we are confident that it would work reliably and efficiently in such a situation because of experience with an application of SVD, in a different context, that involved the fitting of thousands of data points with hundreds of coefficients [15].
The remainder of this paper is organized as follows. In Sec. II, we outline the basics of SVD and present a method for using SVD to carry out principal-component analyses of the SMEFT coefficients. Section III contains an illustration of the use of the principalcomponent/SVD analysis in top-quark decays to a b quark and a W boson. Here, we present examples of fits involving flat directions and various numbers of SMEFT coefficients, and we contrast the results from the principal-component/SVD approach with those from traditional fitting approaches. In Sec. IV, we discuss the extension of the principal-component/SVD approach to situations in which the theoretical expressions for the observables depend nonlinearly on the SMEFT coefficients. Finally, in Sec. V, we summarize our results.

A. Singular-value decomposition
The singular-value-decomposition theorem states that an m × n matrix M that contains either real or complex entries can always be decomposed as [16] [17][18][19][20].
SVD has the important property that it can be used to solve the linear least-squares problem, as we will now explain. Suppose that M is an m × n matrix, C is an n-dimensional column vector, and O is an m-dimensional column vector. Further suppose that we wish to with respect to the elements of C (coefficients). Here, the square denotes the inner product of MC − O with itself. The solution of this problem is given by [21]  The matrix V † takes the elements of C from the original basis of coefficients to the principal-component basis. Each element of P = V † C is one of the principal components.
Owing to the phase ambiguity in each row in V † in the SVD, each principal component is defined only up to an overall phase. By virtue of the unitarity of V , the principal components are orthogonal to each other. The best-fit values of the principal components are given by the elements of where we have used Eq. A key property of a principal-component analysis is that, because W is diagonal, the fluctuation in MC that is produced by a fluctuation in a given principal component is independent of the fluctuations in the other principal components. That is, the uncertainties in the principal components are uncorrelated, which is equivalent to the statement that the covariance matrix of the coefficients is diagonal in the basis of the principal components.
As we will see, this property will prove to be very useful in constraining the SMEFT coef- The fitting of the SMEFT coefficients is carried out by minimizing the χ 2 , which is defined by where O exp is the N obs -dimensional column vector of experimental observables, O SMEFT is the N obs -dimensional column vector of theoretical predictions for the observables in the SMEFT, and σ 2 is the N obs × N obs covariance matrix of experimental and theoretical uncertainties. We decompose the theoretical predictions in the SMEFT into standard-model (SM) contributions and beyond-the-standard-model (BSM) contributions as and re-write χ 2 as where Now we wish to put χ 2 in Eq. (7) into the linear-least-squares form. First, since the covariance matrix is symmetric, we can diagonalize it: We note that the diagonal matrixσ 2 and the unitary matrix U exp can be found conveniently from the SVD decomposition of σ 2 , although other diagonalization methods could also be used. Then, we can write χ 2 as Since the diagonal matrixσ 2 is positive definite, the quantity (σ 2 ) − 1 2 is well defined. Therefore, we can normalize the observables in the new basis to unit error by writinĝ Now χ 2 has the form SinceÔ BSM is linear in the SMEFT coefficients, we can write it in the form where C is an N coeff -dimensional column vector of SMEFT coefficients and M SMEFT is an N obs × N coeff matrix. Hence, in order to constrain the SMEFT coefficients, we minimize which is a linear-least-squares problem. As was described in Sec. II A, the solution of this minimization problem can be obtained from the SVD decomposition M SMEFT = UW V † : As we have mentioned, each element of V † C gives one of the principal components. The best-fit values of the principal components are given by the elements of P in Eq. (4), and the one-standard-deviation uncertainty on each principal component is given by the inverse of corresponding singular value in W .
The covariance matrix of the SMEFT coefficients is obtained by using V to rotate W back to the original basis of SMEFT coefficients: As is standard, the covariance matrix for the situation in which one has marginalized over some of the coefficients is obtained by striking from the full covariance matrix the rows and columns that correspond to the marginalized coefficients [25]. Although the covariance matrix contains the same information as the uncertainties in the principal components, we will see that the presentation of uncertainties in principal-component form leads to a clearer picture of the constraints on the SMEFT coefficients.

III. APPLICATION TO TOP-QUARK DECAYS
In this section, we illustrate the principal-component/SVD method for fitting the SMEFT coefficients by applying it to the case of top-quark decay to a b quark and a W boson.

A. SMEFT operators
We work in the Warsaw-basis [1] of SMEFT operators, and our notation is similar to that of Ref. [26]. Following Ref. [27], we fit the coefficients C tW , C bW , C φtb , C tg , and C bg , which correspond to the operators with We also consider the coefficients C qu and C lq , which correspond to the four-fermion operators Here, q p (l p ) is a left-handed quark (lepton) isospin doublet with generation index p, u r and d r are the up and down right-handed isospin singlets with generation index r, l r is the lepton right-handed isospin singlet with generation index r, φ is the Higgs isospin doublet, φ = iτ 2 φ * is the hypercharge-conjugate Higgs doublet, τ is a Pauli matrix, W I µν is the fieldstrength tensor for the SU(2) I gauge bosons with isospin index I, the γ's are Dirac matrices, µν is the gluon field-strength tensor with color index A, and T A is a color matrix in the fundamental representation with tensor with color index A.

B. Experimental inputs
We take experimental values of the total top-quark decay rate and the helicity fractions from the Particle Data Group compilation [28]: In our analysis, we symmetrize the uncertainties in Γ exp tot by shifting the central value. That is, we take Γ exp tot = 1.43 ± 0.17 GeV. The correlation matrix of the experimental uncertainties is [29] Then, the experimental inputs for our SVD analysis are O exp = (Γ exp tot , F exp L , F exp − ) T , and σ 2 ij = σ i ρ ij σ j , where i, j = 1, 2, and 3.

C. Theoretical inputs
We make use of the expressions in Ref. [27] for SMEFT contributions to Γ tot , the total decay width to bW , F L , the fractional decay rate for a longitudinally polarized W boson, and F − , the fractional decay rate for a W boson with negative helicity. 3 We include the QCD corrections that are given in Ref. [27]. We note that the standard-model QCD corrections are also given in Ref. [31] and that an analysis of the SMEFT contributions to t quark decay has also been given in Ref. [32].
For purposes of this demonstration, we do not include uncertainties in the theoretical predictions. They could be incorporated into the analysis by adding the theoretical covariance 3 The published forms of these expressions in Ref. [27] have been corrected [30]. In Γ tot in Eq. (13), the denominator factor x W has been replaced with x 2 W . In F LO L in Eq. (13), the second numerator parenthesis has been moved to the start of the numerator. In ∆F QCD L in Eq. (A2), the standard-model contribution has been changed to agree with the expression in Eq. (15) of Ref. [31]. matrix to the experimental covariance matrix. 4 The input parameters for the theoretical calculation are given in Table I and are identical to those in Ref. [27], except that we evaluate α s at the scale m t , rather than the scale M Z .
We compute the electroweak couplingḡ from [33] We set the SMEFT cutoff to be Λ = 500 GeV.

D. Fit with one SMEFT coefficient
In Table II, we show the best-fit values of the SMEFT coefficients and their two-standarddeviation uncertainties that are obtained by setting all of the coefficients to zero, except for one. This is a widely used approach to constraining the SMEFT coefficients. However, as we will see, it can be quite misleading. Throughout the remainder of this paper, when we C tW 0.0644 ± 0.100 C   present the array V † , the columns correspond to the order of the SMEFT coefficients in Table II, namely, C tW , C tg , C bW , C bg , C φtb , C lq .

E. Fit with three SMEFT coefficients
We now contrast the results in Table II with those that can be obtained through a principal-component analysis. We begin by considering the case in which only the first three coefficients in Table II are nonzero. Then, we can compute the best-fit values of those coefficients and their uncertainties, marginalized over the other two coefficients. As we explained earlier, the latter can be obtained from the diagonal values of the covariance matrix. The results of this computation are shown in Table III. As can be seen, the central values have shifted substantially relative to those in Table II, and the uncertainties have increased, in some cases by almost an order of magnitude. Clearly, the single-coefficient values and uncertainties in Table II are not indicative of the true constraints on the SMEFT coefficients in the presence of three non-zero coefficients. However, the large uncertainties in Table III paint an unduly pessimistic picture of the constraints that can be achieved.
In order to see this, let us consider the results of the SVD analysis with three nonzero coefficients. We obtain From the expression for V † , we see that the principal components are We see that P 1 and P 2 are much better constrained than any of the individual coefficients and that only P 3 is poorly constrained. The principal-component analysis clearly allows one to access a much more powerful set of constraints than do the analyses of individual SMEFT coefficients.

F. Fit with five SMEFT coefficients
Now suppose that we keep only the first five SMEFT coefficients in Table II nonzero.
In this case, we have more SMEFT coefficients than observables, and so the individual coefficients cannot be fit unambiguously. Furthermore, because there are necessarily flat directions, the marginalization over some sets of SMEFT coefficients is ill-defined. Never-theless, the SVD approach allows us to find meaningful constraints. We obtain contributions are introduced. This reflects the fact that the observables are relatively insensitive to the SMEFT contributions that are proportional to C bg and C φtb , as can be seen from the small coefficients of C bg and C φtb the first two principal components.

G. Fit with ten SMEFT coefficients
Next we apply the SVD method to the complete set of ten SMEFT coefficients in Table II.
We list only the three principal components that are constrained. They are The best-fit values and two-standard-deviation uncertainties for these principal components are P 1 = −0.0645 ± 0.100, As can be seen, the first two principal components remain quite stable in best-fit value and uncertainty as new SMEFT coefficients are introduced, reflecting the relative insensitivity of the observables to the new SMEFT coefficients.

H. Fit with a flat direction in coefficient space
Finally, we examine the case in which the number of SMEFT coefficients and the number of observables are equal, but there is a hidden flat direction. In order to construct an example of this situation, we keep three SMEFT coefficients, C tW , C tg , and C bW , nonzero and set the remaining SMEFT coefficients to zero. Let a tW , a tg , and a bW be the coefficients of C tW , C tg , and C bW in O SMEFT . Then, the following replacement creates an approximate artificial flat direction in the space of C tg and C bW : In the limit ǫ → 0, there is an exact flat direction in the space of SMEFT coefficients.
As a numerical example, we take scale factor r in Eq. (32) to be −3.2. As ǫ approaches zero, conventional fitting procedures that use gradients of χ 2 to find a minimum in χ 2 have numerical difficulties. Let us consider, for example, the situation for ǫ = 10 −6 . The Mathematica routine FindMinimum can be used to minimize the χ 2 . This routine comes with a number of options for the method to be used in finding the minimum. Using Mathematica version 11.3 [17], we find that the conjugate-gradient method algorithm yields C tW = 0.0445, Newton's method yields C tW = 0.0410, C tg = 7.00 × 10 5 , which implies that, among the principal components, only P 1 and P 2 are well constrained, while P 3 is not, as is evident from the near-vanishing of the corresponding diagonal value of W . 5 We see that the principal-component/SVD method has constrained the principal components that contain a contribution that is proportional to −3.2C tg + C bW and has identified as unconstrained the principal component that contains a contribution that is proportional to C tg + 3.2C bW , which corresponds to the flat direction. The best-fit values of the principal components and their two-standard deviation uncertainties are P 1 = −0.0655 ± 0.0994, P 2 = 0.136 ± 0.344, We can invert the relations in Eq. (38), using the rows of V to obtain the coefficients. The result is From Eq. (40), it is easily seen that, to good approximation, the differences between the three results from FindMinimum correspond to differences in the value of P 3 .
When ǫ = 0 and there is an exact flat direction, numerical minimization of χ 2 with respect to the SMEFT coefficients would fail to converge to a result. However, the SVD method still yields meaningful constraints. Specifically, we have This method would yield best-fit values of the coefficients, but would not give accurate results for the principal components. Instead, one could compute the principal components as follows. First one could obtain the inverse covariance matrix by computing analytically two derivatives of χ 2 with respect to the SMEFT coefficients and evaluating the result at the best-fit values of the coefficients from the iterative procedure. The inverse covariance matrix could be diagonalized by standard methods, and the principal components could then be obtained from the elements of the unitary transformation that effects the diagonalization. The uncertainties would be given by the inverses of the square roots of the diagonal components of the inverse covariance matrix.

V. SUMMARY
In this paper, we have addressed a difficulty that appears in using experimental measurements of observables to constrain the Wilson coefficients in standard model effective theory (SMEFT). The difficulty is that the observables may be insensitive to certain linear combinations of SMEFT coefficients. That is, there may be "flat directions" in the space of SMEFT coefficients. This difficulty can arise because, in a partial analysis that is restricted to a particular set of physical processes, the number of experimental observables may be less than the number of SMEFT coefficients. In this case, it is clear that some linear combinations of SMEFT coefficients would are not constrained. However, it can happen that some linear combinations of SMEFT coefficients are poorly constrained even when the number of observables is equal to or greater than the number of SMEFT coefficients to be fit.
We have advocated for the use of a principal-component analysis to address the issue of flat directions, and we have presented an efficient method for carrying out the principalcomponent analysis that is based on singular-value decomposition (SVD). The principalcomponent/SVD analysis isolates the principal components that correspond to poorly constrained linear combinations of SMEFT coefficients. Furthermore, the reduction to principal components allows one to ascribe independent uncertainties to linear combinations of SMEFT coefficients, thereby leading to more stringent bounds than one could obtain simply by marginalizing over some of the SMEFT coefficients. In particular, one can obtain meaningful bounds on SMEFT coefficients, even in a partial analysis in which the number of observables is less than the number of coefficients. We have demonstrated the application of this method to the process of top-quark decays to a W boson and a b quark.
A further difficulty in fitting SMEFT coefficients to data that we did not address explicitly in this paper is that the sheer numbers of SMEFT coefficients and observables may lead to a formidable computational problem. In this regard, we can mention that the SVD method has been used in another context to fit hundreds of coefficients to thousands of observables [15] and can do so in a few seconds of CPU time on a typical present-day desktop computer.
Although the method that we have presented is limited to the case in which the observables depend linearly on the SMEFT coefficients, we have outlined in Sec. IV an iterative extension of the method that can be applied to the non-linear situation, provided that the contributions of the non-linear terms to the observables are small in comparison with the contributions of the linear terms. This is the case if the SMEFT expansion converges.
Finally, the method that we have presented relies on a Gaussian probability analysis.
While one might ultimately want to improve on a Gaussian approach, it should certainly be adequate for the purpose of carrying out exploratory studies in SMEFT.