Sharpening the shape analysis for higher-dimensional operator searches

When the Standard Model is interpreted as the renormalizable sector of a low-energy effective theory, the effects of new physics are encoded into a set of higher dimensional operators. These operators potentially deform the shapes of Standard Model differential distributions of final states observable at colliders. We describe a simple and systematic method to obtain optimal estimations of these deformations when using numerical tools, like Monte Carlo simulations. A crucial aspect of this method is minimization of the estimation uncertainty: we demonstrate how the operator coefficients have to be set in the simulations in order to get optimal results. The uncertainty on the interference term turns out to be the most difficult to control and grows very quickly when the interference is suppressed. We exemplify our method by computing the deformations induced by the ${\cal O}_{3W}$ operator in $W^+W^-$ production at the LHC, and by deriving a bound on ${\cal O}_{3W}$ using $8$ TeV CMS data.


Introduction
After the historical discovery of the Higgs boson [1,2] in July 2012, the Large Hadron Collider (LHC) is currently probing matter and spacetime at unprecedented small distances, looking for a signal of physics beyond the Standard Model (SM). However, new physics remains very elusive so far, as no statistically significant signal has been observed in the collected data up to now. Although there are good theoretical arguments to expect new particles nearby the TeV scale, it is also plausible that these states be somewhat too heavy to be on-shell produced at the LHC. In such scenario, the presence of new physics states is best studied using effective field theory methods and their effects can be parametrized by higher dimensional operators made of SM fields. The search for heavy physics beyond the SM then becomes a program of SM precision measurements, aiming at testing the existence of one or several of these higher dimensional effective operators.
The analysis of the distributions of final states kinematic variables plays a central role in this scenario, because their shapes contain important information about the presence of the effective operators. In principle, a thorough analysis of the shapes of the differential distributions may be the key to the discovery of new physics. However, such program is not so straightforward to carry out systematically, because of the large number of effective operators and kinematic variables to take into account, and because of the computational cost of estimating the differential rates. One-dimensional differential rates made available by ATLAS and CMS in run-I Higgs analyses have been included in the global fits performed in Refs. [3][4][5]. Many progresses are still needed in order to systematically and efficiently search for an arbitrary number of higher dimensional operators using multidimensional differential rates. An attempt to improve shape analysis techniques using the moments of the differential distributions has been done in [6].
From the theoretical point of view, it is crucial to be able to optimize the determination of the various contributions (SM, interference and BSM) that constitute the differential rates in presence of dimension-6 operators, especially when a numerical estimator (e.g. Monte-Carlo tool) is available . In the MadGraph5 framework [7], it has become possible to estimate separately the various components of the event rates induced by effective operators, however this useful improvement is not, for the time being, applicable to any kind of processes. It does not apply when several processes are combined, for example when asking for production of a particle on the mass-shell followed by its decay. Also, if the effective operators enter in the propagators, or if the energy dependence of the width has to be taken into account, the option cannot be consistently used.
Whenever we face the situation in which this MadGraph5 option to separate the various components cannot be used, or another event generator is used, an alternative method has to be invoked for the optimal determination of the contributions (SM, interference and BSM) that constitute the differential distributions. The derivation of such method, valid in general, is the topic of this work. This paper is organized as follows. The inspection of the event rates in presence of effective operators is done in section 2. Based on these considerations, an optimal method to estimate the deformations of differential rates induced by an arbitrary number of effective operators is presented in section 3. As illustration and check, in section 4, we obtain the deformations induced by the O 3W operator to the W W production differential rate and we reproduce, within two sigma, the bound obtained by CMS on the coefficient of the O 3W operator using 8 TeV data . Conclusions are presented in section 5.
2 Low-energy amplitudes and phase-space considerations

Effective theory basics
A field theory can sometimes be approximated by a simpler, so-called effective theory, in a given region of phase space. 1 Consider an ultraviolet (UV) theory that lies at a scale Λ which is much larger than the typical energy scale E of a given experiment (Λ E). From the UV point of view, the low-energy effective theory is obtained by expanding the correlation functions computed within the UV theory with respect to the parameter E/Λ.
It is very plausible that the Standard Model is not the ultimate UV-complete theory of nature and its Lagrangian corresponds to the relevant and marginal sector of a lowenergy effective theory. In this picture, the full SM effective Lagrangian also contains a series of operators of dimension higher than four, built from SM fields and invariant under SM symmetries, which are suppressed by inverse powers of the new physics scale Λ. To the low-energy observer, these higher dimensional operators parametrize the new physics effects, which appear as new interactions between the known particles, including vertices with extra derivatives. The SM effective Lagrangian takes the general form where The effective operators O I are dimensionless coefficients. In general, these coefficients can be complex, depending on the operator. However one can always split the operator into its real and imaginary parts in order to end up with only real coefficients. Because of this, in the following we will consider only real coefficients, without loss of generality.
Identifying Λ with the mass scale of a heavy particle, the effective field theory is valid only at energies E < Λ (2.3) and the contributions of the effective operators to any given observable are arranged as an expansion in E/Λ. 2 The power of the effective theory approach lies in the fact that this series can be consistently truncated in order to match a desired precision, which can be chosen accordingly to the available experimental uncertainties. At any given order of truncation, new physics effects are described by a finite set of universal coefficients. Larger effects are expected to come from operators with lower dimension: for example, the leading contributions to Higgs observables come from the dimension-6 Lagrangian L (6) , while the leading contributions to quartic neutral gauge boson interactions come from the dimension-8 Lagrangian L (8) . 3 Here and in the following we will truncate the EFT expansion at dimension 6. The complete list of independent dimension-6 SM effective operators have been reported in [11]. Let us consider, for concreteness, the case of a single dimension-6 effective operator. The general case involving an arbitrary number of higher dimensional operators is very similar. The effective lagrangian is then given by In the presence of the effective operator of Eq. (2.6), a generic amplitude M can be expressed as where M SM is the SM piece and M BSM represents the leading BSM contribution obtained by one insertion of the effective operator. The subleading contributions are given by Feynman diagrams with more than one insertion of the effective operator. Notice that, in general, the BSM component M BSM is not proportional to M SM . Physical observables, like the ones measured at the LHC, are described statistically by event rates σ that are given by the integral of the squared amplitude |M| 2 over some phase space domain D -which can be chosen by the experimentalist to some extent. Typical examples of such observables are total cross-sections and decay widths. In the presence of 2 Another bound comes from requiring a perturbative expansion of the effective interactions, which implies |α| Moreover, requiring perturbative couplings in the UV theory implies a bound on α, that depends on the number of fields n in the effective operator (or equivalently to the number of legs of the corresponding UV amplitudes). For dimension-6 operators one has with n ≤ 6. More details about the validity of the SM effective field theory in relation to possible UV completions can be found in [9]. 3 There can sometimes be selection rules suppressing the interference of certain dimension-6 operators with the SM [10]. In that case dimension-6 and 8 operators have comparable impact. Just in case, we stress that such feature has nothing to do with a possible breakdown of the E/Λ expansion. The effects of operators with d > 8 are still expected to be suppressed by powers of E/Λ with respect to the leading effects. Possible selection rules on the d > 8 operators would only increase even more this suppression.
The leading contributions to the square amplitude in presence of a dimension-6 operator.
the effective operator of Eq. (2.6), the observable event rate σ is given by 4 where M is given by Eq. (2.7) and D dΦ denotes the integral over the phase space domain. This event rate σ can be further decomposed as a sum of three leading contributions (2.10) The ellipses in Eq. (2.9) represent both contributions of higher order in 1/Λ, and O(Λ −4 ) terms coming from the interference of higher-order diagrams with the SM component. It will be made clear in next subsection why all the terms in Eq. (2.9) should be kept in order to describe the leading effects of new physics.
The leading components of a square matrix element in terms of Feynman diagrams is illustrated in Fig. 1. The component σ SM corresponds to the pure SM contribution, which remains in the limit α/Λ 2 → 0. The component σ int is obtained from the interference between the SM and BSM amplitudes, while the component σ BSM is the pure BSM contribution. From the point of view of a new physics searcher, these two latter components together constitute the new physics signal σ NP = σ − σ SM , while σ SM constitutes the irreducible background. 5 Notice that, by definition, σ int and σ BSM have not the same dimension as σ SM .
Finally, we notice the general fact that the modulus of the interference term has an upper bound In the following we will refer to differential cross sections as event rates. We remind that, strictly speaking, an event rate is rather defined as a cross section times the instantaneous luminosity of the collision. 5 A necessary condition for the shape analysis of a differential distribution to provide information on new physics is that M BSM be not proportional M SM . In practice, this condition is realized most of the time.
This is obtained using (2.12) where last step we have used the Cauchy-Schwartz inequality. In the following, we will refer to Eq. (2.11) as the "Cauchy-Schwartz bound" on the interference.

On the consistent truncation of event rates
We can now consider various limit cases of signal and background over a given phase space domain. Consider first a region of phase space D 1 where the SM contribution is much larger than the BSM contribution The Cauchy-Schwartz bound automatically implies that the interference is much smaller than σ SM , namely σ SM |αΛ −2 σ int |. The modulus of the interference term can in principle take any value between zero and 2|α|Λ −2 √ σ SM σ BSM . If we assume the case of an unsuppressed interference then we have (2.14) In this case, the SM background is large with respect to the signal, and the signal appears predominantly through the interference term σ int . The component σ BSM can thus be neglected at leading order in the effective theory expansion. For practical purposes, one may notice that the σ BSM term can also be kept, as long as it is negligible. Whenever the σ BSM term becomes non-negligible, the truncation of the EFT expansion has to be pushed to next-to-leading order: The higher order diagrams in Eq. (2.7) have to be taken into account, as well as the contributions from higher-order operators. In this paper our focus is merely on the leading effects new physics, thus aspects of the EFT at next-to-leading order are beyond our scope. Still in the case described by Eq. (2.13), if the interference is vanishing, then the signal comes mainly from σ BSM . In such case, one should be careful with the next-toleading order contributions of the EFT, as described in the paragraph above. The σ BSM component being O(1/Λ 4 ), other O(1/Λ 4 ) contributions coming from the interference of higher order diagrams or dimension-8 operators with the SM amplitude can be present. These observations have also been made in [9,10].
Consider now a phase space domain D 2 where the SM contribution is much smaller than the BSM contribution Using again the Cauchy-Schwartz bound, Eq. (2.15) automatically implies that the interference term is much smaller than the BSM contribution, namely α 2 Λ −4 σ BSM |αΛ −2 σ int |. Assuming an unsuppressed interference, we have This is the typical situation of a search for "rare events", where the SM background is vanishing and each signal event carries a high statistical significance. In this case, the new physics signal appears predominantly through the pure BSM term σ BSM . This implies that, even at leading order in the EFT expansion, one has to keep the σ BSM term. This fact might seem puzzling at first view, as one ends up with a O(1/Λ 4 ) term at leading order. Naively, there are other O(1/Λ 4 ) contributions coming from the interference of higher order diagrams or dimension-8 operators with the SM amplitude. However, one can easily check that the contribution σ BSM is actually the dominant one, because the other O(Λ −4 ) contributions would come from an interference with the SM amplitude, which is small by assumption (see Eq. (2.15)). We conclude from the above analysis that in general, both the interference term σ int and the quadratic BSM term σ BSM have to be kept in order to describe the leading effects of new physics for any background configuration. The σ int piece dominates when M SM αΛ −2 M BSM -provided that the interference is not suppressed-while the σ BSM piece dominates for M SM αΛ −2 M BSM . Finally, let us comment about specific event rates encountered in collider experiment. In the case of resonant production of an unstable particle, the propagator is resummed, so that the amplitude has not initially the form Eq. (2.7), but the form where Γ tot is the total decay width. This is instead the total width which has the form of a quadratic function, However, if the EFT expansion is valid, one can always expand |M| 2 res with respect to s/Λ 2 and m 2 /Λ 2 , to end up with a quadratic form |M| 2 res,SM + αΛ −2 |M| 2 res,int + α 2 Λ −4 |M| 2 res,BSM + O(Λ −6 ). Similarly, when using the narrow-width approximation in a production+decay process, the event rates take the form σ i ≡ σ prod Γ i /Γ tot , where σ prod is a production cross-section and Γ i a partial decay width. Again, the three quantities σ prod , Γ i , Γ tot are in principle quadratic functions of α. However, if the EFT expansion is valid, one can always expand σ i to quadratic order so that We can see that the EFT expansion always allows us to express the event rate as a quadratic function in α/Λ 2 , even in case of resonances. The discussion of this section applies similarly to the case of an arbitrary number of effective operators.
A comment on the relevance of rare events One may remark that in the case where we have M SM αΛ −2 M BSM and an unsuppressed interference, the signal is proportional to α/Λ 2 . On the other hand, in the case M SM αΛ −2 M BSM , the signal is proportional to α 2 /Λ 4 . The absolute magnitude of the signal is thus much larger in the former case than in the latter one, but the magnitude of the background is also larger in the former case. We may therefore wonder which of these two configurations is the more advantageous in order to detect the signal. The answer to this seemingly straightforward question is not so easy, and needs to involve a test statistic.
Assume a signal discovery test whose significance Z is given by 6 where N is the total number of events and N bkg is the expected number of background events. Let D 1 and D 2 be two domains of phase space satisfying are the components of the event rate σ on D 1,2 . Denoting Z 1 and Z 2 the discovery significances on D 1 and D 2 respectively, we have The proof is given in App. A. This small theorem makes clear that the regions of phase space where |M SM | |αΛ −2 M BSM |, i.e. "rare events" regions, can provide a lot of statistical significance, even though the signal is much weaker in that region compared to the |M SM | |αΛ −2 M BSM | region. The signal search in such "rare events" regions deserves thus a particular attention. We further emphasize that this "rare events" situation naturally happens whenever a higher dimensional operator carries derivatives, as is the case for many of the SM dimension-6 operators. For such operators, the high-energy tails of the kinematic distributions are typically "rare events" zones. The analysis of high-energy tails deserves thus particular attention. For example, regarding presentation of results, the use of overflow bins for the high-energy tail should be prevented as much as possible in order not to lose the precious information from the tail. Instead, all the bins should be kept up to the highest energetic event, no matter how few events are contained in the bins.

Optimal estimation of the differential rates
In the previous section we have considered event rates over a region of phase space D.
Let us now assume that the experiment allows to partition the region D into subdomains such that D = ∪ r D r . We refer to the D r subdomains as "bins". In addition to the total rate, the knowledge of the event rate in each bin provides information about the shape of the distribution. From the experimental point of view one talks about binned distribution. In principle, the size of each bin can be made small enough so that each event is seen separately. In this case one talks about unbinned distribution. Notice that, since the experimental precision is finite, the bin size can never shrink to zero, however it is instructive to keep in mind that the latter case can be seen as a limit of the former.
Let us first consider the case of one single dimension-6 effective operator whose contribution to the amplitudes is given by Eq. (2.7). Let X be a variable of phase space with domain D X ⊂ D. For a collider experiment X can be some transverse momentum, invariant mass, . . . One defines the "differential event rate" along X as where D\D X means that the integral is performed over the complement of D X . Integrating σ X over the domain D X one recovers the total rate, namely D X dX σ X = σ. This σ X should be used when treating unbinned data. The bins D Xr over the domain D X are defined by the partition D X = ∪ r D Xr and the event rate σ r on a bin D Xr is then given by The set of the rates on every bin {σ r } forms an histogram, that constitutes a discrete estimator of the true differential distribution σ X . Let us now proceed to decompose the differential rate σ X defined in Eq. (3.1) as a sum of three components σ SM X , σ int X and σ BSM X , in complete analogy to the case of the total rate σ. We have The same is also true for the binned rates The phase space integral is usually difficult or impossible to evaluate analytically, for example because of the complexity of D. Its evaluation has then to rely on a numerical integration method, for instance a Monte-Carlo simulation. In the following we are going to assume that such estimation method is available.

Reconstructing the differential rates
Assuming we have a way of evaluating a differential rate in presence of effective operators with coefficients fixed to given values, we can now wonder how to efficiently determine the deformations induced by the effective operators.
We first consider the case of a unique dimension-6 operator. We have seen in Sec. 2 that the expansion of the event rate has to be kept up to quadratic order in α and the same argument applies also to the differential rate σ X . Being σ X a quadratic function of α (see Eq. (3.3)), in principle, it is sufficient to know σ X for only three different values of α, namely α 0 , α 1 and α 2 , in order to reconstruct the exact form of σ X (α). Whenever these three evaluations of σ X are available, that we denote by σ i X = σ X (α i ), i = 0, 1, 2, then the three components σ SM X , σ int X and σ BSM X are obtained by just solving a 3 × 3 linear system and this simple task needs to be carried out only once.
In the case in which the estimations are made by means of Monte Carlo simulations, one directly deals with the binned rates σ r which are extracted from the histogram of the σ X distribution. The three components σ SM r , σ int r , σ BSM r are obtained by solving a 3 × 3 linear system for each bin. The σ SM r component can be obtained by simply setting α = α 0 = 0 in the MC simulation. The components σ int r and σ BSM r are instead obtained by running the MC simulation for two non-zero values of α, namely α 1 and α 2 . Therefore, we end up with the following solution of our linear system where σ 0 r = σ r (0), σ 1 r = σ r (α 1 ) and σ 2 r = σ r (α 2 ). These components can then be used in Eq. (3.6), "reconstructing" the formula that gives σ r for any value of α.
Let us consider the general case of n effective operators. We present only the case of dimension-6 operators for simplicity. A similar analysis applies to the case of operators with arbitrary dimension. The amplitude has in general the form The n BSM contributions M BSM I are in general different one from each other. The differential event rate σ X is proportional to the squared modulus of the amplitude and can be decomposed as It is convenient to rewrite the sum of the BSM quadratic contributions as In addition to n interference terms σ int X,I we have n terms σ BSM X,II and n(n − 1)/2 terms σ BSM X,IJ . Including the σ SM X component, the total number of terms to compute is (n + 1)(n + 2)/2. We conclude that (n + 1)(n + 2)/2 simulations are enough to exactly know the event rate σ X as a function of the operators coefficients. The components are obtained by simply solving a (n + 1)(n + 2)/2 × (n + 1)(n + 2)/2 linear system. This operations needs to be done only once. If one uses histograms, the system has to be solved once for each bin, just like in the case of a single operator.
We conclude that the complexity of the simple reconstruction method describe hereabove grows quadratically with the number of operators, once simplifications provided by the EFT expansion are taken into account.

Minimizing the uncertainties
The numerical evaluations of σ X that are needed to reconstruct σ X (α) are usually endowed with an intrinsic uncertainty. The correct approach is to think about estimators of σ X for each value of the chosen α. In the case of a single dimension-6 operator the estimators are three and we denote them byσ i X =σ X (α i ), i = 0, 1, 2. The uncertainties associated to these estimators are naturally propagated to the σ SM X , σ int X , σ BSM X components, because these quantities are linear combinations of the σ i X . In turn, these uncertainties are propagated to the reconstructed σ X (α) distribution because of the relation in Eq. (3.3) and, when it comes to confronting σ X (α) to the observed distribution σ obs X , the uncertainty on σ X (α) should be taken into account. It is therefore important to have a measure of this uncertainty, that is given by the covariance matrix of the estimators of the σ SM X , σ int X , σ BSM X components. In principle, there is a freedom in choosing the α coefficients that are used to perform the numerical estimations of σ X which are needed to reconstruct σ X (α). If there were no uncertainties, any set of values would work fine. However, in presence of uncertainties, it turns out that the choice of the α coefficients has a crucial impact on the reconstruction uncertainties. In the following we will determine the values of α that minimize these uncertainties.
Let us work with the binned distributions σ r and focus on a bin r. The three estimators are denoted byσ 0 r =σ r (α 0 ),σ 1 r =σ r (α 1 ),σ 2 r =σ r (α 2 ). We introduce the relative variance of each estimatorV i r , which is given bȳ where E[ŷ] represents the expectation value of the random variableŷ. No correlation is assumed among estimators related to different values of α i . Furthermore, we assume that the three relative variances have the same magnitude, namelȳ In case the estimators are obtained through Monte Carlo simulations with N MC points, we haveV r ∼ 1/N MC . The estimators of our interest areσ SM r ,σ int r andσ BSM r , which are expressed in terms of some linear combinations of theσ i r as shown in Eq. (3.7). The quantity at the center of our analysis is the relative covariance matrix of these estimators which is given bȳ It is important to notice that for the sake of determining the σ SM X , σ int X and σ BSM X components, there is no need to use values of α and Λ that respect the EFT validity bounds of Eq. (2.3). Note that to apply this trick, the possible higher order terms in the expressions of the event rates must be set to zero to avoid any disturbance. 7 After these preliminary steps, we can discuss the uncertainties that affect theσ SM r , σ int r andσ BSM r estimators. We aim at minimizing simultaneously the uncertainties for every component. We should therefore consider the trace of the relative covariance matrix of Eq. (3.13). Let us first notice that choosing α 0 = 0 gives simplyσ SM r =σ 0 r and therefore the relative variance forσ SM r is simplyV SM r =V r . This choice is arguably optimal and we are left with finding optimal values for α 1 and α 2 .
For a fixed value of α 1 it turns out that trC r (0, α 1 , α 2 ) is minimized for α 2 going to infinity 8 and this runaway direction can be seen in Fig. 2. In this limit, theσ 2 r estimator corresponds exactly toσ BSM r with relative variance equal toV r and vanishing correlation 7 We notice that, as the EFT validity bounds do not matter for the sake of determining the components of σr(α), only the α/Λ 2 combination actually appears in the problem. A value for Λ could thus be fixed without loss of generality. In the text we will not make such assumption. For Fig. 2, it will be convenient to choose Λ so that σ SM (3.14) withσ SM r . In this limit the relative covariance matrix takes the form The trace of the relative variance matrix goes to infinity for both α 1 → 0, ∞ because the linear system becomes degenerate in these limits. Studying further the behaviour of trC r (0, α 1 , ∞) we find out that it admits two minima for It is a non-trivial feature that this expression is independent on σ int r . This implies that the optimization does not depend on whether the interference is suppressed.
The behaviour of the trace ofC r (0, α 1 , α 2 ) is shown in Fig. 2, where we have taken Λ to be such that Eq. (3.14) is verified. One can observe from the plot that the minimum of the trace of the relative covariance matrix occurs for α 1 = 1, α 2 = ∞ or vice-versa. The relative covariance matrix at the positive optimal α 1 takes the following form Notice that the minimal relative covariance being by definition dimensionless, it can depend on the components of the differential rate only through the combinationσ r /σ int r . This quantity is also the one appearing in the Cauchy-Schwartz bound of Eq. (2.12) that now can be rewritten as |σ int r | ≤ 2σ r . Considering the two values of σ int r that saturate the Cauchy-Schwartz bound, the relative covariance matrix at the minimum assumes the following form for σ int r = +2σ r and σ int r = −2σ r respectively. The relative uncertainty on the interference component is given by the square root of (C r ) 22 . From Eq. (3.19) one sees that the uncertainty on the interference component is larger than the ones on σ SM r , σ BSM r by at least a factor 3/ √ 2 when σ int r is positive. In case of a maximal negative interference, the optimal α 1 corresponds in fact to a fully destructive interference, σ 1 r = 0, and the expected uncertainty associated to this result is vanishing. This can be directly seen in the covariance matrix (Eq. (3.19), right side), which features a zero eigenvalue. This situation of a large negative interference gives the smallest uncertainties possible. In contrast, whenever the interference is suppressed, |σ int r | < 2σ r , the uncertainty on the interference component quickly blows up. For example, taking σ int r = ±0.2σ r , we get (C r ) 22 ≈ 175 and (C r ) 22 ≈ 130 respectively. Our calculation provides a quantitative estimate of the difficulties that will be encountered for determining the interference component.
If one evaluates the minimal relative covariance matrix at the negative optimal α 1 of Eq. (3.16), Eq. (3.19) and the subsequent expressions and discussions are valid up to a sign flip of σ int r . Let us notice that if the sign of σ int r is known in advance, the sign of the optimal α 1 can then be chosen so that the interference term is negative, which gives a better uncertainty on the estimation of σ int r . But this refinement matters mostly for an interference near the Cauchy-Schwartz bound.
Finally, one may notice that the optimization is in principle different for each bin, as made clear by the r-dependence of the optimal point given in Eq. (3.16). However a simpler version of the optimization can also be obtained by using the α 1 that minimizes the uncertainty on the total rates, namely α 1 = Λ 2 (σ SM /σ BSM ) 1/2 .

Case of n operators
Let us discuss the reconstruction method in the case of n effective operators. We have seen in section 3.1 that (n+1)(n+2)/2 simulations have to be performed to fully reconstruct the event rate σ X as function of the α I 's. In practice, it turns out it is sufficient to switch on one or two effective operators at a time in order to reconstruct all the components. Using the results of the previous section, we can provide the complete list of (n + 1)(n + 2)/2 optimized points to be used for the simulations. We have: 1) One point with all α I = 0.
2) For every I, a single non-zero α I satisfying α I 1. These are n points.
We thus end up with a system of (n + 1)(n + 2)/2 linear equations that has to be solved. A useful side effect of the optimization described in the previous subsection is that the calculation of the various components becomes more transparent. In particular, 1) provides σ SM X and 2) provides the σ BSM X,II components. 3) provides the interference terms σ int X,I which can be simply obtained by subtracting σ SM X + α 2 I Λ −4 σ BSM X,II from the outcome of the simulations in 3), instead of using the exact formulas of Eqs. (3.7). This simplification is a consequence of having used arbitrary large α I 's in 2). Similarly, 4) provides the σ BSM X,IJ components, which are obtained by subtracting α 2 I σ BSM X,II +α 2 J σ BSM X,JJ from the outcome of the simulations in 4). An analysis similar to the one of section 3.2 shows that the uncertainty on the σ BSM X,IJ terms is minimized for α I σ BSM X,II ∼ α J σ BSM X,JJ .
Comments on Monte Carlo simulations Differential distributions are often estimated using Monte Carlo simulations, which reproduce the experimental setup assuming a fixed number of events N MC or a fixed integrated luminosity L MC . The expected, relative uncertainties associated with estimation of the event rate is given by ) in both cases (see App. B). In order to search for a small deviation in a given set of data, one should require that the MC error be small with respect to the statistical error of the data. For binned data, the number of events in each binN r is Poisson-distributed and its relative statistical error is given by 1/ N r . The requirement that the MC error be small with respect to the experimental error in every bin translates into the condition N MC,r N r for any bin r . (3.20) Note that this condition depends on the actual data one wants to analyse. It applies for both background-only and signal hypothesis, i.e. for both α = 0 and α = 0. Finally, the binning and range of all the MC histograms have to match the bins chosen for the data and are thus completely fixed. and its coefficient is denoted by α 3W Λ 2 . After electroweak symmetry breaking, it contributes to anomalous triple gauge couplings [13] that can be parametrized as follows 9 The lagrangian L ∂ CGC induces new vertices among the weak gauge bosons which carry extra derivatives with respect to the Standard Model ones. This new interactions will potentially deform the W W differential rates at the LHC, especially in the high energy range of the distributions. Figure 3. Differential distributions of dilepton invariant mass. The σ SM m ll , σ int m ll , σ BSM m ll components normalised to σ SM are shown from left to right. The differential distribution from CMS data is shown in red.
A search for the O 3W operator has been performed by the CMS collaboration in [15] where they consider W + W − production in the leptonic decay channel at the LHC, with an integrated luminosity of 19.4 fb −1 at 8 TeV center-of-mass energy. In [15] the same operator is defined using a different normalization convention and the translation to our notation is done by using the following relation: Our aims are: • Determining the deformations of the differential rates in W + W − production induced by the O 3W operator using our optimized technique for Monte Carlo simulations, • Deriving a 95% CL bound on α 3W Λ 2 using the measured differential distributions of [15]. Therefore we consider the process pp → W + W − → l + νl −ν (l = e, µ) at 8 TeV center-ofmass energy. The measured unfolded differential distributions for this process are displayed in Fig. 3 of [15] and they can be consistently compared to the outcome of our MC simulations. The dilepton invariant mass (m ll ) distribution in the "0-jet category" is the one chosen to put a bound on α 3W /Λ 2 . The total number of W + W − + background events measured in the 0-jet category is ∼ 4800 and the quoted 95% CL bound [15] translated to our notation is We simulate events for W + W − production at 8 TeV with MadGraph5 [7] after having implemented the O 3W perator in FeynRules2.0 [16]. These events are showered with Pythia8 [17] and selected using the cuts chosen in the CMS analysis. In particular, the leptons are required to have p T > 20 GeV and |η| < 2.5. Events with one or more jets with p T > 30 GeV and |η| < 4.7 are rejected. Following the notation introduced in the previous sections, the differential rate along the m ll variable will be denoted by σ m ll . We consider the binned distributions for σ m ll in the range [20,200] GeV, accordingly to the choice made in the CMS analysis. We evaluate the components of the binned m ll distribution following our optimal method described in section 3. We first compute the SM component σ SM m ll by setting α 3W = 0 in our Monte Carlo simulation. Then we compute the m ll distribution for a very large value of α 3W /Λ 2 , chosen to be 272 TeV −2 , which turns out to be proportional to the BSM component σ BSM m ll to a very good approximation. The binned σ SM m ll and σ BSM m ll components are shown in Fig. 3.
In order to compute the interference component σ int m ll following the results of section 3.2, we have used Eq. (3.16) to determine the third optimal value of α 3W /Λ 2 . The σ SM /σ BSM ratio being roughly about 100 TeV −4 in most of the bins, we conclude that the third optimal point for the simulation is roughly α 3W /Λ 2 ≈ 10 TeV −2 .
We still need to set the size of our MC samples. We require, for every bin, the number of MC events to be larger than the observed data analyzed by the CMS collaboration in the search for the O 3W operator. For example, N MC = 5 · 10 5 events would give a MC uncertainty that is typically ten times smaller than the statistical ones. Although this number is enough for the sake of analyzing the CMS distribution, it turns out that this amount is not sufficient to resolve the binned interference components because the interference is further suppressed by a m 2 W /E 2 factor with respect to the naive expectation. This can be understood thanks to some helicity selection rules [10,18].
It turns out that a much larger number of events is needed to properly estimate the interference component. We have indeed used N MC = 2.4 · 10 6 events for each of the three simulation points α 3W /Λ 2 = 0, 8.6, and 272 TeV −2 . The relative covariance matrix calculated in section 3.2 readily provides the uncertainty on the σ SM m ll , σ int m ll and σ BSM m ll components.
The uncertainties on σ SM m ll and σ BSM m ll are too small to be visible in Fig. 3. In contrast, the uncertainty on σ int m ll is not negligible, even with such a large MC sample. Having fixed a unique optimized value of α 1 for every bin and taking into account that the number of MC events in a given bin depends on this α 1 , the uncertainty on σ int r is obtained from Eq. (C.2), where the relative variances associated to a given bin is These uncertainties are shown in Fig. 3. Finally, we compare our reconstructed differential rate for pp → W + W − → l + νl −ν to the measured one shown in Fig. 3 of the CMS study [15]. The uncertainties quoted are a combination of statistical and systematic errors. Because combinations of many sources of uncertainties tend to be governed by the central limit theorem [19], we approximate the likelihood for each bin as a Gaussian. The complete likelihood used in our analysis is (4.7) where σ obs r /σ SM r are the experimental numbers given in [15] and ∆ r are the combined uncertainties for each bins, which are typically ∼ 8%. Using Eq. (4.7), we compute the credible intervals for the α 3W /Λ 2 parameter, assuming a flat prior over [−100, 100] TeV. The posterior distribution for α 3W Λ −2 is shown in Fig. 4. We find This bound is in agreement within less than two sigma with the one reported by CMS (see Eq. (4.5)).

Conclusion
If new particles beyond the Standard Model are too heavy to be on-shell produced at the LHC, their presence can still be indirectly detected via the effect of SM higher dimensional operators. In such scenario, the LHC precision physics program would play a central role for new physics searches. The key observables for revealing the existence of higher dimensional operators may be the distributions of final state kinematic variables, that contain precious information about new physics effects. The analysis of these differential rates thus deserves to be optimized in all its aspects. We focus on the case were leading effects from new physics arise from dimension-6 operators. We first inspect the event rates and the problem of detecting the deformations induced by the presence of the effective operators. We make clear that, in general, the pure BSM term should not be neglected in the differential rate analysis-even though it is O(α 2 /Λ 4 ), since there can be regions of phase space where its contribution is dominant. We have also found a bound on the interference term which follows from the application of the Cauchy-Schwartz inequality. Using this bound, it can be quantitatively shown that regions of phase space with rare SM events are very important to search for deformations of the differential rates.
Based on this preliminary analysis, we determine an optimal method to obtain the different contributions to the differential rates in the presence of dimension-6 effective operators, assuming that an estimator (e.g. a Monte Carlo tool) of the distributions is available. In the case of n effective operators, the evaluation of the rate at (n + 1)(n + 2)/2 different points are needed. The various contributions to the differential rate are then simply obtained by solving a linear system.
A crucial aspect of the proposed method is the minimization of the uncertainty through an optimal choice of the higher dimensional operators coefficient to be used in the simulations. In the case of a single dimension-6 operator we have to estimate the differential rate for three values of the coefficient α. The analysis of the relative covariance matrix of the three estimators reveals that the uncertainty is minimized for values of α equal to zero, infinity, and ±Λ 2 σ SM X /σ BSM X . Interestingly, this result turns out to be independent of the value of the interference. The covariance matrix provides the uncertainties on the estimated contributions and their correlations, allowing a well-defined control of the estimations. This covariance matrix should be in principle implemented in any subsequent statistical analysis. It turns out that the uncertainty on the interference component is larger than the ones on the SM and BSM components by a factor 3/ √ 2 if the interference saturates the positive Cauchy-Schwartz bound, and grows very quickly if the interference is smaller than this value.
We illustrate and check our method by determining the deformations induced by the O 3W operator on leptonic final states from W W production, at the 8 TeV LHC. We work at reconstruction level, aiming to approximately reproduce the analysis made by the CMS Collaboration in Ref. [15]. We ultimately reproduce within a two sigma range the bound on α 3W /Λ 2 obtained in this CMS analysis.

Note added
After completing our work we became aware of similar developments made by members of the Higgs Cross Section Working Group [20] and the ATLAS collaboration [21]. The basic method presented in these references (called morphing) is essentially the same as our reconstruction method described in Section 3.1, although the parametrization that has been used is slightly different. However, the question of how the input parameters need to be chosen such that the expected uncertainty of the output is minimal has not been addressed in these studies. Our paper fills this important gap by presenting, for the first time, an optimal morphing method and the statistical approach which gives rise to it. where σ is the theoretical rate. The expected relative error associated with the estimation of σ is thus given by 1/ √ σL MC . Instead of a fixed luminosity, one can require a fixed number of events N MC . In this case, the random variable is the MC luminosityL MC . The combination σL MC follows an Erlang distribution