Direct CP violation and the $\Delta I=1/2$ rule in $K\to\pi\pi$ decay from the Standard Model

We present a lattice QCD calculation of the $\Delta I=1/2$, $K\to\pi\pi$ decay amplitude $A_0$ and $\varepsilon'$, the measure of direct CP-violation in $K\to\pi\pi$ decay, improving our 2015 calculation of these quantities. Both calculations were performed with physical kinematics on a $32^3\times 64$ lattice with an inverse lattice spacing of $a^{-1}=1.3784(68)$ GeV. However, the current calculation includes nearly four times the statistics and numerous technical improvements allowing us to more reliably isolate the $\pi\pi$ ground-state and more accurately relate the lattice operators to those defined in the Standard Model. We find ${\rm Re}(A_0)=2.99(0.32)(0.59)\times 10^{-7}$ GeV and ${\rm Im}(A_0)=-6.98(0.62)(1.44)\times 10^{-11}$ GeV, where the errors are statistical and systematic, respectively. The former agrees well with the experimental result ${\rm Re}(A_0)=3.3201(18)\times 10^{-7}$ GeV. These results for $A_0$ can be combined with our earlier lattice calculation of $A_2$ to obtain ${\rm Re}(\varepsilon'/\varepsilon)=21.7(2.6)(6.2)(5.0) \times 10^{-4}$, where the third error represents omitted isospin breaking effects, and Re$(A_0)$/Re$(A_2) = 19.9(2.3)(4.4)$. The first agrees well with the experimental result of ${\rm Re}(\varepsilon'/\varepsilon)=16.6(2.3)\times 10^{-4}$. A comparison of the second with the observed ratio Re$(A_0)/$Re$(A_2) = 22.45(6)$, demonstrates the Standard Model origin of this"$\Delta I = 1/2$ rule"enhancement.


I. INTRODUCTION
A key ingredient to explaining the dominance of matter over antimatter in the observable universe is the breaking of the combination of charge-conjugation and parity (CP) symmetries. The amount of CP violation (CPV) in the Standard Model is widely believed to be too small to explain the dominance of matter over antimatter, suggesting the existence of new physics not present in the Standard Model. CPV in the Standard Model is highly constrained, requiring the presence of all three quark-flavor doublets and described by a single phase [3]. These properties imply that the "direct" CPV in K → ππ decays is a highly suppressed O(10 −6 ) effect in the Standard Model, making it a quantity which is especially sensitive to the effects of new physics in general, and new sources of CPV in particular.
Direct CPV was first observed in K → ππ decays by the NA48 (CERN) and KTeV (FermiLab) experiments [4,5] in the late 1990s, and the most recent world average of its measure is Re(ε /ε) = 16.6(2.3) × 10 −4 [6], where ε is the measure of indirect CPV ( |ε|= 2.228(11) × 10 −3 ). However, despite the impressive success of these experiments, it was only recently that a reliable, firstprinciples Standard Model determination of ε that could be compared to the experimental value became available. This is due to the presence of low-energy QCD effects that are difficult to model reliably.
Lattice QCD is the only known technique for determining the properties of low-energy QCD from first principles with systematically improvable errors. In this regime the high-energy physics is precisely captured by the ∆S = 1 weak effective Hamiltonian, where the Fermi constant G F = 1.166 × 10 −5 GeV −2 , V q q is the Cabibbo-Kobayashi-Maskawa matrix element connecting the quarks q and q and τ = −V * ts V td /V * us V ud . The quantities z i and y i are the Wilson coefficients that encapsulate the high-energy behavior, and which have been computed to next-to-leading-order (NLO) in QCD perturbation theory and to leading order (with some important NLO terms) in electroweak perturbation theory [7], in the MS scheme as a function of the scale µ. The task of the lattice calculation is to determine the matrix elements ππ|Q i (µ)|K 0 of the weak effective operators Q i renormalized in a scheme which can be defined non-perturbatively.
A further perturbative calculation is subsequently necessary to match such matrix elements to those in the MS scheme. Conventionally, as shown in Eq. (1), the weak Hamiltonian is expressed in terms of 10 operators {Q i } 1≤i≤10 (as defined, for example, in Eqs. (4.1) -(4.5) of Ref. [7] ) that are linearly dependent due to the Fierz symmetry. More convenient for our purposes is a second, 7-operator "chiral" basis [8] {Q j } j=1,2,3,5, 6,7,8 in which the operators are linearly independent and transform as irreducible representations of SU(3) L × SU(3) R . The relationship between these bases is discussed in more detail in Sec. VI B.
For an isospin-symmetric lattice calculation it is convenient to formulate the K → ππ matrix elements in terms of two amplitudes, A 0 and A 2 , where A I = (ππ) I |H W |K 0 and the subscript indicates the isospin representation of the two-pion state. These correspond to ∆I = 1/2 and ∆I = 3/2 decays, respectively. From these amplitudes, ε can be obtained directly as where δ I are the ππ scattering phase shifts and ω = Re(A 2 )/Re(A 0 ). Note that the effects of isospin breaking and electromagnetism are not included in our simulation and are instead treated as systematic errors as discussed in Sec. VIII D.
In In order to obtain on-shell kinematics, i.e. to ensure that E ππ , the energy of the two-pion final state, satisfies E ππ = m K , we exploit the possibility of choosing appropriate spatial boundary conditions. With periodic boundary conditions for all the quarks, the ground state of the two-pion final state corresponds to E ππ = 2m π , with each of the pions at rest, and the state with E ππ = m K appears as an excited state. We would therefore need to resort to multi-state fits to rather noisy data in order to obtain the physical amplitudes. The change in the finite-volume corrections induced by modifying the boundary conditions is exponentially small [9,10] or else accounted for by the Lüscher and Lellouch-Lüscher [11,12] prescriptions with minor alterations [13].
In our calculation of A 2 [2,14,15] we employ antiperiodic spatial boundary conditions (APBC) for the down quark in some or all directions, which results in the charged pions also obeying cor-responding antiperiodic boundary conditions. The momenta of the charged pions are therefore discretized in odd-integer multiples of π/L in these directions, where the spatial volume of the lattice is V = L 3 . Since only the spectrum of the charged pions is changed by the APBC, we compute K + → π + π + matrix elements of operators which change I z , the third component of isospin, by 3/2 and then use the Wigner-Eckart theorem to obtain the physical K + → π + π 0 amplitude which is proportional to A 2 . Note that in order to ensure that E ππ = m K , L must be appropriately tuned.
The technique of using APBC on the down quark naturally breaks the isospin symmetry. For the ∆I = 3/2 calculation this symmetry breaking does not pose an issue because the final state of the measured K + → π + π + matrix element is the only doubly-charged two-pion state and therefore cannot mix with other ππ states due to charge conservation. However, the final state in the ∆I = 1/2 matrix elements has isospin 0 and is a linear combination of |π + π − and |π 0 π 0 states. Thus the breaking of isospin symmetry at the boundaries results in different energies for the |π + π − and |π 0 π 0 states and the APBC technique cannot be used. As a result, for the calculation of the ∆I = 1/2 amplitude we instead utilize G-parity boundary conditions (GPBC). G-parity is the combination of charge conjugation and a 180 • isospin rotation about the y-axis,Ĝ =Ĉe πÎ y .
The charged and neutral pions are both odd eigenstates of this operation, hence when applied as a boundary condition all pion states again become antiperiodic in the spatial boundary. While more general than the APBC approach, the use of GPBC introduces a number of technical and computational difficulties that we discuss in Ref. [10] and below.
Note that due to the ππ interaction being repulsive in the I = 2 channel but attractive in the I = 0 channel, the finite-volume ππ energies in these two representations differ at fixed lattice size and it is therefore not possible to use the same ensemble to measure both the ∆I = 3/2 and ∆I = 1/2 decay amplitudes with on-shell kinematics. In this document we present a detailed update of the calculation of A 0 and will combine it with the results for A 2 given in Ref. [2].
Among the necessary ingredients in the lattice calculation of the K → ππ matrix elements are the two-pion energies E ππ and the amplitudes ππ|O ππ |0 , where O ππ is an interpolating operator that can create the required two-pion state from the vacuum. These quantities are determined by correlation functions describing the propagation of the two-pion state. The matrix elements ππ |Q i |K 0 are obtained from K → ππ correlation functions in which the Euclidean time dependence is exponential in E ππ , and the amplitudes corresponding to the annihilation of the two-pion state (and the creation of the kaon state) have to be removed. From the measurement of E ππ and using the Lüscher formula [11], we also determine the s-wave isospin 0 ππ-phase shift, δ 0 (m K ), which enters the expression relating the matrix elements to (ε /ε), Eq. (2). The derivative of the phase-shift with respect to the energy is additionally required to determine the power-like (i.e. nonexponential) finite-volume corrections through the Lellouch-Lüscher formula [12] (cf. Sec. VI A).
The observation of a discrepancy from the predicted phase shift increased our motivation to extend the earlier calculation by increasing the statistics and using more sophisticated methods to better analyze the I = 0 two-pion system. In Ref. [1] we observed excellent stability in the determination of the ground-state two-pion energy E ππ ; the result was consistent between oneand two-state fits to our data (i.e. whether we assumed that just the ground-state was propagating or allowed for a contribution from an excited state) and was also independent, within the uncertainties, of the time separation between the insertion of the creation and annihilation operators (the O ππ introduced above). Nevertheless, we considered the best explanation for the discrepancy to be contamination from one or more excited states whose contribution with increasing time is masked by the rather rapid reduction in the signal-to-noise of our data. Therefore, in addition to increasing our statistics by more than a factor of 3, we have introduced two additional ππ interpolating operators. For our original calculation we used a ππ operator comprising two quark bilinear operators that create back-to-back moving pions of a particular momentum. Alongside this operator, which we label ππ(111), we have now added a scalar operator σ = 1 √ 2 (ūu +dd), and an operator creating pions with larger relative momenta that we label ππ(311). Here the number appearing in the parentheses of the ππ(. . .) operators is related to the components of the pion momentum in lattice units: (xyz) → (±x, ±y, ±z)π/L (the total ππ momentum is zero in all cases). Here and for the remainder of this document we will assume the lattice size L to be in lattice units unless otherwise stated. All three operators, once suitably projected onto a state that is symmetric under cubic rotations, have the same quantum numbers as the s-wave I = 0 two-pion state of interest and as such project onto the same set of QCD eigenstates, albeit with different coefficients.
In Ref. [18] we demonstrate that a simultaneous fit to the 3 × 3 matrix of ππ two-point correlation functions in which the two-pion states are created or annihilated by one of these three operators, results in a substantial reduction in the statistical and systematic errors. We find that, once the excited states are taken into account, the resulting I = 0 ππ-scattering phase shift at E lat ππ = 479.5 MeV is δ 0 (E lat ππ ) = 32.3(1.0)(1.8) • , where the errors are statistical and systematic, respectively. This significant increase in our result for δ 0 (E lat ππ ) brings us into much closer agreement with the dispersive prediction, which at our present value of E lat ππ is δ 0 (E lat ππ ) disp = 35.9 • , obtained using Eqs. 17.1-17.3 of Ref. [16] with m π = 139.6 MeV. (We refer the reader to Ref. [16] for estimates of the error on the dispersive prediction.) In this paper we present results for the ∆I = 1/2 K → ππ matrix elements obtained from our expanded data set of 741 measurements, using all three ππ interpolating operators.
In this analysis we also include an improved non-perturbative determination of the renormalization factors relating the bare matrix elements in the lattice discretization to those of operators renormalized in the RI-SMOM scheme (see Sec. V). Perturbation theory is then required to match the operators renormalized in the RI-SMOM scheme to those in the MS scheme in which the Wilson coefficients have been computed. This calculation utilizes step-scaling to raise the matching scale from 1.53 GeV to 4.01 GeV, significantly reducing the systematic error associated with the perturbative matching.
Throughout this document results are presented in lattice units unless otherwise stated.
While the current paper is intended to be self-contained it should be viewed as the third in a series of three closely related papers. The first of these is Ref [10] which gives a detailed discussion of the implementation and properties of lattice calculations which impose G-parity boundary conditions. The second paper is Ref. [18] in which the same ensemble of gauge configurations and many of Green's functions used in the current paper are analyzed to study ππ scattering. This second paper contains the two-pion, finite-volume energy eigenvalues from which the I = 0 and I = 2 ππ scattering phase shifts are derived as well as the matrix elements of the ππ interpolating operators between the corresponding energy eigenstates and the vacuum which are used in the current paper.
For the convenience of the reader we summarize the primary results of this work in Tab. I. For further discussion we refer the reader to Sec. VIII. It is important to stress that the results and uncertainties in Tab. I have been obtained by combining a number of elements. The major direct contribution from this work is the evaluation of the matrix elements (ππ) I=0 |Q i |K 0 in isosymmetric QCD, with the operators renormalized in the RI-SMOM( / q, / q) scheme (see Tab.  in Tab. XXVII can be used to improve the precision in the determination of A 0 .
The layout of the remainder of this paper is as follows: In Sec. II we introduce our lattice ensemble and give a general overview of our measurement techniques. In Sec. III we discuss and present results from fits to the single-pion, two-pion and kaon two-point correlation functions, the values of which are required as inputs to the fits of the K → ππ three-point correlation functions from which the matrix elements of the bare lattice operators are determined. In Sec. IV we discuss the measurement of these three-point functions and provide the results from the fits. In Sec. V we discuss our procedure for the non-perturbative renormalization of the operators Q i , the results of which are combined with the matrix elements of the bare lattice operators and other inputs to determine A 0 and ε /ε in Sec. VI. We follow this by a detailed discussion of the systematic errors in Sec. VII and present our final results for the matrix elements, decay amplitudes, and ε /ε, together with a discussion of the ∆I = 1/2 rule, in Sec. VIII. Finally we present our conclusions in Sec. IX. There are two technical appendices in which we present the Wick contractions of some of the correlation functions used in this project.

II. OVERVIEW OF MEASUREMENTS
In this section we provide an overview of the calculation, including information on the ensemble and the measurement techniques. The dislocation suppressing determinant ratio (DSDR) term [19] is a modification of the gauge action that suppresses the dislocations, or tears in the gauge field that enhance chiral symmetry breaking at coarse lattice spacings. This enables the calculation to be performed with larger lattice spacings, and hence larger physical volumes, at fixed computational cost, ensuring good control over finite-volume systematic errors. We use G-parity boundary conditions (GPBC) in three spatial directions in order to obtain nearly physical kinematics for our K → ππ decays.
The lattice parameters are equal to those of the 32ID ensemble documented in Refs. [20,21] except for the boundary conditions and that we now simulate with a lighter, physical pion mass of 142 MeV versus the 170 MeV pion mass of the 32ID ensemble. This allows the use of existing measurements such as the lattice spacing, and also enables the computation of the non-perturbative renormalization factors in a regime free of the complexities associated with GPBC.
The ensemble used for our 2015 calculation comprised 864 molecular dynamics (MD) time units (after thermalization), upon which 216 measurements were performed separated by 4 MD time units. Subsequent to the calculation, it was discovered [22] that an error existed in the generation of the random numbers used to set the conjugate momentum at the start of each trajectory, which gave rise to small correlations between widely separated lattice sites. While the resulting effects were determined to be two-to-three orders of magnitude smaller than our statistical errors, we nevertheless do not include these configurations in the present calculation.
In the period following our previous publication, we have dramatically increased the number of measurements. Configurations were generated on seven independent Markov chains originating from widely seperated configurations of our original ensemble. Subsequent algorithmic improvements, particularly the introduction of the exact one-flavor algorithm (EOFA) [23][24][25]  The bootstrap measurement of the goodness-of-fit is a technique developed specifically for this and our companion work [18], and is detailed in Ref. [27]. To summarize, the goodness-of-fit is typically parameterized by a p-value that represents the likelihood that the data agrees with the model, allowing only for statistical fluctuations. The p-value is computed by first measuring wherex i are the ensemble-means of the data at coordinate i, p the fitting parameters, f the model function, and (cov) ab is the covariance matrix. The value obtained for q 2 is then compared to the null distribution that describes how this quantity varies between independent experiments if only statistical fluctuations are allowed around the model. The null distribution is typically assumed to be the χ 2 distribution, but this is inappropriate when the fluctuations in the covariance matrix between experiments become significant, as is the case for our ππ measurements [27]. In that work we demonstrate that the null distribution can be estimated directly from the data through a simple bootstrap procedure, allowing for a more reliable p-value that is free from assumptions. This procedure also has the benefit of allowing us to neglect the autocorrelations in the determination of the covariance matrix on each bootstrap ensemble, which dramatically improves the statistical error but changes the definition of q 2 in a subtle way that cannot be accounted for by traditional methods.

C. Measurement technique
Measurements are performed using the all-to-all (A2A) propagator technique of Ref. [28], whereby the quark propagator is decomposed into an exact low-mode contribution obtained from a set of, in our case 900, predetermined eigenvectors, and a stochastic approximation to the highmode contribution. This allows for the maximal translation of correlation functions in order to take full advantage of each configuration, as well as easy implemention of arbitrarily smeared source and sink operators. We perform full spin, color, flavor and time dilution such that the stochastic source is required only to produce a delta-function in the spatial location.
For all quantities we use smeared meson sources with an exponential (1s hydrogen wavefunctionlike) structure, where α = 2 is the smearing radius and x and y are the spatial coordinates of the two quark operators. Several technicalities must be considered when using G-parity boundary conditions, including limitations on the allowed quark momenta which has implications for the cubic rotational symmetry, the preservation of which is essential for producing an operator that projects onto the rotationally-symmetric (s-wave) ππ state. These are detailed in Ref. [18].
More specific details of the various measurements are provided in the following sections.

III. RESULTS FROM TWO-POINT CORRELATION FUNCTIONS
In order to compute the K → ππ matrix elements it is necessary to measure the energies and amplitudes of the pion, kaon and ππ two-point Green's functions. In this section we present results for the kaon two-point function and summarize the results of Ref. [18] for the pion and ππ twopoint functions. We also detail the determination of the energy dependence of the phase shift at the kaon mass scale, which is used to obtain the Lellouch-Lüscher [12]   of the opposite quark flavor. In Ref. [10] we introduced a notation whereby the quark field and its G-parity partner are placed in a two-component vector, where C is the charge conjugation matrix. We will refer to the index of these vectors as a "flavor index". In this notation the propagator becomes a 2×2 "flavor matrix", and Pauli matrices inserted appropriately describe the flavor structure. In this notation the Wick contractions assume an almost identical form to those of the periodic case.
The strange quark is introduced into the G-parity framework as a member of an isospin doublet that includes a fictional degenerate partner, s , into which the strange quark transforms at the boundary. The corresponding field operator is With the introduction of this extra quark flavor a square-root of the s/s determinant is required in order to generate a 2+1 flavor ensemble [10].

B. Kaon two-point function
Following Ref. [10], a stationary (G-parity even) kaon-like state can be constructed as where K 0 is the physical kaon and K 0 a degenerate partner with quark contents u. This |K 0 state can be created using the following operator where p = (1, 1, 1) π 2L is the quark momentum and Θ is defined in Eq. (4). Note that in the above equation and for the other operators presented in this document, the projection operators 1 2 (1 ± σ 2 ) appear; these are necessary to define quark field operators that are eigenstates of translation and hence have definite momentum [10].
The two-point function is measured for all t 1 and t 2 , and subsequently averaged over t 2 at fixed t = t 1 − t 2 . The data are folded in t, i.e. data with t = t 1 −t 2 are averaged with those with t = L T − (t 1 −t 2 ), where L T is the lattice temporal extent, to improve statistics. We perform correlated fits to the following function, where the second term accounts for the state propagating backwards in time through the lattice temporal boundary. The chosen fit range, p-value and the results of the fit are given in Tab. II. In physical units our kaon mass is 490.5(2.4) MeV, which is within 2% of the physical neutral kaon mass.

C. Pion two-point function
The isospin triplet of pion states can be constructed from the operators listed in Sec. V.A. of Ref. [10]. Due to the isospin symmetry the resulting two-point functions all have the same Wick contractions, and are most conveniently generated with the neutral pion operator, where p π = p 1 + p 2 is the total pion momentum and P p π is a flavor projection operator of the form 1 2 (1 ± σ 2 ) whose sign depends on the particular choice of the quark momentum, per the discussion in Sec. IV.G. of Ref. [10]. We measure the two-point function with four different momentum orientations related by cubic transformations in order to improve the statistical error: p π ∈ {(1, 1, 1), (−1, 1, 1), (1, −1, 1), (1, 1, −1)} in units of π/L. The corresponding choices of quark momentum are given in Ref. [18]. The two-point function is again averaged over all source timeslices and also over all four momentum orientations, and the data folded to improve statistics. Correlated fits are performed to the function,   Details of the strategy for measuring the I = 0 ππ two-point function can be found in Ref. [18].
In summary, we construct three operators with the quantum numbers of the I = 0 ππ state: The first and second operators, labeled ππ(111) and ππ(311), comprise two single-pion operators carrying equal and opposite momenta separated by ∆ = 4 timeslices in order to reduce the overlap with the vacuum state. The pion momenta in the former reside in the set (±1, ±1, ±1)π/L, and those of the latter in the set (±3, ±1, ±1)π/L and permutations thereof. We average over all nonequivalent directions of the pion momentum in order to project onto the rotationally symmetric state. The final, σ operator corresponds to the scalar two-quark operator 1 √ 2 (ūu +dd). As mentioned previously, the pion and σ bilinear operators are smeared with a hydrogen wavefunction (exponential) smearing function of radius 2 lattice sites in order to improve their overlap with the lowest-energy states.
Two-point correlation functions are constructed from pairs of source and sink operators thus, where we include an explicit vacuum subtraction. Here t 1 specifies the earliest time in which any fermion operator appearing in the annihilation operator O † α is evaluated, and likewise t 2 is the latest time appearing in the creation operator O β , such that t = t 1 −t 2 is the time of propagation of the shortest-lived pion state. We average over many t 2 at fixed t = t 1 − t 2 and the data are folded to improve statistics as follows: where ∆ i = 4 for the ππ(111) and ππ(311) operators and zero for the σ operator. To the matrix of correlation functions we perform simultaneous correlated fits to the functions, We will use the result obtained by uniformly fitting to the temporal range 6 − 15 with all three ππ source/sink operators and allowing for two intermediate states (i max = 1), which represents the "best fit" in Ref. [18]. The results, reproduced from that work are given in the second column of Tab. III for the convenience of the reader. Note that our ππ and kaon energies differ by 2.
where the error is statistical only, and as such our K → ππ calculation is not precisely energyconserving. The effect of this difference is incorporated as a systematic error on our final result, as discussed in Sec. VII B.
It is interesting to compare the statistical errors of our ππ ground-state fit parameters to those of our 2015 analysis, which was performed using a single operator (ππ(111)) and the same t min = 6 as our present analysis. Previously we obtained (74) .
Comparing these to the results of this work in Tab. III we find that the error on the ground-state amplitude has reduced by a factor of 1.9 and the energy by a factor of 6.7. The former is compatible with the expected 741/216 = 1.9 reduction in errors due to the increased statistics, but the latter has improved by a far greater amount. In Ref. [18] we demonstrate that this improvement in the errors is a result of the additional operators, in particular the σ operator, which vastly enhance the resolution on the ground-state energy.
The I = 0 ππ scattering phase shift is obtained via Lüscher's method [11,29] and has the value, where the errors are statistical and systematic, respectively. The procedures by which we estimate our errors are detailed in Ref. [18].
Our decision to fit the ππ two-point function with two states limits the number of states that we can include in our K → ππ matrix element analysis. In order to study the possibility of residual contamination from a third state we repeat the analysis of the ππ two-point function with 3 states, the results of which are given in the third column of Tab. III. For a stable fit to the ππ data we found it necessary to use t min = 4, which is lower than the t min = 6 used for the primary fit and which exposes the result to enhanced excited state contamination. However comparing the results between the second and third columns of Tab. III we find little relative difference in the parameters associated with the ground-state, suggesting any such effects on the K → ππ matrix elements are small.

E. Phase-shift derivative at the kaon mass
As detailed in Sec. VI A, the Lellouch-Lüscher finite volume correction to the K → ππ matrix elements requires the evaluation of the derivative of the phase-shift with respect to the ππ energy evaluated at the kaon mass scale, or more specifically with respect to the variable q = kL/2π π is the square of the interacting pion momentum. This derivative cannot presently be obtained experimentally at this energy scale, and therefore an interpolating ansatz or direct lattice measurement is required.
In Ref. [18], alongside the stationary state examined above, we also compute the ππ energy at several non-zero center-of-mass momenta, allowing us to obtain the phase-shift at two values of the rest-frame energy that are lower than the kaon mass as well as a threshold determination of the scattering length. These results are also close to their corresponding dispersive predictions, albeit with somewhat larger excited-state systematic errors. Using these results we can directly measure the derivative of the phase-shift with respect to the energy using a finite-difference approximation, for which we obtain We can also obtain the derivative from the dispersive prediction of Colangelo et al [16]. The derivative with respect to s = E 2 ππ , computed at our lattice ππ energy using Eqs. 17.1-17.3 of Ref. [16] with m π = 135 MeV, is found to be where the error is the statistical error arising from the uncertainty in the lattice spacing and measured lattice ππ energy. Note that this result is obtained at the physical pion mass, which is 5% smaller than our lattice value. In order to estimate the impact of the difference in pion masses on this derivative we use NLO chiral perturbation theory [16,30] (ChPT) to estimate the derivative with respect to energy at k = 0.1 GeV, at which ChPT is expected to be reliable. Assuming that the slope with respect to √ s is roughly constant (which is well motivated by the dispersion theory result, cf. Fig. 7 of Ref. [16]) we estimate the change in dδ 0 ds evaluated at our lattice ππ energy as 1.2%. This value is small relative to the final systematic error we assign to the derivative in Sec. VII D and can therefore be neglected here. Finally, applying ds/dq = 4.18(5) × 10 5 MeV 2 , where again the errors are statistical, we obtain The near-linearity of the dispersive prediction suggests that a linear ansatz, may also be appropriate. With this ansatz we find (24) dδ 0 dq = 1.259 (36) rad .
Given that the derivative of the phase shift is a subleading contribution and that the above values are all in reasonable agreement, we expect that the Lellouch-Lüscher factor can be obtained reliably. The variation in these results will be taken into account in our systematic error in Sec. VII D.
In our 2015 work [1] we also considered a linear ansatz in q, This value is not as well motivated as the ansatz in Eq. (24) and is in disagreement with all four of the above results. Given the good agreement between our measured phase-shifts and the above estimates of the derivative with the dispersive predictions, we will not include this result in our systematic error estimate.

F. Optimal ππ ππ ππ operator
For use later in this document we define here an optimal operator that maximally projects onto the ππ ground state relative to the first-excited state.
Under the excellent assumption that the backwards-propagating component of the time dependence is small in the fit window, the two-point functions can be described as a sum of exponentials: where again Greek indices denote operators and Roman indices states. We wish to define an optimized operator that projects onto the ground state: for which where the approximate equality indicates that additional exponential terms resulting from excitedstate contamination, although suppressed, still exist for an optimal operator composed of a finite number of ππ operators. Expanding the Green's function, Without loss of generality we can fix A 0 opt = 1, which alongside Eq. (29) is sufficient to define r i : If the number of states is equal to the number of operators this can be interpreted as a matrix equation, where the row index of A is the state index i and the column index the operator index α. Here0 is a unit vector in the 0-direction, and as such which gives (34) i.e. r is the first column of the inverse matrix.
As our ππ fits include only two states, we drop the noisier ππ(311) operator in order to form a square matrix of correlation functions. We then obtain where the elements are the coefficients of the ππ(111) and σ operators, respectively. In Fig. 1 we compare the effective energy obtained with the optimal operator to that of the ππ(111) and σ operators alone. We observe a marked reduction in the ground-state energy and a noticeable improvement in the length of the plateau region resulting from the removal of excited-state contamination, as well as a significant improvement in the statistical error. This optimal operator will also be used in our matrix element fits in the following section.

DECAYS
In this section we detail the measurement and fitting of the K → ππ three-point Green's functions, from which the unrenormalized matrix elements (ππ) I=0 |Q i |K 0 are obtained.

A. Overview of measurements
On the lattice we measure the following three-point functions, where t denotes the time separation between the kaon and four-quark operators, and t K→snk sep the time separation between the kaon and the ππ "sink" operator, O snk . As described in Ref. [31], the Wick contractions of these functions fall into four categories based on their topology, as illustrated in Fig. 2.
Note that here and below we take care to differentiate between the G-parity kaon stateK 0 , which is a G-parity even eigenstate of the finite-volume Hamiltonian, and the physical kaon K 0 that is not an eigenstate of the system. The matrix elements of the physical kaon are related to those of the G-parity kaon by a constant multiplicative factor of √ 2 that serves as the analogue of the Lellouch-Lüscher finite-volume correction as described in Sec. VI.B. of Ref. [10].
In order to maximize statistics we translate the three-point function over multiple kaon timeslices and average the resulting measurements. As the statistical error is dominated by the type3 This divergence is removed by defining the subtracted operators [31,32], We will henceforth denote the unsubtracted operator with a hat notation,Q i . The coefficients α i in Eq. (37) are defined by imposing the condition, where we have allowed α i to vary with time as this was found to offer a minor statistical improvement. Although the matrix element of this pseudoscalar operator vanishes by the equations of motion for energy-conserving kinematics and is therefore not absolutely necessary for our calculation, the subtraction reduces the systematic error resulting from the small difference between our ππ and kaon energies while simultaneously reducing the statistical error and suppressing excitedstate contamination.
Due to having vacuum quantum numbers, the I = 0 ππ operators project also onto the vacuum state and this off-shell matrix element dominates the signal unless an explicit vacuum subtraction is performed, However, due to our definition of the subtraction coefficient α i in Eq. (38), the vacuum matrix elements appearing in the right-hand side vanish making this subtraction unnecessary. In practice this cancellation is not exact in our numerical analysis for the following reason: While the ππ "bubble" 0|O † snk |0 is formally time-translationally invariant we observed a minor statistical advantage in evaluating this quantity with the ππ operator on the same timeslice as it appears in the full disconnected Green's function that is being subtracted, such that it is maximally correlated.
Therefore, for the right-most term in Eq. (39) we compute where t K is the kaon timeslice and {t K } the set of timeslices upon which measurements were performed, i.e. with the product of the K →vacuum matrix element and the ππ bubble performed under the average over the kaon source timeslice rather than after. As suggested by the above, the coefficients α i (t) are computed separately from the t K -averaged matrix elements and therefore the cancellation between the two terms in brackets is exact only up to the degree to which the time translation symmetry is realized at finite statistics. Due to our large statistics we found the difference in the fitted Q 6 matrix element obtained with and without the vacuum subtraction to be at the 0.1% level. Our final choices of cut incorporate data from this set in the range 6 ≤ t ≤ 11 (cf. Sec. IV E 4). In this window we observe that for both the C 2 and C 6 correlation functions, the contribution of the noisy, type4 disconnected diagrams is largely consistent with zero, albeit with much larger errors for the former. C 2 appears dominated by the type1 and type3 diagrams, which both contribute with the same sign, with a negligible contribution from the type2 diagrams. The contribution of the type1 and type3 diagrams appears to behave similarly for the C 6 three-point function, however here we observe a strong cancellation between those and the type2 diagrams.
The subtraction coefficients α i are computed via Eq. (38) as the following ratio of two-point functions, where the average of the correlation functions over the kaon source timeslice is implicit as above.
The Wick contractions for the 0|Q i (t)OK0(0)|0 two-point functions are identical to the components of the type4 K → ππ diagrams that are connected to the kaon. While these connected components are formally independent of the sink two-pion operator, in practice these quantities were computed using code that was organized differently for the ππ and σ operators. As described in Appendix B of this paper and Appendix B.2 of Ref. [33], the factors entering the type4 diagrams that determine the α i were constructed from two separate bases of functions of the quark propagators, one for the σ and the other for the ππ(. . .) operators, where for each basis γ 5 hermiticity was used in a different way. While γ 5 hermiticity is an exact relation, the fact that we are using a stochastic approximation for the high modes of the all-to-all propagator allows small differences to arise between the values of the α i computed in these two bases. We therefore have separate results for the α i from the ππ and σ three-point functions calculations.
In Fig. 4 we plot the time dependence of the α i for all ten operators. We observe excellent agreement between the results obtained from the two different bases of contractions as expected.
For a number of operators we find statistically significant but relatively small excited-state contamination for small t that in all cases appears to die away by t = 6. While the effects of this contamination are unlikely to significantly affect our final results, the cuts that we later apply to our fits nevertheless exclude data with t < 6.
The K → ππ matrix elements of the pseudoscalar operatorsγ 5 d are required to perform the subtraction of the divergent loop contribution. In this section we independently analyze these matrix elements in order to understand their time dependence and the corresponding effect of the subtraction on the amount of excited state contamination in the final K → ππ result.
In the limit of large time separation between the source/sink operators and the four-quark operator, only the lowest-energy ππ and kaon states are present. Since the pseudoscalar matrix elements vanish by the equations of motion when the decay conserves energy and the kaon and ππ groundstate energies in our calculation differ by only 2%, we expect the subtraction to result in only a negligible shift in the central value but a marked improvement in the statistical errors in this limit.
However at finite time separations, the contributions of the excited states may take a long time to die away due to the increasing magnitude of the corresponding matrix elements between initial and final states of different energies. It is this concern that prompts us to study this system more carefully.
The lattice three-point function for a generic sink ππ operator, O snk , has the following time dependence: where the subscript 'in' refers to the incoming kaonic state, 'out' to the outgoing two-pion state, and M i j P is the matrix element for the term involving in and out states i and j, respectively. It is convenient to define an "effective matrix element" by dividing out the ground-state time dependence and operator amplitudes, is the separation between the four-quark operator and the sink and Note that M eff,snk P is dependent on the sink operator through the terms involving the excited states, in which a ratio of ground and excited state amplitudes appears.
We measure the correlation function Eq. (42) for each of our three two-pion operators. Note that a vacuum subtraction is also required here and is performed in the same way as for the four-quark . The corresponding data for the ππ(311) operator is much noisier and has therefore been excluded. The form of this plot can be explained as follows: As E 0 in ≈ E 0 out we expect M 00 P to be small. If we then assume that the dominant excited state contributions come from the term involving the excited kaon state and ground ππ state (i = 1, j = 0) and the term with the ground kaon state and the first excited ππ state (i = 0, j = 1), then we expect the data to behave as This ansatz then implies an exponentially falling contribution from the excited pion state and an exponentially growing piece from the excited kaon state, giving rise to a bowl-like shape assuming that A 1 in and A 1 out have the same sign, which appears to the the case here. Furthermore, the exponentially-growing piece in t is expected to be larger for smaller t K→snk sep , and indeed we observe that the turnover point at which the exponentially-growing term begins to dominate occurs sooner for smaller t K→snk sep .
While the effective matrix elements of both sink operators initially trend towards zero, for the more precise ππ(111) data it seems that none of the five data sets are statistically consistent with zero at their maxima, suggesting we do not reach the limit of ground-state dominance. This is not necessarily an issue for our calculation given that the subtraction will heavily suppress these contributions in our final result, and furthermore the inclusion of multiple sink operators will improve our ability to extract the ππ ground-state matrix element. In order to disentangle these two effects it is convenient to examine the three-point function for the optimized sink operator discussed in Sec. III F. The time dependence of M eff,snk P for this operator is also shown in Fig. 5.
By definition this operator heavily suppresses A j out for j > 0, and indeed we find the data to be much flatter in the low-t region and also considerably closer to zero. The exponential growth and t K→snk sep dependence that enters due to the excited kaon term is expected to be largely unaffected by this transformation, however it seems that in several cases the plateaus extend much further into the large-t region than previously. It is likely that is due to an accidental cancellation owing to the fact that A 1 out is positive for the ππ(111) operator and negative for the σ operator (cf. Tab. III) and hence the exponentially-growing terms for these operators have opposite signs.
We conclude by discussing the expected size of the excited-state contamination in the matrix elements of the subtracted four-quark operators arising from the pseudoscalar operator. In the K → ππ calculation, this dimension-3 operator is introduced to remove what in the continuum limit would be a quadratic divergence resulting from the self-contraction between two of the four quark operators appearing in those operatorsQ i with a component transforming in the (8, 1) or In our lattice calculation these terms behave as 1/a 2 when expressed in physical units. To leading order in a this 1/a 2 coefficient does not depend on the external states and is therefore removed from our 0|(ππ)Q iK 0 |0 amplitude by the subtraction defined above, even though the coefficients α i are determined from the 0|Q iK 0 |0 matrix element in Eq. (41). Because of the chiral structure of the (8, 1) and (8,8) operators, these coefficients have the structure: α i ∼ m s −m d a 2 + . . . [8], where the ellipsis represents terms which are not powerdivergent.
Thus, thesγ 5 d subtraction removes the leading 1/a 2 term in the matrix element ofQ i , leaving behind a finite piece of size ∼(m s − m d )Λ 2 QCDs γ 5 d. This remainder is not physical and depends on the condition chosen to define the α i . However, it will contribute to our final result if E ππ = m K . For the ground-state component (i = 0, j = 0) this term is thus heavily suppressed by the factor (E 0 ππ − m K ). However for the excited states we expect this piece to be on the order of the physical contribution from the dimension-6 four-quark operator. As such it may result in a modest enhancement of the excited state matrix elements. Providing we are able to demonstrate that we have the excited ππ and kaon states under control through appropriate cuts on our fitting ranges, this should pose no obstacle to our calculation.

D. Description of fitting strategy
For a lattice of sufficiently large time extent that around-the-world terms in which states propagate through the lattice temporal boundary can be neglected, and assuming that the four-quark operator is sufficiently separated from the kaon source that the kaon ground state is dominant, the three-point Green's functions C i of the weak effective operators defined in Eq. (36) have the general form, where M j i = (ππ) j |Q i |K 0 is the matrix element of the four-quark operator Q i with the ππ state j, with M 0 i corresponding to the physical K → ππ matrix elements required to compute A 0 . The factor of 1/ √ 2 relates the matrix element involving the kaon G-parity eigenstate to that of the physical kaon [10]. Here A K is the amplitude of the G-parity kaon operator, A j snk are the amplitudes of the sink operator with the state j, and E j is the energy of that state. These parameters are fixed to those obtained from the two-point function fits in Sec. III: A K and m K to the results given in Tab. II, and A j snk and E j to the results obtained from the three-operator, two-state ππ fits given in the second column of Tab. III.
We perform simultaneous correlated fits over multiple sink operators to the form Eq. (48) in order to determine the matrix elements M j i , allowing for one or more states j. Independent one-state fits are also performed to the optimized sink operator defined in Sec. III F. The fits are performed to each weak effective operator separately, in the 10-operator basis (the relationship between these 10 linearly-dependent operators serves as a useful cross-check of the fit results) using the strategy outlined in Sec. II B. We apply a cut t min on the separation t between the kaon and the fourquark operator in order to isolate the ground-state kaon, and also a cut t min on the separation t = t K→snk sep − t between the four-quark and sink operators. These cuts, the number of sink operators, and the number of excited ππ states included in the fit are varied in order to study systematic effects.
For use below we again define an "effective matrix element" in which the ground-state ππ and kaon amplitudes and time dependence are multiplied out, These effective matrix elements converge exponentially to the ground-state matrix element at large t . Note that, unlike in Sec. IV C, we are assuming that a cut, t min , on the separation betwen the kaon and four-quark operators has been applied that is sufficient to isolate the contribution of the kaon ground state. As a result, these effective matrix elements can be assumed to be independent of t K→snk sep and a weighted average of our five datasets of different t K→snk sep can be applied to improve the statistical resolution of the data presented in our plots.

E. Fit results
In this section we examine the results of fitting various subsets of our data, with the goal of finding an optimal fit window in which systematic errors arising from both excited ππ and kaon states are minimized.
In Figs. 6 and 7 we plot the fitted ground-state matrix elements M 0 i as a function of t min for various choices of t min , the number of sink operators and the number of states. The three-operator fits are performed using the ππ(311), ππ(111) and σ sink operators; for the two-operator fits we drop the noisier ππ(311) data; and for the one-operator fits we further drop the σ data. The one-operator, one-state fits are equivalent to those performed in our 2015 work, albeit with more statistics and more reliable ππ energies and amplitudes.
The discussion below will be focused on these figures. We will first discuss general features addressing the quality of the data and the reliability of the fits, and will then concentrate on searching for evidence of systematic effects (or lack thereof) arising from kaon and ππ excited states.
Based on those conclusions we will then present our final fit results.

Discussion of data and fit reliability
We will first comment on the fits to the optimal operator, labeled "opt." in the figures. This approach is outwardly advantageous in that the fits are performed to a single state and the covari- appearance in the legend. The legends are given in the format #ops × #states followed by the cut t min on the time separation between the kaon and the four-quark operators. Here "opt." is the fit to the optimal operator and "sys." is used to estimate the systematic error resulting from a third state. effective matrix elements of this optimal operator to that of the ππ(111) and σ operators alone, where we note a marked improvement in the quality of the plateau. This behavior, which is also accounted for implicitly in the multi-state fits, demonstrates the power of the multi-operator technique for isolating the ground state. In Figs. 6 and 7 we observe that the fit results for the optimal operator agree very well with the multi-state fit results in all cases. While this approach does not appear to offer any statistical advantage, the strong agreement suggests that our complex multi-state correlated fits are under good control.
In Figs. 6 and 7 we observe for several ground-state matrix elements a trend in the fit results up to an extremum at t min = 7, followed by a statistically significant correction at the level of 1-2σ for the fits with t min = 8. In this and Sec. VII A we present substantial evidence that the systematic errors resulting from excited kaon and ππ states are minimal, which makes it unlikely that this rise is associated with excited state contamination. Certainly if it were due to excited ππ states we would expect an improvement as more sink operators are added, but there is little evidence of  The p-values assessing how well the data with t ≥ 7 is described by the model for the C i correlation functions obtained by fitting to 3 operators and 2 states with t min = 5 and t min = 6.
such, and likewise if excited kaon states were the cause we would expect an improvement as we increase the t min cut, whereas no significant change is observed. The most likely explanation is a statistical fluctuation in our correlated data set, and indeed in Fig. 8 we see evidence of such a fluctation peaking at t = 7 which is likely driving this phenomenon.
Given the above, an interesting question we can ask is whether the models we obtain from our fits with t min = 5, which in all cases lie within the plateau region before this rise, are a good description of the subset of data with t ≥ 7, or in other words how likely it is that these data are consistent with this model allowing only for statistical fluctuations. In Tab. IV we list the p-values for these data using the model obtained by fitting to 3 sink operators and 2 states with t min = 5 and t min = 6, computed using the technique discussed in Sec. II B (with no free parameters). We observe excellent p-values in all cases bar M 0 3 , and to a lesser extent M 0 4 . The lower p-values for these operators are common for all of the multi-operator fits and are likely associated with the statistical fluctuations described above which are more apparent for these matrix elements (cf. Fig. 6). We expect that such unusual statistical fluctuations will be found when so many different operators and fitting ranges are examined. Of most importance in a calculation of Im(A 0 ) is M 0 6 , for which we find that the model obtained with t min = 5 is an excellent description of the data with t ≥ 7. The p-value is in fact little different from the value p = 0.525 obtained by fitting to these data directly, suggesting that the models are equally good descriptions despite the tension in the ground-state matrix elements.
For M 0 7 and M 0 8 (and to a lesser extent, M 0 10 ) we observe a discrepancy between the one-operator and multi-operator results at the 1-2σ level that persists even to large t min . Given the very clear plateaus in the multi-state fit results, this disagreement is likely due again to statistical effects in these correlated data. This is evidenced for example in Fig. 9 in which we overlay the M eff,snk 8 effective matrix element for the ππ(111) and σ sink operators by the multi-operator fit curve.
We observe that the fit curve for the ππ(111) operator is completely compatible with the data but favors a value that is consistently within the upper half of the error bar, suggesting that the apparent flatness of the ππ(111) effective matrix element represents a false plateau, and the fact that the multi-operator method is capable of resolving the behavior is a testament to its power.

Excited kaon state effects
We now address excited kaon state effects. Because the data rapidly becomes noisier as we move the four-quark operator closer to the kaon operator and thus further away from the ππ operator, such effects are not expected to be significant. The simplest test is to vary the cut on the time separation between the kaon operator and the four-quark operator, t min . The first three points from the left of each cluster in Figs. 6 and 7 show the result of varying t min between 6 and 8 at fixed t min . As expected we observe no statistically significant dependence on this cut.
We can also test for excited kaon effects by examining the data near the kaon operator in more detail, alongside looking for trends in the five different K → sink separations at fixed t . The optimal operator proves convenient for examining this behavior as it neatly combines the two dominant sink operators and should be flat within the fit window. In Fig. 10 we plot the data for the M eff,snk 4 and M eff,snk 6 effective matrix elements with a distinction drawn between data included and excluded by a cut on the kaon to four-quark operator time separation of t min = 6. We find no apparent evidence of excited kaon state contamination even for data excluded by the cut, nor do we observe any trends of the data in the K → sink separation.
We therefore conclude that excited kaon effects in our results are negligible.

Excited ππ state effects
The dominant fit systematic error is expected to be due to excited ππ states. Fortunately, given that we can change both the number of operators and the number of states alongside varying the fit window within a region where our data is most precise, there are a number of tests we can perform to probe this source of error.
We begin by comparing the multi-operator fits to the one-operator (ππ(111)) fit, the latter being equivalent to the procedure used for our 2015 work. In the majority of cases we see little evidence of excited state contamination in the one-operator data, as evidenced by its agreement with the multi-operator fits as well as the strong consistency between the fits as we vary the fit window. However for the M 0 5 and M 0 6 matrix elements we observe strong evidence of excited-state contamination in these fits at smaller t min . Fig. 6 clearly demonstrates how these effects are suppressed as we add more operators: Initially the one-operator results converge with the 3-operator results at t min = 5 and 6, respectively, at which point the excited states appear to be sufficiently suppressed. Introducing a second operator and state we eliminate part of this contamination and the convergence appears earlier, at t min = 4 and 5, respectively. Finally, in adding the third operator we find results that are essentially flat from t min = 3. This suggests that the 5% excited-state systematic error on our 2015 result which used t min = 4 was significantly underestimated for these matrix elements.
In general we observe excellent agreement between two and three-operator fits with two-states.
Unfortunately, as mentioned above, the ππ(311) data are considerably noisier than those of the other operators, and the associated ππ energy and amplitudes are less-well known, and as such these data contribute relatively little to the fit. Nevertheless we do observe that for the M 0 5 and M 0 6 matrix elements, the introduction of the third operator results in values that for low t min (3 or 4) are in considerably better agreement with the results for larger t min , suggesting that in the regime in which these data are less noisy (i.e. closer to the ππ operator) the third operator is acting to remove some residual excited-state contamination. We conclude that it is beneficial to include the third operator. In order to study the possibility of residual contamination from a third state we perform threeoperator, three-state fits to the matrix elements using the ππ two-point function fit parameters given in the third column of Tab. III and the same fit ranges for t and t used in the three-operator, two-state fits. The results for the ground-state matrix elements are also included in Figs. 6 and 7 with the label "sys.". We find that including this third state has very little impact and the results agree very well with the three-operator, two-state fits. This again suggests that we have the ππ excited-state systematic error under control.
A further test for excited-state contamination is to study the agreement of the fit curves with the data outside of the fit region. To this end in Fig. 11 we plot the ππ(111) and σ operator data for the M eff,snk 6 effective matrix element overlaid by the fit curves for the 3-operator, 2-state fits, and for the 3-operator, 3-state fits described above, using t min = 4 and 5. The fitted groundstate matrix elements in these cases are all in complete agreement to within a fraction of their statistical errors. We observe that the 3-operator, 2-state fit curve with t min = 5 describes well the ππ(111) data at t = 4 but shows a tension for the σ data at this timeslice. Fitting with t min = 4 does not resolve this tension, suggesting the effects of a third state are visible in the σ operator data at t = 4. This is consistent with the pattern of couplings of the operators to the states in Tab. III which show a significant reduction in the couplings to higher states for the ππ (111) operator but almost equal-sized couplings of the σ operator to all three states. The 3-operator, For our final result we choose to focus upon the three-operator, two-state fits. While the majority of the corresponding curves in Figs. 6 and 7 are essentially flat from t min = 3, we opt for a conservative and uniform cut of t min = 5 at which we can strongly claim an absence of significant excited-state effects. In the Sec. VII A we will consider means by which we can assign a systematic error to this result.

Final fit results
As discussed above we choose the 3-operator, 2-state fit with t min = 5 for our final result. As we observe no significant dependence on the cut on the separation between the kaon and four-quark operators we will choose t min = 6. In Tab. V we present the full set of p-values and parameters for these fits. We obtain acceptable p-values in the majority of cases, with the notable exception of the Q 3 four-quark operator for which p = 4%. We find that this p-value is not improved by increasing t min , and also that the p-value of the one-operator, one-state fit with the same fit range -with which our chosen value is in excellent agreement -has a p-value of 15%. The low probability is therefore unlikely to be associated with any systematic effect and can be attributed to lowprobability statistical effects. Comparing these values to those in Tab. V we find that the errors have reduced by factors of 2.8 and 2.4 for M 0 2 and M 0 6 , respectively. Comparing the 3-operator, 2-state fits to the 1-operator, 1-state fits in Fig. 6 we observe that the larger improvement for M 0 2 can be explained by the additional operators, however for M 0 6 these two approaches have similar errors. The fact that the error on M 0 6 has improved considerably more than the factor of 1.9 expected by the increase in statistics can therefore be attributed to the improved precision of the ππ two-point function fits observed in Sec. III D.

V. NON-PERTURBATIVE RENORMALIZATION OF LATTICE MATRIX ELEMENTS
The Wilson coefficients are conventionally computed in the MS (NDR) renormalization scheme, and therefore we are required to renormalize our lattice matrix elements also in this scheme. This is achieved by performing an intermediate conversion to a non-perturbative regularization invariant momentum scheme with symmetric kinematics (RI-SMOM). As the name suggests, these schemes can be treated both non-perturbatively on the lattice (provided the renormalization scale is sufficiently small compared to the Nyquist frequency π/a) and in continuum perturbation theory (providing the renormalization scale is sufficiently high that perturbation theory is approximately valid at the order to which we are working). Thus, we can use continuum perturbation theory to match our RI-SMOM matrix elements to MS, avoiding the need for lattice perturbation theory. The matching factors have been computed to one-loop in Ref. [34].
In our 2015 calculation we computed the renormalization matrix at a somewhat low renormalization scale of µ = 1.529 GeV in order to avoid large cutoff effects on our coarse, a −1 = 1.38 GeV ensemble. Due to this low scale, the systematic error associated with the perturbative RI to MS matching was our dominant error, with an estimated size of 15%. In this paper we utilize the stepscaling procedure [35,36] (summarized below) in order to circumvent the limit imposed by the lattice cutoff and increase the renormalization scale to 4.0 GeV at which the error arising from the use of one-loop perturbation theory is expected to be significantly smaller. A separate step-scaling calculation to 2.29 GeV was performed in Ref. [37] and we will utilize those results to study the scale dependence of the perturbative and discretization errors in our operator normalization.

A. Summary of approach
Due to operator mixing, the renormalization factors take the form of a matrix. This is most conveniently expressed in the seven-operator chiral basis in which the operators are linearly independent and transform in specific representations of the SU(3) L ⊗ SU(3) R chiral symmetry group, an accurate symmetry of our DWF formulation even at short distances. In this basis the renormal-ization matrix is block diagonal, with a 1×1 matrix associated with the Q 1 operator that transforms in the (27, 1) representation, a 4 × 4 matrix for the (8, 1) operators Q 2 , Q 3 , Q 5 and Q 6 , and a 2 × 2 matrix for the (8,8) operators Q 7 and Q 8 .
In the RI-SMOM scheme the renormalized operators are generally defined thus, where Einstein's summation conventions are implied and the label "RI" is used as short-hand for the RI-SMOM scheme. The renormalization factors are defined via where the index m is not summed over. Here α − δ are combined spin and color indices, Z q is the quark field renormalization, q is a four-momentum that defines the renormalization scale and P m are "projection matrices" described below. The quantities F im on the right-hand side are found by evaluating the left-hand side of the equation at tree level. Γ RI im are computed as where the sum is performed over the full four-dimensional lattice volume and q = p 1 − p 2 . Here E m are a set of seven four-quark operators that each create the four quark lines that connect to the weak effective operator, where the momentum arguments indicate the incoming momenta and the quark momenta satisfy symmetric kinematics: p 2 1 = p 2 2 = (p 1 − p 2 ) 2 = q 2 ≡ µ 2 . The subscript "amp." in Eq. (53) implies that the external propagators are amputated by applying the ensemble-averaged inverse propagator, such that the resulting Green's function has a rank-4 tensor structure in the spin-color indices.
These Green's functions are not gauge-invariant, hence the procedure must be performed using gauge-fixed configurations, for which we employ Landau gauge-fixing. The use of momentumspace Green's functions introduces contact terms that prevent the use of the equations of motion so that additional operators, beyond those needed to determine on-shell matrix elements, must be introduced if all possible operator mixings are to be included, as is required if the RI-SMOM scheme is to have a continuum limit. These are discussed below.
Note that the Wick contractions of Eq. (53) result in disconnected penguin-like diagrams that interact only by gluon exchange; these diagrams are evaluated using stochastic all-to-all propagators and are typically noisy, requiring multiple random hits and hundreds of configurations. The presence of disconnected diagrams also precludes the use of partially-twisted boundary conditions and therefore limits our choices of the renormalization momentum scale to the allowed lattice momenta.
The quark field renormalization Z q is also computed in the RI-SMOM scheme via the amputated vertex function of the local vector current operator,qγ µ q, from which we compute Z V /Z q where Z V is the corresponding renormalization factor for the local vector current. The factor Z V is not unity as the local vector current is not conserved, however it can be computed independently from the ratio of hadronic matrix elements containing the local and conserved (five-dimensional) vector current allowing Z q to be obtained from the above. Alternatively, Z q can also be computed from the local axial-vector current operatorqγ µ γ 5 q. Again the ratio Z A /Z q is determined from a three-point function evaluated in momentum space and, providing non-exceptional kinematics are used, is equivalent up to negligible systematic effects at large momentum [38]. The quantity Z A is then determined by comparing the pion-to-vacuum matrix elements of the local and approximately conserved (five-dimensional) axial current.
The independent projection matrices P m contract the external spin and color indices, and are chosen with a tensor structure that reflects that of the operator with the same index. For the weak effective operators, we can choose both parity-even and parity-odd projectors, which project onto the parity-even and parity-odd components of the amputated Green's function, respectively, and which should both provide the same result due to chiral symmetry. In practice however we have found that the parity-odd choices are better protected against residual chiral symmetry breaking effects that induce non-zero mixings between the different SU(3) L ⊗ SU(3) R representations (cf. Sec. 4.5 of Ref. [39]), and so we will use the parity-odd projectors exclusively. We consider two different projection schemes: the "γ µ scheme", for which the parity-odd projectors have the spin structure, (55) P γ µ m = ±γ µ ⊗ (γ 5 γ µ ) − (γ 5 γ µ ) ⊗ γ µ , and the " / q scheme" with spin structure For the full set of parity-odd and parity-even projectors we refer the reader to Sec. 3.3.2 of Ref. [33].
Similar choices of γ µ and / q projector exist also for the quark field renormalization. We will follow the convention of describing our RI-SMOM schemes with a label of the form SMOM (A, B) where the quantities A and B in parentheses describe the choices of projector for the four-quark operator and Z q , respectively. In this work we consider only the SMOM(γ µ , γ µ ) and SMOM( / q, / q) schemes as previous studies of the renormalization of the neutral kaon mixing parameter B K indicate that the non-perturbative running is better described by perturbation theory for these two choices than for the two mixed schemes [40]. We will compare our final results obtained using both intermediate schemes in order to estimate the systematic perturbative and discretization errors in computing the RI to MS matching.

B. Operator mixing
The seven weak effective operators mix with several dimension-3 and dimension-4 bilinear operators. For the parity-odd components these are S 1 =sγ 5 d, where the arrow indicates the direction of the discrete covariant derivative. These are accounted for by performing the renormalization with subtracted operators, The subtraction coefficients b j are obtained by applying the following conditions, = 0 with symmetric kinematics at the scale q 2 . The projection operators can be found in Sec. 7.2.6 of Ref. [37]. In practice we find that the subtraction coefficients are small due to the suppression of the mixing by a factor of the quark mass as a result of chiral symmetry, and also the observation that the amputated vertex function Eq. (53) with a four-quark external state and a two-quark operator necessarily involves only disconnected diagrams that are small at large momentum scales due to the running of the QCD coupling.
Mixing also occurs with the dimension-5 chromomagnetic penguin operator and a similar electric dipole operator, conventionally labeled Q 11 and Q 12 , respectively [41]. These operators do not vanish by the equations of motion and therefore contribute also to the on-shell matrix elements, but break chiral symmetry and as such are expected to be heavily suppressed [41,42]. It is therefore conventional to neglect their effects in, for example, the determination of the Wilson coefficients [7]. In our DWF calculation the dimension-1 mixing coefficients of these dimension-5 operators will be of order the input quark masses used in our RI-SMOM calculations or the DWF residual mass -effects, when combined with the required gluon exchange, should be at or below the percent level. Thus, in this work we neglect these operators.
In addition to the lower-dimension operators there is also mixing with both gauge-invariant and gauge-noninvariant dimension-6 two-quark operators. These operators enter at next-to-leading order and above, and are therefore naturally small provided we perform our renormalization at large energy scales.
The gauge-noninvariant dimension-6 operators vanish due to gauge symmetry and in many cases also by the equations of motion, and therefore do not contribute to on-shell matrix elements [43]. These operators enter the renormalization only at the two-loop level [8] and above, and given that the RI→ MS matching factors are at present only available to one loop, the systematic effect of disregarding these operators is likely to be much smaller than our dominant systematic errors. Nevertheless we are presently investigating position-space renormalization [44] which does not require gauge fixing and therefore does not suffer from such mixing, and as such we may be able to remove this systematic error in future work.
Of the gauge-invariant dimension-6 operators, is the only operator that mixes at one loop [45], with all others entering at two-loops and above. In Ref. [37] we have investigated the impact of including the G 1 operator in our RI-SMOM renormalization and have computed the subsequent effect on the K → ππ amplitudes. This can be achieved without the need for measuring matrix elements of G 1 between kaon and ππ states by taking advantage of the equations of motion to rewrite those matrix elements for on-shell kinematics in terms of the matrix elements of the conventional four-quark operators, such that the entire effect of this operator is captured by changes in the values of the (8, 1) elements of the renormalization matrix. Note that at present the results including the G 1 operator have been computed only at the 2.29 GeV renormalization scale and not the 4.0 GeV scale used for our final result. However, as demonstrated in Ref. [37] and also in Sec. VII F, the effects of including G 1 are at the few percent level as expected, implying that the resulting systematic error is small compared to our other errors.

C. Step-scaling
Step-scaling [36] allows for the circumvention of the upper limit on the renormalization scale imposed by the lattice spacing through independently computing the non-perturbative running of the renormalization matrix to a higher scale using a finer lattice. The multiplicative factor relating the RI-SMOM operators renormalized at two different scales can be obtained from the ratio where µ 1 is a renormalization scale that lies below the cutoff on the original coarser lattice while µ 2 is a higher scale, likely inaccessible on the coarser lattice. The quantity Λ RI (µ 2 , µ 1 ) is computed on finer lattices for which µ 2 also lies below the cutoff and can be applied thus, in order to raise the renormalization scale to µ 2 , giving the renormalization matrix Z RI←lat (µ 2 ) which non-perturbatively converts our course-lattice operators into an RI scheme defined at a scale µ 2 potentially much larger that the inverse of our coarse lattice spacing. We will take advantage of this technique to avoid having to match perturbatively to MS directly at the lower energy scales allowed by our coarse, a −1 = 1.38 GeV lattice.

D. Details and results of lattice calculation
We use the step-scaling procedure to obtain the renormalization matrix at a scale of µ 2 = 4.006 GeV by matching between our β = 1.75, a −1 = 1.378(7) GeV (32ID) ensemble and a second, finer ensemble with β = 2.37 and a −1 = 3.148(17) whose properties are described in Ref. [21] under the label "32Ifine". These ensembles have periodic spatial boundary conditions rather than G-parity boundary conditions, but as previously mentioned, boundary effects can be neglected for these high-energy Green's functions. Such quantities are also constructed to be insensitive to the quark mass scale, and therefore we can disregard the unphysically heavy 170 MeV and 370 MeV pion masses on the 32ID and 32Ifine ensembles, respectively. Note also that, although we do not take the continuum limit of the step-scaling matrix computed on the 32Ifine ensemble, the fine lattice spacing and the typically small size of discretization effects on such quantities [46] suggest the induced error is also negligible compared to our other errors. We remind the reader that these calculations do not include the G 1 operator, and its absence in our calculation is treated as a source of systematic error in Sec. VII.
Due to the presence of disconnected diagrams in our calculation, the choices of quark momenta are restricted to the discrete values allowed by the finite-volume. The closest match between allowed momenta on the 32ID and 32Ifine ensembles that can be chosen as an intermediate scale is µ 32ID 1 = 1.531 GeV and µ 32Ifine 1 = 1.514 GeV, respectively. The fact that these scales differ by 1.1% introduces a systematic error that, given the slow evolution of the QCD β -function, can be treated as negligible.
We obtain the quark field renormalization for the 32Ifine ensemble via the vector current operator as described in Sec. V A. For the 32ID ensemble we use the axial-vector operator as the corresponding renormalization factor, Z A has been measured to much higher precision than Z V (0.05% versus 1.2%, respectively) [47]. The measurements of Z A and Z V are treated as statistically independent from those of the amputated vertex functions and are incorporated into the calculation using the superjackknife technique.
On the 32ID ensemble we extend the calculation at µ we obtain the renormalization matrices Z RI←lat The results for the step-scaling matrix Λ(4.006 GeV, 1.514 GeV) i j in both schemes are given in Tab. VII. In Tab. VIII we combine these step-scaling results with the 32ID Z RI←lat results to produce the final renormalization matrices at 4.0 GeV, where the errors on the two independent ensembles have been propagated using the super-jackknife procedure.
As mentioned previously, we will also utilize step-scaled renormalization matrices computed at µ 2 = 2.29 GeV both with and without the G 1 operator included. This calculation used an intermediate scale of µ = 1.33 GeV to match between the coarse and fine ensemble. Details of this calculation can be found in Ref. [37]. In that work the statistical errors on Z V and Z A were not included in the results, and Z V was used rather than Z A in the determination of Z q on the 32ID ensemble. In order to match the procedure outlined above we have reanalyzed the data from that work, the results of which are presented in Tab. IX for µ = 1.33 GeV and Tab. X for µ = 2.29 GeV. Note, at present only results in the SMOM(γ µ , γ µ ) scheme are available with G 1 included. calculations are listed in Tab. XI and their uncertainties are accounted for as a systematic error in the following section. In this table the value of Re(A 2 ) was obtained from the experimental measurement of K + → π + π 0 decays, and the value of Re(A 0 ) from K S → π + π − and K S → π 0 π 0 decays. The relationship between the isospin amplitudes and the experimental branching fractions and decay widths is described in detail in Secs. III.A and III.B of Ref. [14].

VI. RESULTS FOR
As previous mentioned, the Wilson Coefficients that incorporate the short distance physics "integrated out" from the Standard Model are known in perturbation theory in the 10-operator basis to NLO in the MS scheme. Partial calculations at NNLO are available in the literature [48][49][50][51][52], together with a preliminary study on a direct lattice determination [53]; in this manuscript we utilize the complete NLO results of Ref. [7] in the MS-NDR scheme for our central values, and the LO predictions to assign a systematic error due to the truncation of the perturbative series. For consistency with the NLO determination of the Wilson coefficients we follow Ref. [7] in utilizing the 2-loop determination of α s given in Ref. [7] (and the 1-loop determination for the LO Wilson coefficients used to estimate the systematic error) despite the fact that a 4-loop calculation is available [54]. In order to fix the parameters of the 2-loop (1-loop) calculation, a value of α s at a reference scale is required, and to minimize the perturbative truncation error it is desirable that this scale be close to the typical scale of the physical problem, in our case O(2 GeV). We therefore utilize the 4-loop calculation of α s to run the value of α where δ 0 is the I = 0 ππ scattering phase shift and φ is a known function [11] of q = Lk 2π , appropriately modified for our antiperiodic pion boundary conditions [13], with k the interacting pion momentum defined via (70)   with (upper) and without (lower) the effects of the G 1 operator included. This matrix converts the lattice matrix elements computed in this paper to the SMOM(γ µ , γ µ ) scheme at µ = 1.33 GeV Note that Eq. (69) differs by a factor of two from the corresponding equation in Ref. [12] due to our different conventions on the decay amplitude (cf. Ref. [31]).
The calculation of the Lellouch-Lüscher factor requires the derivative of the phase shift with respect to interacting pion momentum, or correspondingly the ππ energy, evaluated at the kaon mass. The determination of this derivative is detailed in Sec. III E where we present values obtained both directly from the lattice and also from the dispersive prediction. Given the good agreement between our phase shifts and the dispersive predictions [18] we will use the dispersive result given in Eq. (22). The variation in the results will be incorporated as a systematic error in Sec. VII D.
We find

B. Renormalized physical matrix elements
The infinite-volume matrix elements of the seven chiral-basis operators Q R j in a scheme R at the scale µ can be expressed without ambiguity in terms of the matrix elements M lat j = ππ|Q lat j |K of the corresponding lattice operators: where a is the lattice spacing, Z R←lat (µ) a 7 × 7 renormalization matrix and F the Lellouch-   [6], apart from those of Re(A 0 ), Re(A 2 ) and their ratio, ω, which were taken from Ref. [1]. Here φ ε is the phase of the indirect CP-violation parameter ε.
The CKM ratio τ = −V * ts V td /V * us V ud is obtained using the Wolfenstein parameterization expanded to eighth order, with parameters taken from the aforementioned review. The impact upon our result of the errors on those quantities marked with a ( * ) is incorporated as a systematic error in The ten conventional, linearly-dependent operators Q i are defined in terms of the seven independent operators Q j as follows: (73) where 1 ≤ i ≤ 10, j runs over the set {1, 2, 3, 5, 6, 7, 8} and the matrix T is given by which can be found as Eqs. (58) and (59) of Ref. [34]. This relationship applies both to RI scheme and bare lattice operators.
In our lattice calculation we have evaluated the matrix elements of all ten linearly-dependent operators Q i as given in Tab. V. This gives us a consistency test of the three Fierz identities: these identities are obeyed to within statistical errors and with an absolute size at the 1% level, validating our code. We do not expect the Fierz relations to be obeyed to floating point accuracy since our use of all-to-all propagators introduces a stochastic element into the inversion of the Dirac operator and our use of γ 5 hermiticity differs between the ten operators introducing statistical noise in different ways into each evaluation.
Since the Fierz identities are not obeyed exactly by the data in Tab. V, we have a choice as to how the ten linearly-dependent matrix elements M lat i in that table are to be combined to give the seven independent matrix elements M lat i needed on the right-hand side of Eq. (72). To this end we choose to treat M lat i as fit parameters whose best fit values are obtained by minimizing the correlated χ 2 : The result is an optimal combination that provably minimizes the statistical error on the resulting M lat i . The 10 × 10 covariance matrix C i j is estimated by studying the variation of the bootstrap means of the matrix elements, and is given in Tab. XII. Note that we use the same covariance matrix for the fit to each bootstrap sample (a frozen fit) and therefore do not take into account in our errors the fluctuations in the covariance matrix over bootstrap samples. However such effects are expected to be minimal due to our large number of configurations. The results for the bare matrix elements obtained by this procedure, along with those obtained by applying Eq. (73) to convert those results back into the 10-basis, are given in Tab. XIII. These results are quoted in physical units and incorporate the Lellouch-Lüscher finite-volume correction.
The results for the seven operators converted to the SMOM(γ µ , γ µ ) and SMOM( / q, / q) schemes are given in the left two columns of Tab. XIV. The right two columns of that table show the matrix elements of the ten conventional operators in the MS scheme obtained from the left two columns by an application of Eqs. (73) and (74). For the convenience of the reader in utilizing these results we also provide the covariance matrices for the SMOM( / q, / q) scheme matrix elements, which we will use as our central values in Sec. VIII, and also the MS matrix elements derived from them, in Tabs. XV and XVI, respectively.  We can now obtain A 0 from our lattice calculation as follows: The Wilson coefficients have been computed to next-to-leading order in QCD and electroweak perturbation theory in the MS scheme [7], and at µ = 4.006 GeV take the values given in Tab. XVII.
For the CKM matrix element ratio τ we use the value given in Tab. XI. Combining these with the MS-renormalized matrix elements obtained in Tab. XIV we obtain the following for the SMOM( / q, / q) intermediate scheme,      systematic errors is presented.
The contributions of each of the ten operators to the real and imaginary parts of A 0 are given in Tab    The real and imaginary parts of A 0 comprise different linear combinations of the same basis of real lattice matrix elements. As the real part of the amplitude is precisely known from experiment and is not expected to receive significant contributions from new physics, we can use this quantity to replace part of the lattice input and thereby improve the precision of the imaginary part. The appropriate procedure is discussed in Refs. [55,56] in the context of the conventional basis of 10 non-independent operators, where the latter authors use it to eliminate the Q 2 matrix element. For our purpose it is more convenient to express the method in terms of the unrenormalized matrix elements in the 7-operator basis. We write Im where the M lat j = ππ|Q k |K are the matrix elements of the unrenormalized lattice operators in the 7-basis in infinite-volume and physical units, and are the "lattice Wilson coefficients". Here T i j is the 10 × 7 matrix expressing the 10 linearlydependent operators in terms of the seven independent operators in the chiral basis, given in Eq. (74). The matrix Z MS←lat is the product of the 7 × 7 perburbative matrix expressing the seven MS operators in terms of the seven RI operators and the non-perturbative 7 × 7 matrix which determines the RI operators in terms of the lattice operators.
We can then use Eq. (79) to remove the matrix element of the operator Q from Im and choose In Tab We could instead choose the parameter λ to give that result for Im(A 0 ) with the smallest statistical error. Since the value obtained for λ from this procedure is extremely close to that needed to remove the matrix element M lat 3 , we adopt the simpler procedure of eliminating M lat 3 and the results given in Eqs. (85) and (86).
The resulting real part of the complex phase, is in complete agreement with the value of 0.9998(2) obtained by combining PDG inputs [6] and the dispersive values for the phase shifts [16].  (26) and for the SMOM(γ µ , γ µ ) intermediate scheme, where the error is statistical only.
It is illustrative to break the value of Re(ε /ε) into the so-called "QCD penguin" ImA 0 ReA 0 and "electroweak penguin" ImA 2 ReA 2 contributions, the sum of which is equal to Re(ε /ε). These terms have opposite sign such that the sum involves an important cancellation. For the electroweak penguin contribution we find Using the results for Im(A 0 ) obtained using the SMOM( / q, / q) intermediate scheme we find (96) Re ε ε QCDP = 0.00297 (26) , and likewise for the SMOM(γ µ , γ µ ) intermediate scheme, Re ε ε QCDP = 0.00283 (25) .
We observe that the two terms cancel at the 27(4)% and 28(4)% level relative to the QCDP contribution for the SMOM( / q, / q) and SMOM(γ µ , γ µ ) results, respectively. This degree of cancellation is considerably less than the 71(36)% observed in our 2015 analysis. Here the errors are statistical only.
We can also compute a purely lattice value of Re(ε /ε) using Re(A 0 ) from Eqs.  6. The change in the value of the experimental inputs, notably that of the CKM ratio τ from 0.001543 − 0.000635i to 0.001558 − 0.000663i.
We first note that repeating the ππ two-point function analysis for our larger data set but with a one-state fit to a single operator (ππ(111)), and a fit range 6-25 to match that of the 2015 analysis, yields a result (in lattice units), indicating that a 3.4× increase in statistics is not sufficient to account for the difference.
Again we observe a small increase but insufficient to account for the difference.
The result in Eq. (105) differs now from our primary result only in the ππ and K → ππ fitting strategies. Adopting the final fit ranges determined for the ππ and K → ππ fits in Secs. III and IV, such that the analysis now differs only in the number of ππ operators, gives This result is now much closer to our final result. The behavior we observe here is consistent with that displayed in Fig. 6 where we plot the dependence of the fitted matrix elements on the cut t min and the number of ππ operators included in the fits to the matrix elements (the ππ two-point function fits remain unchanged between the results displayed in this figure). This figure shows a significant discrepancy between the Q 6 matrix element obtained from a one-operator, one-state fit with t min = 4 and the plateau observed when further operators are included. With increased statistics the onset of the apparent plateau for the one-operator, one-state fit does not occur until t min = 5 (equal to the t min = 5 used to obtain the result in Eq. (106)) but the resulting value for the Q 6 matrix element is still several standard deviations larger than the strong plateau observed in the multi-operator fits.
We therefore conclude that the difference in Re(ε /ε) between our present and 2015 analysis results can be attributed primarily to unexpectedly large excited-state contamination in our previous analysis masked by the rapid reduction in the signal to noise ratio, and that multiple operators are essential to isolate the ground-state matrix element even with large statistics.

VII. SYSTEMATIC ERRORS
In this section we describe the procedure used to estimate the systematic errors on our results.
We will quote the values as representative percentage errors on either the matrix elements or on A 0 as appropriate. A discussion of the systematic errors in the ∆I = 3/2 calculation can be found in Ref. [2].

A. Excited state contamination
In Sec. IV E we devoted considerable effort to finding an optimal fit window in which excited state effects are minimal. We were unable to find evidence of such effects arising from excited kaon states, which is to be expected given both the large relative energy of such states and also  the fact that the rapid growth of statistical noise as the four-quark insertion is moved away from the ππ operator implies that the data furthest from the kaon operator dominates the fit results. As such we do not assign a systematic error to possible contamination from excited kaon states.
As for the contribution of excited ππ states, we found little evidence for such effects even within the single operator fits to the ππ(111) data, except for the Q 5 and Q 6 matrix elements where the single-operator fits showed statistically significant deviations from the common plateau region that did not die away until t = 6. We observed that by adding further sink operators and allowing for more ππ states substantially reduced the excited-state contamination such that the fits were highly consistent even if we include data at times as low as t = 3. Despite this we chose a conservative uniform cut of t min = 5 for our fits.
In order to assign a numerical error to the contamination from excited ππ states, we consider the comparison of the 3-operator, 3-state fit with t min = 4 and the 3-operator, 2-state fit with t min = 5, the latter being our chosen best fit. The former includes a third state and with t min = 4 appears capable of describing the data well outside of the fit range, as we observed in Fig. 11 (lower-left panel). We compute relative differences under the bootstrap between the values of the groundstate matrix elements, the results of which are shown in Tab. XX. The only statistically resolvable difference, at 1.5σ , is for the Q 9 matrix element, which has only a negligible contribution to Im(A 0 ). For the dominant Q 4 and Q 6 matrix elements the differences cannot be resolved within our errors. We therefore conclude that the excited state systematic error is likely to be much smaller than our dominant systematic errors and can be neglected.

B. Unphysical kinematics
As our values of E ππ and m K differ by 2.2(3)%, the K → ππ matrix elements are not precisely on shell. As discussed in Sec. IV, the primary result of these unphysical kinematics is the rise of a divergent contribution from the pseudoscalar operatorsγ 5 d that vanishes when on shell by the equations of motion. In order to suppress this error we perform an explicit subtraction of the pseudoscalar operator that leaves behind a finite, regulator-independent term that represents the dominant remaining systematic error from the unequal kaon and ππ energies. As we are close to being on shell we can reasonably assume a linear ansatz for the dependence of our result on the energy difference E ππ − m K . We estimate the associated systematic error by observing the change in the Q 2 matrix element as the kaon mass is increased by 4.5%. The measurement was performed using 69 configurations of our original ensemble [1], with 3 different K → π time separations (10, 12, and 14), and we observed a 6.9% increase in the matrix element. We scale this increase by the relative difference between our kaon and ππ energies, giving 3%.
Another means of estimating this systematic error is to vary the subtraction coefficients α i by an amount consistent with the expected size of the residual contribution of the pseudoscalar operator.
Given that the operator is dimension-3, its coefficient is originally O(m s /a 2 ) where the strange quark mass is in physical units. After the subtraction is performed, the residual term is expected to be of size O(m s Λ 2 QCD ), which has a relative size of ∼a 2 Λ 2 QCD , or ∼5%, of the original contribution, for Λ QCD = 300 MeV. Increasing the subtraction coefficients α i by this amount gives rise to the differences in the unrenormalized lattice matrix elements given in Tab. XXI. The observed variations are generally consistent with the above, but to be conservative we assign a relative systematic error of 5% on the matrix elements resulting from the off-shell difference E ππ = m K . We use the value provided in Ref. [1] as an estimate of the finite lattice spacing systematic error. This was obtained by comparing the values of the ∆I = 3/2 matrix elements between the continuum limit [2] and the calculation [14] performed on our 32 3 × 64, β = 1.75 (32ID) lattice.
The parameters of the latter ensemble are identical to those used in this work to compute A 0 , albeit without G-parity boundary conditions and with a larger-than-physical light quark mass giving a unitary pion mass of 170 MeV. The MS values for the three continuum matrix elements that contribute to A 2 are obtained by combining the continuum values of those matrix elements in the SMOM( / q, / q) scheme (Tab. XIV of Ref. [2]) with the RI→ MS renormalization matrix computed on the 32ID lattice (Eq. 66 of Ref. [2]). As such this estimate addresses only the discretization errors on the matrix elements and not to those on the renormalization factors (which are expected to be small). We find the values given in Tab. XXII. Averaging the three relative errors we arrive at an estimate of 12% discretization errors on the matrix elements.

D. Lellouch-Lüscher factor
As described in Sec. VI A, the calculation of the Lellouch-Lüscher factor, F, that accounts for the power-law finite-volume corrections to the matrix element, requires an ansatz for the derivative of the ππ phase shift with respect to energy. In Sec. III E we present values for this derivative obtained from three methods: • The Schenk parameterization [57] of the dispersive energy dependence obtained in Ref. [16] • A linear approximation in the ππ energy above threshold, dδ 0 dE ππ = δ 0 E ππ −2m π , which is inspired by the dispersive low-energy dependence found in Ref. [16] and can be related to dδ 0 /dq via Eq. (70).
• A direct lattice calculation of the phase shift at energies close to and including the kaon mass.
Ignoring the noisier of the two lattice determinations, the results varied between dδ 0 dq = 1.26 and 1.41, a 12% spread. The resulting values of F differ by 1.5% since the dominant contribution arises from the derivative of the analytic function φ . We therefore assign a 1.5% systematic error to the matrix elements from this source.

E. Exponentially-suppressed finite volume corrections
We expect the remaining finite volume corrections to our matrix elements to be dominated by the (exponentially-suppressed) interactions between the final state pions that are not accounted for  by the Lüscher and Lellouch-Lüscher prescriptions. In Refs. [2,14] we performed an in-depth analysis of the finite-volume errors on the matrix elements that comprise A 2 using SU(3) chiral perturbation theory, in which the mesonic loop integrals are replaced by discrete sums over the allowed momenta. We do not expect these effects to depend strongly on the form of the fourquark operator, and indeed comparable O(6 − 6.5%) corrections were estimated for both classes of operator that enter the calculation of A 2 . We therefore assign a representative 7% systematic error to the matrix elements. In the calculation of our step-scaled non-perturbative renormalization factors with scale µ = 4.01 GeV we have not incorporated the effects of the G 1 operator. A previous lattice study [37], performed in the SMOM(γ µ , γ µ ) scheme and utilizing step-scaling from a low-scale of µ = 1.33 GeV on our 32ID ensemble to a high scale of 2.29 GeV on a finer lattice, revealed the effects on A 0 of including this operator to be on the order of a few percent when combined with the matrix elements measured in our 2015 work [1]. Unfortunately the statistical errors on the differences in the renormalized matrix elements at µ = 2.29 GeV with and without G 1 included were found to be too large to resolve the effect with any precision, and we find that this also applies to the matrix elements obtained in the present work. (The renormalization matrices with and without G 1 at µ = 2.29 GeV can be found in Tab. X.) As discussed in Ref. [37], the increase in the relative error on the bootstrap differences is associated largely with the step-scaling matrix Λ RI that describes the running between the low and high energy scales. However it is reasonable to expect that the largest effects of neglecting G 1 appear at the low energy scale in the step-scaling where the QCD coupling is larger. We therefore it is convenient to study instead the differences between the elements of the 7×7 lattice→ MS renormalization matrix where H is the perturbative matching matrix. In the absence of systematic errors the matrix R MS←RI←lat is independent of the intermediate RI scheme. We can then study this systematic error by examining the matrix where I is the 7 × 7 unit matrix and |.| implies that the absolute value of each element is taken.
The ratio of R-matrices in this equation converts from the lattice scheme to MS through one intermediate scheme, and converts back to the lattice scheme via the other scheme, and hence becomes the unit matrix if no systematic errors exist. The difference from the unit matrix is therefore a measure of the size of the systematic error: Under the reasonable assumption that the systematic errors in the two schemes are comparable in size, we expect the elements of Ξ to vary between zero and approximately twice the size of the systematic error present in each. We therefore assign a percentage systematic error that is one half of the largest observed element of Ξ at a scale µ.
In Tab. XXIV we tabulate the non-zero elements of Ξ for various MS scales and step-scaling procedures. Once again we observe that the effects of including or discounting the G 1 operator, while harder to statistically resolve after passing through the step-scaling procedure, are at the percent scale.
As expected there is a general trend towards smaller values as we increase the scale that appears consistent with the factor of three decrease in α 2 s between 1.33 GeV and 4.01 GeV that is expected to describe the scaling of the missing NNLO terms. Unfortunately the statistical errors on the results at 4.01 GeV are too large to resolve the residual systematic effects. Nevertheless, considering the results of this table and also the 3-4% differences observed in ReA 0 and ImA 0 between the schemes in Sec. VI C, we assign a 4% systematic error to the non-perturbative renormalization factors.

H. Parametric errors
We propagate the parametric uncertainties shown in Tab. XI to ReA 0 and ImA 0 . For ReA 0 the largest such uncertainty is the charm-mass dependence, which, however, is only a 0.3% effect.
For ImA 0 , the largest uncertainty is 5% from the τ parameter, 3% from α s , and less than 1% from the charm and top quark masses. The other uncertainties have been estimated but are negligible compared to those quoted. We therefore estimate a total parametric uncertainty of 6% for ImA 0 and 0.3% for ReA 0 .  of Im(A 0 ) are quite consistent, in the range 11-15%. This suggests that the bulk of the observed difference arises from the perturbative 3-to-4 flavor matching and running above the charm threshold, which is common to all of these determinations, and that improved theory input for the 3-to-4 flavor matching could significantly reduce it. (Note that in our calculation we take the matching scale across a flavor threshold equal to the corresponding quark mass in order to avoid large logarithms. Additional insights could be gained by studying the dependence on this matching scale as in Ref. [52].) In conclusion we assign a 12% systematic error on both ReA 0 and ImA 0 associated with the NLO determination of the Wilson coefficients.

J. Error budget
We divide the systematic errors into those that affect the calculation of the matrix elements of the MS weak operators Q j and those that enter when these matrix elements are combined to produce the complex, physical decay amplitude A 0 . The former are collected in Tab    respectively. Note that our 4% estimate of the renormalization systematic error includes both lattice systematic errors and those associated with the truncation of the perturbative series in the RI→ MS matching. While the latter are inappropriate to apply to matrix elements in the non-perturbative schemes, due to our estimation procedure we are at present unable to isolate these two effects and as such apply the full 4% systematic error also to these RI matrix elements.

VIII. FINAL RESULTS AND DISCUSSION
In this section we collect our final results including systematic errors and discuss the implications of our results. For consistency with our previous work we will use the SMOM( / q, / q) intermediate scheme for our central value.

A. Matrix elements
The renormalized, infinite-volume matrix elements in the RI and MS schemes are given in Tab. XIV, where the errors are statistical only. The corresponding relative systematic errors can be found in Tab. XXV. For the convenience of the reader we have reproduced the matrix elements in the SMOM( / q, / q) scheme including their systematic errors in Tab. XXVII. In order to allow the reader to compute derivative quantities from these matrix elements, the covariance matrices for the renormalized matrix elements in the SMOM( / q, / q) and MS schemes at 4.01 GeV can be found in Tabs. XV and XVI, respectively.

B. Decay amplitude
For the real part of the decay amplitude we take the value from Eq. (77a) and apply the systematic errors given in Tab The breakdown of the contributions of each of the 10 operators to these amplitudes can be found in Tab. XVIII. We observe that, at the scale at which we are working, the dominant contribution to Re(A 0 ) (97%) originates from the tree operator Q 2 , while Q 1 has a contribution of about 13% that is largely cancelled by that of the penguin operator [58, 59] Q 6 . Likewise, the dominant contribution to Im(A 0 ) is from the QCD penguin [58,59] operator, Q 6 , with a 14% cancellation from Q 4 . The "∆I = 1/2 rule" refers to the enhancement by almost a factor of 450 of the I = 0 K → ππ decay rate relative to that of the I = 2 decay, corresponding to the experimentally-determined ratio Re(A 0 )/Re(A 2 ) = 22.45 (6). A factor of two contribution to this ratio arises from the perturbative Wilson coefficients [60][61][62]. While the remaining factor of ten has been viewed for some time as a consequence of the strong dynamics of QCD, the origin of this large factor has remained something of a mystery with no widely-accepted dynamical explanation.
In the past [14,15,63], and most recently in Ref. [2], when simulating with physical pion masses we have observed a sizeable cancellation between the two Wick contractions of the dominant (27, 1) operator contributing to the ∆I = 3/2 decay amplitude, leading to a significant suppression of Re (A 2 ). In these calculations we reproduced the experimental value of Re (A 2 ) and concluded that this cancellation was likely to be a very significant element in the ∆I = 1/2 rule. We stress that the cancellation between the two leading contributions to Re (A 2 ) depends sensitively on the light quark mass and becomes much less significant as the light quark mass is increased above its physical value. Note also that such a cancellation is not consistent with naïve factorization, which predicts that both contributions have the same sign and differ in size by a factor of three due to color suppression, although calculations using chiral perturbation theory and the 1/N c expansion [64,65] have previously suggested a reduction in A 2 .
In order to obtain a quantitative, first-principles result for Re (A 0 )/Re (A 2 ), we also require knowledge of Re (A 0 ) which we provide in Eq. (109) of the present paper. Combining this with our earlier result for A 2 [2], we obtain where the errors are statistical and systematic respectively. The value in Eq. (111) agrees very well with the experimental result, demonstrating quantitatively that, within the uncertainties, the ∆I = 1/2 rule is indeed a consequence of QCD and thus providing an answer to an important long-standing puzzle.
For ε /ε we use Eq. (2), combining the lattice values for the imaginary parts of the decay amplitudes with the experimental measurements of the real parts. The systematic error for Im(A 0 ) is taken from Tab. XXVI and that of Im(A 2 ) from Eq. 64 of Ref. [2]. The statistical and systematic errors on Im(A 0 ) and Im(A 2 ) are combined in quadrature and are therefore enhanced by the cancellation between the two terms in Eq. (2). However, one further important systematic error should be addressed: that arising from the effects of electromagnetism and the isospin-breaking difference, m d − m u , between the down and up quark masses.
While for most quantities these corrections enter at the 1% level or below, for ε this familiar situation does not hold. As can be seen from the formula used to compute ε in the Standard Model given in Eq. (2), the I = 0 and I = 2 amplitudes A 0 and A 2 enter with equal weight. However, as is summarized by the ∆I = 1/2 rule, the amplitude A 2 is 22.5 times smaller than A 0 . Thus, a 1% correction to A 0 can introduce an O(20%) correction to A 2 and a potential correction to ε of 20% or more.
The effects on ε of electromagnetism and m d − m u have been the subject of active research for some time [66][67][68]. The most recent results are those of Cirigliano et al. [68]. They provide a correction that is appropriate for our calculation in which the contribution of the electro-weak penguin operators Q 7 and Q 8 has been included. Their result is parametrized byΩ eff which is introduced into a version of Eq. (2) which incorporates these effects: and findΩ eff = (17.0 +9.1 −9.0 ) × 10 −2 . Here we are reproducing Eqs. (54) and (60)  Since a careful discussion of these corrections is beyond the scope of this paper we choose to treat these effects of isospin breaking as a systematic error whose size is given by the effect of includingΩ eff in Eq. (112). We find where the errors are statistical and systematic, with the systematic error separated as isospinconserving and isospin-breaking, respectively. We note that if we were to apply this negative correction directly to our result for Re(ε /ε), the central value obtained, 0.00167, would nearly coincide with the experimental value, albeit with appreciable errors.
Our first-principles calculation of ε /ε also allows us to place a new, horizontal-band constraint on the CKM matrix unitarity triangle in theρ −η plane. In Fig. 12 we overlay this band with constraints arising from other sources. We find that our result is consistent with the other constraints and does not at present suggest any violation of the CKM paradigm. For more information on how this band was obtained, as well as the corresponding plot obtained using our 2015 results, we refer the reader to Ref. [72]. The error bands represent the statistical and systematic errors combined in quadrature. Note that the band labeled ε is historically (e.g. in Ref. [72]) labeled as ε /ε, where ε is taken from experiment.
Möbius domain wall ensemble with the Iwasaki+DSDR gauge action, with an inverse lattice spacing of 1.378(7) GeV and physical pion masses. G-parity boundary conditions are used in the three spatial directions which induces non-zero momentum for the ground-state pions so that the energy of the lightest two-pion state matches the kaon mass to around 2%, thereby ensuring a physical, energy-conserving decay.
The new calculation reported here is based on an increase by a factor of 3.4 in the number of Monte Carlo samples and includes two additional ππ interpolating operators, which have dramatically improved our control over contamination arising from excited ππ states. The greater resolution among the excited finite-volume ππ states provided by our now three interpolating operators has allowed us to resolve the approximately 2σ discrepancy between our earlier lattice result for the I = 0 ππ scattering phase shift and the dispersive prediction, as will be detailed in Ref. [18]. These improved techniques result in a significant, 70% (2.6σ if σ is determined from only the statistical error) relative increase in the size of the unrenormalized lattice value of We are presently working [73] to circumvent this issue by computing the 3-to 4-flavors matching non-perturbatively using a position-space NPR technique [44].
Finally in the current calculation we have adopted a new bootstrap method [27] to determine the χ 2 distributions appropriate for our calculation in which the data is both correlated and non-Gaussian. The resulting improved p-values provide better guidance in our choice of fitting ranges and multi-state fitting functions.
Finite lattice spacing effects remain a significant source of systematic error as at present we have computed ε at a single, somewhat coarse lattice spacing. In the future we intend to follow the procedure used in our A 2 calculation [2] to compute A 0 at two different lattice spacings, allowing us to perform a full continuum limit. This is hampered by the need to generate new ensembles with GPBC, which alongside the high computational cost of the measurements and the need for large statistics requires significantly more computing power than is presently available.
A second important systematic error, which we plan to reduce in future work, comes from the effects of electromagnetic and light quark mass isospin breaking. As discussed in Sec. VIII D, the small size of the amplitude A 2 relative to A 0 gives a potential twenty times enhancement of such effects which are normally at the 1% level. The effects of electromagnetism and the quark mass difference m d − m u have been studied in considerable detail using chiral perturbation theory and large N c arguments, most recently in Ref. [68]. We take the size of their correction as an important systematic error for our present result and are exploring possible methods to also use lattice techniques to determine these effects [74,75].
The third error here is the systematic error associated with isospin breaking and electromagnetic effects, and the first and second errors are the statistical error and the remaining systematic error.
These values are consistent within the quoted errors.
We believe that ε continues to offer a very important test of the Standard Model with exciting opportunities for the discovery of new physics. For this promise to be realized substantially more accurate Standard Model predictions are needed. Important improvements can be expected from a simple extension of the work presented here, studying a sequence of ensembles with decreasing lattice spacing so that a continuum limit can be evaluated. In addition, we are developing a second, complementary approach to the study of K → ππ decay which is based on periodic boundary conditions. This avoids the complexity of the G-parity boundary conditions used in the present work but requires that higher excited ππ states be used as the decay final state [76]. More challenging is the problem posed by the inclusion of electromagnetism where new methods [74,75] are needed to combine the finite-volume methods of Lüscher [11] and Lellouch and Lüscher [12] with the long-range character of electromagnetism.

ACKNOWLEDGMENTS
We would like to thank our RBC and UKQCD collaborators for their ideas and support. We are also pleased to acknowledge Vincenzo Cirigliano for helpful discussion and explanation of isospin breaking effects. The generation of the gauge configurations used in this work was primar-  operators can be found in in Appendix B.1 and B.2 of Ref. [33].
For this appendix we will utilize the notation described in Sec. III A whereby the quark field operators are placed in two-component "flavor" vectors ψ l and ψ h for the light and heavy quarks, respectively, and the corresponding propagators are matrices also in this flavor index. In this notation the creation operator for the G-parity even neutral kaon analog has the form, where the physical component corresponds to the usual neutral kaon operator (cf. Sec. VI.A of Ref. [10]). The σ creation operator has the form, For convenience we will treat the meson bilinears as point operators in which both quarks reside on the same lattice site. (In our actual lattice calculation we use more elaborate source and sink operators but those details are not needed to specify how we evaluate the Wick contractions.) The ten effective four-quark operators Q i for i ∈ {1 . . . 10} written in the above notation are given in Sec. 3.2.2 of Ref. [33]. While the exact forms are not important for this discussion, we highlight the fact that the operators are written in terms of a common set of matrices, where F i are diagonal flavor matrices that pick out either the upper (0) or lower (1) element of the vector upon which they act: The matrices M µ i,V ±A appear inside products of two bilinear operators and the space-time index µ is summed over implicitly. Following the notation of Ref. [33] we will suppress this index. contractions. The type3 and type4 diagrams are those that contain a quark loop at the location of the four-quark operator, with type4 corresponding to that subset of those diagrams that are disconnected (i.e. for which the σ operator self-contracts). For the ππ(. . .) operators the remaining, connected, contractions can be subdivided based on whether the two pion bilinear operators are directly connected by a quark line (type2) or not (type1); no such distinction exists of course for the σ sink operator. Hence we classify all remaining diagrams as type1.
As in Ref. [33] it is convenient to write the ten expressions A i in terms of a common basis of, in this case 23, functions D α (Γ 1 , Γ 2 ) where the subscript indexes the function and Γ 1,2 are spin-flavor matrices.
We will first write down the expressions for the correlation functions A i in terms of these functions and will conclude the section with their definition. We list the contributions for each of The type1 contractions are: (A9a) D 1 (Γ 1 , Γ 2 ) = tr Γ 2 G l y,x γ 5 G h x,y Γ 1 G l y,z G l z,y (A9b) D 6 (Γ 1 , Γ 2 ) = tr G h x,y Γ 1 G l y,x γ 5 tr G l z,y Γ 2 G l we use the shorthand ABC . . . to denote the n-point Green's functions of the operators A, B, C, and so on, in descending time order.
It is easy to see that the A vac