Light-meson leptonic decay rates in lattice QCD+QED

The leading electromagnetic (e.m.) and strong isospin-breaking corrections to the $\pi^+ \to \mu^+ \nu[\gamma]$ and $K^+ \to \mu^+ \nu[\gamma]$ leptonic decay rates are evaluated for the first time on the lattice. The results are obtained using gauge ensembles produced by the European Twisted Mass Collaboration with $N_f = 2 + 1 + 1$ dynamical quarks. The relative leading-order e.m. and strong isospin-breaking corrections to the decay rates are 1.53(19) % for $\pi_{\mu 2}$ decays and 0.24(10) % for $K_{\mu 2}$ decays. Using the experimental values of the $\pi_{\mu 2}$ and $K_{\mu 2}$ decay rates and updated lattice QCD results for the pion and kaon decay constants in isosymmetric QCD, we find that the Cabibbo-Kobayashi-Maskawa matrix element $\vert V_{us}\vert = 0.22538(38)$, reducing by a factor of about $1.8$ the corresponding uncertainty in the PDG review. Our calculation of $|V_{us}|$ allows also an accurate determination of the first-row CKM unitarity relation $\vert V_{ud}\vert^2 + \vert V_{us}\vert^2 + \vert V_{ub}\vert^2 = 0.99988(44)$. Theoretical developments in this paper include a detailed discussion of how QCD can be defined in the full QCD+QED theory and an improved renormalisation procedure in which the bare lattice operators are renormalised non-perturbatively into the RI$^\prime$-MOM scheme and subsequently matched pertubatively at $O(\alpha_{em}\alpha_s(M_W))$ into the W-regularisation scheme appropriate for these calculations.


I. INTRODUCTION
In flavor physics, the determination of the elements of the Cabibbo-Kobayashi-Maskawa (CKM) matrix [1], which contain just four parameters, from a wide range of weak processes represents a crucial test of the limits of the Standard Model (SM) of particle physics. Inconsistencies with theoretical expectations would indeed signal the existence of new physics beyond the SM and subsequently a detailed comparison of experimental measurements and theoretical predictions would provide a guide toward uncovering the underlying theory beyond the SM. For this to be possible nonperturbative hadronic effects need to be evaluated as precisely as possible and in this paper we report on progress in improving the precision of lattice computations of leptonic decay rates by including radiative corrections and strong isospin-breaking (IB) effects. A summary of our results has been presented in Ref. [2]; here we expand on the details of the calculation and include several improvements, most notably the renormalization of the four-fermion weak operators in the combined QCD þ QED theory (see Sec. IV). We also discuss in some detail how one might define the QCD component of the full (QCD þ QED) theory (see Sec. II). Although such a separate definition of QCD is not required in order to obtain results computed in the full theory, it is necessary if one wishes to talk about radiative (and strong IB) "corrections" to results obtained in QCD. For this we need to specify what we mean by QCD.
The extraction of the CKM elements from experimental data requires an accurate knowledge of a number of hadronic quantities and the main goal of large-scale QCD simulations on the lattice is the ab initio evaluation of the nonperturbative QCD effects in physical processes. For several quantities relevant for flavor physics phenomenology, lattice QCD has recently reached the impressive level of precision of Oð1%Þ or even better. Important examples are the ratio f K =f π of kaon and pion leptonic decay constants and the K l3 vector form factor f þ ð0Þ [3], which play the central role in the accurate determination of the CKM entries jV us =V ud j and jV us j, respectively. Such lattice computations are typically performed in the isospin symmetric limit of QCD, in which the up and down quarks are mass degenerate (m u ¼ m d ) and electromagnetic (e.m.) effects are neglected (α em ¼ 0).
Isospin breaking effects arise because of radiative corrections and because m u ≠ m d ; the latter contributions are usually referred to as strong isospin breaking effects. Since both α em and ðm d − m u Þ=Λ QCD are of Oð1%Þ, IB effects need to be included in lattice simulations to make further progress in flavor physics phenomenology, beyond the currently impressive precision obtained in isosymmetric QCD.
Since the electric charges of the up and down quarks are different, the presence of electromagnetism itself induces a difference in their masses, in addition to any explicit difference in the bare masses input into the action being simulated. The separation of IB effects into strong and e.m. components therefore requires a convention. We discuss this in detail in Sec. II, where we propose and advocate the use of hadronic schemes, based on taking a set of hadronic quantities, such as particle masses, which are computed with excellent precision in lattice simulations, to define QCD in the presence of electromagnetism.
In recent years, precise lattice results including e.m. and strong IB effects have been obtained for the hadron spectrum, in particular for the mass splittings between charged and neutral pseudoscalar (P) mesons and baryons (see, for example, Refs. [4,5]). The QED effects were included in lattice QCD simulations using the following two methods: (i) QED is added directly to the action and QED þ QCD simulations are performed at a few values of the electric charge and the results extrapolated to the physical value of α em (see, e.g., Refs. [5][6][7]). (ii) The lattice path integral is expanded in powers of the two small parameters α em and ðm d − m u Þ=Λ QCD . This is the RM123 approach of Refs. [4,8,9] which we follow in this paper. In practice, for all the relevant phenomenological applications it is currently sufficient to work at first order in the small parameters α em and ðm d − m u Þ=Λ QCD . The attractive feature of the RM123 method is that it allows one naturally to work at first order in isospin breaking, computing the coefficients of the two small parameters directly. Moreover, these coefficients can be determined from simulations of isosymmetric QCD.
The calculation of e.m. and strong IB effects in the hadron spectrum has a very significant simplification in that there are no infrared (IR) divergences. The same is not true when computing hadronic amplitudes, where e.m. IR divergences are present and only cancel in well-defined, measurable physical quantities by summing diagrams containing real and virtual photons [10]. This is the case, for instance, for the leptonic π l2 and K l2 and semileptonic K l3 decay rates. The presence of IR divergences requires a new strategy beyond those developed for the calculation of IB effects in the hadron spectrum. Such a new strategy was proposed in Ref. [11], where the determination of the inclusive decay rate of a charged P meson into either a final l AE ν l pair or a final l AE ν l γ state was addressed.
The e.m. corrections due to the exchange of a virtual photon and to the emission of a real one can be computed nonperturbatively, by numerical simulations, on a finite lattice with the corresponding uncertainties. The exchange of a virtual photon depends on the structure of the decaying meson, since all momentum modes are included, and the corresponding amplitude must therefore be computed nonperturbatively. On the other hand, the nonperturbative evaluation of the emission of a real photon is not strictly necessary [11]. Indeed, it is possible to compute the real emission amplitudes in perturbation theory by limiting the maximum energy of the emitted photon in the meson rest frame, ΔE γ , to a value small enough so that the internal structure of the decaying meson is not resolved. The IR divergences in the nonperturbative calculation of the corrections due to the exchange of a virtual photon are canceled by the corrections due to the real photon emission even when the latter is computed perturbatively, because of the universality of the IR behavior of the theory (i.e., the IR divergences do not depend on the structure of the decaying hadron). Such a strategy, which requires an experimental cut on the energy of the real photon, makes the extraction of the relevant CKM element(s) cleaner.
In the intermediate steps of the calculation, it is necessary to introduce an IR regulator. In order to work with quantities that are finite when the IR regulator is removed, the inclusive rate ΓðP þ → l þ ν l ½γÞ is written as [11] ΓðP AE → l AE ν l ½γÞ where the subscripts 0,1 indicate the number of photons in the final state, while the superscript pt denotes the pointlike approximation of the decaying meson and μ γ is an IR regulator. In the first term on the rhs of Eq. (1), the quantities Γ 0 ðLÞ and Γ pt 0 ðLÞ are evaluated on the lattice. Both have the same IR divergences which therefore cancel in the difference. We use the lattice size L as the intermediate IR regulator by working in the QED L [12] formulation of QED on a finite volume (for a recent review on QED simulations in a finite box, see Ref. [13]). The difference ½Γ 0 − Γ pt 0 is independent of the regulator as this is removed [14]. As already pointed out, since all momentum modes contribute to it, Γ 0 ðLÞ depends on the structure of the decaying meson and must be computed nonperturbatively. The numerical determination of Γ 0 ðLÞ for several lattice spacings, physical volumes, and quark masses is indeed the focus of the present study.
In the second term on the r.h.s. of Eq. (1), P is a pointlike meson and both Γ pt 0 ðμ γ Þ and Γ pt 1 ðΔE γ ; μ γ Þ can be calculated directly in infinite volume in perturbation theory, using a photon mass μ γ as the IR regulator. Each term is IR divergent, but the sum is convergent [10] and independent of the IR regulator. In Refs. [11,14], the explicit perturbative calculations of ½Γ pt 0 þ Γ pt 1 ðΔE γ Þ and Γ pt 0 ðLÞ have been performed with a small photon mass μ γ or by using the finite volume respectively, as the IR cutoffs.
In Ref. [2], we have calculated the e.m. and IB corrections to the ratio of K μ2 and π μ2 decay rates of charged pions and kaons into muons [2], using gauge ensembles generated by the European Twisted Mass Collaboration (ETMC) with N f ¼ 2 þ 1 þ 1 dynamical quarks [15,16] in the quenched QED (qQED) approximation in which the charges of the sea quarks are set to 0. The ratio is less sensitive to various sources of uncertainty than the IB corrections to π μ2 and K μ2 decay rates separately. In this paper, in addition to providing more details of the calculation than was possible in Ref. [2], we do evaluate the e.m. and strong IB corrections to the decay processes π μ2 and K μ2 separately. Since the corresponding experimental rates are fully inclusive in the real photon energy, structure-dependent (SD) contributions to the real photon emission should be included; however, according to the chiral perturbation theory (ChPT) predictions of Ref. [17], these SD contributions are negligible for both kaon and pion decays into muons. The same is not true to the same extent for decays into final-state electrons (see Ref. [11]) and so in this paper we focus on decays into muons. The SD contributions to Γ 1 are being investigated in an ongoing dedicated lattice study of light and heavy P-meson leptonic decays.
An important improvement presented in this paper is in the renormalization of the effective weak Hamiltonian. To exploit the matching of the effective theory to the Standard Model performed in Ref. [18], it is particularly convenient to renormalize the weak Hamiltonian in the W-regularization scheme. The renormalization is performed in two steps. First of all, the lattice operators are renormalized nonperturbatively in the RI 0 -MOM scheme at Oðα em Þ and to all orders in the strong coupling α s . Because of the breaking of chiral symmetry in the twisted mass formulation we have adopted in our study, this renormalization includes the mixing with other four-fermion operators of different chirality. In the second step, we perform the matching from the RI 0 -MOM scheme to the W-regularization scheme perturbatively. By calculating and including the two-loop anomalous dimension at Oðα em α s Þ, the residual truncation error of this matching is of Oðα em α s ðM W ÞÞ, reduced from Oðα em α s ð1=aÞÞ in our earlier work [11].
The main results of the calculation are presented in Sec. VI together with a detailed discussion of their implications. Here, we anticipate some key results: after extrapolation of the data to the physical pion mass, and to the continuum and infinite-volume limits, the isospinbreaking corrections to the leptonic decay rates can be written in the form where Γ ð0Þ is the leptonic decay rate at tree level in the Gasser-Rusetsky-Scimemi (GRS) scheme which is a particular definition of QCD [19] (see Sec. II B 2 below). The corrections are about 1.5% for the pion decays and 0.2% for the kaon decay, in line with naïve expectations. Taking the experimental value of the rate for the K μ2 decay, Eq. (3) together with Γ ð0Þ ðK AE → μ AE ν l Þ obtained using the lattice determination of the kaon decay constant we obtain jV us j ¼ 0.22567ð42Þ, in agreement with the latest estimate jV us j ¼ 0.2253ð7Þ, recently updated by the PDG [20] but with better precision. Alternatively, by taking the ratio of K μ2 and π μ2 decay rates and the updated value jV ud j ¼ 0.97420ð21Þ from super-allowed nuclear beta decays [21], we obtain jV us j ¼ 0.22538ð46Þ. The unitarity of the first row of the CKM matrix is satisfied at the per-mille level; e.g., taking the value of V us from the ratio of decay rates and jV ub j¼0.00413ð49Þ [20], we obtain jV ud j 2 þ jV us j 2 þ jV ub j 2 ¼ 0.99988ð46Þ. See Sec. VI for a more detailed discussion of our results and their implications.
The plan for the remainder of this paper is as follows. A discussion of the relation between the "full" QCD þ QED theory, including e.m. and strong IB effects, and isosymmetric QCD without electromagnetism is given in Sec. II. We discuss possible definitions of QCD in the full QCD þ QED theory, and in particular we define and advocate hadronic schemes as well as the GRS scheme which is conventionally used [19]. In Sec. III, we present the calculation of the relevant amplitudes using the RM123 approach. The renormalization of the bare lattice operators necessary to obtain the effective weak Hamiltonian in the W-regularization scheme is performed in Sec. IV, while the subtraction of the universal IR-divergent finite volume effects (FVEs) is described in Sec. V. The lattice data for the e.m. and strong IB corrections to the leptonic decay rates of pions and kaons are extrapolated to the physical pion mass, to the continuum and infinite volume limits in Sec. VI. Finally, Sec. VII contains our conclusions. There are four appendices. The lattice framework and details of the simulation are presented in Appendix A. Appendix B contains a detailed discussion of the relation between observables in the full theory and in QCD, expanding on the material in Sec. II. An expanded discussion of the renormalization of the effective weak Hamiltonian, including electromagnetic corrections, is presented in Appendix C, which contains a general discussion of the nonperturbative renormalization in the RI'-MOM scheme, and Appendix D in which issues specific to the twisted mass formulation are discussed.

II. DEFINING QCD IN THE FULL THEORY (QCD + QED)
Before presenting the detailed description of our calculation of leptonic decay rates, we believe that it is useful to discuss the relation between the "full" QCD þ QED theory, that includes explicit e.m. and strong isospin breaking effects, and QCD without electromagnetism (denoted in the following as the full theory and QCD, respectively).
The action of the full theory can be schematically written as Here g s is the strong coupling constant, S YM is a discretization of the gluon action, S A is the preferred discretization of the Maxwell action of the photon, S kin f is the kinetic term for the quark with flavor f, including the interaction with the gluon and photon fields, m f S m f ¼ m f P xqf ðxÞq f ðxÞ is the mass term, S kin l and S m l are, respectively, the kinetic and mass terms for the lepton l (for details, see Appendix B). For fermion actions which break chiral symmetry, such as the Wilson action, a counterterm is needed to remove the critical mass and m f S m f has to be replaced with m f S m f þ m cr f S cr f . A mass counterterm is in principle needed also in the case of the lepton, but at leading order in α em the lepton critical mass can be ignored.
At the level of precision to which we are currently working it is only the full theory, as defined in Eq. (4), which is expected to reproduce physical results and that is therefore unambiguous. Nevertheless, a frequently asked question is what is the difference between the results for a physical quantity computed in the full theory and in pure QCD, and how big are the strong isospin-breaking effects compared to the e.m. corrections. We particularly wish to underline that in order to properly formulate such questions it is necessary to carefully define what is meant by QCD. It is naturally to be expected that in QCD alone physical quantities will not be reproduced with a precision of better than Oðα em Þ ≃ 1% and this of course is the motivation for including QED. In order to define what is meant by QCD at this level of precision, it is necessary to state the conditions which are used to determine the quark masses and the lattice spacing. The separation of the full theory into QCD and the rest is therefore prescription dependent.
In Ref. [4], the subtle issue of a precise definition of QCD has been discussed by using the scheme originally proposed in Ref. [19], which we refer to as the GRS scheme and which has been widely used [2,4,8]. In the following and in Appendix B, we present an extended and detailed discussion by introducing the hadronic schemes. Indeed, in light of the fact that hadron masses can nowadays be computed very precisely, we strongly suggest using hadronic schemes in future lattice calculations of QED radiative corrections. At the end of this section, we discuss the connection with the GRS scheme that we had adopted at the time in which this calculation was started and that, for this reason, has been used in this work. A summary of the ideas discussed here has already been presented in Ref. [22].

A. Renormalization of the full theory
The main difference in the steps required to renormalize the full theory compared to the procedure in QCD is the presence of a massless photon and the corresponding finitevolume (FV) corrections which appear as inverse powers of L, where L is the spatial extent of the lattice and the volume V ¼ L 3 . By contrast, in QCD for leptonic and semileptonic decays, the FV corrections are exponentially small in the volume. In the discussion below, if necessary, we imagine that the chiral Ward identities have been imposed to determine the critical masses m cr f [23]. A possible strategy in principle is the following: (1) Fix the number of lattice points N, e.g., T ¼ 2aN and L ¼ aN, where T and L are the temporal and spatial extents of the lattice and the lattice spacing a will be determined later. (The specific choice T ¼ 2L is convenient for illustration but not necessary for the following argument.) (2) Using a four-flavor theory for illustration, we now need to determine the four physical bare quark masses, the bare electric charge, and the lattice spacing. To this end, we need to compute six quantities, e.g., the five dimensionless ratios 1 where M phys Ω ¼ 1.672 GeV is the physical value of the mass of the Ω baryon. For illustration, we are considering the masses of QCD þ QED stable pseudoscalar mesons in the numerators of the dimensionless ratios (5) and using M phys Ω to determine the lattice spacing, but of course other quantities can be used instead. For example, in the four-flavor theory that we are considering here one can in principle avoid potentially very noisy baryon observables by using one of the charmed mesons masses already considered above to set the scale. The choice of setting the scale with a charmed-meson observable could, however, generate significant cutoff effects and reduce the sensitivity to the charm mass. In Eqs. (5) and (6), we have used aN instead of L to highlight that the infinite-volume limit should be taken at fixed lattice spacing (see Eq. (7) below). The quantity m represents the vector of bare quark masses m ≡ fm u ; m d ; m s ; m c g. Note that in the RM123 strategy, since one works at first order in α em , it is not necessary to impose a renormalization condition to fix the e.m. coupling [4,8]. In this case, the electric charge can simply be fixed to the Thomson's limit, i.e., e ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi 4π=137.036 p , and R 5 becomes a predictable quantity. For the remainder of this section, we assume that we are working to Oðα em Þ and only consider the four ratios R i (i ¼ 1; 2; 3; 4) as well as R 0 when discussing the calibration of the lattices. Notice also that at first order in α em the π 0 cannot decay in two photons, so that it can also be used in the calibration procedure (see Sec. III below).
(3) Up to this point the procedure is the standard one used in QCD simulations. The difference here is in the FV effects which behave as inverse powers of L.
We therefore envisage extrapolating the ratios R i to the infinite-volume limit In practice, of course, depending on the specific choice of the ratios R i , this will require some extrapolations of results obtained at different values of the bare quark masses and electric charge. Note that with such a procedure the bare parameters and the lattice spacing a do not depend on the lattice volume. (6) At first order in isospin breaking, i.e., Oðα em ; m d − m u Þ, the renormalization of the lepton masses is performed perturbatively, by requiring that the onshell masses correspond to the physical ones. If one wishes to go beyond first order, when hadronic effects first enter, then the physical lepton masses should be added to the quantities used in the nonperturbative calibration. The bare lepton masses, together with the other parameters, should be chosen such that, in addition to satisfying the conditions in Eq. (5), the lepton-lepton correlators decay in time as e −m l t , where m l is the physical mass of the lepton l. In Eq. (7), we have taken the infinite-volume limit of the computed hadron masses. By working in the QED L finitevolume formulation of QED, if for each hadron H the FV corrections of order Oðe 2 =ðM H LÞ 3 ; e 4 Þ can be neglected, then the extrapolation to the infinite-volume limit can be avoided by making use of the formula [5,12]

B. Defining observables in QCD
The procedure discussed in Sec. II A provides a full framework with which to perform lattice simulations of QCD together with isospin-breaking effects including radiative corrections. Nevertheless, one may wish to ask how different are the results for some physical quantities in the full theory (QCD þ QED) and in QCD alone. We stress again that, under the assumption that isospin breaking effects are not negligible, QCD by itself is an unphysical theory and requires a definition. Different prescriptions are possible and, of course, lead to different results in QCD. In this section, we propose and advocate hadronic schemes, based on the nonperturbative evaluation of a set of hadronic masses in lattice simulations and contrast this with schemes based on equating the renormalized strong coupling and masses in some renormalization scheme and at a particular renormalization scale which have been used previously.
We recall that the QCD action is given by where the kinetic term only includes the gluon links and the subscripts 0 indicate that the bare coupling and masses are different from those in the full theory of Eq. (4). Indeed, the two theories have different dynamics that, in turn, generate a different pattern of ultraviolet divergences. The difference in the bare parameters of the two theories, for all schemes used to define QCD, can in fact be ascribed to the necessity of reabsorbing the different ultraviolet singularities. In what follows, we present two different approaches to making the choice of the parameters g 0 and m f;0 . Explicit details of the lattice action, discretized using the Wilson formulation for the fermions for illustration, are presented in Appendix B 1.

Defining observables in QCD: hadronic schemes
In hadronic schemes, we choose a value of g 0 and determine the bare quark masses m phys 0 and the lattice spacing a 0 imposing the same conditions as for the full theory for the ratios R 0;…;4 evaluated at vanishing electric charge, i.e., following steps 1-5 in Sec. II A without imposing any constraint on the ratio R 5 . We repeat that, for illustration we define the bare quark masses and lattice spacing using the five ratios R i , but other hadronic quantities could be used instead, both in the full theory and in QCD. These parameters differ by terms of order Oðα em Þ from those in the full theory. For this discussion, we make the natural and convenient choice g 0 ¼ g s . (In order to make the perturbative expansion in Eq. (B11), the difference g s − g 0 should be less than Oðα em Þ.) With this choice, the lattice spacings in QCD (a 0 ) and in the full theory (a) are therefore given by To illustrate the procedure, imagine that we wish to calculate an observable O of mass dimension 1, for example, the mass of a hadron which has not been used for the calibration. The generalization to other cases is straightforward and presented in Appendix B. At a fixed value of g s ¼ g 0 , we denote the best estimate of the observable O, which is the one obtained in the full theory, by O phys , and that obtained in QCD as defined above by O QCD , We define the difference of the two as being due to QED effects, δO QED ≡ O phys − O QCD . There are three contributions to δO QED : (1) The first contribution comes from the fact that the covariant derivatives in the kinetic terms in Eqs. (4) and (11) are different. This generates the diagrams in the correlation functions which contain the explicit exchange of virtual photons. (2) The second contribution comes from the fact that the bare quark masses appearing in Eqs. (4) and (11) are different. The corresponding quark-mass counterterms must therefore be inserted into the correlation functions used to determine O phys . We stress that the need to include quark-mass counterterms is generic and arises from the requirement that the conditions being used to determine the quark masses must be satisfied both in the full theory and in QCD (for the hadronic scheme being used for illustration we impose that the conditions in Eq. (8) are satisfied in both theories). (3) Finally, we must account for the difference in the lattice spacings δa ¼ a − a 0 in the full theory and QCD. Combining these contributions, we arrive at where we have combined the contributions to the correlation functions from the exchange of virtual photons and from the insertion of the mass counterterms into ha 0 δOi QCD . The detailed derivation of Eq. (14) is presented in Appendix B but some further comments may be helpful here. The first term on the righthand side is one that can be calculated within QCD alone. It has a well-defined continuum limit as does the sum of all the terms in Eq. (14). This term allows us to define what is the difference between QCD (defined as above) and the full theory in the hadronic scheme: An important feature of the RM123 approach which we follow in the numerical study presented below, is that the Oðα em Þ terms are computed explicitly and so we do not have to take the difference between numerical calculations performed in the full theory and in QCD. Each of the terms on the right-hand side of Eq. (14) is calculated directly. We now explain the procedure in some more detail by assuming that terms of order Oðα 2 em Þ are negligible (the extension to higher orders in α em is straightforward).
(1) Correlation functions corresponding to diagrams with the exchange of a virtual photon and to the insertion of the mass counterterms are already of Oðα em Þ and are calculated directly in QCD. The term proportional to the time separation in the correlation functions gives us the mass shift δM H i (i ¼ 1, 2, 3, 4) and δM Ω for the five masses (or mass differences) in the ratios R i (i ¼ 1; 2; 3; 4) in Eq. (5); (2) In the hadronic scheme being used for illustration, we impose the condition that the four ratios R i ¼ m H i =m Ω are the same in QCD and in the full theory. This corresponds to requiring that The QED contribution to the left-hand side is different from zero (and also ultraviolet divergent) and we require the terms proportional to the counterterms to cancel this contribution. We therefore (in principle) scan the values of the four mass counterterms δm f ¼ m f − m f;0 (f ¼ u, d, s, c) until the four conditions (15) are satisfied. Also, in this case no subtraction of results obtained in the full theory and in QCD is necessary.
(3) Finally, we determine the difference δa ≡ a − a 0 in the lattice spacing. Having determined the bare masses using item 2, we can calculate the shift in the Ω mass, δM Ω due to both QED and the mass counterterms, and use Eq. (12). Since aδM Ω is calculated directly, there is again no subtraction. We have devoted a considerable discussion to the definition of the isospin-breaking effects due to electromagnetism, δO QED . Having done this, the subsequent definition of the strong isospin breaking effects is straightforward. To do this however, we need to define the isosymmetric theory (labelled by "ISO") by imposing appropriate conditions to determine the bare quark masses and the lattice spacing. Since m u ¼ m d , in the N f ¼ 2 þ 1 þ 1 theory we need to determine only three quark masses and hence we only need three conditions, e.g., we can use the ratios R 1;2;3 in Eq. (5) to determine the physical bare quark masses. For the determination of the lattice spacing, we have two options. The simplest one is to work in a massindependent scheme and set the lattice spacing in the isosymmetric theory, a ISO 0 , equal to the one of QCD with m u ≠ m d , i.e., a ISO 0 ¼ a 0 . Notice that this choice is fully consistent with renormalization because the ultraviolet divergences of the theories that we are considering do not depend on the quark masses. Note however, that they do depend instead on the electric charge. The other option is that we set the lattice spacing in the isosymmetric theory by using R 0 in Eq. (9). The difference between the two options is due to cutoff effects that disappear once the continuum limit is taken consistently. The strong isospin breaking correction δO SIB to the observable O can now be defined by where is the value of the observable obtained in isosymmetric QCD. With these definitions, we have the natural relation O phys ¼ O ISO þ δO QED þ δO SIB . We underline however that δO SIB depends on the quantities used for calibration, both in four-flavor QCD and in isosymmetric QCD.

Defining QCD: the GRS scheme
A different prescription, called the GRS scheme, was proposed in Ref. [19] to relate the bare quark masses and bare coupling of QCD (m f;0 and g 0 ) to those in the full theory (m f and g s ). This prescription has been adopted in Refs. [2,4,8]. In the GRS approach, instead of determining the bare parameters of QCD by requiring that the chosen hadronic masses in QCD are equal to their physical values, one imposes that the renormalized parameters in a given short-distance scheme (e.g., the MS scheme) and at a given scale are equal in the full and QCD theories.
A consistent procedure is the following: (1) The full theory is renormalized by using a physical hadronic scheme as discussed in Sec. II A. This means that for each chosen value of g s we know the corresponding physical value of the bare electric charge e phys ðg s Þ and of the lattice spacing aðg s Þ. (2) The renormalization constants (RCs) of the strong coupling constant and of the quark masses are computed in a short-distance mass-independent scheme both in the full theory and in the theory at vanishing electric charge. (3) In order to set the bare parameters of QCD at a given value of the lattice spacing, we now chose a matching scale μ and impose that the renormalized strong coupling constant and the renormalized quark masses are the same as in the full theory. In practice, we might want to simulate QCD at the same values of the lattice spacing used in the full theory simulations. In this case, the matching conditions are whereˆindicates quantities in the QCD þ QED theory. Notice that quarks with the same electric charge have the same RC, e.g., Z m u ðe; g s ; μÞ ¼ Z m c ðe; g s ; μÞ, and that the quark mass RC at vanishing electric charge is flavor independent, Z m f ð0; g 0 ; μÞ ¼ Z m ðg 0 ; μÞ. (4) In order to define isosymmetric QCD by using this approach, the bare up-down quark mass is determined from Z m ðg 0 ; aðg s ÞμÞm ud;0 ðg 0 Þ ¼m u ðμÞ þm d ðμÞ 2 : Some remarks are in order at this point. The GRS scheme is a short-distance matching procedure that can also be used to match the theories at unphysical values of the renormalized electric charge and/or quark masses with the physical theory.
By following the procedure outlined above, one can perform lattice simulations of the full theory and of (isosymmetric) QCD at the same value of the lattice spacing but, consequently, at different values of the bare strong coupling constant. This is different from the strategy outlined in the previous subsection where, by using hadronic schemes, it was more natural to choose the same value of the bare strong coupling at the price of having two different lattice spacings. The absence of the lattice spacing counterterm (see Eq. (14) above) in the GRS scheme is compensated from the presence of the counterterm ð1=g 2 0 − 1=g 2 s ÞS YM originating from the difference of the bare strong coupling constants in the two theories.
A remark of some practical relevance concerns the possibility of implementing hadronically the GRS scheme. To this end, note that in the GRS scheme the dimensionless ratios R i will not be equal to the corresponding physical values and the difference can be parametrized as follows: where the ϵ GRS i are order Oðα em Þ and depend on the chosen matching scheme and also on the chosen matching scale. Once the ϵ GRS i (and hence the R QCD-GRS i ) are known, for example from a particularly accurate lattice simulation, then they can be used in other lattice computations. The bare quark masses are then determined by requiring that the R i in (isosymmetric) QCD reproduce R QCD-GRS i as given by Eq. (19), and, at this stage, the GRS scheme can be considered to be a hadronic one as it is defined in terms of nonperturbatively computed quantities (in this case meson masses). We stress however that this requires prior knowledge of the ϵ GRS i . Of course, other schemes are also possible. In general, the ϵ i provide a unifying language to discuss the different schemes for the definition of (isosymmetric) QCD in the presence of electromagnetism; in physical hadronic schemes the ϵ i ¼ 0 while in the GRS and other schemes they are of order Oðα em Þ. For later use, we make the simple observation that two schemes can be considered to be equivalent in practice if the ϵ i in the two schemes are equal within the precision of the computations.
Although the GRS scheme is perfectly legitimate, we advocate the use of physical hadronic schemes in future lattice calculations. For lattice simulations of physical quantities, a nonperturbative calibration of the lattice is necessary in general, but the renormalization required for the GRS conditions in Eq. (17) is not generally necessary (except perhaps for the determination of the renormalized coupling and quark masses themselves). Now that hadronic masses are calculated with excellent precision in lattice simulations and their values are well known from experimental measurements, it is natural to use hadronic schemes. By contrast, the renormalized couplings and masses are derived quantities which are not measured directly in experiments. In spite of this, as explained above, at the time that our computation was started we chose to use the GRS scheme. Of course, the physical results in the full theory do not depend on this choice.

III. EVALUATION OF THE AMPLITUDES
At first order in α em and ðm d − m u Þ=Λ QCD , the inclusive decay rate (1) can be written as where Γ QCD is the tree-level decay rate given by and M ð0Þ P and f ð0Þ P are the mass and decay constant of the charged P-meson mass defined in isosymmetric QCD in the chosen scheme.
The decay constant f ð0Þ P is defined in terms of the matrix element of the QCD axial current A ð0Þ P (in the continuum) as where the initial state meson P ð0Þ is at rest. The decay rate is obtained from the insertion of the lowest-order effective Hamiltonian as depicted in the Feynman diagram of Fig. 1, where the decay of a charged kaon is shown as an example. At lowest order in α em the two full dots in the figure represent the two currents in the bare four-fermion operator whereas at order α em they will denote the insertion of the renormalized operator in the W regularization as defined in Sec. IV. In order to compare our results for the e.m. and strong IB corrections to those obtained in Ref. [25] and adopted by the PDG [20,26] however, we will use a modified expression, where Γ ð0Þ is given by and M P is the physical mass of the charged P-meson including both e.m. and leading-order strong IB corrections.
The quantity δR P encodes both the e.m. and the strong IB leading-order corrections to the tree-level decay rate. Its value depends on the prescription used for the separation between the QED and QCD corrections, while the quantity is prescription independent [27] to all orders in both α em and ðm d − m u Þ.
The quantity F π may be used to set the lattice scale instead of the Ω baryon mass. The physical value F phys π can be obtained by taking the experimental pion decay rate Γðπ − → μ −ν μ ½γÞ ¼ 3.8408ð7Þ · 10 7 s −1 from the PDG [20] and the result for jV ud j ¼ 0.97420ð21Þ determined accurately from super-allowed β-decays in Ref. [21]. Consequently, one may replace M Ω with F π [as the denominator of the ratios R 1;…;4 in Eq. (5)], M π þ with M π 0 in the ratio R 1 (when working at leading order in α em ) and set the electron charge directly to its Thomson's limit (instead of using the ratio R 5 ), namely R 1 ðaN; g s ; e; mÞ ¼ aM π 0 aF π ðaN; g s ; e; mÞ; R 2 ðaN; g s ; e; mÞ ¼ aM K 0 aF π ðaN; g s ; e; mÞ; R 3 ðaN; g s ; e; mÞ ¼ aM D s aF π ðaN; g s ; e; mÞ; Note that for the present study we were unable to use M Ω to determine the lattice spacing because the corresponding baryon correlators were unavailable. The choice of using F π instead to set the scale clearly prevents us from being able to predict the value of jV ud j. This is one of the reasons why we advocate the use of hadronic schemes with hadron FIG. 1. Feynman diagram for the process K þ → l þ ν l . In the effective theory, the interaction is given by a local four-fermion operator denoted by the two full dots in the figure.
masses as experimental inputs for future lattice calculations. However, as already explained above, in this work we renormalize the QCD theory using the same set of hadronic inputs adopted in our quark-mass analysis in Ref. [28], since we started the present calculations using the RM123 method on previously generated isosymmetric QCD gauge configurations from ETMC (see Appendix A). The bare parameters of these QCD gauge ensembles were fixed in Ref. [28] by using the hadronic scheme corresponding to M  [20]. Note that in the absence of QED radiative corrections F π reduces to the conventional definition of the pion decay constant f ð0Þ π . The superscript FLAG has been used because the chosen values of three of the four hadronic inputs had been suggested in the previous editions of the FLAG review [3]. For this reason, we refer to the scheme defined from these inputs as the FLAG scheme.
We have calculated the same input parameters (28) used in the FLAG scheme also in the GRS scheme (corresponding to the MS scheme at μ (111) in Sec. VI below). Therefore, the values of the inputs determined in the GRS scheme differ at most by ∼0.15% from the corresponding values adopted in Ref. [28] for the isosymmetric QCD theory and the differences are at the level of our statistical precision. Thus, the result of our analysis of the scheme dependence can be summarized by the conclusion that the FLAG and GRS schemes can be considered to be equivalent at the current level of precision. Nevertheless, we have used the results of this analysis to estimate the systematic error on our final determinations of the isospin breaking corrections δR P induced by residual scheme uncertainties (see the discussion at the end of Sec. VI).
In light of this quantitative analysis, given the numerical equivalence of the two schemes at the current level of precision, in the rest of the paper we shall compare our results obtained in the GRS scheme with the results obtained by other groups using the FLAG scheme and we shall not use superscripts to distinguish between the two schemes.
The correction δR P , defined in Eq. (25), is given by (see Ref. [11]) where (i) the term containing logðM 2 Z =M 2 W Þ comes from the short-distance matching between the full theory (the Standard Model) and the effective theory in the W regularization [18]; (ii) the quantity δΓ ðptÞ P ðΔE γ Þ represents the Oðα em Þ correction to the tree-level decay rate for a pointlike meson [see Eq. (1)], which can be read off from Eq. (51) of Ref. [11]. The cut-off on the final-state photon's energy, ΔE γ , must be sufficiently small for the pointlike approximation to be valid; (iii) δA P is the e.m. and strong IB correction to the decay amplitude P → lν with the corresponding correction to the amplitude with a pointlike meson subtracted [this subtraction term is added back in the term δΓ ðptÞ P ðΔE γ Þ; see Eq. (1)]. (iv) δM P are the e.m. and strong IB corrections to the mass of the P meson. The correction proportional to 2δM P =M ð0Þ P is present because of the definition of f ð0Þ P in terms of the amplitude and of the meson mass in Eq. (22). Since we adopt the qQED approximation, which neglects the effects of the sea-quark electric charges, the calculation of δA P and δM P only requires the evaluation of the In Eq. (29), δA P and δM P contain both the e.m. and the strong IB leading-order corrections where δA W P is the e.m. correction from both the matching of the four-fermion lattice weak operator to the Wrenormalization scheme and from the mixing with several bare lattice four-fermion operators generated by the breaking of chiral symmetry with the twisted-mass fermion action which we are using. Both the matching and the mixing will be discussed and calculated in Sec. IV. As already pointed out, the renormalized operator, defined in the W-renormalization scheme, is inserted in the diagram of In Eqs. (30) and (31), the quantity δA SIB P (δM SIB P ) represents the strong IB corrections proportional to m d − m u and to the diagram of Fig. 4(b), while the other terms are QED corrections coming from the insertions of the e.m. current and tadpole operators, of the pseudoscalar and scalar densities (see Refs. [4,9]). The term δA J P (δM J P ) is generated by the diagrams of Figs. 2(a)-2(c), δA T P (δM T P ) by the diagrams of Figs. 2(d) and 2(e), δA P P (δM P P ) by the diagrams of Figs. 3(a) and 3(b), and δA S P (δM S P ) by the diagrams of Figs. 4(a) and 4(b). The term δA l P corresponds to the exchange of a photon between the quarks and the final-state lepton and arises from the diagrams in Figs. 5(a) and 5(b). The term δA l;self P corresponds to the contribution to the amplitude from the lepton's wave function renormalization; it arises from the self-energy diagram of Fig. 5(c). The contribution of this term cancels out in the difference Γ 0 ðLÞ − Γ pt 0 ðLÞ and could be therefore omitted, as explained in the following section. The different insertions of the scalar density encode the strong IB effects together with the counter terms necessary to fix the masses of the quarks. The insertion of the pseudoscalar density is peculiar to twisted mass quarks and would be absent in standard Wilson (improved) formulations of QCD.
In the following subsection, we discuss the calculation of all the diagrams that do not involve the photon attached to the charged lepton line. The determination of the contributions δA l P and δA l;self P will be described later in Sec. III B.

A. Quark-quark photon exchange diagrams and scalar and pseudoscalar insertions
The terms δA i P and δM i P (i ¼ J; T; P; S) can be extracted from the following correlators: Connected diagrams contributing at Oðα em Þ to the K þ → l þ ν l decay amplitude corresponding to the insertion of the pseudoscalar density related to the e.m. shift of the critical mass, δm crit f , determined in Ref. [8].
FIG. 4. Connected diagrams contributing at Oðα em Þ and Oðm d − m u Þ to the K þ → l þ ν l decay amplitude related to the insertion of the scalar density (see Ref. [8]).
where Δ em μν ðy 1 ; y 2 Þ is the photon propagator, J ρ W ðxÞ is the local version of the hadronic (V − A) weak current renormalized in QCD only, 3 and T em μ is the tadpole operator In Eqs. (32)- (35), ϕ † P ð⃗ x; −tÞ ¼ iq f 1 ð⃗ x; −tÞγ 5 q f 2 ð⃗ x; −tÞ is the interpolating field for a P meson composed by two valence quarks f 1 and f 2 with charges e 1 e and e 2 e. The Wilson r-parameters r f 1 and r f 2 are always chosen to be opposite r f 1 ¼ −r f 2 (see Appendix A). We have also chosen to place the weak current at the origin and to create the P meson at a negative time −t, where t and T − t are sufficiently large to suppress the contributions from heavier states and from the backward propagating P meson (this latter condition may be convenient but is not necessary). In Eq. (35), Z ð0Þ m is the mass RC in pure QCD, which for our maximally twisted-mass setup is given by P is the RC of the pseudoscalar density determined in Ref. [28]. The quantity Z f m is related to the e.m. correction to the mass RC, and can be written in the form where Z f QED is the pure QED contribution at leading order in α em , given in the MS scheme at a renormalization scale μ by [30,31]   Connected diagrams contributing at Oðα em Þ to the K þ → l þ ν l decay amplitude corresponding to photon exchanges involving the final-state lepton. 3 In our maximally twisted-mass setup, in which the Wilson r parameters r f 1 and r f 2 are always chosen to be opposite r f 1 ¼ −r f 2 (see Appendix A), the vector (axial) weak current in the physical basis renormalizes multiplicatively with the RC Z A (Z V ) of the axial (vector) current for Wilson-like fermions, i.e., Z A ¼ Z V (see Appendix D). 4 The use of the conserved e.m. current guarantees the absence of additional contact terms in the product j em μ ðy 1 Þj em ν ðy 2 Þ.
The quantity Z fact m is computed nonperturbatively in Sec. IV and represents the QCD corrections to the "naive factorization" approximation Z f m ¼ Z f QED (i.e., Z fact m ¼ 1) introduced in Refs. [8,32].
Analogously, the term ½δA P SIB and ½δM P SIB can be extracted from the correlator where, following the notation of Ref. [8], we indicate witĥ m f and m f the renormalized masses of the quark with flavor f in the full theory and in isosymmetric QCD only, respectively. We stress again that the separation between QCD and QED corrections is prescription dependent and in this work we adopt the GRS prescription of Refs. [2,4,8], wherê Thus, in Eq. (42), the only relevant quark mass difference ism d − m ud ¼ −ðm u − m ud Þ, whose value in the ðMS; 2 GeVÞ scheme was found to be equal to 1.19 (9) MeV [8] using as inputs the experimental values of the charged and neutral kaon masses. Following Ref. [4], we form the ratio of δC i P ðtÞ with the corresponding tree-level correlator and at large time distances t, we obtain (i ¼ J; T; P; S; QCD) where G ð0Þ P ≡ h0jϕ P ð0ÞjP ð0Þ i ð 46Þ is the coupling of the interpolating field of the P meson with its ground state in isosymmetric QCD. The term proportional to δM i P in the r.h.s. of Eq. (45) is related to the e.m. and strong IB corrections of the meson mass.
The function in the square brackets on the r.h.s. of Eq. (45) is an almost linear function of t. Thus, the correction to the P-meson mass, δM i P , can be extracted from the slope of the ratio δC i P ðtÞ=C ð0Þ P ðtÞ and the quantity δ½G i P A i P from its intercept. As explained in Ref. [11], in order to obtain the quantity δA i P the correction δG i P is separately determined by evaluating a correlator similar to those of Eqs. (32)- (35), in which the weak operator J ρ W p ρ P =M P is replaced by the P-meson interpolating field ϕ P .
For illustration, in Fig. 6, we show the ratios C i P for the charged kaon (P ¼ K) obtained from the ensemble D20.

B. Crossed diagrams and lepton self-energy
The evaluation of the diagrams 5(a) and 5(b), corresponding to the term δA l P in Eq. (30), can be obtained by studying the correlator [11] where S l ð0; x 2 Þ stands for the free twisted-mass propagator of the charged lepton. For the numerical analysis, we have found it to be convenient to saturate the Dirac indices by inserting on the r.h.s. of Eq. (47), the factor ½vðp l Þγ σ ð1 − γ 5 Þuðp ν Þ, which represents the lowest order "conjugate" leptonic (V − A) amplitude, and to sum over the lepton polarizations. In this way, we are able to study the time behavior of the single function δC l P ðtÞ. The corresponding correlator at lowest order (Oðα 0 em Þ) is In Eqs. (47) and (48), the contraction between the weak hadronic current J ρ W ð0Þ [see Eq. (36)] and its leptonic (V − A) counterpart gives rise to two terms corresponding to the product of either the temporal or spatial components of these two weak currents, which are odd and even under time reversal, respectively. Thus, on a lattice with finite time extension T, for t ≫ a and ðT − tÞ ≫ a, one has where s 0 ¼ −1, s 1;2;3 ¼ 1 and is the relevant leptonic trace evaluated on the lattice using for the charged lepton the free twisted-mass propagator and for the neutrino the free Wilson propagator in the P-meson rest frame [p σ P ¼ ðM P ; ⃗ 0Þ]. Similarly, for the lowest-order correlator, one has C lð0Þ P ðtÞ⟶ where A ð0Þ P is the renormalized axial amplitude evaluated on the lattice in isosymmetric QCD in the P-meson rest frame, namely The effect of the different signs of the backwardpropagating signal in Eq. (49) can be removed by introducing the following new correlators: where Thus, the quantity δA l P =A ð0Þ P can be extracted from the plateau of the ratio δC l P ðtÞ=C lð0Þ P ðtÞ at large time separations, viz Note that the diagrams in Figs. 5(a) and 5(b) do not contribute to the electromagnetic corrections to the masses of the mesons and therefore the ratio (55) has no slope in t in contrast to the ratios (45). Moreover, the explicit calculation of X l;j P on the lattice is not required. In terms of the lattice momenta ap l and ap l , defined as the energy-momentum dispersion relations for the charged lepton and the neutrino in the P-meson rest frame are given by The three momentum of the final-state lepton ⃗p l ( ⃗p ν ¼ − ⃗p l ) must be chosen to satisfy the equatioñ Thus, for any given simulated P-meson mass M ð0Þ P , the three-momentum ⃗p l ¼ j⃗p l jð1; 1; 1Þ is calculated from Eq. (60) and is injected on the lattice using nonperiodic boundary conditions [33,34] for the lepton field. A simple calculation yields In Fig. 7, we show the correlators C μð0Þ π ðtÞ, δC μ π ðtÞ, C μð0Þ π ðtÞ, and δC μ π ðtÞ for π μ2 decays, multiplied by the ground-state exponential. These were obtained on the gauge ensembles A40.24 and D30.48 of Appendix A. The subtraction of the backward signals, needed for extracting directly the quantity δA l P given by Eq. (54), is beneficial also for extending the time region from which δA l P (as well as the ratio δA l P =A ð0Þ P ) can be determined. The quality of the signal for the ratio δC μ P ðtÞ=C μð0Þ P ðtÞ is illustrated in Fig. 8 for charged kaon and pion decays into muons for the case of the ensembles B55.32 and D30. 48.
The calculation of the correction due to the diagram 5(c) is straightforward, since it is obtained by simply multiplying the lowest order amplitude, A ð0Þ P , by the one-loop lepton selfenergy evaluated on the lattice. FIG. 7. Time dependence of the correlators C μð0Þ π ðtÞ (left panels) and δC μ π ðtÞ (right panels) for π μ2 decays. These are given in lattice units and multiplied by the ground-state exponential and were obtained from gauge ensemble A40.24 (top panels) and D30.48 (bottom panels). The blue squares represent the correlators δC μ π ðtÞ andC μð0Þ π ðtÞ given by Eqs. (53) and (53). Errors are statistical only. For details of the simulations, see Appendix A.

IV. RENORMALIZATION OF THE EFFECTIVE HAMILTONIAN AND CHIRALITY MIXING
In this section, we provide the basic formalism to derive the e.m. corrections to the RCs nonperturbatively; further details of the calculation will be presented in a forthcoming publication [29]. This procedure relates the bare lattice operators to those in the RI′-MOM (and similar) renormalization schemes up to order Oðα em Þ and to all orders in α s . We also improve the precision of the matching of the weak operator O 1 [see Eq. (24)] renormalized in the RI'-MOM scheme to that in the W regularization by calculating the coefficient of the term proportional to α em α s logðM 2 W =μ 2 Þ in the matching coefficient. Using the two-loop anomalous dimension thus determined, we can evolve the operator to the renormalization scale of M W . Following this calculation, the error due to renormalization is reduced from order Oðα em α s ð1=aÞÞ to order Oðα em α s ðM W ÞÞ.
The effective Hamiltonian, including the perturbative electroweak matching with the Standard Model [18], can be written in the form where the term proportional to the logarithm has been already included in Eq. (29) and O W-reg 1 ðM W Þ is the operator renormalized in the W-regularization scheme, which is used to regularize the photon propagator. Since the W-boson mass is too large to be simulated on the lattice, a matching of the lattice weak operator O 1 to the W-regularization scheme is necessary. In addition, for lattice formulations which break chiral symmetry, like the one used in the present study, the lattice weak operator O 1 mixes with other four-fermion operators of different chirality.

A. The renormalized weak operator in the W-regularization scheme
In order to obtain the operator renormalized in the W-regularization scheme, we start by renormalizing the lattice four-fermion operator O 1 defined in Eq. (24) in the RI′-MOM scheme [35], obtaining O RI0 1 ðμÞ, and then perturbatively match the operator O RI0 1 ðμÞ to the one in the W regularization [11], The coefficient Z W-RI0 ðM W =μ; α s ðμÞ; α em Þ can be computed by first evolving the operator in the RI'scheme to the scale M W and then matching it to the corresponding operator in the W scheme. The coefficient can therefore be written as the product of a matching coefficient and an evolution operator Below we will only consider terms of first order in α em and, therefore we will consistently neglect the running of α em . We note that the original bare lattice operators and O W-reg 1 ðM W Þ are gauge invariant, and thus the corresponding matching coefficients are gauge invariant. This is not the case for O RI' 1 ðμÞ that instead depends not only on the external states chosen to define the renormalization conditions, but also on the gauge. Consequently, the matching coefficient Z W-RI' ð M W μ ; α s ðμÞ; α em Þ and the evolution operator U RI' ðM W ; μ; α em Þ are in general gauge dependent. However, at the order of perturbation theory to which we are working, the evolution operator turns out to be both scheme and gauge independent.
In the following, we discuss in turn the matching coefficient, Z W-RI' ð1; α s ðM W Þ; α em Þ, the evolution operator U RI' ðM W ; μ; α em Þ, and the definition of the renormalized operator O RI' 1 ðμÞ, which will be obtained nonperturbatively.
where the strong interaction corrections for the RI′-MOM operator vanish, at this order, because of the Ward identities of the quark vector and axial vector currents appearing in the operator O 1 in the massless limit. We recall that we currently do not include terms of Oðα s ðM W Þα em Þ in the matching coefficient Z W-RI' . (ii) The evolution operator. The evolution operator U RI 0 ðM W ; μ; α em Þ is the solution of the renormaliztion group equation where U RI' ðM W ; μ; α em Þ satisfies the initial condition U RI' ðM W ; M W ; α em Þ ¼ 1, γðα s ; α em Þ is, in general, the anomalous dimension matrix [36,37], although in our particular case it is actually a number (and not a matrix), and βðα s ; α em Þ is the QCD β function, where N f denotes the number of active flavors, and N u and N d denote the number of uplike and downlike active quarks, respectively, so that N f ¼ N u þ N d . We may expand γðα s ; α em Þ in powers of the couplings as follows: where γ ð1Þ se has been previously calculated in Ref. [38]. In the case of the operator O 1 , both γ It can be demonstrated that, in addition to the leading anomalous dimension γ ð0Þ e , γ ð1Þ se is also independent of the renormalization scheme; thus, in particular it is the same in RI'and in the W-regularization schemes. It is then straightforward to derive U RI' ðM W ; μ; α em Þ, Note that at this order the evolution operator is independent of the QCD β function. This is a consequence of the fact that the QCD anomalous dimension vanishes for the operator O 1 . Combining Eqs. (63)- (65) and (71), we obtain the relation between the operator O 1 in the Wregularization scheme and the one in the RI'scheme, which is valid at first order in α em and up to (and including) terms of Oðα em α s ðM W ÞÞ in the strong coupling constant. (iii) The renormalized operator in the RI'-MOM scheme.
When we include QCD and e.m. corrections at Oðα em Þ, the operator O 1 on the lattice with Wilson fermions mixes with a complete basis of operators with different chiralities. In addition to O 1 , the mixing involves the following operators: where Z QCD is the mixing matrix in pure QCD (corresponding to α em ¼ 0), and is the pure, perturbative QED mixing matrix (corresponding to α s ¼ 0). In Eq. (75), we have introduced the ratio so that, at first order in α em , Eq. (75) is written as The ratio R encodes all the nonperturbative contributions of order Oðα em α n s Þ with n ≥ 1, other than the factorizable terms given by the product Z QED Z QCD . In other words, if Z O were simply given by Z O ¼ Z QED Z QCD at first order in α em , then η would be zero. The case η ¼ 0 thus corresponds to the factorization approximation that was first introduced in Refs. [8,32].
In this work, the ratio R has been computed nonperturbatively on the lattice to all orders in α s and up to first order in α em . Introducing this ratio R in the nonperturbative calculation is useful since by using the same photon fields in the lattice calculation of Z O and Z QED , the statistical uncertainty due to the sampling of the photon field is significantly reduced. Note that the ratio is also free from cutoff effects of Oðα em a n Þ. The nonperturbative calculation of R, in terms of the matrix η, is described in Appendix C, and all the details and results will be presented in a forthcoming publication [29].
As already mentioned, pure QCD corrections in Eq. (78) only induce the mixing of the operator O 1 with the operator O 2 . This mixing produces the renormalized QCD operators which, similarly to the corresponding continuum operators, belong, respectively, to the (8,1) and (1,8) chiral representations with respect to a rotation of the quark fields [23].
Using the results of Ref. [11], obtained in perturbation theory at order Oðα 0 s Þ, we have determined the values for the matching and mixing coefficients, where ξ is the photon gauge parameter [ξ ¼ 0ð1Þ in the Feynman (Landau) gauge]. It is worth noting that the renormalized operator in the W-regularization scheme is gauge independent, at any order of perturbation theory. In particular, as shown by Eq. (82), at first order in α em and at zero order in α s , the gauge dependence of the matching coefficient of O χ 1 cancels in the sum C W-RI' þ ΔZ QED where X l;0 P is the leptonic trace defined in Eq. (61). We then note that O χ 1 and O χ 2 entering in Eq. (81) give opposite contributions to the tree-level amplitude, i.e., with A ð0Þ P given in Eq. (52). Therefore, after averaging the amplitude over the values of the parameterr ¼ AE1, in order to cancel out the contribution of the mixing with O 3 and O 4 , one obtains As already noted, the contribution δA W P of the matching factor at order α em to the decay amplitude, expressed by Eqs. (85) and (86), is gauge independent. It then follows that also the order α em contribution of the bare diagrams to the amplitude, expressed by the other terms in Eq. (30), is by itself gauge independent. Therefore, we can numerically evaluate the two contributions separately by making different choices for the gluon and the photon gauge in the two cases. 5 In particular, we have chosen to compute the matching factor Z W-reg of Eq. (86) in the Landau gauge for both gluons and photons, because this makes RI'equivalent to RI up to higher orders in the perturbative expansions. On the other hand, in the calculation of the physical amplitudes described in Sec. III (and already computed in Ref. [2]), we have used a stochastic photon generated in the Feynman gauge, which has been adopted also in the calculation of Γ pt 0 ðLÞ in Ref. [14]. As discussed in Ref. [11], when we compute the difference Γ 0 ðLÞ − Γ pt 0 ðLÞ in Eq. (1) at leading order in α em , the contribution from the lepton wave function RC cancels out provided, of course, it is evaluated in Γ 0 ðLÞ and Γ pt 0 ðLÞ in the same W-regularization scheme and in the same photon gauge. Since Γ pt 0 ðLÞ has been computed in Ref. [14] by omitting the lepton wave function RC contribution in the Feynman gauge, we have to subtract the analogous contribution from Eq. (86) in the Feynman gauge. The QCD and QED corrections to the lepton wave function RC at Oðα em Þ factorize, so that their contribution does not enter into the nonperturbative determination of the matrix η, which only contains, by its definition, nonfactorizable QCD þ QED contributions. Therefore, as discussed in Ref. [11], the subtraction of the lepton wave function RC only requires the replacement of Z W-reg in Eq. (86) by the subtracted matching factor where 5 It should be noted, however, that while Z W-reg of Eq. (86) is gauge independent at any order of perturbation theory, its actual numerical value may display a residual gauge dependence due to higher order terms in the nonperturbative determination of η 11 which are neglected in the perturbatively evaluated matching coefficient.
The final expression to be used in Eq. (30) is therefore To make contact with the factorization approximation introduced in Refs. [8,32], we rewrite Eq. (90) as where Z W-reg η¼0 is the result in the factorization approximation (i.e., with η ¼ 0), and Z fact is the factor correcting the result forZ W-reg to include the entries of the matrix η determined in Ref. [29], The values of the coefficients Z W-reg η¼0 and Z fact are collected in Table I for the three values of the inverse coupling β adopted in this work and for μ ¼ 1=a. In the same table, we also include the values of the coefficient Z fact m corresponding to the nonfactorizable e.m. corrections to the mass RC [see Eq. (40)], evaluated in Ref. [29]. The two methods M1 and M2 correspond to different treatments of the Oða 2 μ 2 Þ discretization effects and are described in Ref. [28]. The difference of the results obtained with these two methods enters into the systematic uncertainty labeled as ðÞ input in Sec. VI below. The results in Table I show that the nonfactorizable corrections are significant, of O(12%-25%) for Z W-reg and even larger, O(40%-60%), for Z m .
We close this section by noting that Eq. (29) implies that the contribution to δR P from the matching factor in Eq. (89) is 2Z W-reg . Such a term is mass independent. Thus, as already pointed out in Ref. [2], all the matching and mixing contributions to the axial amplitude in Eq. (30) cancel exactly in the difference between the corrections corresponding to two different channels, e.g., in δR K − δR π . A similar cancelation also occurs in the difference between the corrections to the amplitudes corresponding to the meson P decaying into two different final-state leptonic channels.

V. FINITE VOLUME EFFECTS AT ORDER Oðα em Þ
The subtraction Γ 0 ðLÞ − Γ pt 0 ðLÞ in Eq. (1) cancels both the IR divergences and the structure-independent FVEs, i.e., those of order Oð1=LÞ. The pointlike decay rate Γ pt 0 ðLÞ is given by with the coefficients b j (j ¼ IR; 0; 1; 2; 3) depending on the dimensionless ratio m l =M P and given explicitly in Eq. (98) of Ref. [14] (see also Ref. [39]) after the subtraction of the lepton self-energy contribution in the Feynman gauge. An important result of Ref. [14] is that the structure-dependent FVEs start at order Oð1=ðM P LÞ 2 Þ. Consequently, the coefficients b IR;0;1 in the factor Y l P ðLÞ are "universal," i.e., they are the same as in the full theory when the structure of the meson P is considered. 6 Equation (30) is therefore replaced by where δA W P is given by Eq. (89). Notice that the decay rate in the full theory, Γ 0 ðLÞ, can be affected also by nonuniversal FVEs of order O½1=ðM P LÞ n with n ≥ 4 that do not appear in Γ pt 0 ðLÞ.
In order to study the FVEs in detail, we consider four ensembles generated at the same values of β and quark masses, but differing in the size of the lattice; these are the ensembles A40.40, A40.32, A40.24, and A40.20 (see Appendix A). The residual FVEs after the subtraction of the universal terms as in Eq. (96) are illustrated in the plots in Fig. 9 for δR π and δR K in the fully inclusive case, i.e., where the energy of the final-state photon is integrated over the full phase space. In this case, , which corresponds to ΔE max;K γ ≃ 235 MeV and ΔE max;π γ ≃ 29 MeV, respectively. With a muon as the final state lepton, the contribution from photons with energy greater than about 20 MeV is negligible and hence the pointlike approximation is valid. In the top plot, the universal FV corrections have been subtracted and so we would expect the remaining effects to be of order Oð1=ðM P LÞ 2 Þ and this is indeed what we see.
In the bottom plot of Fig. 9, in addition to subtracting the universal FVEs, we also subtract the contribution to the order Oð1=ðM P LÞ 2 Þ corrections from the pointlike contribution to b 2 , which can be found in Eq. (3.2) of Ref. [39]. We observe that this additional subtraction does not reduce the Oð1=ðM P LÞ 2 Þ effects, underlining the expectation that these effects are indeed structure dependent.
It can be seen that after subtraction of the universal terms the residual structure-dependent FVEs are almost linear in 1=L 2 , which implies that the FVEs of order Oð1=ðM P LÞ 3 Þ are quite small; indeed they are too small to be resolved with the present statistics. Nevertheless, since the QED L formulation of QED on a finite box, which is adopted in this work, violates locality [13], we may expect that there are also FVEs of order Oða 3 =L 3 Þ [39]. We have checked explicitly that the addition of such a term in fitting the results shown in Fig. 9 changes the extrapolated value at infinite volume well within the statistical errors.  9. Results for the corrections δR π and δR K for the gauge ensembles A40.20, A40.24, A40.32, and A40.40 sharing the same lattice spacing, pion, kaon, and muon masses, but with different lattice sizes (see Table II). Top panel (a): the universal FVEs, i.e., the terms up to order Oð1=M P LÞ in Eq. (95), are subtracted for each quantity. Bottom panel (b): the same as in (a), but in addition to the subtraction of the universal terms, b pt 2 =ðM P LÞ 2 , where b pt 2 is the pointlike contribution to b 2 in Eq. (95), is also removed. The solid and dashed lines are linear fits in 1=L 2 . The maximum photon energy ΔE γ corresponds to the fully inclusive case ΔE γ ¼ ΔE max;P A more detailed description of the full analysis, including the continuum and chiral extrapolations, is given in the following section. As far as the FVEs are concerned, the central value is obtained by subtracting the universal terms and fitting the residual Oð1=L 2 Þ corrections to where K P and K l P are constant fitting parameters and E l P is the energy of the charged lepton in the rest frame of the pseudoscalar P [see Eq. (98) below]. Such an ansatz is introduced to model the unknown dependence of b 2 on the ratio m l =M P . For the four points in each of the plots of Fig. 9, m l =M P takes the same value, but this is not true for all the ensembles used in the analysis. We estimate the uncertainty due to the use of the ansatz in Eq. (97) by repeating the same analysis, but on the data in which, in addition to subtracting the universal terms in Eq. (95), we also subtract the term b pt 2 =ðM P LÞ 2 , where b pt 2 is contribution to b 2 from a pointlike meson [39]. Since b pt 2 depends on m l =M P , the result obtained with this additional subtraction is a little different from that obtained with only the universal terms removed and we take the difference as an estimate of the residual FV uncertainty.

VI. RESULTS FOR CHARGED PION AND KAON DECAYS INTO MUONS
We now insert the various ingredients described in the previous sections into the master formula in Eq. (29) for the decays π þ → μ þ ν½γ and K þ → μ þ ν½γ.
The results for the corrections δR π and δR K are shown in Fig. 10, where the "universal" FSEs up to order Oð1=LÞ have been subtracted from the lattice data (see the empty symbols) and all photon energies [i.e., ΔE γ ¼ ΔE max;P ] are included, since the experimental data on π l2 and K l2 decays are fully inclusive. As already pointed out in Sec. I, structure-dependent contributions to real photon emission should be included. According to the ChPT predictions of Ref. [17], however, these contributions are negligible in for both kaon and pion decays into muons, while the same does not hold as well for decays into finalstate electrons (see Ref. [11]). This important conclusion needs to be explicitly validated by an ongoing dedicated lattice study of the real photon emission amplitudes in light and heavy P-meson leptonic decays.
The combined chiral, continuum, and infinite-volume extrapolations are performed using the following SU(2)inspired fitting function: where m ud ¼ μ ud =Z P and μ ud is the bare (twisted) mass (see Table II in Appendix A below), E l P is the lepton energy in the P-meson rest frame, R ð0Þ;ð1Þ;ð2Þ P , D P , K P and K l P are free parameters. In Eq. (98), the chiral coefficient R ðχÞ P is known for both pion and kaon decays from Ref. [40]; in QED the coefficients are while in qQED they are where X is obtained from the chiral limit of the Oðα em Þ correction to M 2 π AE [i.e., δM 2 π AE ¼ 4πα em Xf 2 0 þ Oðm ud Þ. In Ref. [8], we found X ¼ 0.658ð40Þ.
Using Eq. (98), we have fitted the data for δR π and δR K using a χ 2 -minimization procedure with an uncorrelated χ 2 , obtaining values of χ 2 =d:o:f: always around 0.9. The uncertainties on the fitting parameters do not depend on the χ 2 value, because they are obtained using the bootstrap samplings of Ref. [28] (see Appendix A). This guarantees that all the correlations among the data points and among the fitting parameters are properly taken into account.
The quality of our fits is illustrated in Fig. 10. It can be seen that the residual SD FVEs are still visible in the data and well reproduced by our fitting ansatz in Eq. (98). Discretization effects, on the other hand, only play a minor role.
At the physical pion mass in the continuum and infinitevolume limits, we obtain δR phys π ¼ þ0.0153ð16Þ statþfit ð4Þ input ð3Þ chiral × ð6Þ FVE ð2Þ disc ð6Þ qQED ¼ þ0.0153ð19Þ; ð101Þ where (i) ðÞ statþfit indicates the uncertainty induced by the statistical Monte Carlo errors of the simulations and its propagation in the fitting procedure. (ii) ðÞ input is the error coming from the uncertainties of the input parameters of the quark-mass analysis of Ref. [28]. (iii) ðÞ chiral is the difference between including and excluding the chiral logarithm in Eq. (98), i.e., taking R χ ≠ 0 or R χ ¼ 0.
(iv) ðÞ FVE is the difference between the analyses of the data corresponding to the FVE subtractions up to the order Oð1=LÞ alone or by also subtracting the term proportional to b pt 2 =ðM P LÞ 2 (see Fig. 9 and the discussion toward the end of Sec. V).
(v) ðÞ disc is the uncertainty coming from including (D ≠ 0) or excluding (setting D ¼ 0) the discretization term proportional to a 2 in Eq. (98). (vi) ðÞ qQED is our estimate of the uncertainty of the QED quenching. This is obtained using the ansatz (98) with the coefficient R χ of the chiral log fixed either at the value (100), which corresponds to the qQED approximation, or at the value (99), which includes the effects of the up, down, and strange sea-quark charges [40]. The change both in δR phys π and in δR phys K is ≃0.0003, which has been already added in the central values given by Eqs. (101) and (102). To be conservative, we use twice this value for our estimate of the qQED uncertainty. Our results in Eqs. (101) and (102) can be compared with the ChPT predictions δR phys π ¼ 0.0176ð21Þ and δR phys K ¼ 0.0064ð24Þ obtained in Ref. [25] and adopted by the PDG [20,26]. The difference is within 1 standard deviation for δR phys π , while it is larger for δR phys K . Note that the precision MeV [28]. The blue dotted lines correspond to the values δR phys π ¼ 0.0176ð21Þ and δR phys K ¼ 0.0064ð24Þ, obtained using ChPT [25] and adopted by the PDG [26]. of our determination of δR phys π is comparable to the one obtained in ChPT, while our determination of δR phys K has a much better accuracy compared to that obtained using ChPT; the improvement in precision is a factor of about 2.2. We stress that the level of precision of our pion and kaon results depends crucially on the nonperturbative determination of the chirality mixing, carried out in Sec. IV by including simultaneously QED at first order and QCD at all orders.
As already stressed, the correction δR P and the QCD quantity f ð0Þ P separately depend on the prescription used for the separation between QED and QCD corrections [27].
Only the product f ð0Þ P ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi 1 þ δR P p is independent of the prescription and its value, multiplied by the relevant CKM matrix element, yields the P-meson decay rate. We remind the reader that our results (101) and (102) are given in the GRS prescription (see the dedicated discussion in Secs. II B and III) in which the renormalized couplings and quark masses in the full theory and in isosymmetric QCD coincide in the MS scheme at a scale of 2 GeV [19]. We remind the reader that, to the current level of precision, this GRS scheme can be considered equivalent to the FLAG scheme.
Taking the experimental values Γðπ − → μ −ν μ ½γÞ ¼ 3.8408ð7Þ × 10 7 s −1 and ΓðK − → μ −ν μ ½γÞ ¼ 5.134ð11Þ × 10 7 s −1 from the PDG [20] and using our results (101)  where the first error is the experimental uncertainty and the second is that from our theoretical calculations. The result for the pion in Eq. (103) agrees within the errors with the updated value f ð0Þ π jV ud j ¼ 127.12ð13Þ MeV [20], obtained by the PDG and based on the model-dependent ChPT estimate of the e.m. corrections from Ref. [25]. Our result for the kaon in Eq. (104), however, is larger than the corresponding PDG value f ð0Þ K jV us j ¼ 35.09ð5Þ MeV [20], based on the ChPT calculation of Ref. [25], by about 2 standard deviations.
As anticipated in Sec. I and discussed in detail in Sec. III, we cannot use the result (103) to determine the CKM matrix element jV ud j, since the pion decay constant was used by ETMC [28] to set the lattice scale in isosymmetric QCD and its value, f ð0Þ π ¼ 130.41ð20Þ MeV, was based on the determination of jV ud j obtained from super-allowed β decays in Ref. [42]. On the other hand, adopting the best lattice determination of the QCD kaon decay constant, which is a result with the excellent precision of ≃0.2%.
Since the nonfactorizable e.m. corrections to the mass RC (see the coefficient Z fact m in Table I) were not included in Ref. [2], we update our estimate of the ratio of the kaon and pion decay rates, Using the pion and kaon experimental decay rates, we get Using the best N f ¼ 2 þ 1 þ 1 lattice determination of the ratio of the QCD kaon and pion decay constants, f Taking the updated value jV ud j ¼ 0.97420ð21Þ from superallowed nuclear beta decays [21], Eq. (108) yields the following value for the CKM element jV us j: which agrees with our result (105) within the errors. Note that our result (109) agrees with the latest estimate jV us j ¼ 0.2253ð7Þ, recently updated by the PDG [20], but it improves the error by a factor of approximately 1.5.
Taking the values jV ub j ¼ 0.00413ð49Þ [20] and jV ud j ¼ 0.97420ð21Þ [21], our result in Eq. (109) implies that the unitarity of the first row of the CKM matrix is confirmed to better than the per-mille level, jV ud j 2 þ jV us j 2 þ jV ub j 2 ¼ 0.99988ð46Þ: ð110Þ With the same value jV ud j ¼ 0.97420ð21Þ from superallowed nuclear beta decays [21], our result (103) implies for the QCD pion decay constant (in the GRS prescription) the following value: 7 The average value of f K AE quoted by FLAG [3] includes the strong IB corrections. In order to obtain f ð0Þ K therefore, we have subtracted this correction which is given explicitly in Refs. [43][44][45].
which, as anticipated in Sec. III, agrees within the errors with the value f ð0Þ π ¼ 130.41ð20Þ MeV adopted in Ref. [28] to set the lattice scale in the isosymmetric QCD theory. This demonstrates the equivalence of the GRS and PDG schemes within the precision of our simulation.
In a recent paper [46], the hadronic contribution to the electroweak radiative corrections to neutron and superallowed nuclear β decays has been analyzed in terms of dispersion relations and neutrino scattering data. With respect to the result V ud ¼ 0.97420ð21Þ from Ref. [21], a significant shift in the central value and a reduction of the uncertainty have been obtained, namely V ud ¼ 0.97370ð14Þ [46]. The impact of the new value of V ud on our determinations of V us and f ð0Þ π is V us ¼ 0.22526ð46Þ and f ð0Þ π ¼ 130.72ð12Þ MeV, i.e., well within the uncertainties shown in Eqs. (109) and (111), respectively. On the contrary, the first-row CKM unitarity (110) will be significantly modified into jV ud j 2 þ jV us j 2 þ jV ub j 2 ¼ 0.99885ð34Þ; ð112Þ which would imply a ≃3.4σ tension with unitarity. A confirmation of the new calculation of the radiative corrections made in Ref. [46] is therefore urgently called for.
Before closing this section, we comment briefly about the comparison between our result δR phys K ¼ 0.0024ð10Þ and the corresponding model-dependent ChPT prediction δR phys K ¼ 0.0064ð24Þ from Ref. [25]. The latter is obtained by adding a model-dependent QED correction of 0.0107(21) and a model-independent next-to-leading strong IB contribution equal to −0.0043ð12Þ. Our result on the other hand, obtained in the GRS prescription, stems from a QED correction equal to 0.0088(9) and a strong IB term equal to −0.0064ð7Þ (see also Ref. [47]). The difference between our result and the ChPT prediction of Ref. [25] appears to be mainly due to a different strong IB contribution. Thus, in the present N f ¼ 2 þ 1 þ 1 study, we confirm for the strong IB term a discrepancy at the level of about 2 standard deviations, which was already observed at N f ¼ 2 in Ref. [4].

VII. CONCLUSIONS
In this paper, we have presented the details of the first lattice computation of the leading e.m. and strong IB corrections to the π þ → μ þ ν and K þ → μ þ ν leptonic decay rates, following a method recently proposed in Ref. [11]. This expands significantly on the discussion of Ref. [2], where the results and a brief outline of the calculation had been presented. The results were obtained using the gauge ensembles produced by the European Twisted Mass Collaboration with N f ¼ 2 þ 1 þ 1 dynamical quarks.
Systematics effects are evaluated and the impact of the quenched QED approximation is estimated.
The effective weak Hamiltonian in the W-regularization scheme appropriate for this calculation is obtained from the bare lattice operators in two stages. First of all, the lattice operators are renormalized nonperturbatively in the RI ' -MOM scheme at Oðα em Þ and to all orders in the strong coupling α s . Because of the breaking of chiral symmetry in the twisted mass formulation, we have adopted this renormalization which includes the mixing with other four-fermion operators of different chirality. In the second step, we perform the matching from the RI ' -MOM scheme to the W-regularization scheme perturbatively. By calculating and including the two-loop anomalous dimension at Oðα em α s Þ [38], the residual truncation error of this matching is of Oðα em α s ðM W ÞÞ, reduced from Oðα em α s ð1=aÞÞ in our earlier work [2,11].
The evaluation of isospin breaking (IB) "corrections" raises the question of how QCD without these corrections is defined. Since IB corrections change hadronic masses and other physical quantities, a prescription is needed to define QCD, whether isosymmetric or not, and in Sec. II and Appendix B we discuss this issue in detail. In particular, the correction δR P and the QCD quantity f ð0Þ P separately depend on the prescription used for the definition of QCD [27]. Only the product f ð0Þ P ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi 1 þ δR P p is independent of the prescription and its value, multiplied by the relevant CKM matrix element, yields the P-meson decay rate. In this paper, we chose to follow the conventionally used GRS prescription (see the dedicated discussion in Secs. II B and III) in which the renormalized couplings and quark masses in the full QCD þ QED theory and in isosymmetric QCD coincide in the MS scheme at a scale of 2 GeV [19]. For future studies, however, we advocate the use of "hadronic schemes" in which QCD is defined by requiring that a set of hadronic quantities (for example, a set of hadronic masses) take their physical values in QCD and in QCD þ QED.
The main results of the calculation are presented in Sec. VI together with a detailed discussion of their implications. In summary, after extrapolation of the data to the physical pion mass, and to the continuum and infinite-volume limits, the isospin-breaking corrections to the leptonic decay rates can be written in the form, where Γ ð0Þ is the leptonic decay rate at tree level in the GRS scheme [see Eqs. (101) and (102)]. These results can be compared with the ChPT predictions δR phys π ¼ 0.0176ð21Þ and δR phys K ¼ 0.0064ð24Þ obtained in Ref. [25] and adopted by the PDG [20,26]. The difference is within 1 standard deviation for δR phys π , while it is larger for δR phys K . We also underline that our result jV us j ¼ 0.22538ð46Þ in Eq. (109), together with the value of V ud determined in Ref. [21] and jV ub j from the PDG [20], implies that the unitarity of the first row of the CKM matrix is satisfied at the per-mille level [see Eq. (110)].

APPENDIX A: DETAILS OF THE SIMULATION
The gauge ensembles used in this work are those generated by ETMC with N f ¼ 2 þ 1 þ 1 dynamical quarks and used in Ref. [28] to determine the up, down, strange, and charm quark masses. We use the Iwasaki action [48] for the gluons and the Wilson Twisted Mass Action [41,49,50] for the sea quarks. In the valence sector, we adopt a nonunitary setup [51] in which the strange quark is regularized as an Osterwalder-Seiler fermion [52], while the up and down quarks have the same action as the sea. Working at maximal twist such a setup guarantees an automatic OðaÞ improvement [50,51].
We have performed simulations at three values of the inverse bare lattice coupling β and at several different lattice volumes as shown in Table II. We allow a separation of 20 trajectories between each of the N cfg analyzed configurations. For the earlier investigation of finite-volume effects (FVEs), ETMC had produced three dedicated ensembles, A40.20, A40.24, and A40.32, which share the same quark masses and lattice spacing and differ only in the lattice size L. To improve such an investigation, which is crucial in the present work, we have generated a further gauge ensemble, A40.40, at a larger value of the lattice size L.
At each lattice spacing, different values of the light seaquark masses have been considered. The light valence and sea quark masses are always taken to be degenerate. The bare mass of the valence strange quark (aμ s ) is obtained, at each β, using the physical strange mass and the mass RCs determined in Ref. [28]. There the "FLAG" hadronic scheme was adopted in which the pion and kaon masses in isosymmetric QCD are equal to M  TABLE II. Values of the valence and sea bare quark masses (in lattice units), of the pion and kaon masses for the N f ¼ 2 þ 1 þ 1 ETMC gauge ensembles used in Ref. [28] and for the gauge ensemble, A40.40 added to improve the investigation of FVEs. A separation of 20 trajectories between each of the N cfg analyzed configurations. The bare twisted masses μ σ and μ δ describe the strange and charm sea doublet as in to Ref. [41]. The values of the strange quark bare mass aμ s , given for each β, correspond to the physical strange quark mass m phys s ðMS; 2 GeVÞ ¼ 99.6ð4.3Þ MeV and to the mass RCs determined in Ref. [28]. The central values and errors of pion and kaon masses are evaluated using the bootstrap procedure of Ref. [28].

Relation between observables in the full theory and in QCD
Physical observables are determined from correlation functions evaluated from lattice computations in the full theory. For a generic observable O evaluated in the full theory up to Oðα em Þ, we write where in the integrand O is a multilocal composite operator. For a given choice of the strong coupling g s , the parameters of the action, the bare quark masses, and the lattice spacing are determined by imposing that a set of physical quantities take their experimental values as explained in Sec. II A. Physical quantities other than those used for the calibration can now be determined unambiguously up to lattice artefacts, which are removed by taking the continuum limit.
In general, the determination of physical observables requires the processing of correlation functions of the form of Eq. (B11). Hadronic masses, for example, are obtained from the behavior in the time separation of two interpolating operators and the determination of hadronic matrix elements may require the cancelation of interpolating operators at the source and/or sink using a combination of three-and two-point functions. The discussion in this appendix concerns the evaluation of a generic correlation function.
We now turn to the definition of correlation functions in QCD defined in a generic scheme. For a generic observable O, we define its value in QCD by where the bare quark masses and the lattice spacing are defined as discussed in Sec. II B. The free QED action is included in the numerator and denominator of Eq. (B12) since even without radiative corrections the physical quantities such as ΓðK l2 Þ and Γðπ l2 Þ studied in this paper are obtained by combining the results for hadronic matrix elements obtained from QCD simulations with leptonic spinors. Moreover, for other quantities, for example, the long-distance contributions to the amplitude for the rare kaon decay K þ → π þ νν, there are internal free lepton propagators even in the absence of isospin breaking [53].
Comparing Eqs. (B11) and (B12), we arrive at where the subscript "conn" reminds that only connected Feynman diagrams contribute: There is one final subtlety which we must account for. We need to convert the results obtained from simulations in lattice units (i.e., in units of the lattice spacing) into values given in physical units such as MeV. Equation (B13) is also written in lattice units. Imagine that the observable O has mass dimension n and rewrite Eq. (B13) with the lattice spacing included explicitly, where, since we are working to first order in isospin breaking, in the second term on the right-hand side we do not need to distinguish between the lattice spacing in the full theory (a) and that obtained in QCD (a 0 ). The quantity which we wish to determine, hOi full in physical units, is therefore given by hOi full ¼ ha n 0 Oi QCD a n 0 þ ha n 0 δOi QCD a n 0 − nδa a nþ1 0 ha n 0 Oi QCD ; where δa ¼ a − a 0 . The three expectation values on the right-hand side of (B15) are directly computed in QCD simulations.

APPENDIX C: NONPERTURBATIVE RENORMALIZATION IN THE RI'-MOM SCHEME
In this paper, as explained in Sec. IV, we have renormalized the weak four-fermion operator O 1 nonperturbatively on the lattice to all orders in α s and up to first order in α em . In this appendix, we describe the main steps of the nonperturbative renormalization procedure at Oðα em Þ and we refer the reader to a forthcoming publication [29] for further details and results.
Given the amputated Green function, Λ O , of an operator O computed in a given gauge between external states with momentum p and a suitable projector on the relevant Dirac structure, P O , we define the projected Green function as In the RI'-MOM scheme, the renormalization constant (RC) Z O ðμaÞ is found by imposing the condition [35] Z The where Z QCD O and Z QED O are the RCs of the operator O in pure QCD and pure QED, respectively, and we have put The first term, ΔZ QED O , in Eq. (C5) represents the pure QED contribution to the RC at Oðα em Þ, whereas η O contains the Oðα em Þ nonfactorizable QCD þ QED correction.
In terms of the QCD renormalized operators O χ , as those in Eq. (79), we define the QCD renormalized projected Green function Γ χ O and expand it at first order in α em , where we have used the RI'-MOM renormalization condition Z QCD Γ O ðμaÞΓ QCD O ðμaÞ ¼ 1 applied in the pure QCD theory and defined Using Eqs. (C4) and (C6), we can rewrite Eq. (C2) at first order in α em as which provides, in turn, the RI-MOM renormalization condition at order α em , Using the expression of Z Γ O in Eq. (C3) in terms of Z O and the external fields RCs, one also obtains Thus, ΔZ O is expressed directly in terms of the Oðα em Þ contribution to the QCD renormalized projected Green function ΔΓ χ O ¼ Z QCD Γ O ΔΓ O evaluated at p 2 ¼ μ 2 . In the following, we describe a completely nonperturbative determination of the RCs ΔZ O ðμaÞ to all orders in α s . We will assume that all the relevant RCs of fields and composite operators in pure QCD have been already determined, by following the standard RI'-MOM renormalization procedure. With appropriate modifications to the kinematical conditions and projectors, the discussion can readily be adapted to similar schemes, such as the symmetric momentum subtraction one [54].
In addition to the renormalization of the four-fermion operator appearing in the Hamiltonian, the e.m. shift of the quark masses (see Sec. III A) requires the knowledge of the RC of the pseudoscalar density [4]. We therefore start by discussing the nonperturbative renormalization of quark bilinear operators.

Renormalization of the quark field and bilinear operators
We start with the renormalization of the quark fields. The e.m. corrections to a quark propagator can be represented schematically in the form ðC11Þ where the last two diagrams represent the mass and critical Wilson parameter counterterms [4]. The amputated one-particle irreducible two-point function is then given by and the correction to the quark field RC in the RI'-MOM scheme is obtained, according to Eq. (C9), as The e.m. correction to the RC Z O of a generic bilinear operator O Γ ¼q 2 Γq 1 , where Γ is one of the Dirac matrices (Γ ¼ 1; γ 5 ; γ μ ; γ μ γ 5 ; σ μν ), is given by Eq. (C10), which in this case reads Two kinds of corrections contribute to the amputated Green function: either the QCD Green function is amputated with the e.m. corrections on the inverse propagators, or the correction to the Green function itself is amputated with QCD propagators. Thus, we have with where G O is the nonamputated Green function and ΔG O is given diagrammatically by In this work, we have used an improved method to compute the first diagram in Eq. (C17), as well as all the diagrams containing a photon propagator connecting different points. In this method, some of the sequential propagators introduced in Ref. [8] are summed in order to reduce the number of inversions of the Dirac matrix. All details of the calculation will be given in the forthcoming publication [29].
Before closing this subsection, we stress that in the calculation of Z P and its e.m. correction ΔZ P , the Goldstone pole contamination has been taken into account and subtracted. In pure QCD, at each p 2 and for each combination of valence quark masses, μ 1 and μ 2 , the amputated Green function has been fitted to the ansatz where M P ≡ M P ðμ 1 ; μ 2 Þ is the mass of the pseudoscalar meson composed of valence quarks of mass μ 1 and μ 2 .
When including QED in the calculation, Eq. (C18) has to be modified to take into account the e.m. correction to the meson mass. By considering the ansatz in Eq. (C18) in QCD þ QED and expanding it in terms of α em , one finds where ΔM 2 P is the correction to M 2 P evaluated in Ref. [8]. Note, in particular, that ΔΓ P also receives the contribution of a double pole. In Eq. (C19), only the coefficients A 1 , B 1 , and C 1 need to be fitted, since the values of B 0 and C 0 are already obtained from the QCD fit in Eq. (C18).

Renormalization of the four-fermions operators
We conclude this section by describing the calculation of the RCs of the complete basis of four-fermion operators O i (i ¼ 1; …; 5), in the RI'-MOM scheme. In this case, the renormalization condition (C10) for the renormalization matrix at Oðα em Þ reads where ΔZ l is only e.m. and can be computed in perturbation theory. We remind the reader that this term is omitted in the actual calculation since its contribution cancels out in the difference Γ 0 ðLÞ − Γ pt 0 ðLÞ.
In Eq. (C20), ΔΓ χ O is a matrix expressed by As in the case of bilinear operators, the correction to the amputated Green function gets two kind of contributions, and in this case ΔG O i is given by The fermionic lines on the left-hand side of the diagrams in Eq. (C23) represent the ingoing and outgoing light quarks. On the right-hand side, the external charged antilepton and the neutrino propagators are drawn for illustration but not actually included in the calculation. For this reason, their amputation is neglected in Eq. (C22). The lepton self-energy is not reported in Eq. (C23) since its contribution cancels out in the amputation.

APPENDIX D: MATCHING, CHIRALITY MIXING, AND FERMION OPERATORS IN THE TWISTED MASS REGULARIZATION
In the main text and in Appendix C, we have described the renormalization of the relevant operators in the physical basis. This discussion is valid for a generic Wilson-like fermion regularization. In this appendix, we address instead some important aspects peculiar to the twisted mass fermions used in our numerical calculation. We derive, in particular, the relations between RCs in the so-called physical and twisted basis, for the bilinear and four-fermion operators considered in this work.
The relevant observation is that the lattice action for twisted mass fermions at maximal twist in the twisted basis only differs from the standard Wilson fermion lattice action for the twisted rotation of the fermion mass term. The two actions become identical in the chiral limit. It then follows that, in any mass-independent renormalization scheme, the