Lattice computation of the electromagnetic contributions to kaon and pion masses

We present a lattice calculation of the electromagnetic (EM) effects on the masses of light pseudoscalar mesons. The simulations employ 2+1 dynamical flavors of asqtad QCD quarks, and quenched photons. Lattice spacings vary from $\approx 0.12$ fm to $\approx 0.045$ fm. We compute the quantity $\epsilon$, which parameterizes the corrections to Dashen's theorem for the $K^+$-$K^0$ EM mass splitting, as well as $\epsilon_{K^0}$, which parameterizes the EM contribution to the mass of the $K^0$ itself. An extension of the nonperturbative EM renormalization scheme introduced by the BMW group is used in separating EM effects from isospin-violating quark mass effects. We correct for leading finite-volume effects in our realization of lattice electrodynamics in chiral perturbation theory, and remaining finite-volume errors are relatively small. While electroquenched effects are under control for $\epsilon$, they are estimated only qualitatively for $\epsilon_{K^0}$, and constitute one of the largest sources of uncertainty for that quantity. We find $\epsilon = 0.78(1)_{\rm stat}({}^{+\phantom{1}8}_{-11})_{\rm syst}$ and $\epsilon_{K^0}=0.035(3)_{\rm stat}(20)_{\rm syst}$. We then use these results on 2+1+1 flavor pure QCD HISQ ensembles and find $m_u/m_d = 0.4529(48)_{\rm stat}( {}_{-\phantom{1}67}^{+150})_{\rm syst}$.


I. INTRODUCTION
The mass splitting between the charged and neutral kaons, K ± and K 0 , arises from two effects that give comparable contributions: the mass difference between up and down quarks, and electromagnetism. If the electromagnetic (EM) contributions can be determined and removed from the experimental meson masses, the resulting pure-QCD masses can then be used as input to a lattice QCD calculation to determine the light quark masses, and in particular the ratio m u /m d , a fundamental parameter of the standard model which measures the strength of strong isospin violations.
The size of the EM contributions to the K ± -K 0 mass splitting is a long-standing issue.
Almost fifty years ago, Dashen [1] showed that the EM splitting of the charged and neutral kaons is equal to that of the pions in leading order (LO) of chiral SU(3)×SU(3) symmetry.
In other words, at LO, (M 2 where the superscript γ denotes the EM contribution, i.e., the difference between the quantity in the real world and in a world where all quark charges are set to zero (keeping renormalized quark masses unchanged).
However, it has been known for some time that the corrections to this lowest order result are large; see, for example, Ref. [2] for a pedagogical review. These corrections can be estimated in a variety of continuum phenomenological models [3]. The model results differ considerably, however, and do not allow one to make controlled estimates of the systematic errors. Indeed, in lattice determinations of m u /m d that employ phenomenological estimates of EM contributions [4][5][6][7], the error coming from the range of EM estimates dominates all other systematic errors.
Direct lattice calculations of the EM contribution to the kaon splittings can greatly reduce the uncertainties. This approach was pioneered by Duncan, Eichten, and Thacker [8] in the quenched approximation of QCD, and has been applied in full QCD more recently by several groups [9][10][11][12][13][14][15][16][17]. Here we report on our lattice QCD+QED computation of (M 2 K ± − M 2 K 0 ) γ . We then apply our result to compute m u /m d in a pure QCD simulation.
There is an alternative approach to calculating EM effects on the lattice [18,19] in which one expands out QED and isospin-violating interactions to O(α EM , m u − m d ) (where α EM is the fine structure constant) and then computes the resulting matrix elements in isospinconserving pure QCD. We do not discuss this approach further here, but simply note that the existence of two independent methods makes possible important cross checks on the results and errors of both. See Ref. [20] for a review that covers both approaches.
In lattice simulations of QCD+QED, both the QCD and QED should in principle be unquenched, i.e., include all contributions from virtual sea-quark loops. However, Bijnens and Danielsson [21] have shown that QED quenching effects for mass differences such as (M 2 K ± − M 2 K 0 ) γ are computable through next-to-leading order (NLO) in SU (3)×SU (3), with no dependence on unknown low energy constants (LECs). In other words, the sea quarks may be taken to be electrically neutral in the simulation, and the effects of their charges maybe restored, correct to NLO, after the fact. We take advantage of this result here and simulate full, unquenched QCD + quenched QED (the electroquenched approximation) in order to determine the kaon EM splittings. Since the QED part of the simulation is quenched, we need only to calculate valence-quark propagators in a background consisting of pure unquenched QCD and quenched EM fields, which are free fields and therefore easily generated. For the pure QCD backgrounds, we use our large data set of ensembles generated with 2+1 flavors of asqtad staggered quarks [4]. We have added a number of additional ensembles to better study finite-volume effects.
One may parameterize the kaon EM splitting by [2] ≡ where the experimental pion splitting is used in the denominator, rather than the EM pion splitting. The two are equal up to isospin-violating effects, which are O((m u − m d ) 2 ), and therefore small. Determining the EM contribution to the mass of the true π 0 is costly, however, since it has quark-line disconnected EM diagrams even in the isospin limit. Instead, we drop the disconnected diagrams, which are expected to be small, and simply find the RMS average mass of uū and dd mesons. We call the pion obtained in this manner the "π 0 ." Both the true (M 2 π 0 ) γ and our (M 2 "π 0 " ) γ are small because EM contributions to neutral mesons vanish in the chiral limit. For the true π 0 , this is required by Dashen's arguments [1], and may be seen explicitly in chiral perturbation theory ( χ PT) including EM effects [22]. For the "π 0 ," a simple argument in partially quenched χ PT, given below in Sec. III D, shows that (M 2 "π 0 " ) γ also vanishes in the chiral limit. This means that the disconnected EM contributions that we are neglecting are themselves small. (An alternative diagramatic proof of the small size of the disconnected terms has been given previously in Ref. [18].) Further, Zweig's rule suggests that the mass contribution from the disconnected diagrams is in fact still smaller than either (M 2 π 0 ) γ or (M 2 "π 0 " ) γ separately. Summarizing, we use to compute . The systematic error coming from using the "π 0 " will of course need to be estimated.
An alternative estimate of is also possible if we employ the experimental EM pion splitting in the numerator of Eq.
(1) instead of our computed π splitting. This estimate is then independent of any assumptions about the disconnected diagrams in the "π 0 ." For a test of systematic effects in the calculation of , we can therefore look at In Ref. [2], the contribution to the pion splitting coming from quark masses (i.e., the splitting that would be present in QCD alone) is defined to be m (M 2 π ± − M 2 π 0 ) expt . Then Eqs.
(1) and (3) imply At NLO in χ PT, m = 0.04 [23]. Reference [2] adds a conservative error and quotes m = 0.04 (2). In our calculation, − appears to be positive. However, because our systematic errors in both and are significantly larger than 0.04, we are only able to use the difference − ( + 0.04) as one estimate of those errors, and have nothing to report about m itself.
We also calculate the EM contribution to the squared mass of the neutral kaon, (M 2 K 0 ) γ . It is convenient to express this quantity in terms of the experimental pion splitting, just as we have done for the kaon splitting. We follow Ref. [2] and define the dimensionless quantity The following is an outline of the remainder of the paper: Section II gives the details of the 2+1 flavor asqtad staggered QCD ensembles [4] on which we compute (quenched) EM effects. In addition, we describe the pure QCD 2+1+1 HISQ fermion ensembles [24] on which we calculate m u /m d , with input on EM effects from the asqtad simulations. We discuss infinite volume χ PT in QCD+QED in Sec. III. Modifications for partial quenching [21] and staggered discretization errors [25] are detailed, and the staggered result for the meson masses at NLO is presented and explained. In Sec. IV we describe how we define QED in finite volume (FV). Finite-volume effects are then calculated at one loop in staggered χ PT in Sec. V. We show that the resulting formulas give an excellent description of our lattice data over a wide range of volumes. We can therefore correct for FV effects, with a small residual systematic error. Section VI C then presents a variety of chiral fits to the FV-corrected lattice data, and Sec. VII describes our results and systematic errors for the EM contributions to the kaon massess, and the parameters and K 0 . Finally, in Sec. VIII we use our EM results to adjust the experimental kaon masses to their values in a pure-QCD world, which are then taken as input to the calculation of m u /m d following Ref. [26].
We note that m u /m d may be computed on the lattice in other ways that do not depend on knowing the EM contributions to the kaon masses. In particular, Ref. [28] uses a dispersive treatment of the experimental input from the decay ρ → 3π instead of kaon splittings to obtain the ratio m u /m d from their lattice determination of m s /m l , where m l ≡ (m u +m d )/2.
Since the ρ → 3π decay violates isospin but is known to be fairly independent of EM corrections, it gives a handle on m u /m d that does not require EM input, at least to some level of accuracy.

II. LATTICE DETAILS
We calculate meson masses on the (2+1)-flavor MILC asqtad ensembles, with quenched photon fields, and with lattice spacings ranging from ≈ 0.12 fm to ≈ 0.045 fm. Table I shows the ensembles employed. On all ensembles, we generate propagators for valence quarks that have charges 0, ±1/3e, or ±2/3e, where e ≈ 0.303 is the physical electron charge, and we compute the masses of mesons made from various combinations of these quarks. On many ensembles we also have mesons made from quarks with charges greater than physical: ±e and ±4/3e. On some ensembles, we even have quarks with charges ±2e, although charges that high are not included in the analysis at this time.
Quenched photon fields are generated in momentum space in the finite-volume Coulomb gauge QED T L defined in detail in Sec. IV. The momentum-space distribution is Gaussian, and is generated and Fourier transformed to position space by a serial program. The spectrum program reads the photon fields from disk and, for each desired charge, converts the field to a U (1) phase factor with that charge. The SU (3) links are multiplied by the U (1) links, and then the same gauge smearing that we use for SU (3) alone is applied. This amounts to an a 2 -improved action, but without any tadpole improvement of U (1).

A. New ensembles
To study finite-volume errors, which were found to be quite important in our prior work, we have generated a number of new ensembles that are not detailed in Ref. [4]. Our prior finite-volume work used two volumes corresponding to spatial size L = 20 and 28. We have added L = 12, 16, 40, and 48 in order to have data on both larger and smaller volumes. For L = 12, we have generated the ensemble using the R algorithm [29][30][31]  Three of them use the RHMC [32][33][34][35][36] algorithm with a 3G1F Omelyan integrator [37,38].    the first in single precision, and the second in double. We treat them as separate data, and do not average the results. The r 1 /a values are mass-independent, in that they are extrapolated to physical quark masses, rather than the sea mass of the simulations. The errors listed for r 1 /a are the sum in quadrature of the statistical errors and the extrapolation errors. We use the a ≈ 0.12, m l = 0.01, m s = 0.05 result for r 1 /a for those a ≈ 0.12 ensembles where no r 1 /a value has been directly computed.

B. Spectrum Calculations
In order to calculate the meson spectrum, we read an archived dynamical SU ( multi-shift solver is employed so that for each desired charge all desired masses are found with one iterative process.
The calculation of the meson spectrum has been primarily done on GPU based computers at the Texas Advanced Computing Center, National Center for Supercomputing Applications, and Indiana University using the QUDA approach pioneered at Boston University [39], but enhanced to support staggered quarks [40][41][42][43].
In Table III, we summarize the quark charges, masses, and number of channels we study on each ensemble. Figure 1 shows the Goldstone pion propagators as a function of Euclidean time for the a ≈ 0.045 fm ensemble, which is our finest lattice spacing. We show four charge combinations for our lightest valence quark mass on that ensemble. Using the notation further detailed in Sec. III B, the quark charges are q x and q y in units of the fundamental charge e, and the meson charge q xy is q x − q y since the meson is made from an x-quark and y-antiquark. The combinations (q x , q y ) we plot are (0, 0), (2/3, 2/3), (1/3, −2/3), and (2/3, −2/3), with total charges q xy = 0, 0, 1 and 4/3, respectively. We see a nice linear decrease of the propagators in this semi-logarithmic plot over a large range of t, before the periodic boundary conditions result in curvature at large t. In Fig. 2, we show the results of fitting the propagators in Fig. 1. Each plot shows a series of fits starting from D min and extending to the center of the lattice. The symbol size is proportional to the p value of the fit. Crosses are fits with a single particle (two free parameters), and squares correspond to two particles (four parameters). We see that there are many fits with good p values, and that the meson masses depend significantly on the total charge. We can even see a difference between the two cases of a neutral meson, one with uncharged quarks and the other made from a quark and an antiquark whose charges cancel each other. Much of this difference is unphysical, coming from the effect of EM quark-mass renormalization at fixed bare mass -see Sec. III C. Note that the quality of the plateaus for mesons with charged quarks is virtually identical to that for the meson with uncharged quarks; we return to this point in Sec. V. The masses corresponding to fits with D min = 30 are detailed in Table IV. Figure 3 plots these masses versus the square of the meson charge.
In order to construct the correlations among the masses of all the channels on a specific to the center of the lattice assuming a single particle. The first two columns are the charges of the two quarks in units of e. Since the meson is made from a quark and an antiquark, the meson charge q xy in the third column is q x − q y . The mass and its error are in the fourth column. Each fit has 65 degrees of freedom. χ 2 and the p value of the fit are in columns five and six, respectively.
values described in Ref. [4], we extrapolate the r 1 /a values at the simulated quark masses to the physical quark masses (given below in Table V), holding β fixed. This defines a massindependent scale-setting scheme, which is needed in order to apply chiral perturbation theory. The scheme is mass independent because it gives an r 1 /a value that depends only on β and not on the simulated quark masses m l and m s . Mass-independent values of r 1 /a for our ensembles are listed in Table I. The errors shown are a sum (in quadrature) of statistical errors and errors of the extrapolation to the physical quark masses.

A. Continuum chiral theory for QCD+QED
In the continuum, the chiral effective theory for QCD+QED was worked out by Urech [22]. Along with all hadrons heavier than the pseudoscalar mesons, high-momentum photons are integrated out of the chiral theory, resulting in a single effective meson-interaction term at LO. Photons and mesons with low momentum (less than the chiral cutoff Λ χ ), are treated explicitly in this chiral perturbation theory ( χ PT).
The partially quenched version of the chiral theory is relevant here. In partial quenching, the valence and sea quarks are treated as distinct; when EM is included, this means that Danielsson [21] have calculated the meson masses and decay constants at NLO (one loop) in partially-quenched χ PT in QCD+QED with three flavors of sea quarks (u, d, s). A key insight of Ref. [21] is that sea-quark charges affect meson masses in particularly simple ways at NLO. In analytic terms involving the sea-quark charges, only the sums of the squared sea-quark charges appear -there are no cross terms between sea-quark and valence-quark charges. (It is necessary to assume here that the sum of the three sea-quark charges vanishes, as it does in the real world.) Sea-quark charges may also appear in the one-loop chiral logarithms, but these are completely determined in terms of the LO LECs. This implies that in the difference of squared mass of two mesons with the same valence quark masses but different valence quark charges, the analytic terms depending on sea-quark charges cancel.
Thus the difference may be reliably computed on the lattice with a simulation in which the sea quarks are uncharged (the electroquenched approximation). The sea-quark charge dependence, which comes only from one-loop chiral logarithms, may be put in after the fact.
All dependence on unknown NLO LECs cancels in the difference.
Note that the quantities we need to calculate to determine in Eq. (2), namely (M 2 K ± − M 2 K 0 ) γ and (M 2 π ± − M 2 "π 0 " ) γ , are squared-mass differences of the type required to make them reliably calculable with electroquenched simulations, in the sense described in the previous paragraph. We emphasize, however, that the calculability depends on using SU (3) (3-flavor) χ PT at NLO, which will have nonnegligible systematic corrections that need to be estimated. The alternative, treating only the u, d quarks as light (2-flavor SU(2) χ PT), is not necessarily an improvement, despite the fact that in SU(2) χ PT the errors are generically much smaller at a given order than in SU(3). The reason is that the calculability of squaredmass differences in the electroquenched approximation depends on the tracelessness of the quark charge matrix, which holds in SU (3), but not in SU (2). Thus, if SU(2) is used, the chiral errors are likely to be smaller, but one must include a separate quenching error that needs to be estimated in some independent fashion. That is the approach taken in Ref. [17].
In this paper, we compute the EM effect on the neutral kaon mass, (M 2 π 0 ) expt , in addition to . In this case, χ PT does not allow us to control the electroquenching error, because that error it is not computable at lowest nontrivial χ PT order. The quantity (M 2 K 0 ) γ is the difference between the squared mass of a neutral kaon made out of charged valence quarks, with a charged sea, and the squared mass of a kaon made out of neutral valence quarks, with a neutral sea. Even effects that depend on the sea-quark charges alone do not cancel here. Our estimate of the electroquenching error in (M 2 K 0 ) γ is therefore based on large-N c power counting only (N c = 3 is the number of QCD colors), and must be considered a rough guide only.

B. Staggered chiral perturbation theory with EM
With the staggered lattice action, each quark flavor appears as four species, known as tastes. This is a remnant of the 16-fold doubling of species of naive lattice fermions. To obtain standard QCD in the continuum limit, it is necessary to eliminate the unwanted taste degrees of freedom in the sea. Our simulations accomplish this by taking the fourth-root of the fermion determinant for each quark flavor [47]. Numerical and theoretical arguments for the validity of this procedure in the continuum limit can be found in Refs. [48][49][50][51][52][53][54][55][56][57][58][59]. The appropriate chiral theory for staggered quarks with the rooting procedure is called rooted staggered χ PT (rS χ PT) [60,61]. Starting with the staggered chiral Lagrangian of Ref. [61], it is straightforward [25] to include EM effects following Ref. [22].
At leading order, the Euclidean, staggered QCD+QED chiral Lagrangian is 2 where Tr denotes a trace over flavor and staggered taste indices. The quantities A µ , F µν , and λ are the photon gauge potential, the EM field strength, and the gauge-fixing parameter, respectively. The meson fields are contained in where diagonal entries U D, and S, are the quark-antiquark pairs uū, dd, and ss respectively.
As usual in partially quenched and/or staggered calculations, it is convenient to keep this term and use the simple U , D, S basis along the diagonal of Φ. At the end of the calculation, we can take m 0 → ∞ [64] and decouple the η .
In Eq. (6), M is the quark mass matrix, and Q is the quark (electric) charge matrix with the property Tr(Q) = 0. The covariant derivative d µ is given by where we have set vector and axial source terms to zero since they are not needed for present purposes. Electromagnetic effects on the meson masses come both directly, from the low-energy photon field A µ , and indirectly, through the term e 2 CTr(QΣQΣ † ) (with e the fundamental electric charge and C an LEC), which represents the effects of high-energy photons that have been integrated out.
With p a typical meson 4-momentum, and M and m generic meson and quark masses, respectively, the standard power-counting scheme of rS χ PT is this counting treats e 2 ∼ p 4 . Terms that go like e 4 ultimately have negligible impact on our results for , but can be necessary to describe small, but statistically significant, effects in our lattice data, especially when we include data for quarks with larger-than-physical charges.
We consider a generic pseudoscalar meson composed of two different valence quarks x and y with masses m x and m y . In units of e, the quark (not antiquark) charges are q x and q y , so that the meson charge is q xy = q x − q y . At LO, the squared mass of such a meson with taste b is where χ xy,b is the LO squared mass without EM effects, ∆ b is the taste splitting coming from the staggered potential V, and Dashen's theorem is immediately evident from Eq. (12) since the LO EM contribution proportional to ∆ EM is independent of quark masses.
We remark that Eqs. (12) and (13)  Beyond LO, the fourth-root procedure needs to be implemented. This can be done systematically at the level of the chiral theory by using a replica trick [65] for the sea quarks: replicating them n r times and setting n r = 1/4 at the end of the calculation [54,55,66].
(Additional, un-replicated valence quarks, here called x and y, must also be introduced.) We do not show the replications explicitly in Eq. (6); in practice it is actually more convenient at the one-loop level to use quark-flow techniques [67] to keep track of diagrams with seaquark loops, and multiply them by hand by a factor of 1/4. Since both the replica and the quark-flow approaches distinguish sea and valence quarks, it is straightforward to take into account, in the chiral calculations, the fact that our simulations are partially quenched.
From Eq. (6), it is straightforward to compute the squared mass of a pseudoscalar meson to order NLO (O(p 4 ), one-loop). We focus on the taste-ξ 5 (pseudoscalar taste) meson because it is the valence meson that we have simulated. The taste-ξ 5 meson is a true Goldstone boson in the massless limit and in the absence of EM (for electrically charged mesons). From now on, we always mean the taste-ξ 5 meson if we do not otherwise specify the meson's taste.
We are interested in the EM contribution to the squared mass, where the second term on the right-hand side is the squared mass in a world without EM, where all quark charges, both valence (q x , q y ) and sea (q u , q d , q s ) vanish. The difference should be taken at fixed renormalized quark masses, so that only physically meaningful EM effects contribute to (M 2 xy ) γ . This is a nontrivial requirement because the masses of quarks with different charges, e.g., u and d, have different EM renormalization. It is much more convenient to work with an intermediate quantity where the two terms on the right-hand side are computed at the same values of the bare quark masses.
On the lattice, we have computed M xy for various choices of valence quark charges, including vanishing charges, for each valence bare quark mass studied. This means that it is straightforward to construct the quantity ∆M 2 xy , as well as its correlated errors with other choices of quark charges and valence masses. On the other hand, the construction of (M 2 xy ) γ would require theoretical assumptions about the EM mass renormalization, coupled with interpolation or extrapolation of the data to adjust the bare masses in the subtraction in Eq. (15). It is much easier to postpone the renormalization step until after the chiral fit, when we will have the ability to make these adjustments easily. Fortunately, the functional form of the chiral fit that is appropriate to the physical quantity (M 2 xy ) γ may also be applied to a fit of the unphysical intermediate quantity ∆M 2 xy . As we will see, the only consequence of fitting ∆M 2 xy instead of (M 2 xy ) γ is that the former will have unphysical contributions to two LECs that are affected by EM renormalization. We therefore postpone detailed discussion of renormalization until Sec. III C. Except for some comments about the affected LECs, we ignore the difference between ∆M 2 xy and (M 2 xy ) γ in the current section. Separating orders in the chiral expansion, we write the difference in Eq. (16) as where ∆ LO M 2 xy is independent of taste. Equation (18) follows from Eq. (12), and Eq. (19) divides the NLO contribution into logarithmic (non-analytic) and analytic contributions.
For NNLO and higher orders, the chiral logarithms are not known; when such orders are needed in the chiral fits, we therefore include the analytic contributions only.
The mass of the Goldstone meson has been computed to NLO (one loop, O(p 4 , e 2 p 2 )) in rS χ PT with EM in Ref. [25]. Figure 4 shows the NLO contributions to the meson mass. The photon tadpole diagram does not contribute here since it vanishes in dimensional regularization; in FV, however, the momentum integral becomes a sum, and the photon tadpole is nonzero, as discussed in Sec. IV. The photon sunset diagram is essentially the same as in the continuum, since the meson-photon vertex is taste-conserving, and the external pseudoscalar-taste meson is also the meson in the loop. The calculation of the contribution from the meson tadpole, Fig. 4(c), is very similar to that in Ref. [61], with the addition of a new 4-meson vertex from the C term in Eq. (6).
The result of the calculation is that the NLO contribution to the squared mass splits into an EM contribution proportional to e 2 and a non-EM contribution, which is identical to that in Ref. [61], and which cancels in the difference ∆M 2 xy,5 , where we include the subscript 5 to emphasize here that we are talking about the meson with taste ξ 5 . The one-loop diagrams where sea-quark flavors and the 16 meson tastes are labeled by σ and b, respectively, Λ χ is the chiral scale, and (χ) is the renormalized loop integral The result in the first line in Eq. (20) is from the photon sunset diagram, Fig. 4(b), and that in the second line is from the meson tadpole, Fig. 4 where i labels quark flavors, i ∈ {x, y, u, d, s}, and∆ is the mean of the taste splittings ∆ b (weighted by multiplicities). It is straightforward to find the possible analytic terms using the standard spurion approach, but it is much quicker simply to write down all polynomials of a given order using the rules that follow from the symmetries: 1. Charge conjugation symmetry implies that a valence xȳ meson has the same mass as its antiparticle, the yx meson, so terms must be symmetric under the interchange q x , µ x ↔ q y , µ y .
2. In the absence of EM, the partially conserved staggered axial symmetry that rotates x into y quarks guarantees that M 2 xy,5 is proportional to m x + m y (times possible additional mass factors). When EM is turned on, the symmetry is explicitly broken, but only for charged mesons (q x = q y ). Thus, when q x = q y , all terms must either vanish or be proportional to m x + m y .
3. The fact that the sea quarks couple equally to valence quarks implies that terms must be symmetric under sea-quark interchange: 4. The sum of sea-quark charges vanishes in the two cases of interest here, the physical case and the electroquenched case. Therefore terms proportional to the sum q u +q d +q s may be dropped.
Given these rules, there are six independent analytic contributions possible at O(e 2 p 2 )) (NLO): xy µ a 2 , e 2 q 2 xy (µ u + µ d + µ s ), e 2 (q 2 x + q 2 y )(µ x + µ y ), Of these, the last contribution will cancel for ∆M 2 xy since it is independent of the valence charges. The remaining contributions are independent of the sea-quark charges. That means that sea-quark-charge dependence only enters at NLO in the chiral logarithms, Eq. (20), and hence is computable, as discovered by Bijnens and Danielsson [21]. 4 The result of an O(1) shift in the scale of the chiral logarithms suggests that an appropriate scale for the analytic contributions in Eq. (23) is f 2 π (from the first line in Eq. (20)) or ∆ EM (from the second line in Eq. (20)). In fact, these two quantities are the same order of magnitude, as can be seen by estimating ∆ EM by assuming that the experimental π +π 0 mass splitting comes entirely from the leading order contribution e 2 ∆ EM . We therefore choose the scale f 2 π for all the NLO analytic contributions. In addition, we find it helpful to include mean taste splittings in analytic terms that absorb the chiral-scale dependence coming from the meson tadpole, which has an average over tastes in the second line in Eq. (20).
As in Ref. [7], this definition of the LECs at nonzero lattice spacing simplifies the chiral-scale dependence of the LECs, and also tends to capture much of the lattice-spacing dependence of the lattice data, reducing the size of the pure discretization term (proportional to µ a 2 ) in the fit. The NLO analytic contribution to ∆M 2 xy is then The usual expectation would be that the dimensionless LECs κ i are O(1). However, several features of the current problem indicate that the expectation may be violated. First of all, previous work, both in the continuum [3] and on the lattice [10][11][12][13][14][16][17][18], suggests that is large (O(1), rather than 1), which would imply that the NLO terms produce O(1) corrections to the LO result, and hence that at least some of the NLO LECs may be expected to be significantly larger than 1. A second issue arises from the nature of our data set. Because the ensembles we study here all have a strange quark mass tuned to near the physical value (m s ≈ m s ), and a light quark mass significantly lighter than that (m l ≤ 0.2m s ), the κ 2 term in Eq. (24) is approximately a constant up to discretization errors, and may therefore compete in the fit with the LO term q 2 xy e 2 ∆ EM . In most fits, in fact, κ 2 has a tendency to get large and ∆ EM to get small -even negative in some cases! This is a typical problem that occurs with SU(3) fits to data sets in which m s does not take a significant range of values less than m s . Fortunately, the final results for physical quantities depend only mildly on the relative sizes of the LO and κ 2 terms. In most of our fits, including the central fit, we simply set κ 2 = 0, but leave ∆ EM unconstrained. However we also consider fits where both ∆ EM and κ 2 are unconstrained, as well as ones in which κ 2 is constrained by a prior that enforces κ 2 < ∼ 1. Differences between results of these fits and the central one are included in an estimate of the systematic error of the chiral extrapolation.
A final complication is the fact that ∆M 2 xy,5 , the quantity we are fitting, includes unphysical contributions because it has not been adjusted for the effects of EM quark-mass renormalization. In particular, the term multiplied by κ 5 in Eq. (24) is precisely of the form that would be induced by the O(α EM ) EM renormalization of the quark masses m x and m y , so κ 5 will have an unphysical renormalization contribution. Indeed all fits that do not include an additional correction for renormalization give κ 5 ≈ 12, with κ 5 = 12.2(2) in the central fit. After renormalization is taken into account in some way, this effective value of κ 5 is significantly reduced. On our central fit, the preferred nonperturbative scheme described in Sec. III C is nearly equivalent to simply setting κ 5 = 0 after the fit. With an MS scheme and a perturbative determination of the renormalization constant at one loop, κ 5 is reduced, effectively, by a factor of 2 but remains clearly nonzero.
Beyond NLO, the S χ PT logarithms have not been calculated, so we are unable to continue the chiral expansion in a systematic fashion. However, for acceptable chiral fits to the lattice data, we must include some or all of the NNLO analytic terms, and at least one N 3 LO term.
Following the symmetry rules above, the independent NNLO terms (for vanishing sea-quark where the terms with ρ i coefficients are O(e 2 p 4 ), and those with ρ i coefficients are O(e 4 ).
Taste-splitting terms (µ a 2 ) have been added to mass terms (µ j ) in plausible ways based on the example of the NLO chiral logarithms, but of course these choices are merely guesses of how best to absorb discretization errors into the mass terms.
Equation (25) includes taste-violating analytic terms, such as the term multiplied by ρ 1 , that arise naturally in rS χ PT. However, lattice-spacing dependence can also arise simply from "generic" discretization effects that break no continuum symmetries and therefore produce no new LECs. Rather, they induce a-dependence in the LECs that are already present. While the leading taste violations in QCD with asqtad quarks are O(α 2 S a 2 ), the leading generic errors are O(α S a 2 ). The quark couplings to EM do not change the leading generic errors because the combination of paths in the asqtad action removes O(a 2 ) terms as always. However, the EM gauge action we use is unimproved and therefore induces O(a 2 ) generic errors. 5 Generic discretization errors of the NLO analytic parameters κ 1 , . . . , κ 5 in Eq. (24) may produce effects of a size comparable to that from the NNLO parameters, so should be included. Even more important, a generic error on the LO parameter ∆ EM may induce effects comparable to NLO and is therefore required in our fits. We thus include six generic variation parameters ψ 0 , . . . , ψ 5 that give a-dependence to the LO and NLO LECs: The parameters ∆ EM and κ i on the right-hand side here are the continuum (a = 0) values.
In Eqs. (26) and (27) When the charges of the quarks in the mesons are limited to physical values or smaller (±2e/3, ±e/3, or 0), only the λ 6 term is necessary for acceptable fits, and its value is ≈ 4.
(The central fit gives λ 6 = 4.1(1).) Note that this term has the form of an O(e 4 ) quark mass renormalization. This implies that λ 6 , like the NLO LEC κ 5 (Eq. (24)), has an unphysical renormalization contribution. We note that, even though fits with λ 6 set to zero have very low p values, < 10 −10 , the term has little effect on the physical quantities studied here. In particular, if we simply set λ 6 = 0 after the fit, these quantities change by amounts less than or equal to their statistical errors, and much less than their total (systematic plus statistical) errors.

C. Electromagnetic quark-mass renormalization
In this section, we discuss the renormalization of quark masses due to EM effects, i.e., O(e 2 ) or higher. This is important because the multiplicative renormalization factor Z m is different for quarks with different EM charges, and thus affects how we separate "true" EM effects from quark mass effects such as isospin violations. Because we are not interested here in determining absolute, physical quark masses (e.g., MS quark masses in MeV, say), renormalization due to the strong interactions alone can be ignored since the corresponding Z m is the same for all quark flavors. Therefore, when we refer in this paper to "renormalized" or "bare" quark masses, we mean renormalized or bare with respect to EM. All quark masses discussed are bare as far as the strong interactions are concerned.
It is instructive first to estimate the size of the EM renormalization effect on the determination of . At fixed lattice spacing a, let δu and δd be the fractional shift in the u and d bare masses such that their renormalized EM masses are both equal to Assuming that the size of any logarithms in aµ remains modest (µ is the scale of the renormalized masses), the constant C is expected to be of order 1. With α EM ∼ 0.01, this gives δu − δd ∼ 0.003. Compared to the experimental pion splitting, the induced mass-squared splitting between a K + and a K 0 is then approximately Our estimate of the EM renormalization effect on is thus quite small, 0.02. The reason the effect is small is that the residual chiral symmetry of staggered quarks guarantees that the renormalization is multiplicative, so that the shifts in the uand d-quark masses are small. The shift in the s-quark mass is much larger; however, its effect cancels in between On the other hand, for quantities such as (M 2 K 0 ) γ , the EM effect on the squared K 0 mass itself, the fractional systematic error from not including renormalization effects is at least an order of magnitude larger than for . One must keep also in mind that the estimate in Eq. (30) is qualitative, and could easily be off by a factor of 3 or more if C is larger or smaller than naively expected.
We now proceed to more detailed discussion of perturbative renormalization, which con-verts bare quark masses to MS renormalized masses at some convenient scale, here taken to be µ = 2 GeV. Only a one-loop determination is available in the literature. For staggered quarks in QCD, the renormalized MS mass is given at this order in terms of the bare mass m(a) at lattice spacing a by [68] where α V (q * ) is the strong coupling in the V scheme [69] evaluated at scale q * , and b is a constant depending on the details of the staggered action. We have neglected discretization In order to find the corresponding EM renormalization for staggered quarks, we merely have to remove the overall SU(3) Casimir factor of 4/3 from Z (2) m and to replace α V (q * ) with α EM = e 2 /(4π). Issues such as the proper scheme and scale q * for α EM are irrelevant since α EM is so small compared to α S , and hence runs very slowly. Because we do not include EM corrections to the QCD tadpole factors in the asqtad action, we take b = 2.27 [68], which corresponds to the case of asqtad-like smearing without tadpole improvement. The one-loop EM renormalization is then where q is the charge of the quark in units of e.
The EM renormalization first affects ∆M 2 xy at NLO in χ PT. To include one-loop renormalization in the chiral fit at this order, we simply add to Eq. (17). Note that changes in µ can then be absorbed in the chiral fits by changes in the NLO LECs: κ 5 and (if discretization effects are important) κ 1 , Eq. (24). After the fit, the effect of Eq. (34) is removed from the result. This procedure is equivalent to readjusting the bare quark masses so that the renormalized masses have the desired value, so that, in particular, m MS u (µ) = m MS d (µ). As discussed below in Sec. VII, the net result is that including the one-loop EM renormalization would shift by 0.03, with small variations depending on the details of the fit. This is consistent with (but somewhat larger than) the order-of-magnitude estimate of the effect made above. Based on this small shift, which is significantly less than the other systematic errors in our result, our approach in preliminary calculations [12][13][14]27] was to omit renormalization in the central value, and simply include an estimate of the effect in the systematic errors. However, Eq. (33) will get strong corrections starting at two loops, i.e., O(α S e 2 ), and experience from pure-QCD quark mass renormalization suggests that we would need the corrections through O(α 2 S e 2 ) to be able to be confident of the coefficient of e 2 at the few percent level. 6 We are thus only able to take Eq. (33) as a qualitative estimate of the EM renormalization effect in the MS scheme.
In the absence of high-order perturbative calculations, a nonperturbative determination of the EM renormalization is necessary to get reliable results. As we will see below, such a nonperturbative approach yields an estimate for the effect of EM renormalization on of approximately 0.07, a bit more than twice as large as the one-loop perturbative estimate.
The nonperturbative method we use has been proposed by the BMW Collaboration [71].
The idea is to compare the masses of neutral π 0 -like mesons constructed from uū quarks and dd quarks with quark-line connected propagators only (no intermediate states with only gluons and/or photons are allowed).
We first introduce the needed connected correlators for arbitrary valence quarks x and y.
The connected xx and yȳ correlators are explicitly constructed in PQQCD by introducing additional valence flavors x and y with q x = q x , m x = m x and q y = q y , m y = m y . The connected correlators are then where disconnected contributions are absent since x and x are different quarks, so x cannot contract withx (and similarly for y andȳ ). We let M xx and M yy be their masses. These mesons are each of the form discussed in rule 2 above Eq. (23): neutral mesons composed of two different, but equally charged, quarks. The EM contributions to M xx and M yy must therefore be proportional to B 0 q 2 x e 2 (m x + m x ) and B 0 q 2 y e 2 (m y + m y ), respectively, where we have inserted the factor of B 0 to put these contributions in units of squared meson mass.
For q x = 2/3, m x ∼ m u , and q y = −1/3, m y ∼ m d , the contributions are of order α EM M 2 π . This is much smaller than the effect of isospin violation on the squared mass difference for approximately physical mass of the quarks, since the quark mass difference is of the same order as the masses themselves.
To lowest nontrivial order in α EM , we may therefore define an isospin limit by adjusting the bare masses m u and m d such that . This is not by itself a sufficient renormalization condition, however, since it does not fix the overall scale of the light quark masses. We can do that by demanding that the renormalized mass of the u and d quarks is the same as their mass in pure isospin-symmetric QCD, the theory onto which we are matching our QCD+(quenched)QED theory. Since chiral symmetry requires that the EM effects on the mass of the physical π 0 are also of order α EM M 2 π , the pion mass in pure QCD may be taken to have the experimental mass of the π 0 , M π 0 ,expt . This leads to a nonperturbative EM renormalization condition. In the QCD+(quenched)QED theory we adjust the bare masses m u and m d to enforce We call the renormalization scheme defined by this condition the "BMW scheme." A related "Dashen scheme" has been introduced by the QCDSF Collaboration [16]. In their scheme, the masses of connected uū, dd and ss mesons are all set equal at a symmetric point.
We define the mass m l as the common u, d mass such that the charged pion in our pure QCD simulations has mass (M π ) QCD . Therefore, Eq. (37) may be enforced by setting and choosing δ u and δ d so that the EM contributions to M 2 uu and M 2 dd vanish: Recall that (M 2 ) γ is defined as the difference between the squared mass of the meson composed of charged quarks with that composed of uncharged quarks, but with the same renormalized masses. In Eq. (39) the EM renormalized mass is m l , so that the neutral-quark (pure QCD) subtraction terms in the definition of (M 2 uu ) γ and (M 2 dd ) γ (see Eq. (15)) are equal to (M 2 π ) QCD . Thus Eq. (38) should be interpreted as defining the bare masses m u and m d such that the EM renormalized mass of each quark is m l .
The condition Eq. (39) must then be rewritten in terms of ∆M 2 uu and ∆M 2 dd , the EM effects at fixed bare mass (see Eq. (16)), which are the quantities we directly compute and fit in our simulations. With the bare mass fixed at m u in ∆M 2 uu , and at m d in ∆M 2 dd , the charged-quark terms in ∆M 2 uu and ∆M 2 dd are the same as in (M 2 uu ) γ and (M 2 dd ) γ , respectively, but the neutral quark subtraction terms are different. Within the approximation that in pure QCD, we may easily correct for the changed subtraction terms and rewrite Eq. (39) as After a chiral fit to the data for ∆M 2 xy (m x , m y ), we solve these conditions iteratively for δ u and δ d at each lattice spacing, or in the fit extrapolated to the continuum. Iteration is in principle necessary because m u and m d depend nonlinearly on δ u and δ d , respectively, at with¯ 3 = 2.81(64) [2]. Systematic errors associated with the value of B are included in our error analysis in Sec. VII.
The residual chiral symmetry of staggered quarks implies that quark mass normalization is multiplicative. That means that once we know δ d , we can use it to renormalize any charge-1/3 quark. In particular, in this scheme the bare strange quark mass m S whose EM renormalized mass is m s , the known physical strange mass in pure QCD, is Once the strange quark mass has been renormalized, we may compute (M 2 K 0 ) γ , the EM effect on the neutral kaon, from where B s and B l are the derivatives of (M 2 K ) QCD with respect to m s and m l , respectively. Unfortunately, because a large fraction of ∆M 2 K 0 is unphysical, and removed when constructing (M 2 K 0 ) γ in the renormalization step, the resulting systematic error in (M 2 K 0 ) γ (or equivalently K 0 , Eq. (5)) is relatively large (∼ 35%). The result is particularly sensitive to the uncertainty in the derivative B s .
A final renormalization scheme that we have tried consists of simply setting to zero after the chiral fit the two LECs, κ 5 and λ 6 , that are dominated by unphysical renormalization effects at O(α EM ) and O(α 2 EM ), respectively. Interestingly, this "LEC scheme" gives results for the central fit that are extremely close to those obtained from the BMW scheme: differs only by 0.03%; K 0 , by 0.2%. However, the results from different chiral fits vary much more with the LEC scheme than with the BMW one; this is especially true of K 0 , which can differ by more than 100% as we change the details of the fit, or the ranges of valence masses and charges included. For this reason we do not consider the LEC scheme further here.

D. The Neutral Pion
The mass of the (partially quenched) π 0 comes from the correlator where x is an up-type valence quark with q x = 2/3, y is a down-type valence quark with q y = −1/3, and we work in the isospin limit m x = m y . (For simplicity, all quark masses in this subsection should be interpreted as renormalized masses.) This true π 0 has quarkline disconnected EM contributions because q x = q y . As mentioned in the introduction, such disconnected contributions would be costly to compute numerically, so we drop them.
We define the squared mass M 2 "π 0 " as a simple average of the squared masses coming from the two connected correlators, one for x and one for y, obtained from Eqs. (35) and (36), respectively: It is then easy to see that chiral symmetry implies that M 2 "π 0 " vanishes in the (twoflavor) chiral limit. That is because both M 2 xx and M 2 yy are of the form discussed in rule 2 above Eq. (23): neutral mesons composed of two different, but equally charged, quarks.
The EM contributions to their masses must therefore be proportional to e 2 (m x + m x ) = e 2 (m y + m y ) ∝ e 2 M 2 π . Chiral symmetry also implies that the EM contributions to the true M 2 π 0 must be proportional to e 2 M 2 π , but the reasoning is slightly different because M 2 π 0 is not of the form M 2 xx with x and x different flavors. The spontaneously broken chiral symmetry associated with the π 0 is diagonal and is not broken explicitly by the also-diagonal quarkcharge matrix Q. Hence the EM contribution to its mass must vanish as usual in the two-flavor chiral limit. We may make a rough estimate of the size of (M 2 π 0 ) γ by using the chiral logarithm contribution calculated in [22], [21] and Λ χ = m ρ = 0.77 GeV. This gives a magnitude of about 30 MeV 2 .
The π + has totally different behavior from either the "π 0 " or the π 0 . Since its chiral symmetry is broken explicitly by the quark charges, ∆M 2 π + is nonvanishing in the two-flavor chiral limit at leading order, and equal to e 2 ∆ EM . At NLO, Eqs. (20) and (24) show that there are both a chiral log and an analytic contribution (from the κ 2 term) proportional to e 2 M 2 K . We may estimate the size of ∆M 2 π + from the LO term, Eq. (18), and the NLO chiral logarithm contribution proportional to e 2 M 2 K in the continuum limit. This gives ∆M 2 π + ≈ 1050 MeV 2 . Alternatively, since ∆M 2 π + is so much larger than ∆M 2 π 0 , and since the u-d quark mass difference contributes so little to the π + -π 0 splitting, we may simply Since both ∆M 2 π 0 and ∆M 2 "π 0 " are O(α EM M 2 π ), the error due to the simulation of the "π 0 " rather than the π 0 is also O(α EM M 2 π ). We estimate the size of this systematic error in Sec. VII D.

IV. QED IN FINITE VOLUME
With the noncompact realization of QED on the lattice, which we use, it is necessary to drop some zero-modes in a finite volume in order to have a convergent path integral. In particular, the action in Coulomb gauge for the zero component of the vector potential, A 0 , is Since the A 0 mode with spatial momentum k = 0 has vanishing action, it must be dropped. Similarly, the action for the spatial components A i is 1 Here only the mode with 4-momentum k µ = 0 must be dropped, and that is what we do. This version of QED in FV was first introduced by Duncan, Eichten and Thacker [8]; following the nomenclature in Borsanyi et al. [72], we call the resulting theory QED T L . Summarizing, Hayakawa and Uno, in their calculation of EM FV effects in χ PT [73], introduce a different FV action, called QED L , in which they drop all all modes with k = 0, both for A 0 and for The difference between Eqs. (46) and (47) is solely in the last line of each, in the treatment of A when k = 0. This difference implies that the FV effects in the MILC calculations are different from those computed in Ref. [73].
To make explicit the difference between our set-up (QED T L ) and that of Ref. [73] (QED L ), we give the Coulomb gauge photon propagator in each case: [QED L and QED T L ] The violation of Gauss's Law induced by the absence of the k = 0 A 0 mode makes it possible to have net charges on a FV torus with periodic boundary conditions [73]. But Gauss's Law has no implications for the spatial modes A i , so does not distinguish between Eqs. (48) and (49).
Borsanyi et al. [72] have independently studied QED in FV, using both the QED L and QED T L versions. They define QED L by This is in fact a partial gauge specification, because spatially-independent, but timedependent, gauge transformations would violate the µ = 0 condition A 0 (k 0 , k = 0) = 0 (written here in momentum space). One can bring any EM gauge field that satisfies Eq. (51) into Coulomb gauge, as was assumed in writing Eq. (48). The necessary gauge transformation is, in momentum space: Unlike Eq. (51), this definition is gauge invariant, as can be immediately seen from Eq. (52).
Equation (54) can be put into a special Coulomb gauge that satisfies Eq. (46) by the transformation: Thus the two definitions of QED T L , Eqs. (46) and (54), are equivalent.

V. FINITE VOLUME EFFECTS IN CHIRAL PERTURBATION THEORY
Before discussing the FV calculations, it is important to make some remarks on the literature. The first calculation for the FV EM effects on pseudoscalar meson masses that we are aware of was performed by Hayakawa and Uno [73]. They worked in QED L exclusively, and used χ PT at one-loop. Again for QED L , Davoudi and Savage [74] showed, using nonrelativistic effective field theory, that the leading 1/L and 1/L 2 terms found in Ref. [73] are in fact universal, independent of the internal structure of the particle of interest. They related higher order terms directly to the structure, parameterized in terms of EM multipole moments and polarizabilities, and extended the calculations to include spin-1/2, as well as spin-0, particles. Shortly after Ref. [74] appeared, Borsanyi et al. [72], and our own work [14] independently completed the FV calculations for QED T L . Where they overlap, the results of Ref. [72] and Ref. [14] agree for spin-1/2. The issue involved is in fact quite subtle, but it seems to have been resolved [75,76] in favor of the result in [72].
In FV, defined here by spatial extent L and temporal extent T , the momentum components take on discrete values with n i (i = 1, 2, 3) and n 0 integers. Through NLO in χ PT, the meson mass squared in FV may then be calculated simply by replacing the momentum integrals in the diagrams of Because the Feynman diagrams are divergent, it is as usual convenient to perform the renormalization in infinite volume, and, in FV, calculate only the difference between the momentum sums and the integrals. This difference, if treated carefully, is finite and does not require renormalization. We thus stipulate that the EM effect ∆M 2 xy defined in Eq. (16) is the appropriately renormalized infinite-volume result, and write where δ FV is the complete NLO FV correction, δ meson FV is the contribution from the meson tadpole, Fig. 4(c), and δ γ FV is the contribution from photon loops, Fig. 4(a,b). The factors e 2 q 2 xy m 2 have been taken out of δ γ FV for convenience. For notational simplicity in this section, m will denote the tree-level mass of the meson of interest in the absence of EM; ultimately we put m 2 = χ xy,5 in the results. With the factor of m 2 removed, δ γ FV is dimensionless, and hence is a function of mL and mT (or T /L) only, rather than m, L, T separately.
The FV effects from the meson tadpole come from pions that loop around the volume, and hence the effect is suppressed by a factor of exp(−mL). Because of this suppression, δ meson FV is of negligible size on our ensembles, < ∼ 0.2%. However, since the calculation of the effect is completely standard, it is straightforward to include it. In the notation of Ref. [77], we just have to make the substitution ln(m 2 /Λ 2 ) → δ 1 (mL), where δ 1 is a sum over Bessel functions, to obtain the FV correction. From Eq. (20), this gives In contrast to the meson tadpole effects, the FV effects from photon diagrams, parameterized by δ γ FV , are large: ∼ 5-20%, depending on the ensemble and valence masses. Since the results are nontrivial, we describe the calculation in some detail, starting with the sunset diagram, Fig. 4 where we have omitted an overall factor of e 2 q 2 xy . A linear term in k 0 in the numerator has been dropped because k 0 and −k 0 contributions cancel for both the infinite-volume integral and the FV sum.
Since I s goes to a constant as k 0 → ∞, the difference between the sum and integral over k 0 (not to mention the integral itself) is divergent, so the FV effect from this diagram alone (in Coulomb gauge) is not well defined. However, once this diagram is combined with the photon tadpole, the problem goes away. What is needed is in fact only the D 00 contribution to the tadpole, which has the integrand 1/ k 2 . Adding this to Eq. (61), gives where the "hat" on the subscript s indicates that the sunset diagram has been modified by a piece of the photon tadpole. It is useful to keep the rest of the tadpole separate, because it gives different contributions in the QED L and QED T L cases, unlike Iŝ.
The FV effect on the self energy coming from Eq. (62) is where the prime on the summation symbol means that the k = 0 term is dropped, but there is no restriction on k 0 . As in Eq. (59), we take out a factor of m 2 to make δŝ FV dimensionless.
From Eq. (48), the remaining (spatial) components of the photon tadpole in QED L give the integrand and corresponding FV effect In QED T L , there is an extra contribution coming from the nonzero value of D ij when k = 0 but k 0 = 0, see Eq. (49).
where we have used the well-known result ∞ n=1 1/n 2 = π 2 /6 [79]. When T is infinite, we can obtain the correction to the meson mass-squared by evaluating the self energy at p 0 = im. For finite T , however, this prescription is not obviously correct, and indeed is wrong in some cases. Here, we will simply assume that we may use the prescription, and leave it to the Appendix to explain the point in detail and show that plugging p 0 = im into the integrand in Eq. (62) gives the desired answer.
It now is necessary only to evaluate the difference of sums and integrals given in Eqs. (63) and (65). This can be done straightforwardly using an importance-sampling integration program such as VEGAS [80]. The sum may be treated as an integral by defining the "finitevolume integrand" at the arbitrary point k as the average of the infinite-volume integrand at the 16 corners of the FV hypercube containing k, weighted appropriately by the distances in each direction to the corners. For example, ifk is the closest point in the sum "below" k (k µ < k µ ) then the weight of the integrand atk is . When a corner is a special point (e.g., k = 0, k 0 arbitrary) that should be dropped from the sum, we simply put in 0 for the integrand there. One could also use the value at the closest corner of the FV hypercube rather than the weighted average, but the resulting integrand has discontinuities on the midplanes of the hypercube, and the numerical integration therefore has larger errors.
We have checked that our result for δ γ FV , the sum of the sunset and the photon tadpole diagrams, agrees with that of Ref. [73] in the QED L case. In Unlike Ref. [73] and the present calculation, Davoudi and Savage [74] and Borsanyi et al [72] do not compute the FV effects in the context of χ PT, but instead work first with the universal terms that describe a point-like particle, and then consider corrections coming from the particle structure. Aside from the contribution from the meson tadpole, Eq. (60), which is suppressed by exp(−mL), the one-loop χ PT calculation is in fact identical to the point-like approximation of Refs. [72,74] because there are no corrections to the photonmeson vertices or internal meson lines in Fig. 4(a), (b). In QED T L , for point-like mesons, Ref. [72] finds where the last term is what we call δ t,+ FV , Eq. (67), and the other terms come from the asymptotic expansion of δ γ FV in QED L . The constant κ is defined in Ref. [72] by where θ 3 (u, q) = +∞ n=−∞ q n 2 e i2nu is a Jacobi theta function. By numerical integration, one finds κ ≈ 2.8373. An equivalent definition of κ in Ref. [73] is The equivalence of Eqs. (71) and (72) follows from the identity [82]  which can easily be proved using the Poisson summation formula.
In Fig. 6 we compare our results for QED T L with the asymptotic form Eq. (70). For mL ≥ 3.8, which describes the unitary points in our data used in the final analysis, the differences with the asymptotic form are negligible. However, a few valence points in that analysis have mL > ∼ 2.9, for which the differences ( < ∼ 6%) are important to include. In our test of FV effects described later in the section, we have points as low as mL = 2.7 and aspect ratio of T /L = 5.33 for which the differences are a bit bigger, ≈ 7%. For convenience, we use our full results everywhere in the analysis, even where the differences with the asymptotic form are negligible.
We emphasize here that the term δ t,+ FV = mT /(4(mL) 3 ) in Eq. (67) indicates that the large-volume limit is rather subtle in QED T L . The result is acceptable if the limit L → ∞ is taken before T → ∞, or if the limits are taken together at fixed aspect ratio T /L, but not if the limit T → ∞ is taken first. In other words, the QED T L set-up is not well defined in finite spatial volume at zero temperature. This fact has also been pointed out by Borsanyi et al . [72]. They make the further point that QED T L violates reflection positivity because the constraint required to set the single k µ = 0 mode of A i to zero involves the square of the integral over all space-time of A i . Although many actions used in lattice QCD violate reflection positivity, one might worry that in this case the violation leads to problems with defining or isolating the lowest states in correlation functions. Reference [72] did have problems from close excited states in extracting masses in pure quenched QED T L . In our QCD plus quenched QED T L simulations, however, this does not seem to be a problem. As illustrated in Fig. 7, we find no significant differences between the qualities of plateaus in correlation functions in QCD+QED T L versus those for QCD alone. The example shown is for a putative "worst case" in our data because the aspect ratio T /L = 5.33 has the largest value, and L is the smallest. See also the plots for our ensemble with a ≈ 0.045 fm and T /L = 3, shown in Fig. 2. Again, no significant differences in plateau quality between QCD+QED T L and pure QCD are visible.
To test our understanding of the FV effects, we have generated ensembles with a wide range of spatial sizes at β = 6.76 (a ≈ 0.12 fm) with sea-quark masses m l = 0.01 and m s = 0.05 (see Table I). In Fig. 8 we show fits, for two different meson masses on these ensembles, to our calculated FV correction, given by Eqs. (58) and (59) is not visible on this scale. This means that the FV correction used here is the same as in the point-like approximation for the mesons. The shape of the fit curves are completely determined by the FV calculation; the only free parameter in each fit is the overall height of the curve given by the value in infinite volume. The theory gives a good description of the data, and we use it to correct the data for FV effects. We estimate the remaining systematic error associated with FV effects in Sec. VII B.
One can now understand why it was difficult to observe FV effects directly in the data set available in Ref. [12]. At that time, we had only the L = 20 and L = 28 ensembles to compare. From Fig. 8, one sees that the minima of the curves are in this region of L or close to it, and therefore the difference expected between these volumes is small compared to the statistical errors in the data.

OLATIONS
In this section, we first discuss the quantities that have been determined from pure QCD computations, and are used here as inputs to the chiral-discretization fits. We then show (a small subset of) the data we fit, both before and after FV corrections. Finally, we describe the fits themselves.

A. Inputs
In addition to the lattice values of r 1 /a that set the relative scales, we need other latticedependent quantities as inputs to the QCD+QED calculations. Table V lists the values of these quantities for one ensemble at each of our (approximate) lattice spacings. The first three columns serve to identify the ensembles. Columns four and five give the light and strange physical quark masses in r 1 units, which are determined from chiral fits to pure QCD lattice data [4]. These masses are "physical" in the sense that they have been determined by demanding that the π and K mesons take their (isospin-averaged) experimental values in absence of EM. 8 They are, however, bare masses, in that no renormalization (perturbative or otherwise) has been applied.
Two errors are shown for the masses. The first is the systematic error coming from the chiral extrapolation. It is determined by comparing the results of fits that include chiral logarithms through NNLO (plus higher order analytic terms) and those that include the chiral logarithms only through NLO. Other changes in the fits give similar estimates of the errors. The second error in the masses comes from the uncertainty in the absolute scale, i.e., the error in the physical value of r 1 .
In the final row of the table, "cont." stands for "continuum." It is convenient for us to view the continuum not as the β → ∞, a → 0 limit, but as another ensemble with fixed β and a, in which all discretization effects have been extrapolated away. In other words, we view the continuum as a lattice with a perfect action. This allows us to continue to 8 There is an apparent circularity here, in that we are computing in this paper the EM effects on the π and K masses. In practice, we have used earlier, phenomenological estimates of EM effects (see Ref. [81]) to remove them at this stage. We can iterate to make the calculation self-consistent, but it is unnecessary, because the EM effects make only a small change in the estimates of the strange and isospin-averaged light masses.  Table V. This is true even though we take out the renormalization factors, defined for the continuum to be the same as those of the β = 7.08, m l /m s = 0.0031/0.031 ensemble. Such errors would be important if we wanted to extract quark masses or B 0 in a continuum scheme such as MS.
However the renormalization errors are irrelevant here and not included in Table V because only the renormalization-invariant products B 0 m l and B 0 m s enter into the results from our χ PT fits. This is illustrated in Fig. 9, which shows these products computed from the values in Table V  in the splittings results in changes of a few percent from the listed ones. We include these changes in our fitting routines, even though they are smaller than the current statistical errors on the splittings. One can make the adjustment simply by multiplying the listed value by the ratio (r 1 /a) 2 listed /(r 1 /a) 2 unlisted , with the r 1 /a values taken from Table I. For the two a ≈ 0.06 fm ensembles mentioned in the previous paragraph, the adjustment in splittings is about 2%.
B. FV corrections to our data Figure 10 shows a small subset of our data for ∆M 2 xy , plotted as a function of the meson mass, r 2 1 M 2 xy , before and after correction for FV effects. The subset consists of charge ±e unitary or nearly-unitary points, as described in more detail in the figure caption. Because the correction due to photon diagrams is proportional to M 2 xy , see Eq. (59), the absolute FV effect is larger for kaon-like points (right-hand half of the plot) than for pion-like points (left-hand half). The correction ranges from 0.0013 to 0.0021 for kaons and 0.0005 to 0.0009 for pions. Even the fractional correction is generally larger for kaons than for pions since the LO contribution to ∆M 2 xy itself is the mass-independent quantity ∆ EM (the Dashen term, Eq. (14)), which has no FV correction. The correction varies from 10% to 16% for kaons, and from 6% to 12% for pions.
Strictly speaking, the full FV correction to ∆M 2 xy depends on the chiral fit, because the FV effect of the meson tadpole δ meson FV , Eq. (60), depends on the fit parameter ∆ EM . However, this dependence would not be visible in Fig. 10, because the exponentially suppressed meson tadpole FV corrections are very small compared with those from the photon diagrams, which are independent of the fit parameters.
Because the FV corrections depend, at least in principle, on the parameters of the fit, we fit uncorrected (raw) data for ∆M 2 xy to a chiral fit form that includes the FV NLO adjustment δ F V in Eq. (58). However, we will always present the results of chiral fits after a posteriori correction to infinite volume of both the data and the fit lines. This allows us to present results obtained from different volumes in an accessible fashion, and also facilitates comparison to experiment.

C. Fits
We fit various subsets of the data to the chiral forms described in Sec. III B, with the FV corrections appropriate to each ensemble added on. The chiral forms include discretization effects, so from now on we will refer to the fits as chiral-discretization fits. The complete data set, which includes a ≈ 0.12, 0.09, 0.06, and 0.045 fm ensembles, and quark charges 0, ±e/3, ±2e/3, ±e, ±4e/3, ±2e, is based on a total of 11,654 configurations and has 2978 data points for ∆M 2 xy . Without the a ≈ 0.12 fm ensembles, which are often omitted from our fits, the data set has 2166 points based on 6040 configurations. Because points from the same ensemble but with difference valence masses and/or quark charges are highly correlated, and because the number of points is not very much less than the number of configurations, the full covariance matrix is nearly singular and has many poorly determined low eigenvalues.
Fits with acceptable p values to the whole data set are therefore not possible. However, once the data is thinned to a more reasonable number of points in comparison to the number of configurations (∼ 250 to 450 points), acceptable fits are possible. For fits with up to about 350 points, we are able to include the complete covariance matrix, with no modifications.
For fits with more points than that, statistical and roundoff errors typically lead to a small number of negative eigenvalues (up to about 10) in the covariance matrix. We remove such eigenvalues with SVD when finding the inverse covariance matrix used in the fitting procedure. For every dropped eigenvalue, we reduce the number of degrees of freedom by 1 in computing the p value of the fit. Our central fit, with 264 points, has no negative (and therefore no dropped) eigenvalues; the alternative fits used in estimating the errors of the chiral-continuum extrapolation do include some with dropped negative eigenvalues.
When determining the p value of a given fit, we take into account the fact that the sample covariance matrix is used, rather than the exact covariance matrix that would be computed from an infinite number of configurations in our ensembles. We make the leading corrections in 1/N , where N is the total number of (independent) configurations in our sample [83].
We fit the thinned data to the LO+NLO S χ PT form (6 parameters; Eqs. (18), (20) and (24)), plus generic discretization terms at LO and NLO (6 parameters; Eqs. (26) and (27)), and NNLO analytic terms (16 parameters; Eq. (25)). The higher order analytic terms, which include discretization terms, are necessary because our statistical errors in ∆M 2 xy are ∼ 0.3% for charged mesons and ∼ 1.0% for neutral mesons. In addition, as described at the end of Sec. III B, we must include at least the N 3 LO term λ 6 , Eq. (28), to obtain chiraldiscretization fits with acceptable p values. When we include data with charges greater than physical, analytical terms of order α 2 EM are also necessary to obtain acceptable fits. Our central fit includes data from the a ≈ 0.09, 0.06, and 0.045 fm ensembles, and quark charges 0, ±e/3, and ±2e/3. As explained following Eq. (24), we fix to zero the NLO analytic parameter κ 2 , which describes sea-quark mass dependence at NLO, and leave the LO parameter ∆ EM unconstrained. The generic discretization parameter corresponding to κ 2 , called ψ 2 , is also fixed to zero. The fit thus has a total of 27 parameters.
Except for the NLO parameter κ 5 , all NLO and NNLO parameters, as well as the N 3 LO term λ 6 , are constrained in the central fit by Bayesian priors with a Gaussian width of 3 around 0. As discussed following Eq. (24), the usual χ PT expectation would be that these parameters are O(1); we believe constraining them with a prior width of 3 is reasonable given that it is known that the size of the chiral corrections to are relatively large. The width for κ 5 is taken to be a factor of 10 larger still, in recognition of the fact that it gets large unphysical contributions from EM quark-mass renormalization. The width for the generic discretization parameters ψ i is 0.044, which implies a 1-σ deviation of 5.1% at a ≈ 0.09 fm, 2.5% at a ≈ 0.06 fm, and 1.4% at a ≈ 0.045 fm.
The purpose of the Bayesian priors is to (loosely) enforce χ PT behavior, as well as to stabilize the fit to lattice-spacing dependence, for which there are many parameters and several directions in parameter space not well constrained by the data. For the generic lattice spacing dependence, we can write the errors as (aΛ) 2 , where Λ is a discretization scale, Λ ≈ 540 MeV, 10 which we judge is large enough to be conservative. In any case, the effects of increasing the prior widths by factors of 3 or 10 (or in many cases, removing the Bayesian constraints entirely) is included in the systematic errors, as discussed in Sec. VII.
The central fit includes points with meson masses up to about 635 MeV. When masses significantly higher than that are included, it is difficult to fit the data to χ PT forms, even with the NNLO analytic terms in the fit function. Some alternative chiral-discretization fits that are used to estimate systematic errors include data up to about 660 MeV, but their p values are rather poor (10 −4 to 10 −3 ). Other alternatives reduce the maximum meson mass included; the lowest maximum is about 540 MeV. We do not go below this because, in order to be able to interpolate to the physical kaon with controlled errors, it is necessary   In Fig. 11, the blue, light green, and dark green curves show the quality of the fit to the a ≈ 0.09, 0.06, and 0.045 fm data, respectively. The points at a ≈ 0.12 fm (red and magenta) are not included in the fit, but the dashed red curves show that the fit does reasonably well in predicting the data at this lattice spacing. It is more difficult to extrapolate to larger lattice spacing than to smaller lattice spacing, since larger lattice spacing may be sensitive to higher-order terms that are either not included in the fit or not well determined on finer lattices.
For the neutral mesons (Fig. 12), the discretization errors, as well as the sea-mass dependence, are quite small, since points from different lattice spacings and sea-mass values line up very well. Further, as required by chiral symmetry, ∆M 2 xy vanishes in the chiral limit. It is also noteworthy that the curvature in the fit lines is small, as may be deduced from the small difference between the curves and the dashed black line, which is straight.
There are no chiral logarithms for neutral particles at NLO, and the NNLO logs are not included in our fit function. There is a contribution from the NNLO analytic term that is quadratic in valence masses and can contribute to neutral mesons (the ρ 11 term in Eq. (25)), but it is rather small: ρ 11 = 0.52 (3). All our alternative chiral-discretization fits preserve these simple features, which are enforced by the lattice data. Therefore we only show the charged-meson plots for the alternative fits below.
The black curves in Fig. 11 show the fit after setting valence and sea masses equal, adjusting m s to its physical value, extrapolating to the continuum, and adjusting the sea charges to their physical values using NLO χ PT. The last adjustment vanishes identically for pions and is very small for kaons. The brown kaon curve (barely visible under the black kaon curve) shows the value before adjustment, i.e., with vanishing sea-quark charges. From the black curves for the π + and K + , we subtract the corresponding black curves for the neutral mesons "π 0 " and K 0 shown in Fig. 12, 11 giving the solid purple curves, whose values at the physical point for each meson (indicated by the vertical dashed-dotted lines) are the physical results.
The solid purple curve in Fig. 11 includes all chiral terms through NNLO (and with the N 3 LO term λ 6 ). We also show the LO contribution alone (the mass-independent horizontal dashed-dotted purple line) and the LO+NLO contributions (the dashed purple curves). In this fit the LO contribution has the value r 2 1 e 2 ∆ EM = 0.00189 (12); this is about 60% of the value 0.00315 that would be necessary to give the full experimental pion splitting at LO.
As expected from the fact that , which measures higher order contributions to (M 2 ) γ , is of order 1, the NLO contributions are relatively large, especially for the kaons or heavier-thanphysical pions. The NNLO contributions are clearly much smaller than the NLO ones for physical kaons, and negligible, or nearly so, for physical pions. Thus, after an anomalously large NLO contribution, SU(3) χ PT appears to converge reasonably well.
One may wonder whether this picture of the convergence of χ PT is strongly influenced by the Bayesian priors that constrain NLO and NNLO LECs in the fit. In fact, the priors on physical LECs (those whose contributions do not vanish in the continuum limit) have 11 More precisely Fig. 12 only shows the dd contribution to the "π 0 ." almost no effect on the convergence or the results. If we remove all prior constraint on these physical 12 LECs (κ 2 , κ 3 , κ 4 , κ 5 , ρ 6 , ρ 7 , ρ 8 , ρ 9 , ρ 10 , ρ 11 , ρ 12 , ρ 13 , ρ 14 , ρ 1 , ρ 2 , λ 5 ), the fit and results are almost unaffected: changes by only 0.03%. This is however dependent on our setting parameter κ 2 = 0 as discussed above; the separation between LO and NLO contributions can be drastically altered if κ 2 is allowed to vary. Figure 13 shows two alternative chiral-discretization fits to the same set of data points as the central fit. Both of these fits have reasonable p values. In (a) the NNLO parameters ρ 6 and ρ 7 are set to zero, in addition to the NLO parameter κ 2 . These NNLO parameters play a role that is similar to κ 2 : the corresponding analytic terms depend on the sea-quark masses and are nonzero in the chiral limit. No major changes from the central fit are visible, but the LO continuum contribution (dashed-dotted purple curve) is slightly higher than for the central fit (here, r 2 1 e 2 ∆ EM = 0.00202(4)), and all the curves (fit lines as well as extrapolations) are correspondingly higher in the chiral limit. The value of , however, is only 0.002 below that of the central fit. One somewhat more obvious change is that the predictions for the a ≈ 0.12 fm points (dashed red curves) are somewhat worse than in the central fit.
In Fig. 13(b), κ 2 is not fixed, but is constrained by our standard prior for physical LECs, 0 ± 3. The LO parameter is now also constrained by priors with central value r 2 1 e 2 ∆ EM = 0.003 and width 0.001. The posterior value is almost two sigma lower, r 2 1 e 2 ∆ EM = 0.0012(3), now less than 40% of the value that would be necessary to give the experimental pion splitting at LO. Nevertheless, the final results (sum of all chiral orders) from this fit are quite close to those of the central fit, as can be seen by comparing the solid purple curves in Figs. 11 and 13(b). Indeed, the value of coming from this alternative fit is just 0.02 below that in the central fit. The fit lines to the data at fixed lattice spacings and sea masses are also very similar in Figs. 11 and 13(b).
In Fig. 14, we show a third alternative fit to the same data points as the central fit. Here, we have put a very tight prior on ∆ EM , r 2 1 e 2 ∆ EM = 0.0031 ± 0.0001) to force the LO χ PT contribution to be close to the experimental pion splitting. The posterior value is about two sigma below this, r 2 1 e 2 ∆ EM = 0.0029(1). The chiral LECs all have priors 0 ± 3, including κ 2 , which is allowed to vary, but now has a negative posterior value in order to reduce the 12 The LO physical LEC ∆ EM is never constrained unless nonzero κ 2 is allowed. We also include κ 5 and λ 6 among the "physical" LECs even though they get unphysical contributions from EM renormalization, because they do not vanish in the continuum limit.  13. Two alternative chiral-discretization fits. The data included in the fits, as well as the meaning of symbols and curves is the same as for the central fit, Fig. 11. Fit (a) sets parameters ρ 6 and ρ 7 to zero, as well as κ 2 . The differences with Fig. 11 are small: all curves are slightly higher in the chiral limit, and the predictions for the a ≈ 0.12 fm points (dashed red curves) are noticeably higher. Fit (b) does not fix the parameter κ 2 (or ρ 6 and ρ 7 ) to zero. The main difference from pion chiral limit of the fit to something that is better tolerated by the data. The p value of the fit (p = 0.035) is significantly less than the other fits that we have considered so far, but is still acceptable. The resulting value of is 0.024 higher than that of our central fit, and is in fact the largest positive deviation from the central value of all the alternative fits we have considered.
The relative contributions in the continuum of various orders in the chiral expansion as predicted by the fits are also sensitive to the parameters that control lattice spacing dependence (κ 1 , ρ 1 , ρ 2 , ρ 3 , ρ 4 , ρ 5 , ψ 0 , and ψ i ). As mentioned above, the fit becomes unstable if these parameters are completely unconstrained. If the priors widths are widened but not 14. An alternative chiral-discretization fit to the same data as for the central fit. The meaning of symbols and curves is the same as for the Fig. 11. This fit puts a strict prior on ∆ EM to force it to be very near to the value that would give the physical pion splitting at LO. Note that the higher chiral orders reduce the pion chiral limit significantly below the LO contribution. eliminated, the effects on the results are controlled (and included in the systematic error estimates), but the division between LO and NLO contributions can again be significantly changed. Thus the division between orders in χ PT shown in Fig. 11 is at best very rough.
The final results are nevertheless much more stable than the individual χ PT orders, as we have already seen in comparing the fits in Figs. 11, 13, and 14. This remains true even for the more extreme divisions between orders considered below.
It is not surprising that inclusion of the a ≈ 0.12 fm ensembles leads to difficulties with the fits. The taste-breaking effects at this lattice spacing produce large discretization errors, and the fact that the physical strange quark is about 35% below the simulated strange mass gives further problems. The smallest meson-mass maximum that allows us to interpolate to the kaon is approximately 645 MeV for the Goldstone meson, and about 750 MeV for the RMS taste meson. For low masses, the taste effects are even worse: while the lowest available Goldstone mass is about 275 MeV, this corresponds to an RMS taste mass of about 465 MeV. Thus, when we add in the a ≈ 0.12 fm ensembles, the chiral-discretization fits have various undesirable features. Figure 15 shows two examples of such fits. Figure 15(a) is rather similar to the central fit: ∆ EM is unconstrained but κ 2 is fixed to zero. Despite the fact that we have increased the prior widths of the LECs controlling lattice spacing dependence (κ 1 , ρ 1 , ρ 2 , ρ 3 , ρ 4 , ρ 5 ) to 40, and the width of the generic variation parameters to 0.11, the fit is poor (p = 0.0005). Nevertheless is only 0.01 below that of the central fit.
Fits with reasonable p values that include the a ≈ 0.12 fm ensembles are possible. In Fig. 15(b) we allow κ 2 to vary, and put a relatively loose prior on ∆ EM (r 2 1 e 2 ∆ EM = 0.003 ± 0.001), as well as dramatically increasing the prior widths of the parameters that control lattice spacing dependence. We now obtain p = 0.098. However, this fit has a negative value of ∆ EM , which implies an extreme breakdown of χ PT, as well as very large discretization LECs (κ 1 ≈ −34, ρ 1 ≈ −70, ρ 2 ≈ 33, λ 2 ≈ 16). It may very well be justified to drop this fit on these grounds. To be conservative, however, we keep it in estimating the systematic errors. Indeed, it is the fit that gives a value of that is furthest away from our central value (0.082 lower) out of all the chiral-discretization alternatives we consider.
Adding in points with quark charges that are greater than the physical ones leads to problems with the chiral-discretization fits that are similar to those we have when adding in the a ≈ 0.12 fm ensembles. This may be because the higher charges bring in greater discretization errors. Indeed, there is evidence [12] that taste violations from photon exchange begin to be important when the charges increase above their physical values. For data with quark charges 0, ±1/3, ±2/3, ±1, and ±4/3, a fit like the central fit (but including all the e 4 p 2 LECs (λ i in Eq. (28)) and with somewhat larger priors (0 ± 5) on LECs) has p = 0.005, and an that is 0.03 below that of the central fit. A fit with κ 2 not fixed to zero, and very loose priors on LECs and generic discretization parameters, has p = 0.21. However ∆ EM is negative, and discretization terms are very large. Both features are quite similar to those seen in Fig. 15(b). In this case, is 0.065 below that of the central fit.

VII. SYSTEMATIC ERRORS AND RESULTS OF EM CALCULATION
Our calculation has the following significant sources of systematic errors: (1) chiralcontinuum uncertainties from the extrapolations to the physical light quark mass and to a = 0, (2) finite-volume (FV) effects, (3) systematic issues involved in the EM renormalization, (4) effects of using the "π 0 ," which does not include quark-line disconnected contributions, instead of the true π 0 , (5) errors in the physical value of r 1 , the quantity we use to set the scale, (6) uncertainties in the physical values of the quark masses after extrapolation to the continuum, and (7) effects of EM quenching. In the following subsections, we discuss each source of error in turn. For the two EM quantities we calculate, and K 0 , Table VI lists central values and statistical errors from the fit shown in Fig. 11, and each systematic error.
The separation between EM and isospin-violating effects is dependent on the scheme, which enters through the EM renormalization of quark masses. Our calculation is performed in the BMW scheme [71], described in Sec. III C. For some purposes, it may be useful to gauge the effects of changing to another reasonable scheme. In Sec. VII H, we estimate such scheme dependence for and K 0 .

A. Chiral-continuum error
We determine this error by considering a wide range of alternative chiral-discretization fits, with various priors and/or parameters set to zero, and apply them to various subsets of the data: different maximum meson masses included, different thinning, omitting or including the coarsest (a ≈ 0.12 fm) ensembles, and omitting or including quark charges greater than the physical charges. Several of these fits have been presented in Sec. VI C.
To be conservative we include fits that have p values as small as 10 −5 , as well as ones that have very large discretization terms and/or exhibit extreme breakdown of χ PT (e.g., negative ∆ EM ). Altogether, 89 fits are included. We take the largest positive and negative differences from the value in the central fit as the error. For , this gives a positive error of +0.024, coming from the fit in Fig. 14, and a negative error of −0.082, coming from the fit in Fig. 15(b). For K 0 , the maximum positive and negative differences are comparable, so we average them and quote a symmetric error of 0.002.

B. Finite-volume error
To estimate the systematic error associated with the FV correction we use (a residual FV error), we examine the deviations of the fit lines from the data in Fig. 8. For neutral mesons there are no chiral logarithms at NLO, and hence no FV effects at this order. There will be FV effects at higher orders, but they are very likely to be much smaller than our other systematic errors in K 0 , which are quite large. We therefore do not include a residual FV error for K 0 in Table VI.

C. EM renormalization error
We use the BMW scheme, as defined by Eq. (37) and as implemented by Eq. (40), to perform nonperturbative EM renormalization of the uand d-quark masses. For , this is sufficient, since the renormalization of the s-quark mass cancels in the difference (M 2 K + − M 2 K 0 ) γ . However s-quark mass renormalization is crucial for obtaining K 0 . We extend the renormalization to the s quark using Eq. (42).
We can implement Eq. (40) to high accuracy from our chiral fits, so the only significant systematic errors in the scheme come from the errors in our values of the derivatives of the squared meson masses with respect to quark mass: B ≡ ∂M 2 π /∂m l , B l = ∂M 2 K /∂m l , and B s = ∂M 2 K /∂m s . We only need these quantities for physical quark masses and in the continuum limit, since we perform the renormalization after the chiral-discretization fit and its extrapolations. For B, we have the SU(2) χ PT result, Eq. (41), which is quite precise: the error from¯ 3 = 2.81(64) [2] results in a 0.4% error in B. Corrections from NNLO should be even smaller, since the NLO correction is already only 2%. For B l and B s , SU(3) χ PT would be needed, but the higher order corrections, as well as the uncertainty in the relevant LEC, are large. Instead, we extract these quantities from our lattice data, and make a simple extrapolation (linear in a 2 ) to the continuum. We find B s /B = 0.974 (15), B l /B = 0.946 (19), where we give the results in terms of the central value of B (errors in B should not be added on to these values). The error in B l is small enough that the resulting error in is small compared to other systematic errors; is independent of B s . The total renormalization error on is 0.002.
For K 0 , we find a renormalization error of 0.012. The error is dominated by the uncertainty coming from B s , and would therefore benefit from increased precision in this quantity.
A significant improvement in B s could be obtained from a dedicated pure QCD calculation with several closely spaced strange-quark masses around the physical value at each lattice spacing. However, the fact that K 0 has an uncontrolled quenched-EM error means that one cannot decrease the overall error very much without also going to dynamical QED simulations (or equivalent approaches to include the effects of sea-quark charges at order α EM ). D. Error from dropping disconnected π 0 diagrams As described in Sec. III D, we simulate a "π 0 " in which quark-line disconnected diagrams are dropped, rather than the physical π 0 . The difference is O(α EM M 2 π ). We may estimate the size of this effect by noting that the disconnected contributions are solely responsible for the chiral logarithm term found in Ref. [22]. The connected contributions, which are equal to (M 2 uu ) γ or (M 2 dd ) γ , have no chiral logarithms at NLO. Indeed, there are no NLO chiral logarithms for any neutral meson that has only connected contributions, such as the neutral kaon. In Sec. III D, we estimated that the chiral logarithm term in (M 2 π 0 ) γ as approximately 30 (MeV) 2 . Using the result from our central fit for e 2 ∆ EM instead of the value from Ref. [21], gives a smaller result of 25 (MeV) 2 .
In estimating the error on , we also need the pion splitting, which appears in the denominator. The experimental value is 1261 (MeV) 2 , but we can put the denominator on the same footing as the error in the numerator by using instead the value obtained from LO, namely e 2 ∆ EM . With the value of ∆ EM from Ref. [21], we get about 900 (MeV) 2 for the LO pion splitting. Taking this smaller value for the denominator and the larger estimate of the error in numerator, we get a conservative estimate of 0.034 for the error in . This value is independent of what is assumed for ∆ EM , since it cancels between the numerator and denominator.
Another approach to estimating the "π 0 " error would be to compare with , defined by Eq. (3). In the experimental value of the pion splitting appears in the numerator instead of the computed value of the pion EM splitting, so is independent of how we treat the π 0 . From the discussion in Sec. I, we expect that, in the absence of statistical or other systematic errors, = + m , where m = 0.04(2) [2]. However, chiral-discretization errors play a significant role here, since there is a partial cancelation of errors in when we subtract Indeed, the chiral-discretization error for is a factor of about 4 larger than for . If we ignore this problem and just focus on the central fit, − ( + 0.04) = 0.089. This is slightly smaller than the expected error (0.091) from the addition in quadrature of the chiral-discretization and "π 0 " errors on , and the m error.
Because there are also likely to be some residual FV errors in the difference − ( + 0.04), the result suggests that the errors we have already included are reasonable and do not need to be increased further.
The calculation of the EM effect for the K 0 is independent of the treatment of the π 0 , so there is no corresponding error in K 0 .

E. Scale error
The absolute scale of our ensembles is set by r 1 = 0.3117 (22) fm [46]. To find the induced error in our results, we rerun the analysis with r 1 changed by 1 σ. In doing so, it is necessary to include the changes, caused by the scale, in the physical quark masses in the continuum limit. The scale error in these masses is given in Table V. Note that the estimates of the quark masses move in the same direction as r 1 because the quark masses are adjusted to reproduce the experimental values of the meson masses multiplied by r 1 .
The resulting scale errors are very small: 0.001 in and 0.0002 in K 0 . In each case, the effect of changing the scale itself is largely cancelled by the scale changes in the quark masses.
Only the denominators, which come from the squared experimental splitting multiplied by r 2 1 , are affected by the change in the scale itself, and only the numerators are affected by the changes in quark masses.

F. Quark mass error
To find the errors coming from our values of the physical quark masses, we rerun the analysis with the continuum physical mass values 13 given in Table V  Errors arising from the other inputs in Table V

G. Quenched EM error
For , the effect of having quenched the EM interactions is controlled at NLO in SU (3) χ PT, per the argument of Ref. [21]. This is because effects that depend on the sea-quark charges and unknown LECS are independent of valence-quark charges and therefore cancel in (M 2 K + − M 2 K 0 ) γ and in (M 2 π + − M 2 "π 0 " ) γ -see Eq. (23). Errors arise at NNLO, in which cross terms between valence and sea charges can first appear in analytic terms, which have unknown LECs. Examples of such terms are ones proportional to q xy (µ x − µ y )(q u µ u + q d µ d + 13 The values of physical quark masses at nonzero lattice spacings affect our results only through the values of r 1 /a, which are extrapolated to these quark masses in our mass-independent scale-setting scheme. The effects on the final results are negligible. q s µ s ) or (q x + q y )(µ x + µ y )(q u µ u + q d µ d + q s µ s ). From our central fit, the calculated effect of turning on the sea quark charges is 0.040, or 8.2% of the 0.486 NLO contribution for neutral sea quarks. Assuming the quenching effects on NNLO would be of this same size, an estimate of the electroquenching error is 8.2% of the 0.250 NNLO contribution, or 0.020. It may be, however, that the electroquenching effect at NLO is anomalously small. In particular, there is no effect on "pions" (mesons with degenerate quarks) at this order. We therefore follow a more conservative approach and take the full value of the NLO sea-quark charge effect, namely 0.040, as the error estimate for .
As explained at the end of Sec. III A, the electroquenching error in K 0 is uncontrolled, in the sense that it is not computable at lowest nontrivial χ PT order. We can get a rough  [71], since the sum of sea-quark charges vanishes, and quark mass factors must be included to get a nonzero result. Since diagram (2) cancels for , the double suppression of electroquenching effects may explain why the contribution of sea-quark charges is only 0.04 at NLO. However, diagram (2) does not cancel for K 0 , so we have only the 1/N c suppression. We therefore take 1/3 of the central value, namely 0.012, as the electroquenching error in K 0 .

H. Scheme dependence
It may be helpful for some purposes to estimate the changes that would be induced in our results if we changed to a different scheme for EM renormalization. For example, in a pure-QCD calculation that relies on our results to remove EM effects from physical quantities that are used to set the quark masses or scale, it would be useful to know how much the results could change in a different scheme for separating EM from isospin-violating effects.
In addition to the BMW scheme, we have tried renormalizing the quark masses using the MS scheme at scale µ = 2 GeV. Unfortunately, we have only a 1-loop determination of the renormalization, and this may suffer from rather large perturbative errors, as we remarked in Sec. III C. Nevertheless, comparison of the MS scheme at 1 loop, Eq. (33), with that of the BMW scheme, Eq. (42), gives at least a rough estimate of how much the results may change over various choices of "reasonable" schemes.
With the MS scheme and the central fit, we obtain = 0.814 (12), where the error is statistical only. This suggests a scheme dependence of ∼ 0.04 in . The small dependence is in accordance with the general discussion at the beginning of Sec. III C. The corresponding result for the neutral kaon is K 0 =0.365(2) giving a scheme dependence of ∼ 0.330. Note that, if the two-loop corrections from QCD make a comparable contribution to the 1-loop EM renormalization as they do in the pure QCD, asqtad case [70], the value of K 0 in the MS scheme at 2 GeV would be reduced by a factor of order 3.
The large dependence on scheme for K 0 is not surprising, since Adding the systematic errors in Table VI in quadrature, we find The result for K 0 implies (M 2 A preliminary value for (M 2 K 0 ) γ was reported in Ref. [13]. That result did not yet take into account EM quark-mass renormalization and is thus not reliable.  (17)  The procedure for finding m u /m d is described in detail in Ref. [26]. Very briefly, the essential steps are: 1. Use the pion mass and decay constant to fix the lattice spacing and average light quark mass, m l = (m u + m d )/2, on each ensemble. Here we use the physical π 0 mass, since this has small electromagnetic contributions. This mass is also adjusted for QCD finite size effects.
2. Find the tuned strange quark mass on each ensemble by matching 2M 2 K − M 2 π . In this step the lattice masses use the average light quark mass computed in the first step, and the input M K is the average of the K 0 and K + masses after subtracting the electromagnetic contributions parameterized by and K 0 .
3. Use the derivative of the lattice M 2 K with respect to the light valence quark mass and the difference between the K 0 and K + masses after removing electromagnetic contributions to find m d − m u and hence m u /m d on each ensemble. The most important differences between this analysis and that of Ref. [26] are the extension of the 0.06 fm ensemble to 895 lattices and the addition of an ensemble with a ≈ 0.04 fm. Figure 16 shows the values of m u /m d for each ensemble, and the continuum extrapolation.
With the addition of this data at small lattice spacings, we now choose to omit the 0.15 fm data from our central fit, and use the fit including this ensemble as one of our alternative fits for estimating the systematic error due to the choice of continuum extrapolation. We take the range of all of these continuum extrapolations as our estimate of the systematic error coming from the choices made in our continuum extrapolation.
Using the MILC HISQ (2+1+1)-flavor QCD ensembles and the values of and K 0 given in Eqs. (74) and (75), and following the approach described in Ref. [26], we obtain m u /m d = 0.4529 (48) The errors on the quantity are, in order, the statistical error and the errors from choices in the continuum extrapolation, from , from K 0 , from finite volume in the pure QCD calculation, and from the error in the experimental value of M K 0 − M K + [84]. The finitevolume effects are taken to be the difference between a NLO staggered chiral perturbation theory calculation and a nonstaggered calculation at NNLO for M π and F π and NLO for M K and F K . We note that the result in Eq. (76) should be considered an update to the result quoted in Ref. [85], m u /m d = 0.4556(55) stat ( +114 −67 ) syst (13) ∆M K . The current result includes newly generated 2+1+1 HISQ configurations at a ≈ 0.06 fm and 0.04 fm, as well as all our configurations at a ≈ 0.09 fm. Reference [85], which focused on physics for quarks heavier than m c , included only the subset of configurations at a ≈ 0.09 fm for which we have generated propagators for those heavy quarks. The smaller statistical error of the current result reflects the larger data set used. Our procedures for estimating systematic errors, however, actually give slightly larger values in the current analsyis than in Ref. [85]. To this level of precision, and within the scheme we are using, our EM errors in m u /m d come only from and not from K 0 , despite the large relative error in the latter quantity.
The errors in K 0 do, however, have an effect on the errors in m s and in ratios such as m s /m l [85,86].

IX. CONCLUSIONS AND OUTLOOK
Using the three-flavor MILC asqtad configurations, we have computed the EM quantities and K 0 , which parameterize the EM contribution to the K + -K 0 mass splitting, and to the K 0 mass itself, respectively. Our results are given in Eqs. (74) and (75). A comparison of our result for with those of other groups (and our preliminary result, labeled as MILC 16) is shown in Fig. 17 [18], and RBC 07 [9]. Note that our current value for m u /m d is plotted in Fig. 18 with the u, d, s,  Our result for m u /m d is consistent with those from most other groups, but lies on the low side of the range of results. From Fig. 16 one can see that the low continuum value from our data set is due to the results from the two finest lattice spacings, a ≈ 0.06 fm and a ≈ 0.04 fm. The latter is finer than the finest of the ensembles used by the other groups, which has a ≈ 0.054 fm. Because discretization errors depend on the lattice action, however, it is unclear at this point whether the difference in available lattice spacings is relevant to the apparent differences seen in Fig. 18.
While the electroquenching errors for are under control, these errors are uncontrolled for most quantities, for example, K 0 . To move beyond the electroquenched approximation, we have developed a dynamical EM code [87] and are beginning to generate unquenched QCD+QED ensembles. These ensembles will be crucial to our efforts to obtain precise results for the hadronic contributions to (g − 2) µ , as well as for calculations such as the proton-neutron mass difference and improvements in the result for K 0 .
Appendix: Obtaining the mass from the self-energy at finite T For infinite T , the standard procedure to get the correction to the squared mass is to evaluate the Euclidean self energy σ(p 0 ) at p 0 = im. In this appendix, we show that that method does not in general give the right answer at finite T . In particular, σ(im) is dependent on the routing of the loop momentum through the diagram. Nevertheless, we show that the particular momentum routing chosen in Sec. V does allow us to extract the mass correction from σ(im) because the natural continuation of σ(p 0 ) away from the Matsubara frequencies 2πn/T happens to be particularly simple.
To introduce our notation and approach, we first review the usual procedure when the time extent T is infinite. The momentum-space Euclidean propagator has the form where m is the Lagrangian mass, σ ∞ is the self energy, the subscript ∞ indicates that T is infinite, and we have taken the case of vanishing spatial momentum, p = (p 0 , 0), for simplicity.
To find the physical mass, we Fourier transform to position space where C is a constant, and p 2 0 = −M 2 is the location of the single-particle pole When T is finite, the calculation of the mass correction changes in two crucial ways. First of all, the integral over p 0 in Eq. (A.2) becomes a sum over p 0 = 2π /T , where runs over the integers. The self energy σ(p 0 ) and henceG(p 0 ) are moreover only well-defined for these discrete values of p 0 . We may continue these functions to σ cont (p 0 ) andG cont (p 0 ), defined on the full complex p 0 plane, but the continued functions are not unique. Second, the internal loop energy (for example, k 0 in Eq. (63)) in the determination of σ(p 0 ) is itself discrete, so that σ(p 0 ) is not the same function of p 0 as σ ∞ (p 0 ), even on the discrete points p 0 = 2π /T .
These two changes interact in interesting ways, with the result that the procedure to obtain the squared-mass correction by evaluating σ cont (p 0 ) at p 0 = im is not valid in general.
We discuss the discrete sum over p 0 first. To extract the mass, we need to compute G cont (τ ) = dp 0 2π e ip 0 tG cont (p 0 ). (A.8) AlthoughG cont (p 0 ) is not unique, it is straightforward to check that another continuation constructed by adding a function that vanishes at p 0 = 2π /T , such as sin(p 0 T /2), will not change G(t), although it does of course change G cont (τ ). This still leaves open the question of howG cont (p 0 ) should be chosen. For now, we simply state that we should choosẽ G cont (p 0 ) so that G cont (τ ) is strongly damped for large τ . By a standard theorem of Fourier transformations, we can accomplish this ifG cont (p 0 ) and all its derivatives are continuous and absolutely integrable over the real p 0 line [88].
If G cont (t) is exponentially damped for mt 1, we can, in practical situations, neglect most or all of the n = 0 terms in Eq. (A.7). A standard approach is just to include n = −1 in addition to n = 0, so that we include a backward propagating meson in our fit Ansatz for G(t): The fit for mt 1 will then effectively isolate the first contribution, from G cont (t), and extract the corrected mass M from its exponential decay. If Jordan's lemma applies to the Fourier transform and if the only single-particle pole inG(p 0 ) is the usual one near p 0 = im, then the correction to the squared mass is indeed just σ cont (im). We will see below, however, that this will not be true in general.
So we are led to consideration of the finite-T self energy in momentum space σ(p 0 ), and how it may be continued away from the special values p 0 = 2π /T to σ cont (p 0 ). A natural choice for σ cont (p 0 ) is simply the result of doing the loop energy/momenta sums for arbitrary external p 0 , instead of only for the special values. For example, we can perform the sum on first term on left-hand side Eq. (63) for any p 0 . Because the resulting self-energy function and its derivatives obey the continuity and integrability conditions mentioned above, G cont (t) will automatically be exponentially damped 15 as desired.
An undesirable, but unavoidable, feature of this continuation σ cont (p 0 ) is that it depends on the routing of the external momentum p 0 through the diagram. The dependence on routing vanishes when p 0 = 2π /T because the loop energy may be shifted by this amount.
But away from these special points, there is no reason for σ cont (p 0 ) to be independent of the routing; we have checked this dependence numerically for σ cont (p 0 ) = 1 L 3 T k 0 , k Iŝ − d 4 k (2π) 4 Iŝ, (A. 10) with Iŝ the photon-sunset integrand given by Eq. (62). Here we have considered the difference between the sum and the integral, rather than the sum itself, to avoid having to cut off the sum over k, which is irrelevant to the current discussion. 16 Further, the dependence on momentum routing persists when σ cont is evaluated at p 0 = im, which indicates that the rule relating the mass correction to σ cont (im) cannot be true in general. We emphasize that this is a problem with the rule, rather than some fundamental problem with the definition of the mass correction itself: The finite-T propagator G(t) is of course completely independent of the routing.
To examine this issue further, we consider the two obvious possible momentum routings in the sunset diagram. Routing A, which we used in Sec. V, has p − k on the photon line and k on the internal meson line. Routing B has k on the photon line and p − k on the internal meson line. With p = (p 0 , 0) and k = 0, , (A.11) where Eq. (A.11) is copied from Eq. (62). In both cases we have added on the 00 component of the tadpole, which is independent of the external momentum. The linear term in k 0 in the numerator of I B s cannot be dropped since the denominator is not symmetric under k 0 → −k 0 . When p 0 = 2π /T , I Â s and I B s clearly give the same result for σ(p 0 ), as can be seen by shifting the summation variable k 0 → k 0 + p 0 in I B s (and then dropping a linear term in k 0 in the numerator of I Â s ). Extracting the mass correction is easy for routing A. We can see from Eq. (A.11) that σ A cont (p 0 ) has the simple form α + βp 2 0 , where α and β are independent of p 0 . Therefore, G A cont (p 0 ) has a simple pole close to p 0 = im, and Jordan's lemma allows us to close the contour as usual in the upper half plane (for t > 0) for the Fourier transform ofG A cont (p 0 ). This determines the squared-mass correction to be σ A cont (im), as was assumed in Sec. V. Extracting the mass correction in the case of routing B is more subtle. To see the relation between the self energy from I B s and I Â s when p 0 is not at a special point, we use the Poisson summation formula to write where n runs over the integers. We can now make the shift k 0 → k 0 + p 0 in both integrals, converting I B s into I Â s . Differences remain, however, from the resulting phase e inT p 0 and from the term in the numerator linear in k 0 , which gives a nonvanishing contribution when n = 0. The difference between the self-energies is then ∆σ(p 0 ) = − 2 L 3 n≥1 k e −ω k nT k 2 sin 2 (nT p 0 /2) k 2 + m 2 − p 2 0 ω k + p 0 sin(nT p 0 ) , (A.14) where ∆σ ≡ σ B cont − σ A cont , and ω k ≡ k 2 + m 2 . Because of the additional factors of sin 2 (nT p 0 /2) and sin(nT p 0 ), which blow up for large imaginary p 0 , the analytic properties of σ B cont are not standard, and we must reexamine the usual assumptions that go into finding the mass. To simplify the algebra we take mT 1 and consider p 0 = iy with y > ∼ m. We can therefore neglect the exponentially falling terms exp(−ynT ) from the sine functions and keep only the growing ones. The sum over n then immediately gives ∆σ(iy) = 1 2L 3 k (ω k + y) 2 k 2 ω k (e T (ω k −y) − 1) .
The values of ω k for each discrete value of k therefore determine singularities in ∆σ. As y approaches a discrete value of ω k from below, ∆σ goes to +∞, and then comes up from −∞ as y increases above ω k . Because the self-energy varies over the full range (−∞, ∞),G B cont will have a pole near each of the singularities in ∆σ. Figure 19 shows how this occurs for three choices of mT and mL. The equation for the poles is y 2 = m 2 + σ B cont (iy), which we find from the crossings of the curves (y/m) 2 and 1 + ∆σ(iy)/m 2 , where we have neglected the difference between ∆σ and σ B cont . This difference is σ A cont , which just gives a relatively small correction to the terms (y/m) 2 and 1, and does not change the qualitative picture.
In the plots, we have included all the k values in the sum in Eq. (A.15) that contribute significantly in the region of y/m shown.
The left-hand plot (mT = 10, mL = 5) shows that, in addition to the "normal" pole close to y = m, there are anomalous poles close the singular values in ∆σ where y = ω k , for some k. As L increases, the possible values of k get closer, and the poles get denser.
We observe this feature in the middle plot (mT = 10, mL = 10). As L → ∞, the poles pile up at y = m. On the other hand, as mT gets smaller, the residues of the singularities in ∆σ increase like 1/(mT ). This can lead to the particularly strange situation where the normal pole inG cont close to y = m disappears, as shown in the right-hand plot (mT = 5, mL = 10).
Because there are many poles in the propagator, and often many of them are close to p 0 = im, the quantity σ B cont (im) has no direct relation to the mass correction. Nevertheless, the finite-T propagator G(t) is always well-defined, and in principle one could always extract the mass from G(t) numerically. The relation between the self-energy at p 0 = im and the mass correction is however problematic, and it seems unlikely in most cases that the dependence of the self-energy on p 0 will be simple enough to relate the mass-correction to the self energy at p 0 = im, as we did for σ A cont (im). It is worth making contact here with the argument given in Ref. [72]  where the prime on the sum on n indicates that n = 0 should be omitted; it is cancelled by the infinite-T subtraction. They then argue that I(k 0 , k, im) has no poles on the real k 0 axis and is infinitely differentiable, with all of its derivatives integrable, which implies that δσ cont (im) vanishes faster than any power of 1/T as T → ∞. This argument explains why the QED L FV correction δ γ,QED L FV shows negligible dependence on T for the values of mT relevant to Fig. 5. Indeed, using routing A, the unique single-particle pole inG cont near p 0 = im implies that the leading T -dependence in QED L is suppressed by a factor of exp(−mT ). 17 However, the Borsanyi et al. argument does not apply in general for routings that generate complicated p 0 dependence away from the discrete points 2π /T . In particular, the argument cannot be use to conclude that routing-dependent differences in σ cont (im) are 17 Note that the extra term δ t,+ FV (mL, mT ) for QED T L , given in Eq. (67), is not negligible for any of our data.
similarly suppressed by exp(−mT ) and therefore negligible for values of mT used in our computation. To see this, we look at a simple example with I(k 0 , k, p 0 ) = 1 (k 0 + p 0 ) 2 + k 2 + m 2 . (A.17) With p 0 = iy and y ≤ ω k (for fixed k), there is a simple pole in the upper half plane at k 0 = i(ω k − y). When y = m, the n ≥ 1 terms in the sum give contributions (after integration over k 0 ) proportional to exp(−nT (ω k − m)), while the n ≤ −1 contributions are more highly suppressed for large T since the pole in the lower half-plane is further from the axis. Because k = 0, all the terms in the sum over k indeed decay exponentially with T .
The rate of decay, however, can be very small for large L, because the lowest momenta have magnitude 2π/L. Thus it is not obvious that the difference between finite T and infinite T can be neglected, even if one just focusses on σ cont (im). More importantly, there are poles in σ cont (iy) for y = ω k (for some k) as the k 0 pole in I moves down to the real axis. These poles mean that −p 2 0 = y 2 = m 2 + σ cont (iy) can have multiple solutions, so there are multiple poles in the momentum space propagatorG(p 0 ), as we have seen in Fig. 19. If the higher poles are close to y = m (as in Fig. 19 (middle)), or the y ≈ m pole is absent entirely (as in Fig. 19 (right)), σ cont (im) will have little to do with the finite-T mass correction. Unfortunately, it is likely that the generic case will be like routing B rather than routing A -it seems to be an accident that with routing A no p 0 dependence appears in the denominator of our integrand, so that σ A cont (p 0 ) is a simple (quadratic) polynomial in p 0 . Finally, as an estimate of how important these effects are for the actual simulation data, we study the routing dependence of the self energy at p 0 = im, coming from the photon sunset graph and 00 component of the photon tadpole. (As elsewhere in this Appendix, the spatial part of the photon tadpole is not included because it has no routing dependence.) In Fig. 20, we plot the ratio of ∆σ(im)/σ A cont (im) vs. mL for the data shown in Fig. 8 above. As expected from the above discussion, the dependence increases with mL for fixed mT , and decreases with mT for fixed mL. Note that, even though mT is large, the routing dependence is not negligible for much for our data, and approaches 50% for the largest values of mL.