Edinburgh Explorer Isovector electromagnetic form factors of the nucleon from lattice QCD and the proton radius puzzle

We present results for the isovector electromagnetic form factors of the nucleon computed on the coordinated lattice simulations ensembles with N f ¼ 2 þ 1 flavors of O ð a Þ -improved Wilson fermions and an O ð a Þ -improved vector current. The analysis includes ensembles with four lattice spacings and pion masses ranging from 130 up to 350 MeVand mainly targets the low- Q 2 region. In order to remove any bias from unsuppressed excited-state contributions, we investigate several source-sink separations between 1.0 and 1.5 fm and apply the summation method as well as explicit two-state fits. The chiral interpolation is performed by applying covariant chiral perturbation theory including vector mesons directly to our form factor data, thus avoiding an auxiliary parametrization of the Q 2 dependence. At the physical point, we obtain μ ¼ 4 . 71 ð 11 Þ stat ð 13 Þ sys for the nucleon isovector magnetic moment, in good agreement with the experimental value and h r 2 M i ¼ 0 . 661 ð 30 Þ stat ð 11 Þ sys fm 2 for the corresponding square radius, again in good agreement with the value inferred from the ep -scattering determination [Bernauer et al. , Phys. Rev. Lett. 105 , 242001 (2010)] of the proton radius. Our estimate for the isovector electric charge radius, h r 2 E i ¼ 0 . 800 ð 25 Þ stat ð 22 Þ sys fm 2 , however, is in slight tension with the larger value inferred from the aforementioned ep -scattering data, while being in agreement with the value derived from the 2018 CODATA average for the proton charge radius.


I. INTRODUCTION
The internal structure of the nucleon still poses many open questions. Not only is the composition of its spin and momentum not completely understood [1][2][3][4], but even its size is subject to significant uncertainty arising from discrepancies between different determinations: there is a decade-old inconsistency [5,6] between the electric charge radius of the proton as obtained from ep-scattering [hr 2 p i 1=2 ¼ 0.879ð8Þ fm [7]] in good agreement with the value hr 2 p i 1=2 ¼ 0.8758ð77Þ fm from hydrogen spectroscopy [8] on the one hand, and the most accurate determination from the spectroscopy of muonic hydrogen [hr 2 p i 1=2 ¼ 0.84087ð39Þ fm [9]] on the other. This significant discrepancy, which has been dubbed the "proton radius puzzle" [10], has given rise to a variety of initiatives to better determine the proton radius. Recent measurements of hr 2 p i 1=2 using (electronic) hydrogen spectroscopy [11][12][13] mostly tend to give somewhat smaller values, while the newest determinations from ep-scattering give different results, albeit with still rather large uncertainties: the A1 collaboration at MAMI uses an initial-state radiation setup [14] to achieve very small momentum transfers and finds a large value hr 2 p i 1=2 ¼ 0.870ð28Þ fm [15], while the PRad experiment at Jefferson Lab [16] has reported a small result of hr 2 p i 1=2 ¼ 0.831ð14Þ fm [17]. An upgrade of PRad has recently been proposed [18], and a new ep-scattering experiment, MAGIX, is being prepared at Mainz [19]. To complement the result from muonic hydrogen spectroscopy with a result from μp scattering, the MUSE collaboration aims to measure the μp scattering cross section to subpercent accuracy [20]; a similar experiment is in preparation at CERN by the COMPASS++/AMBER collaboration [21]. At present, a significant tension remains between the determinations based on muonic hydrogen and ep-scattering, respectively. A variety of possible theoretical explanations have been proposed, but so far there has not been decisive evidence in favor of any particular explanation [10,22]. It remains notable that dispersive analyses favor a smaller value of the proton charge radius [23][24][25][26], and in fact have done so since before the muonic measurements [27].
In the context of scattering experiments, the proton charge radius is determined from the derivative of the electric form factor G E ðQ 2 Þ at Q 2 ¼ 0. To determine this derivative accurately, high-quality data at very low momentum transfer Q 2 and/or a parametrization that remains trustworthy over a large range of Q 2 are required. In order to gain a proper understanding of the origin of the proton radius puzzle and associated questions, theoretical determinations of nucleon structure observables, and in particular of the electromagnetic form factors of the nucleon, from first principles are required. Lattice QCD calculations are therefore instrumental in predicting the nucleon charge radii from QCD. This has generated lively activity on this topic within the lattice QCD community [28][29][30][31][32][33][34][35][36][37][38][39][40][41], although at present, the precision of lattice results is not yet sufficient to rule out either the electronic or the muonic result for the proton radius.
At large momentum transfers Q 2 , the electromagnetic form factors are the subject of another puzzle: while polarizationtransfer experiments [42] find that the ratio of electric and magnetic form factor of the proton, μ p G E;p ðQ 2 Þ=G M;p ðQ 2 Þ, decreases roughly linearly for large Q 2 [43][44][45][46][47], experiments based on the Rosenbluth separation formula [48] find that it is roughly constant and of order 1 (albeit with rapidly increasing errors at large Q 2 , where G E;p contributes little to the total cross section) [49][50][51]. This discrepancy has been explained theoretically as the result of two-photon exchange contributions to the cross-section measurements [52][53][54], but the situation is not yet completely clarified [55].
In this paper, we present our lattice QCD-based determination of the isovector electromagnetic form factors of the nucleon from the N f ¼ 2 þ 1 coordinated lattice simulations (CLS) ensembles [56]. We use two state-of-the-art methods, known as the summation method and two-state fits to extract G E and G M from Euclidean correlation functions for a range of momentum transfers Q 2 ≲ 1 GeV 2 . Given the limited Q 2 range of our data, we focus on extracting the electric and magnetic charge radii and the magnetic moment of the nucleon using a variety of methods to check for consistency. Extrapolated to the physical point, our results favor a small value of the electric charge radius, although the present accuracy is not sufficient to make a decisive statement in this regard.
The paper is organized as follows: in Sec. II we present the ensembles and operators used in our simulations; Sec. III describes the methods we use to account for the presence of excited-state contaminations in our data, and Sec. IV the methods we employ to parametrize the form factor data on each lattice ensemble, while Sec. V gives the details for the extrapolation of our results to the continuum and infinite-volume limits at the physical pion mass. Our conclusions and a discussion of the results are contained in Sec. VI. For completeness and ease of reference, we provide tables with the values of all measured form factors in Appendix A. Appendix B lists the priors we use to stabilize our two-state fits, while the results of both dipole and z-expansion fits on each ensemble are given in Appendix C. Appendices D and E give the results of the extrapolations to the physical point using two variants of chiral perturbation theory. We consider the ratio G M ðQ 2 Þ=G E ðQ 2 Þ in Appendix F.

II. LATTICE SETUP
We use the CLS N f ¼ 2 þ 1 ensembles [56] that have been generated with nonperturbatively OðaÞ-improved Wilson fermions [57,58] and the tree-level improved Lüscher-Weisz gauge action [59]. In order to prevent topological freezing [60], the fields obey open boundary conditions in time [61], with the exception of ensemble E250 which uses periodic boundary conditions in the time direction. The reweighting factors needed to correct for the treatment of the strange quark determinant during the gauge field generation are obtained using the method of Ref. [62]. See Table I for a list of ensembles used in this work, which cover the range of lattice spacings from 0.050 to 0.086 fm. Our setup in this work is identical to that used in our paper on the isovector charges and momentum fractions of the nucleon [63], to which we refer the reader for further details.
We obtain the matrix element of the vector current through the ratio [65,66] R J μ ðt; t s ; qÞ where the nucleon two-and three-point-functions are given by where t s and t refer to the time slice of the nucleon sink and the current operator insertion, respectively. In our setup, where the nucleon at the sink is at rest, i.e., for a momentum transfer q the initial and final nucleon states have momenta Our interpolating operator for the proton is built using Gaussian-smeared [67] quark fieldsq using spatially APE-smeared [68] gauge links in the covariant Laplacian Δ and tuning the parameters κ G and N G so that a smearing radius r G ∼ 0.5 fm [69] is realized. For the projection matrix Γ we use Furthermore we use the improved conserved vector current with the improvement coefficient c cs V from [70]. The improvement term is implemented using the symmetric lattice derivative, i.e., To compute the three-point functions, we employ extended propagators in the "fixed-sink" method, requiring additional inversions for each value of t s studied while allowing the momentum transfer to be varied via a phase factor at the point of the current insertion [71]. As in [63], we apply the truncated solver method with bias correction [72][73][74] to reduce the cost of the inversions. The number of highprecision and low-precision measurements carried out on each gauge configuration is indicated in Table I. We have optimized the solver parameters with respect to relative cost over error reduction on a subset of configurations, as described in Ref. [75]. For E250 we increase statistics with increasing source sink separations, i.e., the values for N LP and N HP in Table I refer to the two largest source sink separations on that ensemble. For the polarization given in Eq. (7) the asymptotic value for the spectral decomposition of Eq. (1) reads where G E 1 and G M are the isovector electric and magnetic Sachs form factors, respectively, with G E ð0Þ ¼ 1. The   TABLE I. Overview of ensembles used in this study. The quoted errors on the pion and nucleon masses include the error from the scale setting [64]. N HP and N LP denote the number of high-precision (HP) and low-precision (LP) measurements on all of the N CFG configurations for each value of the source-sink t s , respectively. For E250 N HP and N LP refer to the number of sources used for the two largest values of t s ¼ 1.3; 1.4 fm while for each of the smaller values t s ¼ 1.2 fm and t s ¼ 1.0 fm the number of sources is reduced by a factor of 2 and 4, respectively. In addition we have extracted the electric form factor from the spatial components of the current matrix element, however the extracted values are less accurate compared to Eq. (10b). Note that in this case one obtains a system of equations similar to G M .
FIG. 1. Effective form factors for ensemble D200 (upper panel) and E250 (lower panel). In each panel, the first row corresponds to the smallest nonvanishing momentum in the given ensemble, i.e., Q 2 ¼ 0.089, 0.040 GeV 2 for D200 and E250, respectively, and the second row corresponds to Q 2 ∼ 0.5 GeV 2 . For the four available source-sink separations t s , the effective form factors are displayed as a function of the current insertion time t, offset to the midpoint between nucleon source and sink. The curves represent the two-state fits in their respective fit intervals. The gray band and black data point correspond to the estimate for the ground-state matrix element for the summation and two-state method, respectively. The data points are displaced for better visibility. extraction of the magnetic form factor via Eq. (10c) amounts to solving a system of linear equations, since, in general, several different choices of q produce the same value of Q 2 . Consequently there are several possibilities for obtaining an estimate of G M ðQ 2 Þ. We use the following estimator, averaging over all momenta q contributing to the same Q 2 . The resulting effective form factors for every source-sink separation for the first nonzero momentum and a momentum close to 0.5 GeV 2 on the ensembles D200 and E250 are shown in Fig. 1. Unless stated otherwise, errors are computed using the jackknife method on binned data with a bin size of two for all ensembles except E250, where the spacing between two analyzed configurations in terms of molecular dynamics time is twice as large compared to e.g., D200 or C101 to begin with, and consequently no binning is applied for E250. For the conversion to physical units we use the lattice spacing determination of [64].

III. EXCITED-STATE SYSTEMATICS
Baryonic correlation functions suffer from a strong exponential growth of the relative statistical noise when the distance in Euclidean time between operators is increased [76]. Therefore, for the typical source-sink separations in current lattice calculations of baryon structure observables, it cannot be guaranteed that contributions from excited states are sufficiently suppressed. Evidently, special care is required to avoid any bias from unwanted excited-state contributions [77,78]. Predominantly, two approaches have been widely adopted to address this problem: the summation method [41,[79][80][81][82][83] and multistate fits [41,[84][85][86][87].
While the former is (in its simplest incarnation) a straightforward method to apply, the latter is more involved as one is forced to make specific assumptions and/or parameter choices, regarding e.g., fit windows, at various steps of the analysis. In this section we give details on our implementation of the two respective methods and discuss how errors related to methodology are incorporated in the final results. The form factor values obtained with both methods are collected in Appendix A for all ensembles.
For the two-state fits of the effective form factors, we use priors obtained from an analysis of the two-point functions. We fit the two-point function with the ansatz and extract the energy gap between ground (E 0 ) and first excited state (E 1 ), as well as the ratio of the respective overlaps, i.e., Δðp 2 Þ ¼ E 1 ðp 2 Þ − E 0 ðp 2 Þ and ρðp 2 Þ ¼ c 1 ðp 2 Þ=c 0 ðp 2 Þ. In practice, the gaps and ratios depend on the choice of fit ranges in time, especially the starting time slice. We therefore repeat the fits for different starting time slices and obtain our best estimate as a weighted average over the region where the results have stabilized (see Fig. 2).
For the nonlinear exponential fits we use the VARPRO method [88], which only needs initial guesses for the energy levels. Monitoring the ground state energy, we find that the extraction works well for all ensembles for momenta up to 1 GeV 2 , see Fig. 3. The results, which are used in the subsequent analysis, are given in Appendix B for all ensembles.
For the asymptotic limit of the ratio in Eq. (1) we obtain R as ðt; t s ; FIG. 2. Energy levels (left) and overlap factors (right) extracted from the zero-momentum nucleon two-point function on ensemble D200, for the ground state (blue) and the first excited state (red). All quantities are given in lattice units.
where r 00 is proportional to the ground state matrix element 2 G E=M ðQ 2 Þ, and the last two terms in the first line come from the expansion of the two-point functions in Eq. (1). For each value of the momentum q 2 , the gap Δðq 2 Þ and the terms proportional to ρ are universal, and we therefore proceed by fitting the electric and magnetic effective form factors simultaneously. The fits are performed up to a maximum transfer momentum of about 1 GeV 2 . The fits to the effective form factors are stabilized using Gaussian priors for Δ and ρ (see Appendix B), whose central values are set to the results of the fits to the twopoint function. We monitor the impact of our particular choice for the priors on the extracted form factor values by varying the width of the priors in all fits. To this end we multiply the errors of Δ and ρ by a factor between 1 and 5.
The associated fit results are labeled 1x; 2x; …; 5x to reflect their dependence on the prior width. In order for the prior to be effective, we constrain the width to maximally half the mean value. The idea is to strike a balance between the statistical accuracy of our extracted values and any potential bias introduced by the priors. Therefore we choose the final values for the two-state method to come from the analysis with the maximum prior width that (a) gives values compatible within errors with all determinations based on a smaller width, and (b) maintains an acceptable error increase. We made a rather conservative choice for the latter, allowing the error to increase by factors between 2 and 4 for prior widths 2x to 5x. For the ensembles in this work, it turns out that, for Q 2 ≤ 0.5 GeV 2 , the final values come from an analysis with prior width 5x. In addition to the above analysis, we perform fits to the summed correlators. The summation method takes advantage of the fact that in the ratios of Eq. (13), when summed over time slices in between source and sink, the contributions from excited states are parametrically suppressed. Summing Eq. (13) over t, omitting t skip time slices at both ends, we obtain 3 For the effective form factors we may write the summed ratios as where in our analysis we use t skip ¼ 2a. The summation method crucially depends on the computation of observables for multiple source-sink separations. However, since the signal of the correlators deteriorates with larger time separations, we are limited in the number of available t s . Our current data does not deviate significantly from linear behavior (see Fig. 4) for any of the ensembles and we therefore fit Eq. (14) in the asymptotic limit with only ground state contributions, i.e., where C E=M is an irrelevant offset.
The results of the two methods are shown together in Fig. 5 for the near-physical ensemble E250. In order to assess systematics associated with excited-state effects, we apply all subsequent analyses to the data obtained by both methods, hence we distinguish between summation and explicit two-state fits in the following.

IV. PARAMETRIZATION OF THE Q 2 DEPENDENCE
Since the magnetic moment μ is defined as the intercept of G M and the electric and magnetic radii are determined by the slope of the form factors at zero momentum transfer, a description of the Q 2 dependence is necessary. In analogy to [41] we perform two analyses. In the first analysis we parametrize the Q 2 dependence using either a dipole or a z-expansion ansatz [90] and subsequently perform chiral and continuum extrapolations. In the second analysis we use covariant baryon chiral perturbation theory (BChPT) [91][92][93][94] to fit the available form factor data for all ensembles simultaneously. The latter approach combines the chiral extrapolation and the fit to the Q 2 dependence, i.e., without intermediary extraction of the radii using a separate ansatz of the Q 2 behavior for each ensemble. We use the expressions in [94] as they include vector meson degrees of freedom, e.g., ρ-mesons, in order to extend the description of the form factors to larger values of Q 2 .
For the dipole fits we use the ansatz where a E ¼ 1. The dipole mass and a M are the fit parameters. The dipole fit is performed separately for the electric and the magnetic form factor, thus the dipole mass is allowed to be different for G E and G M . The fits are performed with cuts in Q 2 of 0.6 GeV 2 and 0.9 GeV 2 , and the corresponding results are collected in Appendix C.
A model-independent description of the Q 2 dependence of G E;M can be obtained by employing the z-expansion [90]. The form factors may be decomposed as FIG. 4. Dependence of the summed ratios for the electric (left) and magnetic (right) effective form factors on the source-sink separation Eq. (15). Data is shown for the first nonzero momentum on D200, i.e., Q 2 ¼ 0.089 GeV 2 , together with a linear fit using Eq. (16).
FIG. 5. Comparison of the summation (blue circles) and two-state (red triangles) method for the priors in Appendix B on ensemble E250. The black band, which corresponds to the parametrization of [89], is displayed to enable a first comparison to phenomenology. The continuum extrapolation of the lattice data is discussed in Sec. V. with The parameter t 0 is the value of −Q 2 which is mapped to z ¼ 0. On each ensemble we set t cut ¼ 4M 2 π , where M π denotes the pion mass of the respective ensemble. In general we stabilize the fits using priors, where the coefficients are constrained using Gaussian distributions. The mean and width of these distributions are chosen such that coefficients do not change drastically with increasing order of the expansion. To that end we first perform unconstrained fits to order k ¼ 1. For all subsequent fits with orders k > 1 in the z-expansion we add Gaussian priors for the first two coefficients, centered around the means of the unconstrained fit, with widths of 5 times the corresponding error estimate. For the remaining coefficients we choose Gaussian priors centered around zero with twice the maximum of the fitted coefficients up to first order. 4 Throughout we always enforce G E ð0Þ ¼ 1. Additionally we performed fits with stronger constraints on the large k behavior of the coefficients a k coming from the falloff of the Sachs form factors for large spacelike momentum transfer [95]. These conditions may be implemented in the form of sum rules [96] For a given range in Q 2 the optimal value for t 0 is (see Ref. [90]) given by We perform the fits with t 0 ¼ 0 GeV 2 and t 0 ¼ t opt 0 ð0.6 GeV 2 Þ for momentum cuts of Q 2 ≤ 0.6 GeV 2 and Q 2 ≤ 0.9 GeV 2 ; the results are given in Appendix C.
The dependence of the extracted electric and magnetic radii on the maximum order of the z-expansion is shown in Fig. 6. We find that the fits stabilize around k max of 5 for the ansatz with weaker assumptions about the bounds on the z-expansion parameters. Fits using the large-Q 2 constraints of Eq. (20) in general converge more slowly, i.e., for larger values of k max ; once plateaued, they however give compatible results. Since we are interested in the low-Q 2 region, we choose the fits without imposing Eq. (20). In Fig. 7 we show the extracted values for data with Q 2 ≤ 0.9 GeV 2 for the summation method. We see that the values obtained from the different parametrizations of the Q 2 dependence are consistent, while the dipole ansatz gives smaller errors. We demand that at least four nonvanishing values of Q 2 enter the dipole and z-expansion fits, respectively. This implies, for instance, that ensemble H105 is excluded from the analysis when Q 2 ≤ 0.6 GeV 2 is applied.
In the direct approach based on covariant BChPT, we fit the form factor data for both G E and G M on all ensembles simultaneously, and thus the intermediate parametrization of the Q 2 dependence is avoided. While the ensembles are treated as statistically independent, we do take the correlation among different Q 2 and between G E and G M within an ensemble into account. Statistical errors are derived from the covariance matrix estimated in the least-square fits that yield the results for charge radii, the magnetic moment and other FIG. 6. Dependence of the extracted electric and magnetic radii on the maximum order k of the z-expansion for t 0 ¼ 0 (circle) and t 0 ¼ −0.190 GeV 2 (diamond) for the two-state method with (filled symbols) and without (empty symbols) constraints of Eq. (20) on ensemble D200. fit parameters. Since this method unifies the description of the Q 2 and the M 2 π dependence, it is presented in Sec. V.

V. CHIRAL AND CONTINUUM EXTRAPOLATION
In this section we present our chiral and continuum extrapolation, in order to arrive at results for the isovector electric and magnetic radii, as well as for the magnetic moment. We adopt two strategies, the first of which follows up on the intermediate results of Sec. IV and is presented in Sec. VA. The second achieves a simultaneous description of the Q 2 and the M 2 π dependence of the results of Sec. III and is presented in Sec. V B. The model-averaging procedure used to arrive at our final results is explained in Sec. V C.
A. HBChPT extrapolation of the radii and the magnetic moment As detailed in Sec. IV, we perform fits to the Q 2 dependence of the Sachs form factors using a dipole and z-expansion ansatz. In this way we obtain estimates for the magnetic moment and the electromagnetic radii on each ensemble. For the chiral and continuum extrapolation of this dataset we perform uncorrelated simultaneous fits based on heavy baryon chiral perturbation theory (HBChPT), i.e., Eqs. (41), (45) and (47) of Ref. [28], using the physical nucleon mass. We apply momentum cuts to the data between Q 2 ≤ 0.6 GeV 2 and Q 2 ≤ 0.9 GeV 2 as well as a pion mass cut of M π ≤ 0.28 GeV. We fit the low-energy constants κ 0 , E 1 , B 10 and B c2 , while setting the remaining constants to their phenomenological values, 5 i.e., The results for these fits extrapolated to the value of the pion mass in the isospin limit of QCD (134.8 MeV [97]) are given in Appendix D. We have analyzed variations of the fit function, amending the formulas with lattice spacing terms of Oða 2 Þ or including finite volume effects 6 [99], i.e., Here, the subscripts HB, a and L are used to distinguish the estimates in HBChPT from the coefficients describing FIG. 7. Dipole (red squares) and z-expansion results (blue circles) for the radii and the magnetic moment, obtained with the summation method for a momentum cut of 0.9 GeV 2 . For better visibility, dipole and z-expansion data points corresponding to the same ensemble have been slightly separated from each other horizontally.
lattice artifacts and finite-volume effects, respectively. Fits to the HBChPT expression fail if they are applied to the charge radii and the magnetic moment determined from the dipole ansatz. Even excluding the results from ensemble E250, which are hardest to accommodate in the fit to the dipole data, does not improve the HBChPT description significantly. For the z-expansion extraction on the other hand, a good fit can be achieved for momentum cuts between Q 2 ≤ 0.6 GeV 2 and Q 2 ≤ 0.9 GeV 2 using HBChPT with and without lattice artifacts as parametrized in Eq. (23). We find simultaneous fits of finite lattice spacing and finite volume dependence to be unstable. Let us stress again that we only include ensembles with at least four data points remaining after momentum cuts are applied. This effectively limits the lower bound on the cut in Q 2 , which still allows for a chiral and continuum extrapolation, to roughly 0.6 GeV 2 . We also note that, even though for each ensemble a different number of data points enters in the z-expansion, the relative weight in the HBChPT fit does not reflect the density of available Q 2 points at low momentum transfers.
In this sense the two-step process, first performing z-expansion fits and subsequently extrapolating using HBChPT, masks the relative paucity of data points at small momentum transfer for some ensembles.

B. Direct BChPT fits
As an alternative to the intermediate determination of the Q 2 dependence via z-expansion or dipole fits, we perform direct fits of the covariant BChPT expressions of [94] to our form factor data. In this way we obtain a combined description of the Q 2 and the M π dependence. The fit procedure is the same as described in Ref. [41], i.e., we use the full expression of Ref. [94], without applying any expansions in Q 2 or M π and without including explicit Δ degrees of freedom. We note that the fit depends linearly on the four LECs d 6 ,c 6 , d x and G ρ . For further details on the LECs, we refer to Table III in Ref. [41]. It turns out that an important advantage of this approach to extracting the electromagnetic radii compared to the combined z-expansion and HBChPT analysis is its stability against considerably lowering the momentum cuts applied.
For the direct fits we obtain results for various momentum cuts between Q 2 ≤ 0.3 GeV 2 and Q 2 ≤ 0.6 GeV 2 for both the summation method and the two-state method. The applicability of the chiral EFT expressions with respect to the pion mass is not easily established [100][101][102][103], and we choose a rather conservative upper limit of 290 MeV, i.e., excluding S201 and heavier ensembles. Since we are also interested in the lattice spacing dependence, we choose the lower limit of the pion mass cut such that we keep ensembles at three lattice spacings, i.e., the lowest cut is equal to the pion mass on J303. In order to assess the dependence on the cut we choose intermediate steps such that one ensemble is removed at a time, i.e., N200 and H105. The resulting cuts for the pion mass read M π ≤ ½0.27; 0.28; 0.29 GeV. We perform the fits with and without terms parametrizing the lattice spacing or finite volume dependence, Fits leaving κ L as a free parameter are unstable, and we therefore fix κ L to the value from HBChPT [99], 7 i.e., As a cross-check, we perform fits where the lattice artifacts enter multiplicatively, i.e., In total we have six models, i.e., without any lattice artifact and either including discretization or finite volume effects, for the additive parametrization of Eq. (24) or the multiplicative one of Eq. (26).
Within our statistical errors, discretization and finitevolume effects are hardly significant, implying that for most cuts the corresponding coefficients are compatible with zero. 8 We perform simultaneous correlated fits of G E ðQ 2 Þ and G M ðQ 2 Þ for each model with the given dataset and cuts applied for every ensemble. The errors for the fit parameters are estimated using derivatives of the χ 2 function. The direct method leads to more stable results in comparison to the two-step procedure in which fits to the z-expansion are performed first, followed by the chiral and continuum extrapolations using HBChPT. While the two methods give consistent results, the direct method has smaller errors, especially for the magnetic form factor (see Fig. 8). Moreover, direct fits allow for more stringent cuts in the momentum transfer, and the analysis is more driven by data in the low Q 2 region, where the radii and magnetic moment are defined. Finally, the influence of priors for the 7 Note that in Ref. [99] f is used instead of F π and that we are using the expression for the isovector magnetic moment. 8 Similar to the HBChPT fits we find simultaneous fits of finite volume and lattice spacing dependence are not stable for all applied cuts and we do not include them in our final estimate. z-expansion is eliminated in the direct fits. For these reasons we restrict the following presentation of the final results to the direct method, however noting that the same procedure applied to the HBChPT extractions gives consistent results, albeit with larger errors.
The quality of the direct covariant BChPT description is illustrated in Fig. 9, where we present a typical fit to the extracted form factors for summation data, corresponding to the model without lattice artifacts or finite-volume corrections. The data is described rather well over the fit range in Q 2 for all ensembles, already suggesting that, within our statistical accuracy, lattice artifacts are not discernible. Additionally, Fig. 9 illustrates the very different density of low-Q 2 data points for each ensemble. For the magnetic moment, most recent lattice determinations [32,37,86,105] lie below the experimental value. For our most chiral ensemble, E250, we observe (top right panel of Fig. 9) that the direct fit lies somewhat above the data points for the magnetic form factor, while still being compatible within the uncertainties. Thus the possible presence of a non-negligible source of systematic error in calculations of the magnetic form factor at the physical point merits further investigation.
From phenomenological dipole fits to experimental data, the ratio of the electric and magnetic form factor is known to show a rather constant behavior over a large range of Q 2 . Indeed we observe similar behavior in our data (see Fig. 10), where the two-state and summation data are rather flat with the exception of J303 showing signs of a light upward slope. This pattern motivates a linear extrapolation of the ratio for the magnetic moment up to Q 2 ≤ 0.6 GeV 2 (Q 2 ≤ 0.29 GeV 2 for E250), where the results are given in Table XVI and shown in Fig. 10 (right). It is reassuring to see that our fit, while not using these points, does reproduce the estimates from a linear extrapolation of the ratio rather well.

C. Model average and final result
As we have no a priori preference for the results from the summation method or two-state fits, we will treat both datasets on an equal footing. We obtain our final estimates and total errors from averages over fit models and kinematic cuts using weights derived from the Akaike information criterion (AIC) [106,107]. In this context the momentum and pion mass cuts applied can be reinterpreted in terms of a model selection problem [108]. One may introduce for each would-be-cut data point an additional fit FIG. 9. The summation-method data points for the Sachs form factors, and the blue band describing the corresponding direct covariant BChPT fit with momentum cut Q 2 ≤ 0.4 GeV 2 , pion mass cut 0.28 GeV and without lattice artifacts. The data point for G M ð0Þ is obtained from a linear fit to the ratio of G M and G E and is not used in the direct covariant ChPT fit. The fit depends linearly on the four LECs d 6 ;c 6 ; d x and G ρ (cf. [41]).
parameter that matches the respective data point exactly, thus giving no contribution to the χ 2 . The corresponding data point is effectively excluded from the least-square fit, while the model weight is decreased via the penalty term for additional parameters in the AIC. The AIC reads where χ 2 min;i denotes the minimum of the weighted least square, i.e., including covariance, for the ith model, n f the number of fit parameters, and n c the number of cut data points. In this way we obtain a criterion that takes the goodness of fit into account. At the same time it penalizes increasing the number of fit parameters while it favors including more actual data points [41]. For the weighting of different models on the same input dataset we use i.e., we normalize the AIC obtained for all models for summation and two-state data separately. Finally, we apply a flat weight function to the estimates from summation and two-state fits. In the final analysis we only include fits with a p-value larger than one percent, i.e., a total of 50 variations is considered. We adopt the procedure from [109], which we briefly sketch in the following, for estimating the systematic and statistical error of the model-averaged values. To that end we treat the modelaveraged estimate as a random variable with the following cumulative distribution function (CDF): i.e., the weighted sum of Gaussian distributions where the mean x i and variance σ 2 i is given by the best estimate and fit error of each model, and the weight w i is obtained as explained above. This effectively smoothens the otherwise rugged distribution of model postdictions and allows for a more robust estimate of the distribution parameters. An illustration of the smoothening is given in Fig. 11, where we show the CDF of Eq. (29) together with the corresponding distribution based on the individual w i only, i.e., without taking the error on the variation into account. 9 The final value and total error are easily read off from the distribution in Eq. (29) as the median, and the 1-σ percentiles, respectively. Under the assumptions that a rescaling of all errors σ i only affects the statistical error but not the systematic one, we can further separate the statistical and systematic errors, cf. [109].
In our previous work based on N f ¼ 2 ensembles [41], we used the spread in the central values as an estimate of systematic errors. While this procedure is robust, it is also very conservative and susceptible to overestimating the true error due to systematics. Therefore, in order to not be overly conservative and still be able to incorporate systematic errors in a robust way, we adopt the above model averaging procedure using AIC weights and obtain as our final results κ ¼ 3.71 AE 0.11 AE 0.13; where the first and second errors refer to statistical uncertainty and the total systematic error, respectively. One may even proceed further and estimate the individual contributions for every variation to the total systematic error. That is achieved by building the CDF in Eq. (27) not over all variations but rather first iterating over a particular feature, e.g., a momentum cut, and performing the analysis for every variant of that feature separately. From this we then build a secondary CDF like Eq. (27) and extract the variation-specific systematic error. Repeating this analysis for all variations we obtain the following systematic error budget: FIG. 10. The left panel shows the ratio of the magnetic and electric form factors for summation method data. The right panel shows the direct covariant BChPT result for the pion-mass dependence of the magnetic moment as a blue shaded area, together with the data points obtained from a linear extrapolation to Q 2 ¼ 0 of the summation method data on the left panel. The black diamond corresponds to the PDG value [104]. Note that the displayed data points were not used in obtaining the covariant BChPT result. 9 Note that this corresponds to replacing the Gaussian distributions in Eq. (29) with δ functions. δκ ¼ 0.11 exc AE 0.03 artifacts AE 0.04 Q 2 AE 0.02 m π AE 0.02 method ð¼ 0.13Þ; where the subscript "exc" refers to using either summation method or two-state fits, while "method" labels the fit ansatz for lattice artifacts based on either Eq. (24) or (26). We note that, due to correlations, the individual terms added in quadrature (as indicated on each line by the number in brackets) need not exactly reproduce the total error of Eq. (30). For the magnetic moment and the electric radius, the dominant source of systematic error remains the excited state contribution, while for the magnetic radius all systematic effects considered here have comparable size. Moreover, in the current analysis the magnetic radius is least affected by systematic errors.
In Fig. 12 we compare our work to recent lattice determinations and to the phenomenological values for the isovector magnetic moment and the isovector electromagnetic radii. While we postpone the comparison of our results with phenomenological determinations of the radii to Sec. VI, we remark that our value for the magnetic moment is in good agreement with the experimentally precisely known difference of proton and neutron magnetic moments. As for the comparison with other lattice calculations, we note that our estimate is compatible with the determinations from [105,110], while there is a sizable difference to the values from [32,37,86]. We stress that the difference is not related to the issue of preferring direct fits to the form factor data over the more conventional route via the z-expansion, as the latter shows a trend to higher values for the radius for our data. Our error estimates for the statistical and systematic errors are comparable in size with the other lattice determinations. For the isovector magnetic moment we see good agreement with phenomenology and [37,110]. We note that the missing data point for Q 2 ¼ 0 complicates the extraction of the low-Q 2 observables in most recent lattice determinations. Especially the z-expansion fits, at least for orders n ≥ 2, tend to overfit the dependence of the form factor at low Q 2 . In order to remedy this, either priors are introduced or mock data points at Q 2 ¼ 0, e.g., linear extrapolations of the ratio of the isovector form factors, are used to stabilize the description. We note that the direct approach, in this sense, has less freedom and by itself allows for considerably less variation in the form factors at low Q 2 (see Fig. 9). We believe this to be responsible, in large part, for the small errors we find in the isovector magnetic radius.

VI. CONCLUSIONS
We have calculated the isovector electromagnetic form factors of the nucleon in lattice QCD with dynamical up, down and strange quarks. The electromagnetic radii and the magnetic moment have been extracted accounting for systematic effects due to excited states, finite volume and nonzero lattice spacing. Our final estimates are listed in Eq. (30), with a detailed systematic error budget given in Eq. (31).
As an important benchmark, we reproduced the experimental value of the magnetic moment with an overall precision of 3.6%. The precision of the present calculation is significantly higher than that of our earlier study in twoflavor QCD [41], especially concerning the magnetic properties. For the isovector electric charge radius, our result is in good agreement with the phenomenological estimate inferred from the 2018 CODATA recommended value of the proton radius. 10 By contrast, after adding all errors in quadrature, we find a 2.4σ tension with the result from ep-scattering [7]. For the isovector magnetic radius, on the other hand, our result agrees well with the value inferred from the ep-scattering based determination [7], and exhibits a sizable tension with the other collected world data [96]. For ease of comparison, we translate our estimate for the isovector hr 2 E i into a result for the proton radius with the help of the experimental determination of the (squared) neutron charge radius, hr 2 n i ¼ −0.1161ð22Þ fm 2 [104]. After combining all errors we obtain hr 2 p i 1=2 ¼ 0.827ð20Þ fm, where the error is completely dominated by the uncertainties of the lattice calculation.
The excited-state contributions remain the dominant source of systematic error for the isovector vector form factors. A promising strategy to tame this issue, besides increasing source-sink separations, is to further stabilize multistate fits of the three-point functions by performing a dedicated study of the excitation spectrum. Moreover, an analysis of the excited-state contributions in chiral effective theory, as has been done for the axial form factor [111], and the expression for the finite volume dependence, would be highly desirable to improve the assessment of the related systematic errors. Our analysis also shows that in order to significantly improve on the error for the radii and the magnetic moment, more points at smaller Q 2 , i.e., at large volumes and close to physical pion mass are necessary. We plan to extend our analysis to such ensembles as they become available. Calculations for this project were partly performed on the HPC clusters "Clover" and "HIMster2" at the Helmholtz Institute Mainz, and "Mogon 2" at Johannes Gutenberg-Universität Mainz. 10 The central value for the latter is very close, in comparison to its uncertainty of 2.3‰, to that extracted from muonic hydrogen [9], which is yet 4.9 times more precise.
Additional computer time has been allocated through Projects HMZ21, HMZ23, and HMZ36 on the supercomputer systems "JUQUEEN" and "JUWELS" at John von Neumann Institute for Computing, Jülich. The authors also gratefully acknowledge the Gauss Centre for Supercomputing e.V. [112] for funding this project by providing computing time on the GCS Supercomputer HAZEL HEN at Höchstleistungsrechenzentrum Stuttgart [113] under Project GCS-HQCD. Our programs use the QDP þ þ library [114] and deflated SAP þ GCR solver from the openQCD package [115], while the contractions have been explicitly checked using [116]. We are grateful to our colleagues in the CLS initiative for sharing the gauge field configurations on which this work is based.

APPENDIX A: FORM FACTOR DATA
In this Appendix, we present the results of extracting the isovector electromagnetic form factors of the nucleon with either the summation method or the two-state method, both described in Sec. III, for every gauge ensemble listed in Table I.
The summation-method results quoted below are obtained from a fit using Eq. (16) as fit ansatz. For the two-state method we use the procedure outlined in Sec. III. G E and G M for ensemble H105. G E and G M for ensemble N302.

APPENDIX B: PRIORS
The two-state method, as we apply it, requires a certain amount of prior information in order to stabilize the fits. In this Appendix, we summarize the priors, which are extracted from the nucleon two-point functions, for the ratio of overlaps ρ and for the energy difference between ground and first excited state; for details see Sec. III. The energy gap is given in lattice units.

APPENDIX C: DIPOLE AND z-EXPANSION RESULTS
This Appendix presents the results for the magnetic moment and the electromagnetic radii for the dipole and z-expansion fits for all ensembles, both for the summation and two-state data given in Appendix A, for a momentum cut of Q 2 ≤ 0.9 GeV 2 . The pion mass indicated in the first column of the tables uniquely identifies the corresponding gauge ensemble via Table I.

APPENDIX D: HBChPT FITS
Here we summarize the results for the physical values of the magnetic moment μ ¼ κ þ 1 and the electromagnetic radii of the HBChPT fits to the z-expansion data, applying a pion mass cut M π ≤ 0.28 GeV, as explained in Sec. V. Note that the  Oðe −m π L Þ value for AIC is not corrected for the cut data points. The last column indicates which of the corrections appearing in Eq. (23) is included in the fit.

APPENDIX E: COVARIANT BChPT FITS
Here we summarize the results for the physical values of the magnetic moment μ ¼ κ þ 1 and the electromagnetic radii of the direct covariant ChPT fits as discussed in Sec. V, applying pion mass cuts of M π ≤ ½0.27; 0.28; 0.29 GeV. Note that the  value for AIC is not corrected for the cut data points. The entries with and without an asterisk in the last column indicate which of the corrections appearing respectively in Eqs. (26) and (24) is included in the fit.

APPENDIX F: RATIO G M =G E
A common observation among lattice determinations of the isovector form factors is that the ratio of the magnetic and electric form factor exhibits a rather flat behavior. One may therefore hope to extract the magnetic moment as the intercept