Uncertainty analysis and order-by-order optimization of chiral nuclear interactions

Chiral effective field theory (chiEFT) provides a systematic approach to describe low-energy nuclear forces. Moreover, chiEFT is able to provide well-founded estimates of statistical and systematic uncertainties -- although this unique advantage has not yet been fully exploited. We fill this gap by performing an optimization and statistical analysis of all the low-energy constants (LECs) up to next-to-next-to-leading order. Our optimization protocol corresponds to a simultaneous fit to scattering and bound-state observables in the pion-nucleon, nucleon-nucleon, and few-nucleon sectors, thereby utilizing the full model capabilities of chiEFT. We study the effect on other observables by demonstrating error-propagation methods that can easily be adopted by future works. We employ mathematical optimization and implement automatic differentiation to attain efficient and machine-precise first- and second-order derivatives of the objective function with respect to the LECs. We use power-counting arguments to estimate the systematic uncertainty that is inherent to chiEFT and we construct chiral interactions at different orders with quantified uncertainties. Statistical error propagation is compared with Monte Carlo sampling showing that statistical errors are in general small compared to systematic ones. In conclusion, we find that a simultaneous fit to different sets of data is critical to (i) identify the optimal set of LECs, (ii) capture all relevant correlations, (iii) reduce the statistical uncertainty, and (iv) attain order-by-order convergence in chiEFT. Furthermore, certain systematic uncertainties in the few-nucleon sector are shown to get substantially magnified in the many-body sector; in particlar when varying the cutoff in the chiral potentials. The methodology and results presented in this Paper open a new frontier for uncertainty quantification in ab initio nuclear theory.


I. INTRODUCTION
Uncertainty quantification is essential for generating new knowledge in scientific studies. This insight is resonating also in theoretical disciplines, and forward error propagation is gaining well-deserved recognition. For instance, theoretical error bars have been estimated in various fields such as neurodynamics [1], global climate models [2], molecular dynamics [3], density functional theory [4], and high-energy physics [5].
In this paper, we present a systematic and practical approach for uncertainty quantification in microscopic nuclear theory. For the first time, we provide a common statistical regression analysis of two key frameworks in theoretical nuclear physics: ab initio many-body methods and chiral effective field theory (χEFT). We supply a set of mathematically optimized interaction models with known statistical properties so that our results can be readily applied by others to explore uncertainties in related efforts.
The ab initio methods for solving the many-nucleon Schrödinger equation, such as the no-core shell model (NCSM) [6] and the coupled cluster (CC) approach [7], are characterized by the use of controlled approximations. This provides a handle on the error that is associated with the solution method itself. Over the past decade there has been significant progress in firstprinciples calculations of bound, resonant, and scattering states in light nuclei [6,[8][9][10][11][12] and medium-mass nuclei [7,[13][14][15][16]. The appearance of independently confirmed and numerically exact solutions to the nuclear many-body problem has brought forward the need for an optimized nuclear interaction model with high accuracy, quantified uncertainties, and predictive capabilities.
χEFT is a powerful and viable approach for describing the low-energy interactions between constituent nucleons [17,18] -a cornerstone for the microscopically grounded description of the atomic nucleus and its properties. Most importantly, the inherent uncertainty of the χEFT model can be estimated from the remainder term of the underlying momentum-expansion of the effective Lagrangian. We refer to this error as a systematic model uncertainty.
We use the common term low-energy constants (LECs) to denote the effective parameters of a nuclear interaction model. Indeed, for the description of atomic nuclei, the numerical values of the LECs play a decisive role. In the χEFT approach, the LECs can in principle be connected to predictions from the underlying theory of quantum chromodynamics (QCD), see e.g. Ref. [19]. However, the currently viable approach to accurately describe atomic nuclei in χEFT requires that the LECs are con-arXiv:1506.02466v3 [nucl-th] 27 Jan 2016 strained from experimental low-energy data. The bulk of this fit data traditionally consists of cross sections measured in nucleon-nucleon (NN ) scattering experiments. Most often, this data is parameterized in terms of phase shifts [20,21]. However, experimental data comes with error bars, which implies that a thorough statistical error analysis of the constructed nuclear Hamiltonian can only be performed when fitting directly to nuclear scattering observables. This optimization procedure gives rise to statistical uncertainties on the LECs.
In general, the determination of the LECs constitutes an extensive nonlinear optimization problem. That is, the relatively large number of parameters makes it challenging to find optimal values such that the experimental fit data is best reproduced. Various methods and objective functions have been used to solve this problem for a wide array of available nuclear-interaction models [22][23][24][25][26][27][28]. More often than not, the parameters of the models were fitted by hand. Mathematical optimization algorithms were only recently introduced in this venture by Ekström et al. [29] and by Navarro Perez et al. [30]. First attempts to investigate the statistical constraints on the LECs of mathematically optimized interactions have recently been performed in the NN sector with coarsegrained δ-shell interactions [31][32][33][34] and with χEFT NN interactions [35,36].
The so called power-counting scheme of the χEFT approach offers a systematically improvable description of NN , three-nucleon (NNN ), and pion-nucleon (πN ) interactions. It provides a consistent framework in which LECs from the effective πN Lagrangian also govern the strength of pion-exchanges in the NN potential and of long-and intermediate-range NNN forces. This implies that πN scattering data can be used to constrain some LECs that enter the chiral nuclear Hamiltonian.
Furthermore, χEFT offers an explanation for the appearance of many-nucleon interactions, such as NNNdiagrams, and the fact that they provide higher-order corrections in the hierarchy of nuclear forces. Still, effective NNN forces are known to play a prominent role in nuclear physics [8,37,38]. Most often, the LECs that are associated with the NNN terms have been determined relative to existing NN Hamiltonians. These LECs are optimized against a few select binding energies, excitation energies, or other properties of light nuclei.
The extended approach that is presented here is conceptually consistent with χEFT in the sense that the NN +NNN Hamiltonian is constrained from a simultaneous mathematical optimization to NN and πN scattering data, plus observables from NNN bound states including the electroweak process responsible for the β-decay of 3 H. Furthermore, we include the truncation error of the chiral expansion to take systematic theoretical errors into account. If correctly implemented, the truncation error of an observable calculated in this scheme should decrease systematically with increasing order in the χEFT expansion. Indeed, we will show that the resulting propagated uncertainties of a simultaneous fit are smaller and exhibit a more obvious convergence pattern compared to the traditional separate or sequential approaches that have been published so far.
Below, we summarize the work presented in this Paper by listing three specific objectives: • Establish a systematic framework for performing mathematical optimization and uncertainty quantification of nuclear forces in the scheme of χEFT.
Our approach relies on the simultaneous optimization of the effective nuclear Hamiltonian to lowenergy πN , NN , and NNN data with the inclusion of experimental as well as theoretical error bars.
• Demonstrate methods to propagate the statistical errors in the order-by-order optimized nuclear Hamiltonian to various nuclear observables and investigate the convergence of the chiral expansion.
• Deliver optimized chiral interactions with welldefined uncertainties and thoroughly introduce the accompanying methodological development such that our results can be easily applied in other calculations.
Our paper is organized as follows: In Section II we introduce the methodology. We start with the construction of the nuclear potential from χEFT and proceed to the calculation of observables and the optimization of parameters. In particular, we introduce automatic differentiation for numerically exact computation of derivatives, and we discuss the error budget and error propagation. The results of our analysis, for potentials at different orders in the chiral expansion and using different optimization strategies, is presented in Section III. We study the order-by-order convergence, the correlation between parameters, and we present first results for few-nucleon observables with well-quantified statistical errors propagated via chiral interactions. The consequences of our findings in the few-and many-body sectors are discussed in Section IV, and in Section V we present an outlook for further work.

II. METHOD
In this section we give an overview of the nuclear χEFT that we employ to construct a nuclear potential (Sec. II A). The optimal values for LECs are not provided by χEFT itself; they need to be constrained from a fit to data. For completeness we will summarize the well-known methods to calculate the relevant experimental observables: NN scattering cross sections (Sec. II B), NN 1 S 0 effective range parameters (Sec. II C) and bound state properties for A ≤ 4 nuclei using the Jacobi-coordinate no-core shell model (Sec. II D). We also present the objective function (Sec. II E), the optimization algorithm (Sec. II F), and the formalism for the statistical regression (Sec. II G).
A. The nuclear potential from χEFT The long-range part of the nuclear interaction in χEFT is governed by the spontaneously broken chiral symmetry of QCD and mediated by the corresponding Goldstoneboson; the pion (π). This groundbreaking insight [39] enables a perturbative approach to the description of phenomena in low-energy nuclear physics [40]. High-energy physics that is not explicitly important is accounted for through a process of renormalization and regularization with an accompanying power counting scheme. The expansion parameter is defined as Q/Λ χ , where Q is associated with the external momenta (soft scale) and Λ χ ≈ M ρ (hard scale), with M ρ ≈ 800 MeV the mass of the rho meson. The benefit of a small-parameter expansion is that higher orders contribute less than lower orders. If the series is converging, an estimate of the magnitude of the truncation error is given by the size of the remainder.
The chiral order of a Feynman diagram is governed by the adopted power-counting scheme. Given this, any chiral order ν ≥ 0 in the expansion will be identified with a finite set of terms proportional to (Q/Λ χ ) ν . In this work we have adopted the standard Weinberg power-counting (WPC) which is obtained from the assumptions of naive dimensional analysis. For the scattering of two or more nucleons without spectator particles, ν is determined by (see e.g. Ref. [18]) where A is the number of nucleons and L is the number of pion loops involved. The sum runs over all vertices i of the considered diagrams and ∆ i is proportional to the number of nucleon fields and pion-mass derivatives of vertex i. ∆ i ≥ 0 for all diagrams allowed by chiral symmetry. In Fig. 1 we show the different interaction diagrams that enter at various orders. For the NN system, contributions at ν = 1 vanish due to parity and timereversal invariance. Also, we consider nucleons and pions as the only effective degrees of freedom and ignore possible nucleon excitations, i.e., we use the so-called deltaless version of χEFT.
The interaction due to short-range physics is parameterized by contact terms, which also serve to renormalize the infinities of the pion loop integrals. The order-byorder expansion of this zero-range contribution is also organized in terms of increasing powers of Q/Λ χ . Due to parity, only even powers of ν are non-zero. Furthermore, the contact terms of order ν = 0 contribute only to partial waves with angular momentum L = 0, i.e. S−waves, whereas ν = 2 contact terms contribute up to P -waves. In general, the contact interaction at order ν acts in partial waves with L ≤ ν/2. Following Eq. (1), the terms in the χEFT expansion, up to third order, are given by a sum of contact interactions V ct and one-plus two-pion exchanges, denoted by V 1π and V 2π , respectively: The superscript indicates the separate chiral orders ν = 0, 2, 3, referred to as leading-order (LO), next-to-leading order (NLO), and next-to-next-to-leading order (NNLO). For detailed expressions, see e.g. Ref. [27]. The threenucleon interaction, V NNN , contains three different diagrams as shown in Fig. 1. These correspond to two-pion exchange, one-pion exchange plus contact, and a pure NNN contact term. Insofar, the analytical expressions for the NN potential have been derived up to fifth order (N4LO) [41,42]. The partial-wave decomposition for the NNN interaction at NNLO is well known [43], while the N3LO contribution was published very recently [44]. Note that the connected four-nucleon diagrams also appear at this higher order. In the present work we limit ourselves to NNLO for completeness. The strengths of the terms in the χEFT interaction are governed by a set of LECs. These parameters play a central role in this work, and we discuss in detail how they are constrained from measured data. In general, for each chiral order there will appear a new set of LECs. For the nuclear interactions used in this work, see Eq. (2), the corresponding LECs are denoted Furthermore, there are additional constants that must be determined before making quantitative predictions in χEFT. Here, we set the axial-vector coupling constant to the experimentally determined value of g A = 1.276 [45] for LO, whereas for the higher orders we use the renormalized value of g A = 1.29 to account for the Goldberger-Treiman discrepancy [27]. At all orders we use F π = 92.4 MeV [27]. All other physical constants, such as nucleon masses and the electric charge, are taken from CO-DATA 2010 [46], except the pion masses for which we have used the values from the Particle Data Group [47].
Note that LECs that determine the sub-leading πN interaction vertices occur in both the NN interaction and the two-pion-exchange part of the NNN , see Refs. [27,43]. Besides offering this pion-vertex link between the NN and the NNN interaction, the πN interaction model of χEFT allows to describe πN scattering processes. Consequently, experimental πN scattering data can be used to constrain the long-range part of the nuclear interaction. The lowest order terms of the effective πN Lagrangian have ν = 1 and are free from LECs, besides g A and F π . At order ν = 2 the LECs c 1 , c 2 , c 3 , and c 4 enter. Higher-order πN LECs, such as d 1 + d 2 , d 3 , d 5 and d 14 − d 15 , enter at ν = 3 while e 14 to e 18 appear at ν = 4. In total, there are 13 LECs in the πN Lagrangian up to fourth order.
The different masses and charges of the up and down quarks give rise to isospin-violating effects [18,27]. There are both short-and long-range isospin-violating effects. The long-range effects are of electromagnetic (EM) origin and for this contribution we use the well-known set of potentials where C1 is the static Coulomb potential, C2 the relativistic correction to the Coulomb potential [48], VP is the vacuum polarization potential [49], and MM the magnetic-moment interaction [50]. The long-range effects become increasingly important as the scattering energy approaches zero; consequently we include all the above long-range effects at all orders in the chiral expansion. We also consider short-range isospin-breaking mechanisms. At NLO, theC1 S0 contact is split into three charge-dependent terms:C . At this order, and above, we also take the pion-mass splitting into account in one-pion exchange terms [18].
An effective field theory often has to handle more than one expansion parameter. In our case, the nucleon mass, is the proton (neutron) mass, provides such an extra scale and the use of the heavy-baryon chiral perturbation theory introduces relativistic corrections with factors of 1/M N . We count these corrections as Q/M N ≈ (Q/Λ χ ) 2 [40,51]. This choice implies that no relativistic corrections appear in the NN sector up to the order considered in this paper.
To regularize the loop integrals that are present in the two-pion exchange diagrams we employ spectral function regularization (SFR) [52] with an energy cutof Λ = 700 MeV. The nuclear interaction is calculated perturbatively in χEFT. A nuclear potential that can be used for bound and scattering states is obtained by iterating the terms of the chiral expansion in the Lippmann-Schwinger or Schrödinger equation [53]. We employ the minimal-relativity prescription from Ref. [54] to obtain relativistically-invariant potential amplitudes. The ultraviolet divergent Lippmann-Schwinger equation also require regularization. We remove high-momentum contributions beyond a cutoff energy Λ by multiplying the NN and NNN interaction terms with standard (non-local) regulator functions f NN (p) and f NNN (p, q), respectively, and where p and q are the Jacobi momenta of the interacting nucleons. In this work, we mainly use Λ = 500 MeV and n = 3. However, we also explore the consequences of varying Λ in steps of 25 MeV between 450 − 600 MeV. The canonical power-counting, i.e. WPC, and the nonperturbative renormalization of nuclear χEFT in its current inception is currently under some debate [55,56]. In relation to this it should be stressed that our implementation of statistical regression methods and gradient-based optimization methods furnishes an independent framework to extract well-founded estimates of the uncertainties in theoretical few-nucleon physics and a tool to assess the convergence properties of χEFT.

B. Nuclear scattering
The NN scattering observables are calculated from the spin-scattering matrix M [57,58]. This is a 4 × 4 matrix in spin-space that operates on the initial state to give the scattered part of the final state. Thus, M is related to the conventional scattering matrix S by M = 2π ip (S − 1), where p is the relative momentum between the nucleons. The decomposition of M into partial waves is given by (see e.g. [20]) where the big parentheses are Wigner 3j-symbols, s (s ) and m (m ) are initial (final) total spin and spin projection, respectively, J = L + s is the total relative angular momentum andL (Ĵ) is 2L + 1 (2J + 1). The quantization axis is taken along the direction of the incoming nucleon and θ gives the center-of-mass scattering angle. The S-matrix for the scattering channel with angular momentum J can be parameterized by the Stapp phase shifts [59], for the coupled triplet channel, and for the (coupled) singlet-triplet channel with L = J.
The spin-singlet (S = 0) phase shift is denoted by δ L=J , the spin-triplet (S = 1) phase shift by δ L,J , while J represents the triplet-channel mixing angle and γ J is the spin-flip mixing angle [60] (γ J = 0 for pp scattering).
In practice, the infinite sums in Eq. (7) are truncated at L, L ≤ L max . Calculations that involve long-ranged EM effects require L max ≥ 1000 in order to reach convergence, while L max = 30 is sufficient for the part coming from the short-ranged nuclear interaction. This leads to a natural separation of the terms in Eq. (7), see e.g. Ref. [50]. In brief, all EM amplitudes are calculated independently in Coulomb Distorted-Wave Born Approximation (CDWBA) using Vincent-Phatak matching [61] to handle the difficulties of the Coulomb interaction in momentum space. For the 1 S 0 channel, the C2 and VP interactions are strong enough that a small correction to the bare phase shifts is needed, resulting in is the phase shift of the Coulomb and the chiral NN interactions computed in CDWBA, ρ 0 (τ 0 ) is the C2 (VP) phase shifts in CDWBA, and∆ 0 is a correction calculated by interpolating between the values tabulated by Bergervoet et al. [62]. In principle,∆ 0 is dependent on the interaction model for the strong force; this effect has been shown to be very small [62] and was not considered here.
We compute the VP phase shifts, τ L , in CDWBA using the variable-phase method [63]. The values we obtain agree with the ones that are tabulated by Bergervoet et al. [62]. The VP amplitude is calculated in the firstorder approximation derived by Durand [49] using the expansion parameter X ≡ 4m 2 e /(T lab M p (1−cos(θ))), where m e is the electron mass. We find that X 0.031 for all scattering data that is employed in this work. The MM amplitude for np and pp scattering is given by Stoks 1 [50].
The Stapp phase shifts are calculated from the realvalued free reaction matrix R [64], which is defined through a Lippman-Schwinger type equation [64] where V is the potential, µ the reduced mass, and P denotes the Cauchy principal value. Due to parity and time-reversal invariance, the scattering matrix M has six linearly independent elements. We employ the Saclay parameterization [57], with complex amplitudes a to f , to express where q = p − p is the momentum transfer, k = (p + p)/2 and r = q × k. For identical particles, f will be zero. Expressions for the scattering observables in terms of the Saclay parameters can be found in Ref. [57] for identical particles and in Ref. [58] for the more general case of non-identical particles.
For the theoretical description of the πN scattering observables we use the fourth order χEFT expressions according to Refs. [65,66]. A detailed description of the EM amplitudes that we employ are given in Refs. [67][68][69][70] C. Effective range parameters The effective-range expansion (ERE) of low-energy phase shifts [71] provides parameters that can be directly compared to experimentally inferred values. The ERE can be expressed in the general form The functions A(p) and B(p) depend on the choice of included long-range EM effects and δ LR LR+NN is the phase shift of the total nuclear potential (long-range plus strong NN ) relative to the phase shift of only the long-range part.
For nn and np scattering we have A(p) = 0 and B(p) = 1 [71] since there are no EM effects. The corresponding ERE parameters are denoted a N nn , r N nn , a N np and r N np . For pp scattering we calculate ERE parameters for the nuclear plus Coulomb potential, i.e., using the phase shifts δ (CDWBA) C1+NN . The expressions for A C (p) and B C (p) can be found in Refs. [62,71]. The corresponding ERE parameters are denoted a C pp and r C pp . In practice, the ERE parameters are determined using a linear least-squares fit to 20 equally-spaced phase shifts in the T lab = 10 − 100 keV range.

D. Few-nucleon observables
We employ the Jacobi-coordinate version of the NCSM [72] to compute bound-state observables for 2,3 H and 3,4 He. Apart from binding energies and radii we also compute the deuteron quadrupole moment, Q( 2 H), and the comparative half-life for the triton f T 1/2 ( 3 H).
In the NCSM, observables and wave functions are obtained from the exact solution of the eigenvalue problem H |ψ = E |ψ . In this work, the nuclear Hamiltonian H is given by where T ij are relative kinetic energies while V ij and V ijk are the NN and NNN interactions, respectively. In our calculations we use the isoscalar approximation as presented in Ref. [73]. The model-space dimension is determined from the maximal number of allowed harmonicoscillator (HO) excitations N max . We obtain essentially converged results in a HO basis with oscillator energy ω = 36 MeV and model-space dimension N max = 40(20) for A = 3(4).
The experimentally measured electric-charge radius can be related to the theoretically calculated pointproton radius through the relation [74] where r 2 p (r 2 n ) is the proton (neutron) charge meansquared radius and Z (N ) is proton (neutron) number. Furthermore, is the Darwin-Foldy correction [75] and ∆r 2 includes effects of two-body currents and further relativistic corrections. We use r p = 0.8783(86) fm and r 2 n = −0.1149(27) fm 2 [76]. For all nuclei, we use ∆r 2 = 0.
Precise results for electroweak observables depend on two-body nuclear currents and relativistic effects. χEFT provides a consistent framework for including such corrections and for deriving quantum-mechanical currents, such as the electroweak one, from the same Lagrangian as the nuclear force. We follow the approach by Gazit et al. [77] and compute the triton half-life from the reduced matrix element for E A 1 , the J = 1 electric multipole of the axial-vector current This matrix element is proportional to c D , the LEC that also determines the strength of the NN − πN diagram of the NNN interaction. As a consequence, the triton half-life provides a further constraint of the nuclear force. The experimentally determined comparative halflife, f T 1/2 = 1129.6 ± 3 s [78], leads to an empirical value for E A 1 = 0.6848 ± 0.0011 [77]. For the deuteron quadrupole moment we choose, instead, to fit to the theoretical value obtained from the high-precision meson-exchange NN model CD-Bonn, Q d = 0.27 [24], with a 4% error bar that more than well covers the spread in values using other NN potential models [18].

E. Objective function
Using the methods to compute observables outlined above, the vector α of numerical values for the LECs at a given order in χEFT is constrained using experimental data. This is accomplished by minimizing an objective function defined as where O theo i and O exp i denote the theoretical and experimental values of observable O i in the pool of fit data M, and the total uncertainty σ i determines the weight of the residual, r i . The optimal set of LECs α is defined from We wish to explore the physics capabilities and limitations of nuclear χEFT by forming different objective functions and subsequently probing the precision and accuracy of each one in a statistical regression analysis [79]. At each chiral order (LO, NLO, or NNLO) we compare two different strategies of minimization: simultaneous (sim) and separate (sep). In the "separate" approach we first optimize the sub-leading πN LECs (c i , d i , e i ) using πN data. Subsequently, we optimize the NN contact potential of the nuclear interaction using NN scattering data, and finally (at NNLO) the NNN interaction is determined by fitting c D and c E to the known binding energies and radii of 3 H and 3 He, and the comparative β-decay half life of 3 H. Besides the first-ever application of novel derivative-based optimization techniques to this problem, the "separate" approach is very similar to the conventional procedure to constrain the description of the nuclear interaction. In contrast, with the "simultaneous" approach we optimize all the LECs up to a specific-order in χEFT at the same time with respect to NN and πN scattering data as well as experimentally determined bound-state observables in the two-and threenucleon systems: 2,3 H and 3 He. At LO and NLO, the NN interaction does not involve any sub-leading πN amplitudes, nor are there any NNN force terms. Therefore, at these orders the sim-potentials are optimized using only NN scattering data and the binding energy, radius, and quadrupole moment of the deuteron. A summary of the data types that were included in the objective function for each potential is given in Table I. Table I. Objective functions for the various nuclear interactions in this work. Included data types are marked with 'X'. For sequential optimization, the subscript 'i' indicates at what stage the model is optimized to that data. Excluded data-types are indicated with '-'.

Scattering data
nn ERE bound-state data Potential NN πN The bulk of the experimental data consists of NN and πN scattering cross sections. For the NN data we take the SM99 database [21] entries with laboratory scattering energies T max Lab ≤ 290 MeV, i.e. the pion-production threshold, which constitutes a natural limit of applicability for χEFT. This results in N norm = 148. However, we also explore the consequences of varying T max Lab between 125-290 MeV. Unless otherwise stated, our canonical choice is T max Lab = 290 MeV and Λ = 500 MeV. As there is no neutronneutron scattering data, we use the neutron-neutron 1 S 0 scattering length a N nn = −18.95 (40) fm [18] and effective range r N nn = 2.75 (11) fm [80] to constrain the parameter C (nn) 1 S0 at order NLO. For the πN scattering observables we employ the database from the Washington Institute group [81], here referred to as the WI08 database. The πN data consists mainly of differential cross sections and some singly-polarized differential cross sections for the processes π ± + p → π ± + p and π − + p → π 0 + n. Unfortunately, the WI08 database contains very little data at low scattering energies, which would have been preferred to constrain the low-energy theory of χEFT. In fact, there is no scattering data below T lab = 10.6 MeV. For this reason, we include all data up to lab energy T lab = 70 MeV and keep all terms up to, and including, ν = 4 when calculating πN observables. A lower chiral order does not give a reasonable description of the data. This results in N (πN) data = 1347 data points including N (πN) norm = 110 normalization data. At the optimum, it is usually assumed that the residuals are normally distributed, and that they are all independent of each other. If so, then χ 2 (α ) will comply with a chi-squared distri- where N α denotes the number of LECs (i.e., the number of model parameters). In turn, this allows for a standard regression analysis. These rather strong assumptions of both the model and the data are only approximately fulfilled, mainly due to the inherent systematic error in χEFT.
The distribution of residuals, r i , for the NNLOsim potential, which will be thoroughly introduced in Sec. III A, is shown in Fig. 2. It is clear that the residuals are not entirely normally distributed, with a skewness of −0.38(3) and excess kurtosis of 5.39 (6). The main reason for this deviation can be traced to the inclusion of a systematic error in the fit. This can produce a consistent over-or underestimate of observables, resulting in a non-zero skewness. A non-zero excess kurtosis indicates that the model error sometimes overestimates the uncertainty and in other cases underestimates it, causing a too sharp peak near zero in the histogram in Fig. 2. We stress that the deviations from normality does not invalidate the use of χ 2 (α) as an objective function to fit the parameters; it just indicates that the minimizer α will not be a maximum-likelihood estimator. In fact, we find that when optimizing NNLOsim using NN scattering data up to 125 MeV only, to avoid large model errors, the skewness and excess kurtosis of the NN scattering residuals are significantly reduced; −0.01(6) and 0.6(1), respectively. Still, the propagated uncertainties are very similar in these two cases Thus, the minimization and subsequent regression analysis of the χ 2 (α) function will provide valuable insights into both the model and the data [79].

Total error budget
For each residual, the total uncertainty σ 2 is divided into an experimental part and a theoretical part The experimental uncertainty (statistical or systematic) is provided by the experimenter. Here, we focus on estimating the theoretical uncertainty. As a first step, we identify three different components: (1) the numerical error originating in finite computational precision, (2) the method error due to mathematical approximations in the solution of the bound-state or scattering problem, (3) the model error that is inherent to the truncation of the momentum expansion in χEFT.
The numerical error is the smallest one and several new technical developments, such as automatic differentiation for computing derivatives, allow us to generally ignore σ 2 numerical . However, some elements of the statistical analysis can potentially become numerically unstable if the relative errors are too small. In particular, this concerns the computation of the covariance matrix through the inversion of the Hessian (33). For this reason we impose a minimum relative uncertainty of 0.01%. In practice this requirement only affects the error of the deuteron binding energy.
Regarding the method error, the only significant contributions come from truncating the NCSM model space and from the use of the isoscalar approximation in calculations of bound-state observables. Indeed, for all scattering cross sections we include sufficiently many partial waves to construct an exact scattering matrix. We estimate the method error of the NCSM calculations using a simple exponential extrapolation, E(N max ) = E ∞ + a exp(−bN max ), for a range of different χEFT potentials. However, the uncertainties from the isoscalar approximation dominate the truncation error by an order of magnitude. We therefore use the uncertainties presented in Ref. [73] as our method error.
In practice, we combine the method errors with the experimental ones to obtain the resulting weight of each bound-state observable in the optimization, see Table II. In certain cases, the method error is comparative to, or larger than, the experimental error. The model errors can be labeled as systematic and are the most difficult to assess. We follow the most naive χEFT estimate and associate a truncation error with the effect of excluded higher-order Feynman diagrams. The χEFT expansion up to a given chiral order ν includes all diagrams that scale as (Q/Λ χ ) ν where Q ∈ {p, m π }. The remainder of the diagrams could a priori be assumed proportional to (Q/Λ χ ) ν+1 .
For bound-state properties it is not straightforward to associate a relevant and system-dependent momentum scale; therefore, we will not include systematic theoretical errors for these observables. Scattering observables, on the other hand, have a well-defined center-of-mass momentum. As described in section II B, M -matrix elements are the fundamental quantities that are needed to calculate NN scattering observables and can be parameterized by the complex-valued Saclay amplitudes a to f . Similarly, the πN non-spin-flip and spin-flip amplitudes g ± and h ± determine the πN scattering observables [66]. Therefore, from the above scaling argument we introduce a model error in the scattering amplitudes of the form (20) where C NN and C πN are two overall constants that need to be determined.
We assume that both the real and the imaginary parts of the Saclay amplitudes a − e scale in this manner. The nuclear force does not contribute to the f amplitude so we do not impose a model error in that amplitude. Since the order of magnitude of each scattering amplitude is the same, we assign the same constant of proportionality to all of them, see e.g. Fig. 3. The same argument applies to the πN amplitudes.
We set Q = p to capture the increasing uncertainty in the model as the energy increases. The definition Q = max{p, m π } [17] seems to have a comparatively small impact on the theoretical predictions of the model as discussed further in Sec. III C.
To determine C NN and C πN we use the statistical guiding principle that χ 2 /N dof for both NN and πN scattering should be 1 if the objective function χ 2 follows a chisquared distribution and all errors have been correctly accounted for. This leads to an iterative process where first the C x constants are updated, then the LECs are optimized using the previously determined C x , and so on until the values of the constants have stabilized. This usually requires no more than three iterations.

F. Optimization algorithms
The minimization of χ 2 (α), Eq. (17), is a nonlinear optimization problem. In this work we have employed three different non-linear least-squares minimization methods at different stages during the optimization: POUNDerS [83], Levenberg-Marquardt (LM) and Newton's method. POUNDerS is part of the TAO package [84] and is a so-called derivative-free method. As the label indicates, it does not require the computation of any derivatives. This makes it very attractive for use with applications where differentiation is a formidable task; e.g. nuclear energy density optimization [85] and previous optimizations of chiral interactions [29,35,86]. However, in this work, we have managed to make significant progress in the optimization problem by implementing automatic differentiation, which enables us to extract machine-precise derivatives of the objective function. Consequently, the whole class of derivative-based optimization algorithms becomes readily available. The convergence rate is increased considerably with the LM method that employs first-order derivatives of the residuals with respect to the LECs. A further improvement can be achieved with Newton's method that uses also the second-order derivatives. At LO, the presence of only two LECs to parameterize the potential makes it a trivial task to minimize the corresponding objective functions. However, already at the next order, NLO, the optimization requires quite an effort. There are 11 LECs, and in order to provide a reasonable start vector α 0 of numerical values for these we make an initial fit to the NN scattering phase-shifts published by the Nijmegen group [20]. At NNLO there is a total of 26 LECs, since we also need to include all the 13 πN LECs up to order ν = 4. Also at this order we carry out an initial fit to NN phase-shifts before proceeding with the optimization of the complete objective function. The optimization with respect to scattering observables in the πN sector could proceed without any fits to phase shifts.
There is always a risk of getting trapped in local minima and the success of the minimization strongly depends on the starting point α 0 . Extensive searches were performed to search for a global minimum, which is described in more detail in Sec. III A.

Automatic differentiation
First-and second-order derivatives of χ 2 (α) with respect to the LECs are needed during the minimization process and the subsequent statistical regression analysis, i.e., we need to compute The straightforward numerical approach is to approximate the nth-order derivatives with finite differences. The general idea is to form appropriate linear combinations of M function evaluations in the vicinity of the point of interest. There are, however, a number of issues with this method. First, it is prone to large numerical errors since differences of large, almost equal, numbers are needed. Second, the result can be very sensitive to the choice of step size. Furthermore, it is also a computationally demanding method since the number of required function evaluations grows quickly with the number of dependent variables and order of the derivative. For instance, a third-order, finite-difference calculation of first and second derivatives with respect to all 26 LECs requires M = 3653 function evaluations. For these reasons, we abandon finite-difference methods and employ instead forward-mode automatic differentiation (AD). The basic idea of AD is the following: A computer implementation for calculating the observables, or any computational algorithm for that matter, will consist of a chain of simple (or intrinsic) mathematical operations; e.g. addition and multiplication, elementary functions such as sin and exp, and matrix operations. Therefore, by repeatedly employing the chain rule, derivatives with respect to the LECs can be calculated alongside the usual function evaluations. Using AD, the derivatives of Eq. (21) can actually be computed to machine precision, which is far beyond the precision of any reasonable finite-difference scheme. This accomplishment is illustrated in Fig. 4, where also the dependence on the step size for the finite difference method is shown for comparison. We implement forward-mode AD using the Rapsodia computational library [87]. For the calculation of first and second derivatives with respect to N α different LECs, Rapsodia requires a total of derivative calculations. For N α = 26, this results in M = 702, thus considerably more efficient than the finitedifference approach. Furthermore, all calculations that do not depend on the LECs are performed only once, compared to the brute-force implementation of the finitedifference scheme that requires a full calculation for every function evaluation. Furthermore, since all LECs enter linearly in the momentum-space formulation of the chiral potential it is very easy to calculate the derivatives of the potential with respect to the LECs. Thus, the only workhorses in our calculations are the R-matrix evaluation (matrix inversion) of the scattering process and the solution to the NCSM eigenvalue problem (matrix diagonalization) as we will discuss next.
To solve for the two-nucleon R-matrix (11) at a given on-shell scattering energy we use the well-known method of Ref. [88]. It recasts the Lippmann-Schwinger equation into a matrix equation where I is the identity matrix, V is the two-nucleon potential, and Z is a simple diagonal matrix defined in Ref. [88]. The R-matrix is easily obtained after inverting (I + V Z) using e.g. LU factorization. First-and secondorder derivatives of the R-matrix with respect to LECs α x and α y are easily obtained using the AD technology and the same LU factorization, We also use the fact that many derivatives are exactly zero, for example the πN LECs d i and e i do not appear in the formalism for NN scattering at the present chiral orders. The computational overhead of AD in terms of wall time is very small. On a single computational node, the calculation of all first-and second-order derivatives of the 4450 NN scattering observables with respect to the 26 LECs at NNLO only takes twice as long as computing just the central values.
It is straightforward, but slightly more costly, to apply the AD technology to the NCSM diagonalization of the nuclear Hamiltonian H for A ≤ 4. If the eigenvalue spectrum is non-degenerate, the first-order derivatives of the ground-state energy E 0 and wave function |ψ 0 with respect to the LEC α x are given by [89] Higher-order derivatives are simply obtained by repeated differentiation. For bound-state observables, the computational overhead in terms of wall time is slightly larger than for twobody scattering since we must compute all eigenvalues and eigenvectors of H. The calculation of all first and second derivatives for all 26 LECs at NNLO for the A = 3 observables is approximately 20 times slower than just calculating the central values.

G. Uncertainty Quantification
We employ well-known methods from statistical regression analysis to study the sensitivities and quantify the uncertainties at the optimum χ 2 (α ), see e.g. Dobaczewski et al. [79]. The N α × N α covariance matrix Cov(α ) defines the permissible variations ∆α in the LECs that maintain an objective function value such that where T is some chosen tolerance. We can assume rather small variations ∆α, and therefore truncate a Taylor expansion of the objective function at the second order The N α parameters x can be viewed as "rotated" LECs. They are very convenient since they are independent of each other, which simplifies the previous equation and gives 1 2 where T 1 is the limit to use when considering only variations in one parameter and keeping the others fixed. If χ 2 (α) follows a chi-squared distribution, then x 2 i D χ 2 ,ii /2 will also follow a chi-squared distribution with one degree of freedom, meaning that the 1σ confidence level is given by T 1 = 1, and x i ∼ N (0, 2/D χ 2 ,ii ). In practice, χ 2 (α ) will only be an approximate chi-squared distribution, which modifies T 1 slightly. Here we set T 1 = χ 2 (α )/N dof which corresponds to a rescaling of the χ 2 (α )-function [79], The covariance matrix is then given by where Σ is the diagonal matrix with the vector of variances, σ 2 , of the rotated LECs, on the diagonal. Since T 1 only affects Cov with a constant factor, correlations remain invariant under changes in T 1 .

Error propagation
Starting from the covariance matrix Cov(α ) we can propagate the statistical uncertainties in the LECs to any observable O A , and compute the linear correlation coefficient between any two observables O A and O B . To this aim, it is most convenient to use the rotated and independent LEC representation x defined above. Each LEC x i is normally distributed with zero mean. Next we use a quadratic approximation of the observable O A , where J A is the Jacobian vector of partial derivatives, J A,i = ∂O A ∂αi , H A is the corresponding Hessian matrix, and the tilde notation in the last line indicates the similarly rotated Jacobian and Hessian. The corresponding statistical expectation value E(·) is given by Finally, we define the covariance of O A and O B by where • denotes the Hadamard product. The statistical uncertainty of an observable O A is then given by σ A ≡ Cov(A, A). This approximation of the covariance is valid as long as the quadratic approximations (29) and (34) are valid and the normalized objective function can be assumed to follow a chi-squared distribution. Using a linear approximation, the probability distribution for an observable O A will follow the well-known Gaussian form. However, for the quadratic approximation there is no such analytic expression. Instead, it is easy to reconstruct the probability distribution numerically by using Eq. (34) with a large sample of parameter sets.

III. RESULTS
In this section we discuss our results from the optimization of χEFT at LO, NLO, and NNLO (Sec. III A), the subsequent error propagation (Sec. III B), as well as an expanded discussion on the implications and advantages of a simultaneous optimization protocol (Sec. III C.) In particular we discuss the important consequences of correlations between the LECs in the case of simultaneous versus separate optimization strategies.

A. Optimization
With all the necessary tools in place we can perform the fits to experimental data. For all cases we implicitly assume that the LECs are of natural size [39] by choosing starting points in this region of the parameter space. We did not in any other way force the LECs to be natural. A possible problem in multi-parameter optimization is the existence of several local minima. At LO, with just two parameters, there is only one minimum. However, Table III. Comparison of different minima at various chiral orders. NN -LECs are optimized using only NN scattering data (at NNLO, the πN LECs are fixed). The minima are equally good for A = 2 observables, but differ significantly in A = 3 bound-state properties, calculated here without a three-body force. The last row corresponds to parameters and results (with NN forces only) of the simultaneously optimized NNLOsim interaction. TheC LECs are in units of 10 4 GeV −2 . The scattering χ 2 /N dof shown are for data up to 125 MeV without model errors included. at NLO we find four local minima. They correspond to combinations of two optima in the 1 S 0 channel and two optima in the coupled 3 S 1 − 3 D 1 channel. As shown in Table III, all four combinations describe scattering data and the deuteron properties equally well, thus making them indistinguishable from this point of view. Furthermore, a similar set of minima exists at NNLO when fitting the πN and NN data separately. A theoretical argument can provide partial guidance in the choice between these parameter sets. The nuclear interaction will have an approximate Wigner SU(4) symmetry [90] due to the large scattering lengths in the Swaves, which impliesC1 S0 ≈C3 S1 . This approximate constraint rules out the second and third of the four candidate NLO and NNLO minima in Table III. Furthermore, we might argue that the fourth minimum (NLO-4 and NNLO-4, respectively) is the physical one since its C LECs most resemble the values obtained at LO. This is not a strong justification since LECs are allowed to vary between orders. In the end, it does turn out that both NLO-4 and NNLO-4 are indeed close to the single minimum that exists in the simultaneous NNLO optimization.
A much more interesting difference between the four minima occurs in the few-nucleon sector. It turns out that minima 1-3 give significant underbinding of the triton. Since the measured ground-state energy is -8.48 MeV, these results imply that three-nucleon forces, which appear at NNLO, would have to contribute 5-6 MeV of the missing binding energy. This difference is smaller for the NLO-4 and NNLO-4 minima and they most likely represent the physical minima. This is also more in line with the power-counting arguments that the three-nucleon force should be weaker than the twonucleon force, see e.g. Ref [91]. Furthermore, with the subsequent addition of the NNN terms at NNLO (as it is done in the sequential optimization strategy) it turns out that only the NNLO-4 minimum allows to reproduce all A = 3 observables within one standard deviation. For these reasons, NLO-4 and NNLO-4 define the NN -only parts of the NLOsep and NNLOsep potentials, respectively.
The values for the LECs of our optimized potentials at LO, NLO, and NNLO are tabulated in the Supplemental Material [92] together with their estimated statistical uncertainties. The statistical uncertainty of the ith LEC, i.e. Cov(α * ) ii , is a measure of how much this particular parameter can change while maintaining a good description of the fitted data, as detailed in section II G. That is, the uncertainty for a given LEC represents its maximal variation while assuming that all other LECs are fixed at the χ 2 minimum. Note, however, that the LECs really cannot be varied independently of each other due to mutual correlations. A full error analysis requires a complete covariance matrix as we demonstrate below.
The appearance of NNN diagrams and sub-leading terms from the πN sector does not occur until NNLO in our chiral expansion. This implies small differences between the separately and simultaneously optimized interactions at lower orders. The deuteron properties are included in the optimization of LOsim and NLOsim, but not in LOsep and NLOsep. We find that the statistical χ 2 values (not including the model errors) with respect to NN scattering data are almost identical for LOsim and LOsep, and so are the values of the LECs. The small value of σ exp+method for the deuteron binding energy constrains the statistical error forC3 S1 in LOsim correspondingly. For the contact potential at NLO there are three LECs that operate in the deuteron channel, more than in any other NN partial wave. The presence of mutual correlations cannot be neglected. This explains why the individual statistical errors for the LECs in NLOsim and NLOsep in this channel are similar and larger than at LO. The covariances will also impact the value for the forward error in the deuteron binding energy, discussed further in Sec. III B.
We find that the description of the pp scattering data is not influenced much by the inclusion of the deuteron in the optimization, while the agreement with np data is notably worse above 35 MeV. At this order it is mainly the C3 S1 and C1 P1 LECs that have changed, see Ref. [92], which only affect np scattering.
As previously mentioned, the χEFT interaction becomes significantly more involved at NNLO as NNN and sub-leading πN terms enter at that order. The simultaneous optimization of all data listed in Table I leads to the construction of the NNLOsim interaction. The consequences of the simultaneous approach are dramatic. First of all, we find a single optimum as this strategy eliminates all but one of the local minima that were obtained in the sequential optimization. Moreover, a possible concern turns out to be unwarranted: an improved overall description of scattering data does not detoriate the description of different subsets. In fact, the result is quite the opposite. With the simultaneous-optimization strategy we find that the description of the pp scattering data is actually significantly improved. For scattering energies T lab ≤ 290 MeV the statistical χ 2 (pp) /N dof = 9.1 for NNLOsim compared to χ 2 (pp) /N dof = 26 for NNLOsep, not including the model error. At the same time, the χ 2 for np scattering and πN scattering are similar for the two potentials. Measured np scattering cross sections are characterized by larger uncertainties and it is therefore not surprising that this data remains well described. However, it is noteworthy that the NNLOsim potential reaches a better description of the NN data while maintaining a description of the πN data that is comparable to the one of NNLOsep. Keep in mind that NNLOsep is separately optimized to the πN scattering data. In the simultaneous optimization protocol we are effectively introducing additional constraints on the c i LECs via the NN data set. One could be concerned that the shortrange NN physics would impact and worsen the description of the long-range pion physics. It is not unlikely that we would have seen such unphysical effects if the πN database would have been more comprehensive. The existing πN data does not constrain all directions in the πN LEC parameter space, which allows for large variations in the parameter values and a better description of the pp scattering data with NNLOsim. The χ 2 /N dof for NN and πN scattering up to different T max lab are presented in Fig. 5.
The predominant advantage of the simultaneous optimization is the correct treatment of correlations. Although the uncertainties of the LECs presented in the Supplemental Material [92] are similar for NNLOsep and NNLOsim, the propagated statistical errors of observables can be several orders of magnitude larger for NNLOsep due to missing correlations, see Sec. III B. To visualize the correlations between all LECs, we plot the linear correlation matrix in Fig. 6. The linear correlation between two LECs, or any observables A and B, indicates their linear relationship and is defined as the normalized covariance, Cov(A, B)/(σ A σ B ). This quantity assumes values between −1 (fully anti-correlated) and +1 (fully correlated). A positive (negative) value for the correlation indicates that a larger value for A most likely requires a larger (smaller) value for B. The correlation coefficients between LECs that belong to different objective functions are zero. For NNLOsep this implies that the correlation matrix is block-diagonal in terms of the πN , NN , and NNN sectors. For NNLOsim, however, such inter-block correlations are revealed. In addition, we observe an increase of the correlations within each group. This can be traced to the fact that the πN LECs, c 1 , c 3 and c 4 , occur in the description of NN -, πN -, and NNNdata. The failure to capture these correlations within the sequential optimization approach, such as with the NNLOsep potential, will induce very large propagated statistical errors. In conclusion, simultaneous optimization is key for a realistic forward propagation of parametric uncertainties.

B. Error propagation
Statistical errors and covariances between computed observables are calculated under the assumption that each observable depends quadratically on the LECs in the vicinity of the minimum, see Eq. (34). Our estimate of the statistical uncertainty, σ A , of an observable, O A , rests on this assumption, which also explains why we have asymmetric error bars. We have performed extensive Monte Carlo samplings to verify the validity and necessity of using the second-order approximation. A linear truncation is more common. In particular, we compare the probability density function for various observables obtained from: (i) Monte Carlo samplings of the multivariate Gaussian spanned by the covariance matrix, (ii) the quadratic approximation, and (iii) the linear approximation of Eq. (34). The Monte Carlo calculations use 10 5 sets of normally distributed LEC vectors.
The probability distributions for the scattering lengths a C pp and a N nn for the potentials NNLOsep and NNLOsim are shown in Fig. 7. Note that these results are predictions since the scattering lengths are not included in the objective function at NNLO. The statistical errors for a C pp and a N nn obtained in the Monte Carlo calculations with the NNLOsim potential are small and well reproduced already by the corresponding linear approximation, as expected. With NNLOsep, the errors are much larger and require at least a quadratic approximation for the forward error. The uncertainties of the ERE parameters differ quite a lot between these two potentials. It is important to remember that for the NNLOsim potential, all LECs are constrained by πN , NN as well as NNN data. Hence, in the error analysis, the LECs that fulfill χ 2 scaled ( p) ≈ N dof will provide a reasonable description of most scattering data. The πN LECs for NNLOsep on the other hand, are constrained only by the πN -data and the missing statistical correlations allow for wide permissible ranges for the NN scattering lengths.
It is possible to explore correlations between any pair of observables by looking at joint probability distributions. As an example, we plot the statistical distribution of binding energies of 4 He and corresponding radii of the deuteron for the NNLO potentials in Fig. 8. The contour lines indicate the regions that encompass 68% (1σ) and 95% (2σ) of the probability density. It is remarkable that the quadratic approximation (dashed lines) reproduces even the fine details of the full calculation (solid lines) for the NNLOsim interaction. Again, the magnitude of variations is strikingly large for NNLOsep, but the quadratic approximations does rather well in reproducing them. In particular, we see a large improvement when going from a linear (dotted lines) to a quadratic dependence on the LECs. This even captures the departure from the standard first-order ellipse.
We present final results for bound-state observables in few-body systems (A = 2 − 4) as well as ERE parameters in Table IV for the LO, NLO and NNLO potentials. Observables that were part of the respective objective function are indicated by a white background, while en- tries with grey background are predictions. Note that the errors that are given in this table do not include a model error from the χEFT truncation, only the propagated statistical uncertainties as described in Sec. II G. It is therefore difficult to make strong conclusions regarding the order-by-order convergence but we certainly observe improved predictions when going to higher orders. Note that LO results in general are characterized by small statistical uncertainties since very little freedom is allowed with just two parameters. At NNLO we observe large statistical errors in the predictions following the sequential approach, e.g., with NNLOsep the statistical error for E( 4 He) is more than 10 MeV.
Energies and radii of few-nucleon systems are well reproduced by NNLOsim as shown in Table IV, with the deuteron radius being the possible exception. This can be traced back to omitted relativistic effects. For the deuteron, ∆r 2 has been estimated to be of the size 0.013 fm 2 [93] and 0.016 fm 2 [94].
We have also extracted correlations between other observables in the few-nucleon sector. As expected, for both NNLOsep and NNLOsim there exist a significant correlation between the D-state probability and the quadrupole moment of the deuteron. More interestingly, at the present optima the triton β-decay half-life does not correlate strongly with any other bound-state observable in Table IV. This corroborates the importance of using this observable to constrain nuclear forces, as was done already in Ref. [77].
Total uncertainties (statistical plus estimated model error from the χEFT truncation) are shown for scattering observables in Fig. 9. The statistical errors are typically very small compared to the model error for the LOsim, NLOsim and NNLOsim potentials, Fig. 9(b,c,d,e). Clear signatures of an order-by-order convergence are seen in the NN scattering observables as illustrated by the np total cross section and the differential cross section that are shown in Fig. 9(b,c). The same convergence is not seen when using the sequentially optimized potentials as illustrated in Fig. 9(a). In this case, the statistical errors are of the same order of magnitude as the model errors, and the NNLO error band is even wider than the NLO band. Note that πN scattering is only described with the NNLOsep and NNLOsim interactions.

C. Optimization protocol
We have demonstrated that the statistical uncertainties of χEFT, if all correlations are accounted for, will induce rather small errors in the predictions of observables. This reflects the fact that most of the few-nucleon data is precise and diverse enough to constrain a statistically meaningful χEFT description of the nuclear interaction. Also note that we only included experimentally observable data in the objective function.
The existence of strong correlations between the LECs require a complete determination of the corresponding covariance matrix, not just the diagonal entries. For this, it is necessary to employ the so-called simultaneous optimization protocol. To further demonstrate this point, we carried out error propagations with NNLOsim while neglecting the off-diagonal correlations between the LECs. The statistical uncertainty of the binding energy in 4 He grew with a factor ∼ 90 compared with the fully informed model. Neglecting the statistical correlations will also obscure the desired convergence pattern of χEFT. Indeed, for the separately optimized potentials there were no signs of convergence in the description of, e.g., np scattering data.
If the experimental database of πN scattering cross sections would be complete, then it would be possible to separately constrain, with zero variances, the corresponding LECs. Only this scenario would render it unnecessary to include the πN scattering data in the simultaneous objective function. Implicitly, this scenario also assumes a perfect theory, i.e. that the employed χEFT can account for the dynamics of pionic interactions. Of course, reality lies somewhere in between, and a simultaneous optimization approach is preferable in the present situation. There exists ongoing efforts where the πN sector of χEFT is extrapolated and fitted separately in the unphysical kinematical region where it exhibits a stronger curvature with respect to the data [96].
Overall, the importance of applying simultaneous optimization is most prominent at higher chiral orders, since the sub-leading πN LECs enter first at NNLO. In fact, the separately optimized NNLOsep potential contains a large systematic uncertainty by construction. We find that the scaling factor for the NN scattering model error, C NN , decreases from 1.6 to 1.0 mb 1/2 when going from NNLOsep to the simultaneously optimized NNLOsim. This implies that the separate, or sequential, optimization protocol introduces additional artificial systematic errors not due to the chiral expansion but due to incorrectly fitted LECs. This scenario is avoided in a simultaneous optimization. The scaling factor for the πN scattering model error, C πN , remains at 3.6 mb 1/2 for both NNLOsep and NNLOsim.
The size of the model error is determined such that the overall scattering χ 2 /N dof is unity, which means that it depends on the observables entering the optimization. We can explore the stability of our approach by re-optimizing NNLOsim with respect to different truncations of the input NN scattering data. To this end we adjust the allowed T max lab between 125-290 MeV in six steps. It turns out that our procedure for extracting the model error is very stable. The resulting normalization constants C NN vary between 1.0 mb 1/2 and 1.3 mb 1/2 as shown in Fig. 10(a).
To see the corresponding effect on predicted observables we consider the np total cross section at laboratory scattering energy T lab = 300 MeV. The model errors vary between 4.8 mb and 6.1 mb, and the calculated cross sections vary between 36.5 mb and 42.7 mb, see Fig. 10(b).  The measured value is 34.563(174) mb [21,97]. We note that the size of the estimated model error is comparable with the variation in the predictions due to changing T max lab . Throughout the analysis, the model error for scattering observables was assumed to scale with momentum p according to Eq. (20). However, the soft scale Q in χEFT is set by max{p, m π } and it can be argued that the model error should be implemented as It turns out that resolving these two momentum scales has a small impact on the estimated model errors. As an illustration, the predictions of the 4 He binding energy changes by just ∼ 20 keV (less than 0.1%). In fact, this effect is much smaller than the impact of changing the T max lab cutoff in the experimental NN scattering database.

IV. EXTENDED ANALYSIS OF SYSTEMATIC UNCERTAINTIES
In nuclear physics the theoretical uncertainties very often dominate over the experimental ones. In particular, this is true for the systematic error. Therefore, it is crucial to establish a credible program for assessing the error budget of any prediction or analysis of experimental information. Thus, we focus our attention on the convergence and missing physics in χEFT. In particular, we discuss consequences for predictions of bound-state observables in heavier nuclei such as 4 He and 16 O. It would be valuable to estimate the systematic uncertainty of predicted bound-state observables -due to the momentum-dependent χEFT uncertainty σ model in Eq. (20). However, the explicit momentum dependence is integrated over when solving the non-relativistic Schödinger equation. Thus, a clear connection to the momentum-expansion is lost.
As demonstrated already in Fig. 10, the variations in model predictions obtained from different truncations of the input data (including only NN scattering data with T lab ≤ T max lab ) is a good first approximation of the expected model uncertainty. To get a more complete picture of the systematic uncertainty, we now also vary the regulator cut-off parameter Λ in the range 450−600 MeV in steps of 25 MeV. For each combination of T max lab and Λ we perform a simultaneous optimization of the LECs, which results in a family of 42 NNLO interactions -i.e., 42 sets of LECs that each comes with statistical uncertainties. It is clear from Table V that the statistical uncertainties of the LECs are smaller than the overall shifts induced by varying T max Lab and the cutoff Λ. All sets of LECs at LO, NLO, NNLO that were obtained in this work are listed in the Supplemental Material [92]. Furthermore, each set is accompanied by its own covariance matrix, also avaliable for download. In the following discussion we use this family of potentials to estimate the systematic uncertainty.
First, we would like to emphasize that all sets of simultaneously optimized LECs provide an almost equally good description of all A ≤ 4 data. Some of the πN LECs display large variations, but the χ 2 /N dof (without model error) for the πN data is within 2.28(4) for all of these potentials. The sub-leading πN LECs become more positive when N N scattering data at higher energies is included, and c 1 in particular carries a larger (relative) statistical uncertainty than the others. It is noteworthy that for a given T max Lab , and up to 1σ precision, the πN LECs exhibit Λ-independence. The NNN LECs, c D and c E , tend to depend less on T max Lab at larger values of Λ. However, they always remain natural. It is also interesting to note that the tensor contact, C E1 , is insensitive to Λ-variations but strongly dependent on the T max Lab cut. It was shown in Fig. 6 that C E1 and c 4 correlate strongly. This effect can already be expected from the structure of the underlying expression for the NNLO interaction.
To gauge the magnitude of model variations in heavier nuclei we computed the binding energies of 4 He and 16 O using the previously mentioned family of 42 NNLO potentials. The resulting binding energies for 4 He and 16 O, computed in the NCSM and CC, respectively, are shown in Fig. 11. The NCSM calculations were carried out in a HO model space with N max = 20 and ω = 36 MeV. The CC calculations were carried out in the so-called Λ−CCSD(T) approximation [7] in 15 major oscillator shells with ω = 22 MeV. The largest energy difference when going from 13 to 15 oscillator shells was 3.6 MeV (observed for Λ = 600 MeV). For our purposes, this provides well-enough converged results. The NNN force was truncated at the normal-ordered two-body level in the Hartree-Fock basis.
The E( 4 He) predictions vary within a ∼ 2 MeV range. For E( 16 O) this variation increases dramatically to ∼ 35 MeV. Irrespective of the discrepancy with the mea- Lab . The third column shows the maximum statistical uncertainty of each LEC, which almost exclusively come from the Λ = 450 MeV and T max Lab = 125 MeV NNLOsim potential. For a given LEC, the statistical uncertainty is rather similar for different potentials.Ci are in units of 10 4 GeV −2 , Ci in units of 10 4 GeV −4 , cD and cE are dimensionless while ci, di and ei are in units of GeV −1 , GeV −2 and GeV −3 , respectively.
±0.0020  4 He and a few hundred keV for 16 O. These uncertainties are obtained from the quadratic approximation with the computed Jacobian and Hessian for 4 He, while a brute-force Monte Carlo simulation with 2.5 × 10 4 CC calculations was performed for 16 O. This massive set of CC calculations employed the doubles approximations in 9 major oscillator shells. We conclude that the statistical uncertainties of the predictions for E( 4 He) and E( 16 O) at NNLO are much smaller than the variations due to changing Λ or T max Lab . However, this is only true for simultaneously optimized potentials. For the separately optimized NNLO potential (NNLOsep) the statistical uncertainty of the E( 4 He) prediction is five times larger than the observed variations due to changing Λ and T max Lab .

V. OUTLOOK
The extended analysis of systematic uncertainties presented above suggests that large fluctuations are induced in heavier nuclei (see Fig. 11). Furthermore, while predictions for 4 He are accurate over a rather wide range of regulator parameters, the binding energy for 16 O turns out to be underestimated for the entire range used in this study. In fact, there is no overlap between the theoretical predictions and the experimental results, even though the former ones have large error bars.
Based on our findings we recommend that continued efforts towards an ab initio framework based on χEFT should involve additional work in, at least, three different directions: 1. Explore the alternative strategy of informing the model about low-energy many-body observables.
2. Diversify and extend the statistical analysis and perform a sensitivity analysis of input data.
3. Continue efforts towards higher orders of the chiral expansion, and possibly revisit the power counting.
Let us comment briefly on these research directions. The poor many-body scaling observed in Fig. 11 was pragmatically accounted for in the construction of the socalled NNLO sat potential presented in Ref. [35] where also heavier nuclei were included in the fit. The accuracy of many-body predictions was shown to be much improved, but the uncertainty analysis is much more difficult within such a strategy. Secondly, to get a handle on possible bias in the statistical analysis due to the choice of statistical tech-nique, it is important to apply different types of optimization and uncertainty quantification methods. Various choices exist, such as e.g. Lagrange multiplier analysis [99], Bayesian methods [100], or Gaussian process modeling [101,102]. In general, stochastic modeling with Monte Carlo simulations offer a straightforward and versatile approach. This tool is also indispensable for computing the posterior probabilities in Bayesian inference. The Monte Carlo results for A ≤ 4 observables that were presented in this work consist of 10 5 sampling points over a multivariate Gaussian parameter space. With our current implementation, the computational cost for sampling all A ≤ 4 observables presented in this work is very low -less than 8000 CPU hours. As such, the present work shows great promise also for future stochastic applications.
Furthermore, the computational framework that we have presented here, and our present implementation, is not limited to any particular type of regulator function or flavour of chiral expansion. Moreover, the handling of a larger number of LECs, as would be the consequence of working at a higher chiral order, should be relatively straightforward and we don't foresee any computational bottlenecks.
Finally, the magnitude of the systematic uncertainties that were observed in this work suggest the need to further explore and improve the theoretical underpinnings of the chiral expansion of the nuclear interaction.