Sparsening Algorithm for Multi-Hadron Lattice QCD Correlation Functions

Modern advances in algorithms for lattice QCD calculations have steadily driven down the resources required to generate gauge field ensembles and calculate quark propagators, such that, in cases relevant to nuclear physics, performing quark contractions to assemble correlation functions from propagators has become the dominant cost. This work explores a propagator sparsening algorithm for forming correlation functions describing multi-hadron systems, such as light nuclei, with reduced computational cost. The algorithm constructs correlation functions from sparsened propagators defined on a coarsened lattice geometry, where the sparsened propagators are obtained from propagators computed on the full lattice. This algorithm is used to study the low-energy QCD ground-state spectrum using a single Wilson-clover lattice ensemble with $m_{\pi} \approx 800$ MeV. It is found that the extracted ground state masses and binding energies, as well as their statistical uncertainties, are consistent when determined from correlation functions constructed from sparsened and full propagators. In addition, while evidence of modified couplings to excited states is observed in sparsened correlation functions, it is demonstrated that these effects can be removed, if desired, with an inexpensive modification to the sparsened estimator.


I. INTRODUCTION
Lattice Quantum Chromodynamics (QCD) provides an ab-initio method for predicting the low-lying spectrum, structure, and reactions of hadrons and nuclei from the dynamics of their constituent quarks and gluons. In practice, this is a computationally demanding task, and requires the use of state-of-the-art supercomputers, as well as the development of increasingly sophisticated numerical algorithms [1]. Continued progress, especially toward understanding the properties of increasingly heavy nuclei, will require further advances in both hardware and algorithms .
A prototypical lattice QCD calculation proceeds in three stages. First, Monte Carlo importance sampling techniques are applied to the QCD path integral, generating a Markov chain of representative configurations of the gauge field. This ensemble generation is an expensive task that is often undertaken as a community effort [1], with the same gauge field configurations shared between many physics calculations. In the second phase, the lattice Dirac operator is repeatedly inverted for each gauge field configuration to determine quark propagators nonperturbatively. Finally, these quark propagators are contracted together to form correlation functions describing the physics of particular states of interest. The effects of the lattice regularization on physical observables, as well as the effects of potential mistunings in the bare input quark masses, can be systematically removed by repeating this procedure to generate a series of simulations with different lattice spacings, simulation volumes, and masses. One can then perform controlled interpolations and extrapolations to the infinite volume, continuum, physical quark mass limit, to provide QCD predictions which can be directly compared to experimental results where they exist, or to make predictions for quantities that cannot be accessed experimentally.
Generating gauge field ensembles and quark propagators is common to lattice calculations of many different physical quantities, and has historically dominated the cost of these calculations. As a result, improving the efficiency of algorithms used for gauge field generation, such as hybrid Monte Carlo (HMC) [40], as well as the sparse matrix inverters needed to compute quark propagators, has been a major focus of algorithmic research and software development. Multigrid algorithms [41][42][43][44][45][46][47][48][49][50], which exploit the local coherence of QCD by inverting a cheaper approximation to the Dirac operator defined on a coarsened lattice, have been particularly successful in accelerating gauge field generation and propagator inversions in recent years, leading to O(10 − 100) fold improvements in the efficiencies of these tasks. Similar ideas have also found success as a technique for reducing the memory footprint of eigenvectors of the lattice Dirac operator [51]. As a result of these advances, the cost of the contraction stage of lattice QCD calculations targeting nuclei has become relatively more expensive, and has, in some cases, become the dominant cost of the entire calculation. New efforts to address this situation and improve the efficiency of contractions are needed.
This work investigates the feasibility of an algorithm exploiting local coherence to reduce the numerical cost of computing correlation functions of single hadrons and light nuclei, based on sparsening. Section II begins by discussing a simple prescription for sparsening, and details the construction of multi-hadron correlation functions such as those that describe the properties of light nuclei. Section III A examines the impact of sparsening on hadronic correlation functions, demonstrating that the ground-state energies extracted from these correlation functions are unaltered within the statistical resolution of this calculation, and Section III B introduces an improved estimator for controlling modifications of the couplings to excited states introduced by sparsening. Finally, Section III C examines the impact of sparsening on the ground states of light nuclei.

II. METHODOLOGY
Naively, the quark contractions required to form correlation functions describing manybody systems require prohibitively large computational resources in general, since the number of quark contractions grows exponentially with the number of quark fields. A number of lattice QCD collaborations have instead used more efficient "baryon block" algorithms [3,19,[52][53][54][55], which have enabled first-principles calculations of the spectra and matrix elements of light nuclei, for example in Refs. . These algorithms work by first constructing partially-contracted "blocks" from quark propagators S: describing the propagation from x 0 = ( x 0 , t 0 ) to x = ( x, t) of a baryon with quantum numbers b and momentum p. Here a i and c i are combined spin-color-flavor indices, and, in many cases, particular choices of the weights w (a 1 ,a 2 ,a 3 ),k b corresponding to interpolating operators with the correct transformation properties to project onto the wavefunctions of many-body states of interest are known [52]. Expressing the contractions for multi-hadron nuclear correlation functions in terms of nucleon-level blocks can often significantly reduce the computational cost by improving the projection onto the state of interest, especially if the blocks are stored and re-used between different calculations [4,5].
The dominant cost of assembling the baryon blocks defined by Eq. (1) is associated with the Fourier transforms (FTs) used to project onto states with definite momenta. These FTs are a natural target of a multigrid-type algorithm, since spatially blocking the lattice by a factor of N reduces the number of modes by a factor of N 3 . The local coherence of QCD implies that blocked and unblocked calculations should result in the same values of lowenergy hadronic observables, up to uncertainties from statistical sampling and discretization effects, provided that N a m −1 π , where a is the lattice spacing and m π is the mass of the lightest hadronic state (pion). In this work, a particularly simple spatial blocking procedure for quark propagators is explored: the lattice is blocked uniformly in all spatial directions, and the value of the propagator evaluated on the first site of each block is used to define sparsened propagators on the coarsened lattice. While in principle one could imagine exploring more sophisticted blocking procedures -such as a renormalization-group-based block average, or a projection onto the coarsened lattice defined by blocked low-mode eigenvectors of the Dirac operator -such a study is left for future work.
In the following sections, we distinguish between full correlation functions constructed from quark propagators defined on the full lattice and sparsened correlation functions constructed from sparsened propagators defined on a coarsened sublattice Λ 3 (N ) ⊂ Λ 3 , as described above, with O † and O appropriate creation and annihilation operators for the state of interest, and x 0 ∈ Λ 3 (N ). Ultimately, the effect of sparsening, as it has been implemented in this work, is to modify the structure of the interpolating operator used at the sink. Since any choice of interpolating operator with the correct quantum numbers is equally valid for probing a given state in the lattice theory, this implementation of sparsening is guaranteed to preserve the values of physical observables, such as the finite volume energy spectrum, but can, however, modify the relative overlaps onto the ground and excited states in a particular channel. In the approach explored in this work, sparsening is expected to modify couplings to excited states at short Euclidean time separations, since Eq. (3) can be understood as an incomplete momentum projection over a subset of the allowed lattice modes. The degree to which the couplings to excited states are modified by sparsening, as well as the degree to which it impacts the statistical uncertainties of observables such as hadron energies, are empirical questions that are explored in the next section.

III. RESULTS
Results are reported for the low-energy QCD spectrum computed on a single 32 3 × 48 lattice ensemble with the Wilson-clover fermion action [56] and Lüscher-Weisz gauge action [57]. This ensemble was generated using three degenerate flavors of quarks with masses tuned to the strange quark mass, leading to m π ≈ 806 MeV, and a lattice spacing a ≈ 0.145 fm determined by Υ spectroscopy [9]. Throughout, lattice momenta p are specified in terms of the dimensionless wavenumber n, where p = 2π n/L and L = 32 is the spatial extent of the lattice. The correlation functions described in this work are computed from Gaussiansmeared propagators constructed using 30 iterations of APE smearing [58] at the source and sink with radius ρ = 4.35 in lattice units, and sparsened according to the procedure described in Section II. Further details can be found in Ref. [9]. The distribution of source positions throughout the spacetime volume is varied depending on the quantity being studied: in Section III A all source locations with | x 0 |/a ≤ 12, where the components of x 0 are multiples of 4 and t 0 /a = 12, are included, while in Section III C, the source locations are randomly distributed throughout the four-dimensional spacetime volume. In all cases measurements were performed on 900 independent gauge field configurations.
Throughout this work the lattice has been blocked by a factor of N = 4 lattice units in the spatial directions to define sparsened propagators and correlation functions. This blocking is chosen to be consistent with the expected scale of spatial correlations in hadronic twopoint functions, (am π /2) −1 ≈ 3.4, in the lattice units of the ensemble used for this study. In Ref. [59], for example, it has been demonstrated that the nucleon correlation function approximately factorizes as where R N (t) and θ N (t) denote the magnitude and phase of the nucleon two-point function, respectively. An analogous factorization is expected to hold for other hadronic states. In addition, in the context of this work, we have numerically studied the magnitude of correlations in hadronic two-point functions as the spatial locations of the quark propagators used to compute these two-point functions are varied, and find results that are consistent with (am π /2) −1 as the relevant scale; the interested reader is referred to Ref. [60] for additional detail.

A. Sparsened Hadronic Correlation Functions
The viability of the proposed sparsening algorithm as a cost reduction technique for lattice QCD calculations depends primarily on the degree to which it preserves the precision with which matrix elements and the low-energy spectrum of QCD can be extracted, as well as the speedup of computing quark contractions that it enables. While the blocking described in the preceding paragraph was designed to preserve long-distance physics, the incomplete momentum projection implied by Eq. (3) effectively alters the lattice interpolating operator at the sink, and therefore alters the overlap onto the different hadronic states in the QCD spectrum. This section explores the practical ramifications of modifying the sink structure by comparing results extracted from the pion, ρ meson, nucleon, and ∆ baryon two-point functions computed using either full or sparsened propagators and for all lattice momenta with | n| ≤ √ 5. While sparsening offers no significant calculational speedup for correlation functions describing single hadrons, these are the simplest and most statistically precise quantities available to study in lattice QCD, and thus a natural starting point to examine the impact of sparsening. Similar studies of more complicated matrix elements involving these states are deferred to future work.

Consistency of Full and Sparsened Two-Point Correlation Functions
To understand correlations between measurements, linear regressions of the sparsened two-point correlation functions, Eq. (3), against the corresponding full two-point correlation functions, Eq. (2), are computed and summarized in Table I and Figure 1. For each gauge field configuration used in the calculation, the source location-averaged sparse data is plotted against the source location-averaged full data for a fixed choice of the Euclidean time separation. This procedure is repeated for sink times t/a ∈ {4, 8, 12}, where the range is chosen to overlap with both the short-time, excited state-dominated regime as well as the late-time, ground state-dominated regime observed in the effective mass plots shown in Figure 2. Results are shown for hadrons at rest; similar correlations are observed for hadrons with non-zero momenta, which were studied for all states with | n| ≤ √ 5.    Table I.
The full and sparsened data sets are observed to be statistically consistent, as evidenced by regression intercepts which are consistent with zero and regression slopes which are consistent with unity. The degree to which the full data can be described by a linear function of the sparse data is summarized by the coefficient of determination where y α and y are the full correlation function computed on a gauge field configuration indexed by α and the ensemble average, respectively, and f α is a linear function of the corresponding sparse correlation function. The extreme limits R 2 = 0 and R 2 = 1 correspond to no relationship and a perfect linear relationship, respectively. Somewhat larger R 2 values are observed for the mesons than for the baryons -this is expected since the baryon signals are contaminated with more statistical noise. Figure 2 depicts the effective energy function

Effective Energies
of hadrons with lattice momenta | n| ∈ {0, √ 2, 2}. At large Euclidean time separations, t/a 1, the effective mass asymptotically approaches the energy of the ground state with the quantum numbers of the interpolating operator from which it is constructed. It is observed that the full and sparsened effective energies reach consistent plateaux for t/a 8 -in particular, both the asymptotic value of the effective energy, and the range of Euclidean times over which the effective energy signal exhibits a stable plateau, are consistentsuggesting that the proposed sparsening preserves the energy and signal quality arising from the ground state contribution.
At small Euclidean time separations t/a 6, the effective mass is contaminated by contributions from higher energy excited states, which are exponentially suppressed in t. In this regime, there are statistically significant deviations between the full and sparsened results; this deviation is further emphasized in Figure 3, which shows the correlated ratio of the full and sparsened effective mass signals. From Figures 2 and 3, it is possible to infer a few effects of sparsening on the coupling to the excited state spectrum of QCD: first, that excited state contamination is more prominent in the sparsened signals, as evidenced by the larger deviations from the asymptotic ground state plateau at early times in Figure 2, and second, that these deviations are also more prominent when projecting onto states with higher lattice momenta, as observed in Figure 3.

Ground State Energy Extraction
Ground state energies can be extracted from two-point correlation functions by determining the parameters β minimizing where is the covariance matrix describing correlations between time slices, C α ( p, t) is the full or sparsened two-point function computed on a fixed gauge field configuration indexed by α and averaged over all source locations, and f (t; β) is an appropriate fit ansatz. In this notation T fit denotes the range of time separations included in the fit, · · · α denotes an ensemble average across measurements on independent gauge field configurations, and C( p, t) = C α ( p, t) α . For simplicity, T fit is chosen to lie inside the asymptotic plateau regions exhibited in Figure  2, where the correlation function is saturated by the ground state contribution, and the ansätze are simple, single-exponential forms, with where T /a = 48 is the temporal extent of the lattice. The results of these fits are summarized in Table II: it is observed for all hadron species and momenta that the extracted ground state energies are consistent -both in terms of the central values and statistical uncertainties - whether the fits are performed to full or sparsened correlation functions.

B. Excited States in Sparsened Correlation Functions
In the preceding subsections, it was observed that while sparsening can modify the couplings to excited states in the early Euclidean time regime of lattice two-point correlators, this had no statistically significant effect on the extraction of ground-state hadron energies. It is conceivable, however, that in other calculations which are more sensitive to shortdistance effects -for example, spectroscopic calculations using a large, variational basis of interpolating operators to extract ground and excited state energies -modified couplings to higher energy states may be more of a concern, especially if the couplings are enhanced or if couplings to additional excited states not present in the full correlation functions are induced. One way to systematically control this is to modify the form of the sparsened  TABLE II. Summary of fits to extract the ground state energies of the pion, ρ meson, nucleon, and ∆ baryon. T fit denotes the range of Euclidean times included in the fit in lattice units, | n| is the wavenumber describing the total momentum carried by the hadron, aE is the extracted ground state energy, χ 2 /dof is obtained by minimizing Eq. (7), and κ(Σ) denotes the condition number of the covariance matrix (Eq. (8)). The first set of fit results (middle three columns) are from fits to full correlation functions (Eq. (2)), while the second set of fit results (rightmost three columns) are from fits to sparsened correlation functions (Eq. (3)). The statistical uncertainties of fitted quantities are computed using the jackknife resampling technique.
estimator for the two-point correlation function: For this construction to be useful in practice, one must be able to determine the modified estimator precisely at low cost. The proposed idea, similar in spirit to the all-mode averaging technique introduced in Ref. [61], is to compute the inexpensive, sparsened correlation functions by averaging over N sparse independent propagator source locations x 0 ∈ Λ sparse , as well as the full correlation functions by averaging over a smaller subset of N ∆ propagator source locations with Λ ∆ ⊂ Λ sparse . One can then form the estimator of Eq. (10), where the second term interpolates between Eq. (3) (N ∆ = 0) and Eq. (2) (N ∆ = N sparse ). This can reduce the additional excited state contamination while still leading to significant cost reductions in practice, provided the observed differences can be effectively removed when N ∆ N sparse . Figure 4 shows the ratio of the full two-point correlator (Eq. (2)) to the modified sparse estimator C (Eq. (10)) as a function of N ∆ , with t/a = 3 held fixed. The full set of source locations is used for Λ sparse (N sparse = 123), and, for each value of N ∆ , a random subset Λ ∆ is drawn independently to compute the correction term. While, in this study, a small choice of N ∆ is sufficient to remove the additional excited state contamination observed in the sparsened correlation functions, this comes at the cost of inflating the statistical error. As N ∆ is increased the correlated ratio asymptotically approaches a regime where it is consistent with unity and no inflation of the statistical uncertainty is observed. However, the reader should be cautioned against inferring too much from the dependence on N ∆ in Figure 4. In this study Λ ∆ has been drawn from a collection of closely spaced, and thus highly correlated, propagators, with all sources on a single time slice. If the propagator source locations were instead distributed randomly throughout the lattice, it is likely that the modified estimator would converge more quickly in N ∆ . Verifying this conjecture is left for future work.

C. Sparsened Nuclear Correlation Functions
While the results for single hadron correlation functions described in the previous section are encouraging for the use of the sparsening technique proposed in this work, the necessary quark contractions are also inexpensive, and thus there is no clear scenario where this technique might be useful in practice. Computing correlation functions for nuclear systems composed of multiple hadrons, however, quickly becomes computationally challenging, and requires the use of more sophisticated techniques such as the baryon block algorithm [3,19,[52][53][54][55] described in Section II. A typical calculation involves constructing and combining many such blocks for different choices of source and sink smearings and locations, which is often the dominant cost in the entire workflow. This is further compounded in calculations which employ background field methods to compute matrix elements involving current insertions, since one must also compute blocks for multiple values of the background field strength [29]. By drastically reducing the cost of building baryon blocks, sparsening can either help to reduce the overall computational cost of such calculations, or else enable the use of a much larger basis of interpolating operators, smearings, and background fields at fixed computational cost.
This section investigates the effects of sparsening on the extraction of ground state energies of more complicated bound states consisting of multiple nucleons, in analogy to Section III A 3. The states considered include the 1 S 0 and 3 S 1 N N bound states 1 (dinucleon and deuteron, respectively), and the 3 He and 4 He isotopes of Helium, and have been previously studied in Refs. [9,11,32]. Two classes of fits are performed. In the first, the energies of these states are extracted directly by fitting an exponential ansatz to the Euclidean time dependence of the two-point correlation function. In the second, the binding energies of each state are instead extracted from an exponential fit to a suitable correlated ratio: for a bound state of A nucleons the ratio used is where C A (t) is the multi-hadron two-point correlation function, C N (t) is the single nucleon two-point correlation function, and ∆E ≡ E A − AE N is the binding energy. The advantage of the ratio R A (t) is that it naturally accounts for the strong correlations in the statistical fluctuations of C A (t) and C N (t), which must be taken into account to properly determine the statistical uncertainty of ∆E. In addition, t must be chosen sufficiently large that both C A (t) and C N (t) are ground state dominated. Figure 5 depicts the effective masses, effective binding energies -computed from Eqs. (6) and (11) -and correlated ratios of the full and sparsened effective energy signals, for the dinucleon, the deuteron, 3 He, and 4 He. Consistent with the single hadron case, the multihadron ground state plateaus agree within statistics between the full and sparsened data, and the correlators exhibit percent-scale deviations of the correlated ratios of full to sparse from 1 At the heavy, SU (3)-symmetric quark mass point used in these calculations, the dinucleon state is observed to be bound [9], unlike in nature. unity in the early time, excited-state dominated regime. Likewise, in the summaries of fits to the ground state energies and binding energies detailed in Tables III and IV, respectively, there are again no observable discrepancies between fits to the full data and fits to the sparsened data, in terms of both the energies extracted and their statistical uncertainties.   3)). The sparsened data has been shifted slightly along the time axis for clarity.  N N ( 1 S 0 )), the deuteron (N N ( 3 S 1 )), 3 He, and 4 He. T fit denotes the range of Euclidean times included in the fit in lattice units, aE is the extracted ground state energy, χ 2 /dof is obtained by minimizing Eq. (7), and κ(Σ) denotes the condition number of the covariance matrix (Eq. (8)). The first set of fit results (middle three columns) are from fits to full correlation functions (Eq. (2)), while the second set of fit results (rightmost three columns) are from fits to sparsened correlation functions (Eq. (3)). The statistical uncertainties of fitted quantities are computed using the jackknife resampling technique. These results are consistent with previous determinations of these quantities [9].   N N ( 1 S 0 )), the deuteron (N N ( 3 S 1 )), 3 He, and 4 He. The notation is otherwise the same as that of Table III.

IV. CONCLUSIONS
This work has introduced an algorithm for reducing the numerical resources required to compute multi-hadron correlation functions in lattice QCD simulations, based on sparsening. It has been demonstrated that a relatively simple prescription for sparsening -uniformly blocking the lattice in the spatial directions, and taking the value of the propagator evaluated at the first site in each block to define sparsened propagators and correlation functionsis sufficient to preserve the ground state energies and uncertainties extracted from a lattice QCD simulation. It has also been noted that this sparsening procedure alters the couplings to excited states observed at early Euclidean times for single-and multi-hadron correlation functions; however, a simple modification of the sparsified correlation functions can efficiently remove these modified excited state effects, if desired. Since sparsening differentially distorts the UV components of correlation functions, it is not surprising that it modifies the overlaps onto excited states at early Euclidean times.
The sparsening techniques that are presented here enable O(10−100) fold speedups in the contraction stage of lattice calculations of nuclear physics. This factor will further increase as the continuum limit is approached, since it is possible to block more aggressively as the scale of the lattice cutoff grows in comparison to the scale of the systems being studied. Future work will explore the application of sparsening to more complicated observables, such as three-point functions describing the gluonic structure of light nuclei [33].