Scaling properties of multiscale equilibration

We investigate the lattice spacing dependence of the equilibration time for a recently proposed multiscale thermalization algorithm for Markov chain Monte Carlo simulations. The algorithm uses a renormalization-group matched coarse lattice action and prolongation operation to rapidly thermalize decorrelated initial configurations for evolution using a corresponding target lattice action defined at a finer scale. Focusing on non-topological long-distance observables in pure SU(3) gauge theory, we provide quantitative evidence that the slow modes of the Markov process, which provide the dominant contribution to the rethermalization time, have a suppressed contribution toward the continuum limit, despite their associated timescales increasing. Based on these numerical investigations, we conjecture that the prolongation operation used herein will produce ensembles that are indistinguishable from the target fine-action distribution for a sufficiently fine coupling at a given level of statistical precision, thereby eliminating the cost of rethermalization.


I. INTRODUCTION
Multiscale methods have been applied successfully in a variety of ways to facilitate Markov chain Monte Carlo (MCMC) simulations in lattice quantum chromodynamics (QCD). These applications range from Dirac operator inversion [1][2][3][4] to evaluation of correlation functions and other observables [5][6][7][8], and have resulted in significant increases in computational efficiency, and reductions in uncertainties of stochastically estimated observables. Implementation of a multiscale algorithm for gauge field updating in lattice QCD, however, remains an open challenge despite some early progress for simpler theories [9][10][11][12][13][14].
Recently, we have introduced a multiscale thermalization algorithm, based on the multigrid concepts of prolongation and restriction, and inspired by renormalization-group flows, which offers the ability to rapidly initialize an ensemble of configurations for subsequent parallel evolution [15]. The benefits of using this algorithm in a QCD context are severalfold. First, it enables the generation of ensembles with well-distributed topological charge in parameter regimes where topological freezing is problematic (namely, when the lattice spacing is less than 0.05 fm). Although the resulting distribution is not correctly sampled according to the fine path integration measure, formally, the deviations from the correct distribution are of order the lattice spacing, provided the topology is correctly sampled on the coarse lattice. These lattice artifacts can be investigated using multiple prolongations and in principle corrected by subsequent reweighting. Second, with reduced thermalization overhead, evolving fine ensembles with multiple streams becomes practical and can lead to reduced communication overhead, thereby further enhancing the efficiency of gauge field generation. Finally, since the coarse level evolution is inexpensive, it is practical to generate fully decorrelated configurations by such an algorithm, implying greater statistical power of the resulting ensemble compared to those generated conventionally at similar cost. Multiscale thermalization had been successfully demonstrated for both quenched [15] and two-flavor [16] QCD.
The main focus of the present study is to quantitatively investigate the scaling properties of multiscale thermalization as a function of the lattice spacing, under the assumption that all coarse and fine lattice pairs have been properly matched using renormalization-group matching conditions. In addition, this study examines whether multiscale thermalization techniques might be useful for circumventing the problem of critical slowing down. In conventional MCMC simulations, the autocorrelation times associated with long-distance observables typically scale polynomially as τ int ∼ 1/a z for a fixed physical volume, where z is a dynamical exponent and a is the lattice spacing. For local algorithms, the updating is typically diffusive, implying a dynamical exponent z ∼ 2, although for QCD, the scaling can be far worse due to topological freezing. For example, between a ∼ 0.1 fm and a ∼ 0.05 fm, the dynamical exponent is around z ∼ 5 for pure gauge theory, using both hybrid Monte Carlo (HMC) [17] and heat bath (HB) [15] algorithms. Similar scaling behavior has been observed in Ref. [17] for gauge theories with dynamical fermions, and in both cases the topological tunneling rate is expected to become exponentially suppressed farther toward the continuum limit. In addition to multiscale thermalization, a variety of other approaches have been proposed and studied for addressing topological freezing [18][19][20][21][22] and critical slowing down [23,24] in lattice QCD [25].
Setting aside the issue of topological freezing (e.g., by considering gauge evolution within a fixed frozen topological sector, or by applying open boundary conditions [18], as it is possible in approaching zero-temperature physics), the computational cost of conventional simulations of gauge theories nonetheless grow rapidly as the continuum limit is approached due to autocorrelations in the Markov process. By comparison, in multiscale thermalization, the relevant timescale for attaining decorrelated configurations is no longer the autocorrelation time associated with the Markov process, but rather the rethermalization time required to equilibrate a prolongated coarse ensemble of decorrelated configurations. (Re)thermalization and autocorrelation timescales are both tied to the slow eigenmodes of the Markov transition amplitude that defines the fine scale evolution. However, in multiscale thermalization, the starting ensemble is drawn from a prolongated coarse distribution (P prolongated ), which, by design of the matched coarse action, has very good overlap with the targeted fine distribution. This implies that the initial prolongated fine distribution is nearly orthogonal to the slowest mode(s) of evolution, 1 χ n (n = 1, 2, . . .), and therefore it is possible that only the highly excited modes of evolution control the rethermalization time. Under such conditions, it was numerically observed that the rethermalization timescale can be significantly shorter than the associated autocorrelation timescale for fine evolution [15].
If the lattice spacing dependence of the rethermalization time scales better than that for autocorrelation times in a conventional approach [e.g., τ R = O(1/a zR ) with a rethermalization exponent z R < 2, for nontopological observables], or the overlap of prolongated fine distributions onto slow modes decreases sufficiently fast with a, then multiscale thermalization could offer a new strategy for addressing the problem of critical slowing down. From a theoretical standpoint, whether or not rethermalization times scale better than autocorrelation times is a nontrivial question, TABLE I. Ensemble parameters considered in this work. Note that the lattice spacings provided below are approximate, and based on numerical estimates for w0.4/a, the physical value of the Sommer scale (taken to be r0 = 0.5 fm), and the continuum conversion factor between r0 and reference scale w0.4 determined in Ref. [32]. since it involves not only understanding the spectral properties of the transition probability matrix of the Markov process, but also its density of eigenmodes. Although in general not much can be said theoretically about the scaling properties of either with lattice spacing (though our expectation is that the slow modes are generally diffusive for local updating schemes, and thus have quadratic scaling with inverse lattice spacing ), heuristic and perturbative arguments suggest that the overlap factors arising from multiscale thermalization will diminish with lattice spacing for gauge theories in the continuum limit, since configurations become locally smooth, and thus the interpolation of coarse gauge fields performed prior to the rethermalization step becomes increasingly accurate [15].
In this work, we provide numerical evidence that corroborates the heuristic argument that the coupling to the slowest mode decreases with lattice spacing for the case of pure SU (3) gauge theory. However, we find that the coupling does not decrease at the exponential rate needed to realize an improved rethermalization exponent for lattice spacings in the regime a ∈ [0.02, 0.06] fm that we study. Rather, the coupling diminishes approximately quadratically with lattice spacing. Despite this finding, the numerical results suggest that there exists a lattice spacing beyond which the unthermalized bias associated with excited modes of the Markov process becomes negligible for a given desired level of statistics.

II. METHODS AND RESULTS
We study the scaling behavior of the rethermalization timescale, as probed by the average Yang-Mills action density E(t) evaluated at a Wilson flow time t = w 2 0.4 /4, for four prolongated ultrafine ensembles; following Ref. [26], the scale w 0.4 is defined by The targeted fine ensembles correspond to a single fixed physical volume of 1.92 fm, and the lattice spacings 0.06, 0.04, 0.03, and 0.02 fm. These ensembles are initially prepared by interpolating fully decorrelated coarse ensembles that have been generated using nonperturbatively matched coarse actions. All coarse and fine ensembles were generated using the Wilson gauge action; ensemble parameters used for this study are summarized in Table I. The ensembles were generated using a combination of MCMC algorithms as follows. 16 3 , 24 3 and 32 3 ensembles were initially generated using the Cabibbo-Marinari HB algorithm [27] with ten over-relaxation sweeps [28] per HB update (one update attempt per link per sweep). These configurations were subsequently prolongated and rethermalized to produce decorrelated fine 32 2 , 48 3 , and 64 3 ensembles. The 48 3 ensemble was once again prolongated and rethermalized to produced a decorrelated fine 96 3 ensemble. In all cases, prolongation of the coarse configurations was performed by gauge field interpolation [29][30][31], following the staged approach described in Ref. [15]. Rethermalization of all prolongated ensembles was performed using the HB algorithm (ten update attempts per link per sweep) without over-relaxation.
The coarse and fine actions were consistently matched using the scale parameter w 0.4 , determined in part by using the parametrization with respect to the gauge coupling β provided in Ref. [32]. Note that that study quotes 0.5% uncertainties in the parametrization over the interval β ∈ [6.3, 7.5], and the coarsest two ensembles lie outside this window (for the coarsest ensemble, we performed the nonperturbative tuning independently). We have validated the scale settings for all but the finest ensemble; results are provided in Table II and compared with results provided in Ref. [32]. For the ensembles with L/a = 32, 48, 64, we find agreement within approximately 2% of the values predicted by the parametrization of Ref. [32], and therefore assume comparable uncertainties in the scale for our finest ensemble, corresponding to L/a = 96. In order to make fair comparisons of the rethermalization behavior at different lattice spacings, and ultimately extract reliable scaling properties, it is important to maintain accurate matching between coarse and fine lattices (i.e., between L/a = 16 ∼ 32, L/a = 24 ∼ 48, L/a = 32 ∼ 64 and L/a = 48 ∼ 96); here we find agreement in the scale setting parameter to within the quoted statistical uncertainties of Table II. The prolongated configurations were evolved toward equilibrium using a total of τ max MCMC updates (we validate the choice of τ max in a subsequent analysis by demanding that τ max be greater than the observed rethermalization time by a factor of approximately 5-10), with intermediate measurements of the Wilson flowed plaquette action density. Uncorrelated measurements were performed at rethermalization times τ n + 1/2 , wherê and n = 1, . . . , N meas . Each measurement was performed using disjoint subsets of configurations, each of size M cfg drawn from the given ensemble. The values of τ max , N meas and M cfg are provided in Table III for each ensemble considered; note that in each case the total number of decorrelated configurations generated by the end of the rethermalization is N cfg = N meas M cfg . Between N meas = 20 − 70 uncorrelated measurements were made in total, with each measurement performed on ensembles of size M cfg = 10 (9 for the L/a = 96 ensemble). In order to understand the rethermalization time dependence of our prolongated ensembles, we modeled the longdistance observable O ≡ (w 2 0.4 /4) 2 E(w 2 0.4 /4) by a single-exponential fit function of the form where c 0 , c R , and τ R are fit parameters and τ represents the number of MCMC updates. Least-squares fits were performed using ensemble estimates of O taken over the entire MCMC time range (i.e., τ n +1/2 for n = 1, . . . , N meas ); fit results are provided in Table IV    indicate that the model provides an acceptable description of the data. In particular, except for perhaps the coarsest ensemble, there is little statistically meaningful evidence for exponential contamination beyond the leading "excited state." Given the numerical validity of the single-exponential fits, the fit coefficients may be identified with the timescale τ R = τ n and product of overlap factors c R = c n ≡ O|χ n χ n |P prolongated , associated with the Markov process for some mode n > 0 (or some combination of modes with timescales too close to resolve), where the latter provides indirect insight into the strength of the coupling between the prolongated configuration distribution and excited modes of evolution. Fig. 2 shows a plot of the lattice spacing dependence of the extracted rethermalization times, τ R and overlap factors, c R . A fit to these data yields and log 10 c R = −0.749(53) − 1.89(9) log 10 w 0.4 a , respectively. From the observed lattice spacing dependence of τ R , the rethermalization scaling exponent appears consistent with z R = 2, which is typical for diffusive processes. Although the coupling strength, c R , appears to diminish quadratically with the lattice spacing, disambiguating the lattice spacing dependence of the individual overlap factors appearing in Eq. 5 is not possible given the present data. However, under the reasonable assumption that the overlap factor O|χ n has a nonzero continuum limit 2 and the same mode is dominant at each lattice spacing, we find clear evidence of decoupling of the prolongated distribution from the dominant (nontopological) mode in the continuum limit.

III. DISCUSSION
In previous studies of both short-and long-distance observables in pure SU (3) gauge theory [15] and SU (2) gauge theory with heavy dynamical fermions [16], it was established that a properly matched and prolongated coarse ensemble can yield vanishing overlap onto the slowest nontopological modes of the Markov process for both HB and HMC algorithms. In this study, based on the quality of our single-exponential fits to the long-distance observable O with respect to the rethermalization time, we find little evidence for any but a single, approximately diffusive, mode of evolution (hereafter referred to as a "rethermalization mode") contributing to the rethermalization of prolongated ensembles. The degrees of freedom of the prolongated configurations that incorrectly describe the target fine probability distribution dominantly couple to the rethermalization mode, but with a strength that diminishes quadratically with the lattice spacing. For a given level of statistical precision, the findings suggest that at a sufficiently fine lattice spacing rethermalization may become unnecessary.
Specifically, even in the regime τ τ R , and for a given fixed level of statistical precision, the bias associated with the rethermalization mode contamination is negligible provided c R σ/ √ N conf , where σ is the standard deviation of the measured observable distribution, and N conf is the number of statistically independent configurations used in the measurement. In the case of the action density at flow time w 2 0.4 /4, the fitted form in Eq. (7) predicts this to occur at a lattice spacing a ∼ 0.01 fm (20 GeV) at the present level of statistics. Since c R ∼ a 2 , maintaining this condition at a higher level of statistics demands that the lattice spacing be reduced, scaling like a ∼ N −1/4 conf for a fixed physical volume. This observation and the corresponding scale at which rethermalization becomes unnecessary may be specific to the operator being studied. However, our previous study [15] has shown rethermalization times are relatively insensitive over a large class of long-distance operators.
Further suppression of the bias associated with rethermalization mode contamination may be possible by improving the scaling properties of rethermalization mode coupling to the prolongated distribution. It is not clear, however, whether the quadratic dependence of c R observed in Eq. 7 arises out of geometrical considerations (noting that the current interpolation scheme involves mapping coarse configurations to a fine grid and then locally minimizing the Wilson action with respect to the remaining gauge links, the interpolation from coarse to fine is expected to only be valid up to quadratic corrections in the lattice spacing) or some other source of lattice spacing dependence (e.g., the quality and consistency of the matching at different scales). If the origin for this scaling is indeed geometrical, then a higher order prolongation scheme could yield a higher order dependence of c R on the lattice spacing.
Focusing on interpolation-based prolongation operations, and assuming the renormalization-group matching produces coarse links that are properly sampled from the standpoint of the fine action, then the most probable values for the remaining fine links are the ones that minimized the classical continuum equations of motion-up to a statistical variance that must diminish with lattice spacing. This suggests that a higher order interpolation scheme might involve minimizing a classically improved lattice gauge action, subject to the constraint that coarse links be held fixed. A practical realization for such a scheme might involve two passes: in the first pass, an interpolation is performed locally using the lower order scheme as in this study, and in the subsequent pass, a global minimization of the improved gauge action is performed subject to the coarse link constraints. Use of the lower order interpolated gauge links as a starting point (e.g., in an iterative optimization scheme) would help to ensure that the desired global minimum is reached.
Whether or not such an improved interpolation scheme yields an improved prolongation operation, and whether or not an improved prolongation operation yields improved scaling of the overlap of the resulting configuration distribution onto the observed rethermalization mode remain interesting and open questions. Regardless of the outcome, and given the observed decrease in contamination from unwanted rethermalization modes, use of prolongated coarse, matched ensembles to rapidly thermalize fine ensemble streams with well-sampled topology appears to be increasingly feasible as the continuum limit is approached.