Perturbative Removal of a Sign Problem

This paper presents a method for alleviating sign problems in lattice path integrals, including those associated with finite fermion density in relativistic systems. The method makes use of information gained from some systematic expansion -- such as perturbation theory -- in order to accelerate the Monte Carlo. The method is exact, in the sense that no approximation to the lattice path integral is introduced. Thanks to the underlying systematic expansion, the method is systematically improvable, so that an arbitrary reduction in the sign problem can in principle be obtained. The Thirring model (in 0 + 1 and 1 + 1 dimensions) is used to demonstrate the ability of this method to reduce the finite-density sign problem.


I. INTRODUCTION
Lattice Monte Carlo methods are able to provide nonperturbative access to observables in quantum field theories. They are unique in this respect for many strongly coupled theories. Under certain circumstances, such as at finite density of relativistic fermions and the Hubbard model away from half-filling, lattice methods are made dramatically less efficient by the so-called sign problem. This sign problem is a central obstacle to first-principles calculations in many regimes of strongly coupled theories, including ab initio studies of the nuclear equation of state.
In lattice field theory, spacetime is treated as discrete and observables are obtained from the high-dimensional lattice path integral. Lattice field theory is ordinarily used to study a system in thermal equilibrium, and the partition function is written as Z = DA e −S(φ) , where S is the (Euclidean) action and the integral is taken over all configurations of a field A. Observables are given by various derivatives of the logarithm of the partition function. These derivatives are ordinarily sampled by importance sampling, which hinges on the treatment of the normalized Boltzmann factor e −S /Z as a probability distribution. For some systems, including those with a finite density of relativistic fermions, the action S is complex, and this is not possible -this is the sign problem.
Importance sampling commonly takes a polynomial amount of time in the spacetime volume being simulated (although this is proven only in a few cases [1][2][3]). Importance sampling can be modified to work even where S is complex, but at the cost of efficiency. In this modification, the "quenched" Boltzmann factor |e −S |/Z is treated as a probability with respect to which sampling is performed. Ordinary expectation values are obtained in terms of quenched expectation values: O = Oe −iSI Q / e −iSI Q . The loss of efficiency comes primarily from the denominator. The average of the exponential of the imaginary part of the action, often termed the "average phase", is equal to the ratio of the physical * scott.lawrence-1@colorado.edu to quenched partition functions Z/Z Q , and characteristically scales like e −βV . Resolving this exponentially small quantity, by averaging many quantities of unit magnitude, requires ∼ e 2βV samples; thus the reweighting procedure incurs an exponential cost in the volume. This failure affects a wide variety of models, and a correspondingly wide variety of methods have been proposed to mitigate it: complex Langevin [4], the density of states method [5], canonical methods [6,7], reweighting methods [8], series expansions in the chemical potential [9], fermion bags [10], field complexification [11], and analytic continuation from imaginary chemical potentials [12].
In this paper we will examine a new method, inspired by two observations: first, that the partition function is unchanged if a function that integrates to zero is added to the Boltzmann factor, and second, that lattice methods can encounter a fatal sign problem even in regimes under good control by perturbation theory (or any other systematic expansion). To any fixed order in perturbation theory, the sign problem can be (non-uniquely) identified with some oscillating part of the Bolzmann factor which integrates to zero, and this part can then be subtracted off, without changing the partition function or any observables. In fact, we will see that this subtraction can be performed in such a way that even the nonperturbative partition function and observables remain unchanged. Where the model is under good control by perturbation theory, meaning that the partition function is well-approximated by the integral of a perturbative expansion of the integrand, this subtraction is nearly the entire sign problem. In regimes where perturbation theory is a poor approximation, we may hope to isolate and remove a single component of the sign problem, thereby improving the efficiency of the necessary nonperturbative calculation.
The method described in this paper exhibits two favorable characteristics worth noting before we begin. Firstly, it is an exact method, in the sense that the modified form of the partition function is precisely equal to the original, physical form. As a consequence, all observables retain their physical values, and the only errors are statistical ones associated to the sampling process. This is true regardless of the quality of the systematic expansion used: the removal of the sign problem is approximate, but the observables computed are exact. Secondly, although the removal of the sign problem is approximate, it is systematically improvable. If a certain order in perturbation theory does not yield a sufficiently moderate sign problem, a higher order can in principle be used. As long as the expansion converges (on the lattice), a sufficiently high order is guaranteed to remove the sign problem to any desired degree. Of course, an exponential cost is associated with going to higher orders in most expansions, and it is to be expected that this property of systematic improvability is not a practical way to solve many problems, as it merely trades one exponential cost for another. Nevertheless, this is an unusual and promising combination.
This paper uses the Thirring model [13] in 0 + 1 and 1 + 1 dimensions as a testbed for the method of subtractions. This model has frequently been used, in varying dimensions, to test methods for treating the fermion sign problem in the past, including complexification [14,15] and complex Langevin [16].
In the next section, the general method of subtractions is described in detail, with an emphasis on subtractions that are constructed via some systematic expansion. In Sec. III, the heavy-dense limit is used to construct a subtraction for the Thirring model in 0 + 1 dimensions. This is extended in Sec. IV, where the 1 + 1-dimensional Thirring model is treated with a variety of expansions. A nonperturbative method of optimizing subtractions is described in Sec. V. Finally we conclude in Sec. VI, discussing in particular a relation between this method and the method of field complexification.

II. GENERAL METHOD
For brevity, let us write the Boltzmann factor as f (A) ≡ e −S(A) , so that the unmodified form of the partition function is Z = DA f (A). If we let g(A) be some function which integrates to 0 (e.g. a total derivative of a function with appropriate behavior on the boundary of configuration space), then the numerical value of the partition function is unmodified by the subtraction of g(A) from the Boltzmann factor: The quenched partition function, and therefore the average phase σ ≡ Z/Z Q , is generically changed by this operation. Therefore, a suitable g(A) may improve the sign problem. In fact, a subtraction always exists which removes the sign problem entirely: This particular subtraction is unusable in practice, as computing it requires exact knowledge of the partition function. Indeed, using this subtraction is equivalent to performing the entire computation analytically. Particularly in the case where g is constructed from a perturbative expansion (described below) this method can be thought of as splitting the path integrand into a few terms, and integrating some analytically. In the case of the ideal subtraction of Eq. (2), the entire path integral is performed analytically.
Once a subtraction is selected, it remains to compute an observable. We must express O (an expectation value over f ) as an expectation value taken over the dis- This equation is correct, but not useful for computing the expectation value, as the measurement of the modified observable encounters a signal-to-noise problem comparable to the original sign problem. This is particularly clear in the case of O = 1, where the numerator is equal to DA f (A), the highly oscillatory integral we wanted to avoid in the first place.
The previous approach corresponds to treating g as constant in µ. Instead, take g to vary with µ, in such a way that g = 0 for any value of µ. The desired expectation value is now which does not necessarily (and does not in practice, as we will see) suffer from the same magnitude of signal-tonoise problem. We now discuss how to construct a suitable subtraction g(A) in a systematic manner. Suppose a perturbative expansion of f (A) is available, such that the partition functions at low order are readily (perhaps analytically) obtained. Defining Z n = DA f n (A), we can construct a wide variety of functions which integrate to 0 and approximate various parts of the original Boltzmann factor. It is often convenient to pick (some linear combination of) The factor of the free theory Boltzmann factor is somewhat arbitrary -any function of A with unit integral will do. This procedure does not depend on the precise nature of the systematic expansion. Our first application of this method (in Sec. III) will use the heavy-dense limit to construct a subtraction, instead of an expansion around free field theory.
Because the subtracton was constructed from a systematic expansion, g n naturally depends on µ. Applying Eq. (4) to this construction, the physical expectation value of O is given by Note that it is not in general true that ∂ ∂µ f n = Of n , nor is it generally true that the same derivative of log Z n yields a perturbative expectation value.
In deriving this expression, we have chosen for convenience not to let f 0 , and therefore Z 0 , vary with µ. This, like the precise manner of constructing the subtraction, is an arbitrary choice. We will not, in this paper, explore the question of what the optimal construction of a modified observable is.
Of course, even after the subtraction, a residual sign problem typically remains, which is addressed by reweighting.

III. QUANTUM MECHANICS
In this section we demonstrate the method on a 0 + 1dimensional variant of the Thirring model. Described in [15,17], this model is defined by the lattice action Here m is the bare mass and g a coupling constant; we are implicitly working in units where the lattice spacing is 1, so that the number of sites is equal to the inverse temperature β. The sign problem, created by the chemical potential µ, is portrayed in Fig. 1; the average phase decays exponentially with the inverse temperature, and so the cost of calculations increases exponentially with the same. A suitable subtraction is provided by the heavy-dense limit of µ → ∞. The Dirac matrix can be expanded via the polymer representation [18], and the dominant term of det K in the limit of large µ is We will use the leading-order term as our subtraction: Integrating over all fields yields the leading-order partition function (Here and throughout, I ν (·) denotes the modified Bessel function of the first kind, of order ν.) For the scaling factor f 0 (A)/Z 0 in Eq. (6) we could simply choose (2π) −β , but it is convenient in this case to use the bosonic part of the Boltzmann factor: The observable we will focus on is the number density, defined as n = β −1 ∂ ∂µ log Z. In order to measure this observable with the subtraction method, we need the µderivatives of f 1 and Z 1 as per Eq. (7). Happily, in this case they are particularly simple: ∂ ∂µ f 1 = f 1 and ∂ ∂µ Z 1 = Z 1 . This reflects the fact that, in the heavy-dense limit, the density is 1 regardless of temperature.
To summarize, before performing the subtraction, the partition function was written Z = e −S with the action S defined by Eq. (8). The modified form of the partition function is where the scaling factor f 0 and its integral Z 0 are defined by Eq. (12), and the subtraction is constructed from the heavy-dense term f 1 and its integral Z 1 , given in Eqs. (10) and (11).
While numerically identical, this form is hoped to have a reduced sign problem. The density is given by the expectation value, taken in the subtracted ensemble, The results of this procedure are shown in Fig. 1. Specially in 0 + 1 dimensions, the sign problem is no longer  [19]. The center plot shows the average phase, again as a function of µ, for the same parameters. On the right is the average phase for µ = 1.8 as a function of inverse temperature β. exponential in the volume, but rather improves slightly as β is increased. This is not to be expected to hold true for higher dimensional theories. In general, the exponential difficulty of the sign problem will not be removed by the subtraction method, but merely ameliorated. (In the case of the particular model at hand, it is possible to construct a subtraction that entirely removes the sign problem, but only because the entire partition function is analytically known.) Lastly, note that all data points in Fig. 1 are constructed from 10 3 samples. The data points calculated with the subtraction have much smaller error bars (for µ = 2.0, the error bar width is ∼ 10 −14 ) even than the sign-free µ = 0 data point without the subtraction; this procedure has improved the signal-to-noise ratio in addition to reducing the sign problem. In the limit of the ideal subtraction of Eq. 2, there is no variance remaining in the observable, and a single measurement yields the exact answer.

IV. FIELD THEORY
We now move to the 1 + 1-dimensional Thirring model with staggered fermions. The lattice action of this model is [14] S = x,ν=0,1 with the Dirac matrix now defined by where as before m is the bare mass, g the coupling, and µ the chemical potential. The staggered fermions are defined by η 0 = (−1) δ0x 0 and η 1 = (−1) x0 . As in the 0 + 1-dimensional model, a sign problem is created at µ = 0.
The first subtraction procedes from the same heavydense limit we used for the quantum mechanical model above. As before, we define f 0 = e 2 g 2 x,ν cos Aν (x) . The leading-order term in the heavy-dense expansion is which, when integrated over all fields, yields the partial partition function At this order in the heavy-dense expansion, everything takes the form of L copies of the quantum mechanical model above. In particular, the µ-derivatives of f 1 and Z 1 are Lf 1 and LZ 1 , respectively. The results of simulating with the leading-order heavydense subtraction, on a 12×6 lattice, are shown in Fig. 2. Without the subtraction, the sign problem falls to be indistinguishable from 0 ( 10 −2 ) by µ ≈ 0.8; after the subtraction, the sign problem is manageable from µ = 0 through lattice saturation.
At the next order in the heavy-dense limit, the number of diagrams in the polymer representation is exponential in β. Therefore, it is not practical (barring another way of computing the NLO heavy-dense partition function) to use this expansion at higher orders. Another expansion to consider is the hopping expansion. However this expansion is also not practical for the purpose of removing a sign problem, as the lowest-order term in the hopping expansion that has a sign problem is at order κ β .
At small g, the auxiliary field is pegged to A ∼ 0 by the cos A term in the action. As a result, it is possible to construct a "weak-coupling" expansion for the lattice Thirring model described here by Taylor expanding det K[A] in the fields A. The term first-order in A makes a particularly convenient subtraction: as it is odd in A, it integrates to 0, and the corresponding partial partition function Z 1 vanishes. The subtracted integrand of the partition function is where K 0 is K evaluated at A = 0, and f 0 = e 2 g 2 x,ν cos Aν (x) as usual. Fig. 3 shows the magnitude of the sign problem on a 6 × 6 lattice, as a function of the squared coupling constant, with and without this subtraction. A systematic improvement is visible at small values of the coupling; at sufficiently large value of g 2 , the subtraction is no longer guaranteed to help.

V. NONPERTURBATIVE OPTIMIZATION
So far, we have described how a suitable subtraction can be engineered with the aid of a systematic expansion, such as the weak coupling or heavy-dense limit. Subtractions constructed in this manner need not be optimal, and it may be profitable to consider other possibilities. In this section we will see that it is possible to efficiently perform a nonperturbative optimization on a family of ansatz subtractions to find the one with the largest average phase. The method discussed here was used in a very similar form for optimizing manifolds of integration [20], and has been applied (in one form or another) to several different field theories [21][22][23].
Suppose we have a continuous family of actions S α (the parameter α may have many components), such that the partition function Z = e −Sα does not depend on α. This is exactly the case if α defines a subtraction, or as in [20], a manifold of integration. Although the partition function has no dependence on α, the quenched partition function and therefore the sign problem may. In general, computing the sign problem for any fixed α is computationally expensive. We would like to invest computational resources efficiently, performing a simulation with the value of α that has the mildest sign problem. However, finding such a value appears hard: it certainly isn't feasible to do a grid search, resolving the sign problem for each value of α, in order to find the best one.
Consider performing gradient ascent on the logarithm of the average phase. Arbitrarily picking some initial α, we would like to calculate ∂ ∂α log Z ZQ(α) , which specifies the direction in which we should move. If we were to calculate this by finite differencing, we would need to resolve the sign problem at both α and α+ǫ, an expensive proposition. However, observe that has the form of a derivative of the logarithm of the quenched partition function, and the contribution of the physical Z cancels entirely. The direction which most quickly alleviates the sign problem is a quenched expectation value, which can be computed without encountering a sign problem. With this observation in hand, we see that it is possible to begin with a family of subtractions g α , and perform an efficient, sign-free gradient descent to find the optimal subtraction in that family. At this point, a (comparatively expensive) Monte Carlo can be performed, with high statistics to counter the remaining sign problem.
One motivation for this method stems from the "weakcoupling" subtraction of the previous method. The sub- traction g = f 1 −Z 1 defined by Eq. (19) can be multiplied by an arbitrary coefficient α, so that the integrand of the partition function is modified by In the previous section, the coefficient used was implicitly 1; as shown in Fig. 4, it turns out that this is not the optimal coefficient. The optimization procedure described above can be used to optimize this coefficient at scale. Note that for the example shown here, the full-magnitude first-order subtraction makes the sign problem worse at g 2 = 0.3. However, nonperturbative optimization can reverse this, making the first-order subtraction useful even at this relatively large coupling.

VI. DISCUSSION
The method of subtractions described in this paper allows practical mitigation of sign problems associated to finite fermion density and real-time observables. The method is exact in the sense that it makes no additional approximations in the partition function. Furthermore, the removal of the sign problem, although only approximate, is systematically improvable.
This method is not unrelated to prior work. In particular, the method of field complexification [11] may be seen as a specific strategy for constructing a subtraction 1 . In that method, the original domain of the path integral -R N , say -is expanded to a complex space of twice the (real) dimension. In this case, the expanded space would be C N . By Cauchy's integral theorem, the path integral 1 In fact, the subtraction method was initially inspired by an attempt to extend the method of field complexification to the case of path integrals with discrete domains of integration.
can now be performed over any N -real-dimensional manifold M ⊂ C N obtained by a smooth deformation from R N (and with mild constraints at infinity, when the complex space is unbounded). Typically the new manifold is parameterized by the real plane via a functionφ mapping field configurations φ ∈ R N to field configurations on M, so that the deformed path integral is written The difference between the two integrands is zero, and so can be viewed as a subtraction. Of course, in this view, every modification to the path integral that leaves the integration domain unchanged is a special case of the subtraction method. We can also go a step further and note that the difference between the two integrands is a total derivative. Concretely, in one dimension, the difference between the two Boltzmann factors is It is notable that a well-chosen subtraction can resolve a sign problem even in cases where no manifold can. A simple example of a sign problem unremovable by any choice of manifold is the one-dimensional integral (which is to be considered a mock partition function) The sign problem associated to this partition function becomes arbitrarily bad as ǫ is taken towards 0. This sign problem was shown in [24] to be unremovable by any choice of integration contour. In fact, the original integration domain S 1 has a more mild sign problem than any other choice of domain. In this case, it's particularly easy to see that a subtraction of cos θ completely removes the sign problem, where no manifold can. Thus the method of subtractions is strictly more powerful than that of complexification.
The manifold used in [20,21] to improve the sign problem of the Thirring model in 1 + 1 and 2 + 1 dimensions was motivated (post-hoc) by the leading-order term in the heavy-dense expansion. In [25] it was shown that a manifold of that form can entirely remove the sign problem coming from that leading-order term. This choice of manifold is therefore equivalent to a subtraction constructed from that term.
The complexification method has been applied to realtime observables through the lattice Schwinger-Keldysh formalism [26]. The determination of real-time observables on the lattice remains a largely unexplored area. Future work should be able to apply the subtraction method to real-time calculations through the same formalism.
The success of the method described in this paper depends on the availabilty of a systematic expansion in which the sign problem can be seen. We have seen that several options exist for the Thirring model. Examining and making use of such expansions in other models is a critical next step.
We noted in Sec. III that in addition to improving the sign problem, the signal-to-noise ratio associated with the modified observable was improved from the one associated with the original observable. This was not explored further in this paper, but it suggests that the same or a similar method could be deployed explicitly for treating expensive signal-to-noise problems. It is not entirely surprising that this should be possible, as the closely related complexification method has recently been applied to noisy observables in Abelian gauge theory and complex scalar field theory [27].