Exponential reduction of the sign problem at finite density in the 2+1D XY model via contour deformations

We study the 2+1 dimensional XY model at nonzero chemical potential $\mu$ on deformed integration manifolds, with the aim of alleviating its sign problem. We investigate several proposals for the deformations, and considerably improve on the severity of the sign problem with respect to standard reweighting approaches. We present numerical evidence that the reduction of the sign problem is exponential both in $\mu^2$ and in the spatial volume. We also present a new approach to the optimization procedure based on reweighting, that sensibly reduces its computational cost.


I. INTRODUCTION
Euclidean quantum field theories with a finite chemical potential generally suffer from a complex action problem: the path integral weights are complex, and therefore cannot be interpreted as the Boltzmann weights of a classical statistical mechanical system. In QCD, this complex action problem is a severe roadblock for first principles understanding of the physics of neutron stars, supernovae, as well as heavy ion collisions at lower collision energies. In some theories the sign problem can be solved by a reformulation of the theory in different variables, such that in the new variables the weights are manifestly real and positive [1][2][3][4][5]. This has not been achieved in QCD so far.
The existence of a sign problem does not make simulations completely impossible. In the presence of a complex action problem, simulations can still be carried out by standard Monte Carlo methods in the phase-quenched (PQ) theory, with Boltzmann weights proportional to e −S , or -assuming that the grandcanonical partition function is real -the sign-quenched (SQ) theory, with weights proportional to Re e −S . After such simulations have been carried out, the ratio of the simulated and target partition functions, as well as the expectation values of different operators O, can in principle be reconstructed via the formulas: where we introduced the phase of the complex action θ as e −S = e −S e iθ . This leads to large cancellations when the phases e iθ have large fluctuations, a problem that is generally referred to as a sign problem. The severity of the sign problem can be measured by the partition function ratios Z/Z PQ and Z/Z SQ , and as long as these quantities are under numerical control one can reconstruct expectation values in the desired theory reliably. In fact, reweighting from the phase and sign quenched theories is starting to become feasible in QCD [6][7][8][9]. However, the range of applicability -both in the chemical potential and in the physical volume -of such an approach is severely constrained by the sign problem, which requires an exponential increase of the statistics both as a function of the volume V and of the chemical potential µ. It is therefore desirable to develop methods that either solve or at least alleviate the sign problem. Even if only the second goal is achieved, this could still drastically increase the range of parameters that reweighting methods can reach. A set of methods that try to deal with the sign problem is based on complexification of the fields. There are two broadly defined approaches of this type. In the first onecomplex Langevin [10][11][12][13][14] -an N -dimensional integration over real fields is enlarged to a 2N -dimensional integral over the real components of the complexified fields. In the second one -contour deformations -the integral still remains N -dimensional, but the integration manifold is deformed to a different manifold of the same dimension. In this paper, we pursue this second approach.
In most cases of interest, the path integral weights are holomophic functions of the fields. 1 In such a case, any integration manifold that is in the same homology class as the undeformed manifold leads to the same partition function [17]. However, the phase and sign quenched integrands are not holomorphic, and therefore the phase and sign quenched partition functions are not invariant under such contour deformations. Thus, it could be possible to bring closer to unity the ratios Z/Z PQ and Z/Z SQ , measuring the severity of the sign problem, by such contour deformations. One could then perform simulations in the contour-deformed phase or sign quenched theory, and perform a reweighting via Eq. (1) to get results in the target theory.
There are different ways to deform integration contours to make sign problems milder. First, there are methods based on Lefschetz thimbles [17][18][19][20][21]. Second, there are the more ad hoc path optimization methods which we pursue here. The main idea of these methods is to parametrize the deformed integration manifold by some finite number of parameters, which are then adjusted to make the sign problem as mild as possible. In the context of the sign problem at nonzero chemical potential, the path optimization method was applied to a one-dimensional oscillating integral [22], the 0+1D φ 4 theory [23], the 0+1D PNJL model [24], 0+1D QCD [25], the 1+1D φ 4 model [26], the 1+1D Thirring model [27], the 2+1D Thirring model [28], and Bose gases of several dimensions [29]. Other applications of the sign optimization method include improving the signal-to-noise ratio of noisy observables at zero chemical potential in 0+1D scalar field theory and 1+1D U(1) gauge theory [30] and in 1+1D SU(2) and SU(3) gauge theory [31], and reduction of the sign problem in 1+1D U(1) gauge theory with complex coupling constants [32].
In this paper we apply the path optimization method to the 2+1 dimensional XY model. The choice of the model is mainly motivated by the fact that it shares several technical features with QCD, so that insight obtained here can hopefully be applied there as well. First, the integration variables take values in a compact space, which requires a slightly different treatment of contour deformations than non-compact integration domains do. Second, this theory has Roberge-Weiss periodicity at imaginary chemical potential [33,34]. Third, the model also shares the property of QCD that the complex Langevin approach fails for small coupling β [35,36]. Fourth, like in lattice QCD, the dependence of the action on the chemical potential is non-linear.
Finally, like in QCD, the effects of a chemical potential should saturate at large µ, where the temporal dependence of the relevant field configurations tends to become trivial and µ essentially drops out of expectation values; in this limit the sign problem becomes mild.
Differently from QCD, however, the XY model can be rewritten in the worldline formulation to be free of a sign problem [3,37,38], allowing for direct simulations using a worm algorithm, and making explicit comparisons possible with the path optimization method.
Our goals in this work are the following. First, we investigate how much improvement can be made to the severity of the sign problem in this model by very simple ansätze for the contour deformations. Second, we study the chemical potential and volume dependence of the improvement on the sign problem, and so on the statistics required for reliable reweighting, achieved with the optimized contours. Third, we wish to see whether the optimization procedure can be performed more efficiently, avoiding a full Monte Carlo simulation at each step of the iteration searching for the optimum, by simply reweighting to the different contours using the same fixed ensembles.
The results for all three of these inquiries turn out to be quite encouraging. First -as we will see -a rather drastic improvement can be achieved in the severity of the sign problem, even with relatively simple ansätze. Second, we present numerical evidence that the improvement is exponential, i.e., it reduces the exponent of the severity of the sign problem. And third, the optimization of the contour is feasible with reweighting only. This leads to a radical improvement in the cost of the optimization procedure itself.
The plan of the paper is the following. In Section II we briefly discuss the contour deformation approach to the 2+1 dimensional XY model at finite chemical potential. In Section III we provide details on the optimization procedure and on the different parametrizations used to ameliorate the severity of the sign problem. In Section IV we illustrate the chemical potential and volume dependence of the achieved improvement. In Section V we present a modified optimization procedure based on reweighting to different contours from a fixed ensemble, that allows us to reduce computational costs. We summarize our conclusions in Section VI.

II. CONTOUR DEFORMATIONS FOR 2+1D XY MODEL AT NONZERO CHEMICAL POTENTIAL
The action of the 2+1D XY model with nonzero chemical potential [35,37,38] is where the sum runs over all lattice sites and directions, with 0 identified as the temporal direction. Periodic boundary conditions are imposed in every direction. The partition function, can be interpreted as a complex contour integral in each ϕ x with endpoints at −π and π, and so as an integral over For each ϕ x , we will consider contours in the strip of the complex plane satisfying −π ≤ Re ϕ x ≤ π, with the points −π + i Im ϕ x and π + i Im ϕ x identified, i.e., a cylinder. As long as the contours remain smooth on this space, i.e., if they are smooth curves from −π + ix to π + ix for some x ∈ R, the partition function remains unchanged. This can be shown by connecting the original and the shifted contours using line segments perpendicular to the real axis, whose contributions cancel out thanks to periodicity of the integrand in Re ϕ x . Allowed contour deformations for a single field variable are illustrated in Fig. 1.
Denoting the deformed manifold M, and parametrizing it with real parameters t x , we have for any M satis- fying the requirements above where J is the Jacobian matrix, with elements and the effective action is defined as S eff = S − ln det J.
Exploiting the reality of the partition function we can write it in a manifestly real form, where S R eff and S I eff are the real and imaginary parts of the effective action. This enables us to make use of the sign reweighting approach [8,9,39,40] by using the absolute value of the integrand, | cos S I eff | e −S R eff ≥ 0, as a weight in importance sampling. This method has a slightly milder sign problem than the phase reweighting method [9,39]. The corresponding expectation values will be denoted as . . . SQ,µ . The severity of the sign problem is measured by the average sign, While the numerator of this expression is invariant under changing integration contours, the denominator is instead altered. Hence, deforming contours leaves the target partition function unchanged but it has an effect on the severity of the sign problem.

III. OPTIMIZATION OF THE INTEGRATION MANIFOLD
In this paper we work at β = 0.4, that for µ = 0 is in the disordered phase of the model [35,37,38]. As a preliminary check, we performed a scan in µ using the sign problem-free worldline formulation and the worm algorithm of Ref. [37]. Using a standard finite size scaling analysis, we observe a phase transition at µ 2 c ≈ 0.54. Our goal is to find integration contours that give a larger ε SQ,µ than without contour deformation. As such, we need to maximize the expectation value of the sign, or alternatively, minimize a cost function, with respect to some coefficients p i that parametrize the contours. As our cost function, we choose the ratio of the number of configurations with negative and positive cos S I eff , Its gradient can be computed easily, where . . . ± means averaging only over configurations with positive/negative cos S I eff , and We perform optimization using a simple gradient descent algorithm. At every optimization step we update coefficients by subtracting the gradient of the cost function with a multiplication factor, where p (j) is the vector of coefficients obtained at the jth step, F (j) is the value of the gradient obtained using p (j) , and α j is The algorithm terminates when the relative change of the coefficients decreases below a prescribed tolerance value. We note that since we always start the optimization near the undeformed integration manifold, our procedure explores only the vicinity of the original contour, and could not detect an optimum lying beyond a "potential barrier".
We parametrize the integration manifold by the real parts of the field variables on the different lattice sites, denoted by t x . Periodicity of the action in the real part of ϕ x restricts the imaginary part of ϕ x on the deformed manifold to be a Fourier series in t x , with coefficients that in general depend on x and on the other field variables. As our first attempt, we parametrized each complex ϕ x using only the corresponding t x . However, the sign problem could not be improved this way, as using this parametrization the original contour turned out to be the local minimum of the cost function. In retrospect, this is not surprising, since any nontrivial such parametrization breaks the symmetry of the model under a common shift t x → t x + c. This suggests to restrict to a Fourier series in sines and cosines of t x − t y . We then experimented with different ansätze that couple neighbouring sites in simple ways, all of which could improve the sign problem, but by quite different amounts.

A. Ansätze without temporal translational invariance
Since the chemical potential affects directly only the interaction between temporal nearest neighbors, it is reasonable to consider deformed manifolds that depend only on their difference. Given the translation invariance of the system, it is also natural to expect that the optimal deformation is also translationally invariant. For more generality, we allow for a non-translationally invariant optimum in the temporal direction, using the same coefficients for every ϕ x with the same temporal coordinate x 0 . We later check our assumptions on spatial translation invariance and on neglecting spatial neighbors. We then considered parametrizations of the following general form, leading to the following effective action: We used the choices K = 1 and 2 for the cut-off on the Fourier series, which we will denote with (A, B, K = 1) and (A, B, K = 2), respectively. Performing optimization on a lattice of size Ω ≡ N 0 N 1 N 2 = 8 3 for β = 0.4 we find that the sign problem is substantially improved. In each iteration 10 6 configurations were generated to compute the gradient. The initial values for the Fourier coefficients were chosen to be zero for the smallest simulated µ 2 value, and for each subsequent µ 2 to be equal to the final values obtained in the previous optimization round. Figure 2 shows the average sign achieved with optimization for K = 1 and 2, along with the unoptimized results for comparison. There is significant improvement in the sign problem even when using only a first-order Fourier series. When going up to second order there is a marginal increase in ε SQ,µ . The optimal values of the Fourier coefficients for µ 2 = 0.15 and K = 2 are listed in Tabs. I and II of Appendix A.
The coefficients of the constant and of the sine terms are two-three orders of magnitude smaller than the coefficients of the cosine terms; they also fluctuate around zero as a function of x 0 , with a standard deviation larger than their average, while the coefficients of the cosine terms have roughly the same value on every time slice, with A 1,x0 < 0 and A 2,x0 > 0. This remains true at every simulated value of µ 2 . This suggests that there is little gain in allowing for x 0 -dependent Fourier coefficients. Furthermore, carrying out the optimization on Ω = 8 × 4 2 , 8 × 6 2 and 8 × 10 2 lattices we have found that the values of the optimal coefficients are close to those obtained on the Ω = 8 3 lattice.

B. Ansätze with a triangular Jacobian matrix
With the appearance of additional parameters the calculation of the Jacobian generally becomes more involved. Especially interesting from the computational point of view are the parametrizations which keep the Jacobian simple. For this reason, we considered ansätze with a triangular Jacobian matrix (see also Ref. [31]), that have the benefit of having a very simple form of the effective action : As we show below, this can be achieved at the price of losing translational invariance. This is expected to reduce the amount of improvement that can be gained by path deformation, especially at larger µ [29]. Nonetheless, there may be a trade off with computational costs when the Jacobian gets very complicated, in particular in systems where the dependence on µ becomes milder at large µ. We considered 3 different ansätze of this type. First, we set where Here, we constrained the ϕ x on the slices x i = N i − 1 for i = 0, 1, 2 to stay independent of the real parts of their neighbors in the ith direction, so as to make the Jacobian matrix triangular. Except for these points we used global Fourier coefficients, i.e., the same coefficients across all the other lattice sites. This leads to the cancellation of the constant term since the action depends only on the difference of the field variables. The parametrization Eq. (16) allows us to check the consequences of explicitly breaking temporal translation invariance, and to assess whether the optimal choice of contours is affected by the inclusion of spatial neighbors in the parametrization. We carried out optimization with this ansatz on lattices of sizes Ω = 8 × 4 2 , 8 × 6 2 , 8 3 and 8 × 10 2 with β = 0.4. The optimal coefficients are shown in Tab. III of Appendix A for µ 2 = 0.15 and K = 2. We find that the optimal values ofĀ k,i andB k,i for i = 1, 2 andB k,0 are much smaller than those ofĀ k,0 , which in turn are in agreement with those obtained for A k,x0 using the ansatz of Eq. (13).
This suggests that the contribution of spatial neighbors can be neglected in the parametrization. Once again, we observed that the optimal coefficients are to a good approximation independent of the spatial size of the system.
We have also investigated the effect of including the second nearest neighbor in the temporal direction in the parametrization of ϕ x . Once again, constraints were imposed similarly to Eq. (16) to make the Jacobian simple, and global coefficients were used, We omitted the sine terms since their coefficients remained close to zero with the previously tested parametrizations. Optimization was performed using Eq. (18) with two setups: using K = 2 and adjusting a 1 and a 2 with b k s set to zero, later referred to as (a 1 , a 2 ) optimization; or using K = 1 and adjusting both a 1 and b 1 , that we call (a 1 , b 1 ) optimization.
In Fig. 3 we compare the optimized results obtained with the parametrizations of Eqs. (16) and (18) with the unoptimized results, and with the optimized results previously obtained using Eq. (13). Comparing results obtained with Eq. (16) for K = 2 and the (a 1 , a 2 ) parametrization, it is clear that including spatial neighbors leads only to a marginal improvement, at the cost of a much more complicated Jacobian. It is also apparent that including the second-order term a 2 gives a larger increase in the optimized ε SQ,µ in the range of the simulated µ values, than including the first-order term b 1 . Hence, going to second order in the Fourier series is more important than including second neighbors in the parametrization.
It is also clear that our parametrizations with a triangular Jacobian matrix lead to a significant but not overwhelming loss in the improvement of the sign problem compared to the ansatz of Eq. (13). This discrepancy can probably be attributed to the constraints introduced in Eqs. (16) and (18). This is similar to the conclusions of Ref. [29] for Bose gases. In our model, parametrizations with the triangular Jacobian are not significantly cheaper to simulate the fully translationally invariant ansatz, so there is no gain in breaking translational invariance at the boundaries of the lattice, due to the sizeable difference in the improvement achieved. In other models, there might be a less obvious trade-off.
With all of our two-parameter ansätze, scans of ε SQ,µ in the space of Fourier coefficients reveal a rather simple landscape. There is a single optimum located on a small plateau where the average sign changes slowly. As an example, scans of the average sign in the space of coefficients a 1 and a 2 are shown in Appendix B.

C. Fully translationally invariant ansatz
Among our ansätze, the one that achieved the greatest improvement on the sign problem with the least amount of parameters was given by a translationally invariant version of Eq. (13), with K = 2, the constant and sine coefficients set to zero, and A k,x0 = A k for all x 0 . This will be denoted as the (A 1 , A 2 ) parametrization.

IV. VOLUME AND CHEMICAL POTENTIAL DEPENDENCE OF THE SIGN PROBLEM ON OPTIMIZED MANIFOLDS
As a sanity check, we calculated the average action density as We first present the unoptimized and the (A 1 , A 2 )optimized average action density in Fig. 4. In both cases 10 8 configurations were used. In the left panel of Fig. 4 we also compare with results from the sign problem-free worldline formalism. We see good agreement between the different predictions. Predictions with the (a 1 , a 2 )optimized scheme also agree in the range where the sign problem of the scheme is manageable. In the right panel of Fig. 4 we also compare with analytic continuation from µ 2 < 0 with polynomial ansätze of increasing order. The expansion converges rather slowly, even at small chemical potentials, way before the phase transition to the ordered phase. The significant improvement of the sign problem achieved by path optimization is clearly visible in the left panel of Fig. 5, where we compare the average sign in the unoptimized case and in the optimized case with parametrizations (a 1 , a 2 ) and (A 1 , A 2 ). The ratio of this quantity between our best parametrization and the unoptimized case is already of order 10 2 at µ 2 = 0.3, beyond which the unoptimized approach fails. For the (A 1 , A 2 ) parametrization, the path optimization method instead works well also deep in the ordered phase at µ 2 > µ 2 c ≈ 0.54. An exponential fit ε SQ,µ ∼ e −C (µ) µ 2 Notice that since the sign problem should become mild at large µ, we expect the average sign to reach a minimum as a function of µ 2 (at fixed volume), and then increase. This probably explains the flattening of the (A 1 , A 2 ) curve, starting from around µ 2 ≈ 0.4, which is likely related to the transition to the ordered phase. An eyeball estimate leads to expect several orders of magnitude of improvement in the central region where the sign problem is at its strongest.
In the right panel of Fig. 5 we show the dependence of the average sign on the volume for the unoptimized and for the (a 1 , a 2 ) and (A 1 , A 2 )-optimized cases. A clear exponential decrease ε SQ,µ ∼ e −C (V ) V is visible, with C (V ) unopt ≈ 0.0073, C (V ) (a1,a2) ≈ 0.0032, and C (V ) (A1,A2) ≈ 0.0031. The improvement of the sign problem by path optimization is exponential in the volume, reducing the "badness" C (V ) of the volume scaling by 50%.

V. OPTIMIZATION WITH REWEIGHTING
Generating new configurations at every optimization step can be computationally too expensive in more complicated models. A possible way to decrease the computational cost is to generate a set of configurations before starting the optimization procedure, and use the same set at every step to compute the gradient of the cost function through reweighting. This would allow us to save time on the generation of configurations. When computing expectation values we need to reweight from the original weights, r = | cos S I eff,0 | e −S R eff,0 , to the new weights, where S eff,0 is evaluated on a fixed set of configurations, e.g., those obtained in the optimization at the previous µ 2 value (or along the real axis for the smallest µ 2 ), and S eff,1 is what we get with the updated values of the p i contour coefficients. The cost function is computed as and its gradient with respect to the contour coefficients is where . . . r denotes averaging with respect to the original weights r. One downside of this method is that it might introduce an overlap problem which we need to monitor throughout the optimization procedure. This can be done by looking at the numerical results for the denominators in Eq. (21). When these fall below a prescribed tolerance level and the overlap problem becomes too severe, we generate a new set of configurations before proceeding with the updates. Note that this overlap problem in the optimization by no means can bias the final results, since even on unoptimal contours the integral is guaranteed to be the same by the multi-dimensional Cauchy theorem. On the other hand, it could lead to a loss in the improvement on the sign problem: that is why the second step of the procedure -generating new config-urations when the overlap problem becomes too severeis useful.
As it can be seen in Fig. 6, this modified optimization method yielded similarly good results when compared to the procedure used previously, which required the generation of new configurations at every step of the iteration. Here we used the parametrization of Eq. (13). We observed a good agreement between the values of the coefficients obtained with the simple and the modified optimization. The optimal values for β = 0.4, Ω = 8 3 and µ 2 = 0.15 are shown in Tab. IV of Appendix A.

VI. DISCUSSION
In this paper we studied the path optimization method for reducing the severity of the sign problem in the 2 + 1 dimensional XY model with nonzero chemical potential. We used simple parametrizations for the complexified field variables. We have shown that the optimized manifold exhibits an explicit temporal translational invariance. Exploiting this property allows us to use significantly fewer optimization parameters.
Furthermore, we have found that the optimal choice of contours appears to be independent of the spatial size of the lattice. Such a feature can be utilized to make the optimization procedure computationally less expensive as it would be sufficient to find the optimal contours for a small lattice. Then simulations can be carried out for larger lattices using the same contours.
We have shown numerical evidence that the reduction of the sign problem is exponential both in the chemical potential and the volume -i.e. it considerably reduces the exponents characterizing its severity. This was achieved without changing the number of parameters with the volume, keeping the number of optimizable parameters at a small fixed value.
On the optimized integration manifolds, it was possible to simulate also on the other side of the transition to the ordered phase.
We have also demonstrated that it is sufficient to generate configurations only at the start of the optimization procedure. Then, the same set of configurations is used to compute the gradient of the cost function at each step. As the contours are updated, it is necessary to reweight from the generated distribution to the one corresponding to the new contours. When the overlap between the two distributions significantly decreases, it is preferable to generate new configurations, in order not to lose optimizing power. With this approach the computation time of the optimization can be significantly reduced as compared to a method where new configurations are generated at every iteration, with no significant loss in the reduction of the severity of the sign problem.

Appendix A: Optimal coefficient values
Here we present the optimal Fourier coefficients obtained at β = 0.4 and µ 2 = 0.15 on Ω = 8 × 4 2 , 8 × 6 2 , 8 3 and 8 × 10 2 lattices for K = 2. The optimal values for the parametrization described by Eq. (13) are given in Tabs. I and II while Tab. III shows the ones for Eq. (16). We compare the coefficients obtained with the reweighting technique introduced in Sec. V to those that were found using simple optimization in Tab. IV.        Figure 7 shows the average sign over the space of coefficients a 1 and a 2 of the parametrization described by Eq. (18) in the (a 1 , a 2 ) setup with β = 0.4 and µ 2 = 0.15. Each scan was performed at different spatial sizes with temporal size fixed at N 0 = 8.