Controlling Complex Langevin simulations of lattice models by boundary term analysis

One reason for the well known fact that the Complex Langevin (CL) method sometimes fails to converge or converges to the wrong limit has been identified long ago: it is insufficient decay of the probability density either near infinity or near poles of the drift, leading to boundary terms that spoil the formal argument for correctness. To gain a deeper understanding of this phenomenon, in a previous paper we have studied the emergence of such boundary terms thoroughly in a simple model, where analytic results can be compared with numerics. Here we continue this type of analysis for more physically interesting models, focusing on the boundaries at infinity. We start with abelian and non-abelian one-plaquette models, then we proceed to a Polyakov chain model and finally to high density QCD (HDQCD) and the 3D XY model. We show that the direct estimation of the systematic error of the CL method using boundary terms is in principle possible.


I. INTRODUCTION
Complex Langevin simulations are a very general method which can in principle be applied to any model with complex action, allowing an analytic continuation into the complexification of the original configuration space. The setup is straightforward and needs no preliminary steps, such as model dependent design or approximations. These features motivate the work to ensure the reliability of Complex Langevin simulations, since the resulting stochastic processes in the complexified configuration space require care due to their mathematical subtleties.
This paper extends to realistic lattice models the study of boundary terms [1] which occur in some Complex Langevin (CL) simulations and have the undesired effect of spoiling correctness. We thereby aim at the estimation of possible systematic errors and correction of the results.
We briefly collect some necessary definitions to make this paper self-contained. For more details we refer to [1] as well as to earlier papers such as [2][3][4].
The complex Langevin (CL) process defines a time dependent probability density P (t) on the complexification M c of the original configuration space M, so we sometimes write it as P (x, y; t) where x stands for the real and y for the imaginary part of the configuration variables. For notational simplicity we assume that M and M c are flat, with coordinates x and x + iy, respectively; we will indicate the necessary changes for the non-flat case later.
P (x, y; t) obeys the Fokker-Planck equation (FPE) ∂ t P (x, y, t) = L T P, with where S is the action entering the integration measure ρ = e −S in the partition function; (1) determines the time dependent expectation values of holomorphic observables O via This is to be compared with the 'correct evolution' computed using an evolution of the complex density ρ(t) on the original real configuration space M determined by the PDE ('complex FPE') ∂ t ρ(x; t) = L T c ρ(x; t) , Correctness of the CL evolution means then equality of Eqs (3) and (4). Equality and hence correctness of the evolution depends on (1) equality at t = 0, which can be easily arranged; (2) the absence of boundary terms both at infinity and near poles of the drift. For the models we study here poles are either absent or far away from the P distribution and do not play a relevant role. Correct convergence for t → ∞ depends in addition on existence and uniqueness (independence of the initial conditions) of the limit which depends on the spectrum of L T c being located in the left half of the complex plane with a simple eigenvalue at the origin; the latter property is closely related to ergodicity of the CL process; this is a problem for stochastic processes in general.
The study of possible boundary terms uses a function F O interpolating between the two evolutions F O satisfies such that correctness of the evolution is guaranteed if .

II. TWO VERSIONS OF BOUNDARY TERMS
A. Boundary term as integral over the surface As discussed in [1], the left hand side of (9) is really a boundary term. We also found there that typically this derivative is maximal at τ = 0, so we focus on We rewrite this as an explicit boundary term, still assuming M and M c as flat. Suppressing the configuration arguments x, y and introducing a cutoff Y on the imaginary part y in (7), we define and with (1,5) we get the boundary term Here we assumed that the x integration is unproblematic because of periodicity or fast decay so that the ∇ 2 x terms cancel by partial integration, otherwise there would also be some x boundary terms, see e.g. in [5] where the stationary distribution was found to be P (x, y) ∼ (x 2 +y 2 ) −3/2 . In this case one could calculate x boundary terms by introducing a cutoff also on the x coordinates in eq. (12).
After some trivial manipulations (see [1]), involving (assumed unproblematic) integration by parts in x and the Cauchy-Riemann equations, with the derivatives acting on everything to the right, so we are integrating a divergence. This is equal to the surface integral where n is the outer normal to the surface |y| = Y and dS the surface element on |y| = Y . Of course it is not necessary to choose the cutoff Y in the form |y| ≤ Y as we have done here; it is only necessary that the family of cutoffs restricts y to compact sets which exhaust the full space as we send Y → ∞. Finally we take the limit t → ∞ to extract B(Y ) in the stationary state.

B. Boundary term as a volume integral
To explain the principle we assume again that the configuration space M is flat. Later we will see what has to be changed for the more interesting case of M being a compact group manifold.
Proceeding as in [1] we determine B via a limiting procedure with Y as before and still assuming that the real directions are either compact or have sufficient falloff to avoid any boundary terms there. Y will be sent to ∞ later (cf. [1]). Now we process the term as follows: evaluating (15) we find The t → ∞ limit of the first term is zero as the process reaches equilibrium. The second term can be nonzero, spoiling correctness. So we have to study Vanishing of B(∞) is just the old 'consistency condition' or 'convergence condition' (CC), discussed in [3], which signals stationarity. We now describe briefly the changes to be made in the case where the configuration space M is a compact group. Without loss of generality we may think of M as a space of unitary matrices and M c a space of complex invertible matrices. Each matrix M ∈ M c has a polar decomposition with U unitary and R = √ M † M positive. We introduce a 'unitarity norm' U N (not a norm in the mathematical sense) to measure the distance of a M ∈ M from the unitary subspace; a simple choice uses and defines U N for a lattice model as n(M ) divided by the number of links or the maximum of n(M ) over the links. The boundary term is given by where dM is Haar measure on M. Also the operator L c has a slightly different form (see [6]): where the operators D i are invariant vector fields on M, acting as derivations in the directions of a basis of the Lie algebra of M.

A. U(1) one plaquette model
We revisit the one plaquette model with regularization with x ∈ R as investigated in [1]. We calculate the boundary terms for the observable O = exp(ix) using the volume integral formulation, see [1] for surface integration. The boundary terms in this case arise by integrating with the measure P (x, y; t = ∞) using the cutoff Y , as written in eq.(17). At s = 0 there is a boundary term persisting for Y → ∞, as can be inferred from Fig. 1. At large Y we have more and more fluctuations, so the result becomes submerged in the noise. Finally, for large enough Y , with our limited statistics and necessarily finite Langevin time, we get the same result as without cutoff, since no points outside the cutoff region are sampled by the CL process and hence also not discarded. This corresponds to taking the limits in the opposite order, i.e. first Y → ∞ and then t → ∞. The CC, expressing equilibrium, have to be fulfilled in this limit, albeit with potentially very large fluctuations. This is seen in the last red and green data points.
For sufficiently large s we see that the boundary term converges to a value consistent with 0 and, as found in [1], the CL simulation gives the correct results of the regularized model within errorbars.
In Sec. IV A we show that the systematic error of the CL result is directly related to the boundary term measured here, and it can be estimated using the CL simulation alone.

B. SU (3) one plaquette model and Polyakov chain
Now we investigate the holomorphic Polyakov chain for possible boundary terms. The chain is defined via where c ± = β + κexp(±µ), L is the Polyakov loop and the U i are SU (3) matrices associated to the N links, analytically continued in the CL process to SL(3, C). S is a holomorphic action, hence deviations of the CL result should only come from boundary terms at infinity. The model has a gauge symmetry that makes all N values equivalent, but it presents a good test-bed for simulation methods. We simulate this model in two different ways.

Gauge fixing at N = 1
Here we use the gauge symmetry to diagonalize the matrix U . Since for U ∈ SL(3, C) det U = 1, this means that there are only two degrees of freedom. A single link now reads and the action becomes In addition one has to include the reduced Haar measure, which adds to the action the term This term is not holomorphic and leads to poles in the drift; these are, however, located at the boundary of the domain specified in (26) and therefore cannot lead to ergodicity problems; they also do not lead to boundary terms (cf. [7]). This non-holomorphicity created by gauge fixing is innocuous. The boundary term for Tr U arises from the integrand The expression (14) for the boundary term can be used here straightforwardly. We calculate it for this model explicitly by defining a surface on the complex manifold spanned by ω 1 and ω 2 as the boundary of the compact domain Y ≤ Y cut with Y = max(|Imω 1 |, |Imω 1 |). The boundary term reads (dropping t and τ dependence for briefness sake), defining x = (Re ω 1 , Re ω 2 ) T and y = (Im ω 1 , Im with the surface element dS y . This integral can be 'measured' in the CL simulation. The measurement becomes harder with increasing Y as the statistics deteriorates. We carried out a simulation for β = i, κ = 0 = µ; the exact result for the Polyakov loop is Tr U = −0.664 + 0.793i, whereas the simulation yields Tr U = −0.4809(6) + 0.5968(5)i, which is clearly not correct, i.e. boundary terms are to be expected. Fig. 2 left shows the boundary terms for this case, computed both in the surface and volume forms. We also did a run for β = 2 , κ = 0.1 , µ = 1 (see fig. 2, right), where the simulation yields Tr U = 2.0955(13) which is consistent with the exact result Tr U = 2.0957.

Polyakov chain with N > 1 and gauge cooling
For N > 1 we could use of course gauge fixing to reduce the model to N = 1. It is more instructive, however, to leave all the link degrees of freedom and study the effect of having gauge degrees of freedom and that of the gauge cooling [6] on the presence or absence of boundary terms; gauge cooling reduces the unitarity norm by non-compact gauge transformations.
We use the volume form of the boundary terms in the following. The unitarity norm U N used here is the average of n(U j ) (see (19)) over the links. There are many sets of parameters for which CL without gauge cooling does not Real part, from volume integration Imaginary part, from volume integration Real part, from surface integration Imaginary part, from surface integration give correct results. As an illustration we choose β = 2.0, κ = 0.1, µ = 1.0, where the exact result is Tr L = 2.0957. The boundary term integrand reads where the index a refers to the standard basis of the SU (3) Lie algebra given by the Gell-Mann matrices λ a , i = 1, . . . 8 and j numbers the link in the chain.  With gauge cooling the model is expected to give the correct results as the boundary terms go to zero. Thus it is no surprise that the exact value of Tr L = 2.0957 is consistent with the CL simulation yielding Tr L = 2.0961 (9). One can clearly see that without gauge cooling boundary terms develop. The simulation yields Tr L = 6.09(2) − 0.04(1)i, which is far off from the exact value.
For the full chain with parameters as above, there is a dependence on the step size in the value to which the boundary terms asymptotically seem to converge. B(∞) whose vanishing is the consistency condition (see equation (17)) fluctuates strongly in the trajectories (between -100 and 100) hence even a tiny dependence effect is enhanced, see Fig 4, where we used fixed step size and the Euler-Maruyama discretization for the updates. The step size dependence goes with slope one in the double log plot, consistent with a linear dependence on as expected; note, however, that the boundary term has a stepsize correction several orders of magnitude larger than the Polyakov loop itself. HDQCD was introduced originally in [8] for Wilson fermions and in [9] for staggered fermions. Later developments include [10][11][12][13]. A first complex Langevin study was performed in [14]; later gauge cooling as introduced in [6] and further developed in [15] has been used.
Complex Langevin for HDQCD produces a strong step size dependence when using the Euler-Maruyama discretization, just as in the Polyakov loop model of the previous section. Hence, to get away with larger step sizes, we use an improved updating method for the rest of this paper [16]. The boundary term for the Polyakov loop has the same form as in the Polyakov chain, see Eq.(31).
The boundary term for the plaquette looks similar; writing the plaquette as Tr P ≡ Tr U 0 U 1 U −1 2 U −1 3 , the boundary term integrand is: Note that these formulas are the same for HDQCD and full QCD, the difference of the two theories are in the drift terms. For HDQCD correct results are accessible via reweighting, at least for not too large lattices. Here we use for the cutoff the 'unitarity norm' defined as The results for the spatial plaquette average are shown in the left panel of Fig. 5; we only show the plateau region of the boundary terms and leave out the region of very large unitarity norms because of large error bars. Boundary terms are present even at β = 6.0, though they become quite small in magnitude as β increases. Note that in an earlier publication [6] it was observed that the CL results are correct within errors above β ≥ 5.8. Here we collect averages in the long time stationary phase of the system where a small deviation develops also above β ≥ 5.8. For these β values at moderate Langevin times one can see essentially correct results before the rise of the unitarity norm signals the buildup of the boundary terms measured here. This issue will be discussed in detail in an upcoming publication [17]. In the right plot of figure 5 we show the criterion from [18], which also shows that for all β CL is unreliable though for larger β the tail in the distribution shrinks considerably. Thus both criteria are consistent. The boundary terms are directly related to the proof of convergence and lead to a quantitative estimation of the magnitude of the error, see in Sec. IV C.
Note that in HDQCD the determinant is a product over spatial sites of 'local determinants'. In Fig. 6 we show the histograms of the local determinants in the measure of HDQCD for β = 5.5 and for β = 6.0 in the CL simulation. One observes that the distributions are far away from zero, therefore at these parameter sets the zeroes of the measure on the complex manifold should have no measurable effect (such an effect is expected close to the critical chemical potential µ cr = −ln(2κ) [19]). The boundary terms for the Polyakov loop appear, however, consistent with zero inside (albeit large) statistical errors, even at the lower β values where the average differs sizeably from the reweighting result. We shall discuss this aspect further in Sec. IV C. Finally, we revisit the 3D XY model in which complex Langevin famously fails already for small imaginary parts of the action [20]. The CL application to this model was analyzed carefully in [20]. It is of particular interest here because it shows that the occurrence of boundary terms depends on the observable considered.
The action reads Since there are no poles in this model the wrong convergence in complex Langevin can only come from boundary terms at infinity. We investigate two observables, the action density In the case of the action density as a function of µ 2 it has been shown that for small β, CL produces a discontinuity at µ 2 = 0 [20]. We also show this in Fig. 7, where we compare CL simulations with a worldline formulation [21], which leads to correct results and thus is used as a benchmark for CL. The discontinuity of the CL results for the action density can be understood as follows (see [20]): at imaginary µ, including µ = 0, when using real fields initially ('cold start'), the imaginary part of all drift terms are zero (even in the presence of rounding errors on the computer) and thus the configuration remains real at all Langevin times. The process is thus equivalent to a real Langevin process, producing correct results and no boundary terms. For real nonzero µ, no matter how small, the process will always wander into the complexified configuration space, converging to an equilibrium distribution extending into the complexification and boundary terms can appear.
For µ = 0, however, there is a subtlety: a cold start will produce a real Langevin process and yield a result smoothly connected to those for purely imaginary µ, as stated above; on the other hand, starting with an initial configuration with nonzero imaginary parts ('hot start'), the process will explore the complexified configuration space, converge to an equilibrium distribution not supported on the real subspace, producing a result smoothly connected to those for real µ = 0 and boundary terms can appear. To make sure that we test the boundary terms in the complexified distribution we always use a positive chemical potential below.
Note however that in the number density there is no apparent discontinuity at µ = 0 (which reflects the fact that the real part of the density is proportional to sinh(µ)), unlike in the action density, see Fig. 7. The discontinuity in the action density disappears for larger β and complex Langevin apparently leads to correct results. We will investigate this further by means of boundary terms below. In the formulas below we will use the shorthands For the boundary term of the action density we need with For the number density we need with K(x) as before and We computed the boundary terms for both observables for β = 0.2, 0.7, 0.9 and µ 2 = 10 −6 , 0.1, 0.2; the results are shown in Fig. 8.
In the case of µ 2 = 0.1, 0.2 we find that the boundary terms for both observables are largest for β = 0.2 as expected.
The action density has non-vanishing boundary terms for all three values of µ and β = 0.2 and 0.7. There is still a tiny, barely visible boundary term even for β = 0.9. Hence, we conclude that in this model the action density always has some boundary terms which can become arbitrarily small as β increases. The observables for β = 0.9, however, can be regarded as correct for all practical purposes. The number density, on the other hand, has no boundary terms at µ 2 = 10 −6 , for all the three β values studied. This demonstrates that the inclusion of the observable is crucial in the computation of boundary terms. The correctness of the CL evolution does not only depend on the distribution of the drift. The vanishing of the boundary terms for the number density is consistent with the lack of an apparent jump in Fig. 7.
Note that in Fig. 8 we again only show the boundary term up to the end of the plateau-like region. For larger values of Y huge error bars start to appear due to statistical outliers which typically lead to large values in the boundary term. Those outliers also sometimes lead to sudden jumps and larger errorbars, see e.g. the red curve in the center right plot of Fig. 8. The identification of a plateau-like region is enough to identify a boundary term since the limit Y → ∞ can be taken by extrapolation. Values beyond the plateau like region, where the error bars become very large should be discarded. The last point, which includes all Y should always be consistent with zero, since this is nothing but the 'consistency condition' from [3], signifying that the process has equilibrated. We checked that this is the case in all our simulations.
Finally we also look at the drift criterion from [18]. Fig. 9 shows the histogram of the absolute value of the drift. The drift criterion predicts that for β = 0.2 results are wrong, the same is true for β = 0.7, 0.9 for µ 2 > 10 −6 . For µ 2 = 10 −6 and β = 0.7 the tail is strongly suppressed, suggesting a small error of CL, while for β = 0.9 there is no tail at all, suggesting that CL is correct here. While this criterion does show the same sensitivity and also signals slightly wrong convergence for β = 0.7, 0.9, it does not take into account the observable and thus cannot find that the number density for µ 2 = 10 −6 is actually correct for all β, while the action density is not. Since this criterion relies on the interpretation of the behavior of a distribution in the region of small values and large errors, it is more of a qualitative nature and would not allow a quantitative estimate of the deviation of the CL results from the exact ones.

IV. ESTIMATION OF THE SYSTEMATIC ERROR OF CL FROM BOUNDARY TERM ANALYSIS
The systematic error of the CL result is given by Calculating this difference would allow us to get the exact result, however generally F (t, τ ) is not directly accessible for τ > 0, except for simple toy models.

The time evolved observable is
Assuming that the spectrum of L c is discrete and contained in the open left half plane -except for a simple eigenvalue at zero -we have where a 0 is independent of z and ω 0 = 0; Reω n > 0 for n > 0.
We are interested in a 0 which gives the correct expectation value: In general we have with A n (t) = dxdyP (x, y; t)a n (x + iy).
To relate F O (t, 0) to F O (t, t) we use a simplified ansatz based on the first two terms in (49): This ansatz in consistent with the assumption that the τ derivative of F (t, τ ) is maximal at τ = 0. In [1] we have calculated F (t, τ ) for the U (1) one plaquette model, and we have seen that this ansatz is a good description of the full F O (t, τ ). This leads to where we denote by A 1 the limit of A 1 (t) for large t.
We can access the constants A 1 , ω 1 at τ = 0 by calculating where B 1 is what we called the boundary term above. Using the ansatz one sees that B 1 = −ω 1 A 1 , B 2 = ω 2 1 A 1 , and finally the systematic error of CL (44) is given by A 1 = B 2 1 /B 2 . One can show that in the CL process the boundary terms are calculated by with a reasoning similar to that leading to eq. (17). Having an estimate of the systematic error allows us to calculate the corrected CL result: In the following Tables the column 'CL error' (systematic error) is calculated as: 'CL error' = 'CL' − 'correct' where 'correct' comes from other calculations considered as providing correct results such as direct integration, reweighting (for a mild sign problem) and the worldline setup. Thus ideally the agreement between columns 'CL error' and B 2 1 /B 2 signals that the ansatz (51) describes the evolution of F (t, τ ) well and the corrected CL result will be accurate.

A. U(1) one-plaquette-model
In Fig. 10 we show the imaginary part of boundary term B 2 as a function of the cutoff Y . The corresponding formula is shown in App. A. Note the similarity with Fig. 1, except for the inverted sign and much larger fluctuations present in B 2 (Y ). As shown in Appendix B of [1], it is expected that ω 1 ≈ 1 for this model, which amounts to B 1 ≈ −B 2 . In the Table I we show estimations of the error of the CL method for several parameter values. One notes that whitin errors, this method yields the correct value of the systematic error due to boundary terms. Note that using this estimate for the systematic error to correct the CL results we get the exact result within statistical errorbars.

B. The 3D XY model
Next, we analyze the systematic error of the CL method in the XY model. A straightforward application of L c yields the observables for boundary terms B 2 of the action density and the number density, similarly to the derivation of B 1 in Sec. III D, see in App. A. We show the resulting B 2 in Fig. 11, and extract the value of B 1 and B 2 via fits of a constant to the plateau region in Figs. 8 and 11, the results are shown in table II. Note that the errors on B 2 are rather large which is due to the larger statistical fluctuations of B 2 (Y ) as well as a varying fitting range to the data in Fig. 11. Note that there might be an additional systematic error, since B 1 reaches its asymptotic value at Y = 10, while the fitting range for B 2 was chosen chosen approximately starting at Y = 7 up to Y = 11. Hence, it is possible that B 2 has not yet reached its asymptotic value. For β = 0.2 our estimate of the systematic error is close to the measured systematic error of the CL method, where statistically significant deviations can arise due to the lack of a stable plateau region in B 2 before the signal becomes too noisy as well as the ansatz (51) not describing F (t, τ ) well enough. For β = 0.7 the deviation of CL from worldline is already small, hence a high precision is needed. For this reason we do not investigate β = 0.9 here.

C. HDQCD
As the numerical estimation of B 2 is quite expensive due to large fluctuations and finite stepsize effects, we restricted ourselves here to the calculation of B 1 (see also Sec. III C) which for the spatial plaquette average seems to give an upper bound of the systematic error of CL. A more detailed analysis is delegated to a follow up study including also full QCD.
In Table. III we show the boundary term for the spatial plaquette variable. One observes that the value of B 1 is roughly a factor of 10 higher than the error of the CL approach, therefore it can be used as an indicator of the magnitude of the systematic error of the CL approach. The boundary terms of the Polyakov loop appear much smaller than those of the spatial plaquette (consistent with zero inside large statistical errors), in spite of the averages deviating significantly from the reweighting result. This might signal that the ansatz (51) is too simple (and correspondingly the assumption that the maximal slope of F (t, τ ) is at τ = 0 may not be valid) for the Polyakov loop observable, or that there are strong stepsize effects at play. This issue is currently under investigation.

V. CONCLUSIONS
We have analyzed the emergence of boundary terms responsible for failure of the CL method for various models, from one-plaquette and Polyakov loop models to high density QCD (HDQCD) and the XY model. We used two mathematically equivalent versions: 'surface' and 'volume' and we found that numerically they agree wherever both can be computed. The 'volume' version turns out to be preferable for numerical simulation of HDQCD and the XY model. The vanishing/non-vanishing of those terms signals correctness/failure of the CL simulations. Y=max(Im(φ 2 )) action density, µ 2 =1e-6 Our analysis should give a quantitative estimate for the deviation of the CL method. In practice one must rely on a truncated ansatz for the calculation of the interpolating function between CL and correct results and thus the numerical costs in some cases might be very high. The drift criterion [18], on the other hand, is easier to use, but it is of a qualitative nature.
We show that in case the boundary terms are nonzero, one can estimate the error of the CL result at the cost of measuring a 'higher order' boundary term observable. The estimation uses an ansatz for the F (t, τ ) function interpolating between CL and correct results. This allows the calculation of the "corrected CL" value, which in the case of the U (1) one plaquette model gives the correct result to a high accuracy. In case of the 3d XY model studied here it allows to estimate the size of the systematic error with reasonable accuracy. In case of HDQCD the boundary term for the spatial plaquette variable allows an estimation of the order of magnitude of the systematic error that the CL approach has due to nonzero boundary terms. A detailed analysis of further observables such as the Polyakov loop average and the higher order boundary terms in HDQCD as well as full QCD are currently under investigation. In the case of the U (1) one plaquette model defined in eq. (22) the observable for B 2 is given by L 2 c e ikx =ke ikx k 3 + 2ks + 2ik 2 sx + is 2 x − ks 2 x 2 + (1 + 2k 2 + s + 2iksx)β sin(x) +kβ 2 sin 2 (x) + β cos(x)(−2ik + sx − iβ sin(x)) (A1) In the XY model, we look at action observable first.
where we have introduced the notation A...E for the terms appearing the the last line. For easier readability, we also introduce the shorthand: Next we look at the density, given by n = x n x = iβ x sin(φ +0 ). For the boundary terms we need =2∇ 4 x n + 2∇ x n∇ 2 x K x + 2∇ 2 x n∇ x K x + 2iβ 2 sin(φ +0 ) cos(φ +0 ) + 2iβ 2 sin(φ −0 ) cos(φ −0 ) + 2K x ∇ 3 x n (A15) where the derivatives of n x are given by