Complex Langevin: Boundary terms at poles

We discuss the problem of possible boundary terms at poles of the drift in the complex Langevin method, which spoil correctness of the method. For the simplest, however paradigmatic cases we can find complete answers. Lessons for more generic cases as well as open mathematical problems are discussed.


Introduction
The complex Langevin (CL) method has been studied for almost forty years [1,2], but it still has unsolved mathematical aspects. We have given a formal justification of the method in [3,4], pointing out already the possible failure of the justification due to unwanted boundary terms. In [5] and [6] we studied these boundary terms in great detail for some models, leading to verifiable criteria for correctness and even to the computation of corrections in cases of failure. All these were boundary terms at infinity, resulting from slow decay of the probability at infinity, since we were dealing with holomorphic drifts.
A particularly thorny issue is the problem of meromorphic drift arising from zeroes of the complex density defining the models. We have presented a detailed study of this issue in [7], with the emphasis on numerical analysis of models, from the simplest one-dimensional case to QCD.
In this note I am beginning a mathematical analysis of the boundary terms at poles of the drift, focusing on the simplest model already studied in [7], the so-called one-pole model. In fact, I start with a further simplification of that model in which only the pole term of the drift is kept; this allows to clarify the issue by carrying out explicit calculations. A heuristic justification for this simplification is the realization that near the pole, this term dominates the drift, so neglecting the rest of the drift should be a good approximation of the CL process while it is spending time in the neighborhood of the poles. In this simple approximation the linkage of the failure of the CL method with the appearance of boundary terms at the poles becomes manifest.
In [7] it was incorrectly claimed that those boundary terms had been found in the long time equilibrium limit; this error was pointed out by L. L. Salcedo; see the erratum to [7]. So here we move away from equilibrium and consider the short time evolutions, where we do find indeed the sought for boundary terms.
For the benefit of the reader we briefly recapitulate the general idea of the formal justification of the CL method and show where boundary terms may arise at poles of the drift, invalidating the formal justification.
We we consider a complex density on R ρ(x) = exp(−S(x)) , dxρ(x) = 1 , where ρ extends to an entire analytic function, but with possible zeroes (in which case S is of course multivalued).
The complex Langevin equation (CLE) in the form used here is where dw is the Wiener process normalized as and the drift is given by The drift thus is univalued but has simple poles at the zeroes of ρ. The average of a generic holomorphic observable O is denoted by where P z 0 is the probability density on C produced by the CL process starting at z 0 = x 0 + iy 0 and running for time t; the time evolution of P is given by the Fokker-Planck equation (FPE). We say that the CL process yields correct results if O ∞,z 0 agrees with the 'correct' expectation value of the same observable defined as In [3,4] it was shown that correctness is assured if the so-called interpolating function is independent of the parameter τ ∈ [0, t]. Here O(z; τ ) is the solution of the initial value problem The interpolation property follows from from which correctness (7) can be deduced. The left hand side of (11), via integration by parts, is equal to a boundary term, arising from possible slow decay at infinity as well as from poles of the drift. Details are found for instance in [3,7].
2 The need to consider the evolution before reaching equilibrium In [5,6] we found boundary terms at infinity by considering the equilibrium distributions. But we could not find boundary terms near poles that way, because the equilibrium distribution P (x, y; t = ∞) of the probability density was always found to vanish at least linearly at the poles of the drift, so holomorphic observables could not lead to boundary terms at the pole, as we will see.
The argument goes as follows: for simplicity let's assume that there is a pole at the origin; for the boundary term arising in equilibrium and for τ = 0 (see [7]) consider Using integration by parts and the Cauchy-Riemann equations (12) is (where L T is the Fokker-Planck operator, see [3]), since the first term of the left hand side vanishes in equilibrium. B is a boundary term. Now, since O is holomorphic, L c O has at most a simple pole at the origin, stemming from the pole in the drift. Since P vanishes linearly at the origin, the integrand of (12) is bounded in the region of integration, hence the boundary term vanishes for → 0. If we consider the time evolution for finite time t, the boundary term now is given by and the first term of this expression is no longer zero. In fact, below we will give an example where the second term of (14) vanishes, but there is a boundary term arising solely from the first term of (14).
The main difficulty is now to understand the L c evolution (9) of observables in the presence of poles. We focus on the simplest model, dubbed one-pole model in [7]. Since the equilibrium distribution does not lead to a boundary term, in this note we focus on the short time evolution, and we do indeed find boundary terms there.

The one-pole model
The action for the one-pole model can be written (after shifting the contour of integration) as with n p a positive integer. The drift of the CL process is then given by the real and imaginary parts of and the complex Langevin operator determining the evolution of holomorphic observables is where we wrote D z for d dz and we later use the same symbol for the partial derivative. 4 The "pure pole model": β = 0 Since we are not analyzing the equilibrium distribution we have the freedom to study systems that do not possess one; this leads to the consideration of the one-pole model for β = 0. This is the absolutely simplest model having a pole in the drift. Since z p plays no role, we also set if equal to zero.
We compare the finite time evolution of the probability density under the CL process with the evolution of the observables under (9) or equivalently the semigroup exp(tL c ). The two evolutions should be consistent if there are no boundary terms (see [3,4,7]).
For β = 0 the system could be treated by a real Langevin process. We will nevertheless study the system in the complex domain by choosing a complex starting point for the Langevin process.
4.1 n p = 2 The Langevin operator for this special case is For n p = 2 there is a simplification, pointed out already in [7]: L c is related to the Langevin operator with zero drift by a similarity transformation: and hence exp( with the integral kernel Here x is to be understood as an integration variable along the real axis. The evolution of a holomorphic observable O(z) is thus given by For the observables O k (z) ≡ z k , k = 1, . . . 4 and k = −1 this yields and More generally, it is easy to prove inductively that for any k ≥ −1 O k (z; t) is a polynomial in t; for even k it is holomorphic, while for odd k it is meromorphic with a simple pole at z = 0. So exp(tL c ) applied to polynomial observables is indeed given by the exponential series, which actually terminates after finitely many terms.

Comparison with the FPE evolution
To study the FPE evolution, we have to resort to numerical simulation. We proceed by running 10000 trajectories of the CL process all with a fixed starting point z 0 , stopping after Langevin times t = 0.01, 0.1, 0.5, 2.0.
Comparing the FPE results with those of the L c evolution, we find there is good agreement for the even powers but drastic disagreement for the odd ones, except for very small times.
In Fig.1 we show two plots comparing the evolution of the even powers O 2 (z 0 ; t) and O 4 (z 0 ; t) with the corresponding results O 2 t,z 0 and O 4 t,z 0 based on the FPE, for starting points z 0 = 0.5i, z 0 = 0.5+0.5i and z 0 = 1.5 + 0.5i.
The agreement between the FPE and L c evolutions for even powers corresponds to the fact that in this case there are no 1/z terms appearing, hence no boundary terms at the origin.
The opposite is true for the odd powers, there is strong disagreement, indicating the presence of a boundary term. As an example we show in Fig.2 the comparison of the FPE and L c evolutions for the observable O −1 and the same three starting points z 0 = 0.5i, z 0 = 0.5 + 0.5i and z 0 = 1.5 + 0.5i; some data of the comparison are compiled in Table 1.

Interpolating function and boundary term
From the failure of agreement for the odd powers it is easy to see that the interpolating function F O k (t, τ ) is not independent of τ when k is odd. We choose as the simplest case O −1 = 1/z; according to (29) O −1 (z; τ ) = 1/z, independent of τ , so according to (8) From Fig.2 and Table 1 it is clear that this is not independent of τ , and As discussed in Section 2, this is a boundary term, and it can only be due to the pole at z = 0, because for finite time P z 0 shows strong (Gaussian) decay. It could be evaluated also directly as a boundary term, but this is not necessary.
We can also see that there is no boundary term for even observables, such as O 2 (z) = z 2 . While it is difficult to evaluate the τ derivative directly because it involves L T P (x, y : t − τ ), we can compute the interpolating function for different values of τ to see that it is constant: Let's take for instance t = 2. and t − τ = 0.01, 0.1, 0.5, 2.0. We then find, using (26,28) In Table 2 we present the values of these quantities, showing independence of τ within the errors.

Remarks on general n p > 0
We have This operator still leaves the linear space spanned by the even nonnegative powers z 2 , ≥ 0 invariant, for any integer n p > 0. The odd powers z 2 −1 for ≥ n p /2 span an invariant linear space as well.
But for n p odd, iterating the application of L c to O k will not terminate. For n p even, however, it does terminate at the power z 1−np ; the observable is an eigenvector with eigenvalue 0. As for n p = 2, for k even O k t,z 0 and O k (z 0 ; t) agree , whereas for the odd powers they disagree. Likewise we find that exp(tL c )O k is a polynomial in t; for k = 2 it will be a polynomial in z, whereas for k = 2 − 1 it will be a rational function of z with the lowest negative power being z 1−np .
As an example, we consider three observables for the case n p = 4: we find by a simple calculation Some data comparing O k (z; t) with O k t,z 0 for the case n p = 4 are compiled in Table 3.
5 β > 0 and z p = 0 The Langevin operator is now 11.75(0) 11.7509 (3) As pointed out in [7], the similarity transformation, after restricting to the real axis yields in this case essentially the Hamiltonian of a harmonic oscillator Note that H F P is not positive on L 2 (R); it has exactly one negative eigenmode: In [7] it is explained that this problem disappears if one considers H F P on L 2 (R + ) with Dirichlet boundary conditions at 0; then only the odd eigenvectors contribute. Because ρ(x) vanishes at x = 0 the (real) Langevin process avoids the origin.
In any case, because of (42) we can use Mehler's formula [8] exp to obtain the kernel for exp(tL c ): We define so that bσ = exp(−2βt) .
Thus (45) becomes We now replace x by z, considering it as a complex variable by analytic continuation. We consider again the observables O k (z) ≡ z k , k = 1, . . . 4 and k = −1; the integrals can be carried out analytically and yield and As for β = 0, the expressions for the odd powers are now meromorphic functions of z, with simple poles at z = 0. For the even powers we have polynomials in z.
Consistency of these results with the earlier ones for β → 0 is easily verified, using We can now also consider the limit t → ∞, using For k odd O k (z; t) grows exponentially in t, whereas for k even which are the correct expectation values.
We can again compare O k (z 0 ; t) and O k t,z 0 for finite times; qualitatively the situation is not different from the case β = 0, see Fig. 3, except that now the limit t → ∞ can be considered. We find again agreement for k even and disagreement for k odd; for t = 2 the expectation values of the even powers have almost reached the infinite time limit.
As before, for k odd there is strong disagreement between O k (z 0 ; t) (which has a finite limit for t → ∞) and O k t,z 0 (which grows exponentially); this implies that the interpolating function has a nonzero slope, signaling a boundary term.

General n p > 0
For this case we can still give a rather complete analysis, even though we do not have the benefit of the Mehler fomula. We have so the even and odd subspaces are still invariant under L c . For a holomorphic observable O(z), given by a convergent power series we find The dual action on the coefficients is thus (L T c ) n = (n + 2)(n + 1 + n p )a n+2 − 2βna n .
There are two choices to start the recursion: (a) at n = 0 (b) at n = 1 − n p and the recursion will stop at n = −k. So for fixed k only finitely many a n will be different from 0 in both cases.
The semigroup exp(tL c ) applied to z k , k = 1−n p , . . . , 1, 2, . . . will thus be given by a polynomial in z and 1/z. We give a simple example: which generalizes (51) to general n p . We now distinguish two cases: 6 The general case: n p > 0, β > 0, z p = 0 This case can no longer be analyzed explicitly and analytically, whether z p is real or not. It was studied numerically in [7]. Here we discuss some mathematical subtleties arising in this case.
First we want to formulate a mathematical conjecture that might seem plausible, but is in general not correct: Conjecture 1: Let K(z) be meromorphic in C, holomorphic in a domain G ⊂ C and O(z) also holomorphic in G. Then there is a solution to the initial value problem (9), which is jointly holomorphic in (t, z) for z in any simply connected subset of G and t in a neighborhood of But Conjecture 1 is wrong even for the case of K being an entire function, in fact even for K = 0. The following counterexample is due to Sofya Kovalevskaya (1875): it was found in Wikipedia: Proof: O(z; t) is given, using the heat kernel, by A closed analytic form of this could be given in terms of error functions, but it is not needed. The non-analyticity can be seen either by looking at a power series ansatz in t and z for the solution, which diverges for any t = 0, or by looking at the result for z = 0, which is This is clearly not analytic in t at t = 0.
On the other hand we can express O(z; t), using Fourier transformation, as showing that for any t ∈ C with Re t > 0, O(z; t) is an entire function of z.
Returning to our one-pole model, −L c is still conjugate to the Hamiltonian H F P : via the similarity transformation (40). Inserting the Hamiltonian becomes with and For nonreal z p this is a non-hermitian operator and we do not know much about its spectrum; in fact to give the term a precise meaning we would first have specify the space in which H F P operates. A simple choice is L 2 (|ρ(x)|dx).
But let us try to find the action of the semigroup exp(tL c ) on powers of z. We have L c z n = n(n + n p − 1)z n−2 − 2βnz n − 2βnz p z n−1 and hence the dual action on the Taylor coefficients a n of O is (L T c a) n = (n + 2)(n + 1 + n p )a n+2 − 2βna n − 2β(n + 1)a n+1 z p . (76) The z p term mixes the even and odd subspaces.
(76) has the structure of a downward recursion; iterating this recursion will produce nonvanishing coefficients a n with n arbitrarily large negative. Unfortunately this will lead to coefficients a n growing factorially for n → ∞.
for the observable O 2 (z) = z 2 and the parameters given in the caption. It is seen clearly that there is a linear increase, indicating that |a −n (t)| grows roughly like n n/2 , so the Laurent expansion diverges for any z.
We interprete this situation as follows: it is analogous to the one in the counterexample above: the semigroup exp(tL c ) applied to the powers z k is not analytic in t at t = 0, so it cannot be constructed via the exponential series. In the counterexample there was a simple way out of this dilemma: just avoid observables with poles. Here there is also a subspace of observables that produces a convergent Laurent series (see below). But for pure powers we cannot construct the semigroup by means of the exponential series. We expect that the solutions to the initial value problem (9) exist and are meromorphic in z, but in general nonanalytic in t at t = 0, just as in the counterexample above.
So the analyticity properties of the solution to the initial value problem for a general holomorphic observable, where it exists, are not as simple as Conjecture 1 would suggest, undermining the formal justification of the CL method.
But as stated above, it is possible to find a linear subspace of observables that do not suffer from this disease; this subspace is obtained as a deformation of the even subspace for z p = 0. It can be obtained as the linear span of the eigenvectors of L c to nonpositive eigenvalues, which in turn are deformations of the eigenvectors obtained for z p =0.
The dual action L T c (76) on the coefficients leads to the eigenvalue equation (n + 2)(n + 1 + n p )a n+2 − 2βna n − 2β(n + 1)a n+1 z p = λa n . (79) This can again be rewritten as an upward recursion a n+2 = 2β (n + 2)(n + 1 + n p ) [(n + k)a n + (n + 1)z p a n+1 ] , The recursion has to start at n = 0; the lowest coefficients are For z p = 0 the recursion no longer terminates, but it produces a sequence decaying roughly like 1/Γ(n/2), thus defining an entire function of z: it is not hard to prove by induction a bound of the form |a n | < 1 Γ(n/2 + 1) |a 0 | (2|β|(1 + |z p |)) n/2 For observables from this space we do not expect any boundary terms and the CL method should work.
We checked the correctness of the CL method numerically for the example O 2 , the eigenvector of L c obtained for k = −2 from the upward recursion (80). We ran the recursion up to n = 50, when the coefficients a n are below 10 −26 . In Table 4 we compare the CL results with those of the L c evolutions. The parameters are given in the caption.
The table shows that the CL results are correct, possibly with a small truncation error for t = 2.
It should be noted that the recursion (80) works for any k ∈ R, leading to an entire function of z. Of course k > 0 should be excluded because it does not lead to convergence fore t → ∞. But does this mean that the whole positive real axis belongs to the spectrum? This question is not well posed without specifying the space (Hilbert space, Banach space or a more general topological vector space) in which the problem is posed.  Table 4: CL results for the time evolution of the observable O 2 , which is an eigenvector with eigenvalue −4β for n p = 2, β = 1, z p = i; the exact results are obtained as exp(−4βt)O 2 (z 0 ) with z 0 = 0.5i.
In a slightly different way, a subspace of entire functions invariant under L c is obtained by forming linear combinations of observables for various values of k. This means that the second condition of (82) is no longer enforced, but the condition a 1 = 0 still holds. The invariance under L c requires that a 3 is given as to preserve this relation under L c enforces a similar linear relation between a 5 and a 4 . Continuing this kind of reasoning, we learn that for any > 0 a 2 +2 is a fixed multiple of a 2 with a factor that goes to zero at least linearly with z p . The coefficients still obey the bound (83), so this defines the subspace V + of entire functions invariant under L c ; the elements of this subspace will not give rise to boundary terms and the CL process produces correct results for them.
It would be nice to find a similar invariant subspace, consisting of functions holomorphic in C \ {0}, and which reduces to the odd subspace for z p = 0. We could not find such a space because of the the convergence problems of the Laurent expansion discussed above. But it is clear that for observables O / ∈ V + the formal justification of the CL method fails and boundary ptoblems are to be expected..

Conclusions and open problems
The first conclusion is that to find boundary terms at poles, it is not sufficient to look at the equilibrium distribution; it is necessary to study the short time evolution.
The second point is that quite likely the Conjecture 1 (analyticity in t at t = 0) is in general not correct for the simple observables like powers of z. It does seem to hold, however, for a subspace of holomorphic observables; for the one-pole model such a subspace has been constructed in Section 6; unfortunately for lattice models it seems very hard imitate this construction.
Finally, for the pure pole model we established explicitly the existence of boundary terms at the pole by analyzing the short time evolution. Since in the vicinity of the pole the pole term alway dominates, the pure pole model should give a good approximation of the CL process for more general models; thus we expect such boundary terms generally, provided the CL process comes arbitrarily close to the pole.
An open question concerns the analyticity properties that can be expected for the solutions of the general initial value problem (9), given the analyticity properties of the drift and the initial value O(z; 0). Since Conjecture 1 failed, inspired by Kovalevskaya's counterexample, we formulate something weaker: Conjecture 2: Let K(z) be meromorphic in C, holomorphic in a domain G ⊂ C and O(z) also holomorphic in G. Then there is a solution O(z; t) to (9) which for t > 0 is holomorphic in z for z in any simply connected subset of G.
Remark: Conjcture 2 of course implies that O(z; t) can only have isolated singularities at the poles of K(z); these may be poles or essential singularities, but could also be branch points. Conjecture 2 has implicitly been assumed to be true for instance in [7]. It certainly would be worth knowing if it can be converted into a theorem.
Our experience in the previous section unfortunately suggests the following: Conjecture 3: Let K(z) be meromorphic in C, holomorphic in a domain G ⊂ C and O(z) also holomorphic in G. Then 'generically' there is no solution O(z; t) to (9) that is holomorphic jointly in (t, z) for z in simply connected subsets of G and t in a neighborhood of R + .
The trouble is the lack of analyticity in t at t = 0, just as found in the example above. Of course the term 'generically' is a bit vague; in our model we found that for Conjecture 3 holds for z p = 0 and O / ∈ V + , but not for z p = 0 and not for z p = 0, O ∈ V.
Acknowledgment: I would like tho thank Gert Aarts, Manuel Scherzer, Denes Sexty and Nucu Stamatescu for the long and fruitful collaboration on the CL method. Clearly this note is an outgrowth of that collaboration. I am also grateful to Denes Sexty for useful comments on this manuscript.
A The importance of using the correct function space We want to highlight a subtlety of defining the the semigroup exp(tL c ) or equivalently the initial value problem (9) by looking at a simple example. What we said about the spectrum is equally valid for the semigroup: without fixing the space in which we search for solutions, the initial value problem is not well posed.
In (29) we gave the solution as But there is a second solution: The difference lies in the analyticity properties in t: while for any fixed t ∈ R + (86) is holomorphic in z in the whole complex plane C, it has an essential singularity in t at t = 0 for any fixed z. Furthermore, for |Re z| < |Im z|, the limit t → 0 does not exist, so in this domain (86) does not solve the initial value problem.
Since the CL does not avoid the region |Re z| < |Im z|, for the solution (86) the argument for correctness fails. On the other hand, the real Langevin process, occurring for real starting points, never moves into the dangerous region and reproduces the second solution for ρ(x) restricted to the half line x ≥ 0.
On the other hand, as noted before, the solution (29) is meromorphic in z for fixed t, but holomorphic in t for fixed z, and solves the initial value problem correctly; here it is the pole at the origin that invalidates the formal argument by giving rise to a boundary term.
So it is essential to specify at least the analyticity domains of the solutions when attempting to construct the solution of the initial value problem (9). The real and complex cases demand different spaces.