The sign problem in full configuration interaction quantum Monte Carlo: Linear and sub-linear representation regimes for the exact wave function

We investigate the sign problem for full configuration interaction quantum Monte Carlo (FCIQMC), a stochastic algorithm for finding the ground state solution of the Schr\"odinger equation with substantially reduced computational cost compared with exact diagonalisation. We find $k$-space Hubbard models for which the solution is yielded with storage that grows sub-linearly in the size of the many-body Hilbert space, in spite of using a wave function that is simply linear combination of states. The FCIQMC algorithm is able to find this sub-linear scaling regime without bias and with only a choice of Hamiltonian basis. By means of a demonstration we solve for the energy of a 70-site half-filled system (with a space of $10^{38}$ determinants) in 250 core hours, substantially quicker than the $\sim$10$^{36}$ core hours that would be required by exact diagonalisation. This is the largest space that has been sampled in an unbiased fashion. The challenge for the recently-developed FCIQMC method is made clear: expand the sub-linear scaling regime whilst retaining exact on average accuracy. This result rationalizes the success of the initiator adaptation (i-FCIQMC) and offers clues to improve it. We argue that our results changes the landscape for development of FCIQMC and related methods.


Introduction.-
Exact methods for solving the Schrodinger equation are used at the forefront of understanding in condensed matter physics [1][2][3][4] and in molecular quantum chemistry [5][6][7]. However, exact parameterizations of the many-body wave function for a general system of interacting fermions scale exponentially with the system size, i.e. O(e N ). Quantum Monte Carlo (QMC) techniques attempting to determine these parameters are hindered by their values being either positive or negative, causing more pronounced variability than a set of parameters with a single sign. Unconstrained algorithms [8] therefore also scale exponentially. Constrained algorithms, in which the sign of the wavefunction parameters are fixed (e.g. by a trial wavefunction), can be polynomially-scaling but at the cost of a bias [9][10][11][12]. The (unsolved) challenge to find a constraint on the signs of a wave function, whilst still reproducing the exact result, is usually termed the fermion sign problem. For some QMC methods and specific systems this constraint can be imposed exactly, abstracting away the fundamental complexity of the problem. Overcoming the sign problem is vital for the accurate treatment of real systems.
Here, we investigate the sign problem of a recentlydeveloped QMC method developed for use in finite molecular basis sets: full configuration interaction QMC (FCIQMC) [5]. This is the direct (ground-state) QMC analogue of exact diagonalisation, finding the exact lowest-energy solution for a finite Hilbert space with an exponential number of states using a walker-based algorithm, where the Hilbert space is a set of Slater determinants and grows exponentially with the number of fermions. Since it does not impose the signs of the wave function in advance, this method does in general have a sign problem [13], and the cost of the storage of the exacton-average wave function has been shown to scale linearly in the size of the Hilbert space for a series of atomic systems [14][15][16]. Inspired by recent interest in highthroughput data driven informatics [17] we here study 378 systems with a plateau using a high-throughput approach. In so doing, we extend the information available about this method considerably.
As shown below, we discover a regime of the k-space 1D Hubbard model where the amount of information required to store the ground-state wave vector has sublinear scaling with size of the Hilbert space states. This is achieved during the simulation in the presence of a sign problem, assuming a linear wave function ansatz, without requiring any information or bias beyond the Hamiltonian. This regime is smoothly connected to the more typical linear-scaling regime including situations in which there are more walkers required than the number of states. We use this to build a conceptual map based on the system's parameter space (Hubbard U and size of Hilbert space) for the regions of scaling for this promising QMC technique. We relate these findings back to FCIQMC and its initiator adaptation, providing concrete insight for the development of these methods. We discuss whether, in light of this, QMC for the FCI problem could receive routine use for the treatment of correlated electrons in many more realistic contexts. . These two panels describe how we are able to identify plateaus. In (a), the decrease in U is shown to obscure the plateau. The bold, black (red online), dashed annotation indicates the U = 0.5 population growth averaged over 100 random number seeds. In (b), we show that the population at the plateau corresponds to the maximum value of histogram of the population (although, in general, variable bin widths were used). These two panels share a common key.
ground-state solution of the imaginary time Schrodinger equation in the limit of τ → ∞ if |Ψ(0) has a non-zero overlap with the ground-state. The configuration interaction wavefunction ansatz |Ψ = i c i |D i is used, where {|D i } is a set of Slater determinants of size N det formed from N elec electrons and N orb spin-orbitals. A first-order Euler finite-difference approximation to the imaginarytime Schrodinger equation gives where H ij = D i |Ĥ − S|D j and an energy offset ('shift'), S, has been introduced in order to conserve normalisation. For a sufficiently small timestep [13], δτ , the coefficients tend to the ground-state of the Hamiltonian matrix. Although FCIQMC is essentially a stochastic version of the power method [13,18], the core algorithm has inspired a wide range of developmental advances and new methods [19]. The wavefunction coefficients are discretized by representing them with a set of signed walkers [20]. In each timestep, the set of walkers is considered each in turn and Eq. (1) sampled according to unbiased rules [14]. Since the simulation allows the sign of a site to change, because the off-diagonal matrix elements are of different signs, the arithmetic that occurs on a site to determine its overall sign involves a process termed annihilation, removing pairs of walkers which belong to the same determinant but have opposite signs. The annihilation step preserves the expected distribution of walkers and crucially prevents the growth of exponential noise and collapse onto the sign-problem-free ground state of the matrix defined The sign problem arises in FCIQMC because the signs of the FCI coefficients are not known in advance. If they were known, it is postulated that the sign problem could be removed by factorization [21][22][23]. In contrast the ma-trixH has coefficients that are all of the same sign in much the same way as a bosonic wavefunction has the same sign in its value for all particle coordinates. This is the determinant space analogue of the real-space bosonic solution for FCIQMC [24].
A typical simulation contains four distinct phases [25]. Initially the shift is held constant (typically to a meanfield energy) and the population of walkers grows exponentially. The population spontaneously stops growing and enters the plateau phase at a system-dependent population, during which the ground-state sign structure emerges. The population spontaneously begins to grow again at an exponential rate, albeit slower than before. The shift is then varied to keep the population approximately constant. Above the plateau, statistics can be accumulated that are demonstably from the exact solution [14]; the post-plateau population is a stochastic representation of the exact wave function. The first three phases can be seen in Fig. 1(a).
Plateau determination.-The plateau is therefore a very powerful conceptual feature of FCIQMC. Phenomenologically, the plateau provides an unambiguous signal of how hard the sign problem is because it represents the minimum storage cost for an on-average exact representation of the FCI vector [13,14]. Computationally, this number of walkers determines the dominant scaling bottleneck, for both memory and computer time, of the method since each Monte Carlo iteration loops over this list. The stochastic sampling of the propagator will also contribute to the noise of the simulation but this is pre-multiplied by the length of the main vector.
Crucially therefore (and uniquely in projector QMC methods) the plateau provides an unambiguous measure of the sign problem in FCIQMC: by comparing the plateau height against the number of determinants in the 10 2 10 3 10 4 10 5 10 6 10 7 10 8 10 9 Size of Space Hilbert space, we obtain a measure for how 'hard' a system is for FCIQMC.
In order to study plateau heights, it is important to establish a unique and reproducible definition accounting for variations seen in Fig. 1(a), where plateaus are obscured by becoming 'shoulder'-shaped or being overwhelmed by stochastic noise. The plateau can be thought of as the walker population that the simulation spends the most time at. To find this, the relative frequency that a certain population window appears in the simulation is computed and the maximum of this distribution found. The histogram of the logarithm of the population rather than that of the population is used in order to handle the exponential growth in population. The plateau signal is shown for various values of U in Fig. 1(b). The disadvantage of this approach is that for some values of U , this can lead to overestimation of the plateau for some runs as multiple peaks compete. This can be circumvented by changing the bin width, and this must sometimes be interpreted manually. This procedure is discussed in more detail in the Supplementary Material, where each plateau can also be verified by inspection [27].
Plateau analysis.-We now consider the 1D translationally invariant (k-point) one-band Hubbard model, for the parameter range U = 0.5, 0.75, 1.0, 2.0, 4.0, and 8.0 for N s = 12, 14, 16, 18, 20, and 22 sites per simulation cell. We explore a wide range of doping levels (4 to 2N s − 4) for consistent simulation parameters [28]. All calculations were performed with the HANDE QMC code [29]. We focus on 1D systems because shell-filling effects were anticipated to make interpretation substantially more difficult in higher dimensions. Although this system is only of one dimension, the range of parameters encompasses a wide range of correlation regimes.
All of the plateaus we have found are plotted in Fig. 2(a). We can use this to probe the different scaling laws, based on N plat = βN γ det , where the exponent γ is defined by the tangent to the curve at a given N det . Some of the trends described below are slight and to aid readers a larger version of the graph can be found in Supplemental Material.
Linear in N det and in U . -This is the conventional scaling regime that has been previously observed. Three diagonal-running parallel lines (N det ∝ N plat ) fit the data from U = 2.0, 4.0, and 8.0 at high N det . The behaviour of the gradient with U is consistent with the plateau being linear in U (as shown in Ref. 13). Overall, therefore, γ = 1 and β ∝ U . We note in passing that these trends are remarkably consistent as doping and the particle number are changed.
The bold red line, almost coincident with most of the U = 4.0 data set, shows the line of N det = N plat where on average we store the same number of integers as the size of the space. The grey shading indicates where we would expect, therefore, to store less information than the full wave vector in order to obtain the solution via an exact diagonalisation (or FCI); above this line in the unshaded region the memory requirement is comparable to FCI. Although this is true for storage, the computational time is still expected to be linear in the size of the space, and this (being the upper limit of the scaling here) is better than many diagonalisation routines.
Sub-linear in N det , non-linear in U . -At sufficiently small system sizes, sub-linear scaling (γ < 1) is observed for all U except U = 8.0. The region of this reduced scaling depends on U , and extends to larger sys-tem sizes for smaller U . The lowest measurable exponent observed is γ ≈ 0.1 for U = 0.5.The reduced exponent is surprising for two reasons. The first is that the FCI wave function is apparently representable with storage that is sub-linear with the size of the space. The second is that the projector algorithm here is able to find this minimal representation with no additional information than the Hamiltonian, and in particular no biasing.
Non-linear in N det and in U . -In the intermediate region between these two regimes, there is a polynomial region in N det (γ < 1) as the scaling law seems to return to the original linear scaling regime (with no shift) and crosses through γ = 1. This is most prominently seen by careful examination of U = 2.0, which only deviates from the conventional scaling laws slightly. The return to γ = 1 appears around N det = 10 5 . The dashed, red line shows where we would expect the U = 1.0 scaling to be if the linear scaling with N det and U continued, which our data set never reaches. Nonetheless, the limiting scaling at high N det seems to be linear in N det and U .
Sign problem diagram.-These scaling relationships are summarized with respect to the system parameters N det and U in Fig. 2(b) The tie-lines we have plotted are made by hand, and are estimates limited by the breadth of our data set. In particular, sharp lines should be considered as estimates and not definitive. To the best knowledge of the authors, this provides the most comprehensive summary of known information about the sign problem, and scaling, in FCIQMC.
We also observe that as the system size is raised, the method returns to linear scaling in the size of the space and exponential scaling in particle number. This is interesting because it seems like it is a reverse to what might be expected to happen. As the system gets closer to the thermodynamic limit, quantities such as the energy become extensive (i.e. scale linearly with N elec ). As the correlation length is increasingly well-contained within the simulation cell, we would expect the problem to become easier and of improved scaling (due to self-averaging).
Conclusion & Discussions.-Our principal conclusion is the discovery of a regime of the k-space Hubbard model where the exact ground state can be stored with sub-linear representation cost. This exact-on-average representation requires no prior knowledge beyond onthe-fly access to the Hamiltonian matrix elements, and we believe this finding to be widely significant [30]. By means of a practical demonstration of the significance of this regime, we can find the ground-state energy for the half-filled 70-site system for U = 0.1 in 250 core hours (E = −87.418564(7) a.u.). The size of space is 10 38 determinants, and this is the largest unbiased simulation to date. By way of comparison, we estimate exact diagonalisation would take 10 36 core hours, based on known scaling laws and calculations from smaller system sizes using the algorithm implemented in ALPS [31,32]. This poses the question: how many more, larger, systems are avail-able for study that have simply not yet been attempted?
The low-scaling regime, occurring in a greater range of systems at low U , seems co-incident with the weakly interacting, or weakly correlated, regime. Although this is a tempting conclusion to draw, this is not a link that we have the scope to explore in detail here. This is in part due to the sign problem being representationdependent.In particular, the 1D Hubbard model is a toy model, not only because it is already solvable at polynomial cost [33,34], but also because a transformation to the real space basis set leaves it sign problem free for FCIQMC [13]. Nevertheless, where there exist large expanses of the Slater determinant space that are redundant, FCIQMC should be able to find them, but that sparsity must exist to be found.
Finding such representations is greatly facilitated by our study here. This is first and foremost because we demonstrate the potential benefits to be found, but also for the resources this study provisions for development of FCIQMC. We start a database of plateaus, semiautomated plateau height analysis and a practical understanding as to what might happen to the plateau or sign problem with further development. We hope that these concerns are placed at the forefront of FCIQMC development. One such route of promise and significance is the discussion of symmetry breaking and restoration in the context of QMC techniques [35][36][37][38]. Another is the adaptation of the core algorithm to other Fock space QMC methods [18,[39][40][41], but in the wider context of other quantum chemical methods it is important to know whether there is a sign problem at all [40,[42][43][44][45].
The appearance of a sub-linear regime is interesting because it mirrors some evidence that the initiator adaptation has this scaling in its wave function representation [16,46]. The initiator adaptation (i-FCIQMC) imposes a population dependence on H ij , zeroing some elements that are considered outside the currently wellsampled space. This greatly enlarges the size of systems that can be sampled, up to 10 108 determinants to date [47], but at the cost of a systematically improvable bias. In this context, therefore, i-FCIQMC is an approximate (but systematically improvable) method that expands the sub-linear scaling rather than this scaling being unique to this approximation. This strongly implies there are further improvements that can be made to expand this reduced scaling still further.
To the wider community in QMC methods, we hope this shows that FCIQMC provides interesting phenomenology and therefore something else to offer beyond FCI-quality energies. The analysis we present here argues that a sign problem that is easy to detect is potentially more informative than an error that is unquantifiable. It also demonstrates that FCIQMC does have the potential to solve large systems, which is surely required for its application in condensed matter physics, provided that its sign problem can be controlled. This puts emphasis back onto understanding and solving the sign problem, which is also a more universal effort.
Acknowledgements.-We are grateful to Alex Thom for comments on the manuscript and early access to Ref. 48