Efficient calculation of unbiased atomic forces in ab initio Variational Monte Carlo

Ab initio quantum Monte Carlo (QMC) is a state-of-the-art numerical approach for evaluating accurate expectation values of many-body wavefunctions. However, one of the major drawbacks that still hinders widespread QMC applications is the lack of an affordable scheme to compute unbiased atomic forces. In this study, we propose a very efficient method to obtain unbiased atomic forces and pressures in the Variational Monte Carlo (VMC) framework with the Jastrow-correlated Slater determinant ansatz, exploiting the gauge-invariant and locality properties of its geminal representation. We demonstrate the effectiveness of our method for H$_2$ and Cl$_2$ molecules and for the cubic boron nitride crystal. Our framework has a better algorithmic scaling with the system size than the traditional finite-difference method, and, in practical applications, is as efficient as single-point VMC calculations. Thus, it paves the way to study dynamical properties of materials, such as phonons, and is beneficial for pursuing more reliable machine-learning interatomic potentials based on unbiased VMC forces.


I. INTRODUCTION
Ab initio quantum Monte Carlo (QMC) [1] is a stateof-the-art numerical approach for evaluating the expectation values of many-body wavefunctions.It usually provides extremely accurate energies.To date, QMC has been successfully applied to various materials for which other electronic structure methods, such as the Density Functional Theory (DFT), lose predictive power.Examples are molecular crystals [2], two-dimensional materials [3][4][5], superconductors [6], and materials at extreme pressures [7][8][9][10][11].Despite several successful applications done so far and the recent development of sophisticated QMC packages [12][13][14][15][16], this technique is not as widely used as other established electronic structure methods.If compared with DFT [17], one of the main QMC drawbacks is the lack of an efficient and affordable scheme to compute atomic forces consistent with the derivatives of the total energy with respect to atomic positions (a.k.a.unbiased atomic forces).This problem is relevant in the construction of machine learning potentials (MLPs), which need large datasets, where energy and forces are computed with the method of choice.Recently, some QMC-driven MLPs have been reported [18][19][20][21][22], where the availability of unbiased forces and pressures has been a major concern.
There are two main real-space QMC frameworks, the variational Monte Carlo (VMC) and the fixed-node diffusion Monte Carlo (FN-DMC) methods [1].In this study, we focus on VMC because the forces computation within the FN-DMC framework is much more difficult and it is still a highly debated topic [23][24][25][26][27][28][29][30].Let R α be the atomic position of the nucleus α.The atomic force act- where Ψ T is the variational wavefunction, ⟨A⟩ indicates the quantum average of the local operator A over the VMC sampling of |Ψ T | 2 , E L is the so-called local energy (E L ≡ ĤΨ T /Ψ T ), with E ≡ ⟨E L ⟩, and {p 1 , • • • , p Np } is the set of N p variational parameters included in the Ψ T ansatz.Eqs.(1a), (1b), and (1c) are called the Hellmann-Feynman (HF), Pulay, and variational terms, respectively.One usually ignores Eq. (1c) when evaluating atomic VMC forces, resulting in The long-standing problem of obtaining a statistically meaningful F VMC α value with a finite variance and at the same cost as the VMC energy evaluation has been solved by the zero-variance zero-bias principle [31] together with the space-warp transformation [32] and reweighting techniques [24,31,[33][34][35][36]. Hereafter, we will denote F VMC α as regular VMC force (Eq.2).
Neglecting Eq. (1c) is justified only when the system is at its variational minimum for all parameters (i.e., ∂E/∂p i = 0, ∀i) or when the variational parameters, which are implicitly dependent on the atomic positions, accidentally or by construction become positionindependent (i.e., dp i /dR α = 0, ∀i); otherwise, F VMC α can be biased.This bias is referred to as self-consistency error [36,37].
In this Letter, we propose a method to obtain unbiased atomic forces and pressures that does not increase the computational complexity of the VMC energy calculation, by supplementing the regular VMC force with a suitable variational term, computed by exploiting the gauge-invariant and locality properties of the antisymmetrized geminal power (AGP) ansatz [38].For assessment, we demonstrate that the potential energy surfaces (PESs) of the H 2 and Cl 2 molecules, and the equation of state (EOS) of the cubic Boron Nitride (cBN) are consistent with the forces and pressure obtained by our proposed method.

II. ILLUSTRATING THE PROBLEM
For the sake of clarity, we present the case of the PES of a dimer expressed as a function of the interatomic distance R, while the present discussion can be applied for any other system.Fig. 1 shows a schematic picture of several PESs.Let E exact (a) be the exact PES of the dimer.E exact is the ultimate goal of any electronic structure calculation, but it is unknown except for nodeless ground states.The best possible PES within a given Ψ T ansatz is E fullopt (b), yielded by a VMC calculation with the fully optimized Ψ T .This is achievable for rather small systems by optimization methods suitable for noisy data [39,40], but becomes impractical for larger ones.Therefore, a good compromise between accuracy and computational efficiency is E JSD (c), obtained by the Jastrow correlated Slater determinant (JSD) ansatz, with one-body molecular orbitals (MOs) computed by DFT for each interatomic distance R. The JSD is the most common VMC ansatz: only the Jastrow factor is optimized at the VMC level, while the DFT MOs are kept frozen in the Slater determinant (SD).However, in this case, the VMC force F VMC is not consistent with the slope of E JSD , because the variational parameters included in the SD are not at their VMC minima.Instead, F VMC corresponds to the slope of E biased JSD (d), where the DFT MOs obtained at R = R ′ are used artificially for all R, such that dp i /dR = 0, ∀i.In this work, we propose an efficient method to obtain atomic forces and pressures that are unbiased, namely consistent with the slope of E JSD (c).
As long as the Jastrow factor is at its variational minimum, the contribution to the bias comes only from the SD part.This suggests that a straightforward solution for correcting the bias is to compute the variational term , where p SD i are SD variational parameters.In the following, we introduce a method to evaluate these terms by combining DFT with VMC gradients calculations.

III. METHOD TO OBTAIN UNBIASED ATOMIC FORCES
We begin by introducing the AGP representation [38,41] of the SD ansatz made of MOs.The general AGP ansatz for a system of N e electrons is written as where Â is the antisymmetrization operator and g is the so-called geminal function g and, c i,k are the AO coefficients obtained by a DFT calculation.If the Hilbert space is restricted to the occupied states, i.e.M = N e /2 for spin-unpolarized systems [42], the resultant AGP is equivalent to the SD ansatz.In other words, the SD ansatz can be treated as a special case of the more general AGP wavefunction.We assume that Ψ T is real for the sake of conciseness; thus, the variational parameters are also real.However, our method can be readily generalized to complex Ψ T [43].In this work, the geminal function is constructed from the MOs obtained from a DFT calculation, and then converted to the AO representation, namely f (r l , r m ) = L,L i,j λ i,j ψ i (r l )ψ j (r m ), with λ i,j = k c i,k c j,k .Thus, the variational term needed for correcting the self-consistency error reads which is what one should compute to get unbiased atomic forces in the JSD ansatz, where λ i,j are directly obtained by DFT calculations.As discussed later, the geminal representation also allows one to compute unbiased forces and pressures beyond the JSD ansatz by optimizing a part of λ i,j in the JAGP ansatz at the VMC level.The first factor in the terms summed in Eq. 3, i.e. ∂E/∂λ i,j , used for optimizing Ψ T and often dubbed as generalized force, can be efficiently computed by VMC [43].The second factor, the total derivative dλ i,j /dR α , can be numerically evaluated using the finite-difference method (FDM), i.e. dλ i,j /dR α ∼ /2∆R α , or can be obtained by solving the coupled perturbed Hartree-Fock (CPHF) or Kohn-Sham (CPKS) equations [44], or the linear response equations [45].The second factor is N times more time-consuming than the single-point DFT calculation, and this is regardless of the number of variational parameters.Indeed, to compute the correction terms for the geometry R ≡ {R 1 , . . ., R N }, one needs at least 3N -times HF/DFT calculations, where 3 is the number of Cartesian components.In this study, we employed the finitedifference approach because the gauge-invariant property of the AGP, inherited from its close relation with the re-duced one-body density matrix [46], allows one to construct a robust workflow to compute the second factor.Indeed, thanks to the gauge invariant property of the AGP, one does not suffer from (i) the global phase (or sign) indetermination of MOs, nor from (ii) their possible degeneracy.As for (i), a global phase θ rotating the k-th MO (Φ k → e iθ Φ k ), which is reduced to a global sign e iθ = ±1 in case of a real Ψ T , does not affect the total energy, but prevents the calculation of orbital derivatives, dc i,k /dR α , based on finite differences.Indeed, the global phase (or sign) is sometimes inconsistent between DFT outcomes with different atomic displacements.Instead, the sign flip is not problematic in the AGP representation, because the relation λ i,j = k c i,k c j,k implies that λ i,j is invariant under an MO sign change [47].As for (ii), when two (or more) MOs are degenerate, c Rα+∆Rα i,k and c Rα−∆Rα i,k might have very different values due to the presence of the other degenerate MOs.Nevertheless, it is straightforward to show that an MOs degeneracy does not affect the uniqueness of λ i,j , making λ i,j independent of the choice of the particular DFT implementation for degenerate MOs.Thus, by exploiting the AO representation of the AGP wavefunction, one can always devise a well-defined method to compute dλ i,j /dR α , which will be superior to the calculation of dc i,k /dR α .
By combining the first and second factor in Eq. 3, the variational term can be cast in a form suitable for a VMC estimate, as follows [48]: where we made apparent the dependence of the local operators on the total electronic coordinate x, sampled by VMC, to distinguish them from constant values.In Eq. 4, O i,j (x) = ∂ ln Ψ T (x)/∂λ i,j , and Ōi,j ∼ ⟨O i,j (x)⟩.We remark that O i,j (x) can be efficiently computed in a VMC calculation using the adjoint algorithmic differentiation [34], and the divergences of the generalized forces can be cured by reweighting methods [33,49].It is extremely important that the variational term is evaluated in a covariance form of random variables to reduce its fluctuations [39,50].In addition, the expression in Eq. 4 implies that if the variational wavefunction is an exact eigenstate of the Hamiltonian, F c α vanishes regardless of the VMC sample, because the local energy coincides with the corresponding eigenvalue E. Indeed, the zerovariance property holds in this expression, which is another way to recover the Hellmann-Feynman theorem.

IV. APPLICATIONS TO H2 AND CL2 MOLECULES
We determine the interatomic force of the H 2 and Cl 2 molecules, taken as first examples to assess the accuracy of our method.The ccECPs [51][52][53][54] accompanied with the uncontracted cc-pVDZ basis sets were employed for H 2 and Cl 2 molecules.For Cl 2 , the Hecore ccECP was employed.DFT-MOs were prepared by PySCF v2.0.1 [55,56] with the LDA-PZ exchangecorrelation functional [57], and then converted to the TurboRVB wavefunction format [41] using the Turbo-Genius package [58] via TREX-IO files [59].The inhomogeneous one-body, the two-body, and the three-body Jastrow factors [41] were added to the SD with frozen DFT MOs and optimized using the linear method [39,40] implemented in TurboRVB [41].The second factor, dλ i,j /dR α , was numerically evaluated using the displacements ∆R = ±0.001Å along the molecular bond direction.
The simple H 2 molecule highlights the importance of removing the self-consistency error in the forces calculation by adding the variational force term to the regular VMC expression.In H 2 , the JSD ansatz with DFT MOs is, in principle, exact if the Jastrow factor is converged in the basis set [60].Indeed, the wavefunction is nodeless, so the difficulty of finding the optimal variational state can be fully transferred to the Jastrow factor determination.Thus, H 2 allows one to study different situations, from a poor to a refined Jastrow factor.In this study, we examined a small [1s] and a large [4s2p1d] basis set expansion, as a poor and refined Jastrow factor, respectively.For the former, Fig. 2(a) shows that the DFT parameters are not optimal at the VMC level; thus, the self-consistency error is present.The equilibrium distance obtained from the PES (0.7344(2) Å) and the one from regular VMC forces (0.7392(1) Å) are reported in Tab.I. Fig. 2 and Tab.I demonstrate that the self-consistency error is mitigated by the proposed force correction F c , which gives a bond distance of 0.7341(1) Å, compatible with the one derived from the PES.Fig. 2(b) shows that in the case of a refined Jastrow factor the self-consistency error is instead negligible, because the larger Jastrow expansion compensates for the DFT determinant, and all variational parameters are optimal.Thus, the regular VMC force is already consistent with the derivative of the PES, and the corresponding force correction eventually vanishes, as reported in Tab.I.The H 2 example is illustrative of the capability of the variational term in Eq. 3 to correct the force bias due not only to the frozen DFT MOs, but also to an underconverged Jastrow factor.show that the self-consistency error is more significant for Cl 2 (Z eff = 15) than H 2 (Z eff = 1).This is consistent with the seminal work by Tiihonen et al. [37], reporting that the self-consistency error increases with the effective nuclear charge.Fig. 2 and Tab.I illustrate that the proposed force correction works also for heavier molecules.

V. APPLICATION TO CUBIC BORON NITRIDE
Not only atomic forces, but also pressures can be corrected in solids using the same method, just by replacing dλ i,j /dR α with dλ i,j /dV .To demonstrate it, we computed the cBN EOS.The ccECPs [51][52][53][54] with accompanying uncontracted cc-pVDZ basis sets were used for the cBN calculation.The linear dependency of the basis sets is solved at the DFT level by cutting basis set elements with exponents smaller than 0.20 a.u.This is crucial to suppress the statistical errors on atomic forces and pressures for periodic systems in QMC calculations [62].The 2×2×2 conventional supercell (256 valence electrons in the simulation cell) with k = Γ was employed.DFT-MOs were prepared by the built-in DFT module implemented in TurboRVB [41] with the LDA-PZ exchange-correlation functional [57].Then, the inhomogeneous one-body, the two-body, and the three-body Jastrow factors [41] were added to the SD with frozen DFT molecular orbitals.[3s1p] Jastrow basis sets were employed for B and N atoms.The Jastrow factor was optimized using the linear method [39,40] implemented in TurboRVB [41] for each volume.The second factor, dλ i,j /dV , was numerically evaluated using the builtin DFT module with volume variations ∆V = ±0.3%.Fig. 3 shows the cBN EOS, its volume derivative, the regular VMC pressures, and the corrected pressures.The obtained equilibrium lattice parameters and volumes are reported in Tab.II.It is apparent that the selfconsistency error in pressure is ∼ 5 GPa, constant over the whole volume range.Our method gives corrections that bring the estimated pressures very close to the exact values for all volumes, as shown in Fig. 3.This result illustrates the possibility to successfully correct not only atomic forces but also pressures in large systems with the explicit evaluation of the variational pressure term.
B N FIG. 3. cBN EOS (solid green curve), its volume derivative (dashed green curve), the regular VMC pressure (red diamonds), and the corrected pressure (purple squares).The vertical dashed lines represent equilibrium volumes obtained by fitting the EOS and pressures with the Vinet forms [63].

VI. DISCUSSION
We first compare our method with the FDM, which is the traditional way to obtain unbiased atomic forces in the VMC framework.The main drawback of the FDM is that it requires at least 3N independent VMC runs to compute all 3N force components, preventing its use in routine VMC calculations.Instead, our proposed method requires just a single VMC run to compute all 3N regular VMC forces, together with all O i,j terms that appear in the expression for F c α of Eq. 4. This is thanks to the algorithmic differentiation [34].As we mentioned before, the other terms in Eq. 4, namely dλ i,j /dR α , are computed by FDM using DFT as the driver, thus leading to DFT calculations N -times more time-consuming than a single DFT run.However, since the DFT cost is negligible compared to VMC, and it is mainly fast Fourier transform bound, with a favorable O(N 2 log N ) scaling for a single run, the resulting algorithmic cost of our method is superior to the FDM evaluation of atomic VMC forces.
Next, we discuss the scaling of the variance of the variational term F c α with respect to N .The variance of the local energy E L scales with N e [43], while the variance of the logarithmic derivatives O i,j is O(1) [34].Thus, the variance of F c α is bound by O(L 2 N e ), where the factor of L 2 comes from the double summation over the extended basis set elements [64].The geminal representation needs the L 2 summation instead of the LN e summation of the SD representation for the JSD ansatz.However, at variance with the SD representation, the geminal allows one to exploit the locality of the λ i,j matrix.In other words, one can neglect |dλ i,j /dR α | with small absolute values, obtained deterministically by DFT calculations.For instance, the percentage of elements such that |dλ i,j /dR α | / max |dλ i,j /dR α | ≤ 0.01% is 38.5 %, 45.0 %, and 66.0 % for 1 × 1 × 1 (8 atoms), 2 × 2 × 2 (64 atoms), and 3 × 3 × 3 (216 atoms) cBN supercells, respectively, demonstrating that the larger a system becomes, the more terms can be neglected thanks to the locality.In this way, the summation in Eq. 4 can be reduced from L 2 to L terms, by lowering the size-scaling of the F c α variance.Since L and N e are proportional to N , the scaling of the variance of our method with respect to N is bound by O(N 2 ) in the N → ∞ limit, which is just N -times larger than the variance of the regular VMC force calculation, O(N ) [34].However, in our H 2 , Cl 2 , and cBN calculations, we got the same error bars on the regular VMC forces and on the corrected ones with the same statistics.This points to a very small prefactor ε in the O(N 2 ) variance term, such that in the total variance, Var(F α ) = Var(F VMC α )+Var(F c α ) ≈ O(N )+εO(N 2 ), the O(N 2 ) contribution can be neglected for any affordable N in VMC calculations.
Finally, we emphasize the extensibility of the geminal representation employed here, which allows one to readily generalize the method proposed in this work from the JSD to the more general JAGP ansatz.A practically way to go beyond the JSD ansatz for a large system is to optimize only a subset of the variational λ i,j parameters.The partially optimized λ i,j matrix will normally have a larger rank than the one corresponding to the SD wavefunction, therefore including AGP correlations.The subset of λ i,j is chosen again based on the AGP locality.Indeed, only the variational parameters λ i,j corresponding to atoms at a distance smaller than a reasonable cutoff can be optimized, while those with distance larger than the cutoff are kept fixed [43].In this situation, only the fixed λ i,j must enter in Eq. 3, thus correcting the force bias in the JAGP ansatz.In principle, our approach can also be extended to more general antisymmetric wavefunctions, with the only caveat that, in order to compute the dp i /dR α derivatives, one has to consistently use the same auxiliary framework employed to initialize the antisymmetric part of the VMC wavefunction.

VII. CONCLUDING REMARKS
In this work, we analyzed the bias seriously affecting the regular VMC expression F VMC α .We then proposed a method to efficiently and robustly compute the missing contribution, i.e. the variational term F c α , to completely remove that bias for a JSD ansatz with DFT one-body orbitals, the most common wavefunction in ab initio VMC calculations, usually the best compromise between accuracy and computational cost.We demonstrated that the correction works very well for the systems that have been tested here, namely the equilibrium geometry of H 2 and Cl 2 molecules, and the EOS evaluation of the cubic Boron Nitride.Unbiased atomic forces within a JSD ansatz, which is in general much cheaper to optimize than more refined Ψ T s, will be particularly useful to generate VMC datasets for MLPs construction, which would otherwise be affected by the self-consistency error.Thus, our approach has the potential to open up new horizons for VMC applications, also in the context of machine learning.Finally, the same scheme can be extended to more elaborated wavefunctions, once a suitable auxiliary method is used to generate their antisymmetric part.

FIG. 1 .
FIG. 1. Schematic picture of PESs as a function of the dimer bond length R. (a) The exact PES, not accessible in practice.(b) The best possible PES obtained in the VMC framework by minimizing all variational parameters of ΨT.(c) The PES obtained with optimized Jastrow factor and Slater MOs yielded by DFT at each point R, whose slope at R ′ is exactly given by the VMC force F VMC supplemented by the variational term F c , as proposed in this study.(d) The PES obtained with frozen DFT orbitals computed at R ′ , whose slope corresponds to F VMC without the additional term F c .

2 FIG. 2 .
FIG. 2. (a) and (b): H2 PESs (solid green curves), their numerical derivatives (dashed green curves), regular VMC forces (red diamonds), and corrected forces (purple squares) obtained with (a) small [1s] and (b) large [4s2p1d] Jastrow basis sets.The PESs and forces are computed from 0.30 Å to 2.00 Å with 18 equally spaced datapoints plus 5 additional datapoints (0.55 Å, 0.65 Å, 0.75 Å, 0.85 Å, and 0.95 Å).The vertical dashed lines represent equilibrium bond lengths obtained by fitting the PES (forces) with a polynomial of 11th (10th) order.(c):Cl2 PES (solid green curve), its numerical derivative (dashed green curve), regular VMC forces (red diamonds), and corrected forces (purple squares).The PES and forces are computed from 1.50 to 2.80 Å with 14 equally spaced datapoints.The vertical dashed lines represent equilibrium bond lengths obtained by fitting the PES (forces) with a polynomial of 6th (5th) order for energies (forces).In all panels, only the region in the vicinity of the equilibrium geometry is drawn.The plotted forces are Fx acting on the left atom of each dimer, where the x axis is aligned with the direction of the molecular bond.

Fig. 2 (
c) shows the PES of the Cl 2 molecule, as yielded by a [3s1p] Jastrow basis set.Tab.I reports the equilibrium geometries obtained from the PES, regular VMC force, and corrected force.The Figure and Table

TABLE I .
The equilibrium bond distances req ( Å) of the H2 and Cl2 molecules obtained from the PESs, the regular VMC force, and the corrected force.The corresponding PESs are shown in Fig.2.
a These values are taken from Ref.61.

TABLE II .
Equilibrium lattice parameters and volumes per atom obtained by fitting the EOS, and from the regular VMC pressure and the corrected one.Zero point energy and temperature effects are not included.