Density matrix quantum Monte Carlo

We present a quantum Monte Carlo method capable of sampling the full density matrix of a many-particle system at finite temperature. This allows arbitrary reduced density matrix elements and expectation values of complicated non-local observables to be evaluated easily. The method resembles full configuration interaction quantum Monte Carlo but works in the space of many-particle operators instead of the space of many-particle wave functions. One simulation provides the density matrix at all temperatures simultaneously, from $T=\infty$ to $T=0$, allowing the temperature dependence of expectation values to be studied. The direct sampling of the density matrix also allows the calculation of some previously inaccessible entanglement measures. We explain the theory underlying the method, describe the algorithm, and introduce an importance-sampling procedure to improve the stochastic efficiency. To demonstrate the potential of our approach, the energy and staggered magnetization of the isotropic antiferromagnetic Heisenberg model on small lattices, the concurrence of one-dimensional spin rings, and the Renyi $S_2$ entanglement entropy of various sublattices of the $6 \times 6$ Heisenberg model are calculated. The nature of the sign problem in the method is also investigated.


I. INTRODUCTION
Quantum Monte Carlo (QMC) methods are well established as vital tools in the study of complex manybody quantum systems, often providing highly accurate results. However, the use of QMC methods in the calculation of quantum information measures has been somewhat limited, largely due to their inability to sample reduced density matrices involving more than a few particles or lattice sites. Entanglement, well established as an important concept in quantum information theory, has recently become a subject of active research in the condensed matter community, with various entanglement measures proving useful when studying how ground states change near quantum phase transitions [1]. In small systems the Lanczos method can be used to calculate exact entanglement measures, and in one-dimensional systems the density matrix renormalization group method is applicable [2]. However, the study of entanglement in large systems in more than one dimension is less straightforward. Relationships between reduced density matrices and spin correlation functions [3] allow QMC methods to access some entanglement measures in certain situations [4]. Moreover, Sandvik recently introduced a QMC method formulated in the valence-bond basis [5] that allowed certain new entanglement entropy measures to be evaluated [6,7]. Hastings et al. showed that Renyi S 2 entanglement entropy can also be calculated [2]. However, in general, the inability of QMC methods to directly access the density matrix and reduced density matrices of systems has hindered their use in this area.
Projector QMC methods such as diffusion Monte Carlo [8] (DMC) and Green's function Monte Carlo [9,10] (GFMC) grant access to zero-temperature properties by stochastically applying a projection operator to some initial state such that a stochastic sampling of the ground-state wave function is eventually achieved. The fixednode approximation [8] often allows accurate results to be obtained in systems with sign problems, with the accuracy of such results depending on the quality of the nodal surface used. Finite-temperature methods take a different approach. Path-integral Monte Carlo (PIMC) methods express the partition function, Z = Tr(e −βH ), as a sum over contributions from various paths in imaginary time [11]. With an appropriate update procedure, the paths can be sampled with the correct probabilities, thus allowing finite-temperature expectation values to be obtained. The stochastic series expansion (SSE) method [12] has much in common with this approach but begins by performing a Taylor expansion of the density matrix. The resulting terms in the expansion are then sampled in a similar manner. These methods also allow access to ground-state properties. However, the sign problem is, in many cases, insurmountable at low temperatures.
The full configuration interaction quantum Monte Carlo (FCIQMC) method recently introduced by Booth, Thom and Alavi [13] is a projector method for studying zero-temperature properties, and, as such, has much in common with DMC and GFMC. However, unlike DMC and GFMC, where the sampling of the ground-state wave function is performed in real space, FCIQMC samples the wave function in a discrete basis. Crucially, no prior knowledge of the nodal structure of the groundstate wave function is required to reach the exact ground state. Rather, the sign problem manifests itself in the large but system-specific population of quantum Monte Carlo walkers required in order for the ground state of the Hamiltonian to emerge [14] from the background noise. The system sizes accessible to FCIQMC are limited by the amount of memory available to store these walkers. However, the method has proven highly successful in many chemical systems, reducing the memory needed to achieve FCI-quality results by several orders of magnitude [15][16][17][18][19]. This has led to much interest in this di-rection and research into fundamental improvements and new applications of the algorithm continues [16,20].
This article presents a closely-related QMC method, which we call density matrix quantum Monte Carlo (DMQMC). Like the path-integral and SSE methods, DMQMC allows finite-temperature results to be calculated. However, it uses a projection approach to achieve this and thus has more in common with zero-temperature QMC methods. DMQMC was inspired by FCIQMC and shares many of its features.
In DMQMC, rather than sampling the components of the wave function in a discrete basis, the elements of the density matrix are sampled instead. It is then a simple task to calculate expectation values of arbitrary operators, even those that do not commute with the Hamiltonian. Such expectation values are usually difficult to calculate using other QMC methods [21]. Moreover, the ability to directly sample the density matrix means that many quantum information measures are accessible. These advantages cannot be expected to come without drawbacks, and, indeed, the systems to which we have successfully applied DMQMC to date are small by the standards of other finite-temperature methods. However, the potential uses of a direct stochastic sampling of the density matrix are such that DMQMC deserves investigation. This paper demonstrate the use of DMQMC by studying the isotropic antiferromagnetic Heisenberg model, but DMQMC is a general method and is applicable to both bosonic and fermionic systems.
Section II summarizes the FCIQMC method, setting the stage for the description of DMQMC in Section III. In Section IV an importance-sampling procedure is introduced. The DMQMC method is then applied to the isotropic antiferromagnetic Heisenberg model in Section V and used to calculate the energy and staggered magnetization of small square lattices and the concurrence of one-dimensional rings. The sign problem in DMQMC is investigated on the 4×4 triangular lattice in Section VI. We discuss our results and offer some concluding remarks in Section VII. Hartree atomic units are used throughout.

II. FULL CONFIGURATION INTERACTION QUANTUM MONTE CARLO
The DMQMC algorithm was formulated in analogy with the FCIQMC method. The two methods share many of the same features and it is often useful to compare them. We therefore begin by providing a brief summary of the FCIQMC method; for more detailed discussions readers are referred to Refs. (13) and (14).
Consider the imaginary-time Schrödinger equation The general solution to this equation is for some initial wave function |ψ(τ = 0) . If the initial wave function has a non-zero ground-state component, c 0 (0), it is easy to see that |Ψ(τ ) will become proportional to the ground state in the limit of large τ , where |E 0 is the ground state and E 0 the ground-state energy. The factor of e −E0τ can be removed by choosing the zero of energy such that E 0 = 0. In practice, since E 0 is usually unknown, we solve where we have definedT = −(Ĥ − S1). The energy shift S is adjusted slowly during the simulation to keep the normalization approximately constant. The long-time average of S gives another measure of the ground-state energy. The above theory is common to all projector methods; the difference is how they achieve the evolution to the ground state. FCIQMC works in a discrete basis of kets |X i , which are normally Slater determinants for fermions or permanents for bosons. The components of the wave function in this basis are represented stochastically by a collection of markers. Following Anderson [22,23], we refer to these markers as "psips". Each psip has an associated sign, q = ±1, which we refer to as its "charge", and resides on a particular basis state |X i (or on "site i"). The net psip charge on a basis state is interpreted as the amplitude of that state in the expansion of the wave function. At any point in a simulation, the distribution of psip charges does not need to provide an accurate representation of the wave function. Rather, the FCIQMC method only requires that the expectation values of the site charges represent the ground state [17]. Thus, at any point in the simulation, the memory required to sample the wave function may be many orders of magnitude smaller than the memory required to store the whole state.
Booth and co-workers [13] introduced an algorithm to evolve the population of psips according to the imaginary-time Schrödinger equation. This can be summarized as follows. For each time step ∆τ we loop over the entire population of psips and perform the following steps: 1. Spawning: Allow a psip with charge q i on site i to attempt to spawn onto a connected site j, where T ij = 0 and i = j, with probability |T ji |∆τ . If the spawning attempt is successful, a psip is born at site j with charge q j = sign(T ji )q i .

2.
Diagonal death/cloning: Each psip has the chance to either clone or die with probability |T ii |∆τ . The consequence of a successful death/cloning event depends on the sign of the diagonal matrix element: if T ii > 0 the psip is cloned; otherwise the psip is removed from the simulation.
3. Annihilation: Pairs of psips on the same site with opposite charges cancel out ("annihilate") and are removed from the simulation, leaving a population of only a single charge type on each site.
The FCIQMC algorithm samples the solution of a firstorder Euler finite-difference approximation to Eq. (4). Hence, the distribution of psips gives a stochastic representation of the wave function and, as τ → ∞, the psips settle down to sample the ground-state wave function. The psip annihilation step does not alter the total charge on a site. However, it has been shown to be vital in order for the true ground-state wave function to emerge in systems with sign problems [14]. Similar and more complex walker cancellation mechanisms have been attempted in continuum DMC and GFMC calculations with less success [24][25][26][27][28][29]. Walker cancellation in FCIQMC works better because psips are more likely to encounter each other in a discrete and finite Hilbert space.
The ability of FCIQMC to tackle the sign problem through the annihilation step is perhaps its most significant advantage. Annihilation leads to the characteristic population dynamics and allows the true ground state to emerge without any knowledge of the nodal structure of the ground-state wave function [14]. At the start of an FCIQMC simulation, the shift is held constant at a value large enough to ensure that the psip population grows exponentially. Eventually, when a system-specific population of psips is reached, the annihilation rate becomes equal to the spawning rate and the population spontaneously plateaus. During this plateau period the ground state of the correct Hamiltonian emerges, after which the population begins to grow again. The population can then be controlled by varying the shift. The psip population at the plateau must be a small fraction of the number of basis states in the Hilbert space in order for FCIQMC to be more (memory) efficient than an exact diagonalization.
The ground-state energy in FCIQMC can be calculated using a "projected estimator" based on the the expression Here |Ψ T is an appropriate trial function with components ψ T i in the many-particle basis, |X i , and χ i is a component of the exact ground state in this basis. The ground-state energy is obtained by averaging the numerator and denominator separately, with the total psip charge on each site, q tot i , used in the place of corresponding exact amplitude, χ i .
Calculating the ground-state expectation value of an operatorÔ that does not commute with the Hamiltonian is more difficult because a projected estimator cannot be used. Instead, assuming that |E 0 is real, it is necessary to evaluate where O ij = X i |Ô |X j . Although q tot i = χ i , the expectation value of a product is not equal to the product of the expectation values: q tot i q tot j = χ i χ j . This means that χ i χ j cannot be obtained by averaging the products of the instantaneous psip weights. One could in principle average q tot i over many iterations to obtain χ i before multiplying χ i and χ j , but this would involve storing a number for every basis function, which is unfeasible due to memory limitations.
This problem is not easy to overcome and there is currently no way to evaluate general expectation values exactly within the FCIQMC framework. Indeed, the calculation of general expectation values in other Monte Carlo methods is often a difficult task.

III. DENSITY MATRIX QUANTUM MONTE CARLO
We now show how an FCIQMC-like dynamics can be used to sample both finite-temperature and ground-state density matrices. We first consider the thermal density matrix and how it can be evolved as a function of inverse temperature by solving the symmetrized Bloch equation. We then draw upon analogies with FCIQMC to formulate the DMQMC algorithm before discussing the calculation of estimators for a general quantum mechanical observable. This section ends with an explanation of how to sample a reduced density matrix in order to calculate estimators of entanglement measures.

A. Theory
Since the psip population (and hence the normalization) varies during a quantum Monte Carlo simulation, it is convenient to work with the unnormalized thermal density matrixρ whereĤ is the Hamiltonian operator and β = 1/k B T is the inverse temperature. The canonical partition function Z(β) is given by: Differentiatingρ(β) with respect to β shows that it obeys both the Bloch equation, and the symmetrized Bloch equation, The symmetrized version turns out to be more useful for our purposes. Consider the general solution to Eq. (10),ρ and let |E i denote an energy eigenstate with energy E i . Without loss of generality, we assume that E 0 = 0. By expandingρ(β = 0) aŝ and applying Eq. (11) it is seen thatρ(β) becomes proportional to the ground-state density matrix as β → ∞: A similar analysis shows that the ground-state density matrix is not reached if a general initial density matrix is propagated using the unsymmetrized Bloch equation. Hence, in cases where ground-state properties are desired, Eq. (10) is more useful. If we wish to sample thermal properties at any finite temperature T , we must also impose the correct initial condition on Eq. (10). The most convenient is at β = 0, wherê Eqs. (10) and (14) provide everything required to stochastically sample the thermal density matrix. As in FCIQMC we introduce a collection of psips, which in DMQMC occupy not basis states |X i but basis operators |X i X j | of the expansion of the density operatorρ(β) in some convenient basis set. At the start of a simulation the psips are distributed randomly along the diagonal of the infinite-temperature density matrix ρ ij (β = 0) = δ ij . The psips then evolve under a set of rules so that they sample the matrix elements ρ ij (β) in the chosen basis at each iteration. In order to prevent the population of psips from exploding we introduce an energy shift, S, and letĤ →Ĥ − S1 as in FCIQMC. Eq. (10) thus becomes whereT = −(Ĥ − S1) is the "update matrix".

B. Algorithm
We use the following algorithm to evolve a collection of psips according to Eq. (15). For a single step in inverse temperature of ∆β, we loop over the entire population of psips and perform the following steps: 1. Spawning along columns of the density matrix: Allow a psip with charge q ij on site (i, j) to attempt to spawn onto connected sites (k, j), where T ik = 0 and i = k, with probability 1 2 |T ik |∆β. If the spawning attempt is successful, a psip is born at (k, j) with charge q kj = sign(T ik )q ij .
2. Spawning along rows of the density matrix: Similarly, allow a psip with charge q ij on site (i, j) to attempt to spawn onto connected sites (i, k) with probability 1 2 |T jk |∆β. If the spawning attempt is successful, a psip is born at site (i, k) with charge q ik = sign(T jk )q ij .
3. Diagonal death/cloning: If there is a psip on site (i, j) and T ii + T jj < 0, then with probability p d = 1 2 |T ii + T jj |∆β that psip is killed and removed from the simulation; if T ii + T jj > 0, the psip is cloned (i.e., a new psip is created on the same site and with the same charge) with probability p d .

Annihilation:
Pairs of psips inhabiting the same site (i, j) with opposite charges annihilate and are removed from the simulation, leaving a population of only a single charge type on each site.
The first three steps describe a stochastic algorithm to sample the solution of a first-order Euler finite-difference approximation to Eq. (10). The distribution of psip charges at β + ∆β is thus proportional to the density matrix at this inverse temperature, provided that the distribution of charges at β was correct.
The DMQMC method shares many similarities with FCIQMC, namely: • Annihilation does not alter the expected (normalized) psip distribution but serves to overcome the sign problem [13,14].
• The underlying finite-difference approximation is is the largest eigenvalue of the Hamiltonian matrix [14]. This is a sufficient condition to ensure correct projection onto the exact ground state, but the finite value of ∆β leads to an error of O(∆β) in the density matrix at temperatures greater than zero. It is therefore necessary to check that finitetemperature results have converged with respect to ∆β.
• The familiar shift-update algorithm already used in DMC [30] and FCIQMC [13] simulations is employed to modify S and thus to control the population. The shift, S, is adjusted according to where A is the number of β-steps between shift updates, ζ is a shift damping parameter, and N p (β) is the total number of psips at the inversetemperature β. During simulations, ζ is chosen carefully to prevent large fluctuations in S.
• Rather than attempting all possible spawning events from a psip on a given site, it is computationally efficient to attempt just one (or a small number) of the many possible spawning events and reweight the acceptance probabilities accordingly [13].
• The algorithm is highly parallelizable -only the annihilation step requires communication between CPU cores and so using a large psip population is a viable option.
As with all projection methods, the ground state is approached as β → ∞. Once convergence to the ground state has been attained the density matrix no longer depends on β (remember that S is chosen to keep the normalization fixed) and the statistical errors of measured ground-state expectation values can easily be reduced by averaging over imaginary time. A single simulation is therefore sufficient to obtain accurate ground-state expectation values. The estimation of finite-temperature properties is more difficult because the inverse temperature β changes continuously as the simulation progresses; it is not possible to hold the simulation at a specific temperature whilst statistics are accumulated.
With access to a stochastic sampling of the unnormalized density matrix at a given temperature, the expectation value of any quantum mechanical observable,Ô, at that temperature can be calculated using The numerator and denominator must be sampled and averaged separately at each temperature to achieve the desired statistically accuracy. Hence a simulation involves repeatedly evolving the density matrix from β = 0 to some chosen maximum value of β, a process we call a "β-loop". This also gives us the freedom to allow Eq. (14) to be satisfied exactly only on average, which greatly reduces the memory demands. Each β-loop is initialized with a different random number seed and with psips randomly distributed with uniform probability along the diagonal of the density matrix. Statistics are accumulated over a sufficiently large number of β-loops in order to obtain the desired statistical accuracy. As each β-loop is completely independent, performing multiple β-loops is an embarrassingly parallel task and each β-loop gives statistically independent data points. A primary concern is that averages over an impractically large number of β-loops may be required. Indeed, it is not uncommon to have to average over O(10 6 ) iterations in ground-state calculations. In practice, however, reaching a sufficiently large number of β-loops does not appear to be a problem. The crucial difference from the ground-state case is that each β-loop provides statistically independent data whereas ground-state calculations need to overcome inherent correlations between nearby iterations. Furthermore, as will be shown, the quality of estimates obtained using a given number of β-loops is better at high temperature than at low temperature. Another advantage is that a single simulation can provide data across the entire temperature range being studied.
The energy shift, S, is varied in order to control the normalization. This, however, introduces a bias: if the shift is varied by ∆S, then the psip population is effectively altered by a factor e ∆β∆S . As a result, configurations in which the average energy of the psip population happens to be less negative than usual are effectively given too large a weight. This problem is particularly severe in DMQMC because the results at each temperature are obtained by averaging over separate β-loops, and thus, for each quantity contributing to a given estimator, the corresponding shift (and hence population) profiles can be very different.
The population bias can be greatly reduced using a method suggested by Umrigar et al. in the context of DMC [30], in which each sampled quantity proportional to the psip population is multiplied by the factor where B is some chosen number of factors andB = min( β ∆β , B). By multiplying by this factor, we remove the lastB factors of e ∆β∆S introduced by varying the shift. As B → ∞ the population control bias should be completely removed. The population control bias can also be reduced by using a larger population of psips.
Another concern is that the number of elements in the density matrix is the square of the dimension of the Hilbert space of many-particle states. At first glance it seems that, without dramatically increasing the number of psips, the density matrix would be very poorly sampled. However, the dimension of the Hilbert space, D, rises exponentially with the number of particles or sites, N , so that D ∝ e αN and thus D 2 ∝ e α(2N ) . The doubling of the exponent implies that a DMQMC simulation for an N/2-site lattice model requires approximately the same number of psips as an N -site FCIQMC simulation to achieve the same sampling quality. Moreover, DMQMC estimators for operators that do not commute with the Hamiltonian often have significantly smaller variance than the forward-walking or other estimators required to evaluate the ground-state expectation values of such operators in FCIQMC.

C. Entanglement measures
It straightforward to obtain a stochastic representation of any reduced density matrix (RDM) from a stochastic representation of the full density matrix. Reduced density matrices are required to calculate entanglement measures such as the concurrence described in Appendix A.
An RDM can be sampled by "tracing out" unwanted psips. Consider a composite quantum system C, which can be partitioned into two subsystems A and B so that H C = H A ⊗ H B , where H A (H B ) denotes the Hilbert space of system A (B). The RDM ρ A that describes sublattice A is defined by taking the partial trace of the full density matrix, ρ C , over all the sites on sublattice B: Our implementation of DMQMC represents the manyparticle basis functions as bit strings [31], where each bit refers to the state of a single spin. To evaluate ρ A we construct a mask, I B , which only has bits set which correspond to spins in subsystem B. The (i, j) density matrix element of ρ C contributes to ρ A if the result of the logical AND operation of the i string with the I B mask is identical to that of the j string. The corresponding element in the RDM can be found by taking AND with an analogous I A mask.
We calculate the desired entanglement measure from the unnormalized stochastic RDM along with its trace at the desired temperature, as discussed in Appendix A for the case of concurrence. The entanglement measure and trace are averaged over multiple β-loops, or, for groundstate properties, over many iterations starting from a sufficiently low temperature.
Depending on the specific Hamiltonian, psip population and temperature, it may be that only a few psips survive the masking process to contribute to the RDM. In such cases it is necessary to increase the psip population or accumulate the RDM over several β-loops before making a single estimate. The use of translational and other symmetries can improve the quality of the sampling of an RDM, although symmetry was not exploited in this work.
Any reduced density matrix can be obtained using DMQMC, but this does not necessarily allow any entanglement measure to be calculated. For example, calculating the von Neumann entropy involves taking the logarithm of the RDM, which requires the full RDM to be represented and diagonalized. This places a limit on the size of the reduced subsystem but not on the full system. Fortunately, many useful entanglement measures involve subsystems containing only a small number of spins.

IV. IMPORTANCE SAMPLING
This article considers the application of DMQMC to the S = 1/2 antiferromagnetic Heisenberg model, where J > 0 and the i, j implies that the summation is over nearest-neighbor pairs of spins only. Periodic boundary conditions are applied. We work in the standard basis set where the many-spin states are tensor products of the one-spin eigenstates, |↑ and |↓ , of theŜ z operator. The algorithm described in section II allows a stochastic sampling of the exact finite-temperature density matrix, assuming ∆β is sufficiently small. However, for the antiferromagnetic Heisenberg model, the sampling method described is found to be insufficient for all but the smallest systems. The ground-state wave function is highly delocalized over the states in the basis set, and thus the ground-state density matrix, ρ = |E 0 E 0 |, has many off-diagonal elements with magnitudes comparable to the diagonal elements. As a result, when the density matrix is sampled via the DMQMC algorithm, only a small fraction of psips reside on or near diagonal elements. Estimators for the expectation values of most operators of interest only receive contributions from psips on or near the diagonal, so very few psips contribute and these estimators suffer from large statistical errors at low temperatures. Importance sampling can greatly improve the sampling quality.
We start by defining the excitation level between basis states |X i and |X j to be the smallest number of pairs of opposite spins that must be flipped in order to reach |X i from |X j . For the Heisenberg model, Eq. (20), the excitation level can change by at most ±1 in a single application of the Hamiltonian.
One straightforward way to improve the quality of sampling is to reduce the probability of psips spawning far from the diagonal of the density matrix. Psips that do reside on higher excitation levels are given a correspondingly larger weight, so that expectation values of operators are unchanged. However, the increase in the population of low-weight psips near the diagonal of the density matrix reduces the stochastic error in neardiagonal expectation values.
With this motivation in mind, we define the following importance-sampling procedure; a more rigorous formulation is given in Appendix B. Every time a psip on excitation level γ attempts to spawn a new psip on excitation level δ, the probability of successful spawning is altered by a factor P γδ . Thus, if a psip on a diagonal element attempts to spawn a new psip onto the first excitation level, the probability of successful spawning is altered by a factor P 01 , where P 01 < 1. The first excitation level will thus be occupied by P 01 times as many psips as it would have been with unaltered spawning probabilities. The reduced (relative) population must be accounted for by giving all psips in the first excitation level a weight of W 1 = 1/P 01 when evaluating estimators. The number of spawning events from the first to the second excitation level is also altered by a factor P 01 (due to the reduced number of psips at level 1) and the chance of successful spawning for each attempt is multiplied by the factor P 12 . Hence, the weight given to the second excitation level must be W 2 = 1/P 01 P 12 . Furthermore, since there are P 01 times as many psips on the first excitation level, the probability of spawning from the first excitation level to the diagonal elements must be enhanced by a factor 1/P 01 in order to achieve consistent spawning dynamics.
In general P γδ = 1/P δγ and the weight given to a psip  [32]. A single β-loop was carried out using an initial population of 10 5 psips on the diagonal elements. The results shown in (a) were obtained without importance sampling. The results shown in (b) were obtained using importance sampling to ensure that every excitation level had a roughly equal number of psips at zero temperature.
on excitation level γ is The estimator for an expectation value, Ô , is thus where E(i, j) is the excitation level between |X i and |X j , andρ ij is the importance-sampled density matrix. The expected value of the importance-sampled psip charge on site (i, j) is proportional toρ ij . There is clearly some freedom in the numerical values chosen for the factors P γδ . In this study we adjust P γδ such that all excitation levels have similar psip populations in the ground state. Whilst this choice is not necessarily optimal, it successfully increases the quality of ground-state estimates by many orders of magnitude whilst still allowing the entire density matrix to be sampled. Figures 1(a) and 1(b) show the fractions of psips on the first four excitation levels for the 4 × 4 square Heisenberg lattice as obtained without and with importance sampling. Both simulations used an initial population of 10 5 psips and a single β-loop. It is clear that importance sampling greatly assists in keeping a non-negligible population on the lower excitation levels.
In practice, the values of P γδ are very small for small γ and δ and decrease in magnitude as the lattice size increases. In order to avoid an unnecessarily large suppression of psip spawning from the diagonal at high temperatures, we introduce the weights W γ and probabilities P γδ gradually from β = 0 until they reach their desired final values at β target . For β < β target , the weight at excitation level γ is set equal to W β/βtarget γ ; for β > β target , the weight is held constant. The value of β target is chosen to be the imaginary time at which the ground state is deemed to have been reached. However, the value of this parameter is not critical for the quality of the sampling.

V. RESULTS
For this first study of DMQMC we focus primarily on reproducing some exact and well-established results. It is generally the case that QMC simulations of the antiferromagnetic Heisenberg model on square lattices of even dimensions do not suffer from the sign problem. In the FCIQMC and DMQMC methods this is because all psips spawned on any specific site have the same sign and no annihilation occurs [14]. This system is therefore particularly convenient to study.
In order to calculate real finite-temperature properties, it is necessary to include contributions from all M S (total spin) subspaces. The Heisenberg Hamiltonian conserves M S , so different subspaces can be studied using separate simulations, which is an embarrassingly parallel computational task. Combining results for different values of M S is straightforward but requires additional calculations and reveals nothing of interest about the performance of DMQMC. We therefore present results for the M S = 0 subspace only.
We first consider the example of the 4×4 lattice, which is small enough that an exact diagonalization can be performed, thereby allowing a direct check of our DMQMC results. Figure 2(a) shows the energy as a function of temperature using an initial population of 100 psips at the start of each β-loop and accumulating statistics over 1000 such loops. The shift was allowed to vary throughout the simulation and there were typically 700-800 psips in the simulation by the end of each β-loop. No importance sampling was applied, and so statistical flucuations increase with inverse temperature as explained in Sec. IV. The agreement with the exact results is very good. Increasing the initial population to 10 5 psips (Fig. 2(b)) reduces the statistical errors such that the agreement with the exact results is essentially perfect.
The square of the staggered magnetization is represented by the operator where x i and x j denote the coordinates of the square lattice. This operator does not commute with the Hamiltonian. Its expectation value, M 2 , is plotted as a function of β in Fig. 3 for an 8×8 lattice; a ground-state  The importance-sampling procedure described in the main text was applied. The DMQMC value of M 2 is plotted every 50th iteration. Error bars, where not visible, are smaller than the size of the marker. The ground-state value of M 2 obtained from an SSE simulation [33] is plotted for comparison.
value obtained using the SSE method [33] is also shown. The dimension of the M S = 0 Hilbert space of an 8×8 Heisenberg Hamiltonian is approximately 1.83 × 10 18 , so the density matrix sampled in this simulation has approximately 3.36 × 10 36 elements. We also considered the application of DMQMC to ground-state calculations, where statistics are accumulated over a single β-loop after the ground state has been reached. A relatively modest simulation of a 6×6 lattice using 6.5 × 10 6 psips gave a ground-state energy of −0.67888 (24)JN and a value of M 2 equal to 0.20985 (7), where errors were obtained via a blocking analysis [34]. These values agree with exact results [35] of −0.678872JN and 0.20983 for the ground-state energy and staggered magnetization, respectively, to within statistical errors [36]. Remarkably, despite the fact that [M 2 ,Ĥ] = 0, the statistical error in the value of M 2 was smaller than that of the energy. It was not necessary to reweight these results using Eq. (18) as the psip population was large enough to render such biases negligible.
The ground-state concurrence, C gs , for neighboring spins (qubits) on an antiferromagnetic Heisenberg ring was studied by Wootters and O'Connor in 2001 [37]. They showed that, for an even number of spins, C gs has a simple relationship with the ground-state energy, The exact results calculated from this formula provide a useful test of our DMQMC estimates of C gs , which are obtained from the sampled reduced density matrix (see Appendix A).
The DMQMC estimates of C gs are presented in Table I, along with Wootters and O'Connor's exact analytic values for up to N = 10 sites. A ring with N = 36 sites was also studied, using FCIQMC to calculate E 0 and then Eq. (24) to obtain a value of C gs for comparison with the DMQMC results. For lengths up to N = 10, the calculation of the concurrence can easily be carried out using other methods and provides a straightforward test of the DMQMC algorithm. The N = 36 chain is far from trivial, and it is promising that such accurate results can be obtained.
We note that the DMQMC estimates of C gs were obtained by sampling the reduced density matrix for a single pair of spins. Due to translational invariance, these estimates could have been improved by sampling the reduced density matrix for every neighboring pair and combining the results.

VI. THE SIGN PROBLEM IN DMQMC
As discussed in Sec. II, the annihilation step in FCIQMC leads to the characteristic population dynamics, whereby the sign problem can only be overcome once a critical psip population (the 'plateau') has been exceeded. Since the annihilation steps in FCIQMC and DMQMC are identical, we would expect DMQMC to possess similar population dynamics. The annihilation rate for a given psip population in DMQMC will be significantly smaller than in FCIQMC because the number of density matrix elements is the square of the number of basis functions. As such, it is to be expected that the plateau height for a given Hamiltonian will be higher in DMQMC than in FCIQMC. TABLE I. DMQMC estimates of the ground-state concurrence, Cgs, for antiferromagnetic spin rings containing N sites in the absence of an external magnetic field. Each ring correponds to a Hilbert space of D basis functions. The density matrix was sampled using Np psips, and statistics accumulated over N l β-loops. Exact results for the energy and concurrence for rings up to N = 10 are taken from Ref. (37). An FCIQMC calculation was used to find the energy of the N = 36 chain, from which an "exact" value of Cgs was determined using Eq. (24). The full density matrix of the N = 36 chain has approximately D 2 = 8.24 × 10 19 elements.
To investigate this, we considered the antiferromagnetic Heisenberg model on a 4×4 triangular lattice, for which an exact diagonalization is easily carried out. The triangular lattice is the archetypal example of a frustrated lattice and has a severe sign problem. We carried out a single β-loop and allowed the population to grow with a fixed shift whilst simultaneously investigating the accuracy of the DMQMC energy estimate. Figure 4(a) demonstrates that the population plateaus as expected. Figure 4(b) shows the accuracy of the energy estimate throughout the plateau period.
At high temperatures accurate results are obtained. This is also the case in other finite-temperature methods, where it is generally found that the sign problem is less severe for small β; indeed, there is no sign problem at all at infinite temperature. However, at lower temperatures the energy estimates suffer large fluctuations, and it is not until the population exceeds the plateau that a good agreement with exact results is once again obtained. The plateau height occurs at ∼ 1.35 × 10 8 psips. For comparison, the plateau population in an FCIQMC calculation of the same system is at ∼ 1.25 × 10 4 psips. We find (in this case) that the DMQMC plateau height is approximately the square of the plateau height in FCIQMC, matching the increase in the size of the space being sampled. It is possible to obtain accurate results for the entire temperature range simply by starting the simulation with an initial population greater than that of the plateau. Moreover, even if the plateau cannot be reached due to memory restrictions, one can nevertheless systematically reach lower temperatures by increasing the population of psips.
The sign problem in the antiferromagnetic Heisenberg model on a triangular lattice is severe in both FCIQMC and DMQMC [14]. However, the efficiency of the annihilation procedure in FCIQMC varies substantially with the system studied and with the basis set used [14]. Fur- thermore, the initiator approximation (i-FCIQMC) proposed by Cleland et al. [16], has been shown to reduce the memory requirements for FCIQMC calculations by several orders of magnitude in many cases. In i-FCIQMC, spawning events onto previously unoccupied sites are forbidden unless the psip population of the parent site exceeds a threshold. This increases the annihilation rate relative to FCIQMC and ameliorates the sign problem at the expense of introducing a systematically improvable approximation.
We have yet to investigate the DMQMC equivalent of the initiator method. It is expected that ground-state properties will be available, but it is not yet clear to what extent the modified spawning will affect finite-temperature results.

VII. DISCUSSION
This article has described DMQMC, a quantum Monte Carlo method that allows direct sampling of the finitetemperature and ground-state density matrices in a discrete basis. The validity of the method has been verified by reproducing exact and well-established results for some small systems, including the calculation of the concurrence of one-dimensional spin rings by directly sampling reduced density matrices. In all cases investigated, DMQMC has proved capable and accurate.
The introduction of an importance-sampling procedure allows larger lattices to be investigated -the largest system that we have simulated successfully to date is a 10×10 antiferromagnetic Heisenberg model. Larger systems could easily be tackled at high temperatures, but it is unlikely that our very simple approach to importance sampling will allow simulations of lattices of more than 10×10 sites over the entire range of temperatures from infinity to zero. If a better importance-sampling procedure can be devised, as we believe likely, many more avenues will be opened.
Like FCIQMC, DMQMC uses an annihilation procedure that allows it to overcome the sign problem if a system-specific population of psips is reached. In FCIQMC the sign problem can often be ameliorated, for example by changing to a more appropriate basis set [14,38] or by applying the initiator approximation [16]. Given the similarities between DMQMC and FCIQMC, it is likely that these ideas and future developments in FCIQMC will also apply to DMQMC. Due to the relative youth of FCIQMC, the rate of theoretical and algorithmic improvements is rapid [16,20].
The lattices studied with DMQMC so far are small by the standards of path-integral quantum Monte Carlo methods. However, the unique and defining feature of DMQMC is that it samples the full density matrix. Given the prominent role of the density matrix in the study of quantum information measures and the wide applicability of DMQMC, we hope that DMQMC will prove useful enough to justify further development.

ACKNOWLEDGMENTS
We are grateful for enlightening discussions with Prof. Terry Rudolph and Tom Poole. This work made use of the Imperial College London High Performance Computing facilities. N.S.B. was supported during early stages of this work through an Undergraduate Research Opportunities Scholarship in the Centre for Doctoral Training on Theory and Simulation of Materials at Imperial College funded by EPSRC under grant number EP/G036888/1. over indices. Multiplying Eq. (B1) by ρ T ij yields the following evolution equation forρ ij : ) .
(B6) The above differential equation is entirely analogous to Eq. (15). As such, the finite-difference version of Eq. (B6) can be simulated in an almost identical manner to the standard DMQMC algorithm: the extra factors of ρ T simply act to alter the spawning probabilities. Consider the case where the excitation level of (i, j) is γ and excitation level of (k, j) is γ − 1. The probability that a spawning attempt from (k, j) to (i, j) is successful is altered by the factor where Eq. (21) has been used to simplify the expression. Similarly, when spawning in the opposite direction, the probability is altered by the reciprocal of this factor. Spawning events that do not alter the excitation level are unaffected.