Diagrammatic Monte Carlo Method for Impurity Models with General Interactions and Hybridizations

We present a diagrammatic Monte Carlo method for quantum impurity problems with general interactions and general hybridization functions. Our method uses a recursive determinant scheme to sample diagrams for the scattering amplitude. Unlike in other methods for general impurity problems, an approximation of the continuous hybridization function by a finite number of bath states is not needed, and accessing low temperature does not incur an exponential cost. We test the method for the example of molecular systems, where we systematically vary temperature, interatomic distance, and basis set size. We find that the method is ideal for quantum impurity problems with a large number of orbitals but only moderate correlations.


I. INTRODUCTION
Quantum impurity models, originally introduced to describe magnetic impurities such as iron or copper atoms with partially filled d-shells in a non-magnetic host material [1], have since found applications in nanoscience as representations of quantum dots and molecular conductors [2], and in surface science to understand the adsorption of atoms on surfaces [3,4].In addition, they form the central part of embedding theories such as the dynamical mean field theory (DMFT) [5,6] and its variants [7][8][9][10][11][12][13][14][15][16][17][18][19], as well as the self-energy embedding theory (SEET) [20][21][22], where they describe the behavior of a few 'strongly correlated' orbitals embedded into a weakly correlated or non-interacting background of other orbitals.These methods promise a systematic route for the simulation of strongly correlated quantum many-body problems [23].
While the original formulation of a quantum impurity model [1] only describes a single correlated orbital coupled to a non-interacting environment, in general the impurities occurring in the context of surface science and embedding theories contain many orbitals with general four-fermion interactions and few symmetries.The timedependent hybridization function describing the hopping between the impurity and its environment is typically such that it cannot be diagonalized for all frequencies at once.Solving quantum impurity problems, i.e. obtaining the impurity Green's function given an impurity Hamiltonian and a hybridization function, requires the use of numerical methods.
A wide range of such methods exist.Hamiltonianbased methods, such as exact diagonalization [24][25][26][27][28] and its variants [29], configuration-interactions [30], or coupled cluster theory [31,32], solve the impurity problem by mapping the impurity problem onto a system with a local Hamiltonian and a finite number of auxiliary 'bath' states chosen to fit the time-dependent hybridization function.The methods are limited to a relatively small set of strongly interacting sites or break down at moderate correlation strength.The bath fitting, which typically approximates a continuous bath dispersion by a non-linear fit to a small number of delta-function peaks, introduces additional approximations.Numerical renormalization techniques [33,34] overcome this issue by providing an almost continuous bath density of states but are in turn limited to a few orbitals in highly symmetrical situations.
A complementary approach is given by Monte Carlo techniques such as the continuous-time quantum Monte Carlo methods [35].These methods are based on a stochastic sampling of the terms in a diagrammatic expansion of the partition function.For particle-hole symmetric systems with on-site density-density interactions, interaction expansion methods [36][37][38] can solve systems with hundreds of strongly correlated orbitals [39].Away from particle hole symmetry and at low temperature, they are typically limited to around eight orbitals, and their naive adaptation to general four-fermion operator terms suffers from a severe sign problem.In contrast, a partition function expansion in the hybridization [40][41][42] is able to work with general local Hamiltonians of up to five orbitals, but is similarly restricted to diagonal hybridization functions.A reformulation [43] in terms of 'inchworm' diagrams [44] overcomes the restriction of diagonal hybridizations, but so far remains limited to impurities with up to three orbitals.
There is therefore a need for impurity solver methods that can treat the problems of embedding theory and surface science, where several orbitals with general interactions and hybridizations occur.Diagrammatic Monte Carlo (DiagMC) methods [45][46][47][48][49][50], which expand physical observables rather than partition functions, along with efficient ways of evaluating the resulting diagrammatic series [51][52][53][54][55][56], are promising.While these methods suffer from other limitations, including divergences of the series in the strong correlation regime, they do not require to approximate the hybridization function by a fit, and are not based on a diagonalization of the local Hamiltonian.
In this paper, we show a formulation of the DiagMC method for impurity problems with general interactions and hybridizations.We test the method on the example of molecular systems, for which a broad range of very mature Hamiltonian methods exist.From the point of view of the algorithmic formulation, the molecular systems exhibit the full complexity of general impurity problems.The only difference between molecules and quantum impurities is that the latter are formulated with a time-dependent hybridization, rather than an instantaneous hopping.This hybridization function modifies the bare propagator but otherwise leaves the system and our algorithmic approach invariant.Applications to molecular systems therefore form an ideal testbed for impurity solver methods of this type.
We carefully analyze the convergence behavior of the diagrammatic expansion and the computational cost of the method as a function of varying temperature, basis sets, intermolecular distance, and system size.We emphasize that we do not intend to present our method as a viable method for quantum chemistry systems without retardation effects.Rather, we exploit the rigorous and controlled framework of molecular simulations to generate a series of test cases that illustrate various parameter regimes in quantum impurities.This paper will proceed as follows.In Sec.II we introduce the computational problem, the diagrammatic formulation, and the algorithmic description.In Sec.III we present applications to molecular systems.Finally, Sec.IV presents conclusions.Appendices A through E present technical details useful for implementing our algorithm and reproducing our results.

A. Partition function expansion
We describe molecular electrons using the following Hamiltonian: where a, b, c, d denote spin-orbitals, 1, . . ., N .We employ second quantization: c a and c † a annihilates and creates, respectively, an electron in the spin-orbital a.The noninteracting term H 0 is parametrized by the one-electron integrals h ab = [a|h|b], whereas the interacting term H V is parametrized by the antisymmetrized two-electron integrals U abcd = [ab|cd] − [ad|cb].We note that explicit antisymmetrization, U abcd = −U adcb = U cdab , avoids ambiguities in the diagrammatic expansions below [57].We orthonormalize the basis, {c † a , c b } = δ ab , as we empirically found this to improve the error bars in the subsequent Monte Carlo procedure.For completeness, we compiled the explicit expressions for h and U in Appendix A.
As we are going to perform series expansions later, it is convenient to introduce an expansion parameter ξ into the Hamiltonian: The non-interacting case is given by H ξ=0 , whereas H ξ=1 recovers the full Hamiltonian (1).We are primarily interested in calculating finite temperature observables such as energies, densities, as well as the spectral function and other electronic correlation functions.We start with the grand-canonical partition function: where β = 1/T is the inverse temperature, µ denotes the chemical potential, and N = a c † a c a is the density operator.Expanding Eq. ( 3) about ξ = 0 within the interaction picture [58] yields the Dyson series: where is the non-interacting partition function, τ denotes imaginary (Euclidean) time, • 0 denotes the non-interacting expectation value: and T indicates path ordering in imaginary time.We note that for molecules, both H 0 and H V are bounded and thus away from zero temperature, the series expansion for the partition function ( 4) is absolutely convergent for all ξ .Inserting Eq. (1) into Eq.( 4) yields: In order to simplify our notation we combine four spinorbitals a, b, c, d and an imaginary time τ into a single We also introduce the following shorthands: Eq. (7b) emphasizes the fact that expectation value in Eq. ( 6) corresponds to the sum over all disconnected and connected Feynman diagrams with vertices V = (v 1 , . . ., v k ), while Eq.(7a) just corresponds to the sum over all internal degrees of freedom of the diagrams.With these substitutions, Eq. ( 6) simplifies to: To evaluate Eq. ( 8), we first introduce the noninteracting Green's function: we can use Wick's theorem to write Eq. (7b) as: where G is a 2k × 2k matrix in which the rows (columns) correspond to the 2k annihilation (creation) operators.
Introducing the column and row indices α, β, . . .such that we define the matrix elements The full matrix can be written in a block-diagonal form: where each 2 × 2 block is given by Eqs. ( 8) and (10) serve as the basis of continuous-time quantum Monte Carlo (CT-QMC), also known in the context of lattice models as diagrammatic determinantal quantum Monte Carlo (DDQMC): one generates random configurations (v 1 . . .v k ) and evaluates the corresponding weight by computing the determinant [35,36].

B. Free energy expansion
While the partition function expansion can be efficiently computed as determinants (with scaling O(k 3 )) and the series is guaranteed to converge, it is also plagued by the negative sign problem, which is expected to worsen exponentially as the system size is increased or the temperature reduced.The sign problem is typically manageable in Hubbard model calculations, where it only stems from negative determinant contributions.In contrast, the sign problem is particularly severe in molecules, where both Coulomb interaction terms and determinants generate negative coefficients.
In order to overcome these difficulties, we move to the grand potential Ω, defined as Ω ξ serves as a cumulant-generating function for correlations functions [59] and its power series in ξ is given by: where Ω 0 is defined as Z 0 = exp(−βΩ 0 ).The symbol D c indicates that unlike in Eq. ( 8), the sum is to be performed over connected Feynman diagrams only.Using an inclusion-exclusion formula, similar to the algorithm introduced in Ref. 51, D c can be defined recursively: A proof is given in Appendix D. Eqs. ( 17) and ( 10) allow the computation of connected diagrams as a hierarchy of determinants at a cost of O(2 k k 3 ).We note that even in simple cases, the convergence radius R of the series ( 16) is finite, with the value of R depending on h, U , and β.Whenever R < 1, an orderby-order summation of the series will thus not work.We will discuss strategies to extend the convergence radius in Sec.II E.
For convergent series (R > 1), one can employ the Di-agMC algorithm to sample the series (16) by generating random vertices and computing the weight using the recursion (17).One observes that the statistical variance diverges exponentially with diagrammatic order k, which requires truncation of the series to a finite order k max .

C. Scattering amplitude expansion
Other than free energy, we are primarily interested in thermal correlation function of some operators (X 1 , . . ., X m ): (18) in particular the single-particle Green's function: FIG. 1. Schematic example of diagrams up to order 2. Diagrams shown here should be understood as 'labeled' diagrams as described in Ref. [59].Duplicate diagrams with the same topology are not shown.In the expansion of Z ξ , the red diagram is an example of disconnected diagram, which is absent in the expansion of Ω due to linked cluster theorem.
One can write down a diagrammatic expansion for the Green's function similar to Eq. ( 16) and a corresponding recursion relation [51].We instead choose to perform the expansion for a vertex-like object.
In the case of the expansion of the free energy, the corresponding one-particle vertex is the scattering amplitude M , defined as: (20) where multiplication is to be understood as matrixmatrix multiplication in spin-orbitals.Sampling a oneparticle vertex is advantageous because it is independent of the choice of 'external legs' and thus allows measurements of both imaginary time-dependent quantities (G and Σ) and fixed-time quantities (density, kinetic energy, etc.) in the same simulation.
M arises naturally as a functional derivative of the grand potential: We prove this relation in Appendix B. Roughly, Eq. ( 21) expresses the fact that by removing one line from a (closed) free-energy diagram, we get an interaction correction to the Green's function, which is exactly what the scattering amplitude encodes.Combining Eq. ( 21) with Eq. ( 16) yields a series expansion for M : We thus need to evaluate the functional derivative of the recursion relation (17).
We start with the derivative of the sum of all diagrams D(V), where we rely on the following identity: where A is an n × n matrix, adj(A) denotes the n × n adjugate matrix of A, and A ᾱ β is the (n − 1) × (n − 1) submatrix of A with the α-th row and β-th column removed.The adjugate matrix adj A can be computed in O(n 3 ) time.In numerical computations, however, care must be taken because A may be singular while adj A is still meaningful [60].We elaborate on the numerical calculation in Appendix E. Combining Eq. ( 10) with Eq. ( 23), we have for 0 < τ ≤ β, where a α , b β , and τ α(β) takes the same meaning as in Eq. ( 12), and we have defined the 2k × 2k matrix which includes all connected and disconnected amputated diagrams in which internal legs corresponding to c † aα (τ α ) and c b β (τ β ) are removed.For the functional derivative of a connected free-energy diagram (26), the sum over all diagrams in Eq. ( 24) with amputated legs A(V) needs to be replaced with the sum over connected diagrams with amputated legs A c (V): for 0 < τ ≤ β.The expansion of M (22) can now be expressed in terms of A c as The sum over connected amputated diagrams A c (V) can be built up from an inclusion-exclusion technique similar to Eq. ( 17).Defining v α and v β as vertices where the α-th and β-th operators are located, respectively, diagrams in [A(V)] αβ can always be partitioned to a connected part which contains v α and v β , and the disconnected vacuum diagrams, i.e.
where α , β are row and column indices within S that correspond to the row and column indices α, β in V. Extracting the term with S = V, we have the recursion (29) This partitioning process is illustrated in Fig. 2. Since M captures the interaction correction to the Green's function which starts at the first order in interaction, the zeroth order contribution A c (∅) = 0.For each fixed V, we apply Eq. ( 29) to recursively to compute A c (V), which in turn yields M following Eq.( 26).Algorithmically, Eq. ( 29) can be evaluated by following Algorithm 1.As with Eq. ( 17), Algorithm 1 runs in O(2 k k 3 ) time.

Ac(S) ← Recursion(S, G [S,S]
). return Ac(V).The electron self-energy Σ relates the Green's function G to the non-interacting propagator g via the Dyson's equation The expansion of the self-energy Σ can be interpreted as 'one-particle irreducible' (1-PI) amputated diagrams, which stay connected even when any single propagator line is removed (cf.Fig. 1).The self-energy is thus not directly sampled, M and Σ are related to each other by Replacing G with Eq. ( 20), we have where X(iν) denotes the Fourier transform of X(τ ) (X = Σ, M, . ..) and iν is a fermionic Matsubara frequency.
The one-and two-body contribution to the electronic energy follow from Eqs. (30) and ( 20): Here ρ ij ≡ c † i c j is the electron density matrix.Note that H 0 does not include the Hartree and Fock terms of the interaction.See also Appendix C.

E. Hartree-Fock shifted Hamiltonian
In systems with significant electron-electron correlations where E V has significant contribution to the full energy E, the perturbation expansions in Eqs. ( 16) and (22) may not converge at ξ = 1.
In order to achieve better convergence by starting from a 'better' non-interacting solution such that H 0 is closer to H, we change the partition of the Hamiltonian H = H 0 +H V by adding physically-motivated counterterms to H 0 and subtracting the same terms from H V .Such an approach is referred to the 'α-shift' [36] or as the 'shiftedaction' [61] in the action formalism.
We start by adding the simplest counterterm in the quadratic form to H 0 and subtract it from H V , such that The total Hamiltonian H = H 0,α + H V,α is unchanged, whereas the perturbation expansion of H ξ = H 0,α + ξH V,α can be controlled by choosing different α.The counterterm need not be quadratic in general.Though quadratic choices are convenient in the determinantal setup, recursion schemes have been developed for general counterterms [54].
The shifted non-interacting propagator can be seen as a Green's function with an a priori selfenergy α.
In the molecular context, a significant contribution to electron correlations can be obtained by the Hartree-Fock approximation.We therefore choose α to be the Hartree-Fock self-energy, i.e.Σ HF .Σ HF is given by the self-consistent equations at finite temperature Here f (A) = [1 + exp(βA)] −1 is the matrix-valued Fermi distribution function, and µ is the chemical potential which may be adjusted so that the total number of electrons in the system is adjusted to charge neutrality.Diagrammatically, the Hartree-Fock shift renormalizes the propagators lines to g α , and an additional effective two-point vertex α has to be included in diagrams.The effective vertex α cancels any diagram which has at least one vertex connecting to itself with exactly one propagator line.This removes all 'tadpole' diagrams in expansions of G and M , as well as that of Ω except for the first order diagram whose vertex connects to itself with two propagator lines.Fig. 3 illustrates the cancellation of such diagrams.
Given a specific set of vertices V, the removal of all tadpole diagrams is achieved by replacing the G matrix (13) defined on internal vertices V with: i.e. by setting all 2 × 2 diagonal blocks (corresponding to self-connections of vertices) to zero, and replacing bare propagators with g α .Using the modified definition of G in Eqs. ( 10) and ( 25), one can carry out the same recursive calculations in Eq. ( 26) to obtain corresponding connected quantities.
Note that this introduces a bias in the free-energy evaluation by setting the first order contribution (the 'dumbbell' diagram) to zero, which needs to be corrected: In the remainder of this paper, we will always use a Hartree-Fock counterterm and omit the α subscripts.

F. Monte Carlo integration of diagrammatic series
Evaluations of diagrammatic series, such as Eqs.( 16) and (27), can be formally summarized as where X is the physical variable (G, M , ...), V = (v 1 , . . ., v k ) denotes space time indices of internal vertices, and C the contribution of each fixed configuration of V to X.Here we take the 'physical' value of the coupling constant ξ = 1.To perform a Monte Carlo integral, we introduce a cutoff k max of the expansion order, and an a priori probability distribution of vertex space-time indices p(V) such that In addition, we require that p(V) > 0 whenever C(V) = 0.The order-k max approximation to X can be estimated stochastically as with a large number N of Monte Carlo samples {V i } generated following distribution p(V).
Since the Green's function G, self-energy Σ, as well as the total electronic energy E = E 0 + E V can all be derived from the scattering matrix M using Eqs.( 20), (32), (33), it is sufficient to only keep track of the amputated diagrams A c (V) and obtain all other observables as derived quantities.Table I summarizes some of these measurements.In our implementation, we only measure the energy and M (iν n ) with fermionic Matsubara frequencies iν n on the fly, and construct Ĝ(iν n ) and Σ(iν for each frequency, where symbols with 'hats' represent quantities in frequency representation as matrices in spinorbital indices.Resampling techniques such as the jackknife or the bootstrap are applied to avoid biased error estimations. For efficient Monte Carlo simulations, it is important to choose the a priori distribution p(V) to achieve importance sampling, such that the simulation samples more frequently when |C(V)| is large and less frequently otherwise.Since we measure multiple observables in one simulation, we need to define such a distribution that works for all measurements.We find in practice that the following choices provides efficient samplings for most measurements: where • denotes the Frobenius norm of a matrix, and (V) is the energy measurement defined in Table where denotes a convolution in τ .p E performs well for the energy measurements, whereas p A is more robust when measurement of M is needed.
At high expansion order k max , the normalization factors W A and W E are difficult to calculate analytically.Instead, we measure an auxillary quantity whose exact value can be calculated analytically, and normalize all other measurements against it.For example, we can normalized against the second-order contribution to the total energy Here we have chosen p E as the a priori distribution.Any other measurements can now be estimated as Similar relations apply when we use other choices of a priori distributions or normalization measurements.Once p(V) is defined, we generate Monte Carlo samples as a Markov chain via the Metropolis-Hastings algorithm.From each configuration V i , a new configuration V j is proposed following some proposal probability distribution w prop (V j |V i ).To ensure detailed balance, an acceptance ratio R is calculated after each proposal as The proposal V i → V j is accepted with probability This ensures the detailed balance of the Markov process, i.e. where which guarantees samples obtain the equilibrium distribution p(V) after thermalization.
For molecular systems, we design the following set of updates which lead to an ergotic random walk in the configuration space for all systems we investigate in Sec.III.The new indices a , b , c , d , and τ can be proposed by some a priori probability p ins .The proposal probability distribution for this update from order k to k + 1 is  Vertex shift in time or orbitals are self-balancing moves, hence the acceptance ratios shares the same form Vertex splitting and merging are mutually inverse updates.The acceptance ratios are therefore There is considerable freedom in choosing p ins .For all systems we study in this work, we choose p ins such that where p orb (a , b , c , d ) is uniformly distributed if the inserted indices can form non-zero propagator connections and zero otherwise, and where τV = 1 k k i=1 τ k is the average time coordinate of the existing vertices, and we choose ϕ(τ ) as a function in [0, β] which has more weight near τ = 0 and β but still non-negligible weight in between.Since the Hartree-Fock propagators decay exponentially away from 0 and β, this makes sure that the new vertex are more likely to stay close to existing vertices so that the resulting configuration has sizable contribution.In our implementation, we define as a Lorentzian distribution where λ is an estimation of the overall energy scale of the system proportional to e.g. the standard deviation of the Hartree-Fock energy levels.

A. Series convergence
We first present a test of our method on a minimal molecular system: H 2 in the STO-6g basis set [62].Two hydrogen atoms are placed at distance r and finite temperature T = 1/β.The basis set only contains the 1s orbital in each atom.This setup allows us to easily perform exact diagonalization (ED) calculations of the full molecular Hamiltonian at any temperature, such that exact benchmark results for our CDet results are available.
In Fig. 4, we compare the total energy E tot = E +E nuc , where E nuc is the Coulomb energy of the nuclei, from CDet with order truncation k max up to 6 to the ED energy at T = 50 −1 E h , both as a function of r.Around equilibrium distance r ≈ 1.4 a 0 , the CDet energy converges well to the ED solution.The system moves to the strongly correlated regime (i.e. a regime far from the Hartree-Fock solution), as we 'stretch' the molecule by increasing r.At r > 2.0 a 0 we start to observe significant systematic deviation at k max = 6.Since the kinetic energy of electrons moving between two atoms is significantly reduced as we increase r but the long-range Coulomb repulsion between electrons changes slowly, the electron-electron interaction becomes more important at larger r, and hence it is expected that the perturbation expansion becomes more difficult to converge.This setup is standard in quantum chemistry [63] and is similar in spirit to lattice model setups in which a metal-toinsulator transition is induced by gradually increasing an on-site interaction.
Analytically, the convergence behavior is determined by the properties of the expanded quantity (e.g.E[ξ]) as a function of the coupling constant ξ on the complex plane, similar to the convergence analysis for many-body perturbation theory (MBPT) calculations at T = 0 [64][65][66][67][68].We evaluate the electron energy E[ξ] for complex values of ξ near ξ = 0 using ED at r = 1.4,2.0, 2.8, and 3.6 a 0 , following where the H α is the Hartree-Fock counterterm introduced in Sec.II E. One can show via a straight forward substitution that E [1] gives the 'physical' electron energy and E[0] recovers the Hartree-Fock energy.Figure 5.a shows the interaction correction E[ξ] − E[0] to the total energy, where the black dot represents the physical value at ξ = 1.Since the convergence radius of the power series around ξ = 0 is determined by the singularity (pole or branch cut) closest to the origin, the series is convergent at the 'physical' point ξ = 1 if and only if there are no singularities in the unit circle (dashed circles in Fig. 5).At r = 1.4 a 0 , all poles are far outside the unit circle, indicating a rapidly convergent series.As we increase r, poles move closer to the unit circle at r = 2.0 a 0 , implying a slower convergence of the series, and finally enter the unit circle at r = 2.8 a 0 and 3.6 a 0 , resulting in divergent series at ξ = 1.
The analytic properties are reflected directly in the convergence behavior of the CDet results.For a direct comparison, we calculate the contribution of each order k to the total energy E (k) tot up to k max = 8 for the same values of r, as shown in Fig. 6.At r = 1.4 a 0 , E (k) tot quickly converges to zero at k > 4. At r = 2.0 a 0 , we observe tendency to converge at k = 8 but non-zero systematic deviations remain.For r = 2.8 a 0 and 3.6 a 0 , no signs of convergence are observed up to k = 8.
The CDet approach can be applied to different temperatures without adding significant computational cost, as we will show in Sec.III B. This is fundamentally different from methods such as CT-QMC, where reaching lower T is only possible at an exponential cost away from half filling [35].In Fig. 7, we show the temperature dependence of the CDet total energy for H 2 , STO-6g at k max = 6, from T = 10.0 −1 E h down to T = 500.0−1 E h , in comparison to the ED solution at T = 0.All calculations use the same algorithmic setup and the same number of Monte Carlo steps.Convergence to the zerotemperature solution is observed as T decreases, while the stochastic error estimation does not change significantly.Systematic deviations can be observed at similar locations (r > 2.0 a 0 ) for different temperatures, indicating similar convergence behavior for the same system at different temperature.This can be shown by the temperature dependence of the analytic structure of E[ξ], as plotted in Fig. 5.b.As temperature is reduced, the spacing of the poles along the imaginary direction decreases proportionally, but the real-axis locations of the vertical 'walls' of poles stay almost unchanged, which leads to similar convergence radii at different temperature.
The Hartree-Fock shifted action plays an important role in achieving better series convergence in CDet. Figure 8 compares the ED analytic structure of the total energy E[ξ] with and without the Hartree-Fock shift.Without the shift, even for the equilibrium distance r = 1.4 a 0 (usually considered 'weakly correlated'), there are poles deep inside the unit circle, implying a highly divergent series at ξ = 1.In contrast, the Hartree-Fock shift pushes the poles away from the origin, which leads to a convergent series as seen in Fig. 4 and Fig. 6.
The CDet approach gives access to dynamic quantities, such as the Green's function G and the self-energy Σ, through the scattering amplitude M .The left column of Fig. 9 shows the CDet measurement of M (iω n ) in Matsubara frequency space up to k max = 6 for H 2 , tot of each order k to total energy.H2, STO-6g, T = 50 −1 E h .Convergence is observed at bond lengths r ≤ 2.0 a0 but not at r ≥ 2.8 a0.STO-6g at r = 1.4 a 0 and T = 50 −1 E h .As we increase the expansion order, CDet results gradually converge to the ED solution (black lines), and at order 6 we observe only a small systematic error due to order truncation.The CDet self-energy Σ is calculated from M following Eq.(45).Both quantities exhibit similar behavior, as shown in the right column of Fig. 9.At order 3 and higher, the real part of Σ(iω n ) takes non-zero value at high-frequency limit, corresponding to the cor- rection to the frequency-independent Hartree-Fock selfenergy Σ HF .The CDet Green's function, derived from M following Eq.( 44), is shown in Fig. 10.Good agreement with ED is observed at k max = 6 on the top panel, where both the Monte Carlo error estimation and the systematic error due to order truncation is much smaller than the symbol size.The bottom panel shows convergence of CDet Green's function to ED by increasing k max , with a small but visible systematic deviation at low frequency when k max = 6.The generality of our CDet implementation allows a straightforward extension to much larger basis sets.Going beyond the minimal basis, we compute the CDet total energy of H 2 using cc-pVDZ and cc-pVTZ basis sets with 10 and 28 orbitals in total, respectively, and compare to the ED solution as shown in Fig. 11.For r < 2.0 a 0 , CDet gives decent convergence to ED at k max = 4, with both stochastic and systematic error below 1 mE h .The 2 s and 2 p orbitals added by cc-pVDZ basis and 3 s, 3 p  Finally, we extend our method to bigger molecules by adding more hydrogen atoms to the system.We consider a chain of 10 hydrogen atoms on a straight line with equal spacing r, the same benchmark system used in Ref. 57.At minimal basis STO-6g, all ten 1 s orbitals contribute equally to the active space of 10 electrons.Compared to H 2 with cc-pVDZ, which has the same number of orbitals, H 10 with STO-6g has more orbitals relevant to electron correlations, and the cost of CDet is higher (for a detailed analysis see Sec.III B).The left column of Fig. 12 plots the CDet total energy up to k max = 4 in comparison to ED solution at T = 50 −1 E h .Convergence within 5 mE h is achieved at k max = 4 for r < 2.4 a 0 , and systematic deviations are evident for r > 2.4 a 0 .Similar behavior can be found in the the zero-temperature coupled cluster (CCSD) result (dotted lines), as both methods rely on the perturbative expansions of electron-electron interactions in different forms.The computational cost becomes much higher as we go to a bigger basis for H 10 .With cc-pVDZ, there are 50 atomic orbitals in total, with potential excitations to the empty orbitals from all 10 electrons.As shown in the right column Fig. 12, CDet still agrees with the reference method (MRCI+Q data from Ref. 57 at T = 0) for small values of r, but the Monte Carlo errors are significantly larger.Although our generic implementation has achieved decent extensibility without fine tuning for each specific system, more efficient Monte Carlo estimators and sampling schemes as well as analytical resummation techniques should advance the limit of CDet to more complex systems.

B. Analysis of computational cost
The computational cost of a Markov chain Monte Carlo simulation, measured as the computational time needed for reaching a result for observable X within a desired accuracy ∆X, is determined by three factors.First, the cost of each individual update, which is O(k 3 2 k ) for a configuration at expansion order k according to Algorithm 1.Second, the number of configuration updates needed to reach an independent sample by transversing a Markov chain of potentially correlated configurations, described by the integrated autocorrelation time τ int .Finally, the variance Var(X) of the estimator of the quantity of interest (Table I To assess the computational cost of our CDet implementation for reaching a certain uncertainty level, as well as how the effort changes with respect to temperature, choice of basis set, and system size, we perform a series of simulations of convergent series for the hydrogen chain H n with the same Monte Carlo updates and measurements for a fixed number of Markov chain iterations. In Fig. 13, we show estimates of autocorrelation effects, actual computational costs, and stochastic uncertainties in total energy, as functions of temperature T , the number of orbitals N orb , or the number of hydrogen atoms n in log-log plots.We rescale the y-values by an arbitrary factor to emphasize the respective scaling of these quantities in the same plot.Figure 13.a shows the temperature dependence of CDet simulations of a fixed system (H 2 , STO-6g, r = 1.4 a 0 ) at k max = 6.We observe that the simulation time does not change significantly as we decrease temperature, indicating similar distributions of the expansion order (usually tilted to the highest order).The error estimate in total energy follows almost the same tendency as the factor of the autocorrelation effect √ 2τ int + 1, indicating the underlying energy estimator does not have strong temperature dependence.The autocorrelation effect shows a slow power-law increase as temperature is lowered, implying that our Monte Carlo updates remain efficient at low temperature.
A similar analysis is shown in Fig. 13.b for the basis set dependence of the same system (H 2 , r = 1.4 a 0 ) at fixed temperature.We perform CDet simulations with k max = 4 for basis sets STO-6g, cc-pVDZ, and cc-pVTZ, with 2, 10, and 28 atomic orbitals, respectively.As we add more 'virtual' orbitals to the system, the computational time increases slowly, and the autocorrelation time even decreases as the additional orbitals improve the connectivity of Monte Carlo configurations.However, the stochastic error shows a different trend from the autocorrelation effect and increases (a fit with a power law results in ∼ N 1.22 orb ), meaning that the additional orbitals introduce more diagrammatic configurations with alternating signs that lead to stronger Monte Carlo fluctuations.
As we increase the systems size in Fig. 13.c by adding more hydrogen atoms, the stochastic error (normalized by the system size n) increases with a much faster power law (fitted ∼ n 3.48 ), while the computational time and autocorrelation barely changes.The additional atoms add not only more orbitals to the system, but also valence electrons which directly participate in electron excitations.These electrons introduce additional diagrammatic configurations on a faster scale than those introduced by virtual orbitals as in Fig. 13.b, and cancelling their effects via Monte Carlo becomes more expensive.
Thus, through the empirical analysis above, we have shown that for convergent series, the computational cost of our CDet implementation is not very sensitive to changes in temperature or basis sets, but depends strongly on the size of the system, or more specifically, on the number of valence electrons directly participating in electron excitations.

IV. CONCLUSION
In conclusion, we have presented a diagrammatic Monte Carlo method for quantum impurity models with general interactions and hybridizations.We have tested the method at the example of molecular systems, which present a systematic way of changing correlation strength, system size, basis size, and temperature.
Our method is formulated in the language of Green's functions and self-energies.As a grand-canonical finitetemperature method, it is able to describe systems with particle number fluctuations and excited states.However, similar to other perturbative methods, the diagrammatic series breaks down in the strong correlation regime.This breakdown is clearly evident in the order-by-order convergence of the series and, as we have shown in detail, can be traced back to the pole structure of the diagram series.
Our method fills a crucial need of impurity solvers able to treat general four-fermion interaction and general offdiagonal hybridizations in large multi-orbital problem.It should therefore find applications in moderately correlated real-material simulations such as those occurring in DMFT [5,6] and SEET [20][21][22][23].
Further methodological progress, such as the use of higher order counterterms [54], better integration methods [69], complex conformal mapping techniques [70][71][72], and other types of Monte Carlo updates will expand the accessible parameter regime of the method and may make simulations in the strongly correlated regime possible.
Expanding compound indices x, y, . .., Eq. (B2) can be rewritten as In practice, we usually work with the time-translational invariant functions M (τ ) and g(τ ) instead of their twovariable form.To that effect, we consider for 0 < τ ≤ β,

FIG. 2 .
FIG.2.Schematic illustration of the recursive removal of disconnected amputated diagrams.Empty boxes stand for the contribution of all diagrams (D or A), and filled ones for that connected diagrams only (Ac).Symbols inside boxes denote the set of vertices included in each component.The top relation shows all partitions of [A(V)] αβ into a subset fully connected to the amputated legs and a disconnected complement set.It is reorganized as the bottom relation which recursively defines Ac(V) by removing all disconnected components.

G 10 :
[S,S] is the submatrix of G whose rows and columns correspond to the subset S. Same definition applies to [Ac(V)] [S,S] .Subtract Ac(S)D(V\S) from [Ac(V)] [S,S] .
end function D. Observables from scattering amplitude

FIG. 3 .
FIG. 3. Schematic example of diagram cancellations due to the Hartree-Fock counterterm.Here we show all secondorder Green's function diagrams generated by the counterterm, where the red circle indicates the counterterm α, each introduces a factor of −1.Terms in each dashed curve cancel each other, leaving only the last term.

2 .
Vertex merging: Pick two random vertices v 1 = (a, b, c , d ; τ ) and v 2 = (a , b , c, d; τ ) and merge them into v = (a, b, c, d; τ ).The proposal probability distribution from order k + 1 to k is

4 .
Vertex shift in time: Update the time label τ of a vertex v to a new value τ .Vertex shift in orbitals: Update one of the orbital labels a, b, c, d of a vertex v to a random new value.

FIG. 5 .
FIG. 5. Analytic structure of electron total energy evaluated with ED as a function of the complex coupling constant ξ.H2, STO-6g.Colors represent complex phases and brightness indicates the magnitude (see color bars).The black dot at ξ = 1 represents the 'physical' result.The dashed black circle indicates minimum convergence radius necessary for the perturbation series to converge at ξ = 1.a) Effect of changing r at fixed temperature T = 50 −1 E h .At r = 1.4 a0 and r = 2.0 a0, no singularity is visible in the unit circle and the series is convergent at ξ = 1.At r = 2.8 a0 and 3.6 a0, poles appear in the unit circle, resulting in a divergent series at ξ = 1.b) Effect of changing T at fixed r = 1.4 a0.The real-axis locations of the vertical 'walls' of poles does not change significantly as temperature decreases, while the imaginary-axis spacing of the poles decreases proportionally with T .

FIG. 8 .FIG. 9 .
FIG. 8. Analytic structure of electron total energy as a function of a complex coupling constant ξ with and without Hartree-Fock shift.Left (right) panel presents values of E[ξ] − E[0] evaluated on the complex plane for H2, STO-6g, T = 50 −1 E h and r = 1.4 a0 with (without) Hartree-Fock shift.

FIG. 10 .
FIG. 10.CDet Green's function in comparison to ED. H2, STO-6g, T = 50 −1 E h , r = 1.4 a0 Top panel: values of Ĝ(iωn) at orbital 1. CDet results at kmax = 6 are plotted as symbols and ED values as lines.Error bars are indicated but much smaller than symbol size.Bottom panel: deviations of CDet results from ED at different kmax.Solid (dashed) lines represent real (imaginary) part of Ĝ11(iωn).Shadings indicate stochastic uncertainties of CDet.

FIG. 11 .
FIG. 11.Total energy Etot with Monte Carlo errors for H2 with cc-pVDZ (left column) and cc-pVTZ(right column) basis sets, T = 50 −1 E h .Top panels: comparison of ED and CDet at different kmax.Bottom panels: difference between ED and CDet.

and 3 d
orbitals by cc-pVTZ basis are mostly unoccupied, and the most electron excitation occur near the lowest 1 s orbitals.Consequently, the convergence behavior and computational cost of CDet do not change significantly from the minimal basis STO-6g.

kFIG. 12 .
FIG.12.Total energy Etot with Monte Carlo errors for H10 with STO-6g (left column) and cc-pVDZ (right column) basis.ED results are used as reference for STO-6g and MRCI+Q (T = 0) from Ref. 57 for cc-pVDZ.Top panels: comparison of reference data and CDet at different kmax at finite temperature T = 50 −1 E h , along with ED and CCSD results at T = 0 for STO-6g basis.Bottom panels: difference between CDet and reference data at finite temperature, (for STO-6g) in comparison to difference between CCSD and ED at zero temperature.

FIG. 13 .
FIG.13.Empirical cost analysis of CDet simulations of hydrogen chain Hn at r = 1.4 a0.In each panel, all simulations are carried out using the same setup of Monte Carlo updates and number of iterations.We estimate the contribution of integrated autocorrelation time τint to the stochastic error (blue), the computational cost (orange), and the total stochastic uncertainty of energy ∆Etot (green) for each simulation, and scale them to the same range on double-logarithmic plots.(a) Temperature dependence, H2, STO-6g, kmax = 6.(b) Basis set dependence, H2, T = 50 −1 E h , kmax = 4. (c) System size dependence, Hn, cc-pVDZ, T = 50 −1 E h , kmax = 4.

Appendix C :
Thermal expectation value of the electron energyThe one-body energy is straightforward:E 0 = H 0 = ab h ab c † a c b = ab h ab ρ ab .(C1)Expression for the two-body energy term can be derived in multiple ways such as using the equation of motion or the Schwinger-Dyson equation.Here we provide a simple derivation following Ref.75.We introduce a coupling constant ξ to the action defined in Eq. (B4) such that S ξ = S 0 + ξS V and ξ → 1 recovers the 'physical' results.Now we havedZ ξ dξ ξ=1 = − D[c, c]S V e −S0−ξS V ξ=1 )c c (τ )c d (τ )c b (τ ) = −Zβ H V .(C2)Introducing a change of variables such that c → c/ξ 1/4 and c → c/ξ 1/4 , thenD[c, c] = lim α =ξ + Tr[I]/2 D[c, c] (C3)where the indices α denote states at each discretized time point on the integration path, Tr[I] = dxδ(x, x) in which x is the compound spacetime index, and the plus sign on the exponent is due to the nature of Grassmann integrals.The partition function is unaffected by the change of variables, which now takes the formZ ξ = ξ Tr[I]/2 D[c, c]e −ξ −1/2 S0−S V ., c]g −1 (x ,x)c(x )c(x)e −S = Tr[I] 2 Z − Z 2 g −1 (x , x)G(x, x ) = Z 2 Tr[I − g −1 G] (C5)

TABLE I .
Measurements for physical observables.Symbols with 'hats' denote quantities in Matsubara frequency representations.The imaginary time convolution is defined as [f * g](τ ) =