Dual Bethe-Salpeter equation for the multi-orbital lattice susceptibility within dynamical mean-field theory

Dynamical mean-field theory describes the impact of strong local correlation effects in many-electron systems. While the single-particle spectral function is directly obtained within the formalism, two-particle susceptibilities can also be obtained by solving the Bethe-Salpeter equation. The solution requires handling infinite matrices in Matsubara frequency space. This is commonly treated using a finite frequency cut-off, resulting in slow linear convergence. A decomposition of the two-particle response in local and non-local contributions enables a reformulation of the Bethe-Salpeter equation inspired by the dual boson formalism. The re-formulation has a drastically improved cubic convergence with respect to the frequency cut-off, facilitating the calculation of susceptibilities in multi-orbital systems considerably. This improved convergence arises from the fact that local contributions can be measured in the impurity solver. The dual Bethe-Salpeter equation uses the fully reducible vertex which is free from vertex divergences. We benchmark the approach on several systems including the spin susceptibility of strontium ruthenate Sr$_2$RuO$_4$, a strongly correlated Hund's metal with three active orbitals.


I. INTRODUCTION
Electronic correlations are present in many interesting quantum materials, including Mott insulators [1], ruthenates [2], iron pnictides [3] and cuprates [4].In these materials, the interaction between electronic quasiparticles is strong, leading to the breakdown of the independent quasi-particle picture.The theoretical description of this quantum many-body problem is hard and generally requires approximations.The Dynamical mean-field theory (DMFT) [5,6] approximation is one prominent example, which quantitatively explains many experimentally observed phenomena [2], although there are also situations where it is insufficient [7,8].
The properties of a system of correlated electrons can be described in terms of expectation values of the form ⟨c † α c β ⟩, ⟨c † α c β c † γ c δ ⟩ and so on, where c α , c † α stand for the annihilation and creation operator of an electron and α is a combined time τ , space R, and spin-orbital a label α ≡ (a, R, τ ).Here ⟨c † α c β ⟩ is called a single-particle quantity, ⟨c † α c β c † γ c δ ⟩ a two-particle quantity, and so on.With the increase in the number of operators and labels in the correlation function, these many-particle quantities become increasingly complicated.In practice, expectation values with more than two pairs of operators are often (but not always [9,10]) out of reach, but both the single-particle and two-particle quantities are experimentally accessible.A simplification of the structure is possible for translational invariant systems in equilibrium where only relative time and position matters.Whence, the number of space-time (or momentum-frequency) labels in the correlation function can be reduced by one.Another name for the two-particle quantity is the gener- alized susceptibility χ, see Fig. 1, since these correlation functions describe the linear response of the electronic system to an external field, according to the fluctuationdissipation theorem.
Experimentally, the susceptibilities are accessible via techniques such as inelastic neutron scattering (INS) [11], electron energy loss spectroscopy (EELS) [12] or resonant inelastic x-ray scattering (RIXS) [13].Roughly speaking, these experiments consist of shooting a probing particle with a known energy E in and momentum k in at the correlated material of interest, and measuring the distribution S over energy E out and momentum k out of the outgoing probing particle.The energy and momentum difference between the ingoing and outgoing particle is transferred to the sample, i.e. the energy ω = E in −E out and momentum q = k in − k out are absorbed by the correlated material, and large magnitudes of S indicate that the material supports (collective) excitations at (ω, q).Thus, the scattering amplitude S measured by these spectroscopic techniques is proportional to the susceptibility χ(ω, q) of the material, up to further scattering matrix element and interaction effects, see Fig. 1.Thus, the experimental spectrum S only depends on the transferred energy and momentum (ω, q), while the generalized electronic susceptibility χ is a four-index object, which depends on three frequencies and momenta due to time-space translation symmetry.To get to the experimentally relevant susceptibility, two of the frequencies and momenta of the generalized susceptibility need to be traced out.To understand the meaning of this trace, consider the momentum labels q, k, k ′ of the generalized susceptibility.These labels correspond to an electronic transition k → k + q and another electronic transition k ′ + q → k ′ , see Fig. 1.In the theoretical description, it is possible to consider/calculate the likelihood of such a transition as a function of both k and k ′ , as well as q, but in the experiment, only the momentum transfer q is observable and there is no way to determine the momenta of the electronic states involved in the transition, and the trace takes the form Thus, there is a tremendous loss of information when going from the full generalized susceptibility to a particular traced-out version that is actually experimentally observable.
In this work, we will show that this loss of information can actually be used to our advantage when we want to calculate the susceptibility in DMFT [6].The usual formulation of the Bethe-Salpeter equation in DMFT proceeds by first calculating the generalized susceptibility of the material, based on the generalized susceptibility of the auxiliary impurity model, and only taking the trace over momenta and fermionic frequencies at the end.The disadvantage of this is that this formulation of the Bethe-Salpeter equation converges only linearly with respect to the number of internal frequencies used in the calculation, especially for local scattering processes.Some techniques have been developed to correct for the known high-frequency asymptotics [14][15][16][17][18][19], but converging with respect to the frequency grid remains a challenge for reaching experimentally relevant and physically interesting temperatures.
An alternative formulation [20], which can be derived using the dual boson approach [21], takes advantage of the distinction between local and non-local fluctuations in DMFT.By using the local single-frequency and twofrequency susceptibilities of the auxiliary impurity model (measured directly in the impurity solver), it is possible to efficiently take into account all local scattering processes, regardless of their internal frequency.The resulting dual Bethe-Salpeter equation (DBSE) then only has to account for non-local scattering processes [22][23][24][25][26], which converge cubically with respect to the number of fermionic frequencies used in the calculation.This is a substantial improvement that enables calculations at lower temperature or in more complex systems, since the computational complexity of the DBSE (and BSE) scales with the number of orbitals N o and the momentum dis-cretization N k , as well as the number of fermionic frequencies N ν according to O(N 3 ν N 6 o N k ).This dual boson based approach has previously been applied to charge [27], S z [25] and full spin [28] susceptibility of single-orbital models, and a multi-orbital implementation of this scheme using a channel decomposition is available in the AbinitioDΓA project [29].Here, we present an open-source implementation of the dual Bethe-Salpeter equation in the two particle response function toolbox (TPRF [30]), based on the toolbox for interacting quantum systems (TRIQS [31]).We provide a detailed benchmark of the static spin susceptibility of Sr 2 RuO 4 .

II. THEORY
We will consider lattice models in thermodynamic equilibrium at finite temperature, using the imaginary time and Matsubara frequency formalism.We start with a brief overview of the susceptibility and how it is usually calculated in DMFT, including our notation for describing it.Some familiarity with these concepts is assumed, we refer the interested reader to the literature [6,7,[32][33][34] for more details.A discussion of the relationship with previous works can be found in Sec.V.

A. Notation
The single-particle Green's function is defined as , where τ ∈ [0, β) is the imaginary time, a and b are combined spin-orbital labels and R is a lattice vector.We use • to denote indices that appear on creation operators, which becomes very relevant for many-particle Green's functions.The Fourier transform of G a b(τ, R) is G a b(ν, k), where ν is a fermionic Matsubara frequency and k is the momentum in the first Brillouin Zone.
Similarly, the generalized susceptibility χ ābcd (ω, q) is the Fourier transform of Here, ω is a bosonic Matsubara frequency.For clarity, we use q for the bosonic momentum and k for fermionic momenta, both take values in the first Brillouin Zone.Note that the order of barred operators differs between the single-particle Green's function and the susceptibility.The notation of the single-particle Green's function is conventional.For the susceptibility there are many notations found in the literature, we follow the notation of TPRF [30], see also Appendix A.
The spin-orbital labels can be contracted to go from the generalized susceptibility to a particular physical susceptibility, χ AB (ω, q) = abcd A ab B cd χ ābcd (ω, q), where A and B are matrices in spin-orbital space.As an example, the S z component of the spin susceptibility is obtained as χ S z S z (ω, q) = abcd σz ab σz cd χ ābcd (ω, q), where σz is the appropriate Pauli matrix.
The static susceptibility χ AB (ω = 0, q = 0) can be interpreted in linear response theory.Define the operators Â = ab A ab c † a c b and B = cd B cd c † c c d , which we assume to be Hermitian.Then, the application of a spatially homogeneous field λ B, in the sense that the Hamiltonian changes as Ĥ → Ĥ − λ B, leads to a change in ⟨ Â⟩, where the zero field response is d⟨ Â⟩/dλ λ=0 = χ AB (ω = 0, q = 0).This definition can be generalized to spatially inhomogeneous fields (finite q) by considering enlarged unit cells [35], while finite ω corresponds to time dependent applied fields.

B. Bethe-Salpeter equation in Dynamical
Mean-Field Theory In DMFT, the susceptibility can be calculated using the Bethe-Salpeter equation [6].Here, we introduce the equations without orbital, frequency and momentum labels to emphasize the structure of the equations, explicit formulas including labels are given later.To distinguish the physical susceptibility χ of the system from the auxiliary impurity quantities in DMFT, it is commonly called the lattice susceptibility.In general, the Bethe-Salpeter equation for χ takes the form where the bubble χ 0 = G * G denotes the independent propagation of a particle-hole pair according to the Green's function G given by the DMFT approximation.Γ is the irreducible vertex, which is assumed to be local in the DMFT approximation.Both G and Γ can be extracted from the solution of DMFT's auxiliary impurity problem.The Bethe-Salpeter Equation ( 2) is a self-consistent equation for χ, which can be inverted to give χ as a function χ 0 and Γ.
In DMFT, it is also possible to discuss the susceptibility of the auxiliary impurity problem, X, which has it's own Bethe-Salpeter equation Here, the impurity bubble X 0 = g * g is defined in analogy with the lattice bubble, and g is the Green's function of the impurity model, see also Eq. ( 13).Importantly, within the DMFT approximation the lattice and impurity vertices Γ are identical.The frequency, momentum and orbital dependence of the products in Eqs. ( 2) and ( 3) is here supressed for readability, c.f. Sec.II F.

C. Locality
Locality is a central concept in DMFT.The DMFT self-consistency condition guarantees that g is the local part of G, which is often expressed as g = G(R = 0) = This implies that X 0 is the local part of χ 0 as well, since the bubble is a direct product in real space However, X is not the local part of χ [24], since the Bethe-Salpeter equation (2) allows for processes where the electron-hole pair moves to a different site and back.Physically, X corresponds to the linear response of the impurity model while keeping the dynamical mean-field fixed, whereas χ corresponds to linear response where the mean-field is self-consistently adjusted.
Although X is not identical to (the local part of) χ, it might still be a much better starting point for the calculation of χ than χ 0 , since X already contains local vertex corrections in the form of Eq. (3).We should stress that X can be extracted directly from the impurity model, without needing to use the identity Eq. (3).Thus, the central idea of the dual Bethe-Salpeter equation [21,22] is to express χ as X and some corrections.Finally, we note that this idea has also been used to decompose the polarization [26].

D. Dual Bethe-Salpeter equation: idea
As mentioned above, it makes sense to pull apart the lattice bubble χ 0 into two parts, the impurity bubble X 0 and a remainder χ0 , We are looking for an analogous separation of χ, in terms of the impurity susceptibility X and a correction term χ, derived below.Insertion of the Bethe-Salpeter equations [Eqs.
(2) and (3)] gives The result can be further simplified by writing it in terms of the impurity reducible vertex F , defined as which gives the relations Here, we have introduced L to compactify the notation further.
Hence, the DMFT+DBSE approach gives the lattice susceptibility χ in terms of the impurity susceptibility X and the dual correction term see Eqs. ( 22) and ( 23) for the complete frequency, momentum and orbital structure.Note that this formulation of the lattice susceptibility only uses the reducible vertex F and is therefore free from the issues related to divergences that occur in Γ [36][37][38][39].
Both formulations of the Bethe-Salpeter equation, Eqs. ( 2) and ( 11), are matrix equations, and the matrix multiplication involves a formally infinite sum over fermionic frequencies.In practice, this sum has to be cut off at some N ν .In Sec.III, we will show that the Eq. ( 11) converges more rapidly with respect to N ν , making it superior for practical implementations.Before studying the convergence, however, we will first introduce how the calculation is done in practice, with all relevant orbital and frequency labels and prefactors.

E. Impurity correlators
To achieve the improved scaling with N ν , it is central that the three constituents of the right-hand side of the DBSE equation (11), X, L and F are obtained from the impurity model directly, as described below, from independent Monte Carlo measurements.Note that, while it is possible to obtain the components with zero or one fermionic frequency (X and L) using sums over frequencies of the generalized two-particle Green's function (g (4)  defined below), doing so would introduce its own N νcutoff errors and would make the scaling with the size of the frequency box worse.
The impurity susceptibility X ābcd (ω) is the Fourier transform to Matsubara frequency of Eq. ( 1) for the impurity model, i.e., without a spatial label R.
Similarly, for the fermion-boson vertex L, we start with the impurity correlation function g , where ν is a fermionic and ω a bosonic Matsubara frequency.L is the amputated connected component of g (3) , i.e., Here, g is the single-particle Green's function of the impurity model.To solve Eq. ( 12) for L, it is convenient to express it in terms of the impurity bubble, so that Eq. ( 12) becomes This is a tensorial equation in orbital space while being diagonal in both the fermionic and the bosonic Matsubara frequency.L is obtained by multiplication with the inverse of X 0 , TPRF's implementation of these operations is used and this is the reason for the factor −1/T in our definition of L.
The fermion-boson vertex has some useful properties, which have been discussed in detail for single-orbital DMFT [40].First, the impurity susceptibility X is equal to g (3) traced over ν.Secondly, L describes the linear response of the self-energy to an external field, Thirdly, for a non-interacting impurity model, ābcd (ω, ν) = g dā (ν)g bc (ν+ω)−g bā (ν)g dc (τ = 0 + )δ ω,0 .This expression for L is consistent with the linear response of Eq. (15).It can also be shown more generally that L asymptotically goes to a constant in the limit of large ν, keeping ω fixed.
Finally, L has a symmetry with respect to complex conjugation.In general, For real Hamiltonians, i.e., a Hamiltonian where all matrix elements are real in the chosen orbital representation, which includes all examples studied here, there is the additional symmetry relation This formula is obtained by reversing the arrow of all fermionic lines (time-reversal), and re-arranging the diagram to bring it back into the customary form.These formulas are proven in Appendix A 3.
From our definition in Eq. ( 12), L has units of energy since T has units of energy, g(ν) has units of inverse energy and g(τ ) is dimensionless.For the linear response, it is convenient to define the operators Â and B in a dimensionless way (this includes the important examples of N and Ŝi ), so λ has units of energy and the right-hand side of Eq. ( 15) is dimensionless.
Another ingredient for the dual Bethe-Salpeter equation is the reducible impurity vertex, F ābcd , which is the connected, amputated two-particle correlation function of the impurity model.This object is used in many DMFT-based theories [7], and a variety of notations is used in the literature.We use the same convention as TPRF for the particle-hole channel, i.e., In practice g and g (4) are obtained from the impurity solver and F is calculated using this relation by subtracting the disconnected components from g (4) and then amputating the four attached single particle Green's function legs by applying four g −1 terms.

F. Dual Bethe-Salpeter equation in multi-orbital systems
The lattice Green's function G in DMFT is defined by where Σ is the self-energy of the auxiliary impurity model.We define the non-local single-particle Green's function G as where G a b(ν, R) is the Fourier transform of G a b(ν, k) to real space.Corresponding to the non-local Green's function G, there is a "bubble" χ0 , (21) This object is Fourier transformed back to momentum space, giving χ0 ābcd (ω, ν, ν ′ , q).Evaluating the bubble in real space and Fourier transforming is more efficient than doing the calculation directly in momentum space [41,42].We should note that χ0 is the nonlocal part of χ 0 and χ0 (ω, ν, ν ′ , R = 0) = 0. Since both Green's functions in Eq. ( 21) have the same |R|, there are no cross-terms with one local and one non-local part of the Green's function.
From χ0 and F , the dual or non-local Bethe-Salpeter equation can now be solved for χ by inversion, using the reducible vertex F obtained from Eq. (18).Finally, to get the physical susceptibility, we have to combine the impurity susceptibility with the non-local correction, adding fermion-boson vertices at the end, as sketched in Fig. 2, Using the symmetry Eq. ( 16) of L, this can be rewritten as For real Hamiltonians, using the symmetry Eq. ( 17) of L, this can be rewritten as Equations ( 24) and ( 25) perform better in terms of frequency windows when ω is large, since ν + ω is replaced by ±ν.For real Hamiltonians, Eq. ( 25) also has the advantage that it only uses a single bosonic frequency ω.
On the other hand, Equation ( 23) is obviously invariant under an orbital basis transformation (counting barred and non-barred occurences of orbital labels on both sides of the equation), while this is not obvious to see in Equations (24) and (25).The entire procedure is summarized in Fig. 3.The computationally non-trivial step in this procedure is the inversion of the DBSE in Eq. (22).It corresponds to a geometric series with kernel χ0 F , which is divergent if one of the eigenvalues of the kernel is equal to +1, signaling a phase transition.Thus, analysis of this kernel can give valuable insight into the leading electronic fluctuations [39].

G. Comparison of dual and usual Bethe-Salpeter equation
The usual Bethe-Salpeter equation in DMFT is based on the Green's function G instead of G, and the vertex Γ instead of F .The susceptibility is then obtained directly as the output of the Bethe-Salpeter equation, i.e., without needing Eq. ( 23).Diagrammatically, it corresponds to the repeated scattering of electron-hole pairs propagating with the DMFT Green's function and interacting with the irreducible vertex Γ.
Comparing the two formulations of the Bethe-Salpeter equation, the dual boson based approach can be interpreted as a resummation of the original Bethe-Salpeter equation where terms in the geometric series are grouped together if subsequent vertices Γ lie at the same lattice site.This occurs in the following way: If all vertices Γ involve only a single lattice site, the diagram is a part of X.Otherwise, all vertices before the first change of site are grouped into the L on the left and all vertices Diagrammatic representation of elements of the dual Bethe-Salpeter equation.At the top, the orbital and frequency definition of L is illustrated.In our definition, L(ω, ν) implies that the incoming electron emits energy ω to the bosonic degree of freedom.At the bottom, the non-local contribution to the susceptibility is drawn.In this process, energy ω and momentum q comes in from the left, is transferred to an electron-hole pair that propagates non-locally ( χ) and is then emitted.In the triangle on the left, since the electron absorbs energy, the corresponding argument of the first L in Eq. ( 23) is −ω.
after the last change of site are grouped into the L on the right.In between, all vertices Γ that occur between two changes of site are grouped into a single F .To avoid double-counting of diagrams, the dual bubble χ0 is used to enforce a change of sites between every vertex.
This construction provides for a one-to-one correspondence of the original Bethe-Salpeter series and the new, dual Bethe-Salpeter with all elements expressed in terms of Γ and bubbles.In the derivation, X and L are expressed using local Bethe-Salpeter equations, ( 9)- (10), but in practice they are obtained directly from the impurity solver, see Fig. 3.This eliminates cutoff errors associated with the local Bethe-Salpeter equation.

H. Implementation details
After converging an ordinary DMFT self-consistency loop using TRIQS and the CTHYB solver [43], we use w2dynamics [44] to measure the two-particle correlation functions with one, two and three frequencies.Due to different conventions the output of the w2dynamics solver is then transformed to the TPRF two-particle convention, see Appendix A for details.After post-processing the two-particle correlation functions to get X, L and F , the dual Bethe-Salpeter equation is evaluated and χ(ω, q) is extracted as in Sec.II F. This evaluation is the main contribution of the present manuscript.
The DMFT+DBSE calculation is compute bound by the measurements of the impurity correlation functions (using QMC).The dual Bethe-Salpeter equation solver is memory bound, since we use fast Fourier transforms to evaluate the lattice bubble susceptibility χ 0 as direct products in real space and imaginary time (R, τ ).However, the DBSE equation is diagonal in ω and q and can be evaluated one (ω, q) at a time.All computationally non-trivial parts of the DBSE calculation are performed using the two-particle response function toolbox (TPRF) [30] with routines implemented in C++, using hybrid OpenMP+MPI parallelization.

III. CONVERGENCE WITH RESPECT TO THE NUMBER OF FERMIONIC FREQUENCIES
An important benefit of the dual version of the susceptibility calculation is the improved convergence with respect to the number of fermionic frequencies N ν .Whereas the traditional Bethe-Salpeter equation scales as N −1 ν , the new implementation generically scales as N −3 ν , as discussed below (see also [26]).We discuss the general case first and then give a few examples that are analytically tractable.

A. Cubic scaling of DBSE
Since we are interested in the formal asymptotic scaling for large fermionic frequency, we eventually have ω ≪ ν and we can set ω = 0 for the analysis.In practice, for finite ω, the asymptotic scaling sets in at larger N ν , and one typically needs N ν > ∼ βω/(2π).In the analysis below, we drop the orbital labels, which are not relevant for the asymptotic analysis.
At large fermionic frequency, the single-particle Green's function behaves as G(iν, k) ∼ 1 iν + O(ν −2 ).In other words, the leading term is k-independent and hence completely local, and non-local parts of the Green's function decay at least as (iν) −2 .Thus, G(iν, k) from Eq. ( 20) decays as (iν) −2 by construction [45], since it is the Green's function with the local part removed.The dual bubble, Eq. ( 21) thus scales as χ0 (ω = 0, ν, ν, q) ∼ (iν) −4 .It will turn out that this intermediate result is similar to the final result, so it is useful to have a look at the cutoff error here.With a fermionic frequency box of size N ν , the contribution from frequencies outside the box scales as ν>Nν χ0 (ν) ∼ ν>Nν (iν) −4 ∼ ∞ Nν dx/x 4 ∼ N −3 ν , i.e. the anticipated cubic scaling.Of course, the DBSE expression for the susceptibility is more complicated than just the dual bubble, so we need to analyze the effect of vertex corrections.
In the final expression for the susceptibility, Eq. ( 23), X is independent of N ν , since it is measured directly in the impurity solver.The vertex L(ω, ν), which (through g (3) ) is also measured in the impurity solver, is asymptotically constant in ν [40], so we can focus our attention on χ, Eq. ( 22).The vertex F is asymptotically constant as a function of the fermionic Matsubara frequency [18,46].The Bethe-Salpeter equation can be expanded as a geometric series, where we only write fermionic frequency labels, For the dependence of the susceptibility on the fermionic frequency box size N ν , we are interested in the error , where χNν denotes that the fermionic frequency sums on the right-hand side of Eq. ( 26) are also cut off at N ν .The first term decays as δ νν ′ (iν) −4 , which leads to N −3 ν as before.For the second term, we need to distinguish between three cases [17,18] for the cut-off error: (1) ν < N ν , ν ′ > N ν (2) ν > N ν , ν ′ < N ν and (3) ν > N ν , ν ′ > N ν .For (1), we know that χ0 ∼ (iν) −4 so the asymptotic contribution is of order N −3 ν .( 2) is similar by symmetry.For (3), we have an error Going to the third term in (26) or even higher order terms, more cases need to be distinguished, since it is now also possible for intermediate frequencies to be large.The leading contributions are always of order N −3 ν and come from the cases where precisely one fermionic frequency is larger than N ν .Cases with more large fermionic frequencies have corresponding higher powers of N −3 ν .For the traditional BSE, the analysis proceeds in a very similar way, but the difference is that χ 0 is used instead of χ0 , and it scales as ν −2 only, leading to a cutoff error of N −1 ν .The same scaling holds for the local BSE.In fact, one can see that the trivial 1 iν asymptote of the Green's function is responsible for the slow scaling of the BSE.However, it contributes in a non-trivial way: the BSE is a series of ladder diagrams, and all orders in the series contribute to the N ν -asymptote via processes where a single rung of the ladder has ν > N ν .

B. Example: The non-interacting dimer
Consider the Hubbard dimer in the limit U = 0, i.e., two atoms with a hopping amplitude t.In that case, the Hamiltonian has two eigenvalues ϵ(k) = ±t and all calculations can be done straightforwardly.Define sgn(k) = sgn(ϵ k ), so ϵ k = t sgn(k). 1 The non-local bubble susceptibility at q = 0 and iω = 0 is From the final expression, it is clear that this frequencydiagonal matrix element decays as (iν) −4 .Given a fermionic frequency box of size N ν , the estimated total contribution from frequencies outside of the box scales as ∞ Nν dx/x 4 ∝ N −3 ν .Thus, we find cubic convergence.For the dimer, it is easy to verify that the other value of q has the same magnitude and only differs in the sign.Thus, it also has the same scaling.
For completeness, the conventional evaluation of the susceptibility in the dimer, at ω = 0 and q = 0, involves This matrix element decays as (iν) −2 , so the total cutoff error scales as N −1 ν , i.e., linearly.The difference between χ 0 and χ0 is given by the impurity susceptibility, which satisfies However, this fermionic frequency sum is not performed explicitly in the fast evaluation of the susceptibility.Instead, X is measured directly in the impurity solver.

C. Tight-binding lattice model
The scaling in a tight-binding lattice model (the noninteracting Hubbard model) is the same as in the corresponding non-interacting dimer, with the difference that the momentum sums cannot be done analytically.A numerical illustration of the scaling is given in Figure 4 for a square lattice with two orbitals per site.Note that the non-interacting model with orbital-diagonal hopping is actually just a sum over independent single-orbital models, so the orbital physics is trivial, but it is a useful test of the general implementation.As a reference, Figure 4 uses the exact χ(q = 0, ω = 0) = dm/dh obtained using finite differences from the magnetization m(h), given by the Fermi-Dirac distribution for non-interacting fermions.To achieve consistency, the same momentum space discretization should be used in the BSE and the linear response.

D. Atomic limit
In the atomic limit, t k = 0, we have χ(ω, q) = X(ω) and the new formulation gives the exact result regardless of the number of fermionic frequencies.Indeed, the dual bubble vanishes in the atomic limit, χ0 = 0. On the other hand, the conventional formulation of the Bethe-Salpeter equation does depend on the finite box size in the usual linear way, since the local Bethe-Salpeter equation of the Hubbard atom is non-trivial [47].

IV. BENCHMARKS
To benchmark the dual BSE formulation, we apply it to two strongly correlated models, namely the singleband Hubbard model on the square lattice and the threeband effective low-energy model for strontium ruthenate (Sr 2 RuO 4 ).

A. The single-band Hubbard model
The single band Hubbard model on the square lattice is a canonical system for the study of strong correlations.
Here we benchmark the static spin susceptibility, computed using dual BSE, against the standard BSE and the extrapolated zero field response from self consistent DMFT calculations in an applied magnetic field.The calculations are performed in the strongly correlated regime with nearest neighbor hopping t = 1 and local Hubbard interaction U = 10 in the paramagnetic state at inverse temperature β = 1.
Figure 5a shows the convergence of the DBSE static susceptibility (orange markers) using fermionic frequency windows in the range 4 ≤ N ν ≤ 30, giving five digits of accuracy and a quantitative agreement with the applied field result (green diamond).In fact, the discrepancy between the extrapolated DBSE result (orange open circle) and the applied field (green diamond) can be attributed to MC errors and the extrapolation to zero field in the linear response calculation (as shown in Fig. 5c).This should be contrasted with the BSE results in Fig. 5b where the raw data (blue squares) only give one digit of accuracy, requiring extrapolation to N ν → ∞ to achieve any quantitative agreement (blue open square).
The expected cubic scaling with the size of the frequency window N ν is clearly visible in the DBSE result (Fig. 5a) allowing us to reach a higher accuracy than the conventional BSE with linear convergence.

B. Three-band Hubbard model for strontium ruthenate
While computing the lattice susceptibility within DMFT for the single band Hubbard model to higher accuracy is a feat, the main motivation for implementing the DBSE is to push the boundaries of applicability of the method to material realistic models in order to connect to experimental spectroscopies.This generally requires DMFT treatment of multi-orbital models.For example, the correlated electrons in transition metal compounds can often be described by effective low-energy Hubbard models with two to five orbitals per unit cell.
One such strongly correlated material is strontium ruthenate (Sr 2 RuO 4 ) which has a substantial momentum-dependence in its spin susceptibility due to the presence of incommensurate spin fluctuations, as FIG. 6. Upper panel: Static spin susceptibility χS z Sz (q) of strontium ruthenate Sr2RuO4 at T = 464 K along the pseudo-tetragonal high-symmetry path (Γ-X-M -Γ).Results for Nν = 5, 10, 20, and 40 fermionic frequencies are shown for the DMFT+BSE method (dashed lines) and our dual BSE method (DMFT+DBSE) (solid lines).Lower panel: Convergence close to the incommensurate peak at q = (1/3, 1/3, 0) as a function of 1/Nν for 3 ≤ Nν ≤ 40.Extrapolations to the infinite frequency limit Nν → ∞ are also shown (lines), using the N −3 ν scaling (solid line) for the DMFT+DBSE results (circles) and the N −1 ν scaling (dashed line) for the DMFT+BSE results (squares).As an independent reference also the zerofield extrapolated self-consistent DMFT result from Ref. 35 is shown (magenta diamond marker).
seen in INS experiments [48,49] and reproduced by DMFT+BSE calculations [35,50,51].The combination of orbital and momentum structure is responsible for the observed phenomena, with both strong local fluctuations and sharp features in momentum space.This makes Sr 2 RuO 4 suitable as a benchmark for our dual Bethe-Salpeter equation implementation.Here, we focus on the frequency convergence of the implementation, the physics of the spin susceptibility in this material is discussed at length in Ref. 35.
Following Ref. 35, we construct a low-energy model of Sr 2 RuO 4 from the three bands crossing the Fermi level with Ru-4d t 2g symmetry.The corresponding Wannier model was fit using Wannier90 [52] on a 10 3 grid to the density functional theory band structure calculated with GPAW [53,54], using the PBE exchange correlation functional [55], with a 20 3 k-point grid, 600 eV plane wave cutoff, and 25 meV Methfessel-Paxton smearing [56], using the experimental crystal structure of Sr 2 RuO 4 at 100 K [57].The interaction in the low energy three band t 2g model was modeled using the rotationally invariant Kanamori interaction, with Hubbard U = 2.3 eV and Hund's J = 0.4 eV and the self-consistent dynamical mean field calculations were performed at T = 464 K using TRIQS [31] and TRIQS/cthyb [43] on a 16 3 k-point grid.The two-particle quantities were measured using W2Dynamics [58] worm sampling [59] using 4 • 10 9 samples and a total compute budget of 110'000 core hours on the EuroHPC Joint Undertaking system LUMI.The static spin-susceptibility χ SzSz (q, ω = 0) was finally computed on the level of DMFT+BSE and DMFT+DBSE using the Two-Particle Response Function toolbox TPRF [30] using a 48 3 k-point grid.
Figure 6 shows the resulting susceptibility along a high-symmetry path through the Brillouin Zone.Results are shown for both the traditional implementation of the Bethe-Salpeter equation (BSE) and our dual implementation (DBSE).In both cases, relatively small frequency boxes are used (N ν = 5, 10, 20, 40).For the conventional BSE, the slow N −1 ν convergence is observed, and extrapolation is needed to obtain qualitatively correct results.On the other hand, the DBSE yields qualitative results already for the smallest box size of N ν = 5, which is a truly remarkable improvement.Enlarging the frequency box, there is a small enhancement of the momentumdependence, but no overall qualitative change.This is in stark contrast with the conventional BSE, where enlarging the fermionic frequency box increases the susceptibility across the entire Brillouin Zone.The origin of this difference is clear: local spin fluctuations are the major contribution to χ(q) and the dual Bethe-Salpeter equation includes all of these in one go via the impurity susceptibility, independent of the size of the fermionic frequency box.
For the computational efficiency of the approach, the cost of obtaining the two-particle correlation functions of the impurity model is also important.For this particular case, the worm sampling [59] used here (as implemented in w2dynamics) is orders of magnitude more efficient than the partition function sampling used in Ref. [35].Together, the improved convergence of the dual Bethe-Salpeter equation and the efficiency of the worm sampling make the calculation of DMFT lattice susceptibilities in multi-orbital models a routine task instead of a gargantuan undertaking.

C. Three-band Hubbard model for strontium vanadate
Strontium vanadate (SrVO 3 ) is another famous correlated material [60,61], with potential application as a transparent conductor [62].Galler et al. [29,63] have previously investigated the lattice susceptibility of SrVO 3 and provided a tight-binding Hamiltonian and reference data for the DMFT susceptibility [29], which makes it a useful test case for our current implementation.We have These calculations are performed with the Hamiltonian of Galler et al. [29], where the susceptibility was also calculated using the BSE (cf.their Fig.B 19), although their code is also capable of using the DBSE.Lower panel: fermionic frequency convergence of the BSE and DBSE.
performed a self-consistent DMFT calculation based on their Hamiltonian (U = 5, J = 0.75 eV, β = 10 eV, 20 × 20 × 20 k-points), calculated the necessary impurity correlation functions using w2dynamics [44] and subsequently calculated the lattice susceptibility using our implementation of the BSE and DBSE. Figure 7 shows the obtained magnetic susceptibility at ω = 0.For N ν = 30, our results obtained using the BSE are consistent with Fig. B.19 of Ref. [29], which uses the same frequency box.We also performed susceptibility calculations at several N ν ≤ 30 using our code and that of Ref. [29], and found relative deviations of approximately 2% for the magnetic susceptibility at q = 0.This is quite small given that we are comparing calculations based on independent Monte Carlo solver runs, and that the details of the implementation differ.As before, Figure 7 shows that the frequency convergence of the DBSE is so efficient that there is no visual difference between N ν = 10, 20, 30, for a parameter set where the usual BSE has difficulty to reach convergence.Our converged DBSE results are qualitatively consistent with the results in Fig. 6 of Ref. [29], where a much larger frequency box N ν = 200 was used.Thus, also for SrVO 3 , we find that the DBSE greatly improves the convergence.

V. DISCUSSION
Here, we provide an overview of the previous literature and of similarities and differences between our implementation and other recent works.
Early works showed that the locality of the self-energy in infinite dimensions also leads to simplifications on the two-particle level [64][65][66].This led to a decomposition of the susceptibility into local and nonlocal parts [20] similar to the formulas implemented here.
A new view on momentum-resolved, dynamic susceptibilities was given by the dual boson method [21] for correlated systems.It was shown [22] that the dual boson expression for the susceptibility is fully equivalent to the usual DMFT+BSE susceptibility under appropriate assumptions: the equivalence holds when the interaction is fully local, no non-local self-energy diagrams are included in the dual theory and the same impurity model is used in both approaches.This is an example of the more general phenomenon that DMFT is included in the wider set of dual approximations [45,67,68].
The dual formulation of DMFT, which makes a clear distinction between local and non-local processes, has certain advantages.Two major motivations for dual theories [45] are the faster decay with frequency and the good behavior in the strongly correlated limit.Regarding the fast decay, the anti-commutation relation {c a , c † b } = δ ab determines the (iν) −1 -coefficient of the Green's function, which is therefore one for the local, diagonal part of the Green's function and zero otherwise.For the calculation of the susceptibility, the faster decay of the Green's function directly leads to a faster decay of the non-local bubble χ0 and forms the basis of the proof of cubic convergence in Sec.III.For the second point, we have seen in Sec.III D that the non-local contribution to the susceptibility vanishes in the atomic limit, which suggests that this contribution will generically be small in strongly correlated systems.This is indeed seen in the large momentum-independent background of the magnetic susceptibility in Sr 2 RuO 4 (Fig. 6 and Ref. [35]) and SrVO 3 (e.g., Figure B.19 in Ref. [29] and our Fig.7).
A similar decomposition of the susceptibility based on local and non-local contributions was implemented in the AbinitioDΓA [29,63,69] code by Galler et al. [29].The AbinitioDΓA code is set-up to use SU(2) symmetry and a decomposition into a charge and magnetic channel, but should be able to output full rank-4 susceptibilities with minor modifications.This AbinitioDΓA code is capable of reading measured versions of the impurity susceptibility and fermion-boson vertex, and should have improved frequency convergence in that case.However, in their example calculations, the one-frequency and two-frequency correlation functions are obtained by tracing out frequencies in g (4) (ω, ν, ν ′ ), which does not lead to the improved convergence.For example, their Figure 6 and Figure B. 19 are calculated with with 200 and 30 fermionic frequencies and show clear difference, a sign that the calculation is not converged at 30.
Krien [26] has presented a further improvement of the dual boson scheme based on the polarization instead of the susceptibility.The essential point is that in the present formulation, both χ0 and the vertex corrections have cutoff errors at the 1/N 3 ν level.For the vertex corrections, these originate in processes where there is a single high-energy fermion propagator in the ladder.These attach to the vertex F , which is asymptotically constant and given by impurity processes which are reducible with respect to the bare interaction Û .By moving from the susceptibility to the polarization, this type of Û -reducible diagrams is to be excluded, so the associated Û -irreducible vertex F i decays asymptotically, instead of being constant.Thus, high-order diagrams in the Bethe-Salpeter series converge with additional powers of N ν .As a whole, Krien's formulation still depends on the cutoff as 1/N 3 ν , but only via L χ0 L. By using a larger frequency cut-off for L than for F , which is beneficial since L is typically much cheaper to calculate, one can thus reduce the impact of frequency cutoffs further.Krien's work deals with the single-orbital Hubbard model using charge and spin channels, but a generalization of his approach to generic rank-4 interaction tensors seems possible.Since the structure of this approach is similar, the basic tools provided by current implementation of the DBSE in TPRF could be used to implement Krien's formula: one needs to replace the input by their irreducible counterparts, i.e., F by F i , L by L i and X by X i = π(ω), then one calculates the polarization via a DBSE structurally similar to the current one, and finally one transforms the obtained polarization to a susceptibility.
We should note that the improved frequency scaling we obtain is due to the dual Bethe-Salpeter reformulation in combination with a simple frequency cut-off.The improved scaling originates from applying the frequency cut-off on a different set of two-particle objects, with faster decaying high frequency asymptotics, than the terms in the standard DMFT Bethe-Salpeter equation.Regarding the general treatment of the Bethe-Salpeter equation, going beyond the drastic hard frequency cutoff in Matsubara space, there are several interesting approaches that incorporate the asymptotic frequency dependence in different ways [14][15][16][17][18][19] or that use a compact representation of the generic two frequency behavior [70,71].Generally, the dual Bethe-Salpeter equation would benefit from these refined approaches, further improving the convergence.

VI. CONCLUSION
We have presented an efficient implementation of the DMFT susceptibility for general multi-orbital systems based on the dual boson formalism.This algorithm re-duces the scaling with the inverse number of frequencies from linear to cubic, leading to a dramatic speed-up of the calculation.A comparison with linear response shows that convergence with respect to the frequency box is much easier to achieve, leading to several digits of accuracy in the resulting static susceptibility in our example, even before extrapolation with respect to the size of the frequency box.
The implementation of the dual Bethe-Salpeter equation is based on the TRIQS library [31] (in particular TPRF [30]) and can in principle be used together with any impurity solver that provides the required two-particle correlation functions.An interface for the w2dynamics solver is provided with the implementation.We have found the worm sampling of two-particle correlation functions in w2dynamics to be very efficient for the susceptibilities studied here.Compared to the usual Bethe-Salpeter equation, one needs to compute two additional impurity correlation functions with zero and one fermionic frequencies, respectively.These are generally similar in computational cost to the vertex with two fermionic frequencies that is needed in both versions of the Bethe-Salpeter equation.However, Monte Carlo sampling issues have been observed for situations where the observables in the susceptibility do not commute with the Hamiltonian [72].The difficulties associated with the solution of the impurity model remain the main bottleneck in the calculation of susceptibilities, and the more efficient convergence of the Bethe-Salpeter equation reduces the requirements on that bottleneck.This efficient implementation of the Bethe-Salpeter equation can be useful for the study of magnetic [35,73] and other susceptibilities [74,75] in systems with multiple correlated orbitals in the unit cell.In particular, the better scaling with respect to the number of fermionic Matsubara frequencies makes it possible to reach lower temperatures at a given computational cost.
In numerical comparisons, we have found that the actual output data of both solvers are related as   which differs from Eq. (A2) by a factor β2 .This is the relation used in our implementation.It should be noted that factors of β are not always written explicitly in parts of the literature, which can make it harder to translate equations from one notation to another.

FIG. 5 .
FIG. 5. Static magnetic susceptibility χ(0, 0) (q = 0, ω = 0) in the square lattice Hubbard model as a function of the number of frequencies Nν .The dual BSE result (DBSE) (orange) is compared with the standard BSE (blue) and the linear response result from self-consistent DMFT in an applied field extrapolated to zero field (DMFT+Field) (green).Panel (a): Infinite frequency extrapolation (open markers) of calculations for several values of Nν (filled markers) with 1/N 3 ν frequency scaling on the x-axis showing the DBSE asymptotic 1/N 3 ν scaling.Panel (b): Same as panel (a) but with 1/Nν scaling on the x-axis showing the BSE asymptotic 1/Nν scaling.Panel (c): Magnetization M curve in applied magnetic field B from self-consistent DMFT, used for zero-field extrapolation (green).

FIG. 7 .
FIG.7.Magnetic susceptibility of SrVO3.Upper panel: Susceptibility along a path between high-symmetry points, the solid lines are the DBSE and the dashed lines are the BSE.These calculations are performed with the Hamiltonian of Galler et al.[29], where the susceptibility was also calculated using the BSE (cf.their Fig.B 19), although their code is also capable of using the DBSE.Lower panel: fermionic frequency convergence of the BSE and DBSE.