Quantum Monte Carlo simulation of spin-boson models using wormhole updates

We present an exact quantum Monte Carlo method for spin systems coupled to dissipative bosonic baths which makes use of nonlocal wormhole updates to simulate the retarded spin-flip interactions originating from an off-diagonal spin-boson coupling. The method is closely related to the stochastic series expansion and extends the scope of the global directed-loop updates to nonlocal moves through a world-line configuration. We test our method for the U(1)-symmetric two-bath spin-boson model, where the off-diagonal components of a spin-$1/2$ particle are coupled to identical independent baths with power-law spectra, and get a precise estimate of the critical coupling between the critical and the localized phase. Our method applies to impurity systems and lattice models in any spatial dimension coupled to bosonic modes with arbitrary spectral distributions.


I. INTRODUCTION
The development of global updating schemes for Monte Carlo methods [1][2][3][4] has enabled high-precision numerical studies of critical phenomena that appear in a wide range of strongly correlated physics. For quantum systems, the worm and directed-loop algorithms [5][6][7] have become the state of the art to simulate either bosonic or spin models, respectively. While these methods stochastically sample the perturbation expansion of the partition function, the worm/directed-loop updates change the world-line configurations along a closed loop that is constructed in an extended configuration space. The computational cost to construct these loops scales only linearly in system size and inverse temperature and therefore allows for efficient simulations in any spatial dimension, as long as the sign problem is absent. Consequently, it is highly desirable to extend these algorithms to new classes of models which cannot be simulated efficiently otherwise.
A major challenge in computational many-particle theory is the simulation of quantum systems coupled to local bosonic modes that do not obey particle-number conservation. Among the most studied examples are lattice models with a local coupling to phonons. Matrix product state (MPS) based approaches require large bond dimensions to deal with the unbound bosonic Hilbert space of each mode, making them less efficient than for purely electronic or spin models; however, optimized basis sets can lead to notable improvements [8]. Quantum Monte Carlo (QMC) simulations often suffer from long autocorrelation times when the bosons are sampled in first quantization [9]. The absence of particle-number conservation in their second-quantized form so far inhibited a simple and efficient formulation of the worm/directedloop updates (for exceptions see Refs. 10 and 11), leaving inefficient local updates as the only available option to simulate the bosons [12,13]. The difficulties of direct boson sampling can be avoided if the Hamiltonian is quadratic in the bosonic fields. Then, the bosons can be integrated out exactly using the path integral and we obtain a retarded interaction in the system's degrees of freedom. A recent generalization of the directed-loop algorithm to retarded interactions [14] has been shown to overcome the autocorrelation problem and was successfully applied to phonon-coupled systems in one [15,16] and two dimensions [17].
The coupling to bosonic modes plays an important role in many areas of quantum physics, e.g., in solid-state systems bosonic excitations appear in the form of quasiparticles like phonons or magnons, trapped-ion quantum simulators rely on long-range interactions mediated by the vibrational modes of the ions [18,19], and in quantum optics the atoms interact with the quantized electromagnetic field [20,21]. A central problem is the dissipation of a quantum system when connected to a bosonic bath with an infinite number of degrees of freedom [22]. One of its simplest realizations is the spin-boson model : a twolevel system that is coupled to a continuum of bosonic modes. In this model, the competition between bathinduced localization and delocalization due to an applied magnetic field leads to a quantum phase transition whose critical properties had only been resolved with the development of an efficient cluster QMC method that simulates the coupling to the bath in terms of a diagonal retarded interaction [23]. A variational MPS approach revealed that the coupling of two noncommuting spin components to independent baths can lead to even more complex phase diagrams [24] including a critical phase [25,26] that emerges due to frustration effects in the decoherence mechanism [27]; however, generalizations of the MPS approach to multiple baths are limited by the large dimensions of local bosonic Hilbert spaces. Lattice models with onsite dissipation have been simulated using classical Monte Carlo methods for long-range interactions [28,29]; full QMC simulations have only been applied to diagonal spin-boson couplings using the worm algorithm [30]. So far, we are still lacking efficient QMC updating schemes for more generic spin-boson interactions that work on an equal footing for impurity and lattice models.
Quantum impurity models are at the core of dynamical mean-field theory where the coupling to an infinite bath mimics the interaction effects with neighboring particles on a lattice [31]. The need for efficient impurity solvers has motivated the development of continuous-time QMC methods for fermions where the coupling to the bath is treated in terms of a retarded interaction [32][33][34]. The most prominent of these methods is based on a hybridization expansion and samples fermionic world-line configurations [33]. The method avoids the negative sign problem by combining the fermionic bath propagators into a determinant which only allows for local Monte Carlo updates and eventually leads to a cubic scaling in inverse temperature. Here, worm sampling is only used to obtain improved estimators for the Green's function [35]. The method has been extended to include bosonic baths [36] to solve bosonic impurity problems [37], spinboson models [38], or Bose-Fermi Kondo models [38][39][40][41], but the Monte Carlo sampling remained restricted to local updates, even in the absence of the fermionic determinant in pure spin-boson models [38,41]. The selfconsistent solution of spin-boson impurity models, e.g., plays an important role for spin glasses [42][43][44][45] and extensions of dynamical mean-field theory to spin systems [46]. Bosonic modes have also been included in the interactionexpansion QMC method which can be applied to fermionboson models on finite lattices [47][48][49].
In this paper, we introduce a continuous-time QMC method for dissipative spin models that combines the advantages of impurity solvers with the global worm/directed-loop updates developed for lattice models. Our method is based on the stochastic series expansion (SSE) [50] with global directed-loop updates [7] and their recent generalization to retarded interactions [14]. In contrast to phonon-coupled systems, the spin-boson models considered below do not conserve the totalŜ z quantum number, which inhibits the local construction of the worm/directed-loop updates. To overcome this difficulty, we introduce the new wormhole update: the two operators of the nonlocal interaction in imaginary time can act as sources and sinks for the worm/directed loop and therefore allow for nonlocal moves through a world-line configuration. It turns out that the rules for constructing the directed loops are equivalent to regular spin models with nearest-neighbor interactions-the lattice-site indices are just replaced with imaginary-time variableswhich allows for an easy transfer of the methodology developed in Ref. 7. The time dependence of the retarded interaction does not affect the construction of the loops because it is sampled during the diagonal updates [14]; the bath propagator can be sampled efficiently for any spectral distribution of the bath modes. In the following, we formulate our QMC method for impurity models but the wormhole updates can be trivially extended to spin-boson models on a finite lattice, as we will discuss at the end of our paper. We demonstrate the efficiency of our QMC method for the two-bath spin-boson model. Our method reaches significantly lower temperatures than previous QMC approaches [38,41], which allows us to get a precise estimate of the quantum critical coupling between the critical and the localized phase that is in excellent agreement with previous results of an MPS based approach [24]. In future, our QMC method will enable efficient simulations of a variety of quantum impurity and lattice models coupled to bosonic modes like photons, phonons, magnons, etc., as they appear, e.g., in quantum optics, trapped-ion simulators, or solidstate systems.
Our paper is organized as follows. In Sec. II, we define the spin-boson models, in Sec. III, we show how retarded interactions can be derived from the interaction picture, in Sec. IV, we introduce our QMC method, in Sec. V, we present our results, in Sec. VI, we discuss extensions of our method, and in Sec. VII, we conclude.

II. SPIN-BOSON MODELS
We consider a generic spin-boson model of the form that can be split into a spin partĤ s , a bosonic partĤ b , and a spin-boson interactionĤ sb . For now,Ĥ s describes a local spin-1 2 degree of freedomŜ = 1 2σ whereσ , ∈ {x, y, z}, are the Pauli matrices (we set ̵ h = 1). The bosonic bath is given by a sum of harmonic oscillators, whereâ † µ (â µ ) creates (annihilates) a boson in a state µ⟩ with frequency ω µ . Below, the superindex µ will refer to a continuum of oscillators and different components of the bath modes, but it could also label the lattice sites of a finite system. The spin-boson interaction couples the bath to a spin operatorˆ µ [Ŝ α ] that is model dependent and also contains the spin-boson coupling constant. In the following, we define two types of spin-boson models that fit into the generic form.
The original spin-boson model is given by a two-level system which is coupled to a bosonic bath via the z component of the spin, i.e., The magnetic field h x induces a finite tunneling between the two levels, whereas the coupling to a continuous bath with a mode-dependent coupling constant γ q will localize the spin. The spin-boson model has been generalized to an interaction with up to three baths, i.e., where each bath couples to a different spin component . We set γ qx = γ qy and refer to this model as the XXZ spinboson model due to its similarity to the XXZ model (this will become apparent when we introduce the retarded interactions further below). For γ qz = 0 we recover the two-bath spin-boson model and for γ qx = γ qy = γ qz the Bose Kondo model. In the absence of the magnetic field, the former model has a U(1) symmetry whereas the latter is SU(2) symmetric. Further details on the U(1)-symmetric case will be discussed in Sec. V. A review of impurity properties can be found in Ref. 51. Another Hamiltonian that is often studied in quantum optics is the Jaynes-Cummings model [52] where the quantized electromagnetic field interacts with a two-level system via the spin-flip operatorsŜ ± =Ŝ x ± iŜ y . We consider a coupling to a continuum of bath modes. The physical properties of the impurity are fully determined by the spectral functions for each bath, We assume that J (ω) has a power-law form with exponent s, where s = 1 corresponds to an ohmic bath and 0 < s < 1 to the sub-ohmic regime. Here, α is the spin-bath coupling constant which we use in the following and we choose the frequency cutoff ω c = 1 as the unit of energy.

III. INTERACTION PICTURE AND RETARDED INTERACTIONS
The spin-boson models introduced in the previous section are quadratic in the bosonic operators, such that the bath can be integrated out exactly. Typically, we use the path-integral formalism to derive a retarded interaction in the system's degrees of freedom. To avoid the intricacies of the spin path integral, we will show how retarded interactions arise from a perturbation expansion in the interaction picture. In this way, we will see how the difficulties of previous QMC approaches are overcome that directly simulate the bosons. Our formalism is very similar to the hybridization expansion used for fermionic impurity models [33,34].
We split the Hamiltonian of the full system,Ĥ =Ĥ 0 + V , into the unperturbed partĤ 0 and the perturbation V . The Dyson expansion of the partition function reads where all operatorsV (τ ) = e τĤ0V e −τĤ0 are ordered according to their imaginary-time variable, i.e., β ≥ τ 1 ≥ τ 2 ≥ ⋅ ⋅ ⋅ ≥ τ m ≥ 0 (β = 1 T is the inverse temperature; we set k B = 1). For our derivation, it is convenient to introduce the time-ordering operatorT τ and rewrite Eq. (9) in the form For the generic spin-boson model in Eq. (1), we identifŷ similar to the hybridization expansion used for fermionic impurity models [34]. For our derivation of the retarded spin interaction, we setĤ s = 0 to simplify the notation. Later, we can includeĤ s inV without loss of generality.
To distinguish between regular and adjoint operators in Eq. (11), we introduced the superscript c and its oppositē c. With this, the interaction expansion becomes The trace splits into separate contributions for spins and bosons. Because the boson particle-number is conserved, only gives a nonzero contribution if m = 2n and the number of creation and annihilation operators is the same. The bosonic trace can be further simplified using Wick's theorem. For example, consider the expectation value with operators ordered as follows: Here, S n is the symmetric group of order n and π ∈ S n is a permutation of n objects. We introduced the free-boson and fulfills D(ω, τ + β) = D(ω, τ ). The left-hand side of Eq. (13) represents only one possible combination of choosing {c 1 , . . . , c n } such that we obtain the same number of creation and annihilation operators. In total, there are 2n n combinations. Inserting Eq. (13) into Eq. (12), we obtain (we define τ ′ k = τ n+k ) The sum over all permutations can be evaluated by relabeling the variables, which gives another factor of n! in Eq. (15). Eventually, the perturbation expansion can be recast as After reexponentiation of the resulting interaction term in Eq. (16), the partition function becomes As a result, the spin-boson coupling has generated a retarded spin-spin interaction that is mediated by the free-boson propagator (14). Note that for our choice ofĤ 0 the time evolution of the spins is trivial becauseĤ 0 does not include any spin operators. However, we still need the τ labels for the time ordering of the spins in the perturbation expansion as well as for the time dependence of the boson propagators. Finally, we want to specify the retarded spin interactions for the spin-boson models introduced in the previous section. For the XXZ spin-boson model we get whereas the Jaynes-Cummings Hamiltonian leads tô Here, we have already taken the continuum limit for the bosonic bath. We explicitly see that the spectral function defined in Eq. (7) completely determines the effect of the bath on the spin subsystem. For the power-law spectrum in Eq. (8), the frequency average of D(ω, τ ) induces a long-range interaction in imaginary time that vanishes as 1 τ s+1 for ω c τ ≫ 1. Because the retarded interaction for the XXZ spin-boson model is symmetric under τ ↔ τ ′ , we have replaced D(ω, τ ) with the symmetrized propagator

IV. QUANTUM MONTE CARLO METHOD
A. Basic ideas of the Monte Carlo method Before we describe the mathematical formalism of our QMC method, we first want to introduce the main ideas that underlie our approach. We use the framework of the directed-loop algorithm [7] which was originally formulated in the SSE representation [50] and recently generalized to retarded interactions [14]. We have shown in the previous section that a generic spin-boson model of the form of Eq. (1) can be mapped to a spin problem with a retarded interaction. In this way, we can avoid the difficulties of direct boson sampling which often occurs when the particle number is not conserved. As the resulting retarded spin interaction in Eq. (18) does not conserve the totalŜ z , either, we will introduce the nonlocal wormhole update to provide an efficient sampling using the directed-loop algorithm.
As a starting point for the formulation of our method, we consider the partition function in Eq. (17), where the bosonic bath has been integrated out to obtain a retarded spin interaction. Following Ref. 14, we expand the time-ordered exponential in the full interactionĤ ret , i.e., we perform an interaction expansion around the trivial spin partĤ 0 = 0, to obtain Eq. (16). Such an expansion in the full Hamiltonian is the characteristic feature that leads to the SSE representation, even if we start from the interaction representation. Then, the time evolution of the spin operatorsŜ (τ ) becomes trivial and their τ labels are only required to perform the time ordering. For equal-time (i.e., time-independent) Hamiltonians, the imaginary-time integrals in the resulting expansion can be calculated exactly, leading to an exact mapping to the SSE representation [10,12], which does not have an explicit time dependence. In the presence of retarded interactions, which is the focus of this paper, imaginary-time variables have to be sampled explicitly in our Monte Carlo scheme to take the time-dependence of the boson propagator correctly into account. It was shown in Ref. 14 that the exact sampling of the continuous times only enters during the diagonal updates (as discussed below), so that the remaining parts of the directed-loop algorithm can be formulated as in the SSE representation.
We first want to discuss how the trace over the spin operators is related to the original SSE formulation. The time-ordered product of spin operators corresponds to the operator sequence in the SSE representation. From our derivation in Sec. III, it is clear that the time ordering does not lead to any negative signs. The trace over the spins, Tr s • = ∑ α ⟨α • α⟩, runs over the initial state α⟩ which is usually chosen in theŜ z basis. We as- sume that the operators inĤ ret are non-branching, i.e., cp µp (τ p ) α p ⟩ ∼ α p−1 ⟩ uniquely propagates a basis state to another basis state; here we define the time ordering as in Eq. (9) (with a decreasing propagation index) and set α 0 ⟩ = α 2n ⟩ = α⟩. The imaginary-time evolution of the initial state can be visualized using a world-line picture (note that a world line is defined by the time dependence of the propagated spin state and not by the operators). Because the propagated state α p ⟩ only changes when an operator is applied, it is sufficient to find a graphical representation of the interaction vertex. The possible vertex types forĤ ret are illustrated in Fig. 1. Each vertex consists of two operators, the boson propagator, and the spin states before and after the operators are applied, which are represented by two black bars, a dashed line, and four legs, respectively. For our spin-1 2 models, the legs are illustrated as filled (open) circles and correspond toŜ z eigenstates with eigenvalue 1 2 (−1 2). In total, there are six vertex types for the retarded interactions in Eqs. (19) and (20): four diagonal vertices which leave the world-line configuration unchanged and two off-diagonal vertices where the spin states are flipped. The latter correspond toŜ − (τ )Ŝ + (τ ′ ) andŜ + (τ )Ŝ − (τ ′ ). A typical world-line configuration that includes both types of vertices is shown in Fig. 2(a). For a more formal definition of the interaction vertices and the configuration space of our QMC method see Sec. IV B.
We use Markov chain Monte Carlo sampling to evaluate the sum over all world-line configurations stochastically. To formulate an ergodic algorithm, we need two types of updates: the diagonal updates and the directedloop updates.
The main purpose of the diagonal updates is to change the average expansion order, to replace existing vertices with new ones, and-in addition to the original SSE method-to sample the imaginary-time variables. Because diagonal operators do not change the world-line configuration, we propose to add or remove diagonal vertices using the Metropolis scheme defined in Sec. IV C. An example for the addition or the removal of such a ver- The efficiency of our Monte Carlo method relies on the directed-loop updates [7], a global updating scheme that changes an extensive subset of world-line segments along a closed loop. The loop is constructed as follows: First, we pick a random starting time to insert a pair of spinflip operators. One of these operators is identified as the head of the loop that can move through imaginary time as a world-line discontinuity, whereas the other operator becomes the tail of the loop that remains fixed at the starting time. Because the time evolution for the spin operators is trivial (Ĥ 0 does not contain any spin terms), the head will proceed until it reaches the leg of an operator that is part of a vertex. In the conventional directedloop update, we would choose an exit leg of a local vertex with a probability determined by the directed-loop equations, a local version of detailed balance. It turns out that this is still true for the nonlocal vertex of the retarded interaction. As a result, the loop head entering the vertex in Fig. 2(c) via the left leg of the left operator can either bounce and reverse its direction of propagation, continue straight through the left operator, or exit at the right operator. In Fig. 2(d), we selected the last option which we call a wormhole update because the loop head uses the connection established by the free-boson propagator as a wormhole to instantly tunnel between well-separated points in imaginary time-the name is also chosen in analogy to the worm algorithm where the loop is called a worm [5]. Once the loop exits the leg of a vertex, it will propagate to the entrance leg of the next vertex and the same procedure will apply again until the loop head returns to its tail, as depicted in Fig. 2(e). When the loop is closed, we flip all spin configurations along the loop, as shown in Fig. 2(f). Thereby, diagonal vertices can be transformed into off-diagonal ones and vice versa. Because the directed loop is able to touch an extensive number of vertices, it represents a global update move. The computational effort to construct the loop scales linearly in the number of touched vertices. Further details on the construction of the directed loops are given in Sec. IV D.
With the diagonal and directed-loop updates described above, our QMC algorithm samples the entire diagrammatic expansion of diagonal and off-diagonal operators and therefore fulfills ergodicity. The diagonal updates sample the expansion order n by adding and removing diagonal vertices at arbitrary imaginary times, whereas the directed-loop updates transform diagonal operators into off-diagonal ones and vice versa. The latter transformation reaches all vertex types with a nonzero weight, as described in the original formulation of the directed-loop updates [7].
Altogether, we combine the framework of retarded interactions-that is commonly used for impurity solvers to avoid the difficulties of direct boson sampling [34]with the global directed-loop updates originally developed for lattice models [7]. The local construction of the wormhole updates is possible for bosonic baths with positive-definite propagators, whereas fermionic baths require the numerically-expensive calculation of a determinant to avoid the negative-sign problem [33]. The formulation of our algorithm is closely related to the standard formulation of the directed-loop algorithm [7]. However, the nonlocality in imaginary time requires some changes in the implementation which we will discuss in the Appendix. Although we have developed the wormhole updates in the framework of the directed-loop algorithm, their underlying ideas can be transferred to related QMC methods based on worm/directed-loop updates.

B. Configuration space and vertex weights
A Monte Carlo configuration C = {n, C n , α⟩} is defined by the expansion order n, the ordered vertex list C n = {ν 1 , . . . , ν n }, and the initial state α⟩ in the localŜ z basis. Each vertex variable ν = {t int , a, ω, τ, τ ′ } includes a frequency ω and two imaginary-time values τ and τ ′ .
The operator type a distinguishes between diagonal and off-diagonal parts of the vertex, whereas the interaction type t int labels different kinds of interactions-the latter is only relevant if we combine the spin-boson models with additional interactions, as discussed in Sec. VI. The interaction vertex for our spin-boson models is given bŷ Here, we have rescaled the free-boson propagator and the spectral density, such that the former becomes a probability distribution in τ and the latter in ω. For the power-law spectrum in Eq. (8), we obtain J (ω) = s ω −s c ω s−1 . The normalization factor for the spectral density can be absorbed into the operator partĥ a (τ, τ ′ ).
Our Monte Carlo method samples the perturbation expansion of the partition function, Z = Z b ∑ C W (C), as in Eq. (16). The total weight of a configuration is given by is the weight of a single vertex. The expectation value of the operator part W [ĥ a (τ, τ ′ )] is fully determined by the weight W v of one of the six vertex types v in Fig. 1.
For the XXZ spin-boson model, the off-diagonal operator part is given bŷ whereas the diagonal part becomeŝ Here, we have defined the couplings λ = 2α ω c s which include the normalization of the spectral function. Note that we have dropped unit operators at times τ and τ ′ , e.g., C corresponds to C1(τ )1(τ ′ ). We can include the magnetic field h z in the retarded interaction because ∫ dω J (ω) ∫ dτ ′ P (ω, τ − τ ′ ) = 1. We have added a constant C toĥ 2 (τ, τ ′ ) to obtain positive weights. For the different vertex types v defined in Fig. 1, the weights are given by We choose C ≥ max λz 4 , hz 2 − λz 4 to get positive weights. The weights W v for the XXZ spin-boson model are equivalent to the ones for the ferromagnetic XXZ model with a nearest-neighbor spin exchange [7].
For the Jaynes-Cummings model, we havê The vertex weights are similar to the ones in Eq. (27) but λ z = 0 and W 5 = 0.

C. Diagonal updates
The diagonal updates involve adding or removing a single vertexĥ 2 (τ, τ ′ ) using the Metropolis-Hastings algorithm. The Metropolis acceptance probability is determined by the acceptance ratio which depends on the Monte Carlo weight W (C) defined in Eq. (23) and the proposal probabilities T 0 (C → C ′ ) specified in the following.
We propose the addition of a new vertex with probability density where only the first time variable is chosen randomly, but the frequency and the second time variable are sampled from J (ω) and P (ω, τ − τ ′ ), as explained below. Note that there are n + 1 possibilities to insert the new vertex into the ordered list C n+1 . We included an additional probability p t int to pick an interaction type t int from a set of interaction terms, as further discussed in Sec. VI; for now, p t int = 1. For the removal of a randomly chosen diagonal vertex, we get where n 2 is the number of all diagonal vertices in C n . With the ratio of the Monte Carlo weights in Eq. (23), for the addition and removal of a vertex, respectively. Because we included J (ω) and P (ω, τ − τ ′ ) in the proposal probabilities, they drop out of the acceptance rates. In this way, we ensure high acceptance probabilities for all parameters.

D. Directed-loop and wormhole updates
We have introduced the ideas behind the directed-loop updates and their extension to wormhole updates already in Sec. IV A. In the following, we want to elaborate on their mathematical formulation.
The directed-loop updates use an extended configuration space to connect two regular Monte Carlo configurations that require an extensive number of local changes. they drop out of the directed-loop equations [14]. Hence, the latter can be formulated for the vertex weights W v as for the original method [7]. In the extended configuration space, each vertex is assigned an entrance and exit leg for the directed loop and the corresponding weights become W v (l in , l out ). The directed-loop equations, correspond to a local version of detailed balance and the conservation of probability. Here, the vertex typev is related to v by flipping the spins along the assigned loop segment. If the directed loop enters a vertex at leg l 1 , W v (l 1 , l 2 ) W v gives the probability to exit at leg l 2 . Before we explain how to solve the directed-loop equations, we want to take another look at the vertex structure depicted in Fig. 1. As for regular spin models on a finite lattice (e.g., Heisenberg or XXZ models), each vertex has four legs. We have argued before that the nonlocality of our vertex does not play a role in the directed-loop equations. We find that the vertices in Fig. 1 map to the ones for regular spin models with nearest-neighbor interactions if we take the subvertex at τ ′ and place it to the right of the subvertex at τ [as illustrated in Fig. 3(a)]. Hence, we can use the same techniques as for regular spin models to solve the directed-loop equations. Moreover, the vertex weights for the XXZ spin-boson model in Eq. (27) are equivalent to the ones for the ferromagnetic XXZ spin model on a finite lattice. As a result, the directed-loop equations have the same solution for both cases. In particular, loops can be constructed deterministically in the SU(2) symmetric case [6].
For reasons of completion, we give a short outline on how to solve the directed-loop equations analytically for the spin-boson models considered in this paper. In more general situations, the directed-loop equations can be solved using linear programming techniques [53]. Con-sider the assignment table illustrated in Fig. 3(b). It is constructed in such a way that each row corresponds to Eq. (39) and the off-diagonal elements are related by Eq. (38). The bounce probabilities are given on the diagonal. For the assignments chosen in Fig. 3(b), we obtain the set of equations which we can solve for a, b, and c as follows: Our guiding principle is to minimize the weights b 1 , b 2 , and b 3 for the bounce moves as they reduce the efficiency of the algorithm. If we plug in the weights for the XXZ spin-boson model, we get an explicit solution, We can always obtain positive solutions if we adjust the constant shift C and one of the bounce weights b 1 or b 2 .
In the absence of an external magnetic field h z , we can obtain bounce-free solutions for λ z ≤ λ xy . For the SU(2) symmetric case, i.e., λ z = λ xy , we can choose C = λ z 4 to construct loops where the entrance leg exactly determines the exit leg, as it is also the case for the Heisenberg model [6]. The directed-loop equations have to be solved for all possible assignment tables. Similar solutions can be derived for the Jaynes-Cummings interaction.

E. Observables
The calculation of observables is equivalent to standard world-line QMC methods. In the following, we only give a short overview over the most common ones.
The properties of the spin system can be accessed from the imaginary-time correlation functions and the corresponding susceptibilities Here, Ω m = 2πm β, m ∈ Z, are the bosonic Matsubara frequencies. We also define the static susceptibilities χ = χ (iΩ 0 ). The z components of these observables do not change the world-line configurations and can be obtained directly from the propagated state α p ⟩, whereas the off-diagonal components can be accessed by tracking the propagation of the directed loop during the directedloop updates. The latter is possible because the loop head and tail are identified with spin-flip operatorsdepending on the model, it might be more convenient to identify them either withσ x operators or withŜ + and S − operators. In contrast to the SSE representation, we have direct access to the imaginary-time values of each vertex which simplifies the calculation of time-displaced observables for both cases. For a detailed discussion of the diagonal and off-diagonal measurements see Refs. 12 and 54. The properties of the bath cannot be accessed directly from the world-line configurations because the bosons have been integrated out. However, bosonic observables can be derived from higher-order spin-spin correlation functions with the help of generating functionals and eventually be recovered from the distribution of vertices [55]. For example, ⟨n⟩ = −β⟨Ĥ s +Ĥ sb ⟩ relates the average expansion order to the energy of the spin subsystem. Estimators for the bosonic energy, the boson propagator, or the specific heat have been derived in Refs. 15 and 55 for a single bath frequency. They can be generalized to a continuous bath, but one has to define an additional mapping from discrete frequencies to the continuum following Refs. 56 and 57 leaving some ambiguity for the bath properties.

V. RESULTS
To demonstrate the efficiency of our QMC method, we consider the quantum phase transition between the critical and the localized phase in the two-bath spin-boson model for h = 0, s = 0.8, and as a function of the spinboson coupling α. For these parameters, the critical coupling has been determined using a variational MPS approach that is based on a Wilson chain with a logarithmic discretization for the bosonic bath [24,58]. We show that our QMC method reaches low-enough temperatures to distinguish the two phases via the spin susceptibility and get a precise estimate of the critical coupling from a finite-size analysis.
For bath exponents 0 < s < 1, the perturbative renormalization group has predicted an intermediate-coupling fixed point where partial screening of the impurity leads to a critical phase with power-law spin-spin correlations C xy (τ ) ∼ 1 τ 1−s [25,26,[59][60][61]. Numerical simulations revealed that the critical phase is only stable for s * < s < 1 [s * ≈ 0.76] and α < α c beyond which a local moment is formed that fulfills lim τ →∞ C xy (τ ) = m 2 loc > 0 [24,58]. For a detailed discussion of the phase diagram, the fixedpoint structure, and the critical properties of the two- bath spin-boson model see Ref. 58. The QMC method developed in this paper allows us to approach the two phases from finite temperatures. Figure 4(a) illustrates the emergence of a local moment in the order parameter C xy (β 2) for α > α c , whereas C xy (β 2) scales to zero for α < α c . The critical and the localized phases can also be distinguished by the low-temperature response of the static spin susceptibilities χ xy and χ z , as shown in Fig. 4(b). For α = 0 or T ω c ≫ 1, we have χ xy = χ z = 1 4T . For low-enough temperatures and deep in the localized phase, χ xy approaches a Curie law, χ xy = m 2 loc T , with a finite local moment m loc . When entering the critical phase, χ xy clearly deviates from the Curie behavior and slowly converges to χ xy ∼ T −s ; we do not show χ xy for α < 0.7 because all graphs are parallel to χ xy at α = 0.7 but intersect with the others which re- duces their distinguishability. On the other side, χ z also shows a power-law dependence with an exponent that is steadily reduced as α increases and eventually becomes zero in the localized phase. Although our QMC method reaches temperatures as low as T ω c = 10 −7 , C xy (β 2) and χ xy still experience finite-temperature effects which leads to a slow convergence towards the expected T → 0 behavior. Our QMC method is powerful enough to determine a precise estimate of the critical coupling. To this end, Figs. 5(a) and 5(b) show the rescaled susceptibility T s χ xy and the susceptibility ratio respectively. The latter is a generalization of the correlation ratio and its inverse can be related to a finitesize estimator of the correlation length in imaginary time [41]. For T → 0, both observables remain finite within the critical phase, whereas T s χ xy diverges and R xy goes to zero in the localized phase. We can estimate the critical coupling from a finite-size analysis of T s χ xy and R xy ; here the inverse temperature plays the role of the system size, but in the imaginary-time direction. For each pair of data sets with temperatures {T, T 10}, we extract the crossing points α * (T ) and collect them in Fig. 5(c). For low-enough temperatures, the crossings are expected to follow a power-law behavior α * (T ) = α c + A T e , from which we can estimate α c using a least-squares fit. From the crossings of T s χ xy we estimate α c = 0.76213 (6); the nonmonotonic temperature dependence of α * (T ) for R xy does not allow for a reliable fit, but the data is consistent with the estimate from T s χ xy . Consistent crossings can also be obtained from the order-parameter ratio C xy (β 2) C xy (β 4), but larger statistical errors prohibited a precise estimation of α c . Our estimate of the critical coupling is in good agreement with the previous MPS result α MPS c = 0.765058 [58]; note that the latter has a systematic error of up to 1% due to the logarithmic discretization of the bosonic bath in the Wilson chain [58]. Taking this systematic MPS error into account, our QMC estimate gives a more precise estimate of the critical coupling. In a follow-up study [62], we also determine the correlation-length exponent at this quantum phase transition, which is in excellent agreement with the MPS result.
Finally, we want to emphasize again that the wormhole updates and the efficient sampling of the boson propagator allow us to reach temperatures as low as T ω c = 10 −7 -this is several orders of magnitude lower than in previous QMC studies which only reached T ω c ≈ 10 −4 [38,41]. Based on our algorithmic developments, we are able to resolve the characteristic low-temperature behavior of the critical and the localized phases in Fig. 4, in particular the difference between χ xy ∼ T −s and χ xy ∼ T −1 , and get a precise estimate of the critical coupling in Fig. 5. A systematic study of the closely-related SU(2)symmetric spin-boson will be presented in Ref. 62. The directed-loop updates work efficiently for all parameter regimes considered in this paper, because the average length of the loops is given by χ xy and therefore diverges with the characteristic temperature scale of the problem. For the lowest temperatures and strongest couplings, we have reached expansion orders of approximately 20 million vertices.

VI. GENERALIZATIONS TO OTHER INTERACTION TYPES
We have introduced our QMC method for the XXZ spin-boson model and the Jaynes-Cummings model. In the following, we discuss how other interaction types can be included for impurity models and how the wormhole updates can be applied to lattice models.
A. Impurity models

Magnetic field in the xy plane
For the impurity models discussed before, it might be of interest to apply a magnetic field also in the xy plane. We can decompose the additional field as follows: Because the retarded spin-boson interactions in Eqs. (19) and (20) always contain a pair ofŜ + andŜ − operators, the magnetic-field terms in the perturbation expansion have to appear in pairs as well and therefore do not lead to a sign problem. We can treat the xy magnetic field as an additional interaction type. To formulate a full updating procedure, we add the diagonal term −C ∫ dτ1(τ ) to the interaction vertex. It is convenient to choose the constant prefactor equal to h xy 2 where h xy = h 2 x + h 2 y . Then, all vertex weights have the same absolute value, which simplifies the solution of the directed-loop equations. If the directed loop enters one of these vertices, we want to transform a unit operator into a spin-flip operator or vice versa which prohibits the loop head from propagating any further locally. However, we can choose a random magnetic-field vertex in our world-line configuration (including the original vertex) and continue the construction of the loop from there. The exit leg has to be chosen in such a way that it is consistent with the propagating operator and the world-line configuration. The construction of the loop is completed when the loop head returns to its original starting point. Including the magnetic field in this way has the advantage that we can measure ⟨h xŜx + h yŜy ⟩ = ⟨n xy,1 ⟩ β from the number of off-diagonal magnetic-field vertices in the perturbation expansion.

XYZ spin-boson model
Our discussion of the spin-boson model in Eq. (5) was restricted to the XXZ case. Here, we want to mention that one can also simulate the full problem with three independent couplings λ . Then, the off-diagonal vertex in Eq. (25) has to be replaced bŷ The new termsŜ + (τ )Ŝ + (τ ′ ) andŜ − (τ )Ŝ − (τ ′ ) have to be considered as additional vertex types v, i.e., v = 7 and v = 8. With these vertex types present, the directed loop can exit all four legs of a vertex and the solution of the directed-loop equations becomes a bit more complicated than before. As long as we do not apply an additional field in the xy plane, the second term in Eq. (47) will not lead to negative weights because it always has to appear in pairs in the perturbation expansion.

Original spin-boson model
For completeness, we also want to mention how to simulate the original spin-boson model in Eq. (4). Because the bosonic bath only couples to the z component of the spin, we cannot immediately apply the wormhole updates as defined above. To obtain a retarded spin-flip interaction, we can formulate our QMC method in theŜ x basis. This corresponds to a rotation of the spin operators using the Hadamard matrix (which exchangesŜ x ↔Ŝ z in the absence ofŜ y ). The interaction part becomeŝ and leads to positive Monte Carlo weights for C ≥ h x 2. The off-diagonal term now contains two additional vertex types. As a result, the directed loops can exit all of the four vertex legs and the directed-loop equations can be solved using linear-programming techniques [53].
For the spin-boson model, a cluster algorithm had already been formulated in Ref. 23 in terms of a retarded interaction betweenŜ z operators. The spin-boson model has close similarities with the long-range Ising model in a transverse magnetic field which can be simulated efficiently in the SSE representation [63]. It might also be possible to extend this method to long-range interactions in imaginary time.

B. Lattice models
We have developed our QMC method for a single spin coupled to a dissipative bosonic bath, but the wormhole updates trivially extend to lattice models. In the simplest case, each lattice site is coupled to an independent bath, so that the interaction vertex only gets an additional site index i. The details of our method stay exactly the same, we only have to draw a random i ∈ [1, L] when proposing the addition of a diagonal vertex, which leads to an extra factor of 1 L in Eq. (32). To couple the spins at different lattice sites, we can include any interaction vertex that can be simulated in the SSE representation without a sign problem, e.g., an XXZ spinexchange coupling between nearest neighbors. The additional vertexĤ transfers into the interaction picture aŝ H = ∫ dτĤ(τ ). The spin operators obtain dummy time variables that are sampled during the diagonal updates to determine their position in the world-line configuration. The diagonal updates have to be formulated using the Metropolis scheme in Sec. IV C, whereas the probability tables for the directed-loop updates stay the same. Note that the vertex structure need not be the same for different interaction types, as long as the loop updates can be formulated for each type individually. Eventually, the wormhole updates introduced in this paper allow us to study the effects of dissipation on a variety of lattice models. While the worm algorithm can be applied when a diagonal operator couples to a bosonic bath [30], our formulation with directed loops allows us to simulate more generic couplings that also include offdiagonal terms. Results for a Heisenberg chain coupled to an ohmic bath will be presented elsewhere [64].
We have introduced the wormhole updates for bosonmediated interactions that are local in space but nonlocal in imaginary time. In general, the wormhole updates can be applied to long-range interactions in space or time, as long as the Monte Carlo weights are positive. Examples arise in quantum optics where global bosonic modes couple to the spins at different sites, e.g., in the Dicke model [65] or in trapped-ion simulators [18]-for the latter, the absence of a sign problem has to be assessed from case to case. However, we expect that polariton models like the Jaynes-Cummings-Hubbard model will introduce a sign problem in our formulation because integrating out the bosonic hoppingsâ † iâj leads to a nonlocal boson propagator that has negative contributions for i ≠ j [66]; in that case, it is better to simulate the bosons directly [67,68]. Moreover, we can design long-range interactions in space and time that are not related to a spin-boson Hamiltonian but can be simulated with the wormhole updates.

VII. CONCLUSIONS
We developed an exact QMC method to simulate spin systems coupled to dissipative bosonic baths in terms of retarded spin interactions. Our method makes use of the directed-loop updates which were originally formulated in the SSE representation [7] and recently generalized to retarded interactions [14]. To include the retarded spinflip interactions into the framework of the directed-loop updates, we introduced the nonlocal wormhole updates which allow the loop to tunnel between the two opera-tors of a vertex. The formulation of the directed-loop updates remains as simple as in the SSE representation [7] since the time-dependence of the retarded interaction is fully accounted for in the diagonal updates [14]. Our method applies to any spectral distribution of the bath, which can be sampled efficiently-along with the timedependence of the boson propagator-during the diagonal updates. The ideas developed in this paper are also applicable to related QMC methods like the worm algorithm [5] which make use of worm/directed-loop updates to sample world-line configurations.
We demonstrated the efficiency of our QMC method for the two-bath spin-boson model. Our method reached significantly lower temperatures than previous QMC approaches [38,41], which allowed us to identify the characteristic finite-temperature response of the spin susceptibility in the critical and the localized phase. To estimate the quantum critical coupling between these two phases, we compared the finite-size behavior of different observables. Our estimated α c is in good agreement with previous results from a variational MPS study [24,58].
In future studies, our QMC method will enable us to determine the phase diagrams of spin-boson models coupled to multiple baths [62]. Open questions include the quantum critical properties of these models and their relation to spin systems with long-range interactions in space. Spin-boson interactions also play an important role in the more complex Bose-Fermi Kondo model which is a central model for our understanding of the quantum critical behavior appearing in heavy-fermion systems [69]. For the local spin-boson couplings considered in this paper, the wormhole updates can be combined with any lattice model that can be simulated already in the SSE representation, which will allow us to study the effects of dissipation on finite systems [64]. We expect that our QMC method will be able to simulate more general lattice models coupled to bosonic modes which describe light-matter interactions in quantum optics or trappedion systems-spin-boson models can be realized, e.g., in quantum simulators [70]. In general, the wormhole updates not only apply to long-range interactions in imaginary time but also in space. Our algorithmic developments therefore open up a route to study a completely new class of models within the framework of the wellestablished directed-loop algorithm. sary data structures. In the following, we will give a quick overview over the modifications that become necessary in the interaction representation due to the nonlocality in imaginary time.
In the SSE representation, the vertices are saved in an operator string with a fixed length that is traversed sequentially during the diagonal updates; in this process, unit operators are exchanged with diagonal ones and vice versa. In our formulation, the vertices are saved in an unsorted list where each element has a structure that contains all vertex variables. For the spin-boson vertices defined in this paper, each vertex has variables ν = {t int , v, ω, τ, τ ′ }; note that it is more useful to save the vertex type v than the operator type a. Before we start a block of diagonal updates, we once go through the vertex list and set up a new list that includes all time variables at lattice site i where an off-diagonal operator is applied. After sorting this list, we can access the spin configuration at each position in the world-line configuration from the knowledge of the initial state α⟩. Although sorting algorithms require O(β log β) operations for each site and are mathematically more expensive than the directed-loop updates, the construction of the loops remained the most expensive task for all temperatures accessible to our simulations. To add a diagonal vertex, we can access the spin configurations from the sorted list, draw the vertex variables as described in Sec. IV C, calculate the Metropolis acceptance, and eventually add the vertex as element n + 1 of the vertex list. To remove a diagonal vertex, we randomly pick a diagonal vertex, exchange it with element n saved in the vertex list, and then reduce the integer n that counts the number of vertices by one. Although the diagonal updates change the number of vertices, we can work with a fixed length for the vertex list and only take into account the first n elements.
The implementation of the directed-loop updates is very similar to Ref. 7. To create the doubly-linked vertex list, we first traverse through the vertex list again and make separate lists for the time variables and subvertex labels-the latter is a combination of the vertex number and a label for operator 1 or 2. We create an index list for the time variables to sort the list of subvertex labels. From this we can setup the linked vertex list. The construction of the directed loops then follows the description in Ref. 7. When we enter a vertex of type v at leg l i , we can determine the exit leg l e from a probability table. Even for the wormhole moves, the combination of the vertex variable and the exit leg exactly determines the label of the exit vertex in the linked vertex list. During the construction of the loop, we also update the vertex configurations in the vertex list. Note that the correct measurement of the off-diagonal timedisplaced correlation functions requires that we choose a random starting time for the directed loop. After we have finished a block of directed-loop updates, we also update the initial state α⟩, as described in Ref. 7.
The number of proposals during the diagonal updates and the number of constructed loops in the directed-loop updates get adjusted during the Monte Carlo warmupin such a way that every vertex is touched at least once on average-and remain fixed afterwards. We start the warmup from a randomly chosen α⟩ and an empty vertex list. To speed up the warmup procedure, we use a beta-doubling scheme. For the measurement of certain observables, we might have to sort parts of the vertices again.