Exact dynamics of non-additive environments in non-Markovian open quantum systems

When a quantum system couples strongly to multiple baths then it is generally no longer possible to describe the resulting system dynamics by simply adding the individual effects of each bath. However, capturing such multi-bath system dynamics has up to now required approximations that can obscure some of the non-additive effects. Here we present a numerically-exact and efficient technique for tackling this problem that builds on the time-evolving matrix product operator (TEMPO) representation. We test the method by applying it to a simple model system that exhibits non-additive behaviour: a two-level dipole coupled to both a vibrational and an optical bath. Although not directly coupled, there is an effective interaction between the baths mediated by the system that can lead to population inversion in the matter system when the vibrational coupling is strong. We benchmark and validate multi-bath TEMPO against two approximate methods - one based on a polaron transformation, the other on an identification of a reaction coordinate - before exploring the regime of simultaneously strong vibrational and optical coupling where the approximate techniques break down. Here we uncover a new regime where the quantum Zeno effect leads to a fully mixed state of the electronic system.


I. INTRODUCTION
Open quantum systems are often significantly coupled to more than one kind of environment. However, the combined influence of these different environments is generally more than the sum of their individual parts: it is thus crucial to account for non-additive effects -that is, effects that originate from an interplay between two or more competing environments [1][2][3][4][5].
Examples of non-additive behaviour include those seen or predicted in optically active quantum systems that are strongly influenced by their vibrational environments, which are ubiquitous in condensed matter and molecular physics. Not only do vibrational interactions lead to complex dynamical behaviour, they are central in determining the optical and electronic properties of a system. For example, they are thought to play a key role in light harvesting and energy transfer [6][7][8][9][10]. Another example occurs in molecular nanojunctions, where the combined effect of the leads and vibrational environments is non-additive [11,12], and using an additive treatment can even lead to a violation of the Carnot bound on efficiency [13]. Using additive treatments can lead to other unphysical predictions too, for example anomalous emission of photons from the groundstate in regimes of strong light-matter coupling [14]. Further, it can miss key dynamical and steady state behaviour, such as the Franck-Condon blockade observed in quantum transport [15,16].
However, typical standard perturbative methods such as Redfield theory do make an implicit additive approximation, such that the impact of each individual environment is mutually independent. Such a treatment will fail in general once the magnitude of the system-bath interaction Hamiltonians for each environment become strong, when non-Markovian effects are important. A number of techniques have been developed to capture non-additive and strong coupling effects, while maintaining the conceptual simplicity of Born-Markov master equations, such as the polaron transformation [17][18][19][20][21][22] and its extensions [23][24][25]; methods based on the pseudomode approach [26][27][28] and its generalisation [29][30][31]; and the reaction coordinate (RC) mapping [4,32,33]. Even though these methods can access non-perturbative regimes, they are typically limited to a particular parameter regime due to the approximations required in their derivation.
To overcome these kinds of limitations, tensor network techniques have become increasingly useful in reducing the computational cost and time of numerically exact calculations of open quantum system dynamics [34]. Although originally formulated to more efficiently represent many-body states with area law entanglement [35,36] they are of more general use in problems which can be cast in a form where the compression of large tensors  1. (a) Cartoon of the model considered in this paper: a two-level dimer coupled to both a vibrational bath and a photon field. (b) schematic tensor network representation of our technique for non-perturbative simulation of the dynamics. By considering how the system evolves timestep-by-timestep there is a contribution from both the time-local free system propagators and the influence tensors for each environment which account for any non-Markovian effects induced by the interactions.
gives a significant numerical advantage. For example, chain-mapping techniques [37,38] involve transforming the topology of system-bath couplings from a star to a 1D chain with nearest-neighbour couplings; this can then be efficiently represented as a spatial matrix product state (MPS). An alternative approach is to construct an MPS in time able to efficiently capture non-Markovian correlations in an equivalent fashion that spatial MPS capture spatial correlations [39,40]. The resulting time-evolving matrix product operator (TEMPO) method provides an efficient approach for calculating the exact dynamics of an open quantum system interacting with a complex environment.
In this work we describe how to formulate multi-bath TEMPO, and so describe a numerically exact approach for capturing the effects of strongly coupling environments that are non-additive. As an example calculation using this new approach, we investigate the nonequilibrium behaviour of a two-level quantum emitter interacting with both vibrational and optical environments. We find that multi-bath TEMPO allows us to explore parameter regimes inaccessible to approximate techniques. We first confirm the behaviour predicted in Ref. [4], where non-additive effects lead to a population inversion if the optical temperature and vibrational coupling strength are large enough. We then test multi-bath TEMPO in the previously inaccessible regime of simultaneously strong optical and vibrational coupling. Here we find the steady state population of both states of the emitter is fixed at one half regardless of temperature; we associate this with a quantum Zeno effect arising from the strong optical coupling. This serves to not only prove the importance of non-additive effects in the behaviour of open quantum systems, but also reveals new unexpected behaviour in the nonequilibrium steady state of the system. By comparing TEMPO to the polaron and RC theories, we gain analytic insight into the otherwise purely numerical approach, as well as determine when such approximate techniques are accurate.
In the next section we will describe the multi-bath TEMPO technique in detail. In Section III we will set out the example calculation, introducing the model of a single dipole coupled to both an optical and vibrational bath. We will also show, in a simplified description, the mechanism by which population inversion can occur. In Section IV, we summarise the polaron transformation and RC mapping techniques, which we will compare to the exact results obtained from multiple-bath TEMPO in Section V. We explore when the approximate techniques are accurate, and present results that can only be captured using our new approach. Our conclusions are drawn in Section VI.

II. MULTIPLE-BATH TEMPO
In this section we present a derivation of the TEMPO algorithm extended to simulate a system coupled to multiple baths with non-commuting operators. A key step in the original path integral framework [39,41,42] is to insert the complete eigenbasis of the coupling operator between each timestep of a discretized influence functional. A different approach must be taken in the general case of multiple coupling operators that do not share an eigenbasis, and this is discussed in Ref. [43]. However, a viable method of contracting the resulting influence tensor is still required, and we provide this here. In particular, we enable efficient contraction of this influence tensor by decomposing it into contributions from each environment, allowing us to construct the influence tensor for each independently. This is distinct from the independent calculation, and subsequent addition, of dynamical generators. In combining the separate influence tensors the total, non-additive effect of the environments is faithfully recovered provided this is done on a fine enough timescale. In Fig. 1 we show this process schematically for the specific model we will consider from Sec. III A of this paper: a two level system coupled to two environments. This process is made tractable by employing efficient tensor network representations [35,36] of both the separate environmental contributions and the combined effect of all these environments and the internal system Hamiltonian [40,44].

A. Multi-bath influence functional
The generic Hamiltonian we consider is where H S is the (arbitrary) system Hamiltonian, and and are the interaction and free bath Hamiltonians, respectively. Here a αq (a † αq ) is the annihilation (creation) operator corresponding to mode q in bath α with frequency ω αq . The arbitrary system operator s α couples to mode q in bath α with strength g αq . The baths can be continuous or discrete; in this paper we consider two continua. We assume that the initial state is separable into system and bath terms with the baths each initially in a Gaussian state e.g. thermal equilibrium at temperatures T α .
We work in a representation where the d × d density matrix ρ, an operator on the Hilbert space H spanned by basis vectors {|e i }, is mapped to a d 2 vector in Liouville space L = H ⊗ H spanned by {|e i ⊗ |e j }. Formally this mapping corresponds to the following: For convenience we proceed in the interaction picture with respect to H S and H Bα . The von Neumann equation is then where we have introduced the Liouvillian superoperators . The exact solution of the von Neumann equation can be written as where ← − T signifies that superoperators be time-ordered from right to left and Λ(t) is the propagator. We now factorise the propagator into a product of exponentials for each bath, which is possible because integrals of operators under time ordering commute. We define · α as the expectation taken with respect to the initial state of bath α and set L Iα α = 0 for all α without loss of generality. Each bath can now be traced out independently using exp(X) = exp X 2 /2 , true for any variable X with a Gaussian distribution and zero mean. With this, and using idempotency of time ordering we trace out the baths in Eq. (9) to leave a dynamical map for the reduced system alone, Here we have enacted the time-ordering of the bath operators within the exponents in the first line, gaining a factor of 2, and have defined the system superoperatorvalued influence functionals, F α [s L α , s R α ]. The system superoperator valued expectations of the interaction Liouvillians in Eq. (10) are readily evaluated. We obtain where s ± α = s L α ± s R α and where C α R (t) and C α I (t) are the real and imaginary parts of the autocorrelation function of bath α 1 .

B. Time-ordering using tensor contractions
Here we present the formalism required to cast Eq. (10) into a tensor network, closely following that introduced in [43]. To simplify our presentation we combine the influence functionals for each bath in Eq. (10) into a single 1 A similar derivation that we are describing here also holds for interactions of the form H I ∼ s † a + sa † , for both fermionic and bosonic environments. In both cases the only substantive difference is in the form of Eq. (11).
multi-bath influence function. Now we discretise the double integrals appearing in this influence functional onto a grid of timesteps, t k = k∆, where 0 ≥ k ≥ N . This allows us to write where is a system superoperator-valued quantity that determines the influence of timestep t k on t k . Operators nonlocal in space generate spatial correlations when acting on a state and here we can equivalently think of the time non-local superoperators I k−k (t k , t k ) as generating non-Markovianity. The approximation made in Eq. (13) is equivalent to having made a Trotter splitting on the full system-baths propagator in Eq. (8), prior to tracing over the baths. The error incurred by the approximation then depends upon the order of the splitting. In practice we use a second-order splitting which incurs an error O(∆ 3 ). We note that since the superoperators in the products in Eq. (13) do not commute in general, the products seem ambiguous. In fact, demanding consistency of Eq. (13) with making a Trotter splitting in Eq. (8) fixes the order of multiplication of the superoperators to be from right to left with increasing k and k . Whether the k product or the k product is performed first is irrelevant.
We now need a prescription for enforcing time-ordering in Eq. (13). If each I k−k (t k , t k ) factorised into a product of time-local operators this would be trivial, since we could just arrange the overall product of operators in order, e.g.
To illustrate how we enforce time ordering generally we now introduce some tensor network notation. A rank-n tensor is represented as a node with n legs and contractions are indicated by connecting legs. Operators are rank-2 tensors so Eq. (15) has the following tensor network representation, where i, j, i , and j label the components of the matrices A(t k ) and B(t k ), i.e. [A(t k )] j i and [B(t k )] j i . On the lefthand side of Eq. (16) we can write the two disconnected rank-2 tensors as a single rank-4 (4-leg) tensor by using a tensor product, C(t k , t k ) = A(t k ) ⊗ B(t k ). The tensor C(t k , t k ) would then be represented by a single node, Comparing with Eq. (16), we see that enforcing time-ordering on this tensor would correspond to contracting two of the legs together; either the legs labelled by i and j or those labelled by j and i .
We can formalise the use of a tensor product to combine time-local operators by way of a history Hilbert space [45,46]. If A(t k ) and B(t k ) both act on a Hilbert space H spanned by a basis {|e i } then we can define a history space, H t k ⊗ H t k , spanned by {|e i ⊗ |e i } on which their tensor product acts. Each subspace of the history space is identical to H and their corresponding bases are each identical to {|e i }. In order to see how to write operators that depend on multiple time arguments in the history Hilbert space, let us consider an operator-valued function of the time-local operators A(t k ) and B(t k ), which we call f (A(t k )B(t k )). This object exists in the familiar Hilbert space H, and would have the usual matrix (i.e. rank-2 tensor) representation. It can be converted into an operator that acts on the history space by simply changing the regular product into a tensor product as follows: where The operator D(t k , t k ) is a rank-4 tensor which is represented in the same way as C(t k , t k ) in Eq. (17). Crucially, to enforce time ordering on D(t k , t k ), i.e. to evaluate ), we simply connect together two of its legs, exactly as we did for C(t k , t k ).
The expression we wish to time order in Eq. (13) is a function of superoperators acting at N + 1 different points in time. Thus, we construct the history space, L H , as a product of N + 1 copies of the Liouville space, L, in which the vectorised system density matrix, |ρ S , is defined. If the full discretised influence functional is converted to a superoperator on L H it is represented diagrammatically as a rank-2(N + 1) tensor, the influence tensor, where each of the time points gives rise to a pair of legs. Time ordering is then enforced by connecting pairs of legs arising from consecutive time points, leaving two unconnected which are the two legs of the resulting dynamical map which is a rank-2 tensor. So if we let N = 3 then Eq. (13) has the following tensor network representation = Λ S (20) where each pair of legs on opposing sides of the influence tensor are those arising from a single subspace of L H and we have arranged the legs so that right most pair correspond to L t0 , the next pair along correspond to L t1 , and so on. Rather than representing Eq. (13) as a contraction of a single, large tensor in L H , we note that each individual I k−k (t k , t k ) can be represented either as a rank-4 tensor in the subspace L t k ⊗ L t k (for k = k ) or as a rank-2 tensor in the subspace L t k (for k = k ). This means that the rank-2(N + 1) influence tensor can be decomposed into a network of rank-2 and rank-4 tensors. For example, the string of tensor contractions corresponding to N = 3 and a fixed k = 0 in Eq. (13) would be represented as where contractions occur between the legs which act commonly on L t0 . The full influence tensor in Eq. (13) with N = 3, prior to time ordering, would then be written as where we have highlighted in green a single string of contractions occurring through the L t2 subspace of L H . The dangling legs along at the bottom of (22)  In the case where [s α , s α ] = 0 for all α, α the nodes in (22) can be expanded in terms of the complete, shared eigenbasis of the system operators s α . Upon enforcing time ordering it is then possible to recover the standard path sum representation of the influence functional [43] and the network can be contracted efficiently using the original TEMPO algorithm [39], or by other means [44].
In the general case that the s α do not commute enforcing time ordering upon and contracting (22) efficiently has so far not been possible. We present a method that achieves this in the following section.
It should be pointed out that the influence tensor we have derived above is in fact a process tensor [40,44,47]. A process tensor is a mapping from the space of possible inputs to a system at a discrete set of times, for example a measurement or a preparation, to output states. Therefore not only can (22) be used to calculate the dynamical map for the system but also all of its correlation functions [44,48]. Recently it has been shown that this further allows the calculation of all correlation functions of the baths as well [49]. Thus, the tensor representation for the influence functional introduced here provides a way to completely characterise the full system-baths dynamics.

C. Matrix product form of the influence tensor
Here we show how the influence tensor derived in the previous section can be efficiently put into the form of a matrix product operator (MPO). The first step is to convert each row of tensors in (22) individually into an MPO. This can be done via singular value decompositions (SVD) using the following routine: Here, we start with a pair of rank-4 tensors connected by a single leg as shown in (a). The desired outcome is shown in (d) with the dimension corresponding to the blue highlighted leg being stored by the right-hand tensor. This corresponds to an alternate decomposition of the rank-6 tensor computed from contracting the bond in (a). The steps in (23) show how to switch to this representation without constructing the full tensor. In (b) we indicate with a dashed line that the legs of the tensor are partitioned into two groups; these are combined to form a matrix. We then perform an SVD on this matrix to arrive at step (c). Moving from (c) to (d) we contract the sub-network indicated by the dashed box and in doing so arrive at the desired representation.
We can now use (23) to move the legs on the left of each row in (22), which puts each row in MPO form. For the bottom row this looks like: Having done this for each row, the full influence tensor can now be computed by iteratively contracting together the product of resulting MPOs, using SVDs to maintain an efficient representation: The result is a MPO representation of the influence tensor, where the vertical legs on each node correspond to the input (below) and output (above) legs of a single subspace of the full history space. That is, the rightmost node takes a state at t 0 as an input from below and outputs a state at t 1 . The next node to the left would take this state from t 1 to t 2 and so on. Enforcing time ordering on the influence tensor to obtain the dynamical map, as in Eq. (20), is now straightforward. Doing this and acting the dynamical map on an initial state to obtain a final state at time t 4 would look like where the (rank-1) initial state is represented as a red circle.
There is one further step we can take to make the calculation more efficient. Instead of combining the individual influence functionals for each bath and then discretising the results, as in Eq. (13), we can discretise each F α [s L α , s R α ] in Eq. (10) separately, and obtain a MPO influence tensor for each of them individually using the procedure outlined above. The single MPO used to find the dynamical map in (26) is then calculated by contracting together the MPOs for each α. In the case that there are just two baths, α 1 and α 2 , the influence tensor MPO used in (26) would be written: Here we have returned from the system interaction picture to the Schrödinger picture. The result of this is that the nodes in the bulk of the network are now defined in terms of time-independent system superoperators, e.g. s L α (t) = s L α , and free system propagators, rank-2 tensors represented as white circles, are applied in each subspace of the history space. The appearance of the free system propagators can be understood using the language of process tensors: in between the periods of time where the system interacts with the baths (the process) the operations we perform on the system are simply those of free evolution.
Calculating the influence tensor MPOs for each bath individually and then combining them does not result in a final MPO with lower bond dimension than if we had first combined the bath influence functionals and then directly calculated the full multi-bath MPO. This is to be expected since each method of calculation results in the same final MPO, up to the errors associated with Trotterization and SVD truncation. However, we find that the individual MPO calculations for each bath are computationally less intensive than the alternative so we can do multiple easier calculations rather than a single intensive one.
The method we have introduced here can be taken even further by breaking down the MPOs for each bath into a series of MPOs representing the individual modes that make up the bath. This approach is similar in spirit to other recent works [34,50] where the process tensor can be constructed in a purely numerical manner, even when the environment is not Gaussian and the influence functional cannot be calculated analytically in closed form. Finally, we stress that process tensors calculated using different approaches for each environment can be combined like this, provided time is discretized in the same way. If the system coupling operators for different environments are related by a unitary transformation then the resulting process tensors can be transformed in the same way.

D. Computing all dynamical maps efficiently
The methodology we have presented so far enables us to calculate the dynamical map between two times, t 0 and t N . In this section we explain how to extract dynamical maps between t 0 and all times t < t N during a single computation and give details on how we maximise the efficiency of the method. Generally we aim to minimise the size of the tensors that must be stored while computing the dynamical maps. There are two steps we take towards this: we enforce time ordering on-thefly during construction of the full multi-bath influence tensor and we keep contributions from each single-bath influence functional separate until absolutely necessary. We found this second point particularly helpful since combining influence tensors for non-commuting environments leads to a significant increase in bond dimension. The influence tensor is really a tensor representation of the integrand of a path sum which contains all possible trajectories the systems takes while interacting with its environment. Contracting together two single-bath influence tensors combines two sets of such trajectories. If the couplings to these baths commute then many of these trajectories will be degenerate and the bond dimension remains low, otherwise a larger bond dimension will be needed to store the combination of trajectories.
To recover a dynamical map from time t 0 to some time t n from an influence tensor constructed with a final time t N > t n we need to discard from the tensor all correlations between pairs of time points that straddle the time t n . We do this using a (normalised) vectorized identity matrix, d −1/2 |1 , a rank-1 tensor which we call a trace cap since 1|O = Tr[O] for any vectorized system operator O. The trace cap is a null vector of any commutator superoperator, e.g s − α |1 = 0, and upon inspection of the form of Eq. (11), which is the exponent of an influence funcional, we can see that the result of applying the trace caps to the influence tensor at time t k is to erase any temporal correlation that is generated between this time and all earlier times t < t k . In practice we use the trace cap as follows: where the trace caps are shown as green semicircles and we have extracted the dynamical map up to time t 3 from an influence tensor with N = 3. Trace caps can be used in this way either on the full multi-bath influence tensor, as we did above, or on the single-bath influence tensors. For instance, if we write the influence tensor in (28) explicitly as a combination of two single-bath influence tensors and free system propagators the trace caps are applied as follows: .
To see how we enforce time ordering on-the-fly, note that all the information required to calculate a dynamical map to time t n < t N is contained in the first n rows from the bottom in the influence tensor in the first diagram of (25). So, if we want to calculate the dynamical map from t 0 to t 2 we first need to contract together the first two rows, enforce time ordering only between the t 0 and t 1 nodes, (30) then apply trace caps to the rest of the MPO. Before applying the trace caps we store a copy of the MPO so that it can be used for further computation but the time ordering step here is permanent, since the legs that were contracted in this step, (30), have no further use in the calculation. Thus, the MPO we use to continue the calculation has only N nodes, one fewer than the initial MPO we began with. Continuing to obtain the dynamical map to t 3 , we contract our stored MPO with the third row in (25), use Eq. (30), then apply trace caps. After this we have stored an MPO that has N − 1 nodes. In this way the MPO we store gets smaller as we proceed with the calculation.
Similarly to how we enforce time ordering as described above, we also combine single-bath influence tensors onthe-fly. To calculate the dynamical map from t 0 to t n , we need to contract n rows of (25) for each single-bath influence tensor, then we only need to combine the n nodes of the resulting MPOs corresponding to times t ≤ t n−1 before applying time ordering on these nodes. After storing the the remaining nodes of these separate MPOs, those for t > t n−1 , trace caps are used as in (29) to obtain the dynamical map. In practice we did not contract the initial state with the influence tensor while calculating the dynamical maps. This allowed us to optimise our choice of initial state in order to find the steady state in as few time steps as possible, which, since we considered two baths in our calculations, meant we also had to store a rank-4 tensor during computations. In summary, the following schematic tensor network shows which tensors we have stored after calculating the first k dynamical maps: Tensor carrying dynamical information up to t k Each MPO is that computed by contracting the lower k rows in Eq. (25). The dynamical map to time t k is found by contracting these MPOs into the rank-4 tensor and then contracting trace caps into all of their remaining legs.

E. Partial coarse graining
There are three convergence parameters associated with each process tensor: the discretisation timestep ∆, the singular value threshold χ for truncation and the memory length K which dictates how many timesteps of system history are kept to capture the non-Markovian influence. In practice this final approximation corresponds to setting I k>K = 1 ⊗ 1. For the two baths considered in this paper the timescales of the different bath correlation functions are two orders of magnitude apart. Naïvely the timestep appears to be fixed by the fastest bath (the optical bath in this case) and on this timescale there is no possibility of making a memory cutoff for the slower (vibrational) bath.
Considering how the separate process tensors are contracted it might be thought that a larger timestep could be used for the slower bath with the same time-ordering contractions applied, i.e. with the faster bath being internally contracted, as in (30), several times between contractions with the slower bath. We found that while this approach converges to the correct result the error grows too quickly with increasing disparity in timesteps to be useful. We did, however, find a solution that enabled us to propagate to much later times. Comparing Figures 2(a) and (b) we illustrate how doubling the timestep manifests in both the discretisation of Eq. (13) and the resulting tensor network. Figure 2(c) shows how we can discretise the inner and outer integrals of Eq. (13) with different timesteps. We will refer to the timesteps of the inner and outer integrals as the history ∆ h and propagation ∆ p timesteps respectively. For certain parameter regimes we found that we achieved convergence with factors ∆ h /∆ p on the order of 100 which allowed propagation lengths on the order of 1000 with no memory cutoff for the vibrational environment.

A. Hamiltonian
As an example application of multi-bath TEMPO, we consider a single two-level dipole that is coupled to both an optical and vibrational environment. It has recently been shown that the two baths in this system are nonadditive, and when treated rigorously can lead to population inversion of the two-level dipole [4]. Therefore, this is the ideal proving ground for testing our approach.
The Hamiltonian describing this system can be partitioned into system (S), vibrational (V ) and optical (O) parts as H = H S + H V + H O . The system part, describes the electronic states of the dipole with transition energy δ. The vibrational part has two contributions: where the first term describes the energy of the bath alone and the second is the electron-phonon interaction term. The vibrational bath is composed of phonons of wavevector k and energy ω k , with ladder operators b k and b † k . Each of these modes couples to the dipole with strength g k as described by the interaction term. The effect of the bath on the system is fully characterised by the phonon spectral density J V (ω) = k |g k | 2 δ(ω − ω k ), the exact functional form of which will be considered in later sections. This interaction induces a displacement of the excited electronic manifold of the dipole which leads to Franck-Condon physics, that is, the non-trivial overlap of the vibrational wavefunctions between the manifolds [4,51].
Finally, within the electric dipole approximation, the optical contribution to the Hamiltonian is given by: where we have introduced the photon ladder operators a q and a † q , corresponding to photons with energy ν q and wavevector q. This interaction term describes absorption or emission of a photon along with creation or annihilation of an exciton in the dipole. The light matter coupling strength f q leads to the optical spectral density J O (ν) = q |f q | 2 δ(ν − ν q ). The form of the optical spectral density is dependent on the gauge chosen for the light-matter interaction [52,53]. As we will see, whether or not it is possible for this model to exhibit population inversion is determined by the form of J O . In this paper we will use where α > 0 is the light matter coupling strength constant. We have also introduced a phenomenological cutoff ν c which is required for the influence functional in TEMPO to converge, and is also physically justified for finite systems [53,54]. The Ohmicity of the spectral density p is determined by the gauge choice and takes a positive value. In this work, we shall consider the two values of p corresponding to two gauges that are most common to non-relativistic quantum electrodynamics, the p = 1 (Coulomb) and (p = 3) (multipolar) gauges [52,53].

B. Population inversion mechanism
To give an intuitive understanding of the population inversion mechanism and demonstrate its reliance on non-additivity of the two environments we first consider a simplified four level model shown in Fig. 3. This model consists of two electronic states separated by energy δ, interacting with a single vibrational mode with frequency ω and coupling strength g. We then make three key approximations: we first assume that only two vibrational states contribute to the dynamics of the system, the zeroth and n th levels; second, the vibrational environment is assumed to be at zero temperature; and third, vibrational processes occur on a timescale much faster than optical transitions, such that photon emission and absorp-tion only occur from the |e0 and |g0 states, as shown in Fig. 3.
Writing down a simple rate equation for the populations of the four level system, and solving for the steady states, we find that the ratio of the steady state populations of the electronic states are For population inversion we require that P > 1. Using Fermi's Golden Rule, we find the rates where W j = S j exp(−S)/j! are the Franck-Condon factors accounting for the overlap of the displaced vibrational levels between manifolds differing in number by j = 0, n, and S = (|g| /ω) 2 is the dimensionless Huang-Rhys parameter. We have also introduced the emission and absorption optical spectral densities When the vibrational coupling is weak (S ≈ 0) the manifolds are displaced only slightly such that W 0 W n . In this limit where we have inserted Eq. (35) for J O . This approaches 1 from below as T O → ∞ and so population inversion does not occur. Clearly then, strong vibrational coupling is key to population inversion and in this limit we instead find Since W n is maximised for n = S (here approximating n as continuous) we should replace n with S in Eq. (40). From Eq. (40), we then see that the reason we need strong vibrational coupling is first to suppress optical transitions ∝ W 0 , i.e. to enhance the transition pathway identified by coloured labels in Fig. 3, and second to increase the asymmetry in the transition energy of the decay and excitation rates along this pathway. This asymmetry can be enhanced in two additional ways: increasing either T O or p to make J E and J A more sensitive to energy changes. Since population inversion is most easily achieved at high optical temperatures when N O 1, we focus on this limit and find Here we see one very important prediction of the model: no matter how strong the vibrational coupling, population inversion cannot occur for p ≤ 1. Since this is true at infinite optical temperature when inversion is easiest, this must be true for all temperatures. This result will be emphasised when we compare multi-bath TEMPO to the reaction coordinate and polaron calculations, introduced in the next section, across all parameter regimes in Section V. Eq. (41) is independent of the optical coupling strength. There are two reasons for this. First, the Fermi Golden rule rates are only valid for small optical coupling. Second, we have assumed that the vibrational rates are significantly faster than the optical rates. In the next section, we will derive the master equations for the polaron and RC techniques. The polaron technique similarly treats the optical coupling to second order and assumes that the vibrational bath is displaced instantaneously. This is the same as assuming that vibrational rates are large enough to lead to effectively instantaneous transitions. The RC technique relies on the weak lightmatter coupling but does not assume that the vibrational bath displaces instantaneously, and in Section V we find that population inversion decreases as optical lifetime becomes comparable to the displacement timescale.

IV. APPROXIMATE TECHNIQUES
In this section we will give a brief introduction to the two approximate techniques, a master equation that exploits the polaron transformation and the RC mapping. Both approaches are capable of capturing nonperturbative electron-phonon effects, however, they operate in two distinct regimes as will be discussed below.

A. Polaron theory
Polaron theory accounts for strong electron-phonon interactions through a unitary transformation to the global Hamiltonian, such that This transformation dresses the excitonic degrees of freedom with vibrational modes of the environment forming a polaron, and providing a transformed basis in which to do perturbation theory. This has the advantage of providing a simple analytic expression for the master equation, while still accounting for strong system-environment interactions [17,55]. The resulting Hamiltonian in the polaron frame reads: where the transformation has removed the linear vibrational coupling. The dipole transition operators are now dressed with the displacement operators B = exp [−G], such that B † σ + (Bσ − ) creates (destroys) a polaron with energy δ P = δ − λ, where λ = ∞ 0 dω J V (ω)/ω is the reorganisation energy. The mathematical details of this transformation are well documented and can, for example, be found in references [17-23, 51, 55-57].
To describe the dynamics generated from Eq. (43) we derive a Born-Markov master equation in which the optical and vibrational degrees of freedom are captured to second order in the polaron frame. In this particular example, the populations are independent of the coherences of the system, leading to a simple master equation of the form:ρ where ρ ee and ρ gg are the excited and ground state populations in the lab frame. We have also defined the excitation γ ↑ = γ(−δ P ) and decay γ ↓ = γ(δ P ) rates in terms of the polaron rate function where the phonon correlation function is defined in terms of the Gibbs state of the vibrational environment, ρ V , and the displacement operator in the interaction picture, B(t). The phonon correlation function is characterised by the phonon propagator where T V is the temperature of the vibrational bath and we have taken a continuum limit for the phonon modes [17-19, 21-23, 51, 55-57]. When comparing TEMPO with polaron theory, we shall take the spectral density where S is the dimensionless Huang-Rhys parameter and ω c is the phonon cut-off frequency. The optical contribution to the rate is given by the correlation function where ρ O is the Gibbs state of the optical environment and A q (t) is the operator f q a † q + f * q a q in the interaction picture. Taking the continuum limit over the optical modes, we obtain an expression in terms of the absorption and emission effective spectral densities (50) where the vibrational influence is contained within [58][59][60] The non-additive nature of the vibrational and optical interactions is now clear because Eq. (50) is a convolution of optical functions with a vibrational function. Eq. (50) can be solved numerically for the optical and vibrational spectral densities considered in this paper.

B. Reaction coordinate
An alternative semi-analytic approach for describing the model outlined in Section III A is the RC method.
Here key environmental degrees of freedom are included in an enlarged system Hamiltonian by way of a normal mode mapping [33,[61][62][63][64]. The remaining environmental degrees of freedom are then captured through a residual environment, which can be treated to second order using a Born-Markov master equation [32,64]. This treatment allows one to access strong coupling and highly non-Markovian regimes with the conceptual simplicity of a master equation, and little computational cost [65]. In this section we shall give an outline of the RC mapping and the subsequent derivation of the master equation. For more details we refer the reader to [64].
We shall start by restricting the phonon environment to the case of an underdamped spectral density, of the form: which is one of the spectral densities for which the RC mapping is exact [32]. Here α UD is the reorganisation energy, ω 0 is the frequency of the spectral density's peak and Γ is its width. Using a collective coordinate mapping, the vibrational contribution to the Hamiltonian given in Eq. (33) becomes Here we have introduced the annihilation (creation) operators a (a † ) and c k (c † k ) for the RC and residual environment respectively. The mapped parameters can be written in terms of the unmapped spectral density, where η = πα UD ω 0 /2, Ω = ω 0 , and the residual environment is described by the spectral density J Res (ω) = k |h k | 2 δ(ω − ν k ) = Γω/2πΩ. Since the collective coordinate mapping acts only on the modes of the vibrational environment, leaving the system and optical degrees of freedom unchanged, then in the mapped frame we have: H =H S + H Res + H O , where we have defined the augmented system Hamiltonian as A key advantage of the RC method is that we can treat the augmented system non-perturbatively, while treating the residual environment to second order, thus effectively capturing non-Markovian effects. To do this, we first apply the single mode polaron transformation to Hamiltonian U = exp −ν −1 ησ + σ − (a † − a) . This transformation diagonalizes the augmented system Hamiltonian such that H S = UH S U † = δσ + σ − + Ωa † a. In this transformed frame, we obtain a master equation of the form ∂ t ρ(t) = L[ρ(t)], with the Liouvillian: where ρ(t) is the density operator for the augmented system. Here K Res is a superoperator representing the action of the residual environment [64]: where A = a † + a − 2(η/ν)σ + σ − , and we have introduced the rate operator: where the sum runs over the eigensplittings of the augmented system Hamiltonian, such that λ ∈ {±Ω, 0}, with the corresponding transition operators A(0) = −2(η/ν)σ + σ − , A(Ω) = a, and A(−Ω) = a † . Crucially, unlike standard perturbative treatments, the RC is capable of accounting for non-additive effects when multiple environments are present [4]. In the case of an optical environment, this is done by explicitly deriving the optical contribution to the master equation in the eigenbasis of the augmented system. This results in a dissipator of the form: where we have defined the rate operators corresponding to absorption χ ↑ = jk σ x jk γ ↑ RC (λ jk ) |ψ j ψ k |, and emission χ ↓ = jk σ x jk γ ↓ RC (λ jk ) |ψ j ψ k |. The rates for these processes are given as: where Θ(x) is the Heaviside step function. Here, the steady state is independent of the Lamb shifts, allowing them to be neglected in the calculations we will show.
If we compare the optical transition rates calculated in the RC model (Eqs. (59) and (60)), and those found in polaron theory in Eq. (50), we see some striking similarities, with vibronic contributions entering directly into the expressions for the rates. However, a key difference between the RC and polaron approach is that the displacement of the phonon environment in polaron theory is static, which leads to a breakdown of the perturbative master equation when the system dynamics approach the correlation time of the phonon environment [17]. In contrast, no such approximation is made in RC theory, and the displacement of the RC may vary throughout the system evolution, such that the method is valid in regimes of fast system dynamics. This will be significant in regimes where the system light-matter coupling dominates over the vibrational interaction.

V. NON-ADDITIVE PHYSICS IN ALL COUPLING REGIMES
With the methods described above, we may now explore the effects of non-additivity on the nonequilibrium steady state of the TLE. To do this we will separate the analysis into three regimes of optical coupling strength: weak, moderate and strong. Weak coupling is defined as the thermalization timescale being shorter than the optical lifetime, and moderate coupling the reverse of this. In the strong coupling regime, a second order truncation of the optical coupling breaks down and we use multi-bath TEMPO to explore this novel regime of lightmatter coupling. In each regime we shall compare the steady state population behaviour predicted by the analytic models against multi-bath TEMPO. Analysing the steady state exposes the most striking consequences of the non-additive effects arising from our model. Further it has the added advantage of circumventing the awkward problem of defining equivalent initial states for a fair comparison of the short time dynamics, however, of course the dynamics are also available in all three approaches.
We present our comparisons for variable temperature of the optical field in Fig. 4 and as a function of light-matter coupling strength in Fig. 5. For all calculations, we use the optical spectral density defined in Eq. (35). For comparison with the polaron theory (Figs. 4(a) and 5(a)), we use the super-Ohmic vibrational spectral density given in Eq.

A. Weak optical coupling
For small light-matter coupling strength, we see excellent agreement between both approximate methods and TEMPO, as is apparent from Fig. 4. This is expected as both theories can non-perturbatively account for electron-phonon interactions, and the second order treatment of the optical coupling is sufficient for small α. Most striking from these figures is that for small α all approaches predict a population inversion at large optical temperatures. This is as we expect from the work of Maguire et al [4], as well as the four level analysis in Sec. III B, where vibrational states induce an asymmetry in the optical rates, suppressing optical decay while enhancing the converse absorption process.
Using the polaron theory, we can also see that this behaviour persists across a full range of vibrational cou-plings, agreeing well with TEMPO in Fig. 5(a). This is with the exception of the very largest couplings where there were difficulties with getting converged results from TEMPO. As well as the inherent difficulties associated with simulating stronger couplings, there is also a significant drop in the population transfer rates meaning that longer propagation times are necessary to reach steady state. This suppression with strong vibrational coupling occurs because the system tries to excite into high lying vibrational states, requiring high energy photon modes to be thermally populated to complete the transition. We have included error bars that correspond to the change in our steady state prediction if we end the propagation 40 timesteps earlier to give an idea of how much we would expect the prediction to vary if we were to propagate out further. Taking these errors into account, this suggests that polaron theory predicts accurate steady state populations over the full range of Huang-Rhys parameters.
We may also address the important prediction from the four level model in Sec. III B that population inversion is conditional on the Ohmicity of the optical field. This is confirmed by polaron theory and TEMPO in Fig. 5(a), where we see that population inversion never occurs when p = 1, but does for p = 3. Here we also found that a smaller value of α was required for p = 1 in order to match the weak-coupling result from polaron theory.

B. Moderate optical coupling
At moderate optical coupling, where the optical timescales approach those of the phonon processes, we begin to see the population inversion predicted by TEMPO reduce. We attribute this to a dynamical undressing of the optical transition, where an emission and absorption of a photon occurs faster than polarons can be created or destroyed during transitions between the ground and excited state. This is most apparent when comparing the polaron and TEMPO calculations in Fig. 4(a), where the steadystate population predicted by polaron theory is independent of the light-matter coupling strength α. This can be seen from the polaron frame master equation in Eq. (44), where the steady state population is dependent only on the ratio of γ ↑ /γ ↓ , which has no α dependence. Physically, this is a consequence of the static state dependent displacement (Eq. (42)) applied to the phonon environment, regardless of the system dynamics. This and the approximations used to derive the master equation in Eq. (44), mean that within this theory it is always the case that a polaron is formed during an optical transition, regardless of how quickly this transition occurs. This results in the steady state population being completely insensitive to the light-matter coupling strength. This is a common issue for polaron-type master equations, and may be overcome by modifying the displacement through a variational treatment [24,25]. However, this is not possible when the two baths are out of equilibrium with each other.
In contrast, since the RC is included nonperturbatively in the augmented system, its state may respond to the increased optical transition rate, such that in regimes that polaron formation is not energetically favourable the TLE remains undressed by phonon modes. This flexibility in the theory means that RC continues to be in good agreement with TEMPO for moderate optical coupling strengths (α = 0.01), as seen in Fig. 4(b).

C. Strong optical coupling
For large light-matter coupling strengths the secondorder treatment use to derive the interaction with the optical field in both approximate techniques breaks down. As can be seen in Fig. 5(b), this leads to significant deviation between the RC method and multi-bath TEMPO, where at large α TEMPO predicts a trend towards σ + σ − = 0.5 independent of optical temperature T O .
In the strong coupling limit a system-environment interaction can be thought of as a continuous measurement process where the system is consistently projected onto the eigenbasis of the coupling operator; the ultra-strong coupling limit (α → ∞) can therefore be described using the quantum Zeno effect (QZE) [66,67]. In this limit the total system plus environments evolve under an effective Hamiltonian given by: where P i is a projection onto the ith eigenstate of the optical interaction. Explicitly this is defined as: where These eigenstates can be partitioned into system and bath components and written as where {|+ , |− } are the eigenstates of σ x and |x i the eigenstates of q f q a † q + f * q a q . However given that none of the other parts of the Hamiltonian act on the optical degrees of freedom we only need to consider how the system component of P i affects the system operators in the Hamiltonian. This projection takes σ + σ − → I and leaves σ x unchanged such that we are left with an independent boson model. The steady state of H QZE in this case corresponds to a mixture of the eigenstates of σ x ; any such mixture corresponds to a population of 0.5.
We can also note that the Hamiltonian is symmetric under the unitary transformation that leads to: We would expect the steady state to also obey this symmetry meaning the σ x eigenstates must be equally populated. We can then conclude that in the strong coupling limit we expect the system steady state to correspond to a completely mixed state, i.e. (|+ +| + |− −|)/2. We have confirmed this prediction for the rightmost points of Fig. 5(b) by checking the corresponding system density matrix.

VI. CONCLUSIONS
We have developed a new, exact technique based on tensor networks for finding the dynamics of a system which is coupled to multiple non-commuting baths. Our method can be used to model a broad range of couplings between the system and multiple baths described by arbitrary spectral densities. This has allowed us to probe previously unexplored regimes of simultaneous strong electron-phonon and light-matter coupling, highlighting the importance of proper treatment of non-additive effects in nonequilibrium open quantum systems. In particular, we illustrate this by considering the incoherent population inversion predicted by Maguire et al [4], which emerges in regimes of strong electron-phonon coupling.
Using multi-bath TEMPO, and backed by the analytic polaron and reaction coordinate approaches, we have highlighted two regimes of light-matter coupling: the first is a regime of dynamical undressing, where the light-matter coupling is sufficiently large for optical emission and absorption processes to occur on the same timescales as phonon interactions, preventing the formation of polaron-like states. The second regime occurs when the dynamics are dominated by the light-matter interaction, which leads to the optical environment projecting the system into the so-called Zeno subspace, causing decoherence with respect to the basis of σ x . This results in steady state populations that are independent of the temperature of the optical field.
The novel behaviour described above is just one example of non-additive physics that emerge in nonequilibrium open quantum systems. Multi-bath TEMPO provides a powerful and versatile methodology for studying such phenomena, allowing the simulation of systems that have thus far been intractable to any other techniques. Furthermore, the general formulation of TEMPO presented can be applied to fermionic environments, where non-Markovian and non-additive behaviour are expected to play an important role in the dynamical and steady state behaviour of the system of interest [3,5].