Locally accurate tensor networks for thermal states and time evolution

Tensor network methods are routinely used in approximating various equilibrium and non-equilibrium scenarios, with the algorithms requiring a small bond dimension at low enough time or inverse temperature. These approaches so far lacked a rigorous mathematical justification, since existing approximations to thermal states and time evolution demand a bond dimension growing with system size. To address this problem, we construct PEPOs that approximate, for all local observables, $i)$ their thermal expectation values and $ii)$ their Heisenberg time evolution. The bond dimension required does not depend on system size, but only on the temperature or time. We also show how these can be used to approximate thermal correlation functions and expectation values in quantum quenches.

Tensor network methods are routinely used in approximating various equilibrium and nonequilibrium scenarios, with the algorithms requiring a small bond dimension at low enough time or inverse temperature. These approaches so far lacked a rigorous mathematical justification, since existing approximations to thermal states and time evolution demand a bond dimension growing with system size. To address this problem, we construct PEPOs that approximate, for all local observables, i) their thermal expectation values and ii) their Heisenberg time evolution. The bond dimension required does not depend on system size, but only on the temperature or time. We also show how these can be used to approximate thermal correlation functions and expectation values in quantum quenches.

I. INTRODUCTION
The classical simulation of quantum many-body systems is an important challenge for many different fields, including condensed matter physics, quantum chemistry, quantum information and high energy physics. Approximating generic settings efficiently is widely believed to be impossible, due to the exponential growth of the Hilbert space dimension with the system size. However, many situations of interest do not occur on generic regions of the Hilbert space, but are rather confined to the "physical corner" of it . This can then be covered by appropriate variational ansätze, with tensor networks being the most prominent example.
Indeed, tensor network methods based on the DMRG algorithm [1] are routinely used for the simulation of many important physical situations. Most prominently, they are used for low energy properties in one and even two dimensions, with great success [2]. They are also widely used in the approximation of finite temperature phenomena [3][4][5][6][7][8][9][10][11][12][13], and in the simulation of dynamics [8,[14][15][16][17][18][19][20][21][22][23][24][25][26] for short times. This allows for the computation of properties on large system sizes in many situations of interest. These methods are supported by a series of mathematically rigorous results. For low energies, it is known that ground states of gapped models in 1d have good matrix product state (MPS) approximations [27][28][29], and that these approximations can be found efficiently with explicit algorithms [30][31][32] (see [33,34] for current progress in two dimensions). For thermal states ∝ e −βH , it is known that they can be approximated in any dimension by tensor networks if β is not too large [35][36][37][38]. Similar results are also known for the unitary time evolution at short times e −itH [38][39][40]. All of these previous works aim at approximating the whole ground state, thermal state or unitary, respectively. This can be achieved with a bond dimension that grows with system size. * alvaro.alhambra@mpq.mpg.de However, for many physical applications, such as calculating local order parameters, one does not necessarily require a full global approximation, but just a tensor network that describes the relevant local properties well. The success of existing numerical implementations suggests that a much smaller bond dimension, independent of system size, is required in this case.
This problem has been previously explored for ground states: that such local approximations exist in 1d gapped models has been shown in a mathematically rigorous way. First, with matrix product operators (MPOs) [41,42] and more recently with MPS [43][44][45], as well as with projected entangled pairs (PEPS) for 2D ground states with an area law [44,45] (see [46] for a perspective). For thermal states and time evolution, previous results indicate that it is possible to simulate specific local properties in an efficient way [37,40]. However, it was previously not known whether there exist particular tensor networks that approximate all the local properties of a system with a bond dimension independent of system size.
Here we address this question, by constructing tensor networks with a provably small bond dimension that approximate, for any local operator A and in any spatial dimension: • Thermal expectation values A β ≡ Tr A e −βH Z .
• The Heisenberg time evolution e −itH Ae itH .
By linearity, they also approximate extensive sums of local observables A = 1 N N x=0 A x . The results hold for Hamiltonians H that are short-ranged, but not necessarily translation-invariant. The bond dimension has a similar dependence on β and t as previous global approximations, but it now does not grow with system size.
Notably, our constructions are explicit, and give rise to algorithms that can in principle be implemented in practice. While these are likely less efficient or more cumbersome to implement than other known methods used in practice, the advantage is that we have performance guarantees. These do not currently exist for most algorithms used in practice, such as the paradigmatic DMRG algorithm. Theoretical guarantees for certain algorithms support the fact that state-of-the-art methods give accurate results. This is because said guarantees show that the methods target quantities that can in principle be computed efficiently.
We prove these guarantees with the aid of previous results on global approximations [35][36][37][38][39], combined with ideas that allow us to exploit the locality of the problem. For thermal states, this is the principle usually known as the local indistinguishability [37,[47][48][49], which relies on the clustering of correlations [49,50]. For time evolution it is the Lieb-Robinson bound [51,52]. We also introduce a tensor network construction of a linear map that outputs different PEPO approximations for the unitary dynamics of local operators depending on their support, which may be of independent interest.
We then show how these results allow us to compute quantities of interest. We focus on approximations to auto-correlation functions (such as current operators in transport problems [53][54][55][56][57][58][59]), and time-dependent local expectation values in quantum quenches [21,25,[60][61][62]. These two are particularly relevant to current experiments in quantum simulation platforms such as cold atoms, superconducting qubits or trapped ions, since they are some of the most easily measurable and informative quantities.
The paper is structured as follows. First, we explain the definitions and the setting in Sec. II. Then, we show our result for local thermal states in Sec. III, and for time evolution in Sec. IV. We explain the impact of our results for correlation functions and quantum quenches in V, and conclude. The technical proofs and further background are placed in the Appendices.

II. SETTING AND DEFINITIONS
Throughout this work, the notions of approximation used are in terms of closeness in 1-norm or trace norm X 1 for quantum states and their PEPO approximations [63], and the operator norm X for operators. The big-O notation indicates that a quantity f = O(n) is such that for some constant c, f ≤ cn.Õ(n) indicates polylogarithmic corrections f ≤ cn × polylog(n), and o(n) that the scaling is strictly smaller than linear in n.
We focus on systems governed by a local Hamiltonian H = x h x with uniformly bounded, short range interactions max x h x ≤ h between N particles of small local dimension. The interactions have the graph structure of a d-dimensional lattice Λ with growth constant γ and maximum degree z. We denote the small connected regions we focus on as R, which have a maximum length k, such that |R| ≤ k d (that is, the small region can be embedded on a hypercube of length k).
The operators that approximate e −itH and e −βH are MPOs and their higher-dimensional generalization PE-POs [64] which are operator generalizations of MPS and PEPS, respectively. In d = 1, an MPO M D of bond dimension D can be written simply as [3,64,65] On the other hand, PEPOs are defined in terms of the interaction graph with edges {e} ∈ E and vertices {v} ∈ V as [36,64,66] is an operator acting on vertex v, z(v) is its degree and e v 1 , ..., e v z (v) are the vertices going through it. See [36,64] for more detailed descriptions.
We also introduce the notion of a PEPO map, which appears in one of our main results (Result 3). This is a linear map M(A) that takes a PEPO A as an input, and outputs another PEPO with a potentially larger bond dimension. Schematically, it can be understood as That is, it can be written as a product of PEPOs acting on each of the two physical indices of A, and with an additional "physical" index (in the figure, orange and dashed) that is contracted, such that

III. LOCAL APPROXIMATIONS TO THERMAL STATES
We start with local approximations to thermal states. Let R be a small region as defined in Sec. II. The local thermal state is then Tr Λ\R [ e −βH Z ], and the marginal of the PEPO approximationρ k is Tr Λ\R [ρ k ].
A first idea to approximate the marginals could be to simply consider a product i Tr Λ\Ri [ e −βH Z ] for some choice of adjacent regions {R i }. However, this clearly yields a large error in the regions that lie within two adjacent R i . Since we want to approximate all of them at once, we need a scheme that has no preferred partition of the lattice {R i }. This is possible with a uniform average over large enough partitions, and aided by local indistinguishability, which states that the marginal of large thermal states cna be approximated by the marginals of much smaller thermal states (see Lemma 1 for the precise statement).
One key assumption we need is the decay of correlations where the optimization is over arbitrary observables separated by a distance l on the lattice, and ε(l) is a function decreasing with l.
The most prominent decay is exponential ε(l) ≤ e −l/ξ , which defines the thermal correlation length ξ. This has been proven for translation invariant chains [50,67], and there is strong evidence that it holds in any 1d thermal state [50,68]. In higher dimensions it only holds above a finite threshold temperature β * , as shown in [37,69], which also give a bound on the correlation length. However, we can also consider the polynomial decay ε(l) ≤ R l d+1 for some constant R > 0. This might correspond to the behaviour at certain thermal phase transitions.
The decay of correlations is required to simulate local properties without a system size dependence, since one needs to be able to isolate them from distant regions. This is only possible if the correlations decay sufficiently fast, so that regions of width ∼ O(ξ) can be approximated independently of the rest.
With this, we now show the main result of this section.

Result 1.
There is an explicit construction of a PEPÕ ρ k such that for any region R on the lattice Λ it locally approximates the thermal state The bond dimension is bounded as follows.
• For d = 1, assuming correlations decay exponentially, it is a MPO with bond dimension which is quasilinear in (k + ξ)/ for any β O(1).
The bond dimension in both cases grows with k, ξ, β and −1 , as expected. This result implies a good approximation in any local expectation value, since ||Tr Λ\R Fig. 1 for an illustration.
The proof is shown in Appendix B. It draws inspiration from previous results on local approximations of pure states [41][42][43][44]. The PEPO here is the uniform average of tensor products of approximations to local thermal states. These lie on consecutive hypercubes of a given size l d 0 , which span the whole lattice. The average is taken over all the different l d 0 partitions of the lattice (that is, the different displacements of a given partition into hypercubes, see Fig. 4).
Each of the PEPOs on the hypercubes approximates Tr Λ\R [ e −βH Z ] well for a given partition provided that R is far from the boundary between adjacent hypercubes. That this happens for most partitions is guaranteed by a result from [49], which shows that any local marginal of a thermal state does not depend on the regions far away from it (here how "far away" is determined by the thermal correlation length). This is exactly the idea behind local indistinguishability [49], and the related concept of locality of temperature [37,[70][71][72][73], which states that i) subsystems of thermal states of quantum Hamiltonians are robust to distant perturbations and that ii) they are close to the marginals of the thermal state of their vicinity (see [37] for further discussions on this idea).
The smaller PEPOs within the hypercubes can then be taken to be any of the existing global approximations to thermal states. The current best estimates are given in [38] for 1d, which is D ≤ exp(O β log(l 0 / ) ) and in [36] for higher dimensions, which is D ≤ . In the proof, one has to choose l 0 small to keep the bond dimension controlled, but still large enough such that the error from the local indistinguishability estimate is O( ). This leads to Eq. (5) (by choosing l 0 ∝ k+ξ ) and Eq. (6) (by choosing l 0 ∝ max{k −d , dξ d }). Let us comment on the algorithmic implications of Result 1. To build the MPO/PEPO here one just needs to construct them as given by the prescriptions of [38] and [36] respectively. In [38], the 1-dimensional approximation is defined as a product of the Taylor expansion of operators e −βHj , where H j is the Hamiltonian in a small region. Similar algorithms have already appeared in the literature [10,11]. In [36], the higher dimensional approximation is based on the linked cluster expansion [35], which can in principle also be implemented numerically [26,74]. Standard MPO/PEPO results [2,64] guarantee that these approximations can be computed via an algorithm with run-time poly (D, N ).

IV. LOCAL APPROXIMATIONS TO TIME EVOLUTION
We now focus on efficient approximations to the Heisenberg time evolution e −itH Ae itH . The existing results on global approximations [36][37][38][39] show that the bond dimension of a PEPO M t that approximates e −itH − M t ≤ must grow with system size. We again drop this dependence when our target is the Heisenberg evolution of local operators. The key idea is to use the Lieb-Robinson bound [51], which states that the evolution e −itH Ae itH is restricted to a certain "light-cone" much smaller than the whole system. The evolution in this light-cone can be though of as generated by the Hamiltonian H restricted to the vicinity of A.
To simulate this evolution, the first idea could be to simply reduce the problem to simulating the Lieb-Robinson light-cone exactly, which only requires a unitary in a region of size ∝ O(v LR t + log( −1 )), with v LR is the Lieb-Robinson velocity. While this can be done with a bond dimension independent of system size, as previously pointed in [40], it would only be a good approximation for observables in a specific small region.
Here we show how the idea of using the effective lightcones of the evolution can be pushed further in order to build a single tensor network that approximates the Lieb-Robinson light-cone of any local operator. The statements for one and higher dimensions differ significantly, and are presented separately.

A. One dimension
In one dimension, our main result essentially shows that the previous global approximation scheme from [39] simulates well any Lieb-Robinson lightcone, and that in order to do so one only requires a bond dimension depending on the size of the lightcone, and not on N . The result is as follows.

Result 2.
For any operator A with support on a small region R, the Heisenberg time evolution is well approximated as where M t k is an MPO with bond dimension That is, the bond dimension scales polynomially in k, −1 with a constant degree, and exponentially in time, which is consistent with the expected linear growth in entanglement along a generic time evolution [75][76][77]. The numerator in the polynomial of Eq. (8) corresponds to the size of the light-cone that needs to be approximated. See Fig. 2 for an illustration.
The proof is shown in Appendix C 1. The construction is the same as that of [39], which shows that e −itH can be approximated by a quantum circuit of depth two, in which the size of the gates grows with t, −1 and system size N . To drop the system size dependence we give an argument based on the Lieb-Robinson bound which shows that to simulate the ligh-cone of A one just needs to approximate the effective region of size ∝ O(v LR t + log( −1 )). The important point is that this can be done such that the same MPO simulates the light-cone of any local operator. For the argument to hold, it is crucial that the MPO of [39] is a depth-2 quantum circuit, which is impossible in higher dimensions [78].
This MPO can also be implemented in practice [40], following the explicit construction of [39], which consists on the subsequent application of two local Hamiltonian evolutions. Thus, Result 2 also guarantees an efficient 1d algorithm for short times. The result naturally extends to the simulation of any extensive sum of local operators by linearity.
The exponential scaling in time originates from the error in the Lieb-Robinson bound. For systems with different lightcones, the growth in the bond dimension may be much smaller. For instance, many-body localized systems with a "zero velocity" Lieb-Robinson bound ∝ te −l [79,80] can instead be approximated with a bond dimension growing polynomially in time [40].

B. Higher dimensions
For higher dimensions, we resort to the idea in Sec. III of using partitions of the lattice into hypercubes of length l 0 . The simple approach taken here is to first show that e −itH Ae itH is close to the evolution of an effective Hamiltonian in the hypercube, and then approximate that effective evolution with a PEPO of small bond dimension. This PEPO can be constructed with the cluster expansion results from [36,37], adapted to real time evolution (as described in Appendix A 2). The bond dimension of The approximation is then accurate if i) the hypercube is large enough and ii) A is sufficiently far away from the boundary between hypercubes. However, to construct a scheme that applies to local operators A in any region, we need a tensor network that implements the approximation of a hypercubes conditioned on the support of A. This conditioning requires something more involved than just acting with a single PEPO and its adjoint: a tensor map as described in Sec. II. In Appendix C 2 we construct a map that implements a PEPO M p among a possible list as M p AM † p , where M p is determined by the region R that supports A. Since the map is linear, it can also act on extensive sums of local operators.
Given that there are l d 0 possible partitions into hypercubes, we show that this can be done with a tensor network of bond dimension O(l 3d 0 D 2 l0 ). Choosing l 0 =Õ k + v LR t + log 1 then yields the following result.

Result 3.
There is an explicit construction of a linear PEPO map M t k (·) such that, for any operator A with support on a small region R, the Heisenberg time evolution is well approximated as If A is a product of Pauli matrices, the map has bond dimension whereas for arbitrary A the bond dimension is The proof is shown in C 2, where we also explain how to construct the tensor network that applies a given PEPO conditioned on the support of the input A. This construction of a tensor network map that acts differently depending on some feature of the input (here, the nontrivial support of the observable) has, as far as we know, not previously appeared in the literature. We believe that this or similar schemes have the potential for further applications.

V. APPLICATIONS
Our results apply to a wide range of physical situations. For instance, Result 1 directly shows that it is possible to compute local thermal averages of arbitrary local operators, such as the energy or average magnetization. They can, however, also be used to guarantee approximations of more complex objects. We now elaborate on two of them.

A. Correlation functions
A quantity that appears in many relevant situations, mostly pertaining to linear response theory [81], is the 2-point correlation function where A is often taken to be an extensive operator. By this, we mean that it has uniform support throughout the lattice as where each term has bounded norm A x ≤ A and acts on at most k consecutive sites. This is the central object in various areas of quantum dynamics, such as the study of quantum transport, for which A is taken to be a current operator [59] of the relevant conserved quantities.
In the next result, we show how our constructions serve to approximate it. We focus in the one dimensional case, for which we again assume that Eq. (3) holds with an exponential decay. In higher dimensions Eq. (12) involves a contraction which is typically computationally hard [82,83] (although this is potentially not a problem in physically relevant contexts, see [48]). The result is as follows.
Result 4. The correlation function of an extensive observable A in one dimension can be approximated as where k = O ξ log 1 + v LR t + k , and the operatorsρ k and M t k are as defined in Results 1 and 2. The proof is shown in Appendix D. In simple terms, Result 4 implies that A(t)A β can well be approximated by a contraction of MPOs of bond dimension at most See Fig. 3 for an illustration. The biggest drawback of this result is the fast growth in time t, which is nevertheless expected in general. This does not necessarily prevent the algorithms from reaching interesting timescales [84], and is likely an overestimation for many important situations at late times. Nevertheless, we believe that this result mathematically justifies the success of previous tensor network approaches to computing correlation functions [53][54][55][56][57][58][59].

B. Quantum quenches
In quantum quenches, one starts with a pure initial state |Φ . This is often an easy-to-prepare state, such as a ground state of a gapped model, or a product state. Then, the Hamiltonian is suddenly switched to some arbitrary H, and the subsequent time evolution is tracked through expectation values of local observables A(t) ≡ Ψ| A(t) |Ψ . This time evolution can be simulated with the results in Sec. IV, which give an upper bound on the bond dimension required (e.g. Eq. (8)).
This upper bound, however, grows very fast with time, and may become too large at relevant timescales such as the thermalization or Thouless times [85]. This is so even if the local marginals of the evolved state are simply described by a thermal state or a GGE, which depends on very few parameters. In those cases, it is expected that local evolution at late times can also be approximated with a small bond dimension [60,61,[86][87][88].
One of the main results of [43] (applicable in 1d) can help in this setting: the local marginals of any state |Φ(t) on k sites can be approximated with an MPS |Ψ with bond dimension D ≤ exp (k/ ). One can then potentially simulate A(t) in 1d with a bond dimension It is not clear whether an efficient algorithm to find |Ψ exists [43]. However, numerical schemes for approximating A(t) at long times with low bond dimension have already been devised [60,61].

VI. CONCLUSIONS
We have shown how to construct tensor network representations of thermal states and time evolution, in a way that local observables are well approximated. This allows us to provably achieve a bond dimension independent of system size in all cases, which contrasts with what is achieved by previous global approximations. These results have implications on the tensor network simulation of various equilibrium and out of equilibrium situations, and help to mathematically justify the success of previous numerical results.
Since they hold for any local Hamiltonian, and any timescale and temperature, they are likely not tight in particular cases of interest. For instance, when simulating the long time dynamics of a system that has already thermalized, it seems likely that only a much smaller bond dimension (perhaps independent of time) is required. This intuition is present in previous specific algorithms [60,61].
In all of the proofs, except for Result 2, the constructions involve an average over "local approximations", which we expect to be unnecessary in practice. This also applies to previous results for ground state approximations [29,[42][43][44]. It would be interesting to find arguments to circumvent this proof idea, perhaps akin to the proof of Result 2 or to other features such as the Markov property of thermal states [89,90]. For 1d, this question was dealt with in [91], where it was shown that there exists an MPO that reproduces the local expectation values of the thermal state with bond dimension D ≤ expÕ(β 2/3 + β log(k/ )). This significantly improves on the result of Eq. (5), at the price of not giving an explicit construction of the MPO. It may also be possible to obtain a better higher dimensional generalization of Result 2 that does not require a tensor map, as Result 3 does.
Appendix A: Review of the cluster expansion approximation to thermal and real time evolution in arbitrary dimension We briefly review the main result of [36], which provides PEPO approximations for global thermal states in arbitrary spatial dimensions via the cluster expansion [35,37]. We then explain how an analogous statement holds for the operator e itH .

Thermal state at any temperature
Let us recall the definitions from Sec. II: H = x h x is a k-local Hamiltonian on an arbitrary d-dimensional lattice, with up to K ∝ N terms, and such that max x ||h x || ≤ h ∝ O(1), and the interaction graph has degree at most z. We also define the lattice growth constant as γ.
It was shown in [35] that there exists an operatorρ, defined in terms of the cluster expansion (see [37] for a detailed proof), which is a good approximation to the thermal state for high temperatures. The statement is as follows. Let β * be a constant such that γe (2z−1)β * h (e β * h − 1) < 1 (that is, β * ∼ 1/hd). If β ≤ β * , then where x ≡ γe (2z−1)βh (e βh − 1) < 1. Here, L is a free parameter that determines the size of the clusters in the approximation. Importantly, Eq. A1 also holds for the norm ||...|| 2M if we change the temperature to β = β 2M , such that Then, Proposition 1 in [36] allows us to approximate the thermal state at any temperature.
it follows that To choose an -close approximation in 1-norm at any temperature, we need to set M large enough such that β 2M ≤ β * ∼ 1 hd , which amounts to M = O(βhd). Then, for large L the error in Eq. (A2) is Moreover, it was shown in [36] thatρ is a PEPO with bond dimension D ≤ e L , and thus (ρ †ρ ) 2M has D ≤ e 2M L . We conclude that to achieve an error one requires a bond dimension D ≤ e O(βd log βdN ) . (A6)

Real time evolution
It can be easily seen that the proof of Eq. (A1) from [37], and the cluster expansion analysis, also hold for an approximation of e −itH in operator norm, simply if one substitutes β with |t|. This just follows from the observation that all the steps in the proof of [37] remain unchanged if one changes all the norms to operator norms, and that there is no further fundamental differences between the operators e −βH and e −itH . In the same way, one also has the analogue of Proposition 1 for operator norms, which allows us to extend the approximation for arbitrarily long times.
From this observations we conclude that there exists an PEPOŨ t with bond dimension D such that Here we show the main result regarding local approximations to thermal states. First, we need the following key assumption for the thermal state ρ = e −βH /Z. Definition 1 (Clustering of correlations). The state ρ on a lattice system has (l)-clustering of correlations if where X has support of region A only and Y on region B only, and l ≤ dist (A, B).
The following result on local indistinguishability shows that marginals of thermal states when tracing out a big region are well approximated by the marginal of the thermal state of a much smaller lattice. This is the key ingredient to guarantee the faithfulness of local approximations.

Lemma 1.
[Theorem 4, [49]] Let H be a local bounded Hamiltonian, β an inverse temperature and ρ AB = e −βH / Tr e −βH . Let AB 1 B 2 be a separation of the lattice such that B 1 shields A from B 2 by a distance of at least l. Let ρ AB1 be the Gibbs state on region AB 1 only (that is, with the terms of the Hamiltonian H that have support on AB 1 only). If the system has (l)-clustering of correlations, then where C > 0 and c 1 , c 2 > 0 are constant and |∂B 2 | is the size of the boundary between B 1 and B 2 .
This lemma relies on the technique of quantum belief propagation from [92], from which the contribution c 1 e −c2l arises. See also Theorem 4 in [37] for a similar statement at high temperatures only.
To prove the main result of the section, we need to first define the partitions G p of the d-dimensional lattice Λ into hypercubes of length l 0 : such that |Λ which is sub-polynomial in l 0 . In higher dimensions, the result from [36] We also need that each of these PEPOs has trace 1, which can be achieved with a small price in the precision as For simplicity, we will assume Tr ρ(Λ which has bond dimension D ≤ D l0 × l d 0 . We now show this is a good approximation to the marginal of any region R ∈ Λ of maximum length k < l 0 . First, we separate the terms in the sum over p by whether R lies strictly inside one of the hypercubes Λ (i) l0,p or not, where i p denotes the hypercube Λ (ip) l0,p on which R lies, for a given p (see Fig. 4). Here, we just used the triangle inequality and the trivial bound ||ρ − σ|| 1 ≤ 2.
The remaining terms can be bounded with Lemma 1. For a given lattice partition p, let l be the minimum distance between R and the boundary of Λ (ip) l0,p . Then, where in the first line we used the triangle inequality, in the second the fact that the PEPOs have trace 1 by assumption and in the third Lemma 1 (with |∂B 2 | = 2dl d−1

0
) and the definition of . Given that the area of a hypercube of edge l is 2dl d−1 , We now focus on the two different types of decay of the function ε(l/2): exponential and polynomial. Exponential: In this case ε(l/2) ≤ e −l/2ξ . This has been shown in 1d translation invariant chains [50,67], and it is believed to hold for all 1d systems [50,68]. It has also been shown for higher dimensional systems at high temperature using the cluster expansion technique in [37]. There, it is shown that, given β * ≡ log (1 + 1 + 4/γ)/2 /2h, for every β < β * , it holds that where the correlation length is ξ(β) = log[γe 2βh (e 2βh − 1)] −1 .
Starting from the sum in (B15), then (1) . In higher dimensions, it is which is polynomial in k and −1 .
Polynomial: Now assume ε(l/2) ≤ R l d+1 for any d (note that this assumption is likely not necessary in 1d). Then the following sum converges so we simply need to choose l 0 = C max{k −d , dR } for some constant C > 0. The resulting bond dimension is thus the same as Eq. (B18), (B19) with d replaced with R. This is based on the main result of [39], and also relies on the Lieb-Robinson bound [51,52].
Lemma 2 (Lieb-Robinson bound). Let A be a local observable on k sites and H = x h x a uniformly bounded Hamiltonian with finite interaction range. Then there exist constants c, v LR ≥ 0 such that for all X with l := dist(A, X c ) ≥ 2d − 1 we have where H X contains only the terms of H away from A by a distance at most l. That is The result of [39] is that the time evolution e −itH on N particles can be approximated with a matrix product and D t ≤ e O(|t|)+O(log(N/ )) .
The operator M t N is in fact a quantum circuit of depth 2, where the gates act on |Ω N | ≡ O(|t|)+O (log(N/ )) adjacent qubits. That is, let {Ω j } be a partition of the chain into sets of size |Ω N |, and let {Ω j } be the same partition, displaced by an amount |Ω N |/2. Then Our local approximation is M t N (l 0 ), which is defined in the same way as M t N except for the fact that the partition is into much smaller sets of adjacent sites, each of which has length |Ω l0 | = O(|t|) + O (log(l 0 / )) instead of |Ω N |. Here l 0 is a free parameter such that |Ω l0 | < l 0 < N , otherwise independent of N . See Fig. 5 for an illustration of this and the other definitions in the proof.
The goal is to bound the norm ||e −itH Ae itH − M t N (l 0 )A(M t N (l 0 )) † ||, for any operator A with support on at most k adjacent sites. First, with the triangle inequality and the Lieb-Robinson bound, where H

(l)
A is the Hamiltonian containing the terms of H that are a distance l away from A, and such that l 0 −k−|Ω l0 | ≤ l < l 0 − k. Now, let Ω * ±l0 be the nearest sets at a distance strictly greater than l from A, from the left and the right. The key is to notice that, since M t l0 is a depth-2 circuit, there exists a unitary U l0 ≡ U −l0 ⊗ U +l0 acting on Ω * ±l0 such that U l0 M t N (l 0 ) = U ⊗M t l0 for some U with support at distance strictly larger than l 0 from A (see Fig. 5). By construction, the supports of U l0 and H l A do not overlap. 5. Schematic illustration of the proof of the local approximation to time evolution. The green area represents the effective Lieb-Robinson lightcone of A, and the UΩ j , V Ω j are the gates on |Ω l 0 | sites that constitute the approximation. A unitary can be applied to regions Ω * ±l 0 to have the circuit act independently on the region of length 2l0, and such that the resulting unitary on that region is an approximation to e −itH l 0 A which, when acting on A, gives a good approximation to e −itH Ae itH .
The MPO M t l0 is the approximation of the unitary e −itH (l 0 ) A , on a region of size ∝ l 0 , which allows us to use Eq. C3. That is In the second line we have used the unitary invariance of the norm, in the third line the fact that U l0 decouples the region at a distance l from the rest, and in the fourth line the Lieb-Robinson bound again, to relate e −itH (l) A to e −itH (l 0 ) A . Thus, since |Ω l0 | l 0 , if we choose l 0 = O (k + v LR t + log( 1 ) , we achieve the approximation error Finally, since M t N (l 0 ) ≡ M t k is by definition a depth-2 circuit with gates acting on |Ω l0 | sites, the bond dimension required to represent it exactly is

Higher dimensions
For local approximations in higher dimensions, we cannot straightforwardly generalize the proof of the 1d case because we cannot approximate the unitary evolution with a depth-2 circuit, since in higher dimensions one requires depth at least d + 1 (see Appendix B of [78]).
Instead, we use an argument similar to that for thermal states in Appendix B. We use the global approximation of e −itH described in Appendix A to approximate the evolution of the Hamiltonian within the hypercube that has the local operator in the middle. Since this will only work for individual regions, we have to condition the particular partition used on the support of the input, which as we show can be done with a tensor map as defined in Sec. II.
Let us start with the approximation within each of the hypercubes. Lemma 2 guarantees that to simulate the evolution of a local observable e −itH Ae itH where supp(A) ⊂ R we only need the Hamiltonian of a smaller region H X such that l := dist(A, X c ) ≥ 2d − 1.
We now define the same set of hypercubes covering the lattice as in Appendix B (also see Fig. 4) Again the parameter p defines e.g one of the corners of the hypercubes, or any other parameter that determines the whole partition. It can adopt l d 0 values. To simulate A(t) we then define the PEPO where M For later convenience we also define the p = 0 as a special case, To approximate A(t) we shall choose the partition G p so that there exists a hypercube Λ (ip) l0,p such that dist(A, \Λ (ip) l0,p ) ≥ l = l 0 /2 − k (that is, the region R is in the middle of the hypercube). This yields an error 2 + cl d−1 e vt−l A , from which we have to then choose l 0 =Õ k + v LR t + log 1 to get 3 A .
However, we want a PEPO that approximates any A, no matter the region in which it lies. This can be achieved by constructing a linear map M t k (A) made of tensor contractions such that when applied to A it results on the operator M = I. Thus, on top of being able to implement the right partition p, we have to make sure that the support beyond the lightcone is trivial.
As we now show, all these requirements can be achieved with a bond dimension D 2 p × (l 0 + k) 3d , and so we obtain that is, exponential in t (as expected) and polynomial in k, −1 now with a degree growing with t. We now explain how to construct this PEPO map M t k (A). Let us first limit ourselves to the cases where A is a product of Pauli matrices A = x σ (j) x with j ∈ {0, 1, 2, 3}, which stand for, respectively, the Pauli {I, X, Y, Z} (the generalization to higher local dimensions is straightforward). By definition the only non-identity elements in A have support on a small connected region R of size k. The result extends to arbitrary local operators by linearity.
One dimension: We first consider the one-dimensional case, which serves as an illustrative example, and then extend it to larger dimensions. If we simplify the picture by putting the two physical indices of the output and the input together, the tensor network can be drawn schematically as follows.

Physical indices
Virtual indices Output

Input
We have three virtual indices with different functions: p is the index that determines the partition into hypercubes from Eq. (C14) to use, and as such is fixed by the position of the non-identity Pauli matrices in A (i.e. the region R). It can adopt l 0 + 1 different values. For a given p, index r contains the information about the tensors from the MPO M (l0) p that are applied to the Pauli, and thus has bond dimension given by Eq. (C16). Finally, s = {s 1 , s 2 } are the indices that determine the support of the output MPO, ensuring that it is trivial outside the ligthcone, as required.
The r index is thus determined by the MPOs M (l0) l0 . The p, s indices work as follows: starting from the left of the chain, s 1 is zero until it reaches a non-trivial Pauli input at site x R . At that point it turns into (l 0 + k)/2, and then decreases by 1 at each tensor until it reaches a value we label as 0 , which indicates the end of the lightcone at the right. Beyond that point, the desired outcome M t k (A) must have trivial support, which we enforce by setting the output tensor to be the identity M (l0) 0 = I. The point x R also determines the particular value of p = p * , which then sets the PEPO M (l0) p * to act within the lightcone. The index p * is chosen in such a way that the boundary of the hypercubes is (l 0 − k)/2 away from the left and (l 0 + k)/2 from the right of x R . This sets the right part of the lightcone. For the left side of the lightcone, we use index s 2 . It starts with value (l 0 − k)/2 at x R , and decreases by 1 to the left, while being 0 all the way to the right. The values from (l 0 − k)/2 to 1 to the left of x R allow us to set the output tensor to be that of M (l0) p * ) † . Then, it goes from 1 to another 0 , after which the output of the tensor is the identy, thus setting the end of the lightcone to the right.
We now illustrate this with the particular tensors. The following combinations of indices are the ones that add to the final result. First, the ones that contribute to the lightcone region are shown in Fig. 6 as follows. Then, the ones required to set the output beyond the lightcone to be the identity are shown in Fig. 8 as the following. To summarize, p * determines the right set of hypercubes, and s 1 , s 2 indicates how far we are to starting point of R, from the right and the left. The tensors apply M (l0) p * to A (so that the output is M i,p * ) † as desired. This is illustrated in Fig. 9. Higher dimensions: This can be done by establishing the same index configurations as for 1d for every individual direction. Now, we have two s indices s (q) 1 , s (q) 2 for each dimension q ∈ {X, Y, ...}, as well as one index p (q) for each. Let us begin with a particular spatial direction, say q = X, with coordinates adopting values x ∈ {0, 1, ..., L X } (L X being the length in that direction). Starting from the first position x = 0, the index s X 1 is 0 until any non-trivial Pauli appears at the input, with an X coordinate x R . It is possible that there is more than one non-trivial input with the same initial coordinate x R , but this does not affect the scheme. At x R , the index s X 1 is then again changed to (l 0 + k)/2, and s X 2 is changed to (l 0 − k)/2. Beyond x R , s X 1 decreases along the positive X direction, and s X 2 decreases along the negative X direction (while being 0 in the positive direction). When either s X 1 = 0 , s X 2 = 0 , the tensor acts as the identity M first non-trivial Pauli(s) determines the position of the hypercubes in the X direction, as p X = p X * in the same way as in one dimension. We repeat the same scheme for every dimension. This is illustrated in 2d in Fig. 10 as follows. In the figure, the σ represent sites at which the input is not the identity but any other Pauli matrix. The green shaded region is the square region R in which the observable A has support. The index s X 1 grows to the right, s X 2 decreases to the left, and s Y 1 , s Y 2 respectively grow and decrease downwards. p * has two components of l0 + 1 possible values each and is fixed across the entire lattice. We only show virtual indices, and omit the input and output legs of the tensors explicitly for simplicity.
With this, we obtain a fixed partition into hypercubes as determined by p * = {p q * }, and also a set of indices s whose non-zero value indicate where the output light-cone lies. That is, whenever any one of the 2 × d indices s is 0 , we set the local tensor to be that of M Given this scheme, the map outputs M (l0) i,p * AM (l0) † i,p * with the right {i, p * } chosen, such that R is in the middle of the corresponding hypercube. From our discussion above, this is an -close approximation to e −itH Ae itH in operator norm. The whole map involves, on each side of the input A, a virtual index r with dimension D p , and two additional virtual indices p, s with dimension (l 0 + 1) d , ((l 0 + k)/2) 2d respectively, to connect the tensors at each site. Thus, the bond dimension can be taken as D 2 p × (l 0 + k) 3d , as stated above. Finally, our result assumed that A was a product of Paulis. Since the map is linear and any local operator can be written as a linear sum of up to 4 k Pauli matrices, the result follows by replacing → 4 −k in the bound for the bond dimension Eq. (C18).
Using Result 2, we have that | M t k A x M t † k β A y β − A x (t) β A y β | ≤ 3 ||A|| 2 . Repeated applications of the Lieb-Robinson bound yield | A x (t) β A y β − A x (t)A y β | ≤ 2 ||A|| 2 , and so we obtain For the pairs that are nearby, let us now constrain further k ≥ ξ + 2L. This means that, given Eq. (D7) we still need to cover the pairs x, y such that dist(x, y) ≤ O(v LR t + k + ξ log 1/ ).
This can clearly be done by choosing k = O((v LR t + k + ξ) log 1/ ), which is consistent with k ≥ ξ + 2L. Then, Result 1 implies that and from Result 2, | M t k A x M t † k A y β − e −itH A x e itH A y β | ≤ 3 ||A|| 2 . This thus shows that Eq. (D8) holds for all pairs x, y regardless of their distance. The final result follows by approximating each term in the sum e −itH Ae itH A β = 1 N 2 x, y e −itH A x e itH A y β .