Classical simulation of short-time quantum dynamics

Recent progress in the development of quantum technologies has enabled the direct investigation of dynamics of increasingly complex quantum many-body systems. This motivates the study of the complexity of classical algorithms for this problem in order to benchmark quantum simulators and to delineate the regime of quantum advantage. Here we present classical algorithms for approximating the dynamics of local observables and nonlocal quantities such as the Loschmidt echo, where the evolution is governed by a local Hamiltonian. For short times, their computational cost scales polynomially with the system size and the inverse of the approximation error. In the case of local observables, the proposed algorithm has a better dependence on the approximation error than algorithms based on the Lieb-Robinson bound. Our results use cluster expansion techniques adapted to the dynamical setting, for which we give a novel proof of their convergence. This has important physical consequences besides our efficient algorithms. In particular, we establish a novel quantum speed limit, a bound on dynamical phase transitions, and a concentration bound for product states evolved for short times.


I. INTRODUCTION
The study ofthe dynamics of quantum many-body models is a highly active area of research in quantum information science, both from the perspective of physics and of computation. Probing dynamics provides access to a wealth of physical phenomena and can enable the solution of hard computational problems. Existing quantum simulators are already allowing us to explore quantum dynamics of high complexity. To assess the potential for quantum speed-up, it is important to understand the reach of classical methods for these tasks. Unless all quantum computations can be classically simulated, i.e. BPP = BQP, classical algorithms will be unable to approximate quantum dynamics in an arbitrary setting. Nevertheless, there exist restricted regimes in which efficient classical simulation is possible. An important special case is evolution for short times, during which the quantum information will not spread much. Tensornetwork methods illustrate this point, as they provide provably efficient means of simulating unitary dynamics in one spatial dimension for short times [1,2].
In this work, we characterize the computational complexity of short-time dynamics under a local Hamiltonian more generally. Locality in this context means that the Hamiltonian can be written as a sum of operators supported on small subsystems. We do not require geometric locality and we do not restrict the Hamiltonian to a finitedimensional lattice. Instead, we only impose that every term in the Hamiltonian overlaps with a constant number of other terms. These conditions are satisfied by a wide class of physically relevant Hamiltonians, including parent Hamiltonians of quantum LDPC error-correcting codes [3]. Dynamics governed by Hamiltonians of this * dominik.wild@mpq.mpg.de † alvaro.alhambra@csic.es type are amenable to cluster expansions, which have long been used in both classical and quantum statistical mechanics of lattice models [4][5][6][7][8], leading to results such as the uniqueness of Gibbs states [9,10], efficient approximation schemes for partition functions [11,12], the decay of correlations [13][14][15], and concentration bounds [16,17]. Despite their long history, cluster expansions have typically been applied to equilibrium properties, while dynamics have been rarely considered. The paper is structured as follows. In the remainder of the introduction, we summarize the main results and give an overview of their implications for the complexity of quantum dynamics. In Sec. II, we define the cluster notation used throughout. Sec. III discusses the results and algorithm for local observables. In Sec. IV, we describe corresponding results for the Loschmidt echo. Physical consequences of these results are discussed in Sec. V. We conclude in Sec. VI with further remarks and open questions. The main text provides an overview of all the proof techniques, while technical details that are less crucial to the understanding are placed in appendices.

A. Summary of results
Our first main result concerns the dynamics of a fewbody observable A under a local Hamiltonian H.
? BQP-complete [21] TABLE I. The computational complexity of computing the quantities in the leftmost column with additive error ε = 1/poly(n), where n is the system size, for different times. All expectation values are with respect to product initial states. The constants t * and t * L are independent of the system size but depend on the details of the problem. Entries highlighted in blue are results from this work. Question marks denote regimes where the computational complexity is unknown.
The scaling with the time t is rather unfavorable, but the computational cost is independent of system size. Moreover, for any constant value of t, the run time has a polynomial dependence on 1/ε, which is an improvement over all previously known algorithms, such as those based on Lieb-Robinson bounds. The cluster expansion can thus be seen as an alternative approach to analyzing the effective locality and light-cone structure of manybody dynamics.
Our second result characterizes the complexity of computing the Loschmidt echo, tr(e −itH ρ).

Result 2. (Informal version of Theorem 12) Given a lo-
cal Hamiltonian H, a product state ρ, and |t| < t * L , there exists an algorithm that approximates log tr(e −iHt ρ) up to additive error ε with run time of at most where t * L is a positive constant. A key insight of this work is that when ρ is a product state, the objects analyzed in Results 1 and 2 fit the framework of cluster expansions that are commonly applied to the partition function tr(e −βH ) [11,17,22]. We give a novel proof of the convergence of these expansions based on the counting of trees.
In both algorithms, the cluster expansion enables an efficient grouping of the terms of a Taylor series by clusters of subsystems. Crucially, we show that only connected clusters contribute. Because the number of connected clusters grows at most exponentially with the size of the cluster, we can establish the convergence of the cluster expansion at short times by bounding the magnitude of the individual terms. The computational cost of the approximation algorithms follows by estimating the cost of computing a truncated cluster expansion while controlling the truncation error. For local observables, we are able to extend the algorithm beyond the radius of convergence of the cluster expansion using analytic continuation. The doubly exponential dependence on the evolution time t in Result 1 is a direct consequence of the analytic continuation scheme.
The above results have several important physics implications. Result 2 implies that dynamical phase transitions [23] cannot occur at times t ≤ t * L for local Hamiltonians and product initial states. In addition, it es-tablishes a novel quantum speed limit [24] that is independent of system size, in stark contrast to previous results for general initial states [25,26]. A generalization of Result 2 to the multi-Hamiltonian Loschmidt echo tr(ρ l e −itH (l) ), with H (l) local Hamiltonians, allows us to prove Gaussian concentration bounds of local observables on states evolved for a short time, a case not covered by previous results [27][28][29]. More precisely, we show that the probability of measuring H (2) in the evolved product state ρ(t) = e −itH (1) ρe itH (1) away from the mean tr(ρ(t)H (2) ) by δ is suppressed by e O(−δ 2 /n) .

B. Complexity of dynamics
Our main results have nontrivial consequences for the computational complexity of short-time quantum dynamics, which are summarized in Table I. For observables, it is known that approximating A(t) with additive error ε = 1/poly(n) up to times t = poly(n) is BQP-complete. This follows from the fact that determining the state of a single qubit at the output of a circuit with poly(n) gates is BQP-complete, combined with the existence of local Hamiltonians that simulate arbitrary quantum computations [18]. At the same time, Theorem 6 shows that we can compute A(t) classically with a similar error with computational cost poly(n) as long as t = O(1). This indicates a transition in the complexity of simulating local observables as the system evolves. The exact nature of this transition and the computational complexity of the intermediate regime t ∼ polylog(n), where the cluster expansion fails to be efficient, remains an open problem.
For the Loschmidt echo, there are two meaningful notions of approximation. The first one is with a small multiplicative error, which is equivalent to a small additive error in the logarithm. For this, Theorem 12 shows that there is an efficient approximation algorithm for times |t| < t * L with polynomial cost in both n and 1/ε. Unlike in Theorem 6, it is not possible to extend this result to arbitrary times. If we take ρ ∝ I, L(t) becomes an imaginary-time partition function. Approximating this for |t| = O(1) even with an O(1) multiplicative error has been shown to be #P-hard for 2-local, classical Ising models [20]. Hence, there is only a constant gap between the times accessible with our algorithm and the #P-hard regime, where an efficient algorithm is unlikely to exist.
The complexity of the analogous problem for real partition functions and the thermodynamic free energy has been recently considered in Ref. [30].
We may also consider the weaker additive approximation to the Loschmidt echo. Since |L(t)| ≤ 1, Theorem 12 also implies that we can efficiently approximate the Loschmidt echo to an ε-additive error for |t| < t * L . Calculating the Loschmidt echo for circuits of polynomial size with additive error ε = 1/poly(n) is BQP-complete [21]. To the best of our knowledge, the intermediate regime has not been explored.

A. Hamiltonian
We consider a set of n spins, V . Each spin v ∈ V is associated with a local Hilbert space The total Hilbert space is formed by the tensor product space H = v∈V H v . We call a subset of spins X ⊆ V a subsystem. For any linear operator A on H, we denote its support by X = supp(A), i.e., X is the smallest subsystem in V on which A acts nontrivially.
Next, we formally define the notion of local Hamiltonians. Given a set of subsystems S, we write a Hamiltonian H as where each λ X is a real coefficient and h X is a Hermitian operator acting on the subsystem X such that supp(h X ) = X. The coefficients satisfy |λ X | ≤ 1 and are chosen such that h X = 1, where · is the operator norm. A Hamiltonian is called k-local if it is a sum of terms that act on at most k sites or, equivalently, |X| ≤ k for all X ∈ S.
To characterize the connectivity of the Hamiltonian, we define the associated interaction graph G [22]. Given a set of subsystems S, the interaction graph G is a simple graph with vertex set S. There is an edge between two vertices X and Y if the respective subsystems overlap. We denote the maximum degree of the interaction graph by d. Throughout this work, we only consider k-local Hamiltonians for which, in addition, d is independent of the system size n. Each local term in the Hamiltonian therefore only overlaps with a constant number of other terms, which includes many physically relevant cases such as Hamiltonians with finite-range interactions. We point out that the number of terms |S| in these Hamiltonians increases at most linearly with the number of spins n.

B. Clusters
We define a cluster as a nonempty multiset of subsystems from S. Here, multiset refers to a set with possibly repeated elements but without ordering. We use boldfont letters V, W, . . . to denote clusters. We call the number of times a subsystem X appears in a cluster W the multiplicity µ W (X). If X is not contained in W, then µ W (X) = 0. The size |W| = X∈S µ W (X) of a cluster is the number of subsystems that it contains, including their multiplicity. The set of all clusters of size m is denoted by C m and the set of all clusters by C = m≥1 C m . We associate with every cluster W a simple graph G W , the so-called cluster graph. The vertices of G W correspond to the subsystems in W, with repeated subsystems also appearing as repeated vertices. Two distinct vertices X and Y are connected by an edge if and only if the respective subsystems overlap, i.e., X ∩ Y = ∅. We say that a cluster W is connected if and only if G W is connected. We use the notation G m for the set of connected clusters of size m, and G = m≥1 G m for the set of all connected clusters. The following statement concerning the number of connected clusters is an essential ingredient of our algorithms.
Lemma 1 (Proposition 3.6 of Haah et al. [22]). Given a subsystem X ∈ S, the number of clusters in G m that contain X is bounded from above by (ed) m . Moreover, there exists a deterministic classical algorithm with run time exp(O(m log d)) that outputs a list of all such clusters.
The classical algorithm is given as Algorithm 1 in Section 3.4 of Haah et al. [22].
The union W = V 1 ∪ V 2 of two clusters V 1 and V 2 is defined as the union of the multisets, adding all multiplic-ities such that µ W (X) = µ V1 (X)+µ V2 (X) for all X ∈ S. Another set of clusters of special interest is formed by the clusters connected to the support of a few-body operator A. We say that a cluster W is completely connected to A if and only if the cluster graph G W∪{supp(A)} is connected, where we assume for simplicity that A acts on a subsystem contained in the Hamiltonian such that supp(A) ∈ S. We denote the set of such clusters of size m by G A m , and G A = m≥1 G A m . The different sets of clusters are illustrated in Fig. 1.
Before proceeding, we introduce further notation related to clusters. It is sometimes convenient to assign a (nonunique) ordering to the subsystems in a cluster.
For W ∈ C m , we then write W = {W 1 , W 2 , . . . , W m }. We make frequent use of the shorthands λ W = X∈S λ µ W (X) X and W! = X∈S µ W (X)!. We often take derivatives with respect to the parameters of the Hamiltonian for all subsystems contained in a cluster W. To this end, we define the cluster derivative D W , which acts on any function of the Hamiltonian parameters λ = {λ X : X ∈ S} as Here, the subscript λ = 0 means to set λ X = 0 for all X ∈ S after taking the derivatives. Hence, the cluster derivative isolates the contribution from the monomial λ W .
For any function f (λ), we define its cluster expansion as the multivariate Taylor-series expansion in λ. With the above notation, the cluster expansion can be concisely written as Our goal is to establish conditions under which the cluster expansion converges to f (λ) for different functions of interest. We illustrate the above concepts by an example in Appendix B 1.

C. Cluster partitions
A partition P of a cluster W is a multiset of clusters We are particularly interested in partitions where every element is a connected cluster. We refer to these as partitions of W into connected subclusters. The set of all such partitions is denoted by P c (W).
We introduce several quantities characterizing cluster partitions. We use tildes to distinguish these from similar quantities describing clusters. The multiplicityμ P (V) is defined as the number of times the cluster V appears in the partition P . The size |P | = V∈Gμ P (V) is the number of clusters in P , including their multiplicities. We also use the shorthand For every partition P ∈ P c (W), we define a simple graphG P , called the partition graph of P . The vertices ofG P are the clusters in P . Two clusters V, V ∈ P are connected by an edge if and only if they overlap, that is, there exist subsystems X ∈ V and Y ∈ V such that X ∩ Y = ∅. Alternatively, we may obtainG P from G W as follows. For every V ∈ P , we merge the corresponding vertices in G W into a single vertex and remove all loops. If any of the remaining edges are repeated, they are reduced to a single edge. Figure 2 shows an example of a partition graph and illustrates its connection to the cluster graph.

A. Cluster expansion
We consider the expectation value of an observable A for an initial product state ρ evolving under a local Hamiltonian H. After time t, we have Due to the dependence of H on the parameter set λ, we may think of A(t) as a function of λ. The corresponding cluster expansion is given by For W ∈ C m , the cluster derivative can be explicitly evaluated as where the sum runs over all permutations of the indices {1, 2, . . . , m}.
For simplicity, we assume that the support of A is contained in the set of subsystems S of the local supports of the Hamiltonian, although this constraint can be readily relaxed. We establish the convergence of the cluster expansion for short times in two steps. First, we show that only clusters that are completely connected to A contribute to the sum.
Second, we need to bound the magnitude of the cluster derivative when it does not vanish. Note that there are at most 2 m in the nested commutators of Eq. (9). Repeated application of the triangle inequality then yields tr By combining these observations, we are able to prove convergence of the cluster expansion for short times: Proposition 3. Consider an operator A for which supp(A) ∈ S and let |t| < t * = 1/(2ed). Then, Proof. Consider the error of neglecting all clusters of size m > M from the cluster expansion. Since |λ X | ≤ 1 for all X ∈ S, we can bound this error by (13) By assumption, supp(A) ∈ S, which allows us to obtain every cluster in G A m by starting from a cluster W ∈ G m+1 that contains X = supp(A) and reducing the multiplicity µ W (X) by 1. It follows from Lemma 1 that G A m ≤ (ed) m+1 . Substituting this bound into Eq. (13) yields a geometric series, which converges when |t| < t * . Convergence of this series implies the convergence of the cluster expansion such that f A (t) = A(t) and leads to the error bound in the lemma.
We highlight that Proposition 3 holds for any quantum state ρ. The restriction to product states only becomes relevant when bounding the computational cost. In addition, the proposition is valid for complex values of t.

B. Computation for short times
We now discuss the computational cost of estimating A(t) for a time |t| < t * up to additive error ε A . For simplicity, we only give the asymptotic scaling of the algorithm with ε and |t|/t * and suppress the dependence on constant parameters such as the locality k or the connectivity d of the Hamiltonian. Moreover, we exclude issues of finite numerical precision, which have been addressed rigorously by Haah et al. [22], from our considerations.
Our approximation algorithm computes the cluster expansion for all clusters up to size M , whose value is determined by ε and |t|/t * . For all m ≤ M , we proceed in two steps: (i) Enumerate all the connected clusters in G A m . (ii) Compute and add the contributions of every cluster.
Step (i) can be carried out by using the algorithm in Lemma 1 to first compute clusters of size m + 1 containing X = supp(A) and by subsequently reducing the multiplicity of X by one. The run time is exp(O(m)). The computational effort of step (ii) is dominated by the evaluation of the sum over nested commutators in Eq. (9). We show in Appendix A that this can be done for a product state in time exp(O(m)) by suitably grouping the m! terms of the sum. Together, the two steps lead to the following run time of the approximation algorithm.
Proposition 4. Let ρ be a product state and |t| < t * = 1/(2ed). There exists an algorithm that outputs an estimatef A (t) with run time Proof. According to Proposition 3, truncating the cluster expansion at order M > log ed (1−|t|/t * )ε / log t * |t| leads to an error that is bounded from above by ε A . Following the above discussion of enumerating the clusters and computing their contributions, we see that the cluster expansion truncated at order M can be evaluated in time exp(O(M )). Picking the smallest integer M that guarantees the desired error bound yields Eq. (14).

C. Computation for arbitrary times
The convergence result of Proposition 3 is independent of the system size n. Hence, A(t) remains analytic in the thermodynamic limit for all complex values of t that satisfy |t| < t * . Given any t 0 ∈ R, we may write A(t) = tr e iH(t−t0) Ae −iH(t−t0) ρ , where ρ = e −iHt0 ρe iHt0 is another quantum state. This shows that A(t) is analytic on a disk in the complex plane of radius t * around any point on the real axis. Stated differently, A(t) is analytic for all complex values of t satisfying |Im(t)| < t * . This analytic structure provides a strategy to compute A(t) for a product state ρ for all t ∈ R and any system size n by means of analytic continuation.
While there are many approaches to analytic continuation, we pick here a concrete method that employs a function φ(z) that maps a disk onto an elongated region along the real axis. For some R > 1 and w > 0, we assume that φ(z) satisfies the following three properties: We show below, using an explicit example, that such a function exists.
Next, we define f (z) = A(tφ(z)) , where t ∈ R is the time at which we want to evaluate A(t) . It follows from property (ii) that f (1) = A(t) . Because A(t) is analytic when |Im(t)| < t * , properties (i) and (iii) together ensure that f (z) is analytic on D R , provided that w|t| < t * . These relations are illustrated in Fig. 3.
Next, we compute the Taylor series of f (z) at z = 0 up to order M . With where we used the fact that φ 0 = 0. We can obtain A l t l from the cluster expansion as  To bound the truncation error of the Taylor series, we make use of the following standard result in complex analysis (see, e.g., Proposition 18 of reference [12]).
Lemma 5. We denote by D R the closed disk of radius R centered at z = 0 in the complex plane, i.e., D R = {z ∈ C| |z| ≤ R}. Let f (z) be a complex function that is bounded by |f (z)| ≤ F and analytic for all z ∈ D R . Given α < 1, for all z ∈ D αR , the error of approximating f (z) by a truncated Taylor series of order M is bounded by Combining this lemma with the above considerations for computing the Taylor series yields an algorithm to estimate A(t) for all t ∈ R. Theorem 6. Given t > 0, there is an algorithm that outputs the estimatef A (t) with run time such that for any product state ρ Proof. We bound the truncation error of the Taylor series of f (z) at z = 1 by letting α = 1/R in Lemma 5. This yields By the same argument we used to show that A(z) is analytic for all |Im(z)| < t * , Proposition 3 implies that for all z satisfying |Im(z)| < t * . Since |Im(φ(z))| ≤ w, assuming wt/t * < 1, we have Hence, The parameters R and w cannot be chosen entirely at will owing to the constraints on φ(z). We do not attempt to address this issue in general but instead consider the . This function clearly satisfies the requirements (i)-(ii). For requirement (iii), we assume that the branch of the logarithm is chosen such that Recalling that w < t * /t, we separately let wt/t * = η for some η < 1. We can always , allowing us to replace R by R in Eq. (22) at the cost of a factor of 2. For the particular choice of φ(z), we thus obtain (23) It follows from this expression that truncating at order M > e πt/2ηt * log 2ed 1−η 1 ε e πt/2ηt * guarantees an error ε M (1) ≤ ε A . Thus, our outputf A (t) is the Taylor series of f (z) at z = 1 up to the smallest integer satisfying the lower bound on M . Since the computational cost is exponential in M , the theorem follows by setting η = 1/2.
The above approach yields an algorithm to estimate A(t) with a computational cost that scales polynomially with 1/ε for any fixed real time t. The cost, however, has a doubly exponential dependence on t/t * , which renders this approach unsuitable for practical computations at long times. The chain of disks, another common method of analytic continuation, also yields a doubly exponential dependence of the computational cost on the time t [32]. We conjecture that this scaling is an unavoidable consequence of analytic continuation. More efficient continuation algorithms may be available if the analytic domain of A(t) extends along the imaginary direction beyond a constant. This is, however, not possible in general because there exist local observables and Hamiltonians for which A(t) becomes nonanalytic in the thermodynamic limit at a constant imaginary time [33]. This also indicates that our convergence result is optimal up to an improvement of t * by a constant factor. We remark that the above procedure can be adapted to not only yield expectation values with product initial states, but also an M -local approximation of the operator A(t).

D. Comparison with the Lieb-Robinson bound
The Lieb-Robinson bound [34] offers an alternative method for computing the time evolution of local operators. It implies that where C and ξ are constants, and v is the Lieb-Robinson velocity. The Hamiltonian H l = X∈S: dist(X,A)≤l λ X h X contains all local terms within the graph distance dist(X, A) between the operators X and A on the interaction graph G.
To compute A(t) to within additive error ε A , it suffices to compute e −iH l t Ae iH l t on a region of radius l = vt + ξ log(C/ε). In a lattice in D dimensions, the computational cost of performing exact diagonalization on this region is exponential in [vt + ξ log(C/ε)] D = ξ D log D e vt/ξ (C/ε) . For D > 1, this yields an algorithm whose cost is superpolynomial in both e vt/ξ and 1/ε, as opposed to the polynomial dependence on 1/ε in Theorem 6. The difference is even more marked in expander graphs, for which the number of sites at a distance l grows as exp(O(l)). The Lieb-Robinson method has a run time that is exponential in poly(1/ε) and doubly exponential in t for such graphs.

A. Cluster expansion
We now focus on the Loschmidt echo where ρ is a product state on the n qubits. This is an important quantity in the study of dynamics of quantum systems. It appears in diverse contexts such as quantum chaos [35], as the characteristic function of the local density of states [36], and in algorithms for quantum simulation [37]. As we discuss below, it is also a key quantity in the description of dynamical phase transitions and other relevant phenomena. We consider the logarithm log L(t), whose multivariate Taylor expansion can be written in terms of cluster derivatives as Our goal is to establish sufficient criteria for the convergence of this expansion. Working with the logarithm of L(t) greatly reduces the number of clusters involved since only connected ones contribute: If ρ is a product state, then for any disconnected cluster W, Proof. Since W / ∈ G, there exists a decomposition W = W A ∪ W B such that the cluster graphs G W A and G W B are disconnected components of G W . We define H A = X∈W A λ X h X and H B = X∈W B λ X h X , where each subsystem is included at most once in the sum, even if it appears with higher multiplicity in the cluster. Clearly, For ρ a product state, we further have and thus The first term vanishes because H A is independent of λ X for any X ∈ W B . Similarly, the second term is independent of λ X for X ∈ W A .
Hence, we can restrict the sum in Eq. (26) to connected clusters.
We next bound the cluster derivative using the bounded connectivity of the cluster and partition graphs. Many properties of a graph G are captured by the Tutte polynomial T G (x, y) (see, e.g., [38]). For instance, T G (1, 1) is the number of spanning trees (or spanning forests if the graph is not connected), T G (2, 1) is the number of forests and T G (2, 2) is 2 |E| , where |E| is the number of edges.
Using the Tutte polynomial, the cluster derivative of log L(t) can be concisely expressed as in the following lemma.
Lemma 8. For any W ∈ G m , the cluster derivative of log L(t) can be written as where we introduced the symmetrized expectation value (32) and the combinatorial factor We prove this lemma in Appendix B 2. A similar statement has been reported by Mann and Helmuth [11], with Helmuth, Perkins, and Regts pointing out the relation to the Tutte polynomial [39].
We make use of Lemma 8 to derive the following upper bound on the cluster derivative. Proposition 9. Let W ∈ G m . Then, We defer the proof to Appendix B 4. In rough terms, it proceeds by observing that | h V s | ≤ 1 and 0 ≤ TG P (1, 0) ≤ TG P (1, 1), where TG P (1, 1) is equal to the number of spanning trees ofG P . The sum of these trees over all partitions can then be bounded by the number of spanning trees of the original cluster graph G W . This number is smaller than an exponential in m times W!, yielding Eq. (34).
Having bounded each term in the cluster expansion, it remains to bound the number of terms, i.e. the number of connected clusters of size m. This is done with Lemma 1 and the fact that there are |S| subsystems on the lattice such that the total number is bounded by |S|(ed) m .
The main result of this section is the following theorem.
Theorem 10. The logarithm of the Loschmidt echo, log L(t), is analytic for |t| < t * L = 1/[2e 2 d(d + 1)] and the truncation error of the cluster expansion can be bounded by Proof. Lemma 7 and Propositions 9, together with the bound on the number of clusters, imply that The result then follows by considering the weight of the terms of the Taylor expansion with m > M .
The overall argument of this section mirrors the steps of previous results on Gibbs states [17,22]. Our new technical contributions are the expression for the cluster derivative in Lemma 8 and the bound in Proposition 9, for which we use a novel proof strategy based on counting trees. We note that convergence results similar to Theorem 10 can also be proved using the general framework of abstract polymer models [8,10,11].

B. Computation of the Loschmidt echo
The above Taylor approximation allows for an efficient classical estimation of the Loschmidt echo for short times. Theorem 10 guarantees that to approximate log L(t) it suffices to calculate the terms in the series up to some order M . Recall that the m th order term is given by As in Sec. III B, we need two steps to calculate these: (i) enumerate all the connected clusters in G m and (ii) compute and sum the cluster derivative for every connected cluster according to Eq. (37). Lemma 1 addresses the first step. For the second one, we need to bound the cost of computing cluster derivatives. Related bounds have previously been stated in several works [11,22,40], where the computation hase proceeded either by directly differentiating log L(t) or by summing the terms of an expansion similar to that in Lemma 8. Here we pursue the latter approach, stating the computational cost in the following proposition. As in Sec. III, we ignore complications arising due to finite numerical precision. With these ingredients, the main result is as follows.
Theorem 12. For times |t| < t * L = 1/[2e 2 d(d+1)], there exists a classical algorithm with run time Proof. Theorem 10 implies that the truncation error of the cluster expansion is smaller than ε when keeping terms up to M > log For a fixed t < t * L , the run time is polynomial in 1/ε as well as in the number of terms |S|, which is proportional to the system size n. The output approximates the Loschmidt echo by exp[f L (t)] with a multiplicative error Unlike in the case of local observables, it is in general not possible to analytically continue the Loschmidt echo beyond t * L because the zeros of L(t), and thus nonanalyticities of log L(t), may be located anywhere in the complex plane, including the real axis.

C. Generalized Loschmidt echo
We now show that similar results hold for a Loschmidt echo with multiple Hamiltonians, defined as L(t 1 , t 2 , . . . , t K ) = L({t l }) = tr K l=1 e −iH (l) t l ρ . (40) We assume that each of the K Hamiltonians {H (l) } satisfies the same conditions as in Sec. II, so that they can all be written as We define the set of labeled subsystems S K = {(X, l) : X ∈ S, l ∈ {1, 2, . . . K}}, where the additional index keeps track of the Hamiltonian. The corresponding interaction graph G K may be viewed as K copies of the original interaction graph G, where vertices are connected if the subsystems overlap, independent of which copy they belong to. Note that if the maximum degree of the original interaction graph G was d, then the maximum degree of G K is K(d + 1) − 1.
A cluster W is now defined as a multiset of elements from S K , with the set of clusters of size m denoted by C K m and the set of connected clusters by G K m . The multiplicities of a subsystem and Hamiltonian label in a cluster are denoted by µ W ((X, l)). The cluster graph G K W is again constructed by connecting subsystems that overlap, independent of the Hamiltonian with which they are associated.
With this notation, the logarithm of the Loschmidt echo permits the cluster expansion Here, we introduced natural generalizations of our l))!, and the cluster derivative .
(43) As we show in Appendix C, the results of Sec. IV A carry over directly, as long as we take into account the increased maximum degree of the interaction graph G K . In particular, this means that for W ∈ G K m , where m l is the number of terms in W associated with H (l) . Note the additional factor of K, which effectively reduces the threshold time by 1/K. The analysis of the computational cost is similar to that in Section IV B, as detailed in Appendix C. Hence, we obtain an analog of Theorems 10 and 12.
The logarithm of the generalized Loschmidt echo, log L({t l }), is analytic for τ < 1 and its Taylor series converges as Moreover, there exists a classical algorithm with run time We highlight again that the computational cost scales polynomially with the number of Hamiltonian terms |S| and the inverse error 1/ε. The number of Hamiltonians K enters in the shortened threshold time t * L /K and in the exponent in Eq. (46) as a computational overhead.

V. FURTHER IMPLICATIONS
Besides leading to efficient classical algorithms, the convergence of the cluster expansion has important physical consequences, which we discuss below.

A. Concentration bounds
Given a quantum state ρ, the outcome of a measurement of an observable A is a random variable. Assuming a projective measurement onto the eigenspaces of A, the probability of outcome x is given by Pr(x) = tr[Π(x)ρ], where Π(x) is the projector onto the eigenspace of A with eigenvalue x. A concentration bound is an upper bound on the probability Pr[|x − tr(ρA)| ≥ δ], i.e., the probability that x deviates from its mean tr(ρA) by more than δ. We will focus on the Hamiltonian H = X λ X h X as the observable. The energy distribution of a state plays an important role in thermalization and equilibration [42,43] and its concentration properties place limitations on the performance of variational quantum algorithms [29,44].
As a warm-up example, suppose the Hamiltonian terms h X act on distinct single sites and ρ is a product state. Then, the measurement outcome of H is equal to the sum of the measurement outcomes of h X , which are independent random variables. The Chernoff-Hoeffding bound for independent random variables implies that We note that the denominator in the exponent is proportional to the system size n such that deviations from the mean energy of order √ n are strongly suppressed. This simple argument fails when the terms h X in the Hamiltonian overlap or when ρ is not a product state because the outcomes of the measurements h X are no longer independent. Nevertheless, similar bounds hold for local Hamiltonians and sufficiently weakly correlated states. A number of proof techniques have been employed to establish such results [27,28,45], including the cluster expansion [40] (see also, e.g., Ref. [16] for results on large deviations).
To illustrate the method, we use the cluster expansion to give a concise proof of a concentration bound for the energy of product states, reproducing the main result of [27]. It follows from a standard argument (see, e.g., Corollary 1 in [40]) that the probability of obtaining a measurement outcome x that is greater than the average energy by at least δ is bounded according to Pr [x − tr(ρH) ≥ δ] ≤ e −δν tr ρ e ν(H−tr(ρH)) . (48) Next, we apply Theorem 10 with t = iν, where 0 ≤ ν < t * L . Taking M = 1, the theorem implies that log tr(ρe νH ) − ν tr(ρH) ≤ |S| Substituting into Eq. (48) yields Choosing ν = δ(t * L ) 2 /4|S| (which is always possible since δ/|S| ≤ 1 and t * L < 1), and applying the same bound for Pr(x − tr[ρH] ≤ δ), we obtain We observe that this bound has the same dependence on δ and the system size as the bound for the special case in Eq. (47). Theorem 13 enables us to extend the bound beyond product states. For short times, we can consider log tr e −itH (1) ρe itH (1) e νH (2) − ν tr e −itH (1) ρe itH (1) H (2) , from which we obtain the following concentration bound.

Corollary 14.
Let H (1) and H (2) be local Hamiltonians, let ρ be a product state, and let |t| ≤ t * L /7. The probability that a projective measurement of the state e −itH (1) ρe itH (1) onto the eigenbasis of H (2) yields a value x that deviates from the expectation value tr[e −itH (1) ρe itH (1) H (2) ] by at least δ is bounded from above by The proof is shown in Appendix D. The corollary shows that after evolving a product state for a short time under a local Hamiltonian, the energy distribution with respect to a (possibly different) local Hamiltonian is concentrated around the mean. Previous results along these lines cover many relevant cases but not time-evolved product states [17,[27][28][29]45]. Since Theorem 13 works for any number of Hamiltonians, this corollary can also be extended to states of the form e −itH (1) · · · e −itH (K) ρe itH (K) · · · e −itH (1) , which often feature in variational quantum algorithms, for a correspondingly shorter threshold time.

B. Dynamical phase transitions
Dynamical phase transitions (DPTs) may be viewed as a real-time analog of thermal phase transitions [46]. The cluster expansion naturally constrains the time at which they can appear. Let us consider an infinite sequence of local Hamiltonians H n on n particles under the assumptions of Section II and product states |Φ = ⊗ n i=1 |φ i , such that g n (t) ≡ log Φ|e −itHn |Φ . A DPT occurs when the following function is nonanalytic [23]: The following result is a consequence of Theorem 10.
Corollary 15. G(t) is analytic for t < t * L , and thus DPTs can occur only at later times.
The proof is shown in Appendix E. The time scale t * L is hence a universal lower bound on the time at which dynamical phase transitions occur. Nonanalyticities can appear in the logarithm of the Loschmidt echo at times t ∼ O(1) as can be seen from the simple case of noninteracting spins. This has also been demonstrated analytically for particular interacting models in one dimension [47]. This shows that t * L in Theorem 10 can be increased at most by a constant factor. Note also that Theorem 13 allows us to extend this corollary to some Floquet systems. The absence of dynamical phase transitions at short times is analogous to the fact that thermal phase transitions can only occur above some threshold inverse temperature β * that depends on the details of the system.

C. Quantum speed limits
A quantum speed limit (QSL) is a bound on the time t QSL that it takes for a state Φ evolving under a Hamiltonian H to become orthogonal to itself. Formally, The best-known general limits are the Mandelstam-Tamm and the Margolus-Levitin bounds [25,26]. When combined, they read where H = Φ|H|Φ , (∆H) 2 = Φ|(H − H ) 2 |Φ , and we assume that all eigenvalues of H are positive. In many-body systems, however, we typically have that ∆H ∼ n 1/2 and H ∼ n, so that the bound vanishes with system size. A simple consequence of Theorem 10 gives a significant improvement on the bound.

Corollary 16.
Let H be local and let |Φ be a product state. Then, t QSL ≥ t * L .
Proof. Theorem 10 shows that the logarithm of the fidelity is analytic for t < t * L . Since log(x) is nonanalytic at x = 0, this means that | Φ|e −itH |Φ | > 0 for t < t * L .
Alternatively, the QSL also follows from the explicit lower bound in Eq. (39). By truncating the cluster expansion at order M = 2, we obtain the lower bound for all |t| < t * L . The dependence on |t|/t * L in the first exponential is better than in Theorem 10 because all odd orders in the cluster expansion are purely imaginary and therefore do not contribute to the absolute value This result shows that the well-known QSLs of Eq. (56) do not give very tight bounds for product states evolving under local Hamiltonians. Let us remark that even if the fidelity does not become zero at early times, it does generically quickly become exponentially small in system size [48,49], which can also be seen from the upper bound in Eq. (39).

VI. SUMMARY AND OUTLOOK
We showed that the cluster expansion of many dynamical quantities converges at short times, yielding efficient classical approximation algorithms as a by-product. We described the implications for the complexity of quantum dynamics and discussed consequences for concentration bounds, which are linked to the performance of variational quantum algorithms [29,50], dynamical phase transitions, and quantum speed limits.
The proof strategy of our main results is based on counting the number of clusters that participate and bounding their individual contributions to the sum. This last step diverges from established convergence proofs of cluster expansions in the literature on abstract polymer models [5,8,10,51], which are based on iterative arguments. We follow more closely recent papers with results on Gibbs states [17,22,40], although we use an alternative expression for the cluster derivative involving the Tutte polynomial of the partition graph, which may be of independent interest.
Our work opens the door to many future research directions. It will be interesting to explore the optimality of our algorithms. For example, is it possible to improve the time dependence of Theorem 6 to the one given by the Lieb-Robinson bound (i.e. e O(t D ) in D dimensions) while retaining the polynomial dependence the inverse approximation error? Similarly, we may ask whether it is possible to extend the concentration bound, Corollary 14, to longer times. Related bounds on moments of the dis-tribution have already been shown to hold for times up to O(log n) [52]. One could also address in this context whether a sharp breakdown of Gaussian concentration occurs at longer times, which may be related to dynamical phase transitions.
From a numerical perspective, our algorithms should be compared in practice to existing approaches such as the closely related numerical linked-cluster expansion [53], which has also been used to approximate quantum dynamics [54][55][56][57]. Other related methods are cluster expansions with tensor-network representations [58,59] and schemes based on operator-basis expansions [60][61][62]. While our approach can be adapted to evolution under local Lindbladians by vectorizing the density operator, there is no obvious way in which noise improves convergence of the cluster expansion. It is unclear whether this happens for particular noise models, which could have significant implications on the classical simulation of noisy quantum circuits [63][64][65] and on the assessment of quantum advantage of noisy, intermediate-scale quantum (NISQ) simulators [66].
We have shown that the cluster expansion is useful not only for the study of systems in thermal equilibrium, but also for dynamical problems. The cluster expansion enables us to establish the classical approximability of continuous dynamics for short times, similar to previous results for quantum circuits [67]. It complements other locality-based methods such as those derived from Lieb-Robinson bounds, which have led to important results in both dynamical and equilibrium systems [68]. We hope that our work stimulates further research into these techniques and into how they can help us understand other aspects of quantum many-body problems. inclusion-exclusion argument to rewrite each sum over permutations using the identity (A2) Hence, We highlight that the sum over the subsets of [m] involves 2 m terms as opposed to the m! terms of the sum over the permutations in S m . For a k-local Hamiltonian, the sums over h Wi result in an operator that has support on at most km spins. The writing down of these operators on the relevant subspace can be achieved in time poly(d km ) = exp(O(m)). We can also raise them to the k th power with similar computational effort by diagonalizing the operators and powering the eigenvalues. Multiplying the resulting operators by A and ρ and taking the trace again incurs a computational cost with the same asymptotic dependence on m. Finally we need to perform the sums over I, I , and J . There is a total of 4 m terms in these sums such that the overall computational effort indeed scales as exp(O(m)).
of nonoverlapping terms factorize. For example, the term at second order simplifies to We recognize the two sums as the two distinct types of connected clusters in Fig. 4(c), whereas all disconnected clusters cancel. Our formalism of the cluster expansion enables efficient book-keeping of these cancellations at arbitrary order.

Proof of Lemma 8
Formally, the cluster expansion of the Loschmidt echo is given by Note that the sum at this point includes both connected and disconnected clusters. The cluster derivative of L(t) can be written as where we remind the reader that the symmetrized expectation value is defined by in which m = |W|. The symmetrized expectation value has the property that it factorizes when W is disconnected. Denote by P c,max (W) ∈ P c (W) the partition of W into its maximal connected components. Then, The cluster expansion of the Loschmidt echo becomes At this point, we take the logarithm, which we again express in terms of its formal Taylor series, Combining Eq. (B8) and Eq. (B9) yields (B10) We can rearrange the sums to first sum over all clusters W before considering decompositions of W into connected clusters V: Lemma 7 allowed us to impose that W be connected. The coefficient C(P ) can be determined by considering the different ways in which the partition P can arise from the clusters W 1 , W 2 , . . . , W n in Eq. (B10), such that P = n i=1 P c,max (W i ). Different elements of P can belong to the same "parent" cluster W i if they do not overlap. It is hence possible to construct assignments of all V ∈ P to parent clusters W 1 , W 2 , . . . , W n from a proper coloring of the partition graphG P with exactly n colors. The vertices colored with the first color form W 1 , the second color gives W 2 , and so on. From this argument, we find that where χ * G P (n) is the number of proper colorings ofG P with exactly n colors. The combinatorial factor P ! removes overcounting that occurs when P contains clusters with multiplicity greater than 1 because permuting the colors of repeated clusters has no effect on W 1 , W 2 , . . . W n .
To complete the proof of Lemma 8, we make use of the following combinatorial property of graphs, which we prove in Appendix B 3.
Lemma 17. Given a connected graph G = (V, E), we denote by χ * G (n) the number of proper colorings of G with exactly n colors. Let T G (x, y) be the Tutte polynomial of G. Then, By applying this lemma to C(P ), we obtain log L(t) = m≥1 W∈Gm P ∈Pc(W) Finally, taking the cluster derivative of Eq. (B14) yields which, using Eq. (B4), can be readily brought into the form of the expression in Lemma 8.

Proof of Lemma 17
To prove Lemma 17, we introduce three more short lemmas. The first one relates the number of colorings that use exactly k colors to the chromatic polynomial.
Lemma 18. Given a graph G, let χ * G (k) denote the number of proper colorings of G that use exactly k colors. Moreover, let χ G (k) be the chromatic polynomial, that is, the number of proper colorings with up to k colors. Then, Proof. This lemma follows from a standard inclusion-exclusion argument. Alternatively, we can prove the statement by direct calculation as follows. The chromatic polynomial χ G (k) can be computed by picking j ≤ k colors and adding the contributions from χ * G (j): We substitute this expression into the right-hand side of Eq. (B16) and exchange the order of the sums: The Tutte polynomial T G (x, 0) is a polynomial in x of degree at most |V | − 1. Therefore, T G (1 − k, 0) is a polynomial in k of the same maximum degree, which allows us to write for some coefficients a n . By substituting into Eq. (B27), we get Lemma 20 allows us to simplify the expression to This is the desired expression since a 0 = T G (1, 0).

Proof of Proposition 9
We start from the expression for the cluster derivative in Lemma 8:

Recall the symmetric expectation value
ρ and the combinatorial factor N P (W) = W!/ P ! V∈P V! . We rewrite the sum over cluster partitions of W as a sum of graph partitions of the cluster graph G W . Here, a graph partition refers to a partition of the vertices. We only consider partitions into connected subgraphs, meaning that the subsets of vertices in the partition induce connected subgraphs on the original graph. For every graph partition of G W into connected subgraphs, there exists exactly one corresponding cluster partition P ∈ P c (W). On the other hand, for every cluster partition there are exactly N P (W) = W!/P ! V∈P V! equivalent graph partitions. This combinatorial factor arises because repeated subsystems are indistinguishable at the level of the cluster but give rise to distinct vertices in the cluster graph. With this, where, in a slight abuse of notation, P c (G W ) is the set of partitions of the cluster graph G W into connected components and P ∈ P c (W) is the cluster partition corresponding to P . By observing that | h V s | ≤ 1, we obtain where the last inequality follows from the fact that the Tutte polynomial TG P (x, y) has positive coefficients. We bound this sum in two steps, starting with the following lemma.
Lemma 21. Given a connected cluster W ∈ G m , P ∈Pc(G W ) Proof. TG P (1, 1) counts the number of spanning trees ofG P , so that P ∈Pc(G W ) TG P (1, 1) = P ∈P c (G W ) spanning trees ofG P

(B34)
Given a spanning tree of the cluster graph G W , consider a bicoloring of the edges into blue and red, and delete the red ones. This separates the edges into disconnected components, each of which induces a connected subgraph on G W . These subgraphs define a partition P ∈ P c (G W ), with its corresponding partition graphG P . Moreover, the deleted red edges can be identified with a spanning tree ofG P . Hence, any bicoloring of a spanning tree of G W describes a term in the double sum on the right-hand side of Eq. (B34). Conversely, for every term in the sum, we can find at least one bicolored spanning tree. This procedure is illustrated in Fig. 5. It follows that the sum is bounded from above by the number of bicolored trees of G W . The number of edges of each spanning tree is |W| − 1 = m − 1, so that there are always 2 m−1 distinct bicolorings for every tree. The total number of spanning trees is T G W (1, 1), which completes the proof.
We next bound the number of spanning trees of T G W (1, 1) given the connectivity of the cluster graph.
Lemma 22. Consider a set of supports S such that the maximum degree of the associated interaction graph G is d. Given a connected cluster W ∈ G m , Proof. Let us fix W 1 ∈ W as the root of a spanning tree. Any spanning tree can be constructed by picking for each vertex W 2 , W 3 , . . . W m one of the edges incident on it. The number of choices is γ( is the degree of the vertex associated with W k in G W . It follows that This product of degrees can be bounded as in the proof of Proposition 3.8 in [22]. The degree of a vertex associated with W k can be written as where N (W k ) is the set of neighbors of W k in the interaction graph G. We sum the degrees over all distinct subsystems that appear in W: The double sum is bounded by dm because each µ W (Y ) appears in it at most d times. Finally, we bound 1 W! γ(W 1 )γ(W 2 ) · · · γ(W m ) = X∈S : µ W (X)>0 The bound on the cluster derivative follows from the two lemmas and Eq. (B32).
The convergence of the cluster expansion for this generalized Loschmidt echo follows in an analogous fashion to Theorem 10.
To prove the bound on the computational cost in Theorem 13, we closely follow the proofs of Proposition 11 and Theorem 12, keeping in mind that the degree of the relevant interaction graph is K(d + 1) − 1. Throughout, we only keep the dependence on K explicit, while suppressing the dependence on d = O(1).
The computational cost of step (i) of the proof of Proposition 11 is modified to exp(O(m log K)).
Step (ii) remains unchanged as the computational cost of evaluating the Tutte polyonomial only depends on the number of vertices. In step (iii), we have to evaluate Eq. (C2) instead of the simpler symmetrized expectation value. Nevertheless, by rewriting the sums over permutations using Eq. (A2), this can still be carried out in time exp(O(n)). Hence, the cluster derivative of the generalized Loschmidt echo can be computed in time exp(O(m log K)). By Lemma 1, the run time of the algorithm to enumerate the clusters is |S| × exp(O(m log K)). We conclude that the truncated cluster expansion of the generalized Loschmidt echo that includes all clusters up to size M can be computed with a total run time |S| × exp(O(M log K)). Choosing M > log |S| (1−K l |t l |/t * L )ε / log t * L K l |t l | completes the proof.

Appendix D: Proof of concentration bound
Let us assume tr(ρe itH (1) H (2) e −itH (1) ) = 0 for simplicity. We write the cluster expansion of log L({t, ν, −t}) ≡ log tr(ρe itH (1) e νH (2) e −itH (1) ) as f L ({t, ν, −t}) = log tr(ρe itH (1) e νH (2) e −itH (1) where we have three Hamiltonians. However, note that since f L ({t, 0, −t}) = 0, only the clusters with at least two terms h X from H (2) contribute. This means that the smallest power of ν is 2. That is, where m 2 is the number of subsystems in W associated with H (2) . Now, given Eq. (44) where in the second line we write the sum over combinations with at least two terms from H (2) (that is, the number of ways in which one can arrange m objects in three boxes, with at least two objects in one of them). In the third line, we bound this sum by m 2 times the number of ways of arranging m − 2 elements in three boxes. Finally, we use Now take |t| ≤ t * L /7, so that 1 − 3(2|t|+ν) . We obtain log tr(ρ(t)e νH (2) ) ≤ 5|S| (3ν/t * L ) 2 (1/7 − 3ν/t * L ) 3 .
Appendix E: Proof of Corollary 15 From Eq. (26) we have the series expansion Theorem 10 shows that g n (t) is analytic for t < t * L , and any finite n, such that We apply Tannery's theorem to the sequence { W∈Gm λ W nW! D W log L n (t)}. This implies that we can place the limit inside the sum as That G(t) is analytic follows from the fact that lim n→∞ W∈Gm