Quantum chaos and the complexity of spread of states

We propose a measure of quantum state complexity defined by minimizing the spread of the wave-function over all choices of basis. Our measure is controlled by the"survival amplitude"for a state to remain unchanged, and can be efficiently computed in theories with discrete spectra. For continuous Hamiltonian evolution, it generalizes Krylov operator complexity to quantum states. We apply our methods to the harmonic and inverted oscillators, particles on group manifolds, the Schwarzian theory, the SYK model, and random matrix models. For time-evolved thermofield double states in chaotic systems our measure shows four regimes: a linear"ramp"up to a"peak"that is exponential in the entropy, followed by a"slope"down to a"plateau". These regimes arise in the same physics producing the slope-dip-ramp-plateau structure of the Spectral Form Factor. Specifically, the complexity slope arises from spectral rigidity, distinguishing different random matrix ensembles.


I. INTRODUCTION
In classic work, Kolmogorov proposed that the complexity of a sequence could be defined as the length of the shortest Turing machine program producing it [1]. In information theory, Rissanen likewise suggested that the complexity of an ensemble of messages can be defined as their minimal codelength, averaged over the ensemble [2]. Similarly, complexity of a problem class can be measured by the size (depth, width, or number of gates) of the smallest circuit, defined in terms of a fixed gate set, that performs the computation. The fundamental idea here is that the complexity of an object, or class of objects, should be understood as the number of simple components required to assemble it. As such, there is an ambiguity -the measure of complexity depends on the basis of simple components. For this reason, most discussions of complexity in computer science deal with scaling laws, and seek to establish that, for reasonable choices of basis, these laws are universal up to polynomial factors. Similar ambiguities arise in attempts to measure the complexity of quantum processes. For example, consider Nielsen's definition of the complexity of the time evolution operator U (t) = e −iHt as the minimal distance in the unitary group between the identity and U (t) [3]. This definition requires the choice of a "complexity metric" on the group manifold, penalizing "hard" vs. "easy" operations, typically defined in terms of their degree of locality. Recent studies [4][5][6][7][8][9][10][11] investigate such choices of complexity metric for describing physical time evolution, with consequences for separating integrability and chaos.
In physical systems it is also interesting to quantify complexity of individual states. A reasonable definition should regard many-body factorized Gaussian wavefunctions as "simple", while widely dispersed, non-locally entangled states are "complex". Dynamically speaking, we expect chaos to produce inherently complex states, perhaps even with complexity increasing over exponential durations, e.g., for processes associated to black hole formation [12]. It would be natural to quantify the com-plexity of a quantum state in terms of the spread of the wavefunction over some fixed basis. But such schemes face an ambiguity -what basis should we pick?
We propose a measure of quantum state complexity that avoids this ambiguity. Suppose we start with an initial state, and allow it to spread through time evolution over some basis. Following the spirit of Kolmogorov, we define the complexity of the state by minimizing the spread of the wavefunction over all possible bases. We show that this minimum is uniquely attained, throughout a finite time interval for continuous evolution, and for all time for discrete evolution, by an orthonormal basis produced by the Lanczos recursion method [13]. We will call it "spread complexity".
The notion of spread complexity is controlled by the "survival amplitude" for a state to remain unchanged, a fact we use to develop efficient computational methods. We apply our techniques analytically to particle dynamics on group manifolds, and numerically to the Schwarzian theory, SYK model, and random matrix models. Applied to chaotic systems including the SYK model and matrix models, we show that complexity displays four dynamical regimes: a linearly increasing ramp ending in a peak, followed by a downward slope terminating in a plateau. The durations of the ramp and slope, and the heights of the peak and plateau, are exponential in the entropy. These effects can be compared to the characteristic slope-dip-ramp-plateau structure of the spectral form factor (SFF) [14,15]. The complexity slope, like the SFF ramp, arises from spectral rigidity [14,[16][17][18][19], and thus differs between Gaussian Unitary, Orthogonal, and Symplectic random matrix ensembles.
Some alternative approaches to state complexity based on Nielsen's geometric methods appear in [7,8,[20][21][22][23][24][25][26], and proposals targeted to CFTs include [27,28]. A definition in terms of wave-function spread with respect to the so-called "computational basis" was considered in [29], and a proposal based on group cohomology is in [30]. An interesting direction based on discrete time evolution and random circuits appears in [31]. (1) The solution |ψ(t) = e −iHt |ψ(0) has a formal power series expansion where |ψ n = H n |ψ(0) . The Gram-Schmidt procedure applied to |ψ n generates an ordered, orthonormal basis K = {|K n : n = 0, 1, 2, · · ·} for the part of the Hilbert space explored by time development of |ψ(0) ≡ |K 0 . The basis K, sometimes called the Krylov basis in the recent literature, may have fewer elements than the dimension of the Hilbert space, depending on the dynamics and the choice of initial state.
We expect more complex time evolution will spread |ψ(t) more widely over the Hilbert space relative to the initial state |ψ . To quantify this idea, we define a cost function relative to a complete, orthonormal, ordered basis, B = {|B n : n = 0, 1, 2, · · ·} for the Hilbert space where the c n are a positive, increasing sequence of real numbers, and the p B (n, t) are probabilities of being in each basis vector. Completeness of the basis B, together with the unitarity of time evolution, namely n p B (n, t) = 1, implies that the cost of a wavefunction increases if it spreads deeper into the basis. We will generally take c n = n so that the cost measures the average depth in the basis of the support of |ψ(t) .
We could try to define a natural notion of complexity as the minimum of this cost function over all bases B At a time t 0 , any basis with |B 0 = |ψ(t 0 ) will minimize (4), achieving C(t 0 ) = c 0 . So performing a minimization at each time separately is trivial and it does not provide any information about the spreading dynamics. We seek for a "functional minimization" encoding information about the spread of the state over a finite amount of time. We will show that, under a reasonable choice of functional minimization, there is an essentially unique basis minimizing (4) across a finite time domain.
To motivate our functional minimization, let C B2 for m = k, then C B1 (t) < C B2 (t) in a domain 0 ≤ t ≤ τ for some τ < T . We want to find the basis that minimizes the cost in this sense in the vicinity of t = 0. We can formalize this condition in terms of the sequence of derivatives of the cost function at t = 0: We write S B1 < S B2 if there is some k such that C In what follows, we say that an ordered basis B is a complete Krylov basis K c if its initial elements are the Krylov basis in the correct order. In more detail, say the Krylov basis has K vectors. K might be smaller than the Hilbert space dimension, so in such cases the usual Krylov basis does not span the full Hilbert space. We call B a complete Krylov basis if |B n = |K n , for n = 0, · · · , K − 1. The rest of the basis is unspecified for the concerns of this definition. This defines a class of bases for which the number of unspecified elements is the dimension of the Hilbert space minus the dimension of the Krylov basis. We will prove that any complete Krylov basis K c , as defined above, minimizes the derivative sequence S and hence has a lower cost than any other basis, at least in the vicinity of t = 0.
Theorem 1 For any basis B, S K ≤ S B , with equality only for the complete Krylov bases B = K c .

Proof:
We will prove the theorem by induction by showing that any orthonormal basis B whose first N elements coincide with the Krylov basis satisfies S B < S B for all bases B whose first N elements do not coincide with K.
The first element of the Krylov basis is |K 0 = |ψ(0) . Suppose now that the first element of B is |B 0 = |ψ 0 . Then the cost is C (0) B1 = C B (0) = n c n | ψ(0)|B n | 2 = c 0 because |B i>0 are orthogonal to |ψ(0) . Any basis B which does not include |ψ 0 as its first element will have a higher cost, because, from (3) it will be a weighted average of c n≥0 , and hence be larger than c 0 since c n increases with n.
To prove the induction step we must evaluate time derivatives of the cost C Applying the derivatives to the right side of (3) and using (1) gives C (m) Now, let us assume that the first N elements of B coincide with the first N elements of K, i.e. |B i = |K i for i = 0, 1, · · · N −1. Following (6), this means that p  Proof: For k < N , H k |ψ(0) is a linear combination of |B 0 , . . . , |B N −1 , since these vectors equal the first N elements of the Krylov basis. Orthogonality of the basis B then implies that ψ(0)| H k |B n = B n | H k |ψ(0) = 0 for any n ≥ N and k < N . For m ≤ 2N − 1, we know that for any integer k, either m − k or k is less than N . Since every term in (6) involves either ψ(0)| H k |B n or ψ(0)| H m−k |B n (or their conjugates) we conclude that p (m) (0), with equality when K contains precisely N vectors, in which case B is a complete Krylov basis, or when |B N also equals |K N up to a phase factor.
Proof: Since the first N basis vectors agree between B and K, Lemma 1 has already shown that for n ≥ N , p Examination of (6) shows that for n ≥ N there is a single non-zero term in p Let |X , which is not necessarily normalized, be the component of H N |ψ orthogonal to |B 0 , . . . , |B N −1 . By the definition of the Krylov basis K, |X ∝ |K N . Due to orthogonality, for n ≥ N we have By completeness of bases, n X|B n B n |X = X|X . If |X = 0, then K only contains N vectors, B is a complete Krylov basis and C where D is the dimension of the full Hilbert space, which could be infinite. In the last line we used the fact that c n is increasing, that n X|B n B n |X = X|X , and that the first N basis vectors of B and K are equal. Equality is achieved only when |B N ∝ |X ∝ |K N , up to a phase. Otherwise we have a strict inequality. Given these lemmas, suppose that a basis B coincides with the Krylov basis K up to phases in the first N basis elements, and deviates thereafter. Lemma 1 tells us that the first 2N derivatives of the cost function are the same as those of the Krylov basis, because the other basis elements contribute zero. Lemma 2 tells us that if |B N is not |K N up to a phase, then its 2N th derivative will be larger. Since the first 2N derivatives are equal and the 2N th derivative of C B (t) is larger, S B > S K , completing the proof of the theorem.
Corollary 1 Any cost function of the form (3) defined in terms of an increasing, positive sequence c n and a basis B is minimized near t = 0 by a complete Krylov basis K c . Thus the associated spread complexity function (4) is We have arrived at a basis-independent definition for the complexity, relative to the initial condition, of a quantum state evolving continuously in time.

A. Minimization for discrete time evolution
The above results can be extended to show that, for discrete time evolution, the Krylov basis minimizes the cost (3) for all times. Suppose the discrete time evolution is given by U n |ψ(0) = |ψ n , for a sequence of unitaries U n with U 0 = 1 and n = 0, 1, · · · . We define the Krylov basis by choosing |K 0 = |ψ 0 and then recursively orthogonalizing each |ψ n with all the |K j for j < n. As in the continuous time proof, we must choose the initial state as part of the basis that minimizes the cost, i.e., it should be the first state of the Krylov basis |ψ 0 = |K 0 . Now assume the first N vectors of certain basis B agree with the Krylov basis, namely |B i = |K i for i = 0, · · · N − 1. By assumption and the costs of both bases are the same until discrete time n = N − 1. Now we can decompose the next state into a part belonging to the Krylov subspace |K i , for i = 0, · · · N − 1, and a part perpendicular to it. Since the bases are defined up to phases, we necessarily have something of the form where |K N is the next element of the Krylov basis by definition, and |χ can be expanded in terms of |K i , for i = 0, · · · N − 1. A basis different from the Krylov one would necessarily not include |K N . Therefore, the cost at discrete time N would be larger, since we would have to express |K N in the new basis, which would require at least two vectors. Since the contribution to the cost from the part |χ is the same in both bases, the cost must increase when we divide |K N into several contributions, since c n is a strictly increasing function of n. This completes the proof that the Krylov basis minimizes the cost function for all times. In this argument we have ignored irrelevant phases in the choice of basis elements.
These types of discrete unitary evolutions arise naturally when considering quantum circuits. Given a computational task that produces a certain target state from a certain input state, every protocol performing the task has an assigned quantum state complexity, as defined here. It would be interesting to analyze the complexity of known quantum protocols in this light.

B. Complexity as the exponential of an entropy
It is natural to quantify the spread of the wavefunction as the exponential of the entropy of the probability distribution of weights in an orthonormal basis B. This provides an alternative definition of complexity where is the Shannon entropy of the basis weight distribution. Complexity defined in this way measures the minimum Hilbert space dimension required to store the probability distribution of basis weights. We can again eliminate the basis ambiguity by defining quantum state complexity as the minimum over all choices of basis. In fact, this entropic definition of complexity is also minimized in the Krylov basis. To show this, suppose that B does not contain the entire Krylov basis. Then for some N , the first N elements of the Krylov basis are in B, up to a phase factor, and the (N + 1) th element is not present. Since the entropy function is independent of the order of the basis, we can let these be the first N elements of the basis. Therefore, for n < N we have have p B (n, t) = p K (n, t) for all t. So to see the difference between the entropies we just need to analyze p B (n, t) for n > N . Now, by Lemma 1, for n ≥ N , the first 2N derivatives of the probability vanish. More concretely p (m) B (n, 0) = d m p B (n, 0)/dt m = 0 for n ≥ N and m < 2N . Expanding p B (n, t), for n ≥ N as a Taylor series in t around t = 0, the first non-vanishing term is The difference in entropy between two bases that agree in the first N Krylov vectors lies in the sum − n p n log p n , for n ≥ N . So we now introduce the expansion (14) in the entropy sum − n≥N p n log p n , and split the logarithm of p n to obtain two sums, the first involving log(t 2N ) and the second involving log(p The first sum, after dropping terms of O(t 2N +1 log t) coming from the corrections in (14), is From the proof of Lemma 2 above, Eq. 8 shows that (n, 0) = 2N N X|X . Hence this first term in the sum will not be affected by the remaining elements of the basis beyond the first N elements that were assumed to be the same as those of the Krylov basis.
The second sum is For this sum, note that − is a strictly convex function for x > 0. Since the probability is always positive, and for n ≥ N , p B (n, 0) = 0, the leading order term in the Taylor expansion in (14), p Due to the strict convexity, this inequality is strict except for the case when the previous two equations are exactly satisfied, which can only happen if some element in the basis were proportional to |X ∝ |K N .
Given two functions of the form Since the first sum (15) is the same for both the Krylov basis and B, and the second sum (16) has the form α t 2N +O(t 2N +1 log t) there exists some t 0 such that We conclude that the Krylov basis also minimizes complexity when defined in terms of the entropy of the spread of the initial state over a basis. Following the same line of reasoning, we can also prove that the participation ratio associated with a given basis B, defined as is also minimized by the Krylov basis for small times.

III. COMPUTING SPREAD COMPLEXITY
Following Corollary 1, to calculate the spread complexity we must derive the Krylov basis K. We can do this via the Lanczos algorithm [13], which recursively applies the Gram-Schmidt procedure to |ψ n = H n |ψ(0) to generate an orthonormal basis K = {|K n : n = 0, 1, 2, · · ·}: The Lanczos coefficients a n and b n are defined as a n = K n |H|K n , b n = A n |A n 1/2 , with b 0 ≡ 0 and |K 0 = |ψ(0) being the initial state.
Observe that the Lanczos algorithm (18) implies that This means that the Hamiltonian becomes a tri-diagonal matrix in the Krylov basis. For finite-dimensional systems, this is known as the "Hessenberg form" of the Hamiltonian.

A. Krylov basis from the Hessenberg form
Numerically stable algorithms for computing the Hessenberg form of a matrix, using Householder reflections instead of the Gram-Schmidt procedure, are commonly implemented in libraries like SciPy [32,33] and Mathematica. There are two caveats. First, the "initial state" used in these implementations is typically fixed at (1, 0, 0, . . .) T . To start with an arbitrary initial state, we must first perform a change of basis so that the desired initial vector has those special coordinates. Second, the off-diagonal values b n are sometimes negative in these implementations. This amounts to a choice of phase in the definition of the Krylov basis. Taking the absolute value of all the off-diagonal elements solves this issue, and is equivalent to multiplying some of the vectors of the new basis by −1, which does not change the physics. From the Hessenberg form of the Hamiltonian we can directly read off the Lanczos coefficients: the a n are the diagonal elements, and the b n are the entries above the diagonal. The wavefunction in the Krylov basis can be obtained by exponentiating the Hessenberg form and applying to the initial state. This procedure has the advantage of being numerically stable.

B. Krylov basis from the survival amplitude
We can also devise a more general method for computing the Lanczos coefficients which remains valid for infinite dimensional systems and the large N limit of finite dimensional systems. We start by showing how to compute the Lanczos coefficients from the "survival amplitude", i.e., the amplitude that the state at time t is the Figure 1. Top: "Markov chain" representation of iH. Bottom: "Unwrapping" of the Markov chain so that "time" goes from left to right. In every vertical column of nodes, the bottom node corresponds to |K0 , the first node above corresponds to |K1 and so on. The sum of the weights of the blue and red paths gives K0| (iH) 2 |K0 .
same as the state at time zero. Defining the expansion of the evolving state in the Krylov basis as the survival amplitude is just where we recall that |K 0 = |ψ(0) . The survival amplitude is also the moment-generating function for the Hamiltonian in the initial state: The particular form of the action of the Hamiltonian H in the Krylov basis (20) can be conveniently represented by an un-normalized "Markov chain" with transition weights given by the Lanczos coefficients, as shown in the upper panel of Fig. 1. The action of iH on n d n |K n is then equivalent to the action of the chain transition matrix on a chain state vector (d 0 , d 1 , · · · ). If we start with the vector (1, 0, 0, · · · ) and iterate the chain n times, the weight of the i th node will be the weight of |K i in the state (iH) n |K 0 . Thus, after n iterations of the chain, the weight of |K 0 will be the moment µ n = K 0 |(iH) n |K 0 .
If we start with a state localized on |K 0 , it is convenient to "unwrap" the Markov chain as shown in the bottom panel of Fig. 1. In this representation, the nodes of the j th vertical column represent the chain after j iterations of the transition matrix. In each column, labeled by j, the bottom node (in row 0) corresponds to |K 0 , the first node above (in row 1) corresponds to |K 1 , and so on. The transition weights w(e) of edges e between columns represent the action of iH defined in (20). We define the weight of a path of concatenated edges P = {e 1 , e 2 , · · · } as the product of the included edge weights: w(P ) = e∈P w(e). Finally, we define the weight of the node 0 to be 1, and the weight of any other node as a sum of the weights of all paths from 0 to that node. By construction, the weights in the n th column are the amplitudes K j |(iH) n |K 0 , and, specifically, the weight of the bottom node (labeled n) computes the moments µ n = K 0 |(iH) n |K 0 .
For example we have since there is only one path from 0 to node 1 with weight ia 0 , and because there are two paths from node 0 to node 2, one with weights ia 0 , ia 0 and one with weights ib 1 , ib 1 .
Computing the values of µ 0 , . . . , µ n from a n , b n using this path sum takes O(n 2 ) operations. The weighted path sum from node 0 to some node X in the graph is the sum of the weighted path sums of all nodes with a transition to X, multiplied by the weight of the transition edge. Initializing the weighted path sum of node 0 to 1 and performing this operation layer by layer gives the values we need on the bottom nodes |0 n .
Suppose now that we are given the survival amplitude S(t), or can compute it through other means. By taking derivatives we can compute the moments µ 0 , . . . , µ n . From this data we can calculate the Lanczos coefficients by using the Markov chain described above. Specifically, suppose we have already calculated a 0 , . . . , a k−1 and b 1 , . . . , b k and the odd moment µ 2k+1 . There is a unique path in the unwrapped Markov chain from node 0 to node 2k + 1 that passes through an edge with weight ia k (example in Fig. 2). This follows because any path from 0 to 2k + 1 must follow precisely 2k + 1 edges since every step necessarily progresses one column to the right. This means that no path can rise to a row higher than k because the need to descend back to row 0 would make the path too long. For the same reason, a path that reaches row k must have precisely k upward diagonal and k downward diagonal edges, allowing a single horizontal edge in the path. The only way to have this edge in the k th row is to start with k diagonal upward edges, then go one step horizontally in the k th row and then descend k steps diagonally.
By similar reasoning, the remaining paths between nodes 0 and 2k + 1 lie below the k th row and hence only include edges with weights a 0 , . . . , a k−1 , b 1 , . . . , b k . Thus we can compute the path sum for trajectories from node Figure 2. Top: Except for the path in red, the weights of every path from node 0 to node 3 can be computed with knowledge just of a0 and b1. The weight of the red path can be computed by subtracting the weights of every other path from µ3, and can then be used to compute a1. Bottom: Except for the path in red, the weights of every path from node 0 to node 4 can be computed with knowledge just of a0, a1, and b1. The weight of the red path can be computed by subtracting the weights of every other path from µ4, and can then be used to compute b2. 0 to node 2k + 1 that do not go through edge with weight ia k , and subtract this sum from µ 2k+1 . The remainder is the weight of the excluded path, namely i 2k+1 b 2 1 . . . b 2 k a k . Since we know the b k 's by assumption, we may divide them out, leaving us with a k .
Likewise, the even moments µ 2k allow us to extract values of b k . The only path from node 0 to node 2k that goes through an edge of weight ib k has path weight b 2 Fig. 2). The weights of all the other paths can be computed using only a 0 , . . . , a k and b 1 , . . . , b k−1 .
To summarize, we can compute the Krylov basis and Lanczos coefficients efficiently through the following algorithm: (1) compute the survival amplitude, and use it to extract the moments of the Hamiltonian in the initial state, (2) apply the recursive algorithm above to systematically compute the Lanczos coefficients to the desired order. This procedure is potentially sensitive to the accumulation of rounding error, due to the repeated divisions needed to compute a n and b n from their products. In our numerical analyses we avoided this instability by using the mpmath [34] library to perform computations to arbitrary precision.

C. Wave function and Complexity
Above, we described an algorithm for computing the Krylov basis K and the associated Lanczos coefficients from the survival amplitude. To apply our definition of spread complexity to a time-evolving state we must expand it in K as where unitarity requires n |ψ n (t)| 2 ≡ n p n (t) = 1. Applying the Schrödinger equation (1) to this expression, and then the Lanczos recursion in the form (20) gives The survival amplitude is simply the complex conjugate of ψ 0 (t), see (22). Thus, given ψ 0 (t) = S(t) * and the Lanczos coefficients, (27) defines an algebraic procedure for computing all the ψ n (t). We start by noting that b 0 = 0 and use ψ 0 (t) and its time derivative in (27) to compute ψ 1 (t). Then, given ψ 0 (t) and ψ 1 (t) we can compute ψ 2 (t) and so on.
Finally, given ψ n (t) we apply our definition of complexity in (3,4): where we took the complexity coefficients in the cost function (3) to be c n = n. With this definition, spread complexity measures the average depth of support of a time evolving state in the Krylov basis. Formally, this quantity is the expectation value in the evolving state |ψ(t) of a "complexity operator" such that the spread complexity reads Below we will also consider the entropic definition (13) which can also be calculated from the p n = |ψ n | 2 . This can also be understood as the exponential of the entropy of the algebra generated by the complexity operator. See [35] for the definition of the entropy of an operator algebra.

D. Survival, TFD and the partition sum
We will find it illuminating to study the growth of spread complexity for Thermo-Field Double (TFD) states. These are defined as follows. Consider a Hamiltonian H acting on a Hilbert space H, with eigenstates |n and eigenvalues E n . To purify the thermal ensemble we construct the maximally entangled TFD state Notice that the TFD and its time evolution are contained within the subspace spanned by {|n, n }. As a result, the finite dimension algorithm for computing the Lanczos coefficients need only work within this small subspace, simplifying numerical evaluations. The maximum dimension of the explored Hilbert space in this time evolution is therefore the dimension of the original Hilbert space H.
In the AdS/CFT correspondence, such TFD states are dual to the eternal black hole [36]. The spectrum of the theory is conveniently packaged in the analytically continued partition function and the related spectral form factor These time-dependent quantities have been extensively studied in random matrix theory and quantum gravity [14,15], for example to explore chaotic behavior. The interesting feature for us is that the survival amplitude for the time evolved TFD state has a simple expression in terms of the partition function The spectral form factor (SFF) is then the survival probability of a dynamical process, corresponding to the evolution of the TFD. We can use this fact to extract the probabilities of the Krylov basis states. The fact that the survival probability of the TFD is the SFF has been used in [37] in relation to quantum speed limits. It would be interesting to understand these speed limits from the present spread complexity perspective. Given this survival amplitude, the moments in (23) are thermal expectation values of the Hamiltonian. In holographic theories, the partition function and the energy moments have simple geometric duals, and, at least in 2d gravity [38][39][40][41][42][43], there are non-perturbative definitions of these quantities. Since spread complexity is a functional of the survival amplitude, the relation of the latter to the partition function provides a path towards understanding the relation between quantum complexity, geometry and quantum gravity, and perhaps the conjectures relating complexity in quantum field theory to spatial volumes and actions in a dual theory of gravity [44][45][46]. Likewise, the relation between the spectrum of the Hamiltonian and the dynamics of complexity in TFD states provides a bridge from the classification of phases of quantum matter via the associated partition functions, to a novel characterization in terms of the dynamics of quantum complexity. Finally, although we have shown that complexity dynamics in the TFD state depends only on the spectrum, if we start with a general quantum state |ψ(0) , complexity growth will depend both on the spectrum and the structure of energy eigenstates. Indeed, for a general initial state the survival amplitude is and depends on overlaps between the evolving state and the eigenstates.

IV. RELATION TO KRYLOV COMPLEXITY
Our approach to state complexity is related to the notion of Krylov operator complexity, which has been put forward in [47], and developed in [48][49][50][51][52][53][54][55][56][57][58][59][60][61][62][63], based on the Lanczos approach [13] to operator dynamics in manybody systems. This approach starts with a Hamiltonian H and a time-dependent operator O(t), determined by in terms on the initial operator O(0) ≡ O. Taylor expanding in time we obtain where we have defined To define a notion of support of the operator we need to introduce an inner product. At a technical level we need to endow the operator algebra with the structure of a Hilbert space. The choice of such an inner product is one of the potential ambiguities of this approach. Concretely, suppose we say |O) is a vector in an auxiliary Hilbert space corresponding to operator O by considering the following family of inner products [13] ( where β denotes the thermal expectation value and we require Once we have chosen an inner product, the Lanczos approach starts from |Õ n ) and derives an orthonormal basis |O n ), the Krylov basis, by using Gram-Schmidt orthogonalization. The Krylov basis is defined recursively as Here L is the Liouvillian superoperator that produces the commutator with the Hamiltonian and we have defined the Lanczos coefficients b n as This iterative process has the initial conditions b 0 ≡ 0 and |K 0 ) = |O) is the initial state. The Lanczos coefficients produced in this way are in general different to those produced by the Hamiltonian of theory acting on states in the original Hilbert space (18). We can now expand the time-dependent operator in the Krylov basis as The amplitudes ϕ n (t) can be shown to satisfy the following Schrodinger equation With this equation and the Lanczos coefficients b n we can solve for the amplitudes ϕ n (t) with initial condition ϕ n (0) = δ n0 . The Lanczos approach suggests a natural measure of operator complexity, dubbed Krylov complexity in [47]. It is defined as the average spread of the operator in the Krylov basis The relation of Krylov complexity to the present spread complexity is transparent once we have chosen the inner product. Such a choice maps Heisenberg evolution of the operator to Schrodinger evolution in the auxiliary Hilbert space with the Liouvillian acting as the Hamiltonian. Applying our approach to such a Hilbert space and the pertinent initial state |O), our notion of spread complexity becomes Krylov complexity. In other words, any Krylov complexity can be understood as quantum state complexity in a certain auxiliary Hilbert space. However, state complexity, as we defined it, is more general, and gives more fine-grained information about the quantum dynamics. Notice that operator growth, at least as it is conventionally defined, only sees one set of Lanczos coefficients, namely the b n , while for quantum states we typically have both sets of Lanczos coefficients a n and b n . Equivalently, not all quantum state complexities can be understood as Krylov complexities (at least not in a conventional manner). In this sense, our approach generalizes Krylov complexity.
As developed in [47,51], many different notions of operator complexity that have recently appeared in the literature can be simply understood as notions of the spread of the operator wavefunction, but with respect to different choices of orthonormal basis. We have shown that our measure of state complexity is distinguished in that it minimizes the spread of the operator wavefunction over all choices of basis, eliminating any basis ambiguity. This idea can be applied to operator complexity as well.
But this Wightman inner product choice is arbitrary and calls for a deeper understanding. In fact, [51] showed that different choices of inner product can be related to each other in a simple manner, but that Krylov complexity behaves differently with respect to each choice. Following the philosophy of the present paper, we could further minimize Krylov complexity over the choice of inner product (42). Although we have not performed this minimization, Ref. [51] actually shows that the Wightman inner product (51) corresponds to the slowest growth of Krylov complexity.
Summarizing, the correct Lyapunov exponent arises precisely after minimizing over all possible choices of the ambiguous inner product, and over all choices of basis. This gives further support to our guiding principle that complexity should be defined via a minimization over the possible ambiguous choices.
Note that Ref. [47] argued that Krylov complexity also bounds from above (instead of from below) some other notions of operator complexity. The bound in [47] is in fact consistent with our results. Ref. [47] demonstrates their bound for a certain set of highly constrained operator complexity definitions, that not meet the criteria that we have set out, such as monotonicity of the c n coefficients. An interesting example that demonstrates how the notion of spread complexity that we defined is a lower bound appears in the results of Ref. [29], where complexity was defined in terms of spread of the wave-function with respect to the so-called "computational basis", e.g, in a spin model the basis that diagonalize all the spin operators in a chosen direction. This notion was found to saturate to its maximum value at time of O(N ), where N is the number of spins. By contrast, as we will see below, in the Krylov basis the spread saturates at a time of O(e N ). In other words the spread of the wavefunction in the Krylov basis is much slower.

V. ANALYTICAL MODELS
We will consider a class of models in which the spread complexity can be computed analytically by exploiting techniques developed recently in the context of operator complexity [56]. Suppose that the Hamiltonian belongs to the Lie algebra of a symmetry group: where L + and L − are raising and lowering ladder operators, and L 0 belongs to the Cartan subalgebra of the Lie algebra (see [65] for examples of such theories). The identity term contributes to a phase to the time evolution and so does not affect the associated quantum complexity. But it can be used to set the ground state energy. The coefficients α and γ are model-dependent; their meaning will become clearer in the specific examples.
Comparing with the action of the Hamiltonian in the Krylov basis (20), namely we see that, if the initial state is a highest weight state, the Krylov basis states furnish a representation of the symmetry group. In other words, Eq. (53), which provides a solution to the Lanczos recursion method by putting the Hamilton in tridiagonal form, also guarantees that the Krylov basis states form a representation of the symmetry. Moreover, since the action of the ladder operators and the elements of the Cartan subalgebra are fixed by symmetry, we can read off the Lanczos coefficients immediately Unitary evolution with the Hamiltonian (52) acting on a highest weight state is determined, up to the irrelevant phase δ, by a generalized Lie group displacement operator D(ξ, ξ 0 ) for ξ = −iαt, its conjugateξ, and ξ 0 = −iγt. When ξ 0 = 0 this is a conventional displacement operator [66,67]. Thus, we can understand the action of the Hamiltonian as producing generalized coherent states. The amplitudes ψ n (t) of the time-evolved state in the Krylov basis |K n are obtained by expanding these states in an orthonormal basis. The link with coherent states allows us to geometrize the notion of spread complexity following [56,58]. Below we study motion on SL(2,R), SU(2) and the Heisenberg-Weyl group. We will see that the a n coefficients can dramatically change state complexity growth. For example, suppose the b n grow linearly with n. Then systems with different a n can have have very different complexity growth patterns such as quadratic or periodic. In fact, systems with b n ∼ n and a n = 0 will have exponentially growing complexity, as we see by analogy with the operator growth analysis in [47].

A. A particle moving in SL(2,R)
We start with SL(2,R), a group previously studied in the context of operator growth in the SYK model [56].
Here we will realize it as the symmetry controlling time evolution of the TFD state of the harmonic oscillator.
As discussed above, for time evolution starting from a highest weight state, |h, n can be interpreted as Krylov basis elements (56), i.e., |h, n → |K n .
For some choices of the coefficients in the Hamiltonian (56) this is the time evolution of an oscillator. To see this, consider the harmonic oscillator with frequency ω, and neglect the vacuum energy since it has no effect on the dynamics. The spectrum and partition function are leading to the TFD survival amplitude (36) This expression is generally complex but its norm is periodic in t. For the inverted harmonic oscillator, ω → −iω, which is strictly speaking not stable, but which we will think of in terms of analytical continuation from the stable oscillator, the norm decays to zero as time increases. The moments can be easily computed as where A(n, k) and A n [t] are the Euler numbers and polynomials respectively, with the convention µ 0 = 1. Starting with A 0 (t) = A 1 (t) = 1, the next non-trivial Euler polynomials are Using the Lanczos algorithm described above, we can now compute the two sets of Lanczos coefficients These results match the SL(2,R) Lanczos coefficients (60) if we pick the representation h = 1/2 with Hamiltonian coefficients This identification maps the initial TFD state for the oscillator to The time evolution of the TFD state is now understood as a time-dependent genereralized coherent state on the group manifold in the sense of (55): Using the Baker-Campbell-Hausdorff (BCH) relation, the unitary evolution operator can be rewritten as where we have defined The key quantity is By using (66) to choose α and γ, which quantify the growth rate of the b n 's and a n 's respectively, we can discuss the standard or inverted harmonic oscillator. The transition between these scenarios occurs when γ = 2α. Using the BCH relation (69), a general time-dependent coherent state is given by Using h = 1/2 for the harmonic oscillator gives and the Krylov basis is defined as The amplitudes satisfy the Schrodinger equation (27) and it is simple to verify that The complexity for the standard and inverted (ω → −iω i ) oscillators become respectively We see that complexity is periodic in time for a standard oscillator, but grows exponentially for the unstable oscillator -results that make intuitive sense. At large times we can approximate the inverted harmonic oscillator by where λ = ω i and t * = 2 ωi log (2 sin(βω i /2)). For general representations h we can compute the probability p n to be: with complexity When γ > 2α the square root is imaginary and complexity is periodic in analogy to the standard oscillator, while when γ < 2α the system has exponentially growing complexity like the unstable oscillator. At the transition point ω = 0 between the standard and inverted oscillators, γ = 2α and we have a n = γn = 2αn = 2b n , for large n. Taking the limit ω → 0 limit in (79) from either above or below, we find that the complexity grows quadratically in time Strictly speaking when ω = 0 the theory is free and the partition function is not well defined, but we will think about this setting as an analytical continuation of the stable oscillator. We can give a second, more general argument, for this behavior. We place the sites of the 1d chain in (27) on the real line with spacing ∆x n = 1 √ bn ; the evolution can be rewritten as i∂ t ψ = b n+1 ψ n+1 + a n ψ n + b n ψ n−1 = bn+1 bn ψ n+1 + 2ψ n + ψ n−1 ∆x 2 n + (a n − 2b n )ψ n . (83) In the large n limit when a n − 2b n is a constant V 0 and bn+1 bn ≈ 1, this equation simplifies to We notice that this bears some similarity to the discretization of the second derivative of a function . For large n, the spacing ∆x n → 0, and their ratios ∆x n /∆x n+1 → 1. We instead choose a Krylov basis with phases of −1 at odd sides, then in the new bases we can smoothly interpolate the values of ψ n : ψ( n ∆x n ) = (−1) n ψ n . We can then approximate (84) in the new basis by the Schrodinger equation of a free particle on a line A free particle travels at a constant velocity, so we expect x ∼ t in the large n limit. Since in the large n limit, b n ∼ n, the position corresponding to site n is n, we expect √ n ∼ t, and therefore n ∼ t 2 .

Entropic complexity and variance:
To further characterize the spread of the wavefunction across the Krylov basis (78), we can compute the entropic notion of the complexity in Sec. II B and the variance of the distribution. In the context of operator growth, these were studied in [48,58] respectively, where they were dubbed K-entropy and K-variance.
For h = 1/2 we can compute the entropy in the Krylov basis analytically where Thus for general γ and α the entropy grows linearly at late times. When γ = 2α, which describes the free limit of the the oscillator, entropy at late times shows slower, logarithmic growth This implies a quadratic growth of the entropic definition of complexity similar to our original definition in (81), but with parametrically smaller growth rate for large scaling dimension h. Similarly, the normalized variance is The variance is also sensitive to representation h and shows that the distribution is sharply localized around the mean for "heavy" states. Around γ = 2α, the variance approaches its large time limit more slowly, i.e., as t −2 instead of generic exponential of (90).
These results makes sense because all the representations of SU(2) are finite dimensional. Thus, complexity growth is upper bounded by the dimension of the representation. For the time evolution of highest weight states we have showed that the support of the state in the Krylov basis oscillates with time, causing the complexity to be periodic. This evolution does not show the behavior expected in chaotic theories where the state should remain spread over the complete Hilbert for a long period of time, leading to a plateau in the complexity.

C. A particle moving in the Heisenberg-Weyl group
Finally, consider the Hamiltonian built from the operators of the Heisenberg-Weyl algebra The action of these operators on a representation is given by Following the procedure described above, we identify the Lanczos coefficients a n = ωn + δ, b n = λ √ n .
Moreover, the unitary evolution operator can be decomposed as where Our time evolving state is then This leads to Krylov basis coefficients (26) ψ n (t) = e α1(t) α n that satisfy the Schrodinger equation (27) with a n and b n given by (102). The survival amplitude is The complexity of the TFD state governed by this symmetry is For ω = 0 the complexity oscillates even though there is an infinite dimensional Hilbert space. This is because the inclusion of the ωN term in the Hamiltonian (99) produces a potential energy for excitations, effectively bounding the system. This fact is realized in the a n coefficients that grow with n when ω = 0, which results ultimately in the oscillating complexity. When ω → 0, so that the Hamiltonian does not contain a bounding potential for excitations, a n → constant, and then the complexity grows quadratically in time without bound.

VI. COMPUTATIONAL MODELS
We now seek to apply our methods to three 0 + 1 dimensional systems with chaotic dynamics. These are the Schwarzian theory, random matrix models and the SYK model. The Schwarzian theory appears in the low energy approximation of the SYK model and particular matrix models, and also in the boundary description of Jackiw-Teitelboim (JT) gravity [68][69][70][71][72][73][74][75][76][77][78][79]. Matrix models and their relation to two dimensional gravity have been known for a long time [14,[16][17][18][19], but new holographic gravity applications have appeared recently [38][39][40][41][42][43]. As such, all of these theories are known to be related via dualities to gravity in 1+1 dimensions, in particular to JT gravity. In such dualities, eternal black holes in the gravity theory are related to the TFD state of the dual 0 + 1 system. As we described above, the survival amplitude, and hence the growth of spread complexity, of TFD states is related to the analytically continued partition function (36). Previous work [14,15] has shown the spectral form factor (and hence the partition sum) in the thermal theory of the models considered here initially slopes from a normalized value of 1 to an exponentially small magnitude called the dip. The dip value and its time of occurrence are not universal in chaotic systems. For Gaussian Unitary Ensembles the dip time is of O(N 1/2 ). After the dip, the spectral form factor grows linearly, a regime called the ramp, until it plateaus after a time exponential in the entropy of the system at a value that can be computed from the partition function. The ramp is a feature that can be traced back to the universal statistics of random matrices and chaotic models. The plateau persists until the appearance of Poincaré recurrences in a finite size system. Below we will use the relation between the survival amplitude in the TFD state and the partition function (36) to demonstrate related effects in the state complexity. More concretely, we find the spread complexity shows an initial linear ramp that persists for a time exponential in the system size until a peak, that is followed by a slope down to a plateau. These four regimes arise from the same physics that drives the slope, dip, ramp and plateau of the spectral form factor. The linear growth and plateau demonstrate the behavior expected for complexity in chaotic systems [12]. However, peak and slope are new features uncovered by our analysis. The complexity peak has a height a e S , and occurs when the system reaches the "farthest" states in the Hilbert space. The subsequent downward slope bring the complexity to a lower plateau value b e S . The peak and slope are controlled by spectral rigidity, a universal feature seen in the energy levels of matrix models, and presumably chaotic systems in general.

A. The Schwarzian theory
The Schwarzian theory is 0+1 dimensional [68,72,[76][77][78][79], and the thermal Euclidean theory is defined by the action where τ ∼ τ + β, c is the inverse coupling constant (with inverse energy dimensions), F (τ ) is the degree of freedom and the brackets stand for the Schwarzian derivative The partition function of the Schwarzian theory is given by [72,[76][77][78]] where a is a non-universal constant that depends on the regularization scheme. However, it disappears in appropriately normalized physical observables. The density of states is For this system, the moments of the Hamiltonian in the TFD state are equal to the moments in the thermal ensemble, and can be calculated as where M (a, b, c) is the Kummer's (confluent) hypergeometric function. In the semiclassical c β 1 limit, the moments of fixed n become where E is the semiclassical one-point function We can combine this with the second moment in (115) to show that the variance is much smaller than the mean squared, σ 2 E = 2E/β E 2 , i.e. this is indeed a semiclassical limit.
We can compute the survival probability from the analytically continued partition sum. For small times t β we have This type of survival probability was studied in [56], where it was established that a n = 0 ; b n = σ E √ n . In our case (117) is an approximation, and thus we expect some non-zero a n and corrections to these values of b n . We will evaluate these numerically below. However, if we simply keep the leading dependence (118), then we know from [56] that the corresponding state complexity growth is Namely, complexity grows quadratically in time. Our arguments have showed this in the semiclassical limit for sufficiently small n which is a good aproximation for small times.
We can now consider the opposite limit of the moments, namely we let n grow to infinity while keeping c/β fixed. This limit becomes relevant at sufficiently late times where the power series expansion of the unitary time evolution operator e −iHt requires many terms to provide a good approximation. In this limit, Kummer's hypergeometric function scales as This means that the scaling of the moments with n is just controlled by the Gamma function factor in (114): In our algorithm in Sec. III, the µ n moments were related to polynomials of O(n) involving a k and b k with k < n. Thus, to achieve the n n growth of the moments, the a n , b n , or both must scale linearly with n.
The previous observations tell us that we should expect two regimes for both sets of Lanczos coefficients and for the spread complexity. In the first regime we should expect a square root law for the b n 's and quadratically growing behavior of complexity. In the second regime we should expect some linear behavior of the Lanczos coefficients. As we have seen in Sec. V, the growth of complexity is then going to depend sensitively on how a n and b n both grow, including their coefficients. Fig. 3 uses (114) to numerically evaluate both Lanczos coefficients for different values of c/β, and displays the two regimes mentioned earlier. At large n both of them grow linearly. The relative rate of growth is depicted in Fig. 4, where it is shown that at large n a n = 2b n .
This was precisely the relation between the Lanczos coefficients obtained for the TFD of the harmonic oscillator in the free limit (80). Recall that this was a particular limit of motion in the SL(2,R) group. Thus, at large times, when the wavefunction also has substantial support on Krylov basis elements |K n for large n, the time evolution of the TFD in the Schwarzian theory can be approximated by motion in the SL(2, R) group. In fact, for any n the SL(2, R) description becomes increasingly better in the semiclassical limit as we let c/β grow. Thus, we see that in the infinite c/β limit, where the Schwarzian theory is well described by AdS 2 [68, 72-74, 78, 79], the evolution of the TFD is well described by motion in the SL(2, R) group at all times, and the Hamiltonian and complexity operator (29) closes an SL(2, R) "complexity algebra" acting in the physical Hilbert space, which in turn furnishes a representation of the group. This notion of complexity algebra has been recently studied for operator growth [56].
In Sec. V A we found that when a n = 2b n ∼ n complexity grows quadratically in time (81). Thus the large n result in (122) implies that at large times the Schwarzian theory has quadratically growing complexity. In (119) we found this quadratic growth at small times also. Thus we expect quadratic growth at all times. In Fig. (5) we confirm these expectations numerically. Fig. (5)b shows that the rate of growth is controlled by the variance of the energy, namely, As we discussed, in the semiclassical limit these relations become exact because the SL(2, R) description becomes accurate at all times, and we can use the analytical results of the previous section.

B. Random Matrices
A basic conjecture states that the fine grained structure of the spectrum of a quantum chaotic Hamiltonian is well approximated by the statistics of random matrices [80,81] (see the reviews [14,[16][17][18]). Then, given the energy-time uncertainty principle, we expect that aspects of the long time dynamics in chaotic systems will be well described by statistics of nearby eigenvalues of the Hamiltonian in the random matrix approximation. For example, this was shown to be the case for the spectral form factor of the SYK model [15]. Therefore, since we seek to understand universal aspects of black holes and more general chaotic systems, it is natural to start by considering random Hamiltonians. We will study all three universality classes: the Gaussian Unitary Ensemble (GUE), the Gaussian Orthogonal Ensemble (GOE),

and the Gaussian symplectic ensemble (GSE).
A random matrix theory is defined by the specifying the probability for finding a particular instance of a matrix in a given ensemble. The GUE is an ensemble of Hermitian N × N matrices H ij with gaussian measure For numerical computations, we chose units so that E 0 = 1. Then Z GUE = 2 N/2 π N 2 /2 is the partition function of the matrix model and normalizes the probability distribution. We now perform the following procedure. We take an instance of a random Hamiltonian from the GUE ensemble, compute its eigenvalues, and construct the TFD state (32). Next we study the unitary evolution on one side of the TFD by applying the recursion method described earlier. We repeat this computation for different instances of the random Hamiltonian, different values of N , and different values of β.
To solve the recursion method described earlier we use known numerically stable algorithms for computing the Hessenberg form of any given N ×N Hamiltonian [32,33]. As discussed in Sec. III, these algorithms use Householder reflections instead of the Gram-Schmidt procedure. From the Hessenberg form we can read off the Lanczos coefficients. To compute the wavefunction in the Krylov basis, we exponentiate the Hessenberg form to obtain the matrix representing unitary evolution, and apply this matrix to the initial state. This procedure directly provides the wavefunction in the Krylov basis. From the wavefunction, we can compute the probability of being in any given Krylov basis state, and hence the complexity.
In Figs Both sets of Lanczos coefficients show two behaviors as a function n. For the a n there is a linearly growing regime, followed by a near-plateau (Fig. 6). The plateau for a n is approximately at zero, up to fluctuations. The transition from the ramp in n to the plateau in n seems to occur at n N (Fig. 6). We numerically confirmed that the transition indeed occurs at n of O(1) in the large N limit (Fig 8). The fact that, most of the a n ≈ 0 is a significant simplification for analytical methods.
For the b n we again see a sharp ramp at small values of n followed by a gradual decay with a slope of O(1/N ) to zero as n → N (Fig. 7,9). The transition occurs at n ∼ O(1) (Fig. 9), and in fact in the large N limit the decay to zero is so slow that for any fixed interval of n, b n is approximately constant. The decay to zero at n → N occurs because the b n are hopping coefficients in the one- Figure 10. At small n -between zero and the transition to plateau behavior-and large enough β, an − 2bn is approximately constant. As shown in fig 9 and 8, an, bn grow linearly, so we find ourselves in the free limit of the harmonic oscillator (80). The color bar indicates the value of β for each curve.
dimensional Krylov-basis chain (27). Thus, for any finite system size, b n must vanish when we reach the edge of the chain. Similar behavior for the b n has been found in the context of operator growth [48,53,55], except that the transition between the ramp and approximate plateau happens here at n ∼ O(1) while for operator growth it occurs at n ∼ O(log N ) namely at the order of the entropy.
To understand the behavior of spread complexity, as described in Sec. V, we need to find the precise relation between the rate of growth of the a n and the rate of growth of the b n . Fig (10) shows that, in the range of small n where both Lanczos coefficients grow linearly, a n + d = 2b n with d a constant, like in the Schwarzian theory. This relation between a n and b n manifests as the first plateau in the plot of a n − 2b n in Fig. (10). The second plateau in this figure at larger n occurs because a n and b n are both changing very slowly in this regime.
Recall from (2) and (27) that at short times, the timeevolving state has most of its support on Krylov basis elements |K n with small n. As discussed above a n + d = 2b n ∼ n in this range, just as in the free limit of the particle moving in the SL(2,R) group (80). In analogy we expect that complexity grows quadratically at early times. At later times the time evolution will acquire support on Krylov basis elements with larger n. As we discussed above, in the large N limit a n = 0 beyond some n of O(1), and b n is roughly constant for any fixed interval of n. Using these conditions, the Schrodinger equation in the Krylov basis (27) becomes a free wave equation in one dimension, whose solutions are plane waves moving at constant speed. This implies that the mean position in the Krylov basis, and hence the complexity grows linearly with time. This is the same regime as the one found in [48,53,55] for operator growth at large times. This regime was also found in the context of Nielsen's complexity in [5]. Using random quantum circuits it has been found recently in [31].
These regimes of complexity growth are confirmed in Fig. 11, where we see a transition from initial quadratic growth to linear growth at a time of order β 3/2 (recall that we are working in units where E 0 = 1). In the quadratic growth regime, we checked numerically that, just like in the Schwarzian theory, the growth rate is controlled by the variance in energy which is of order 1/β 2 .
As we discussed above, although the b n are approximately constant over any finite interval in n in the large N limit, over intervals of O(N ) they do gradually decay to zero. This is because the Lanczos algorithm must halt when we reach the dimension of the Hilbert space. This means the support of the state in the Krylov basis cannot keep growing, but it is possible for the support to narrow back again. This means that, at large times, spread complexity should reach a maximum and then may decay or plateau. For chaotic systems we indeed expect the maximum in the complexity to be of O(N ) and a plateau at this order as well.
The dark hued curves in Fig. 12 show how state complexity changes in a variety of GUE ensembles until times of order the size of the Hilbert space (t/N ∼ O(1)). It is immediately clear that the complexity dynamics displays a characteristic overall structure: a linear ramp for times that are exponentially large in the entropy, followed by a peak, and then a downward slope to saturation at an exponentially large plateau. The onset times and heights of the peak and plateau in the complexity are O(N ), i.e., exponentially large in the entropy of the system. The initial linear ramp and plateau were conjectured for chaotic systems by [12]. We propose that the subsequent peak overshooting the plateau, followed by a downward slope are also universal characteristics of the complexity dynamics.
The dynamics of state complexity that we have displayed is related to the behavior of the spectral form factor in chaotic theories [14,15]. Recall that we showed that the spectral form factor computes the survival prob-  {1024, 1280, 1536, 1792, 2048, 2560, 3072, 3584, 4096}. Complexity grows linearly to a peak, followed by a downward slope to a plateau. Light Hues: Ensemble with the same density of states as GUE, but without correlations between eigenvalues. In this case, the curves plateau without reaching a peak followed by a downward slope. The GUE esemble of random matrices displays a ramp followed by a plateau. Light blue: For an ensemble with the same density of states as the GUE but with no correlations between eigenvalues, the spectral form factor displays a plateau without a ramp.
ability of the TFD state, namely This is also the time-dependent probability of the first state in the Krylov basis, i.e., the initial TFD state. Fig. 13 shows the spectral form factor as a function of time. The dark blue line, corresponding to the GUE ensemble, shows a downward slope, that lasts until the dip (i.e., the minimum), followed by an upward ramp and then a plateau. The plateau occurs because the system has a finite, discrete spectrum. From our state evolution perspective, the time development of the TFD explores  the state space broadly and at late times the distribution over the Krylov basis is uniform after some coarsegraining over time. See [82] for a tractable model of why such coarse-graining is needed. The shape of the ramp is determined by the universal statistics of random matrices [14]. In particular, it depends on the universal correlations between eigenvalues, a structure known as spectral rigidity.
We can argue that the ramp, peak, slope and plateau of complexity are in analogy to the slope, dip, ramp and plateau of the spectral form factor. With regard to the plateau this analogy is transparent. Complexity achieves the plateau when the Krylov basis probabilities reach approximate stationarity after some coarse-graining in time. More precisely, as we will see below, the probabilities actually continue to fluctuate in the plateau region, but they are stationary after time averaging, or after averaging over small neighbourhood of n, the Krylov basis index. The spectral form factor is one of these basis state probabilities -it is the probability of staying in the initial TFD state. Thus it must saturate as well. With regard to the peak and downward slope in the complexity, the analogy is more subtle. As shown in [15], the dip (i.e., minimum) in the spectral form factor occurs at a time of O( √ N ), while the complexity peak occurs at a time of O(N ) (see Fig. 12). Indeed, since complexity grows linearly, it can only achieve a size of O(N ), as required for the state to acquire support everywhere on the Hilbert space, at a time of O(N ). Thus, the time of the dip in the spectral form factor and the time of the peak in the complexity do not scale in the same way with N . Of course, since the complexity is a sum over all the Krylov basis probabilities, only one of which is directly related to the spectral form factor, the time scales in these quantities need not be same.
Nevertheless, we can show that the downward complexity slope after the peak likely arises from eigenvalue correlations, just like the ramp in spectral form factor after the dip. To show this, we construct an ensemble with the same density of states as the GUE, but without spectral correlations. To do so we can draw eigenvalues at random from the Wigner semicircle distribution, rather than from a specific random Hamiltonian. Alternatively, we can take sufficiently many instances of random Hamiltonians with their associated spectra, and then construct another spectrum by randomly sampling eigenvalues from the different Hamiltonians [83]. For the present discussion it is sufficient, and simpler, to draw eigenvalues from the Wigner semicircle distribution. Complexity growth of the TFD state assuming such a spectrum is plotted in light hues in Fig. 12. We see the peak and downward slope disappear. A related effect can be seen in the spectral form factor, where the dip and ramp disappear (light blue line in Fig. 13). This is a strong hint that the downward complexity slope after the peak is controlled by spectral correlations. Further evidence arises from the GOE and GSE ensembles considered below.
We can also characterize the spread of the wavefunction across the Krylov basis in terms of the entropic definition of complexity (31) or the variance of the distribution of probabilities of the basis states. These quantities are displayed in Figs. 14 and 15 and also show a ramp, a peak, slope, and plateau.
It is also illuminating to examine the explicit form of the wavefunction in the Krlov basis at different moments of time. Figs. 16 and (17) show the spread of wavefunction over the Krylov basis for β = 0, 5 for a range of times from t = 0 until late times when the complexity has plateaued. At t = 0 the wavefunction is localized on the initial TFD state which is also the first Krylov basis element. The dynamics then looks like a probability shockwave that starts on the initial state and propagates outward to higher basis elements, leaving a tail of probability behind. For high temperatures (β → 0), the probability is initially concentrated at the shockwave front, while for intermediate and low temperatures, the probability distribution over the Krylov basis is more concentrated in the middle of the distribution. But in both cases, when the shockwave reaches the last Krylov basis vector, it is far from being stationary. The wave Figure 16. Snapshots of the probability distribution in the Krylov basis of the time evolved TFD for a range of times as specified above each panel. This plot corresponds to β = 0, N = 4096 and the GUE ensemble. The horizontal axis shows the index of the Krylov basis elements from 1 to 4096 and the y-axis shows the probability that the initial state has evolved so that it is found in the given basis state. At t = 0 the y-axis runs from 0 to 1 and all the probability weight is on the initial state. At t = 40000 the mean probability is 1/4096. Thus, we arranged the scale of each panel to better show the spread of the wavefunction over the Krylov basis. Figure 17. Snapshots of the probability distribution in the Krylov basis of the time evolved TFD for a range of times as specified above each panel. This plot corresponds to β = 5, N = 4096 and the GUE ensemble. The horizontal axis shows the index of the Krylov basis elements and the y-axis shows the probability that the initial state has evolved so that it is found in the given basis state. The y-axis scales differ in each panel (see caption of Fig. 16 for an explanation of this choice).
bounces back and this gives rise to the downward slope after the peak in state complexity. In the entropic definition of complexity, there is also a downward slope after the bounce of the shockwave (Fig. 14) for most temperatures. However, at infinite temperature the probability is so concentrated at the shockwave front that the distribution actually continues to spread after bouncing from the edge of the Krylov chain so that the entropic complexity does not show a peak and download slope in this limit (dark blue line in Fig. 14).
We can repeat our computations for the GOE ensemble, defined as an ensemble of real symmetric N × N  matrices H with Gaussian measure and the GSE ensemble, defined as an ensemble of N × N Hermitian quaternionic matrices with Gaussian measure The details of the computation are the same as for the GUE ensemble. As reviewed above, these ensembles mainly differ in the specific universal correlation functions between nearby and far away energy eigenvalues. In fact, as described in [14], spectral rigidity of the matrix ensembles, related to the correlations of far away energy eigenvalues, controls the ramp in the spectral form factor. The shape of this ramp and particularly the way it transitions to the plateau strongly depend on the particular matrix ensemble (see Fig. 10 in [14]). In particular the transition to the plateau is sharp for GUE, smooth for GOE, and displays a kink for the GSE. This was also verified for SYK models recently [15]. We thus might expect that the universality classes of matrix models will also differ in the dynamics of complexity. Figs. 18 and 19 plot the quantum state complexity of the time evolved TFD over an exponentially large period of time for the GOE and GSE ensembles, and the same values of N and β as those used for the GUE ensemble. The three ensembles clearly differ, in parallel to the differences that appear in the ramp structure of the spectral form factor in these three cases: the transition to the plateau is sharper for GUE, smoother for GOE, and displays a kink for the GSE. In particular, following the behavior of the spectral form factor, and controlled by spectral rigidity, the wavefunction of the TFD state in the Gaussian Symplectic Ensemble, and hence the complexity, bounces twice before reaching approximate stationarity. This suggests that the complexity peak and slope originate in spectral rigidity in the random matrix model.

C. SYK Model
The conjectures of [14, 16-18, 80, 81] tell us that random matrices are good approximations for the structure of the fine grained spectrum of chaotic systems. As argued above, the energy-time uncertainty relation then tells us that these approximations will work well for describing dynamical processes at large time scales. But the detailed density of states in specific models can differ strongly from the universal statistics of random matrices, and the differences manifest themselves in different thermalization/relaxation processes at small times. We might then worry the quantum state complexity that we have defined will differ dramatically for specific chaotic Hamiltonians as compared to the random matrix models.
To examine this possibility we considered the SYK model, which is simultaneously an interesting many-body quantum system and an accurate model of the dynamics of 2d quantum black holes [68-72, 78, 79]. The SYK model is a model of N Majorana fermions ψ i , i = 1, · · · , N . We normalize them by The SYK model is defined by an ensemble of Hamiltonians where the couplings J are real, independently distributed, Gaussian random variables with zero mean and unit variance. These conventions were used in [84]. Notice that the dimension of the Hilbert space here is 2 N/2 . Below we will normalize physical quantities by this dimension.
We now follow the same steps as before. We draw instances of the Hamiltonian from the ensemble and construct the TFD state. Each instance gives a particular 2 N/2 × 2 N/2 matrix, and we apply the algorithm for computing its Hessenberg form [32] [33], from which we read off the Lanczos coefficients. We then exponentiate the Hessenberg form, apply it to the initial state, and compute the wavefunction in the Krylov basis. From the wavefunction, we then compute the probabilities of the various Krylov basis elements, and hence the complexity.
There is an important subtlety relating the SYK to matrix models [85]: the SYK model displays the different GUE, the GOE or the GSE ensembles, depending on the number of Majorana fermions and hence the nature of the particle-hole symmetry in the model. Concretely, for N mod 8 equal to 2 or 6 the GUE appears, for N mod 8 equal to zero the GOE appears, and finally for N mod 8 equal to 4 the GSE appears. As we described above, the dynamics of complexity depends on the specific ensemble. Thus, for clarity, we choose SYK models in the GUE universality class and plot the results for N = 14, 18,22,. The other cases that  are related to GOE and GSE ensembles proceed similarly and are not shown here. We have checked that the results in those cases are consistent with observations in the previous section about complexity and universality, and the results of [85].
Figs. 20 and 21 show the global structure of Lanczos coefficients. As before, there are two regimes, one showing linear growth of Lanczos coefficients and one in which the a n and the b n are approximately zero and constant, respectively. In Fig. 22 and Fig. 23 we focus on the small n regime, to verify that transition between these behaviors occurs at n ∼ O(1), as before. One can again verify that in the linear regime a n + d = 2b n ∼ n and the dynamics resembles that of the free oscillator discussed in earlier sections.
Since the behavior of the Lanczos coefficients is parallel to the GUE matrix model, we also expect that the dynamics of complexity will behave similarly. We verify this in Fig. 24, where we analyze temperatures β = {0, 1, 2, 5, 10} with N = {14, 18, 22, 26} as above. We clearly see the same regimes as for the matrix model ensembles. We have an initial period of quadratic growth in time, followed by a linear ramp which ends at a peak. Next we have an approximately linear downward slope which ends as complexity equilibrates at a plateau. Given the results of the previous section, the presence of the slope is signalling that the SYK model shares the universal Hamiltonian eigenvalue statistics of random matrix models, as shown previously in [15,85] using other methods. We also see from the shape of the slope and the transition to the plateau that we are analyzing SYK models in the GUE universality class.
It is interesting to compare these results with the analysis in [29]. There, a similar notion of complexity based on the spread of the wavefunction, but in the computational basis, was studied for SYK. The resulting complexity saturates at a exponential value at a time of the order of the entropy of the system, instead of a time exponentially large in the entropy. This shows the importance of the minimization over all bases put forward in this article. Also, this gives hope that in actual physical situations, the minimization performed in sec II is actually good for sufficiently long times.

D. Black holes from collapse
Above, we analyzed complexity growth in the TFD states of several theories that are related to two dimensional gravity. The TFD states of these theories are related to the physics of eternal black holes. We showed above that Schwarzian theory has a TFD state complexity that grows quadratically in time for all time, while the matrix model and SYK theories display a transition to linear behavior followed eventually by a slope and a plateau.
It is interesting to instead consider states that model black holes made from collapse. In this case, the energy band of the system is fixed, although we do not know the precise microstate -thus, we are in the microcanonical ensemble, rather that the canonical ensemble. The latter ensemble is the one modeled by the conventional TFD after a partial trace. To model a black hole formed by collapse we can consider a "microcanonical TFD" by entangling only states in a small energy window, all of them weighted with roughly the same amplitude, namely one over the microcanonical degeneracy. This limits the range of energies in which to compute energy moments.
Alternatively, we can consider a modified Gibbs ensemble, in which we introduce a maximum energy E max in the definition of the TFD, so that where Following the previous examples, the moments are expected to have characteristic behaviors in different ranges. Suppose E max ≥ E and n Eβ, where E is the thermal energy Then, in a theory with a semiclassical limit, we expect the the moments will factorize and can be written in terms of the variance of the energy in the thermal state. Equivalently, this corresponds to small times t β, where we can approximate the survival probability by |ψ t = e −iHt |ψ ⇒ |C t | 2 e −σ 2 with a time-dependent phase e i φt . This implies that for sufficiently short times the system effectively behaves as if it is a particle moving in the Weyl-Heisenberg group (see [56] and above). As we discussed in Sec. V A, this regime is equivalently found within the SL(2, R) paradigm. Then the b n 's satisfy (118) and Krylov complexity grows quadratically in time (119).
The next regime appears if there is a hierarchical separation E max E. In such cases we can consider moments for which E max β n Eβ ∼ S(E). We can understand the behavior of the moments from the moment integral implying that the b n 's and/or the a n 's grow linearly for sufficiently large n as discussed earlier. In this regime we cannot say for sure what is the behavior of complexity, since to do so we need to find the ratio between both sets of Lanczos coefficients. In the models studied above we always found a n + d = 2b n , with d a constant, which implies quadratic growth of complexity, at least for an intermediate range of times. We may conjecture that this quadratic growth is universal, but we do not have an argument establishing this. Finally, whenever there is an upper bound to the energy E max , there is a further regime in which n E max β. The behavior of the moments in this regime is very simple since the moment integral dE e S(E)−βE E n concentrates on the upper limit. Using this fact, the moments scale as Following our reasoning above, it follows that both a n and b n must saturate to a constant value, implying that complexity grows linearly in time, with a rate of O(E max ). A similar phenomenon was noted for operator growth complexity in [48,53,55]. For systems describing black holes made from collapse we would expect E max to be of order the thermal energy E, which in turn should be the mass of the black hole. In this case, the intermediate window of time with model dependent complexity growth that depends on the precise relation between the a n and b n coefficients is absent. We are left with two universal regimes -the initial quadratic growth of complexity controlled by the variance of the energy in the initial state, and a late regime in which complexity grows linearly at a rate controlled by the mass of the black hole. This linear growth lasts for exponentially larges times in the entropy. As argued before, at those times we expect that the dynamics is well approximated by that of a random matrix, and displays a peak, followed by a downward slope terminating at the plateau.

VII. DISCUSSION
We have provided a definition of quantum state complexity that avoids the basis ambiguity present in previous work by minimizing over all choices of basis. Usually, quantities defined through minimization are very difficult to compute. Surprisingly, we showed the minimization required by our definition can be carried out explicitly via the Lanczos method for representing quantum dynamics, and the associated Krylov basis. We show that the quantum state complexity we defined is intimately related to basic physical quantities such as the survival amplitude and the energy spectrum. We exploited these relations to explore how quantum complexity of unitary evolution behaves for chaotic systems, and, in particular, how it is related to spectral statistics [80,81]. In chaotic systems, we found a linear growth of complexity for exponential time, followed by a plateau, as expected from the conjecture in [12], and further extended the conjecture by demonstrating finer grained dynamics at exponentially long time scales. Specifically, the exponential growth overshoots the final plateau, reaches a peak, and then decays back. Using matrix models as examples, we show that the form of the decaying slope and the smoothness with which it merges with the plateau are controlled by universal statistics, namely spectral rigidity. Indeed, the form of this complexity decay to the plateau distinguishes between the three universality classes of matrix models: the Gaussian Unitary, Gaussian Orthogonal, and Gaussian Symplectic ensembles.
We have demonstrated an explicit connection between the dynamics of quantum state complexity and quantum chaos. It would be interesting to similarly analyze integrable systems, especially ones with finite Hilbert spaces where our methods can be applied most easily. This would help to construct a bridge from the classification of phases of quantum matter via partition functions, to a novel characterization of these phases in terms of the dynamics of quantum complexity.
Another important avenue is to explore the structural complexity of Hamiltonian eigenstates. This complexity lies at the heart of the black hole information paradox [86], as explored some time ago in [87]. It is also central to the Eigenstate Thermalization Hypothesis [88][89][90] that seeks to explain why the energy eigenstates of some many-body systems may be well-described by thermal ensembles. Indeed, the Eigenstate Complexity Hypothesis of [6] shows that a quantum system in which energy eigenstates cannot be easily transported into each other through simple operations will have linear complexity growth for an exponentially long time, in Nielsen's geometric sense of geodesic lengths [3]. In this vein it is worthwhile to analyze the relations between eigenstate complexity and the Eigenstate Thermalization Hypothesis in SYK, expanding on [91][92][93][94].
Finally, it would be interesting to use these results, in particular the time evolved TFD wavefunction, to better understand the physics of the black hole interior, expanding on [95], and perhaps the conjectures relating complexity in quantum field theory to spatial volumes and action in dual theory of gravity [44][45][46]96]. Concretely, recent work has suggested that it might be possible to understand the black hole interior volume as a notion of phase space volume in a dual mechanical theory [97]. Quantum complexity, as we have defined it, is by construction the minimal notion of phase space volume explored by the system, since Hilbert space equipped with the canonical metric and symplectic structure can be understood as a phase space, and quantum dynamics become classical dynamics in this phase space [98], a feature exploited for quantum complexity in [8]. In this context, it is natural to explore the connections between our approach and the ideas in [97]. A related promising path is to use the geometric approach to Krylov complexity developed recently in [56], applied to the TFD quantum state complexity, making the connections between symmetries, geometry and complexity manifest. It would also be interesting to study how the black hole interior may be sensitive to the quantum chaotic universality class of the underlying theory, since we have shown that the exponential time dynamics of complexity differs between these classes. Finally, another avenue is to analyze from the present light other types of interior processes, such as the recently considered collisions in the black hole interior [99][100][101].