Dynamical quantum ergodicity from energy level statistics

Ergodic theory provides a rigorous mathematical description of chaos in classical dynamical systems, including a formal definition of the ergodic hierarchy. How ergodic dynamics is reflected in the energy levels and eigenstates of a quantum system is the central question of quantum chaos, but a rigorous quantum notion of ergodicity remains elusive. Closely related to the classical ergodic hierarchy is a less-known notion of cyclic approximate periodic transformations [see, e.g., I. Cornfield, S. Fomin, and Y. Sinai, Ergodic Theory (Springer-Verlag New York, 1982)], which maps any"ergodic"dynamical system to a cyclic permutation on a circle and arguably represents the most elementary form of ergodicity. This paper shows that cyclic ergodicity generalizes to quantum dynamical systems, and provides a rigorous observable-independent definition of quantum ergodicity. It implies the ability to construct an orthonormal basis, where quantum dynamics transports any initial basis vector to have a sufficiently large overlap with each of the other basis vectors in a cyclic sequence. It is proven that the basis, maximizing the overlap over all such quantum cyclic permutations, is obtained via the discrete Fourier transform of the energy eigenstates. This relates quantum cyclic ergodicity to energy level statistics. The level statistics of Wigner-Dyson random matrices, usually associated with quantum chaos on empirical grounds, is derived as a special case of this general relation. To demonstrate generality, we prove that irrational flows on a 2D torus are classical and quantum cyclic ergodic, with spectral rigidity distinct from Wigner-Dyson. Finally, we motivate a quantum ergodic hierarchy of operators and discuss connections to eigenstate thermalization. This work provides a general framework for transplanting some rigorous concepts of ergodic theory to quantum dynamical systems.


Background and motivation
Ergodic theory [1][2][3] concerns itself with a study of the statistical properties of classical dynamical systems, centered around a mathematically precise classification of classical dynamics into different levels of randomness called the ergodic hierarchy [4,5] (see Fig. 1). These levels, such as ergodic, mixing, K-mixing and others [1][2][3][4][5] (in order of increasing randomness, discussed in more detail in Sec. 2.1), can be used to motivate different elements of classical statistical mechanics [4,5]: ergodicity justifies the use of the microcanonical ensemble, and mixing the approach to thermal equilibrium, while K-mixing is responsible for chaotic dynamics.  [4,5]. Each successive subset denotes a level of higher guaranteed randomness within the preceding level. λ indicates the maximal Lyapunov exponent, whose nonzero value is a defining signature of chaos [4,5]. (Reused from Ref. [6]) In quantum mechanics, on the other hand, it has remained unclear if any pertinent "ergodic" dynamical properties can even be defined [7]. Instead, our present understanding of quantum statistical mechanics is founded on a much less precise, but empirically successful, connection to the statistical properties of random matrices [7,8]. Direct contact with the thermalization of observables is made through a comparison of the energy eigenstates (or eigenvectors) in a "physical" basis of a system with random eigenvectors, via the Eigenstate Thermalization Hypothesis (ETH) [9][10][11][12][13][14][15][16] and related approaches [17][18][19][20][21][22][23][24][25]. While observables that satisfy ETH show thermalization behaviors resembling ergodicity and mixing [14,15], every system also has observables that do not satisfy ETH, and it is not well understood from first principles which of these are to be regarded as the "physical" observables -except as determined empirically [14,15,22].
Such observable-dependent ambiguities are avoided in the comparison of the statistics of energy eigenvalues (i.e. level statistics) of a system with those of random matrices, at the apparent cost of direct dynamical relevance to thermalization. This approach is based on the observation that on quantization, typical classically non-ergodic systems show highly fluctuating energy spectra with Poisson (locally uncorrelated) level statistics [26], while typical classically chaotic systems show rigid spectra with the local level statistics of Wigner-Dyson random matrices [27][28][29][30] (within the eigenspaces of conserved quantities associated with all classical symmetries preserved on quantization; emergent quantum symmetries and conserved quantities must also be accounted for, e.g., in certain quantizations of billiards on arithmetic domains and cat maps [7]). A semiclassical "periodic orbit" argument for Wigner-Dyson level statistics soon followed [31,32] (with further developments in, e.g., Refs. [33][34][35][36][37]), assuming semiclassical contributions from a "uniform" distribution of isolated periodic orbits in the classical phase space of a K-mixing system [7]. However, Wigner-Dyson statistics has also been seen numerically [38][39][40] on quantization of merely mixing [39,40] and merely ergodic [38,40] systems without K-mixing/chaos, and even without periodic orbits [38,40,41]. These observations highlight the need for a theoretical understanding of spectral rigidity with regard to ergodicity, without relying on K-mixing or periodic orbits (as also anticipated in Ref. [41]).
Similar trends of spectral rigidity being associated with "chaotic" behavior have been observed analytically (often using basis-dependent analogues of periodic orbit theory, for ensembles of systems) and numerically in fully quantum many-body systems [14,[42][43][44][45][46][47][48][49][50][51][52][53][54][55], where judgements of the "chaoticity" of a system have been largely based on intuition. It remains unclear exactly what kind of ergodic dynamics is represented by spectral rigidity especially in such fully quantum systems, in addition to those with a classical limit. While correlation functions of local observables have been rigorously characterized in a manner similar to the ergodic hierarchy in the specific case of dualunitary quantum circuits [56][57][58], no direct link to level statistics is known even in this case. Further, the only known dynamical signature of spectral rigidity -a (gradually weakening) suppression of extremely small (vanishing in the thermodynamic limit) quantum fluctuations representing recurrences at late times, called the "ramp" [42], appearing in autocorrelation functions [49,59] and quantum simulation protocols designed specifically to measure spectral rigidity [60,61] -is a subleading effect showing no transparent connection to the ergodic hierarchy. The fundamental open problem indicated here, of understanding the precise relationship between ergodicity and the basis-independent energy level statistics of a general quantum dynamical system, forms the central motivation for this work.

Summary of results
In this work, we develop an approach that provides a first general answer to the above problem within the fully quantum regime, by introducing suitable ergodic properties (independent of "physical" bases or observables) in the Hilbert space of a quantum dynamical system and deriving their formal connection to energy level statistics. Qualitatively, our approach is based on the following observations: 1. Cyclic permutations of a discrete set of states are the only invertible discrete processes (in other words, permutations of states) whose repeated action is ergodic, i.e., every initial state evolves into every other state in the set; 2. A quantum cyclic permutation of an orthonormal basis of states is a unitary operator whose eigenvalues are regularly spaced (roots of unity); 3. Spectral rigidity, as usually studied in quantum chaos, is essentially a measure of how close a given energy spectrum is to a regularly spaced spectrum. Thus, by defining a quantum version of ergodicity in terms of "closeness" to cyclic permutation dynamics in a finite-dimensional Hilbert space (further justified by a similar approach to classical ergodicity in the mathematical literature [2,62,63]), we show that spectral rigidity is most directly a measure of this form of ergodicity in any quantum dynamical system. In physical terms, the suppression of recurrences of initial states due to spectral rigidity, indicated by the "ramp" in quantum fluctuations, is what allows certain initial states to "cyclically" visit other states under unitary dynamics (over the time scale of the ramp) due to the conservation of probability. This "cyclic" form of ergodicity encoded in the energy level statistics is, remarkably, quite distinct from the familiar quantum thermalization process of the spreading of an initial state over a "physical" basis (which occurs before most of the ramp [43]) associated with random eigenstates.
Formally, a rigorous treatment of classical ergodicity in terms of closeness to cyclic permutations of discretized cells in a continuum phase space (or "cyclic approximate periodic transformations") was developed in Refs. [62,63] (see also Refs. [1,2,64,65] for reviews and related results). Central to this treatment are a discrete counterpart to ergodicity and a discrete prerequisite for strong mixing (see Sec. 2.2 for the precise relationship), which we respectively call cyclic ergodicity and aperiodicity. An abstract "quantization" of these properties, where the discretized phase space cells correspond to pure states in an orthonormal basis (qualitatively representing the smallest Planck-sized phase space cells allowed by quantum mechanics), yields the desired quantum notions of ergodicity. These extrapolate the classical notions to fully quantum dynamics, rather than relying on any specific quantization techniques developed for various classical systems. Such an abstract approach is justified, and perhaps even necessitated, by the established difficulty [24,25] of relating energy levels to ergodicity in a "physical" basis (see Sec. 1.1). Correspondingly, our quantum definitions are in the context of a general autonomous (i.e. time-independent) unitary quantum dynamical system and do not rely on a classical limit.
Some key physical takeaways from our approach are: 1. Classical cyclic ergodicity and aperiodicity generalize to quantum mechanics in a surprisingly direct way (unlike the continuum ergodic/chaotic properties of Fig. 1) as fundamental quantum dynamical properties in the Hilbert space, including in systems without a classical limit. This allows a general observable-independent definition of quantum cyclic ergodicity and aperiodicity.
2. Quantum cyclic ergodicity requires the existence of an orthonormal basis where every timeevolving basis state "visits" every other (fixed) basis state successively in a cyclic sequence; quantum cyclic aperiodicity requires the existence of an orthonormal basis where no timeevolving basis state "visits" its original self; both requirements apply within a specified range of times. Here, a state "visits" another if their overlap exceeds that between two generic randomly chosen states.
3. Spectral rigidity directly characterizes the presence of quantum cyclic ergodicity and aperiodicity in a system, rather than conventional ergodicity, mixing or K-mixing as believed in the "quantum chaos conjecture" [28,30]. This clarifies the precise dynamical role played by level statistics in relation to the ergodic hierarchy, beyond a generic "signature of quantum chaos" [7].
Thus, this work provides a system-independent framework that addresses the general connection between ergodic (quantum) dynamics and energy level statistics, in a way that captures the observable-independence of the latter. We further provide both analytical and numerical evidence for the applicability of this framework to different physically relevant types of energy level statistics: Wigner-Dyson (seen near-universally in intuitively "quantum chaotic" systems) and Poisson (seen near-universally in "non-ergodic" systems) as associated with random matrix theory [7], and also the non-random-matrix-like spectral rigidity of quantized Kolmogorov-Arnold-Moser (KAM) tori (classically ergodic, non-mixing systems with no periodic orbits) and fluctuating spectra in rational tori (classically non-ergodic collections of periodic orbits) which occur as subsets of classically integrable systems [5]. A depiction of the resulting logical relationships is shown in Fig. 2.  Figure 2: Depiction of the set of quantum dynamical systems in terms of the observable-independent ergodic properties proposed in this work, namely quantum cyclic ergodicity and aperiodicity. One distinct type of energy level statistics exemplifying each possibility for the presence or absence of these properties is shown (with dashed boundaries). The location of each such example in this diagram is analytically justified in the course of the main text.

Structure of this paper
The rest of this paper is organized as follows. The basic theory of observable-independent quantum ergodic properties and their connections to energy level statistics is set up in Secs. 2, 3, and 4. Sec. 2 reviews some necessary aspects of classical ergodic theory, including the use of cyclic permutations to "discretize" a dynamical system [1,2,62,63], and defines cyclic ergodicity and cyclic aperiodicity (Definitions 2.1, 2.2) as discrete, primitive properties related to ergodicity and mixing that can be extended to quantum mechanics. Sec. 3 motivates and defines their analogues in quantum mechanics (Definitions 3.1, 3.2, 3.3). Sec. 4 shows that quantum cyclic ergodicity and aperiodicity in a system are quantitatively determined, through discrete Fourier transforms of the energy eigenstates 1 (Eq. (21)), by specific measures of energy level statistics: namely, mode fluctuations [39,66,67] (Eq. (37)) and the spectral form factor (SFF) [7] (Eq. (38)). Remarkably, Wigner-Dyson spectral rigidity emerges naturally at this stage, as a direct consequence of some simple dynamical conditions (Proposition 4.2); this allows an intuitive visualization of random matrix statistics in terms of ergodic Hilbert space dynamics (Fig. 6). The subsequent sections are concerned with demonstrating applications to different situations of physical interest. Sec. 5 considers "typical" systems, relating quantum cyclic ergodicity to the "ramp" in the SFF via the conservation of probability (Sec. 5.3), and provides detailed analytical (Secs. 5.3, 5.4) and numerical (Fig. 7) evidence that quantum systems with Wigner-Dyson spectra are (quantum cyclic) ergodic and aperiodic, while those with Poisson spectra are aperiodic but not ergodic in the appropriate subspaces -covering the forms of level statistics associated with random matrices [7,8]. Sec. 6 considers 2D linear flows on tori, for which we are able to explicitly construct cyclic permutations, and analytically prove both classical and quantum cyclic ergodicity and non-aperiodicity (Theorem 6.1) as well as higher-than-Wigner-Dyson spectral rigidity (Eq. (68)) for every 2D KAM torus (where the linear flow has an irrational frequency ratio), with an analytical argument that the non-ergodic and non-aperiodic rational tori (with a rational frequency ratio) have comparable, but not identical, spectral rigidity to Poisson. This covers some physically interesting systems with atypical level statistics, and suggests a wider applicability of cyclic ergodicity and aperiodicity in characterizing quantum dynamics than random matrix theory, while establishing the possibility of analytical proofs of spectral rigidity for individual systems in this framework (which has been considered a mathematically and conceptually challenging problem [24,25]). Finally, Sec. 7 discusses some insights about the connection between the above eigenvalue-based dynamical properties and thermalization as determined by eigenstates that may be gained from cyclic permutations, in a largely semi-qualitative manner that may motivate future rigorous work.

A short review of classical ergodic theory
In classical ergodic theory [1][2][3][4], one is concerned with dynamics on a phase space (or a smaller region of interest, such as an energy shell of a Hamiltonian system) P, with an operator T t : P → P that evolves points in the space by the time t (which may be a continuous or discrete variable, corresponding to flows or maps). The main questions of interest are which regions of phase space are explored over time by an initial point, and how rapidly a typical point explores these regions. These questions are conveniently posed when there is a measure µ(A) ⩾ 0 defined for subsets A ⊆ P that is preserved by time evolution, µ(T t A) = µ(A) (in Hamiltonian dynamics, this measure is given by the phase space volume A d n qd n p). An important feature of such systems is guaranteed by the Poincaré recurrence theorem [1][2][3]: for any A ⊆ P such that µ(A) > 0, almost every point in A eventually returns to A, each within some (long) finite time (i.e. with the exceptions 1 We emphasize that the fact that the connection to level statistics is realized in specific bases, unambiguously specified relative to the energy eigenbasis, does not reduce the basis-independence of the definitions. This is in the same sense in which the energy eigenvalues are basis-independent, even though a unitary time evolution operator is only diagonal in the energy eigenbasis. forming a set of measure zero). Given such a measure, how well an initial point explores the phase space is generally expressed through correlation functions of various sets, the behavior of which is classified into the ergodic hierarchy [4,5]. In what follows, we normalize µ(P) = 1.

The classical ergodic hierarchy
We first ask whether almost all initial points explore every region of nonzero measure in P. If so, the dynamics is said to be ergodic in P. If not, P can be decomposed into (say) M subsets that are invariant under T, i.e. P = M j=1 P j (each with a measure induced by µ), such that the dynamics is ergodic within each P j . Ergodicity in P can be shown to be equivalent [1][2][3][4][5] to time-averaged correlations (e.g. presence of the system in a region B ⊆ P) becoming independent of the initial condition (e.g. starting in the region A ⊆ P) with only measure zero exceptions: These measure zero exceptions could be isolated periodic orbits or other closed surfaces of lower dimension than P. Here, we use dt either as a continuous integration measure or that corresponding to a discrete sum, depending on the domain of t.
Mixing is a property in which time evolution eventually becomes uncorrelated with initial conditions, and represents how rapidly typical points explore a phase space region P on which time evolution is ergodic. Generically, a system is mixing if any ensemble A with sufficiently many 'neighboring' initial states (i.e. of nonzero measure) in P eventually spreads out equally according to certain criteria over every region of P (with deviations from this behavior allowed for a vanishing fraction of times in the case of weak mixing). The simplest such criterion is expressed in terms of two element correlation functions eventually becoming uncorrelated [1][2][3][4][5]: and is conventionally merely called mixing (weak mixing corresponds to the limit converging except at a vanishing fraction of times, instead of exactly for all t → ∞ [3][4][5]). This can be extended to higher order correlation functions [2,68], and the dynamics is said to be K-mixing when all higher order correlation functions become uncorrelated in the above sense [2,4,5], which leads to chaotic behavior, e.g. nonzero Lyapunov exponents [4,5]. These criteria form a hierarchy in the sense that K-mixing implies mixing, which implies ergodicity [4,5]. Additional levels of randomness may also be considered [4,5]; see Fig. 1 for a depiction of the hierarchy of Ref. [4]. It is interesting to note that if one defines a unitary operator U T induced by T on phase space functions f(x ∈ P) via U t T f(x) ≡ f(T t x) (Koopman and von Neumann's Hilbert space representation of classical mechanics [1][2][3][63][64][65]), some of these properties can be translated to those of the eigenvalues and eigenfunctions of U T , whose direct extensions to quantum mechanics have been previously considered [69]. For instance, ergodicity translates to non-degenerate eigenvalues with eigenfunctions of uniform magnitude, and weak mixing to a continuous spectrum with no non-constant eigenfunction, of U T [3]. For a discrete and finite quantum spectrum that corresponds to phase spaces or energy shells of finite measure by Weyl's law [7], the eigenvalues are almost always non-degenerate (i.e. are non-degenerate or can be made so by infinitesimal perturbations) and the spectrum is necessarily discrete, prompting us to seek alternate avenues in which the above properties are at best emergent in the classical limit.

Discretizing ergodicity with cyclic permutations
We eventually want to understand how quantum mechanics with its discrete set of energy levels can lead to ergodic and mixing behaviors, defined classically for continuous systems. A useful bridge between the continuum and discrete descriptions is offered by the technique of discretizing an arbitrary dynamical system in terms of cyclic permutations, which have been studied as "cyclic approximate periodic transformations" in Refs. [1,2,[62][63][64][65]. Here, we discuss and adapt the elements of this framework that are most relevant for our purposes, following Ref. [63], leading to the definition of cyclic ergodicity and cyclic aperiodicity respectively as a discrete version of ergodicity and a discrete prerequisite for strong mixing. These ideas are then illustrated with simple examples, which are directly used later in the study of quantized KAM tori in Sec. 6.
A simple motivation for studying cyclic permutations is as follows. All invertible maps on a discrete set of states act as permutations of these states. Every permutation can be decomposed into a set of cyclic permutations, each acting on a separate subset of states. Among these, the only ergodic permutations -under whose repeated action every discrete state visits every other state -are cyclic permutations of all states with no further decomposition into subsets. This suggests the strategy of probing the ergodicity of an invertible autonomous dynamical system by comparing its dynamics to the repeated action of a cyclic permutation on some discretized states, in the continuum limit.
To this end, let C = {C k } n−1 k=0 be a decomposition of the phase space P into a large number of µ-disjoint (i.e. with measure zero intersection) closed sets of identical measure, µ(C k ) = 1/n; we will implicitly consider C to be part of a well-defined sequence of such decompositions, with n → ∞ through a subset of the positive integers. Introduce a time evolution operator T C on P that effects a cyclic permutation of the C k , i.e. T C C k = C k+1 (with (n − 1) + 1 ≡ 0 i.e. the addition is modulo n). For some choice of discrete time step t 0 , we would like to see if successive actions of T t 0 on the C k closely resemble the cyclic permutation effected by T C . Thus, as a measure of how well T C approximates T t 0 over a single time step, we define the single-step error of the permutation (differing from that in Ref. [63] by the factor of 1/2): where The error measures the fraction of T t 0 C k that lies outside C k+1 , averaged over all initial sets C k . We will often directly call C the cyclic permutation in place of T C for brevity, as any T C that cyclically permutes the elements of C has the same error. We also note that we can choose t 0 to depend on n, in particular with t 0 → 0 as n → ∞ for a flow with a continuous time variable, so that the discrete steps T pt 0 with p ∈ Z approach the continuous time flow T t with t ∈ R as the decomposition becomes more refined with increasing n. As a simple example, the rotation T t θ 0 = θ 0 + ωt (modulo 2π, which will be left implicit in what follows) on a 1D circle with angular coordinate θ ∈ [0, 2π) is approximated by the n-element cyclic permutation C comprising of equal intervals of the circle C k = {θ ∈ [2πk/n, 2π(k + 1)/n]}, with zero error if t 0 (n) = 2π/(nω). In contrast, higher dimensional ergodic systems typically take infinitely long to explore all of P, and we expect t 0 n → ∞ in addition to t 0 → 0 in such cases. A schematic of a generic cyclic permutation is depicted in Fig. 3. The single-step error ϵ C (t 0 ) can serve as a probe of ergodicity. Refs. [62,63] showed that for a given dynamical system, the existence of any cyclic permutation with error ϵ C (t 0 ) < 2/n guarantees ergodicity as long as all nonzero measure sets can be constructed as unions of the C k in the n → ∞ limit, while ϵ C (t 0 ) < 1/n with t 0 n → ∞ rules out (strong) mixing; a version of their proof is recounted in Appendix A for the interested reader. However, these results allow room for cyclic permutations with a larger single-step error to lead to ergodicity or prohibit mixing. Here, we define cyclic ergodicity (a notion implicit in their proofs) and cyclic aperiodicity (based on an observation in Ref. [63]) as more fundamental discrete dynamical properties based on cyclic permutations that can be used to partly characterize the ergodic and mixing nature of dynamical systems as described below.

Classical cyclic ergodicity and aperiodicity
Cyclic ergodicity is a discrete counterpart to ergodicity in a continuous system, that directly formalizes the notion of "(almost) all points visiting the neighborhood of every other point" in P; the additional cyclic ordering of the "points" ensures that the discretized dynamical system is invertible (and first-order/"memoryless"). In terms of the decomposition C, we want every C k to visit (i.e. have nonzero intersection with) every other C k ′ at least once during its future and past time evolution in steps of t 0 . The cyclic structure (in the ordering of the C k ′ s for all C k s) can also be motivated as follows: if a given initial set C j intersects almost all of C k after one step of T t 0 and then C ℓ after two steps, C k must intersect almost all of C ℓ after one step. Formally, we have the following definition that considers a C k to visit another if their fractional overlap (overlap divided by 1/n, the measure of each C k ) is larger than a given precision η(n) as n → ∞, such as η(n) = n −1 (depicted schematically in Fig. 4): Classical cyclic ergodicity). A cyclic permutation C shows cyclic ergodicity with a given precision η(n) > 0 iff any element C j ∈ C sequentially intersects a sufficiently large fraction of every other C k ∈ C at least once under (future and past) time evolution: where p represents the number of integer steps of time evolution in units of t 0 .
For a given dynamical system, if there exists a sequence (implicitly labeled by n) of cyclic permutations C such that 1. C shows cyclic ergodicity for at least one choice of η(n), and 2. C satisfies the generating property: every nonzero measure set in P contains at least one C k in the n → ∞ limit, then it follows rigorously that every nonzero measure set visits every other, and T t 0 is therefore ergodic in P (see Refs. [2,63] and Appendix A; we also recall that generally t 0 → 0 for a continuous flow). It is also convenient to call the system cyclic ergodic in P for a given η(n), without reference to the generating property, if it admits a sequence of cyclic ergodic C with precision η(n) (mainly to anticipate its quantum counterpart in Sec. 3). For strong mixing (Eq. (2)), we require any initial region e.g. C k to become spread out over all of P as t → ∞. This requires that each C k on average intersects no more than a vanishing fraction of itself in the n → ∞ limit (so that T t C k is not preferentially distributed in C k for almost all C k ), for any t with t → ∞. Correspondingly, we call a system cyclic aperiodic if every sequence of cyclic permutations satisfies cyclic aperiodicity (a necessary condition for strong mixing): Definition 2.2 (Classical cyclic aperiodicity). A cyclic permutation C shows cyclic aperiodicity iff T t C k never returns to intersect a non-vanishing fraction of C k on average, at all times later than t(n): for every t(n) → ∞ as n → ∞.
In light of these definitions, Refs. [62,63] effectively show that ε C (t 0 ) < 2/n for a cyclic permutation implies cyclic ergodicity, while ε C (t 0 ) < 1/n with t 0 n → ∞ implies a violation of cyclic aperiodicity (note that the reverse implication in both cases is not always true), as the error generated in each step is not sufficient to lead to zero overlap of T pt 0 C j with C j+p by p = n/2 (thereby maintaining cyclic ergodicity) or p = n (thereby maintaining cyclic ergodicity and violating cyclic aperiodicity) respectively; see Appendix A. Thus, the existence of a sequence of cyclic permutations C with ε C (t 0 ) < 2/n or ε C (t 0 ) < 1/n for a dynamical system (satisfying the generating property, with t 0 n → ∞) respectively implies that it is ergodic, or ergodic and not strongly mixing.

Examples
Example 1: A simple illustration of these ideas is afforded by the example of rotations on a circle. A continuous rotation T t θ 0 = θ 0 + ωt is ergodic (any initial point covers the entire circle) and not-mixing (the angular width of any initial region is preserved); as discussed after Eq. (3), the n-element cyclic permutation C with C k = [2πk/n, 2π(k + 1)/n] and t 0 = 2π/(ωn) (6) approximates this system with ϵ C (t 0 ) = 0. C then shows cyclic ergodicity, and violates cyclic aperiodicity as t → ∞ due to its periodicity, thereby implying ergodic and non-mixing behavior. It is also worth considering an additional static degree of freedom, e.g. a cylinder Cyl = {(θ, z)} with z ∈ [0, 1] and T t z = z, in which case T is not ergodic on Cyl. A physical example of this type is the 1D harmonic oscillator of frequency ω, with conserved amplitude A (with, say, z = tanh A) and phase θ(t). Here, the cyclic permutation C comprised of "lengthwise strips" C k = C k × [0, 1] (where C k is given by Eq. (6)) is cyclic ergodic for any given 0 < η(n) ⩽ 1; however, arbitrary nonzero measure sets in Cyl (particularly those that do not span the entire range of z for all θ, e.g. sets with z ∈ [0, 1/2]) do not contain any C k . Thus, C does not satisfy the generating property, remaining consistent with the non-ergodicity of T on Cyl in spite of its cyclic ergodicity. Example 2: Somewhat more nontrivial is the discrete rotation on the circle, in steps of the angle 2πα with 0 ⩽ α < 1, which is readily seen to be ergodic only for irrational α and decomposes into infinitely many periodic orbits for rational α (and is mixing in neither case). Here, the construction of cyclic permutations relies on the approximation of α by a rational number [70]. If α = (m + δ n )/n where m, n are coprime integers and |δ n | < 1/2, we can construct an n-element cyclic permutation given by the intervals C k = [2πkm/n, 2π(km + 1)/n] and t 0 = 1, with error ϵ C (t 0 = 1) = |δ n |.
For all irrational α, we can find an infinite sequence of coprime m, n with n → ∞ satisfying by Dirichlet's and Hurwitz' theorems on Diophantine approximations [71]. The error then satisfies ϵ C (t 0 ) < 1/n in any such sequence, establishing ergodicity (as the C k can be used to construct any finite interval as n → ∞) as well as non-aperiodicity by the bounds [2,63] discussed in Sec. 2.2.1 (see also Appendix A), with the latter implying the absence of mixing (as t 0 n → ∞). This leaves the case of rational α = p/q with coprime p and q, for which it is useful to consider the regions B k = [2πkp/q, 2π(kp + 1)/q]; each point within any such B k lies on a different periodic orbit and therefore cannot visit another point in the same B k . Split any such B k into two nonzero measure regions B k1 and B k2 which consequently never visit each other. Any cyclic permutation satisfying the generating property must possess some elements C k1 ⊆ B k1 and C k2 ⊆ B k2 as n → ∞, which cannot visit each other by the preceding discussion; thus, no cyclic permutation satisfying the generating property can be cyclic ergodic for rational α.
In summary, this subsection discussed the connection between certain properties of discretized classical dynamics and levels of the ergodic hierarchy. In particular, the existence of at least one cyclic permutation satisfying cyclic ergodicity guarantees ergodicity (among regions of the phase space containing a discretized element), while mixing requires that every cyclic permutation satisfies aperiodicity (in the infinite time limit). We note that it is not generally known how to establish such properties for classical cyclic permutations except in cases with a sufficiently small single-step error [2,62,63,70] (including the examples discussed above), but we will see that this difficulty is largely mitigated at a formal level after quantization.

Dynamical quantum ergodicity and cyclic permutations
This section aims to find parallels to the discretized classical ergodic properties of the previous section in quantum mechanics, which we propose as a starting point for a precise study of quantum ergodicity. In Sec. 3.1, we define quantum cyclic ergodicity and aperiodicity for cyclic permutations of orthonormal pure states in the Hilbert space, and subsequently the ergodicity and aperiodicity of a quantum system in any subspace of its Hilbert space -which is such that the energy levels of the system encode all the relevant properties. In Sec. 3.2, we briefly discuss the formal quantum analogues of the classical error bounds (see Sec. 2.2) which can be used to constrain the time evolution of a cyclic permutation based entirely on the single-step error in certain cases; a more general characterization of quantum cyclic permutations requires a quantitative study of their connection to energy level statistics, which is taken up in Sec. 4. and D (correspondingly, possibly degenerate) eigenvalues e −iE n t : The time variable t can be chosen to be continuous or discrete, with E n respectively corresponding to the eigenvalues of a Hamiltonian or eigenphases of a Floquet map. Without loss of generality, we will use terminology associated with Hamiltonians in what follows. It is worth noting that such an autonomous unitary evolution is itself never ergodic in the Hilbert space (even after restricting to normalized states), but always has D conserved quantities |⟨E n |ψ(t)⟩| 2 for a time-evolving state |ψ(t)⟩. Moreover, all systems with rationally incommensurate/generic energy levels (including the vast majority of those considered both "quantum chaotic" and "integrable/non-ergodic") have no further conserved quantities, and consequently have the exact same number of ergodic subsets in the Hilbert space [14,15]. Thus, a different conceptual basis is necessary to define and understand quantum ergodicity in an observable-independent manner, while maintaining a connection to some meaningful notion of ergodicity. The main takeaway from this section is that suitably defined quantum cyclic permutations of pure states are a promising candidate for this purpose, allowing a natural quantization of cyclic ergodicity and aperiodicity.
Before embarking on a detailed discussion of quantum cyclic permutations (which may be defined in systems with or without a classical limit), we mention a semiclassical motivation for considering orthonormal pure state cyclic permutations. Weyl's law [7] (generally used for semiclassical calculations of the density of states) effectively assigns to each phase space region A a number of orthonormal pure states P(A) ∝ µ(A) in the semiclassical regime; see also Refs. [72,73] for a related "quantum measure algebra". With A = C k in an n-element classical cyclic permutation C, we have µ(C k ) = (1/n) → 0 classically as n → ∞, suggesting that it is natural to associate the smallest number P(C k ) = 1 of pure states with each C k , i.e. to consider pure state cyclic permutations in the fully quantum description to represent the classical n → ∞ limit. The invertibility of the cyclic permutation in the discretized classical system translates to the unitarity of the associated quantum cyclic permutation of an orthonormal basis.

Pure state cyclic permutations for quantum dynamics
We work in an invariant subspace Σ d ⊆ H (an 'energy subspace') spanned by any subset of suitably relabeled eigenstates {|E n ⟩} d−1 n=0 ;Û H (t) will henceforth refer to the restriction of the time evolution operator to Σ d unless specified otherwise. In practice, Σ d may be chosen depending on convenience to be e.g. in most cases, an energy shell of a physical system spanned by all levels with energies in a range [E, E + ∆E] (which is most likely to show "ergodicity" for any width ∆E less than the energy scale to which spectral rigidity extends as discussed in Sec. 4.2), or the restriction of such a shell to a subspace with fixed values of conserved quantities showing spectral rigidity in systems with additional symmetries. The main question of physical interest is whether a physical system is ergodic within such a (restricted) energy shell. However, the formal considerations of this section apply quite generally to any energy subspace.
We seek pure state cyclic permutations that approximateÛ H (t) within this energy subspace. To this end, let C = {|C k ⟩} d−1 k=0 be an orthonormal basis spanning Σ d with the unitary cycling operatorÛ C |C k ⟩ = |C k+1 ⟩. The eigenvalues ofÛ C are necessarily distinct d-th roots of unity, n=0 . It is convenient to introduce the p-step persistence amplitudes of U H (pt 0 ) relative to the action ofÛ p C , for some choice of t 0 ; these satisfy z k (p, t 0 ) ∈ [0, 1], and represent the overlap amplitude between the time evolved |C k ⟩ and the original |C k+p ⟩. Then, we say thatÛ C approximatesÛ H (t 0 ) with p-step error where Z n = {0, . . . , n − 1}. A pure state approximation scheme for unitaries has been constructed in Ref. [65], in analogy with certain classical non-cyclic transformations (indirectly related to classical cyclic permutations [74]), to formalize results on e.g. the (non-)degeneracy of the classical unitary U T in classical ergodic theory [1,3,63,74]. As we will see in Sec. 4.1, the construction of pure state cyclic permutations as above allows us to go much further, and tackle non-trivial measures of the level statistics ofÛ H (t) that can e.g. distinguish between Wigner-Dyson and Poisson statistics, seen respectively in typical "quantum chaotic" and "non-ergodic" systems.
In analogy with the definitions for classical cyclic permutations (Eqs. (4) and (5)), we can define cyclic ergodicity and cyclic aperiodicity for these pure state quantum cyclic permutations as below (see Fig. 5 for a schematic depiction, and Fig. 6 in Sec. 4.2 for examples with exact numerical data).

Definition 3.1 (Quantum cyclic ergodicity).
A pure state quantum cyclic permutation C shows cyclic ergodicity with precision η(d) iff Eq. (13) states that (C shows cyclic ergodicity iff) any initial state |C k ⟩ ∈ C "visits" all the other elements of C sequentially with sufficiently large overlap (i.e. z k (p, t 0 ) > η(d)), at least once in its future and past evolution. Similarly, in place of a vanishing average self-intersecting fraction for classical cyclic aperiodicity in Eq. (5), we define quantum cyclic aperiodicity in terms of a sufficiently small mean overlap amplitude of a pure state in C with itself, with an additional restriction to a time interval (t 1 , t 2 ) with 0 < t 1 < t 2 : Definition 3.2 (Quantum cyclic aperiodicity). A pure state quantum cyclic permutation C shows cyclic aperiodicity in a time interval (t 1 , C 0 Haar-random persistence (a) Cyclic ergodicity and aperiodicity.
is some monotonic function andÛ p C is extrapolated to non-integer p in some convenient manner (e.g. connecting the |C k ⟩ along some smooth path). The basis vectors |C k ⟩ are depicted by arrows representing the corresponding axes. Each of (5a) and (5b) may be loosely regarded as a "quantization" of the respective classical versions in Fig. 4. The trajectory indicates the persistence amplitude z 0 (p, t 0 ) of the initial state |C 0 ⟩ (1 at the outermost boundary, 0 at the center) in the radial direction up to |p| = 5 (visible trajectory) and beyond ("Haar-random persistence"). The region of "Haar-random persistence" refers to z 0 (p, t 0 ) ≲ O(d −1/2 ), which includes all typical Haar random states by canonical typicality [17,18]. Consequently, this region has by far the largest (Haar) volume in Σ d , while the depicted outer regions of non-random overlap with the |C k ⟩ together form a relatively tiny fraction of the space.
The question of interest is now the choice of precision η(d) and the time interval (t 1 , t 2 ) that are most useful in physical situations. An important consideration for the former is suggested by canonical typicality [17,18], which here refers to the observation that a typical "Haar random" [8] pure state |ψ⟩ in Σ d (by which we mean a normalized state ⟨ψ|ψ⟩ = 1 that is randomly chosen with respect to the Haar/"uniform" distribution, invariant under unitary transformations, in the space of such states) has what we call a "random" overlap with every pure state in a given orthonormal basis , in a fairly strong sense (i.e. n j=1 |⟨ϕ k j |ψ⟩| 2 ≈ n/d for large n, appearing equally distributed [17,18] over large collections of the |ϕ k ⟩). Thus, the action ofÛ H (t 0 ) on any generic choice of cyclic permutation C ⊂ Σ d , such thatÛ H (t 0 )|C k ⟩ looks Haar random, is overwhelmingly likely to have p-step persistence amplitudes of z k (p; t 0 ) ∈ O(d −1/2 ) for almost all p in any system. Consequently, the simplest nontrivial definition of cyclic ergodicity for a quantum cyclic permutation (i.e. one not automatically satisfied by a generic cyclic permutation in every system) would be one that requires a larger overlap than this "trivial" random value to consider a given state as having "visited" another state. Thus, the most physically useful choices of η(d) should formally satisfy η(d) ∈ O(d ε−1/2 ) for any ε > 0 to account for all "non-trivial" overlaps and η(d) / ∈ O(d −1/2 ) to exclude trivial overlaps as d → ∞, while taking values physically regarded as "somewhat larger than" d −1/2 for finite d. For any such choice, we use the asymptotic notation: Now we consider the physically appropriate time interval (t 1 , t 2 ) (which stands in contrast to allowing/requiring infinitely long times in classical aperiodicity). The quantum recurrence theorem [75] (Poincaré recurrence for the flow φ n (t) = φ n (0) − E n t of phases φ n = arg⟨E n |ψ(t)⟩ in the energy eigenbasis) guarantees that aperiodicity will eventually be violated for every cyclic permutation after some extremely long time (possibly exponentially large in d; see e.g. [76] for a related discussion of recurrences), and we also do not want to rule out recurrences for small times t ∼ O(1)t 0 when the state is still "near" the initial state. For this reason, are physically appropriate choices, in which case we will write in place of |t| ∈ (t 1 , t 2 ) in Eq. (14). To summarize, Definitions 3.1 and 3.2 rigorously apply for any finite d, while our choice of precision η(d) and aperiodicity time interval (t 1 , t 2 ) suitable for physics are informed by (and mathematically rigorous in) the d → ∞ limit. We emphasize that it is cyclic ergodicity and aperiodicity -among classically relevant dynamical properties -that have direct quantum dynamical counterparts as given by Definitions 3.1 and 3.2, allowing an observable-independent definition of quantum ergodicity in Definition 3.3 below. While "continuum" (or conventional) ergodicity and mixing are classically fundamental and observable-independent, any attempted quantization appears to acquire a strong dependence on the choice of "physical" observables [14,15,22] due to the necessary involvement of eigenstates (this is briefly discussed further in Sec. 7.1). Consequently, we will often drop the "cyclic" qualifier for quantum cyclic ergodicity and aperiodicity, and treat them as the fundamental quantum ergodic properties of interest in the remainder of this paper.
It is now convenient to define when a quantum system is ergodic or aperiodic in an energy subspace Σ d . For ergodicity, we require the existence of an ergodic cyclic permutation in Σ d , with the range of the time step t 0 restricted by t 0 d < T , for T small enough to avoid the time scale at which quantum recurrences may unavoidably occur in any system. For aperiodicity, we note that a cyclic permutation composed of the energy eigenstates, e.g. |C k ⟩ = |E k ⟩, violates aperiodicity (and ergodicity) with the maximum possible value (1) for the mean overlap in Eq. (14); we are therefore able to construct non-aperiodic cyclic permutations (those that violate Eq. (14)) for any quantum system and cannot require all cyclic permutations to be aperiodic (as a prerequisite for mixing), a situation which has no classical analogue in the absence of superpositions. On the other hand, the aperiodicity of a cyclic permutation C is governed by the following inequality: which follows from writing the trace in the {|C k ⟩} basis and using the triangle inequality for complex numbers (we will see in Sec. 4.1 that this inequality is saturated by some cyclic permutation in every system). Thus, systems with sufficiently large (≫ O(d 1/2 )) trace ofÛ H (t) would possess no cyclic permutation satisfying aperiodicity. Correspondingly, it is natural to call a quantum system aperiodic if it admits an aperiodic cyclic permutation. We then have the following definitions of ergodicity and aperiodicity, pertaining to a dynamical (i.e. time-domain) version of quantum ergodicity (which is distinct from the use of "quantum ergodicity" in the mathematical literature to refer to the delocalization of energy eigenstates in a given basis [24,25]):

Definition 3.3 (Ergodicity and aperiodicity of a quantum system).
We call a quantum system (dynamically) ergodic with precision η(d) in the energy subspace Σ d within a time T > 0, if it admits at least one cyclic permutation that is ergodic with precision η(d), satisfying t 0 d < T . Similarly, the system is While this definition is stated in general terms, we will assume the physically motivated precision and time interval specified by Eqs. (15), (16) and (17) in all physical applications considered later.
As a simple example, every system is always ergodic and not aperiodic in any subspace Σ 1 consisting of a single energy level. For typical quantum systems, we will implicitly assume a choice of T that is as large as possible while being much less than the quantum recurrence time scale for a generic quantum system, e.g. evolution by a (Haar) random unitary in Σ d . In general, identifying which energy subspaces Σ d of the system satisfy these properties would provide an observable-independent characterization of the ergodicity of a quantum system. In anticipation of Sec. 4.1, we emphasize that Definition 3.3 is insensitive to unitary transformations of Σ d (which would simply map cyclic permutations in Σ d to each other), and consequently independent of any observables or measurement bases that one may consider; whether a system is ergodic or aperiodic in the above sense is then determined entirely by the unitary invariants ofÛ H (t 0 ) within this subspace: the energy eigenvalues {E n } d−1 n=0 .

Quantum bounds from the single-step error
Similar to the classical case, we can rigorously prove ergodicity or disprove aperiodicity for a cyclic permutation given just the single-step error ε C (1, t 0 ). In the quantum case, this is made possible by noting that the single-step deviationÛ ∆ =Û † CÛH (t 0 ) of time evolution from a cyclic permutation is unitary, corresponding to a (complex) rotation in the Hilbert space Σ d ; it can thus be described by effective angles of deviation θ k (p) = arccos z k (p, t 0 ) of eachÛ H (pt 0 )|C k ⟩ from U p C |C k ⟩. A simple application of the triangle inequality in Σ d (see Appendix B.1) shows that the p-step angle of deviation cannot exceed the sum of the corresponding single-step angles (with the bound being saturated for a 2D rotation of a vector in successive steps of the θ k (p)): Using ε C (p, t 0 ) = sin 2 θ k (p), it follows from here that ε C (1, t 0 ) < π 2 /d 2 implies ergodicity and ε C (1, t 0 ) < π 2 /(4d 2 ) implies non-aperiodicity for C (for large d); we emphasize that these are one-way implications. These results are of interest to the extent that they are the direct extension of the classical bounds of Refs. [62,63] (see Sec. 2.2) to pure state cyclic permutations; however, they are similarly restricted in their applicability to special systems that admit sufficiently small single step errors as in the classical case. In the next section, we will show that an analysis in terms of the energy level statistics of Σ d allows one to characterize ergodicity and aperiodicity in much more general terms in the quantum case.

Optimal cyclic permutations and energy level statistics
In this section, we explicitly identify how specific measures of energy level statistics determine the ergodicity and aperiodicity of a quantum system. Sec. 4.1 constructs certain cyclic permutations directly related to the energy eigenstates (specifically, their discrete Fourier transforms) and establishes results concerning their optimality for determining ergodicity and aperiodicity. Building on these results, Sec. 4.2 shows that ergodicity is directly determined by the mode fluctuation distribution [39,66,67] of the energy levels, while aperiodicity is directly determined by the spectral form factor [7].

Optimizing ergodicity and aperiodicity with Discrete Fourier Transforms of energy eigenstates 4.1.1 Discrete Fourier Transforms of energy eigenstates
To establish the ergodicity or aperiodicity of a general system in a subspace Σ d , given the corresponding energy levels {E n } d−1 n=0 , it is convenient to explicitly identify "optimal" cyclic permutations in Σ d which are the "most likely" to be ergodic and/or aperiodic in a system. A special role in this regard is played by the set of cycling operators which are diagonal in the energy eigenbasis: where q : Z d ↔ Z d ranges over all permutations of the indices n ∈ Z d = {0, . . . , d − 1}. This corresponds to cyclic permutations C which can be written as a discrete Fourier transform (DFT) of the energy eigenstates, for arbitrary phases φ n that don't influence the persistence amplitudes. In Sec. 4.1.2, we show that the minimum value ε min (p, t 0 ) of ε C (p, t 0 ) for a given p, among all cyclic permutations C in Σ d , occurs when the correspondingÛ C has the form in Eq. (20) by 1. Theorem 4.1 when ε min (p, t 0 ) < 2/d ("small" errors), and 2. a heuristic argument for "generic" energies {E n } d−1 n=0 when ε min (p, t 0 ) > 2/d ("large" errors). This shows that an ergodic system is most likely to possess an ergodic cyclic permutation among the set satisfying Eqs. (20) and (21). Further, in Sec. 4.1.3, we prove that the system is aperiodic in Σ d if and only if any and all cyclic permutations of the form in Eq. (21) are aperiodic. Thus, the "most likely" aperiodic cyclic permutation is also given by the above set. The involvement of energy eigenstates in Eqs. (20) and (21) naturally connects ergodicity and aperiodicity to functions of the energy levels {E n } d−1 n=0 . Later, in Sec. 4.2, we discuss how these properties are determined by concrete measures of energy level statistics.
We note that Eq. (21) allows an intuitive interpretation of why DFTs of energy eigenstates correspond to optimal cyclic permutations: states of this form (particularly when q sorts the E q(n) in ascending order with n) are the closest one can get to defining approximate "time eigenstates" |C k ⟩ with quantized "time eigenvalue" k (of a fictitious "time operator" conjugate to the energy in Σ d , keeping in mind the Fourier relation between e.g. canonically conjugate position and momentum variables in quantum mechanics [77]). Time evolution byÛ H (t 0 ) should then ideally cause the time eigenvalue k to be incremented by 1 in such a state, essentially functioning as a cyclic permutation of the |C k ⟩ (up to small errors caused by discretization). In other words,Û H (t 0 ) ≈Û C with arguably minimal error for such "time" eigenstates.

Optimizing ergodicity via persistence amplitudes
The optimality of cyclic permutations satisfying Eq. (20) for small errors is formalized by the following theorem. Theorem 4.1 (Optimal cyclic permutations). If the system (in some energy subspace Σ d ) admits some cyclic permutation C ′ with p-step error ε C ′ (p, t 0 ) ⩽ (2/d) for a given p and t 0 , then ε C (p, t 0 ) attains its minimum value among all cyclic permutations for a cyclic permutation C, whose cycling operatorÛ C satisfies Here,Ŷ is any fixed Hermitian operator (which effectively selects a unique eigenbasis ofÛ H (t 0 ) if the latter is degenerate). In particular, the global minimum of the error is achieved by one suchÛ C for every choice ofŶ.
Outline of proof. The proof of this statement is outlined below in the special case of p = 1 (witĥ U H ≡Û H (t 0 )); the full proof may be found in Appendix B.2. We note that unitary transformationsV on Σ d , being the most general orthonormality preserving linear transformations, can generate all possible cyclic permutations from a given C via |C k ⟩ → V|C k ⟩, which induces a transformationÛ C →VÛ CV † . Our initial objective is to identify the global maximum of the 1-step mean persistence, over all cyclic permutations. We have |Tr(Û † CÛH )|/d ⩽ z(1, t 0 ) (as is evident from expressing the trace in the C basis); on the other hand, there always exists a unitary transformation |C k ⟩ → |C ′′ k ⟩ = e iφ k |C k ⟩ to a basis C ′′ , leaving the z k (1, t 0 ) unchanged, such that |Tr(Û † C ′′Û H )|/d = z(1, t 0 ). Thus, a cyclic permutation that maximizes the trace overlap also necessarily maximizes z(1, t 0 ), with the same maximum value: max The maxima of the trace overlap can occur only when it is stationary with respect to small variations ofÛ C -effected byV = e iX for all small, HermitianX. Correspondingly, imposing stationarity where α = arg[Tr(Û † CÛH )], as a necessary condition for a given cyclic permutation C with cycling operatorÛ C to maximize both |Tr(Û † CÛH )| and z(1, t 0 ). We then have two distinct cases of interest.
1.F = 0: Solutions to Eq. (25) withF = 0 (equivalently, those satisfying Eq. (22)) exist in every system, corresponding precisely to the set of cyclic permutations satisfying Eq. (20). For such of the mean persistence occurs at such a solution, it must also have the smallest error by Eq. (12), for any cyclic permutation.
2.F ̸ = 0: The status of solutions withF ̸ = 0 is much less clear, with respect to whether they exist in a given system, and if so, whether such solutions can include the global maximum of z(1, t 0 ). To analyze this further, we writeÛ for HermitianÂ,B. We then have the following properties whenÛ C is a solution to Eq. (25): with the first following from the definition of e iÂ and e iB as products of the same two operators ordered differently (which must have the same eigenvalues [78]), the second from the definition of α, and the third fromF =F † . From these, we can show that anyF ̸ = 0 solution must have ε C (1, t 0 ) > (2/d); this is done in Appendix B.2, but here we describe a more intuitive argument showing that a sufficiently small error rules outF ̸ = 0. IfÂ,B and the error are "infinitesimal" (which is allowed by Eqs. (28), (29)), we haveF ≈ i(Â −B) = 0 to leading order by Eq. (30). In other words, the differenceF between two unitaries e iÂ , e iB sufficiently close to 1 must be (almost) anti-Hermitian, while Eq. (25) requires thatF be Hermitian; the two requirements are only consistent forF = 0.
On extending these considerations to general p (see Appendix B.2), we get Theorem 4.1.
Based on the above outline, we can also describe a heuristic argument for why we expect Eq. (20) to include the optimal cyclic permutation for "generic"Û H (t 0 ).F = 0 solutions to Eq. (25) always exist without imposing any constraints on the eigenvalues ofÛ ∆ =Û † CÛH = e iB+iα whatsoever; but for Eq. (30) to not reduce toÂ =B (and therefore,F = 0) requires sinB to have at least one degeneracy (on account of Eq. (28)). We expect that this additional fine-tuning of the eigenvalues required forF ̸ = 0 makes it unlikely for the optimal cyclic permutation to occur with this condition for a genericÛ H (t 0 ). If this is the case, Eq. (20) generically includes the optimal cyclic permutation for small or large error 2 .

Optimizing aperiodicity
This shows that the DFT cyclic permutations of Eq. (21) all saturate the lower bound of Eq. (18). Thus, the aperiodicity of any (and every) one of these DFT cyclic permutations is a necessary and sufficient condition for the system to be aperiodic in Σ d .

Ergodicity, aperiodicity and energy level statistics
In a DFT basis, the p-step persistence amplitudes defined in Eq. (11) are equal, z k (p, t 0 ) = z(p, t 0 ), and the persistence amplitudes can be expressed directly in terms of the energy eigenvalues E n : The corresponding p-step errors are given by , as per Eq. (12), among which the global minimum ε min (p, t 0 ) is attained for some choice of q (if this minimum is less than (2/d), as guaranteed by Theorem 4.1). We will regard the z(p, t 0 )[q] as measures of the energy level statistics within Σ d , in particular the deviation of the (permuted) energy levels from a regularly spaced spectrum. Namely, let represent the deviation of the q(n)-th level in a rescaled spectrum from the integer n. The persistence as a function of time as given by where f(∆; t 0 , q) is the probability density function of the ∆ n (t 0 , q) in the d → ∞ limit (or sufficiently large d).
Intuitively, the persistence at any time p would be maximized when the ∆ n are minimized. A practically reasonable choice of t 0 and q to estimate the global minimum of the 1-step error, for uniform density of states Ω(Σ d ) = (d − 1)/(E max − E min ) (i.e. appearing uniform over large energy windows [7] within Σ d ), is one in which the rescaled levels t 0 dE q(n) /(2π) are each close to the n-th integer. In other words, t 0 ≈ 2πΩ(Σ d )/d, with q chosen to be the sorting permutation q that sorts the energy levels in ascending order of the phase (E n t 0 mod 2π): (essentially, E q(0) is the lowest energy level in Σ d , E q(1) the next higher level and so on until the highest level E q(d−1) , for this choice of t 0 ), ensuring that t 0 dE q(n) /(2π) remains reasonably close to q(n) for some choice of t 0 . For a given t 0 , it is shown in Appendix B.3 that Eq. (34) is indeed maximized at p = 1 when q = q, among a certain class of "small" permutations when ∆ n ≪ d. In other words, the sorting permutation is a (discrete version of a) local minimum for the error. In this case, the ∆ n are essentially what have been called mode fluctuations [39,66,67] in the spectrum 3 ; the Gaussianity of their distribution has been conjectured to be a "signature of chaos" [66,67]. A minor, but important, technical distinction between ∆ n and conventional mode fluctuations is that there is no unfolding [7,8] -a convenient modification of the energy levels to make Ω(Σ d ) appear uniform over large energy scales while preserving shorter range correlations -prior to calculating the ∆ n . Such a procedure, while useful for combining short or medium range level statistics from different parts of the spectrum for improved statistical quality in numerical studies, manually alters the long-range structure of the spectrum and does not preserve the dynamics of the system in the time domain. This makes unfolding unsuitable for any analytical approach aiming to relate the unmodified dynamics of a system to its energy levels, including the present study. Given this qualifier, Eq. (34) naturally states that the Fourier components of mode fluctuation distributions, obtained without unfolding, directly determine the optimal persistence of cyclic permutations. As we will see more quantitatively in the discussion around Eq. (41), this means that the persistence may be large enough to show cyclic ergodicity only when Σ d is contained in a sufficiently narrow energy shell, within which larger scale variations in Ω(Σ d ) can readily be neglected.
Another relevant (and extensively studied) measure is the spectral form factor (SFF) [7] (here, of the energy levels within Σ d ), defined by The SFF is usually the central analytically tractable quantity in studies of "quantum chaotic" systems, as far as level statistics is concerned [7,32]. Excluding O(1) transients at early times (the "slope" [42]), a high degree of spectral rigidity in such systems is indicated by significantly suppressed late-time quantum fluctuations [81] in K(t), over a length of time, to well below its "natural" yet small average value ⟨K(t)⟩ t = d −1 seen over the longest time scales, e.g. to around for Wigner-Dyson statistics (these suppressed fluctuations form the "ramp" [42]). For low spectral rigidity, there is weaker or no suppression, e.g. for Poisson spectra, virtually all late-time fluctuations in K(t) oscillate strongly around d −1 .
Calculations of K(t) in several "quantum chaotic" systems (with various approximations) suggest that the low magnitude of the ramp is determined by generic randomness properties and low levels of recurrence/periodicity of certain physical processes, rather than any direct notion of ergodicity -e.g. an appropriate "uniform" 4 and minimally correlated distribution of strictly isolated periodic orbits in systems with a chaotic classical limit [7,[31][32][33], analogous properties of closed Feynman paths in random Floquet systems [44,45,82], or small return probabilities in diffusive processes [33,83,84]. Indeed, from Eqs. (31) and (36), it is clear that the SFF is most directly associated with aperiodicity. Nevertheless, in Sec. 5, we will show how the behavior of the SFF ramp influences cyclic ergodicity within Σ d -connecting an observable-independent notion of quantum ergodicity to these somewhat better understood recurrence properties of specific physical processes in some systems, while remaining applicable to more general systems.
In terms of the measures z(p, t 0 )[q] and K(t), we have the following direct requirements on the energy level statistics within Σ d , for ergodicity (Eq. (13)) and aperiodicity (Eq. (14)) as per Definition 3.3: 1. Ergodicity in Σ d : is a sufficient condition for ergodicity if satisfied at all |p| ⩽ d/2 for at least one choice of q (usually q = q) by Eq. (32). It is also necessary that Eq. (37) be satisfied by at least one q = q p for each |p| ⩽ d/2 in a cyclic ergodic system, if the heuristic argument for the large error version of Theorem 4.1 holds.
of its similarity to ergodicity [7]. It is worth emphasizing that in spite of the mathematical similarity, the two are logically and conceptually distinct, partly due to the fact that isolated periodic orbits are of measure zero in Eq. (1); as noted in Ref. [31], KAM tori are ergodic systems that do not satisfy this sum rule.
A further question is if the simplest i.e. p = 1 measures of level statistics can be used to study these dynamical properties for all times p, say for a given q.To enable this, we first have the formal bound of Eq. (19), which can be expressed in terms of ε C (1, t 0 ) as a lower bound on the decay of the persistence (via z = cos θ): contributions. In general, by the reciprocal relation of Fourier variables, a slow decay of the persistence corresponds to a narrow distribution f(∆; t 0 , q) e.g. as measured by the variance σ 2 ∆ ≡ ⟨f 2 ⟩ ∆ − ⟨f⟩ 2 ∆ , which is related to z(1, t 0 ) via Eq. (34). When q is a sorting permutation, a low σ 2 ∆ essentially implies a high rigidity of the spectrum . While the precise connection between ergodicity, aperiodicity and σ 2 ∆ depends on the functional form of f(∆; t 0 , q), it is natural to pay special attention to the case of an ideal Gaussian distribution of mode fluctuations, as has been seen in typical "quantum chaotic" systems [39,66,67]. In this case, we get z(p, t 0 ) ≈ e −γp 2 (up to random fluctuations) and ε C (1, t 0 ) = 2γ = 4π 2 σ 2 ∆ /d 2 (for ε C ≪ 1) from Eq (34). More significant is the following proposition, which follows from substituting the Gaussian form of z(p, t 0 ) in Eq. (37) This form of σ 2 ∆ is precisely that of the Wigner-Dyson circular random matrix ensembles -COE, CUE, CSE respectively corresponding to maximal ensembles of symmetric, unconstrained, or selfdual unitaries [7,8] -for α = 1, √ 2, 2, which also satisfy Eq. (38). Moreover, in typical "quantum chaotic" systems, the energy levels show Wigner-Dyson statistics (in particular, possessing the spectral rigidity of the circular Wigner-Dyson ensembles and Gaussian mode fluctuations) [7,8,39,66,67], if Σ d is chosen to be a sufficiently narrow energy shell spanning [E, E + ∆E] with ∆E ≲ 2π/t ramp (E); here, t ramp (E) can be directly identified [43] as the time beyond which the ramp appears in the SFF around the energy E after eliminating the effect of early-time transients as described below 5 . This ramp time is system specific, e.g. ranging from t ramp = 1 in some quantized 0 (a) Cyclic ergodicity and aperiodicity (staying closer to the boundary than O(d −1/2 ) for more than one and less than two full rotations); data for a single Circular Unitary Ensemble (CUE [7,8]) random matrix with E n spanning [0, 2π).   [17,18], and is where the trajectory typically remains for long times with the exception of occasional recurrences near the boundary. See Fig. 7 for a different depiction of similar data.
chaotic maps [85], O(1) to growing logarithmically with system size in some Floquet many-body systems [43][44][45]47], or the time scale of diffusion in some disordered systems [43]. In general, the definite width ∆E of Σ d introduces a large sinc 2 (t∆E/2) transient (see also Ref. [86]) owing to the Fourier sum in Eq. (36), due to which the SFF effectively takes the following form when ∆E ≲ 2π/t ramp : Thus, by choosing t 0 = 2π/∆E (= 2πΩ(Σ d )/d, in agreement with our earlier estimate after Eq. (34); we also note that this is the minimum time scale for nontrivial dynamics in an energy window of width ∆E, suggested by the energy-time uncertainty principle) so that integer steps pt 0 coincide with the zeros of the transient, we can eliminate the influence of this transient and obtain cyclic permutations that are directly determined by the intrinsic spectral rigidity of the system represented by the ramp (this will be justified further in Sec. 5). In addition, this means that the period t 0 d of such cyclic permutations is given precisely by the Heisenberg time t H ≡ 2πΩ(Σ d ) of Σ d , at which the individual energy levels typically dephase completely marking the end of the ramp [7]. Overall, this proposition suggests that all energy shells of width ∆E ≲ 2π/t ramp in systems with Wigner-Dyson level statistics are ergodic and aperiodic with t 0 = 2π/∆E (by Definition 3.3), as the sorted DFT cyclic permutation is both ergodic and aperiodic, which is anticipated by the numerical data in Fig. 6 for CUE and Poisson level statistics. From a dynamical standpoint, that the ergodicity of such systems only holds within thin energy shells is not surprising -this just reflects the fact that Hamiltonian systems with a classical limit are only ergodic in thin energy shells, and not over phase space volumes covering a wide range of energies.

Cyclic permutations for typical systems
In this section, we study the behavior of DFT cyclic permutations for a "typical" system with sufficiently random fluctuations in the energy levels. Based on a general decomposition ofÛ H (pt 0 ) into a periodic part that follows the cyclic permutation and orthogonal random fluctuations in Sec. 5.1, we motivate a Gaussian estimate for the time dependence of persistence amplitudes for sufficiently early p in typical systems in Sec. 5.2. In Sec. 5.3, we show how the Gaussian estimate can be used to derive a lower bound for the error from the SFF for the system sampled at discrete times pt 0 given by Eq. (52), directly connecting cyclic ergodicity to the size of quantum fluctuations in the SFF ramp (which is often analytically tractable). In Sec. 5.4, we discuss detailed analytical and numerical evidence for Proposition 4.2, showing that in the ideal case where the Gaussian estimate remains valid for longer times (pt 0 ≲ t H ) in an individual system, cyclic ergodicity and aperiodicity are equivalent to requiring the spectral rigidity of energy levels to be in the range spanned by the Wigner-Dyson circular ensembles: COE, CUE and CSE.

Periodic and random parts of time evolution
For a cyclic permutation C with cycling operatorÛ C that commutes withÛ H (i.e. a cyclic permutation of a DFT basis), the p-step persistence probability z 2 (p, t 0 ) is given by the analogue of the SFF for the error unitaryÛ ∆ =Û −1 CÛH (t 0 ): To study the development of the persistence over time, it is convenient to write a general expression forÛ p ∆ in terms of the p-step errors ε C (p, t 0 ). On account of [Û ∆ ,Û C ] = 0, we have ⟨C k+ℓ |Û H |C j+ℓ ⟩ = ⟨C k |Û H |C j ⟩, due to whichÛ p ∆ can be expressed simply in terms of powers of for some phases ϕ ∆ (p) and complex error coefficients ν m (p). UnitarityÛ † ∆Û∆ =Û ∆Û † ∆ =1 translates to nonlinear constraints on the ν m (p): where ν 0 (p) ≡ 0, and As a matter of nomenclature, we call the first term proportional to1 in Eq. (43) the "periodic part", and the remaining terms involvingÛ m C (orthogonal to the periodic part) the "random part", of time evolution. This is because the former becomes a term proportional toÛ p C inÛ H (pt 0 ) which is periodic in p, while we expect the ν m (p) to generally (but not necessarily) look "random". In fact, (a subset of) the ν m (p) are directly related to the SFF ofÛ H (pt 0 ) within the subspace Σ d , via: and the expectation of randomness in the ν m (p) reflects the randomness in the SFF ramp [44,49,81] (more precisely, particularly in the phases of Tr[Û H (t)]). Additionally, K(pt 0 ) serves as a (rather weak) lower bound for the p step errors. In particular, ε C (p, t 0 ) = O(1) if K(pt 0 ) = O(1), establishing the impossibility of finding cyclic permutations that are reasonably close toÛ H (t 0 ), when t 0 ≪ t ramp i.e. in the early-time "slope" regime of the SFF. To refine this bound, we will need a generic expression for the p-dependence of the right hand side, derived in the following subsection.

Gaussian estimate for persistence amplitudes
Using Eq. (43), one can readily express the persistence at arbitrary time p in terms of the 1-step parameters ε C (1, t 0 ) and ν m (1). The resulting expression involves a complicated multinomial expansion in the ν m (1) (with p s representing binomial coefficients), which is hard to extract general predictions out of. To simplify the expression, we invoke a heuristic argument that relies on the expected randomness of the ν m .
Specifically, we assume that each ν m (1) is well described by an ensemble of complex numbers with a fixed magnitude and random phases, subject to the constraints Eq. (44) and Eq. (45). Further, if one neglects O( ε C (p, t 0 )) corrections to the ν m , Eq. (45) essentially becomes Thus, pairings of ν m (1) and ν −m (1) in Eq. (47) have a definite phase and generate contributions that potentially interfere constructively, while the remaining random terms add out of phase. This suggests following a strategy similar to methods based on the pairing of closed Feynman paths in studies of generic semiclassical [7,32,35,36,87] and quantum [44,82] "chaotic" systems: we evaluate the contribution from terms dominated by pairings of ν m (1) and ν −m (1) with at most one free ν m k (1), assuming (with no proof beyond the above argument) that the remaining terms are negligible. As is common with these methods, other contributions would eventually dominate at large enough times, when ε C (p, t 0 ) is sufficiently large and ν m (p) is sufficiently random, invalidating Eq. (48) for such p.
The assumed dominance of paired error coefficients can be used to derive a general form of U p ∆ for small p, and from there an estimate for z(p, t 0 ) using a recurrence relation; this is detailed in Appendix C, with numerical evidence for error coefficient pairing. For ε C (1, t 0 ) ≪ 1 and In other words, time evolution for small p simply manifests as a relative growth of the random part in comparison to the periodic part, up to an overall phase. This gives a simple Gaussian expression for the persistence amplitude (in the same regime of small error and time): The second (linear) term in the exponent is negligible until |p| ∼ 1/ε C (1, t 0 ), and we will simply drop it in further calculations. The Gaussian follows the sinusoidal lower bound in Eq. (39) rather closely, suggesting that typical cyclic permutations are surprisingly close to saturating the lower bound. In other words,Û p ∆ remains close to a 2D rotation in Hilbert space, until a time p ∼ 1/ ε C (1, t 0 ) when the cyclic permutation develops a large (∼ 1) error.

Lower bound on the error from the SFF
Now we are in a position to quantitatively analyze the connection between the SFF ramp and the persistence of cyclic permutations. The 1-step error coefficients ν m (1) can be related to the SFF K(pt 0 ) in the p ≪ 1/ ε C (1, t 0 ) regime, using Eqs. (43), (46) and (49): Summing over p = −p to p excluding 0, the left hand side can be at most 1 on account of the normalization constraint, Eq. (44). Expanding z 2 (p, t 0 ) = 1 − O(ε C (1, t 0 )p 2 ) and using K(t) = K(−t), we get p p=1 K(pt 0 ) 1 Every term on the left hand side is positive. Considering only the first term and choosing the largest possible p for which the second term is negligible then gives a reasonably restrictive lower bound on ε C (1, t 0 ). Correspondingly, we take p = 1/(M ε C (1, t 0 )) where M is some large number satisfying M = O(1) ⩾ 1.
Eq. (52) is the main relation of interest connecting the recurrence properties represented by the SFF to cyclic ergodicity via ε C (1, t 0 ) (assuming typical ν m (p)). It demonstrates that the suppression of recurrences indicated by the small magnitude of the SFF ramp is essential for the dynamics to be able to closely follow a cyclic permutation, essentially due to the conservation of probability (Eq. (44)). As it involves only integer steps p ∈ Z of time pt 0 , we see directly that choosing t 0 = 2π/∆E for an energy shell prevents the sinc 2 (t∆E/2) transient in Eq. (41) from influencing ergodicity.
We can also derive explicit bounds for specific cases. As such sums of the SFF over time are generally self-averaging (i.e. fluctuations of the ramp average out to give a more steady sum) [44], we replace K(pt 0 ) with a smooth power law expression for the ramp: K(t) = λt γ for t 0 ⩽ t ≪ t 0 d, γ ⩾ 0, and with λ ≪ 1, which accounts for the behavior of a wide variety of systems 6 ; all such systems have K(t 0 ⩽ t = O(t 0 d)) ≲ O(d −1 ) and are consequently aperiodic. Evaluating the sum in Eq. (52) for this power law (Appendix C.3) gives the following constraints on the error: where ζ(z) is the Riemann zeta function. Now we consider the most important (i.e. typical) cases of practical interest. Poisson statistics [7] corresponds to λ = d −1 and γ = 0, for which we obtain Together with the conditions for Eq. (22), this implies that every (DFT and non-DFT) cyclic permutation for a system with Poisson level statistics has ε C (1, t 0 ) > (2/d). As long as z(p, t 0 ) is not drastically different from a Gaussian in p, as expected from the typicality considerations of Sec. 5.2, it follows that the persistence decays to the Haar random value by p ≲ O(d 1/2 ) ≪ d/2, 6 In the following sense: γ = 0 and λ = d −1 corresponds to generic integrable systems with Poisson statistics [7,26]; γ = 1 corresponds to generic "chaotic" systems when λ ∈ O(d −2 ) [7,8], and those with macroscopic conserved quantities for larger magnitudes of λ [48,84]; integer γ > 1 with λ ∈ O(d −2 ) corresponds to tensor products of γ independent chaotic systems, as well as the γ-particle sectors of single-particle chaotic systems with λ ∈ O(γ!d −2 ) (for large d), in which the many-particle SFF shows an exponential ramp [50][51][52]. and no DFT cyclic permutation is even remotely close to being ergodic for a typical system with Poisson statistics. On the other hand, the circular Wigner-Dyson ensembles [7,8] have γ = 1 and λ = 2/(βd 2 ) with β = 1, 2, 4 for COE, CUE, CSE respectively. With t 0 = 1 (= 2πΩ/d), the error satisfies These relations encode the following property: any system admits cyclic permutations with large error, but only sufficiently rigid spectra can admit cyclic permutations with small error, quantifying the discussion in Sec. 4.2. For instance, if a system is known to have a cyclic permutation with error smaller than (2/d), we can rule out Poisson statistics for that system.

Spectral rigidity for ergodic, aperiodic systems with almost exactly Gaussian persistence amplitudes
From the viewpoint of the Gaussian estimate, an idealized situation is when the persistence amplitude z(p, t 0 ) remains exactly Gaussian as it decays all the way through to the random state (order of magnitude) value z(p, t 0 )], we can solve for g 1 corresponding to ergodic or non-aperiodic evolution by imposing: where α = 2 for ergodicity and α = 1 for non-aperiodicity (from Eqs. (13), (14)), while c is some O(1) positive constant. From Eq. (34), we also obtain a Gaussian distribution for mode fluctuations given some g 1 (assuming that the DFT cyclic permutation under discussion corresponds to a level permutation function q), with variance σ 2 ∆ = g 2 1 d 2 /(4π 2 ). Requiring ergodicity and aperiodicity therefore gives: This amounts to a derivation of Eq. (40). The logarithmic growth of the variance of mode fluctuations with the dimension d of the energy subspace is a direct consequence of the Gaussianity of the persistence. In less idealized situations, it is possible to have a non-Gaussian tail in Eq. (50), for p ≳ 1/ ε C (1, t 0 ), even if the Gaussian estimate holds for smaller times. It is worth noting that non-Gaussian tails at long times would show up as non-Gaussianities near ∆ ≈ 0 in the mode fluctuation distribution; such deviations from Gaussianity are largely determined by the complicated correlations between the errors ν m , partly encoded in the fluctuations of the SFF K(pt 0 ). The main takeaway here is instead the extremely specific numerical range α ∈ (1, 2], of the coefficient multiplying the logarithm, demanded by ergodicity and aperiodicity. For non-Gaussian tails, one would have a similarly specific range of some other parameter.

Now we consider Wigner-Dyson random matrix ensembles as well as individual systems with
Wigner-Dyson level statistics within energy subspaces Σ d with ∆E ≲ 2π/t ramp . There is numerical evidence that the mode fluctuation distribution is exactly Gaussian [39,66,67] (as well as an analytical proof of Gaussianity for a closely related measure, number fluctuations [88]) especially near ∆ ≈ 0, suggestive of an almost Gaussian persistence even at late times. In Refs. [79,80], the leading behavior of the variance σ 2 ∆ (there called ∆ * ) for Wigner-Dyson ensembles has been shown to be equal to that of the (spectrum or ensemble averaged) spectral rigidity parameter ∆ 3 (d) [8,32,89] -measuring the variance of the "spectral staircase" around a best fit straight line -when t 0 = 2π/∆E is determined by the slope of the straight line and q = q is the sorting permutation. Moreover, ∆ 3 (d) can be calculated exactly [8,32,87] by using the appropriate Wigner-Dyson ensemble averaged SFF K β (t) for the energy subspace.
In fact, the leading contribution for large d comes only from the ramp at t ≪ t H , given by K β (t) ≈ t/(βπΩd) with β = 1, 2, 4 respectively for COE, CUE and CSE (much like in the derivation of Eq. (55)). The result is a logarithmic dependence of σ 2 ∆ on d to leading order (for t 0 = 1 and q being the sorting permutation), This precisely corresponds (via σ 2 ∆ = g 2 1 d 2 /(4π 2 ) ) to an error that saturates the lower bound in Eq. (55), providing an important sanity check. Comparing this with Eq. (58) (or Eq. (40)), we see that the Wigner-Dyson ensembles span exactly the range of allowed coefficients for ergodic, aperiodic systems with a Gaussian persistence. CUE is well within this range, whereas COE is at the upper bound and barely ergodic while CSE is at the lower bound and barely aperiodic (here, it is worth noting that for CSE, we have considered only one non-degenerate half of the doubly-degenerate spectrum as is conventional [7,8]).
It remains to be verified that z(p, t 0 ) for Wigner-Dyson random matrix ensembles is indeed well approximated by a Gaussian all the way until p ∈ [d/2, d] as suggested by the ideal Gaussian distribution of mode fluctuations, so that the identification between Eq. (59) and Eq. (58) can be made with some confidence. We provide numerical support for this statement in Fig. 7 for d = 2048.

Cyclic ergodicity and spectral rigidity in 2D KAM tori
In this section, we study the quantum ergodic properties of linear flows on 2D tori, a class of non-mixing dynamical systems for which the framework of quantum cyclic permutations allows an analytical proof of spectral rigidity when the system is ergodic -thereby rigorously demonstrating the link between cyclic ergodicity and spectral rigidity at the level of an individual system, without relying on ensembles or heuristic arguments. It is convenient to classify these tori into the ergodic KAM tori (with an irrational frequency ratio) and non-ergodic rational tori (with a rational frequency ratio). For some physical context, we note that both occur as invariant subsets of classically integrable systems, with almost all KAM tori remaining stable under small perturbations [5,90]. It is particularly interesting to note that KAM tori, though ergodic, possess neither periodic orbits nor random matrix (-like) level statistics as required in semiclassical explanations of spectral rigidity [7,32,33]; the results of this section (together with Sec. 5 concerning random matrix level statistics) therefore suggest a wider applicability of cyclic permutations as a way to understand spectral rigidity in terms of ergodic properties.
In Sec. 6.1, we discuss a simple quantization procedure for these tori, motivated by the standard quantization of integrable systems. In Sec. 6.2, building on the example of irrational rotations discussed in Sec. 2.2.2, we prove that all 2D KAM tori are both classical and quantum cyclic ergodic (if suitably quantized), while all rational tori are non-ergodic. Finally in Sec. 6.3, we show using the corresponding cyclic permutations and Theorem 4.1 that all 2D KAM tori possess much higher spectral rigidity than Poisson and even Wigner-Dyson statistics (as indicated by a smaller ε C (1, t 0 ) or σ 2 ∆ ), and verify this result numerically. We note that this remains consistent with the Berry-Tabor conjecture of Poisson statistics in typical integrable systems [26], as the latter typically contain several uncorrelated KAM tori overlapping with each other in an energy shell, leading to enhanced spectral fluctuations (similar to Refs. [55,84]) and low spectral rigidity of the integrable system.

Quantization of linear flows on a torus
The Hamiltonian of a linear flow with frequencies ω = (ω x , ω y ) on a 2D torus is given by with angle variables θ = (θ x , θ y ) ∈ [0, 2π) 2 conjugate to the action variables J = (J x , J y ). The equation of motion of the linear flow is T t θ = θ + ωt.
The ergodicity of this system on the 2D phase space P J = {(θ x , θ y )} with fixed J is characterized [1,2] by the ratio α = ω y /ω x . When α is irrational, the dynamics is ergodic on this phase space, and the system may be called a KAM torus 7 [5,91]; but when α is rational, P J decomposes into an infinite number of invariant ergodic and periodic subsets, which share the same period. In both cases, there is no mixing. We emphasize that the system of Eq. (60) is different from a free particle moving on a torus. The latter is never ergodic in its phase space, but possibly (depending on initial conditions) merely visits all points among its position coordinates with conserved momentum. The Hamiltonian of the free particle is quadratic in J rather than linear, and its level statistics has been found to be Poissonian [92,93] (see also Ref. [67] for its mode fluctuations) in accordance with the Berry-Tabor conjecture [26] for fully integrable systems.
The quantization of Eq. (60) can be motivated by considering how such tori occur as invariant subsets of integrable systems [5,26,90]. The Hamiltonian of a 2D integrable system with constants of motion I = (I x , I y ) is given by H = H(I), with quantization restricting eachÎ k ∈ N 0 to the non-negative integers due to the periodicity of the conjugate angle variables. Classically, expanding I = I 0 + J in the neighborhood of I 0 (say, with J x , J y ⩾ 0) gives Eq. (60) to leading order (up to an I 0 -dependent constant) with ω(I 0 ) = [∂H/∂I](I 0 ). Thus, a natural way to quantize Eq. (60) in terms of operatorsĴ,θ is to restrict the eigenvalues J x , J y ofĴ x ,Ĵ y each to a finite (but large) set of consecutive integers; up to additive constants in H, this is equivalent to working in the Hilbert space Σ d spanned by with d = d x d y . Of particular interest is the following manifold of θ-states corresponding to points on the torus: We can construct infinitely many complete orthonormal bases for Σ d in this manifold, each comprised of d such states, such as {(θ x , θ y ) = (2πn x /d x , 2πn y /d y ) : n x ∈ Z d x , n y ∈ Z d y }. It is straightforward to see that time evolution reproduces the classical equation of motion in this manifold:Û which provides a natural way to map classical cyclic permutations onto Σ d .

Classical and quantum cyclic ergodicity
In this and the next subsection, we will see how cyclic permutations can be used to rigorously prove spectral rigidity for KAM tori. First, we show that all 2D KAM tori admit a classical cyclic permutation (sequence) that is cyclic ergodic and not aperiodic. This is made possible by identifying the discretized flow on the torus as a sequence of irrational rotations, allowing us to use the cyclic permutations constructed for the latter in Sec. 2.2.2. Using Eqs. (62) and (63) to adapt this construction to a quantum cyclic permutation of an orthonormal basis in Σ d , we will show the following: Theorem 6.1 (Quantum cyclic ergodicity and non-aperiodicity of 2D KAM tori). For every linear flow on a 2D torus with irrational frequency ratio α = ω y /ω x , there exists an infinite sequence of energy subspaces Σ d with d → ∞ (such that d x , d y → ∞) in which the system is ergodic and not aperiodic, within any time T > min(2πd y /ω x , 2πd x /ω y ).
Thus, the quantum dynamical properties of 2D KAM tori reflect their classically cyclic ergodic and non-aperiodic nature, which is further consistent with their ergodicity and lack of mixing.

Classical cyclic permutation
For the classical construction, we first observe that for T x ≡ 2π/ω x (the period along the θ x direction), the map T rT x (θ x , θ y ) = (θ x , θ y + 2παr) with r ∈ Z is a discrete rotation of any θ x = const. circle in the θ y direction, with α = ω y /ω x . Writing α = (m y + δ n y )/n y as in Sec. 2.2.2 with coprime integers m y and n y , we know that every rotation admits an n y -element cyclic permutation C y with error |δ n y | by Eq. (8). To adapt this to the torus, we divide the θ x direction into n x elements and consider time steps in units of t 0 = T x /n x , such that each step captures a 1/n x fraction of the error generated in the full rotation by T x = n x t 0 (see Fig. 8a). Formally, our cyclic permutation C   For these parameters, we have δ n y ≈ 0.704/n y < 1/n y (and likewise with d y ), making both the classical and quantum cyclic permutation ergodic and non-aperiodic. Trajectories starting from θ = (0, 0) are depicted up to p = 2d x = 14, which corresponds to r = 2 steps of the irrational rotation T rT x θ y at θ x = 0.
for the 2D torus consists of: It is easily verified that the intersection of the C qn x (for q ∈ Z) with θ x = 0 gives precisely the cyclic permutation of Eq. (8) for the rotation T rT x . From the discussion of irrational rotations in Sec. 2.2.2, we can find a sequence of coprime m y , n y → ∞ such that |δ n y | < 1/n y for all irrational α. Thus, we have ϵ C (t 0 ) < 1/n with n → ∞ for all n x , establishing the existence of a classical cyclic permutation showing cyclic ergodicity and non-aperiodicity (by the bounds discussed in Sec. 2.2) for all 2D KAM tori.
There are two straightforward extensions of this construction that are worth mentioning: 1. choosing t 0 = 2π/(ω y n y ) and considering irrational rotations in the θ y direction gives essentially similar results, and 2. for rational α, the non-ergodicity of the flow implies that no cyclic permutation with the C k reducing to (neighborhoods of) points in the n → ∞ limit is ergodic or aperiodic, as can be shown by suitably adapting the corresponding discussion for rational rotations in Sec. 2.2.2.

Quantum cyclic permutation
We note that among the θ-states of Eq. (62), any two states |θ⟩, A pure state cyclic permutation corresponding to Eq. (64) can then be constructed using these states, by formally setting d x = n x , d y = n y and considering the set of points given by the d "lower" corners of the classical C k in Eq. (64) (see also Fig. 8b), namely: and t 0 = 2π/(ω x d x ). From Eqs. (63) and (65), it is clear that [Û H (t),Û C ] = 0 as bothÛ H (t) andÛ C are additive maps on the torus; consequently, the above cyclic permutation corresponds to a DFT of the energy eigenstates. Using Eqs. (62) and (63), it is straightforward to evaluate the p-step persistence probabilities z 2 (p, t 0 ) = |⟨C k+1 |Û H (pt 0 )|C k ⟩| 2 in terms of the frequency ratio α = (m y + δ d y )/d y . We obtain the following exact expression for p ̸ = 0: To discuss ergodicity and aperiodicity in the quantum case, we need a slightly more refined implication of Eq. (9) concerning Diophantine approximations than in the classical case, namely that every irrational α admits an infinite sequence of m y , d y with |δ d y | < c/d y for some constant c < 1 (this constant ensures that |δ d y | does not even approach 1/d y asymptotically, as d y → ∞).
Using (sin x) 2 ⩽ x 2 in the denominator of Eq. (66), we get that z 2 (pt 0 ) ⩾ sinc 2 [pπδ d y /d x ]; as the smallest zeros of the right hand side occur at |p| = d x /δ d y , z 2 (p, t 0 ) is guaranteed to remain Θ(1), i.e., no less than O(1) for any d y belonging to the above sequence, until some |p| > d x d y /c > d (as c < 1). It follows that z 2 (p, t 0 ) ≫ O(d −1 ) for |p| ⩽ d, making the constructed cyclic permutation ergodic and non-aperiodic (the latter via z 2 (±d, t 0 ) = K(±t 0 d)). As this is a DFT cyclic permutation, it follows from the considerations of Sec. 4.1.3 that no cyclic permutation in Σ d is aperiodic. On using these results in Definition 3.3 with t 0 d = 2πd y /ω x , while noting that analogous results would hold if we swap the θ x and θ y directions, we directly get Theorem 6.1. When α = m/ℓ is rational with coprime m and ℓ, the motion in |θ⟩-space is periodic with period T α = 2πℓ/ω x . In particular, we have immediately ruling out aperiodicity in any Σ d over any time T > T α . Further, from Eq. (46), we get z k (p, t 0 ) = 0 for every cyclic permutation in some Σ d with pt 0 = rT α , which rules out ergodicity for any cyclic permutation with t 0 = rT α /p with (p ∈ Z) ⩽ d/2. Ruling out ergodicity for more general t 0 would require a more detailed study of spectral properties.   Having established the classical and quantum cyclic ergodicity of 2D KAM tori, as well as nonaperiodicity in both the irrational and rational cases, we now turn to an analytical and numerical study of the spectral rigidity of the quantized systems, with the main results depicted in Fig. 9. The distribution of the energy levels depends sensitively on α (via H = (J x + J y α)ω x for J x ∈ Z d x , J y ∈ Z d y ), and this dependence has been studied numerically and analytically in the context of the 2D harmonic oscillator in e.g. Refs. [26,94,95]. Their main analytical result is that the nearestneighbor level spacings S ∈ {E q(k+1) − E q(k) } within sufficiently small energy windows takes the form of at most 3 delta functions, P(S) = 3 j=1 c j δ(S − S j ) with c j , S j ⩾ 0; this behavior is shown in our context in Fig. 9a. This distribution is quite unlike the smooth level spacing distributions of Wigner-Dyson (compare with Fig. 7) or Poisson level statistics [7], and these results leave open the question of how the spectrum may be quantitatively characterized in direct relation to some dynamical property of the system.

Spectral rigidity
The cyclic permutation of Eq. (65) provides precisely such a characterization, via the persistence probabilities in Eq. (66). As discussed in Sec. 4.2, these quantities are themselves measures of energy level statistics when the cyclic permutation is a DFT of energy eigenstates, due to Eq. (32). The associated map q that permutes the energy levels before comparing them to regularly spaced levels is determined by the choice of m y ; for convenience, we denote this map q m y for the cyclic permutation of Eq. (65). Then, setting d x = d y for convenience and using Eq. (66) with δ d y < 1/d y to obtain the single-step DFT error where d belongs to the infinite sequence occurring in Theorem 6.1 (with d x = d y ), for every irrational α. By the discussion in Sec. 5.3, we expect ε C (1, t 0 )[q] ≳ O(d −1 ) for Poisson statistics, and ε C (1, t 0 )[q] ≳ O(d −2 ln d) for Wigner-Dyson statistics, for all q; Eq. (68) thus constitutes an analytical proof that all 2D KAM tori have significantly "more rigid" spectra than both Wigner-Dyson and Poisson statistics in the appropriate energy subspaces. In Appendix D, we argue that rational tori are non-ergodic over a range of t 0 and show spectral rigidity of the same order of magnitude as Poisson statistics. Another measure of spectral rigidity is the sorted DFT single-step error ε C (1, t 0 )[q], which is most directly related to the conventionally studied mode fluctuations according to Sec. 4.2. Given our choice of t 0 , the permutation q = q corresponds to a direct sorting of the "wrapped" energy levels: 2π which are essentially the eigenphases of the time evolution operatorÛ H (t 0 = 2π/[ω x d x ]). Indeed, while the density of states is highly nonuniform with linearly growing and decaying parts when expressed in terms of H(J x , J y ), it naturally appears uniform in terms of the wrapped energies E (J x ,J y ) at t 0 (or the eigenphases ofÛ H (t 0 )); see Figs. 9b, 9c. It is this "wrapping" that allows the spectrum to have energy levels close to a regularly spaced spectrum (and correspondingly high spectral rigidity) as per the discussion in Sec. 4.2.
We should also expect the analytically calculated DFT error of Eq. (68) to not be less than the sorted DFT error (see also the corresponding discussion in Sec. 4.2): We numerically verify this to be the case for Σ d (with d x = d y ) both belonging to and outside the infinite sequence to which Theorem 6.1 applies; however, we specifically appear to have for several Σ d including all those belonging to the infinite sequence (i.e. with |δ d y | < 1/d y ). This is shown in Fig. 9d. Further, we find that the persistence amplitudes are also in close numerical agreement with Eq. (66) for Σ d in this sequence, as seen in Fig. 9e. Finally, the SFF is shown in Fig. 9f, with a large O(1) peak expected at p = ±d from Eq. (66).
In conclusion, we have analytically proven and numerically verified that all quantized 2D KAM tori admit an infinite sequence of energy subspaces that are cyclic ergodic and non-aperiodic, with an exact expression for the spectral rigidity that is of a higher order of magnitude than even Wigner-Dyson level statistics (i.e. has lower error). On the other hand, quantized rational tori were shown to be non-aperiodic and argued (in Appendix D) to have spectral rigidity comparable to Poisson statistics, making them unlikely to be ergodic in any subspace. These quantum properties directly reflect the discretized classical ergodic properties of these systems. Some additional numerical evidence (outside the scope of this paper) further indicates that it is possible, and perhaps even typical, to have cyclic ergodic and non-aperiodic subspaces for KAM tori even when Σ d is not in the infinite sequence for which we are able to prove these properties analytically. These subspaces often show an almost ideal Gaussian persistence and an apparent O(d −2 ln d) error (e.g., for α = √ 2), but we do not yet have a quantitative understanding of this behavior.

Discussion: Dynamical ergodicity and thermalization
In classical dynamical systems, it is usually unnecessary to differentiate between two equivalent notions of ergodicity. The first is "dynamical ergodicity", by which we mean the dynamical behavior where almost all initial states in phase space visit the neighborhood of every other state. The second is "thermal ergodicity", referring to the initial-state-independence of time-averaged expectation values of any observable, as quantified by Eq. (1). The direct equivalence [1,3] between the dynamical and thermal property is at the foundation of classical statistical mechanics [4].
The analogous picture for quantum dynamical systems is the main subject of this section, which we will approach in a semi-qualitative manner. As noted in Sec. 1, ETH has been proposed as a quantum analogue (or at least, a sufficient condition) for the thermal ergodicity of physical observables, and is largely a statement about energy eigenstates in some "physical" basis (with minimal conditions on the energy eigenvalues themselves, namely non-degeneracy and rational incommensurability) [14][15][16]. The detailed study of quantum cyclic permutations in the previous sections strongly suggests that quantum cyclic ergodicity, which directly decodes spectral rigidity into a dynamical property in Hilbert space, is the appropriate quantum counterpart to dynamical ergodicity. The open problem of the connection between dynamical and thermal properties in quantum statistical mechanics is therefore essentially equivalent (through the framework of cyclic permutations) to that of the connection between the observable-independent energy eigenvalues and observable-(or basis-)dependent energy eigenstates of a quantum system.
The close correspondence between quantum and classical cyclic ergodicity suggests that one may explore these connections in systems with a classical limit, and apply it to a more general class of quantum systems that show spectral rigidity and ETH. In particular, we may treat a quantum cyclic permutation C in some Σ d as representing a fictitious phase space (or energy shell) P[C] in systems with or without a classical limit. By Weyl's law [7], an ensemble of µ(A)d pure states in C represents a region P[C] of measure µ(A) (see Sec. 3, and Refs. [72,73]). Such an ensemble corresponds to a mixed state of the form where C(A) ⊆ Z d is a set of indices of size |C(A)| = µ(A)d (with |·| denoting the cardinality of a finite set). Interestingly, while such a connection between mixed states and phase space regions could in principle be used to quantitatively connect the quantum error ε C (1, t 0 ) to that of the classical cyclic permutations, it is not clear how to do this in practice. This is because the dominant contribution to the classical error is in the lack of overlap of these mixed states, which is determined more strongly by their support in Hilbert space [96,97] than the much smaller contribution from the error in the overlap of the constituent pure states, except in the trivial µ(C k ) → 1/d pure state limit.
It is also worth noting that not all P[C] correspond to the actual phase space P in systems with a classical limit, but likely only those that satisfy the generating property with respect to classical phase space regions (see Sec. 2.2). At a strictly formal level, any n-element partition B k (n) of P can be used to define non-degenerate observables b[B k (n)] that take distinct values on each B k (n) as n → ∞ (e.g. b is essentially θ for the tori of Sec. 6); such observables admit an orthonormal eigenbasis B = {|B k ⟩} by the standard procedure of quantization, such as the postulates in Ref. [77]. The quantum analogue of the generating property then appears to be for some such B, and is therefore an observable-dependent eigenstate property, unlike cyclic ergodicity.

Quantum analogues of classical ergodicity and mixing
If such a generating C has error ε C (1, t 0 ), we can approximateÛ H (pt 0 ) ≈Û p C whenever |p| ≪ 1/ ε C (1, t 0 ) (from the bound of Eqs. (19), (39)). As we will see below, this approximation implies that late-time quantum dynamics in the fictitious phase space basis P[C] can be described in nearly classical terms up to extremely long times for sufficiently rigid spectra (e.g. t H / ln(t H /t ramp ) with t 0 = t ramp , t 0 d = t H for Wigner-Dyson spectra), standing in contrast to rapid "quantum" thermalization processes such as scrambling that cause an initial state to rapidly spread in some "local" basis over much shorter times (closer to t ramp ) without any significant influence of spectral rigidity beyond t > t ramp [43,45,56,59,84,86,98].
Using the above approximation with Eq. (71) for A, B ⊆ P, we have In this way, thermal properties of fictitious phase space regions A,B can be represented over this time scale as simple correlation functions of subsets C(A), C(B) of Z d under relative shifts, which may obey e.g. Eq.
(1) or Eq. (2). We further expect that not all such regions A, B are "physically relevant" in the classical limit: some may correspond to e.g. scattered points or regions that do not contain the neighborhood of any point in the appropriate physical variables (such as the highlighted set {C k } 14 k=0 in Fig. 8). The definition of such regions requires considerably higher resolution (of measure ∼ 1/d) in the physical variables than classically accessible, and arguably become unphysical in the classical limit. Thus, if all physical regions A, B, . . . satisfy Eq. (1) we have the emergence of effectively classical ergodic behavior, and if they further satisfy Eq. (2) we have mixing behavior, in the fictitious phase space P[C].
We note that such statements, about the physicality of fictitious phase space regions with respect to observables, is again an observable-dependent eigenstate property and not determined by spectral rigidity. This suggests that the classical ergodic hierarchy is significantly encoded in the energy eigenstates (with spectral rigidity determining the time scale of applicability), while it is their discretized counterparts (cyclic ergodicity and aperiodicity) that are fully encoded in the energy eigenvalues. Such a conclusion is consistent with the mathematical observation [64] that classical cyclic permutation errors cannot fully identify classical mixing behaviors, which instead requires an "incongruity" of the C k when expressed in terms of the physical variables in the n → ∞ continuum limit (essentially, the C k must be arranged erratically in phase space such that non-constant eigenfunctions f m̸ =0 (x ∈ P) of the classical unitary U T ≈ U T C , which satisfy f m (x ∈ C k ) ≈ exp(−2πikm/n) for 0 < |m| ≪ n, do not exist as smooth, well-behaved functions of the phase space coordinates x in the n → ∞ limit; see also Sec. 2.1). An identical conclusion is suggested by the numerical observation [39,40] that similar spectral properties may be seen in mixing and merely ergodic systems, and one may require explicitly identified physical variables to distinguish them (see also a related discussion of physical bases and the role of eigenstates in Ref. [22]). Nevertheless, in Sec. 7.2, we argue that a connection between spectral rigidity and eigenstate properties can be made if we introduce an appropriate dynamical criterion to identify physical observables.

Poincaré recurrences and eigenstate thermalization
It is natural to ask if we can characterize a physical phase space P[C] directly based on some physical property, without relying on observables such as B (which may only be possible to identify in quantum systems with an explicit phase space description [99], and even then, may be complicated by the non-classical behavior of e.g. Wigner quasiprobability distributions [100][101][102], especially for small µ(C k )). In this subsection, we consider a quantum analogue of Poincaré recurrences as a candidate for this role, noting that it is satisfied by the physical phase space in all classical systems -ergodic and non-ergodic 8 .
We introduce the following ad hoc definition based on the classical statement of the Poincaré recurrence theorem [1][2][3]

(see also Sec. 2): any subspace H(A) ⊆ Σ d with projectorΠ(A) (and density matrixρ(A) =Π(A)/ Tr[Π(A)]) is Poincaré recurrent if there exists an orthonormal basis A
for H(A), such that any pure state |ψ⟩ ∈ A returns to have a larger-than-random overlap with the subspace at some time t, with t 0 ≪ |t| ≲ O(t 0 d): We note the similarity of the restriction on the range of t to that in the definition of cyclic aperiodicity (Eq. (14)). For simplicity, we assume that it is sufficient to consider DFT cyclic permutations as representing any (sub-)regions in phase space. We also assume that the persistence amplitude is the only greaterthan-random component of any state with respect to a DFT basis of interest (in other words, none of the terms ε , as is typically the case for e.g. Wigner-Dyson or Poisson statistics (partly due to Eq. (46)). Let t R (C) then represent the randomization time of a DFT cyclic permutation C -the smallest time for which we have z(t R /t 0 , t 0 ) ∈ O(d −1/2 ), which generically decreases with increasing ε C (1, t 0 ).
The time t R determines the minimum dimension of Poincaré recurrent subspaces H(A) that have the diagonal form in Eq. (71). Specifically, only subspaces with are guaranteed to be Poincaré recurrent. Non-aperiodic cyclic permutations have t R (C) > t 0 d and every subspace is recurrent; for ergodic ones, t R (C) > t 0 d/2, and any subspace with H(A) ⩾ 2 (in other words, every subspace that is not a pure state) is recurrent. If we impose Poincaré recurrence for all regions of the form of Eq. (71) containing more than one pure state (µ(A) > 1/d) 9 , as a criterion to identify a physical phase space P[C] = P, it follows that the only DFT cyclic permutations that can satisfy this requirement are ergodic ones. In other DFT bases, Poincaré recurrence fails for certain regions A with 2/d ⩽ µ(A) < t 0 /t R (C). Thus, if we want to construct a fictitious phase space for some energy subspace Σ d that satisfies Poincaré recurrence for arbitrarily small regions (of greater than the smallest volume 1/d), there are two possibilities given our assumptions: This is the key component of our argument: imposing an ad hoc Poincaré recurrence requirement to identify a physical phase space basis P[C] = P forces this basis to be a collection of ergodic cyclic permutations. Thus, we obtain a quantum analogue of the classically more trivial statement that the phase space P is either itself ergodic or decomposes into ergodic subsets [1][2][3].
In either case, the projectorsΠ(A) can be written aŝ for some random d m × d m Hermitian matrix R kj (m) with O(1) matrix elements (with weak correlations ensuringΠ 2 =Π). On the other hand, the statement of ETH for an n-dimensional projectorΠ may be motivated by random matrix arguments (e.g. similar to Refs. [14,16,61]; see also Ref. [103] for a discussion of the role of highly degenerate projectors), giving in the full energy subspace Σ d . We see that Eq. (77) and Eq. (78) are guaranteed to agree for M = 1. For M > 1, the first (diagonal) terms of Eq. (77) and (78) can only agree through a statistically unlikely coincidence n m /d m = n/d; even in the rare instance that this holds, the second (fluctuation) term of the former typically has matrix elements of size O[( √ Mn)/d] (or zero when k, j correspond to different Σ d m (m)), and ETH can be satisfied only for M ∈ O(1).
We have essentially argued (under some simplifications, e.g. focusing on DFT bases with typical random parts), that a quantum correspondence between dynamical ergodicity (equivalently, spectral rigidity) and thermal ergodicity (essentially, observable-dependent ETH or eigenstate properties) can be established within the framework of cyclic permutations, given a single additional principle to identify physical observables. In particular, given Poincaré recurrence, ETH is satisfied by generic projectors onto "physical" phase space regions (which then show quantum analogues of thermal ergodicity and mixing, albeit over long times t ≫ t ramp ) only if the system is cyclic ergodic in Σ d . It would be interesting to see if such an argument can be made more rigorous, and suitably generalized to systems without a classical limit where a similar correspondence between eigenvalue and eigenstate statistics is observed [14] but not yet understood. To do so, it would be necessary to determine if there's any link between a Poincaré recurrence requirement for the fictitious phase space observables considered here (relevant for t ≳ t ramp dynamics), and the local or few-body observables (relevant for t ≲ t ramp ) that are the subject of conventional ETH [10-12, 14-16, 104].

Conclusions
We have identified fully quantum dynamical properties in the Hilbert space that resemble discretized classical ergodic properties, namely cyclic ergodicity and aperiodicity, and shown how these are determined by precise measures of energy level statistics, thereby introducing genuine dynamical elements into the study of quantum ergodicity and "chaos". These properties were illustrated with four physically relevant types of level statistics: Wigner-Dyson, Poisson, and that of 2D KAM and rational tori. A key takeaway is that we can isolate the precise aspects of random matrix behavior -an exact Gaussian distribution of mode fluctuations with variance σ 2 ∆ in the range of Wigner-Dyson spectral rigidity -that are sufficient conditions for cyclic ergodicity, identifying which of the numerous measures of level statistics [8] are physically important (from the present viewpoint). However, random matrix behavior is not necessary for cyclic ergodicity, as typified by 2D KAM tori. Our study of the latter included an analytical proof of spectral rigidity in an infinite sequence of subspaces, demonstrating that cyclic permutations may offer an intuitive path towards the open problem of rigorously connecting spectral and ergodic properties in individual systems [24,25].
Our dynamical construction also clarifies that spectral rigidity should be most directly associated with (quantum) cyclic ergodicity and aperiodicity, rather than stronger notions of chaos as is the norm in the literature [7]. This provides a general theoretical framework for justifying numerical observations [38][39][40] in systems with a non-chaotic, ergodic classical limit. At the same time, the precise relationship between these eigenvalue-based "cyclic" dynamical properties and more familiar (semi-)classical properties (generally involving both eigenvalues and eigenstates in the classical limit) merits further investigation, perhaps based on a classical limit of Eqs. (46) and (52) that relate quantum cyclic ergodicity to mean recurrence properties (such as, possibly, the semiclassical effect of periodic orbits [7,31,32]) in a DFT cyclic permutation. As argued in Sec. 7, the classical limit may also provide inspiration for identifying fully quantum connections between dynamical (or eigenvalue) and thermal (or eigenstate) properties in arbitrary quantum systems.
The approach developed here may generalize to "quantizing" other dynamical properties of physical interest. Connections between quantum dynamical properties and a suitably defined Kolmogorov-Sinai (KS) entropy (a measure of chaos closely related to Lyapunov exponents [5], see e.g. Ref. [105] for candidate definitions) may be accessible through non-cyclic permutations, which are related to the classical KS entropy in Refs. [2,63]. Generalizing further to non-unitary dynamical structures could help associate precise dynamical properties with the recently observed analogues of spectral rigidity in autonomous dissipative quantum systems, whose evolution is generated by non-unitary Liouvillian superoperators [106][107][108]. Finally, the precise understanding of the spectral rigidity of 2D KAM tori in terms of their dynamical properties obtained in Sec. 6 may serve as a starting point to adapt elements of KAM theory [5,91] -the classical study of the development of ergodicity under perturbations to integrable systems -to fully quantum systems. The rapid development of ergodicity in many-particle systems even under small perturbations, while not well understood in quantum mechanics, is believed to be essential for the applicability of statistical mechanics [15,109].
The number of cycling operators N(d, p) is given by the greatest common divisor of p and d; in particular, N(d, p) = 1 when p and d are coprime, including p = 1. This is most easily seen in the eigenvalue structure ofÛ p C , which consists of N(d, p) identical (degenerate) sets of distinct [d/N(d, p)]-th roots of unity. It is also convenient to consider twisted versions ofÛ p C , in which each cycle acquires an additional phase α j (w): It is worth noting that the twisting functional w affects only the eigenvalues ofÛ p C , lifting the degeneracy for most values of the α j (w), while preserving at least one complete orthonormal set of its eigenvectors. Also, the p-step persistence amplitudes z k (p, t 0 ) are invariant under the action of w.

B.2.1 Optimizing the error via the trace inner product
Due to its non-negativity, the minimum persistence at a given p is bounded by the mean persistence at that time: We also have the inequality, for any w, where the right hand side is just the mean persistence at p, expanded out. Let us assume that for every C and given a p, there exists a unitaryV C and a twisting functional w such that If this holds, then on account of Eq. (90) and the invariance of the z k (p, t 0 ) under the action of w, where w is chosen so that Eq. (91) is satisfied for some C that maximizes the right hand side of Eq. (92). This follows asV C becomes a special case ofV, and the only freedom to vary the orthonormal basis C is through its reorientations in Hilbert space -precisely given by all possible unitary transformationsV ∈ U(d) acting on the energy subspace Σ d . Now, we need to establish that Eq. (91) is indeed valid, and identify w. It is convenient to consider the two cases of nondegenerate and degenerateÛ p C separately.
1. Case 1: |p| and d are coprime. In this case,Û p C is itself a cycling operator. We separate the persistence inner product into an amplitude and phase, Let ϕ(p, t 0 ) = d−1 k=0 ϕ k (p, t 0 ). Define a new cyclic permutation C ′ with basis vectors where (j+1)p=k j=−1 ϕ jp = ϕ −p + ϕ 0 + ϕ p + . . . + ϕ k−p is a sum over the index with steps of size p, and subtracting ϕ(p, t 0 )/d from each term ensures the single-valuedness of the phases in the new basis. This induces a unitary transformationÛ C →Û C ′ =V CÛCV † C (whereÛ C ′ is required to satisfy Eq. (93) with the |C k ⟩ replaced by |C ′ k ⟩), such that We see that Eq. (91) is then satisfied for a twisting functional w with α 1 (w) = −ϕ(p, t 0 )/d (however, this phase is inconsequential in this case, being absorbed by the absolute value in Eq. (92)).
2. Case 2: |p| and d have a nontrivial common factor. For this case, we can ensure that the analogue of Eq. (91) for each [d/N(d, p)]-element cycle is satisfied following the procedure leading up to Eq. (95), with the total phase ϕ(p, t 0 ) replaced by that corresponding to the respective cycle, ϕ j (p, t 0 ). Then, it follows that Eq. (91) is also satisfied overall forÛ p C with a twisting functional w given by α j (w) = −ϕ j (p, t 0 )/d.
Thus, from Eq. (92), we can maximize the mean persistence by maximizing the magnitude of the trace with respect to reorientationsÛ C →VÛ CV † . In Sec. B.2.2, this maximum is shown to occur for as long as f p (Û C ) ⩾ d(d − 2) at some such point. IfÛ H (pt 0 ) and w{Û p C } both have nondegenerate eigenvalues, each has a unique set of d eigenvectors corresponding to the respective eigenvectors ofÛ H (t 0 ) andÛ C . Eq. (97) then implies that both sets of eigenvectors are identical, andÛ C must commute withÛ H (t 0 ) to achieve a local extremum of the mean persistence.
When there are degeneracies (in any ofÛ H (t 0 ),Û H (pt 0 ) or w{Û p C }), we can nevertheless reach a similar conclusion by infinitesimally breaking the degeneracies. We can defineÛ H(δ u ) = U H (pt 0 )e iδ uŶ where δ u → 0 andŶ is any finite Hermitian operator (i.e. with finite matrix elements in any orthonormal basis), such thatÛ H(δ u ) has nondegenerate eigenvalues when δ u ̸ = 0. Similarly, we define w (δ w ) by α j (w (δ w ) ) = α j (w) + δ w γ j with δ w → 0, with the γ j chosen so as to ensure the nondegeneracy of the eigenvalues of w{Û p C } (essentially, infinitesimally twisting any degenerate e iα j (w)Û C,j (p), e iα k (w)Û C,k (p), . . . relative to each other). Re-expressing Eq. (92) in terms of these variables, gives where O(δ u , δ w ) consists of terms of the form (δ u ) a (δ w ) b y ab with a, b ⩾ 1. As with Eq. (97), the solution to the maximization on the left hand side must be among its local extrema, given by Now, each nondegenerate operatorÛ H(δ u ) (pt 0 ) and w (δ w ) {Û p C } † has a unique set of d eigenvectors, which the above equation asserts are identical. We can chooseŶ and γ j to break the degeneracy of U H (pt 0 ) and w{Û p C } in any desired way, i.e. to pick any complete orthonormal subset of each set of eigenvectors. By Eq. (98), any such choice is equally good for maximizing the mean persistence in the δ u , δ w → 0 limit. In particular, we can pick w (δ w ) {Û p C } † so that its eigenvectors are identical to those ofÛ C ; similarly, we can chooseŶ so that the eigenvectors ofÛ H(δu) (pt 0 ) are identical to any complete orthonormal set of eigenvectors ofÛ H (t 0 ). In other words, any choice of degeneracy breaking in the neighborhood of degenerate operators only infinitesimally affects the local extrema of the left hand side of Eq. (98).
Thus, the right hand side of Eq. (89) attains its global maximum when the eigenvectors ofÛ C are fixed to be any complete orthonormal set of eigenvectors ofÛ H (t 0 ), with the only freedom remaining in the assignment of the distinct eigenvalues ofÛ C to these eigenvectors. This can be concisely expressed as follows: the global maximum of the mean persistence occurs among the solutions to lim for any HermitianŶ. For anyÛ C satisfying this property, all the z j (p, t 0 ) are equal at any given p. It follows that min j z j (p, t 0 ) is also maximized, and the p-step error minimized, by the samê U C that maximizes the mean persistence. From the requirement f p (Û C ) ⩾ d(d − 2), we get the condition ε C (p, t 0 ) ⩽ (2/d) on such a minimum of the error.

B.2.2 Local extrema, and the global maximum for large persistence amplitudes
For simplicity, letÛ 1 = w{Û p C } andÛ 2 =Û H (pt 0 ). We seek stationary points of the real valued function (from Eq. (96)) Tr Û † 1Û2 (101) with respect to small reorientations ofÛ 1 byV, to first order. This would yield all the local maxima and minima (as well as saddle and inflection points) of the function except the global minima when the function attains the value 0, where it is not differentiable. We writeV = e iX withX near 0, and require the phase of the O(X) term in Tr[VÛ † 1V †Û 2 ] to be orthogonal to the phase of the O(1) term (so that the first variation corresponds only to a change in phase and not in magnitude; alternatively, one could directly extremize the square of Eq. (101)). This gives Tr X Û † 1 ,Û 2 = c(X) Tr Û † 1Û2 for all HermitianX (102) with c(X) required to be a real-valued function, for the stationary points. As can be verified by imposing this for each independent degree of freedom in the matrix elements ofX, this requires whereF is some traceless Hermitian operator, and α 12 is the phase of Tr(Û † 1Û2 ). Up to this point, the unitarity ofÛ 1 andÛ 2 played no role. Now, we use the fact that their products are unitary, and write e −iα 12Û † 1Û2 = e iÂ 12 , andÛ 2 e −iα 12Û † 1 = e iÂ 21 , for HermitianÂ 12 andÂ 21 . Formally defining sines and cosines of Hermitian operators through their Taylor series (which are also Hermitian), Eq. (103) then gives cosÂ 12 − cosÂ 21 + i sinÂ 12 − sinÂ 21 =F.
The Hermiticity ofF demands that the anti-Hermitian part of the left hand side vanishes, giving sinÂ 12 = sinÂ 21 .
Let {a(k)} d−1 k=0 be the eigenvalues ofÂ 12 andÂ 21 (which must have identical eigenvalues up to irrelevant shifts of 2π, as products of two unitaries have the same eigenvalues irrespective of the order [78]). As long as it is known that a(k) ∈ [−π/2, π/2], the sine is invertible andÂ 21 =Â 12 (In fact, one getsÂ 12 =Â 21 and thereforeF = 0 for "generic" values of a(k) such that sinÂ 21 is not degenerate, i.e. the set {a(k), π + a(k)} is non-degenerate). Consequently, a(k) ∈ [−π/2, π/2], ∀ k =⇒ e −iα 12Û † 1 ,Û 2 = 0 at a stationary point. The vanishing commutator on the right side of the implication is precisely the condition of Eq. (97). The question of interest is now if there's a simple way to guarantee the restriction on a(k) in Eq. (107). To see that there is, we note that [Tr(e iÂ 12 )] ∈ R by the definition of α 12 , which implies k cos a(k) = Tr e iÂ 12 , k sin a(k) = 0.
Let us maximize the multivariable function b[a(k)] = k cos a(k) with fixed a(0) (and free a(k ̸ = 0)) subject to the constraint in Eq. (109) (and implicitly, non-negativity) using e.g. the method of Lagrange multipliers. The stationary points of b[a(k)] occur at a(k ̸ = 0) = c + πζ k , with ζ k ∈ {0, 1}, for some constant c. The global maximum of b[a(k)] corresponds to c ∈ [−π/2, π/2] and ζ k = 0 ∀k. Imposing Eq. (109) to fix c in terms of a(0), we get k cos a(k) ⩽ b max [a(0)] ≡ cos a(0) + (d − 1) 1 − sin 2 a(0) This is a monotonically decreasing function of |a(0)| in its full domain [0, π] for d ⩾ 2. In particular, if |a(0)| > π/2, then it is guaranteed that b[a(k)] < b max [π/2] = d(d − 2). Re-expressing b[a(k)] in terms of the trace of the relevant unitaries, we then have Combined with the implication in Eq. (107), it follows that maxima for which the trace is no smaller than d(d − 2) occur for cycling operators that commute with time evolution, i.e. when [Û † 1 ,Û 2 ] = 0. Such commuting operators remain local extrema of the trace in other cases, but it is unclear in the present analysis if the global maximum is among them. For comparison with the following subsection, we note that a(k) = −2πp∆ k /d, where ∆ k are the mode fluctuations used elsewhere (see Eq. (34)) in the main text.

B.3 Decrease of persistence for small permutations of sorted energy levels
When ∆ n ≪ d, assuming that the energies E n have been shifted by some additive constant so that k ∆ k = 0, we have (representing d times the persistence amplitude as per Eq. (34)) For simplicity, we assume that the levels are already sorted i.e. E n < E m when n < m. This further implies ∆ n − ∆ m > −|n − m|.
Any permutation q(n) can be broken up [8] into a set of cyclic permutations q r (n), each involving a subset of N r levels E[q r ] = {E r k } N r −1 k=0 . For the rest of the argument, we will require (where the subtraction of r is on Z (linear), and not on Z d (circular or modulo d)) for each q r ; permutations q satisfying this are what we refer to as "small" permutations. This will ensure that Eq. (113) remains valid under these permutations without discrete shifts of some of the ∆ k by multiples of (2π).
The new mode fluctuations after permutation are given by for each cycle q r . It follows that the mean is preserved, i.e.
Our goal is to show that the variance of the ∆ ′ k is larger than that of the ∆ k , which would translate to a decreased persistence by Eq. (113). We have In general, the r k+1 are not in any simple (e.g. ascending or descending) order. We can split each difference [r k+1 − r k ] in the first term on the right hand side into a sum of differences of the r ℓ lying between (and inclusive of) them: where ζ k = sgn[r k+1 − r k ], and the r j are chosen to be sorted according to j. On including terms with different values of k, each interval (r j+1 − r j ) occurs in an equal number of terms with positive ζ k = +1 (r k+1 ⩾ r j+1 ) and negative ζ k = −1 (r k+1 ⩽ r j ). We can arbitrarily pair each positive term r + with a negative term r − , and use Eq. (114) for the difference ∆ r + − ∆ r − noting that r + > r − . This amounts to replacing the equality with ⩾, and each ∆ r k+1 with −r k+1 , in Eq. (118). We therefore obtain The second equation follows from simplifying the first. Adding all such equations from each q r together, we get This shows that sorting the energy levels corresponds to the maximum persistence at p = 1 for a given t 0 and small ∆ k , at least among other possibilities that can be obtained as small permutations of the sorted levels. This is more like a discrete version of a local extremum. It would be interesting to check if "larger" permutations not subject to Eq. (115) would lead to significantly better maxima; this is unlikely to be the case without some non-intuitive conspiracy between distant energy levels.
where ε p ≡ ε C (p, t 0 ) and g 1 = ε 1 /(1 − ε 1 ). We note that g 1 is also the coefficient that occurs on the right hand side of Eq. (45). The p-th power of the error unitary iŝ with Θ(x) = 1 if x is true and 0 otherwise. Each term with fixed r in (123) represents a sum over paths for the transition amplitude from any |C k ⟩ to |C k+r ⟩. Now, we apply the assumption of error coefficient pairing, by considering only terms where as many m j as possible are each paired with a corresponding m k = −m j . For even s in Eq. (124), restricting to such pairings necessarily implies that m 1 + . . . + m s = 0. For odd s, it is not possible to pair all error coefficients and a free error coefficient remains, whose index must necessarily be r if the remaining coefficients are paired. Schematically (in the sense that we avoid explicitly enumerating the possible pairings), for non-negative integer u, In the second line, we have accounted for s = 2u + 1 different ways of choosing the unpaired coefficient, and used s p s = (p + 1 − s) p s−1 . For a given u, the sum over pairings and coefficients within the braces in Eqs. (125) and (126) are identical, irrespective of the value of r. Treating g 1 as a formally independent parameter that we can take partial derivatives with respect to, we can further replace (p − 2u) with (p − g 1 ⃗ ∂/∂g 1 ) acting on its right in Eq. (126), which moves all the u dependence to inside the braces. For even p, this means that each sum over s in Eq. (123) -which is naturally restricted to even s for r = 0 and odd s for r ̸ = 0 after pairing -produces coefficients for all r that are identical except for the operators outside the braces in Eqs. (125) and (126). If the time dependence is sufficiently slow, the result for odd p can be extrapolated (to a good approximation) in any convenient way between those for p ± 1. Thus, we have the approximate form The function h(p, g 1 ) originates in the sum over pairings within the braces of Eqs. (125) and (126); from the above expression, it is formally related to the persistence amplitude at p by z(p, t 0 ) = (1 − ε 1 ) p/2 h p, ε 1 1 − ε 1 . (128)

C.2 Gaussian estimate
The persistence amplitude at p + 1 can be expressed in terms of the coefficients inÛ p ∆ andÛ 1 ∆ as follows: Substituting the appropriate expressions for ε p and ν p from Eq. (127), we get Now, we assume that the second term within the absolute value is smaller than the first, and ph ≫ g 1 ∂h/∂g 1 ; both will be justified retroactively. Further defining which happens to measure the goodness of the approximation in Eq. (48), we are led to z(p + 1, t 0 ) ≈ (1 − ε 1 ) 1/2 1 − g 2 1 pν C z(p, t 0 ).
We see that the smallness of the second term in Eq. (130) and ph ≫ g 1 ∂h/∂g 1 are both satisfied when p ≪ 1/g 1 , i.e. when the persistence amplitude is still close to 1.

C.3 Minimum error constraints from the SFF
Substituting the form K(t) = λt γ in Eq. (52) and dropping subleading terms in ε 1 = ε C (1, t 0 ) gives For γ ∈ [0, 1), the left hand side is dominated by small p and is independent of M. Replacing 1/(M √ ε 1 ) → ∞, we obtain where ζ(x) is the Riemann zeta function. In particular, for γ = 0 and λ = 1/d (Poisson statistics), we have ε ⪆ π 2 /(3d) ∈ O(1/d). For γ > 1, it is instead the terms with larger p that dominate. Using the leading term in Faulhaber's formula for the sum (formula (0.121) in Ref. [110]; equivalent to replacing the sum with an integral), we have Figure 10: Comparison ofν m (p) withν m (1), using magnitudes |ν m (p)| 2 , |ν m (1)| 2 and residuals |ν m (1) −ν m (p)| 2 for d = 2048. The residuals are predicted to be negligible compared to the magnitudes at the same m for p ≪ 525, which these plots are in good agreement with even when p is a considerable fraction of 525.
in general. This argument means that for rational tori, we should expect (based on Eqs. (32) and (34) which is at least of the same order of magnitude of spectral fluctuations as Poisson level statistics. Given that the 1-step error for any cyclic permutation must be ≳ O(d −1 ) in this case by Theorem 4.1, it follows that rational tori are unlikely to possess any ergodic cyclic permutation in Σ d . We also note that an analytically solvable illustration of this argument is given by the sorted DFT cyclic permutation for t 0 = 2π/(ω x d x ) when d x = d y , in which the (wrapped) energy levels are exactly √ d-fold degenerate.