Chaos and subdiffusion in the infinite-range coupled quantum kicked rotors

We map the infinite-range coupled quantum kicked rotors over an infinite-range coupled interacting bosonic model. In this way we can apply exact diagonalization up to quite large system sizes and confirm that the system tends to ergodicity in the large-size limit. In the thermodynamic limit the system is described by a set of coupled Gross-Pitaevskij equations equivalent to an effective nonlinear single-rotor Hamiltonian. These equations give rise to a power-law increase in time of the energy with exponent $\gamma\sim 2/3$ in a wide range of parameters. We explain this finding by means of a master-equation approach based on the noisy behaviour of the effective nonlinear single-rotor Hamiltonian and on the Anderson localization of the single-rotor Floquet states. Furthermore, we study chaos by means of the largest Lyapunov exponent and find that it decreases towards zero for portions of the phase space with increasing momentum. Finally, we show that some stroboscopic Floquet integrals of motion of the noninteracting dynamics deviate from their initial values over a time scale related to the interaction strength according to the Nekhoroshev theorem.


I. INTRODUCTION
The kicked rotor is a paradigmatic model in classical Hamiltonian dynamics. This simple model has been widely used to numerically study the development of chaos when integrability is broken, in accordance with the Kolmogorov-Arnold-Moser theorem [1][2][3]. In this singledegree-of-freedom model there can be weak chaos when the integrability breaking is small and there are still many manifolds of integrable-like dynamics (conserved tori) in phase space. In this case of weak chaos, the energy does not increase in time, while in the case of strong chaos the dynamics is ergodic and diffusive in phase space and the energy linearly increases in time. Considering the case of many coupled kicked rotors, on the opposite, there are not enough conserved tori and the dynamics eventually shows a diffusive behaviour with energy linearly increasing in time [1,2]. Nevertheless, the time scale after which this behaviour appears is very long and exponentially large with the inverse of the perturbation from integrability [4,5]. This is a particular case of a general theorem due to Nekhoroshev [6][7][8]. So, a classical Hamiltonian system slightly perturbed from integrability shows a thermal ergodic behaviour after a long nonthermal transient, in strict analogy with the well-known quantum prethermalization [11].
The single kicked rotor is very interesting also from the quantum point of view. Indeed this model shows a lack of correspondence between classical and quantum behaviour. Differently from other cases [12][13][14][15][16][17][18], the kicked rotor shows a diffusive dynamics with linearly increasing energy in the strong classical regime which has no counterpart in the quantum domain. When the model is quantized due to quantum interference the classical energy increase stops after a transient. This behaviour is known as quantum dynamical localization [19,20,22], and has been interpreted as a dynamical disorder-free analog of Anderson localization [23][24][25] and has also been experimentally observed [26,27].
It is interesting to understand if dynamical localization persists in the case of many interacting rotors. It has been argued for different types of interactions, that for any set of parameters there is a threshold in system size beyond which the system becomes ergodic and the energy increases in time without a bound [28,29]. So, in contrast with other models [30][31][32], there is no dynamical localization in the many-body limit. Nevertheless, the system does not become equivalent to the classical model. This is clear in the case of infinite-range coupling where the thermodynamic-limit dynamics can be exactly computed [29] and one finds that the energy increases as a power law with exponent γ < 1 (signaling subdiffusion in momentum space): Quantum effects make the classical linear energy increase slower. A power-law increase of the energy has also been observed in the quantum few-rotor interacting case [33] and in many non-linear generalizations of the single quantum kicked rotor [34][35][36] and related disordered models [37][38][39]. On the opposite, in the classical chaotic many-rotor case, the energy increases linearly in time [4,5]. One can observe a linear increase of the energy also in a single quantum kicked rotor if the kicking amplitude is modulated by a noise [40].
In this work we focus again on the quantum infiniterange coupled quantum kicked rotors. We restrict to the subspace even under all the site-permutation transformations, we apply a method used before in another infiniterange coupled context [41], and map the model over a bosonic infinite-range interacting model over a lattice. From one side this fact allows us to apply exact diagonalization for larger system sizes and larger truncations with respect to what previously done. In this way we can probe ergodicity by means of the average level spacing ratio for larger system sizes N and find further confirmation for the generalized tendency towards ergodicity for increasing N previously demonstrated [29].
This bosonic mapping is convenient also because allows to show that, in the limit N → ∞, the model is described by a system of Gross-Pitaevskii equations. In the limit of vanishing interactions, these equations are equivalent to the Schrödinger equation of the single kicked rotor represented in the momentum basis. They are exact in the limit N → ∞, and we can show that in general they are equivalent to the Schrödinger dynamics of the non-linear single kicked rotor effective Hamiltonian found in [29]. The energy increases in time as a power law with exponent γ < 1 (sub-linear way) and the power-law exponent appears to be constant in a wide range of parameters of the model and consistent with the value γ ∼ 2/3 (in agreement with [29]).
We can analytically explain the value of this exponent by considering the nonlinear modulation of the kick in the effective Hamiltonian as a noise. This allows us to write a master equation for the density matrix. We move to the Floquet basis [42] and apply a coarse graining in time as in [43]; exploiting momentum-space Anderson localization of the Floquet states we get a diffusion equation for the momentum-eigenstates occupations. From that, we get a self-consistent differential equation for the energy expectation providing the power-law increase with exponent 2/3. Moreover, we predict that the coarse-grained squared non-linear modulation of the kick depends on time as t −1/3 and we numerically verify this fact. Our approach can be applied also to the case of a kicking modulated by a noise with properties invariant under time translations. This gives rise to a diffusive behaviour of the energy, in agreement with [40].
Our master-equation approach is possible thanks to the chaotic behaviour of the Gross-Pitaevskii equations. In order to probe the chaoticity properties of these equations, we evaluate the largest Lyapunov exponent, which gives a measure of the rate of exponential divergence of nearby trajectories [44]. We see that, when we consider parts of the phase space with larger and larger momentum, the largest Lyapunov exponent decreases as a power law towards 0. So, for increasing momentum, the system is still chaotic but becomes asymptotically regular in the limit of infinite momentum. This is in agreement with the results of the average level spacing ratio at finite size suggesting full ergodicity in the large-size limit.
The largest Lyapunov exponent results are relevant also for finite N 1. Here, due to Heisenberg principle, the relevant dynamical variables have an initial uncertainty of order 1/ √ N . For finite N we can use exponential divergence of nearby trajectories to show that the Gross-Pitaevskii equations are valid for a time increasing as log N , with a coefficient equal to the inverse of the relevant Lyapunov exponent. So, in the limit of infinite momentum, the Gross-Pitaevskii equations tend to be valid for an infinite time, as one naively would expect, being in that limit the integrability-breaking interaction term vanishingly small compared with the undriven term. Moreover, the validity of the Gross-Pitaevskii equations for a time logarithmic in N in the case of a chaotic dynamics is a fact generally valid in bosonic mean field dynamics. This is relevant for instance in the context of time crystals [45]. Chaos in this model obeys the Nekhoroshev theorem [6]. In the noninteracting case the dynamics is the Schrödinger dynamics of the single rotor which behaves in an integrable way and the resulting constants of motion of the stroboscopic dynamics are related to the Floquet states. Turning on the interaction, the Schrödinger dynamics becomes the nonlinear Gross-Pitaevskii one, integrability is broken and the Floquet constants of motion are no more conserved. They deviate in time from their initial value over a time scale exponential in the inverse of the integrability-breaking interaction term.
The paper is organized as follows. In Section II we introduce the model Hamiltonian and we perform the mapping to the infinite-range bosonic model. We describe all the details of the construction of this bosonic representation in Appendix A. In Sec. II A we perform exact diagonalization on this Hamiltonian, with an appropriate truncation, and by means of the average level spacing ratio we show the existence of a generalized tendency to ergodicity for increasing system size. In Sec. III we show that in the limit N → ∞ the dynamics of the bosonic Hamiltonian is described by a system of Gross-Pitaevskii equations completely equivalent to a non-linear single-rotor effective Hamiltonian. In Sec. IV we use these equations to numerically study the evolution of the energy. We find that it increases in time with a power law. We analytically explain the power-law exponent γ = 2/3 valid in a wide range of parameters by using a master-equation approach. In Sec. V we study the Lyapunov exponent and show that the rate of exponential divergence of the trajectories tends to zero for increasing considered values of the momentum. In Sec. VI we show that the time scale over which the vanishing-perturbation integrals of motion deviate from their initial value obeys the Nekhoroshev theorem.

II. HAMILTONIAN AND MAPPING TO THE BOSONIC MODEL
So, this is the quantum infinite-range coupled kicked rotorĤ where we define [46] l, l cos(θ l −θ l ) (2) (here we have replaced the physical coupling K withkK, in order to somewhat simplify the subsequent formulae). The commutation relations are valid, withk related to the physical parameters of the Hamiltonian (one arrives at Eq. (1) after an appropriate rescaling [29]). In all the paper we will focus on the stroboscopic dynamics, looking at the system in the instant n + , that's to say immediately after the n th kick (in the text we will omit the superscript + ). Most importantly this model is invariant under all the sitepermutation transformations. The subspace even under all these transformations is therefore an eigenspace of the Hamiltonian, a point which will be crucial in the next section. We restrict to the subspace even under all the permutation transformations. Using the methods explained in [41] we get the effective Hamiltonian (see Appendix A) whereb m are bosonic operators obeying the commuta- and we definen m ≡b † mbm . The bosons obey the constraint mn m = N . Here m mark the discrete single-particle momentum eigenvalues; with this mapping we represent the dynamics in terms of occupations of these levels. In the next subsection we are going to discuss the ergodicity properties of the model by means of the level statistics.

A. Average level spacing ratio
This bosonic representation is quite convenient from a technical point of view because it is possible to apply exact diagonalization to the Hamiltonian Eq. (4) for system sizes and truncations of the Hilbert space significantly larger than those considered in [21,29]. A very important object in a periodically-driven dynamics is the time-evolution operator over one period which for the Hamiltonian in Eq. (1) iŝ We report the expression of V (θ) and N l=1p 2 l in the bosonic representation in Eq. (A9). We further restrict to the subspace even under the m → −m symmetry [47]. In this subspace, we computeÛ , diagonalize it, and get the many-body Floquet quasienergies µ α as the phases of the eigenvalues ofÛ [48,49]. Of course, this is possible only imposing a truncation to the Hilbert space, restricting to the states for which |m| ≤ M . In order to probe ergodicity of the system, we can evaluate the average level spacing ratio r [50] defined as where the quasienergies are restricted to the first quasienergy Brillouin zone [51,52] [−π, π] (they are periodic of period 2π) and taken in increasing order. N M is the dimension of the truncated Hilbert space. It is important that on the subspace even under all the permutations we have imposed the further constraint of being even under the m → −m symmetry. In this way we are restricting to an irreducible invariant subspace of the Hamiltonian, a condition required in order that the level spacing distribution (and the related ratio r) is a meaningful ergodicity indicator [53]. When the driven system is ergodic, i.e. locally thermalizing with T = ∞ [54], the Floquet operatorÛ belongs to the circular orthogonal ensemble (COE) of symmetric unitary matrices (because of the time-reversal invariance) [33,55,56,59]. In this case, the level spacing distribution is of the COE type and the average level spacing ratio acquires the value r COE 0.5269. A level spacing distribution of the Poisson type corresponds to an integrable dynamics [60] and is observed for instance in the single kicked rotor which is dynamically localized [20,29] and behaves in an integrable-like way breaking the classical ergodicity (one can contstruct infinite integrals of motion local in momentum space which deeply affect energy absorption). It corresponds to an average level spacing ratio r P 0.386.
We show r versus K for a fixed in Fig. 1. For each N we choose M so that r has attained convergence (fixing N , we see a quite fast convergence of r with the truncation M of the Hilbert space). We see that from a certain K on, r attains the COE value: The system becomes ergodic. For increasing N the values of r generically increases and r attains the COE value for smaller and smaller values of K. This suggests a tendency towards ergodicity for increasing size. This is in agreement with the analytical predictions of [29] of a completely ergodic system in the limit N → ∞.

III. GROSS-PITAEVSKII EQUATIONS IN THE
N → ∞ LIMIT

A. Derivation
We start by considering the limit N → ∞ where the dynamics is described by effective classical equations. Everything is based on the definitionβ m ≡b m / √ N and on the observation that and exploit the fact that in the limit N → ∞ theβ m are uncorrelated, due to the vanishing commutators in this limit [Eq. (6)]. By defining β m (t) = ψ(t)|β m |ψ(t) , in the N → ∞ limit we get from the Heisenberg equations the following system of Gross-Pitaevskii equations The energy per rotor of the unkicked part of the model can be written as We see that for = 0 this system of equations is equivalent to the Schrödinger equation for the wave function of a single rotor in the momentum-basis representation [56,58]. The Gross-Pitaevskii equations can be obtained from the effective classical Hamiltonian In order to get the Gross-Pitaevskii equations, one writes These Poisson brackets will have a relevant role in the analysis related to the Nekhoroshev theorem in Sec. VI. In the limit M → ∞ we can actually map the semiclassical model Eq. (7) into the self-consistent single-particle model studied in [29]. In order to do that, we rewrite Eq. (7) in a sort of continuum-limit approximation. Identifying x ≡ m and ψ(x) ≡ β m , expanding to all orders in 1/m as in [62], we get where we have exploited that ∞ −∞ |ψ(x, t)| 2 = 1 and used the formal relation ψ(x+1) = exp d dx ψ(x), involving all the perturbative orders [62]. Introducing the appropriate coordinate and momentum operatorŝ such that [θ,p] = ik, [63] we can rewrite this formula as with the effective self-consistent Hamiltonian This is exactly the effective mean-field Hamiltonian following Eq. (1), found in [29] through a much more involuted analysis. We can easily see that, if the initial state is symmetric underθ → −θ, this symmetry will be preserved during the time-evolution, so that ψ t | sinθ|ψ t = 0. Consequently is purely real. Thus, the effective self-consistent Hamiltonian Eq. (16) acquires the form This effective Hamiltonian will be very relevant for us in the next section, where we will use it as the starting point to construct a master-equation model for explaining the energy subdiffusion with exponent 2/3.

IV. ENERGY SUBDIFFUSION
We numerically solve equations (7) by means of a fourth order Runge-Kutta method with adaptive time step scheme [64,65]. We initialize them in the symmetrized m = 0 state for the rotors, therefore we choose β m (0) = 1 for m = 0 0 otherwise . and we can compare the results with those of the mean field analysis performed in a different way in [29]. In order to implement them we have to impose a truncation, fixing some M > 0 and restricting to the values of m such that m ≤ M . We show some plots of e(t) versus t with stroboscopic time (t is integer) for an interacting case with = 0 in Fig. 2. The horizontal line corresponds to the T = ∞ value of the unkicked energy in the truncated subspace. This quantity can be readily evaluated as In Fig. 2 we can see convergence towards e T =∞ , and therefore thermalization. Most importantly, we see a power-law increase in time of the energy, still in agreement with the findings of [29]. We can see this power-law increase until saturation sets in, but with M → ∞ it would last forever, as one can easily convince by looking at Fig. 2. We show some examples of power-law increase for different choices of parameters in Fig. 3. The curves with ∈ [0.1, 1] are consistent at long times with a power-law increase of the form This numerical result was already found in [29] (see Figs. 9 and 10 of that paper). So, in a wide range of , the exponent of the power law appears to be independent of the choice of the parameters. In the following we provide an analytical argument for explaining this finding. Before deriving the equation giving rise to momentum subdiffusion, we explain the rationale behind our analysis. The key observation is that energy absorption is purely driven by the fluctuation of the self-consistent field F (t), in fact, if F (t) were constant, the system would be in a dynamical-localized phase with an asymptotically constant energy. We can then expect that the evolution with the effective Hamiltonian Eq. (14) gives a d dt p 2 t proportional to the variance of F (t) over time -as we will show in Sec. IV B. Furthermore, as the momentumrange m t = p 2 t /k grows in time, the variance of F (t) decreases towards 0 as 1/m t (see Sec. IV A). As a consequence the diffusion in momentum is slowed down as the wave-function spreads, giving rise to the subdiffusive behaviour observed above. Other numerical parameters:k = 1.7.

A. Statistical properties of F (t)
While the non-linear equation Eq. (7) is hard to characterize analytically, a simplifying assumption is to consider F (t) as an effective noise. This is reasonable because, as a consequence of the chaotic diynamics in Eq. (7), F (t) shows random oscillations symmetric around 0, as we can see in Fig. 4. Before starting our analysis we discuss which minimal set of self-consistency properties should the effective noise have in order to adequately describe F (t).
Firstly, looking at Eq. (15), we can split the sums over m into pieces, each of one including O(ξ loc ) adjacent ms, where ξ loc is the single-rotor localization length (in momentum space). It is now reasonable to assume that the contribution of each of these pieces will be largely uncorrelated with the other ones. Then, when the state is spread over a momentum range m t ξ loc , by the central limit theorem, the sum F (t) will be distributed like a Gaussian with mean zero.
To estimate the variance of the Gaussian noise, we use the normalization condition m |β m | 2 = 1 to infer that the magnitude of |β m | scales like 1/ √ m t for m m t , and is approximately zero otherwise. Then summing over a region of length 2ξ loc aroundm, such that |m| m t , we approximately have assuming perfect correlation inside that region. Considering, instead, pieces from different regions as uncorrelated, we have that Estimating the localization length in terms of the singlerotor parameter (see e.g. Ref. [25]) we then have Finally, since the evolution of β is chaotic (see Sec. V), we will assume that the autocorrelation of F (t) decays in time over a finite scale τ . These assumptions are consistent with the numerical solution of the equation of motion. We show an example in Fig. 4, where we can see that the random oscillations are symmetric around 0 and F (t) is short-range correlated in time.

B. Effective master equation
We start by considering the self-consistent Hamiltonian Eq. (16) and we defineĤ 0 (t) ≡p  Treating F (t) as a Gaussian noise, we can derive from Eq. (21) a master equation similar to the ones resulting from the coupling to an environment [42]. We write first the evolution equation for the density matrixρ ψ Integrating Eq. (22) once we get Substituting this formula into Eq. (22) and averaging over the Gaussian noise (the assumption of Gaussianity allows to neglect the correlations betweenV I andρ ψ I ) we get where · · · marks the average over the Gaussian ensemble and we have definedρ I (t) = ρ ψ I (t) . At this point we apply a coarse-graining average over a mesoscopic timescale ∆t τ, 1. Furthermore, we assume that averages of functions of F (t) over time can be substituted with the corresponding averages over the Gaussian ensemble producing the noise (see Sec. IV A). We expect this step to be valid as long as the system is effectively ergodic on the mesoscopic time-scale ∆t, which is reasonable in this case, given that the system is chaotic (see Sec. V). Thus, we have where (· · · )(t) marks the average over the coarse-graining time interval [t − ∆t, t + ∆t] and g(t − t ) is a function which decays to 0 when |t − t | > τ . Integrating over dt and applying the coarse-graining average, the δ 1 functions disappear and give rise to an average over integer times, multiple of the kicking period (stroboscopic times). The kicking period 1 is much smaller than ∆t, the resolution in time after coarse graining, then we can approximate the average over stroboscopic times with an average over continuous times. Moreover, assuming τ ∆t, we can substitute t with t because the function g(t − t ) acts as a delta function at the level of time resolution given by ∆t. Defining where the coefficient g results from the integration of g(t − t ) and putting everything together, we get d dtρ where, due to the presence of the δ 1 (t), the coarsegraining average results being over discrete values of time (t 1 is integer) and we have approximated the coarsegrained derivative asρ I (t+∆t)−ρ I (t−∆t) 2∆t d dtρ I (t) because we are interested in phenomena occurring over timescales larger than ∆t. Moreover, σ(t) is already coarse grained and the coarse-graining average does not affect it.
Finally, since we are interested in the asymptotic properties at large times, we can assume that m t 1. Consequently, by Eq. (20), we can assume that σ(t) 1. This assumption is crucial for our analysis since it allows us to make a separation-of-timescales approximation [42]. With this assumption, we can see from Eq. (27) that the noise induces significant changes inρ I (t) over a time scale t r much larger than the typical order-1 timescale t 0 of the dynamics induced byĤ 0 . So, if we take t 0 ∆t t r , ρ I (t) in Eq. (27) is not affected by the coarse-graining average. In essence,ρ I (t) can be approximated by its coarse-grained average,ρ I (t) ρ I (t). This fact will have important consequences in the following analysis of subdiffusion.

C. Subdiffusion
In order to derive subdiffusion, we need to consider the dynamics of the operatorÔ = |m m| where |m is an eigenstate of the operatorp with eigenvaluekm and m ∈ Z. Its expectation is the occupation probability of the state |m and we will eventually find a diffusion equation for these occupations.
We can expand this operator in the basis of the noninteracting Floquet modes [48,49]. They are defined in terms of the noninteracting Floquet states which are solutions of the noninteracting Schrödinger equation ik∂ t |ψ(t) =Ĥ 0 (t) |ψ(t) which are periodic up to a phase |ψ j (t) = e −iµj t |φ j (t) . The periodic part |φ j (t) are the Floquet modes. We have definedÛ 0 (t) as the time-evolution operator ofĤ 0 from time 0 to time t, and we findÛ So, we get the expression where O i j (t) ≡ φ j (t)|Ô|φ i (t) is a periodic quantity. Now let us write its expectation at time t in the interaction representation as where we have used Eqs. (28) and (29). We apply the coarse-graining average. Thanks to the separation of timescales, it does not act onρ I (t). It acts over many periods and, assuming that the µ j are incommensurate with the driving frequency 2π [43], we get Being O i j (t) periodic, O i i does not depend on the coarsegrained time. Using Eq. (27) and applying the cyclic property of trace, we get Expanding in the Floquet-mode basisV = k l V k l (t) |φ k (t) φ l (t)| [as in Eq. (29)] we obtain (we report the detailed derivation from Eq. (32) in Appendix B). Dynamical localization implies Anderson localization of the Floquet modes |φ k (0) in the momentum basis {|m } [23,24,29]. Moreover,V is local in momentum space because it only connects |m with |m + 1 . These two facts imply that |V k i | 2 is local (or short range) and is nonvanishing only if |j − k| is smaller than some localization length λ. At this point we can substituteÔ = |m m| and define p m (t) ≡ |m m|(t) = m|ρ I (t)|m . Concerning this quantity, Anderson localization of Floquet states has another important consequence. If we make a coarse-graining average in momentum space, then φ k (0)|ρ I (t)|φ k (0) can be approximated by m |ρ I (t)|m for some m , and we get where C m m is a coupling local in m and m . In order to better specify the form of this local coupling, we consider that, thanks to the coarse graining in momentum space, m can be assumed continuous. A constraint comes from the normalization condition m p m (t) = 1, which is automatically enforced by writing the right-hand side of Eq. (34) as a total derivative in m We then conclude that plus higher derivatives term, whose contribution goes to 0 as the time goes on. This is a diffusion equation. If we initialize with a p m (0) symmetric for m → −m, the dynamics will preserve this symmetry. This is our case because we initialize with p m (0) = δ m 0 . Therefore, the quantity m 2 t ≡ m m 2 p m (t) is the variance of the p m (t) distribution. Notice that e(t) is proportional to m 2 t being e(t) =k Notice that if σ(t) = const., we find a diffusive behaviour of the energy, with e(t) ∝ m 2 t linearly increasing in time, in agreement with the results known in literature for a driving undergoing a noise with properties invariant under time translations [40].
In the case of our dynamics, Eq. (26) is valid and then we need to estimate F 2 (t) = Var [F (t)]. Applying Eq. (20), then we then have d dt Solving this simple differential equation we find This time dependence has been observed for long times and ∈ [0.1, 1] (see Fig. 3 and Ref. [29]). This theory gives rise to another prediction which we can numerically check. In fact, combining the scaling of the energy with Eq. (20), we find Whenever there is power-law increase of the energy with exponent 2/3, this is exactly the time dependence of σ(t) that we numerically observe (see Fig. 5). So, σ(t) decreases with time and then the coarse-graining approximation improves when larger and larger times are considered. This fact explains why the curves in Fig. 3 tend towards a power law with exponent 2/3 at large times.
We now discuss when we expect our analysis to break down. This is an important point, for instance fitting the curves for = 0.52 in Fig. 3 for t ≥ 3000 we find a value of the power-law exponent γ 0.72. Starting from smaller values of t one would get even larger values of γ. The point is that the curves slowly tend towards the power-law behaviour with exponent 2/3. The reason is that, as already stressed, our analysis heavily relies on the assumption σ(t) 1. Before this condition is valid, in the case = 0.52, one must wait a very long transient time. Plugging Eq. (20) into the definition of σ(t), we have that Thus, for σ(t) 1 to hold we must have m t 2 K 4 . This implies that for large K, there could be a very long transient regime where the energy absorption could follow a different scaling in time. This could explain the linear energy increase found in [29] for large K and , which might be just a transient behaviour. For 1, in contrast, the chaotic behaviour of the system becomes apparent after a transient exponentially large in 1/ , as a consequence of the Nekhoroshev theorem [6] (see Sec. VI). In this transient the system looks like integrable and shows dynamical localization. This fact could explain the dynamical localization without energy increase that we observe in Figs

V. LARGEST LYAPUNOV EXPONENT
Chaos was relevant in the last section in order to treat F (t) as an effective noise. We concentrate here in characterizing the chaotic properties of this model and we evalu-ate the largest Lyapunov exponent [44] λ(T ) as a stroboscopic average over T periods, and study its convergence with T , using the method explained in [66]. We compare two trajectories with nearby initializations, one with β 0 (0) = 1, β m =0 (0) = 0, another with β 0 (0) = √ 1 − δ 2 , β 1 (0) = δ, β m =0, 1 = 0. We choose δ = 10 −10 and we study the rate of exponential divergence in each period in the following way. At the end of the first period the initial distance √ 2δ has become d 1 . Before evolving over another period, we leave (. . . β m (n) . . .) unchanged and modify (. . . β m (n) . . .): We move it along the line joining it with (. . . β m (n) . . .) so that the distance between the two trajectories become again δ. We iterate this procedure over T periods obtaining the sequence of distances d 1 , d 2 , . . ., d T . The largest Lyapunov exponent is evaluated as the average rate of exponential divergence This quantity reaches a limit λ over a finite T whenever the phase space is a bounded set. The phase space is bounded for any finite M . For each choice of M we take T so that we have reached convergence, and so we find that λ decays as a power law with 2M +1 (Fig. 6). We also fit log(λ) versus log(2M +1) with a straight line and the slope of the fitting line is −0.91 ± 0.05. Therefore, in the limit M → ∞ the system should tend towards a regular behaviour with vanishing λ.
This looks like a paradox for a thermalizing system (see Fig. 2), but we should not worry so much. We plot λ(T ) versus T for different values of the truncation M in Fig. 7. We see that λ(T ) decreases as a power law with T , until it saturates to a plateau decreasing with 2M + 1 as a power law (see Fig. 6). For M → ∞ the power law decay would last forever: the phase space is not bounded, the dynamics wanders away towards m → ∞ and the system explores parts of the phase space with larger and larger m and smaller and smaller exponential divergence of the trajectories. This looks reasonable: the ratio of the perturbation inducing chaos and the unkicked part of the Hamiltonian is /m 2 , so for large m there is less chaos.
For any finite T , λ(T ) is nonvanishing because the dynamics is restricted to finite values of m and gives a measure of the average Lyapunov exponent in that range of m. We can see that when you probe parts of the phase space with larger m, you get a smaller value of the Lyapunov exponent. We notice that λ(T ) relaxes to the plateau at the same time when the dynamics saturates to the maximum possible attainable value of m and the energy in Fig. 2 attains the T = ∞ value at finite M .
An important information is that different time regimes in Fig. 7 correspond to different ranges of m: The larger T , the larger m, the smaller the largest Lyapunov exponent λ(T ). We can see this fact more explicitly if we plot λ(T ) versus m(T ) ≡ 2e(T )/k, as we do in Fig. 8. We find further confirmation of this fact if we evaluate the largest Lyapunov exponent considering different initializations. Beyond the one considered up to now, we take another one where for one trajectory β m (0) = 1/ √ 2M + 1 ∀ m and for the other In the second initialization scheme the energy expectation is equal to the T = ∞ value from the beginning and then the dynamics explores large values of m from the beginning. We show an example of the situation in Fig. 9. In the upper panel we show the evolution of λ(T ). We see that with the second initialization λ(T ) is small from the beginning, consistently with the fact that the relevant values of m are larger and our assumption of a λ m decreasing with m. For both initializations λ(T ) reaches the same limit, and this occurs at the time where the energy has reached the T = ∞ value for both the initializations (lower panel of Fig. 9).

N 1
A positive Lyapunov exponent has an important implication for the dynamics at N 1 but finite. The β m have fluctuations of order 1/ √ N has we can see in Eq. (6). Because of chaotic dynamics, this initial uncertainty increases exponentially fast in time with a rate given by the Lyapunov exponent. The larger m, the smaller the Lyapunov exponent, with a dependence λ m resembling the one in Fig. 8. So, the Gross-Pitaevskii equations Eq. (7) are valid until the initial uncertainty becomes of order 1. This occurs for a time t such that e λmt / √ N ∼ 1, that's to say For m → ∞ our results suggest that λ m → 0, and so in that limit the Gross-Pitaevskii equations are true for a time tending to infinity, even for finite N .

A. Nekhoroshev theorem in a nutshell
Nekhoroshev theorem [6] applies when an integrable system is perturbed with a term breaking the integrability. A classical integrable system is such if it has as many integrals of motion I j as degrees of freedom and these integrals of motion are in involution, this gives to the dynamics peculiar regularity properties [61]. Perturbing the integrable system with an integrability breaking term, these quantities are no more conserved and they depend on time: I j (t) deviate from their initial value. Let us call the strength of the perturbation. Nekhoroshev theorem says that there are two positive real numbers a, b such that |I for some positive constant C. This is a very important information because it tells us that for 1 the I j (t) are approximately conserved for a time exponentially large in the inverse perturbation, that's why the Nekhoroshev theorem is also called "classical prethermalization". We are going to show that it is valid also for our Gross-Pitaevskii equations.

B. Conserved quantities at = 0
Let us start focusing on = 0. In this case, Eqs. (7) coincide with the Schrödinger equation of the single kicked rotor in the momentum basis (see for instance [56]). Now we are going to show that this noninteracting model, if probed stroboscopically in time, shows infinitely many conserved quantities local in m. They can be constructed by means of the Floquet diagonalization of Eq. (7) at = 0. This procedure is equivalent to evaluating the single-rotor Floquet modes [48,49] in momentum basis. To find them, one gets from Eq. (28) the eigenvalue equationÛ 0 (1) |φ j (0) = e −iµj t |φ j (0) and solves it, expandingÛ 0 (1) and |φ j (0) in the momentum eigenbasis. More specifically, we are focusing here on the time immediately after the kick, so we consider the formÛ 0 (1 + ) = e −iK cosθ e −ip 2 /(2k) . The eigenvalues are of the form e −iµj ; the eigenvectors are the Floquet modes and they have the form V j ≡ . . . U m−1 j U m j U m+1 j . . . T (in the notation of Sec. IV C we have |φ j (0) = m U m j |m ). From one side we see that if we prepare the system in the condition β m (0) = U m j ∀ m, the evolution reduces to β m (n) = e −inµj U m j being V j a Floquet mode of the single-rotor dynamics. From the other side, we see that the quantity with n integer evolves as O j (n) = e iµj n O j (0), so that |O j | 2 is conserved, whatever are the initial conditions in the β m . We notice that the |O j | are local in m due to the Anderson localization of the Floquet states [23,24,29] Moreover, we can show that the |O j | are in involution, that's to say their Poisson bracket vanishes. This can be easily seen by evaluating the Poisson bracket {|O j | 2 , |O j | 2 } and showing that it vanishes by using the elementary Poisson brackets Eq. (10) and the orthonormality condition m U * m j U m j = δ j j . This is an important remark because it implies that the = 0 system is an integrable system in the classical Hamiltonian sense [61]. From one side the integrals of motion are as many as the degrees of freedom (imposing a truncation one sees that the j are as many as the m). From the other side the Poisson brackets of the integrals of motion vanish so that they are in involution. So, if we apply a perturbation to it, the Nekhoroshev theorem should be valid, as we are going to explicitly show now.

C. Nekhoroshev estimate for > 0
It is very interesting to study the time dependence of |O j (t)| >0 when > 0 is considered. Let us initialize the system with the condition β m (0) = 1/ √ 2M + 1. We consider We study the evolution in time of this quantity. In order to get rid of any influence of the initial state, we average over N diso initial conditions taken as β m (0) = e −iφm / √ 2M + 1 with φ m ∈ [0, 2π] uniformly distributed random variables and call the average δ j, (t). We show some examples of evolution of in Fig. 10. We see a initial power-law increase followed by a saturation to a plateau. We can fit the power law by means of a linear fit of the bilogarithmic plot log η j, (t) = A j, log t + B j, .
We average both A j, and B j, over j and get We plot A and B versus in Fig. 11. We see two important properties which will be important in the following. The first one is that A is almost constant in and equals ∼ 0.53. The second one, emphasised by the bilogarithmic plot, is that −B decays in as a power law, − B ∼ C/ a with a = −0.118 ± 0.006 . . . . (51) where a has been numerically obtained through the linear fit of the bilogarithmic plot. Being the errorbars quite small in relation to the values of A and B , we can focus on an average value of η defined as We give an estimate of the time when the deviation from the initial value gets significant enough. Consistently with the Nekhoroshev analysis, we estimate this time t * as the time when the condition η (t * ) = b is verified. If b > 0 is large enough, so that the resulting t * is still in the regime of power-law increase, using Eq. (49) we get the result We have noticed that A 0.53. So, let us choose any b > 0.6. Moreover, taking a C slightly larger than C in Eq. (51), we have −B ≤ C / a , as we can easily see in the lower panel of Fig. 11. In this way we have that η (t), the average deviation of the O j from the = 0 conserved values, is smaller or equal than b when where we have used Eq. (51) and defined C = C /0.52.

VII. CONCLUSIONS
In conclusion we have studied the dynamics of the infinite-range coupled quantum kicked rotors. By mapping it over a model of interacting bosons we have performed exact diagonalization for quite large system sizes and truncations. We have analyzed the average level spacing ratio and we have found a generalized tendency towards ergodicity for increasing system sizes N , in agreement with previous analytical demonstrations.
Then we have moved to the thermodynamic limit where the model is described by a system of Gross-Pitaevskii equations which reduces to the Schrödinger equation of the non-interacting model when the magnitude of the interaction term vanishes. For = 0, these equations are equivalent to the dynamics of a single-rotor nonlinear effective Hamiltonian. This system gives rise to a power-law increase of the energy in time with exponent γ ∼ 2/3 in a wide range of parameters. We have been able to analytically explain this exponent by using a master equation approach based on the noisy behaviour of the nonlinear modulation of the kicking in the effective Hamiltonian. We have also applied a coarse graining in time and in momentum space, and thanks to the localization of the single-rotor Floquet states we were able to write a diffusion equation for the occupation probabilities of the momentum eigenstates whose solution gave rise to the exponent 2/3. Moreover, we have predicted that the nonlinear modulation of the kicking, squared and coarse-grained in time, showed a power-law time dependence of the form ∼ t −1/3 which we have numerically verified. Remarkably, considering a the kick modulated by a noise with properties invariant under time translation, we could get the diffusion of the energy found in [40].
We have shown that the Gross-Pitaveskij equations we have found are equivalent to a previously existing meanfield analysis of this model and we have studied chaos in them by means of the largest Lyapunov exponent. This quantity is a measure of chaos being an estimate of the rate of exponential divergence of nearby trajectories. We find that it decreases towards zero as a power law when looking to portions of the phase space with larger and larger momentum. From this fact follows that, for finite N , the time of validity of the Gross-Pitaevskii equations diverges with the momentum. Indeed, we have shown that this time is proportional to log N and inversely proportional to the rate of exponential divergence of the trajectories.
Later, we have considered the limit of = 0 and used Floquet theorem to construct integrals of motion of the stroboscopic dynamics. These integrals of motion are as many as the degrees of freedom and are in involution with each other, so the system is integrable in a classical sense. Taking 0 < 1 this integrability is broken and the integrals of motion deviate from their initial value. We have verified that they do so in a timescale exponentially large in the inverse and consistent with the predictions of the Nekhoroshev theorem.
Future perspectives include the study of the model where the infinite-range interactions are replaced by longrange power-law interactions. In this case the full permutation symmetry is broken and a strong qualitative change of the Hilbert space structure occurs [67] which might turn the subdiffusion into diffusion. On the other hand, one might expect that the 2/3-exponent powerlaw behaviour is preserved, as long as the mean-field description is valid and chaos still gives rise to local correlations in time. This long-range case could be numerically investigated by means of a matrix product state description and a time-dependent variational principle (MPS-TDVP, used e.g. in [68] for a different power-law interacting model). Moreover, it will be interesting to consider the case of nonuniform initialization, and see if the dynamics is described by a system of equations of the nonlinear Schrödinger equation form Eq. 13, which might still lead to energy subdiffusion. Other prospects of future work will be the analysis of other invariant subspaces of the Hamiltonian, the application of this analysis to the infinite-range coupled version of the kicked Bose-Hubbard chain [32], and the extension of the masterequation approach to other models.
In this appendix we discuss the bosonic mapping used in the text. This mapping was invented by [41] for a similar infinite-range coupled model. Let us consider a system of N rotors, and we explicitly work out the representation of the operator In the subspace even under all the permutation symmetries, we can indeed write the two parts of the Hamiltonian Eq. (1) as where V (θ) is defined in Eq. (2). We start considering the nested commutator in Eq. (32). ExpandingV in the Floquet-mode basis aŝ V = k l V k l (t) |φ k (t) φ l (t)| with V k l (t) time-periodic with period 1 we get where we have applied Eq. (28). Exploiting the orthonormality of the Floquet-mode basis we can evaluate in a lengthy but straightforward way the double commutator and get At this point we perform the coarse-graining average. We average over ∆t which lasts many periods and we assume that the µ k are incommensurate with the driving frequency 2π [43]. In this way we obtain Here |V k i | 2 does not depend on the coarse-grained time because |V k i (t)| 2 is periodic of period 1 and the averaging time ∆t spans many periods. Substituting this expression into Eq. (32) we get Eq. (33).