Exponentially slow heating in short and long-range interacting Floquet systems

Francisco Machado ,1 Gregory D. Kahanamoku-Meyer ,1 Dominic V. Else,2,3 Chetan Nayak,2,4 and Norman Y. Yao 1,5 1Department of Physics, University of California, Berkeley, California 94720, USA 2Department of Physics, University of California, Santa Barbara, California 93106, USA 3Department of Physics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA 4Station Q, Microsoft Research, Santa Barbara, California 93106-6105, USA 5Materials Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA

Periodic driving is a ubiquitous tool for the controlled manipulation of quantum systems. Classic examples abound in the context of magnetic resonance spectroscopy, where a broad class of dynamical decoupling pulse sequences have been developed to suppress unwanted interactions, not only within a system's own degrees of freedom, but also with an external environment [1][2][3][4][5][6][7]. Periodic driving has also become a staple in the engineering toolshed of both condensed matter and atomic physics, enabling the realization of topological insulators from nominally trivial band structures [8][9][10][11][12] and the generation of synthetic gauge fields for neutral atoms [13][14][15]. When a generic system with many degrees of freedom is periodically driven, it typically absorbs energy from the driving field and heats up to an infinite-temperature state [16][17][18][19][20][21][22][23][24][25], a process called thermalization [26]. However, when the driving frequency is high, the Floquet system can only absorb energy from the drive by creating multiple local excitations-an inefficient process that results in an extremely long thermalization time [27][28][29][30][31][32][33]. The system does eventually thermalize, but during the time interval before this occurs, it settles into a "prethermal" state [34][35][36][37][38] that exhibits the hallmarks of thermal equilibrium, albeit at a lower entropy than the true infinite-temperature thermal state. In this paper, we characterize and elucidate the mechanism of Floquet thermalization.
Using massively parallel Krylov subspace methods, we explore the late-time dynamics of periodically driven spin chains with both short and long-ranged interactions. In both cases, seminal recent results [28][29][30][31][32] have proven that the thermalization time, τ * , increases at least exponentially with the frequency of the drive [40]. We demonstrate that such bounds are indeed tight. To this end, our results are consistent with those of [41], which also observed slow heating; but additionally, by directly observing the exponential scaling of the thermalization time, we can extract the effective energy (a) As the driving frequency is increased, one observes an exponential enhancement in the timescale at which the system approaches infinite temperature as diagnosed by the energy density, D (0) eff /L → 0. Inset: For smaller system sizes, full thermalization to infinite temperature is never observed even at late times. (b) The same exponentially slow thermalization is seen in the timescale where the half-chain entanglement entropy reaches its infinite-temperature value, scale controlling the Floquet heating rate. This is enabled by going to sufficiently large system sizes such that there is a clear separation of scales between the local bandwidth, the driving frequency, and the global many-body bandwidth [42]; indeed, for small system sizes, moderate driving frequencies are already above the many-body bandwidth, and the system is trivially blocked from heating up to infinite temperature, Fig. 1(a) inset.
Moreover, we demonstrate that at high frequency, the halfchain entanglement entropy, S L/2 (t ), quickly reaches a plateau value consistent with a prethermal state before saturating to its infinite-temperature value at exponentially late times [28][29][30][31][32]. On this prethermal plateau, there is an emergent time-independent Hamiltonian, D eff , that is conserved and generates the time evolution of the system at stroboscopic times t = mT (where T is the period of the drive). For shortrange interactions, these observations are in direct agreement with recent rigorous proofs [28][29][30][31][32].
Intriguingly, for long-range interacting systems (with power law 1 < α < 2), we also observe the emergence of a time-independent D eff that generates the dynamics of the system until exponentially late times. This result extends beyond previous analytical bounds, where the lack of a Lieb-Robinson bound with a polynomial light cone for long-ranged systems d < α < 2d (where d is the spatial dimension) precluded the study of this regime. Besides suggesting the existence of tighter Lieb-Robinson bounds with a power-law light cone, our results are of particular relevance to experiments in isolated quantum optical systems of atoms, ions, and molecules, where strong interactions often take the form of long-range Coulomb, dipolar, or van der Waals couplings [43][44][45][46].
Model and probes. We analyze one-dimensional spin chains whose Floquet evolution is governed by a Hamiltonian with power-law interactions: Fig. 1(b), inset] induce a "bangbang" protocol as considered in previous studies on quantum thermalization [20,22,47], σ γ i are Pauli operators and ω = 2π/T is the driving frequency [28][29][30][31][32]. We also consider a short-range interacting model, H s (t ), realized by truncating the Ising interaction in H (t ) to nearest and next-nearest neighbor. Throughout this work we consider the param- The inclusion of nonzero J x and h x ensures the static part of the Hamiltonian is not trivially diagonal in the σ z i basis, controlling the timescale τ D eff of the approach to the prethermal plateau, as well as the local energy scale (for more details see Appendix F). We emphasize that our results are not sensitive to the particular choice of parameters nor to the details of our driving protocol, the nature of the couplings, and symmetries of the Hamiltonian, and the same phenomenology occurs with different choices of parameters and long-range interactions (for additional data see Appendices B, F, and G).
To characterize the Floquet thermalization dynamics, we begin with two diagnostics (Fig. 1). First, we use the increase of the energy averaged over a period of the drive: [48]; we note that D (0) eff is actually the first term in an expansion for the prethermal Hamiltonian, D eff =D (0) eff +D (1) eff /ω+D (2) eff /ω 2 + · · · , which contains a large but finite number of terms [28][29][30][31][32]. To set notation, let us also define D n eff as the truncation of D eff to nth order in 1/ω. As a second diagnostic, we investigate the growth of the half-chain entanglement entropy as a function of time: Exponentially slow thermalization. We directly compute the Floquet evolution of up to L = 26 spins using massively parallel Krylov subspace techniques [49][50][51]. We consider initial product states with spins polarized alongẑ and control the energy density of the initial state by varying the number of equally spaced domain walls that are present. We begin with the short-range model, H s (t ), and compute the time evolution of D (0) eff (t ) /L for L = 20 spins at a variety of driving frequencies (significantly larger than the local energy scales of the Hamiltonian but smaller than the global manybody bandwidth). Unlike with the small-size (L = 12) exactdiagonalization results, Fig. 1(a) inset, one observes a clear approach to infinite temperature ( D (0) eff /L → 0) at late times for a wide range of frequencies, Fig. 2(a), allowing us to study the effect of frequency in the infinite-temperature thermalization. Later in this work, we further verify that indeed L = 20 captures both the large system size thermalization dynamics (Fig. 5) and long-range nature of the interactions (Fig. 6). We define the thermalization time τ * En as the time at which the energy density is halfway from its initial value to its infinite-temperature value, D (0) eff (τ * En ) = 0.5 D (0) eff (t = 0) . For both low, Fig. 2(a), and high, Fig. 2(b), temperature initial states, one observes an exponential enhancement of τ * En as a function of increasing driving frequency.
To further probe the exponentially slow heating of the system, we investigate the growth of the half-chain entanglement entropy as a function of time. We expect the evolution of S L/2 (t ) to be characterized by three distinct regimes: an initial growth period beginning from S L/2 (0) = 0; an intermediate plateau where the entropy reaches its prethermal value, S P L/2 ; and a final plateau once the system has fully thermalized to infinite temperature, S T =∞ L/2 = [L ln(2) − 1]/2 [39]. This is indeed borne out by the numerics, Figs. 2(c) and 2(d). The timescale τ * S L/2 at which the entropy is halfway from its prethermal plateau value to its infinite-temperature value gives us an alternate estimation of the thermalization time , and has the virtue of not relying upon the choice of operator (such as D n eff ) used to probe the state of the system. For both low, Fig. 2(c), and high, Fig. 2(d), temperature initial states, one observes an exponentially long heating timescale consistent with that extracted from D (0) eff /L. To this end, Fig. 3(a) shows just how well τ * S L/2 fits an exponential dependence for a variety of different initial states. Let us emphasize that, as the system leaves the prethermal plateau and heats toward its final infinite-temperature state, the entanglement entropy closely follows the expected thermal value, suggesting that the system evolves between different global thermal states with respect to the prethermal Hamiltonian D eff (for additional data see Appendix D). There is a second timescale in the problem, namely, the time τ D eff at which the entanglement entropy reaches its prethermal plateau value S P L/2 as depicted in Figs. 2(c) and 2(d). This is the time at which the system globally establishes the prethermal equilibrium-like Gibbs state of D eff and is expected to be greater than the local thermalization time of D eff by a factor of order ∼L. The value of the plateau entropy, S P L/2 , depends on the inverse temperature of the prethermal ensemble, β eff , which in turn can be directly estimated using the energy density, , of the initial state: To quantitatively verify this relationship, we perform imaginary-time evolution of random initial states (infinitetemperature-like states) in order to estimate the entanglement entropy of the thermal state (for details see Appendix D). In the case of short-range interactions, this approach predicts S P L/2 = 4.34 and S P L/2 = 5.13 for low and high temperature initial states, respectively, both in excellent agreement with the numerically observed plateau, Figs. 2(c) and 2(d).
We now turn to the long-range interacting model, H (t ), with power law α = 1.25, where we again compute D (0) eff (t ) /L and S L/2 (t ). As mentioned above, recent results have proven exponentially slow heating in Floquet systems with power-law interactions [28,29]. The intuition is that the system still requires many rearrangements, each with a few-body (albeit long-ranged) interaction [28,29], in order to absorb energy ω from the drive. Indeed, for both low, Figs. 2(e) and 2(g), and high, Figs. 2(f) and 2(h), temperature initial states, we observe exponentially slow heating times as a function of frequency, analogously to the short-range case.
A few remarks are in order. First, the approach of the entanglement entropy to its prethermal plateau can exhibit a "shoulder" with slow growth, which only flattens into a true plateau for larger frequencies. This phenomenon can occur for both short-range and long-range interactions, Figs. 2(d) and 2(g). Much like the short-range case, for long-range interactions the prethermal plateau is in excellent agreement with the value computed via imaginary-time evolution, S P L/2 = 4.31 and S P L/2 = 5.74 for low and high temperature initial states, respectively (for details see Appendix D). Second, while both the short and long-range systems exhibit exponentially slow thermalization, there is a clear quantitative difference between the heating rates in the two cases.
We directly extract the energy scale controlling the exponentially slow heating (i.e., the effective local bandwidth) by fitting τ * S L/2 to τ * S L/2 ∼ e ω/J eff , as depicted in Fig. 3. Motivated by the results in Fig. 2, we do not consider the ω = 6 data, as they do not exhibit an approach to the prethermal plateau for any initial state. In the case of short-range interactions, both low and high temperature initial states give J s eff ≈ 0.5 ± 0.1. For the long-range interacting model, one finds a larger value J eff ≈ 0.9 ± 0.1. Intriguingly, these heating rates yield a ratio, J eff /J s eff ≈ 1.8 ± 0.2, which is consistent with the ratio of the average strength of the Ising interactions emanating from each site, [ |i − j| −1.25 ]/[1 + 2 −1.25 ] ≈ 1.6. We note that the prefactor of the exponential in τ * is larger for initial states near the edges of the spectrum, which could arise from the smaller density of states there (for additional data see Appendix E).
Long-range prethermal effective Hamiltonian. We now demonstrate that the time-independent prethermal FIG. 3. The thermalization time τ * S L/2 as a function of driving frequency for both (a) short and (b) long-range interactions. The slope provides a direct estimate of J eff , the energy scale controlling the slow thermalization dynamics. The extracted J eff is largely independent of initial state (different colors) and is consistent with its interpretation as an effective local energy scale of the system. Initial states near the edge of the spectrum exhibit slightly larger τ * S L/2 , which can be qualitatively attributed to a reduction of the density of states at these energies.
Hamiltonian D eff is indeed the generator of Floquet dynamics at stroboscopic times up to τ * . Here, we focus on the more surprising long-range case, leaving the short-range case for Appendix H. Unlike the question of slow heating, a proof of the existence of a time-independent D eff that approximately generates the dynamics of local observables in the prethermal regime may need to employ Lieb-Robinson bounds, for which the tightest possible bounds for long-range interacting systems may not yet have been found for d < α < 2d [52][53][54][55][56][57][58]. As mentioned above, we observe not only the same exponentially slow approach to the maximum entropy [consistent with D (0) eff (t ) /L] but also the presence of a prethermal plateau (for both low and high temperature initial states), indicative of the existence of D eff even for long-range interacting systems, Figs. 2(g) and 2(h). Such result is not expected to hold generically in the d < α regime, where superextensivity of the energy and loss of locality leads to a breakdown of the analytical understanding of prethermalization (for details see Appendix G).
Further evidence for the existence of a time-independent D eff comes from comparing the system's evolution under the full Floquet unitary, to evolutions under truncations of the Magnus expansion: (2) eff /ω 2 + · · · at leading order (D 0 eff ), at second order (D 2 eff ), and at fourth order (D 4 eff ). In Fig. 4 Inspection reveals two essential features: a short-time plateau whose value depends on both n and ω, followed by linear growth at late times that seems to converge for the different truncation orders. To understand these features, we note that there are two contributions to δ n (t ). First, even at short times, one expects a finite discrepancy to arise simply from the fact that the nth-order Magnus approximation D n eff still differs from D eff (e.g., by terms such as D (n+1) eff /ω n+1 + D (n+2) eff /ω n+2 + · · · ). As a result, by either increasing n or ω, the effect of higher-order terms in the expansion is decreased, and so is the height of the early-time difference plateau. Indeed, measuring the plateau height h p as a function of frequency, we find that it is consistent with Second, since D eff approximates the full Floquet evolution only up to a timescale τ * ∼ e ω/J eff , one expects the exponentially slow accumulation of errors, δ ∼ te −ω/J eff . Indeed, this linear growth of δ n (t ) is observed, Fig. 4(a), enabling an independent extraction of J eff . In particular, by plotting the slope of the late-time growth of δ n (t ) as a function of the frequency, Fig. 4(b), one obtains J eff ≈ 0.88 ± 0.05 consistent with that calculated via the entanglement entropy in Fig. 3.
Discussion of numerical methods. Throughout our discussions, we have emphasized the importance of considering sufficiently large systems sizes to ensure that the thermalization behavior we observe is generic and indicative of the thermodynamic limit. Here, we expand upon this point and present additional results carefully quantifying finite-size effects in our numerics. At the same time, we detail the methodology 033202-4 .0, 0.19, 0.21, 0.17, 0.13, 1.25, 8} and a Néel-like initial state with a domain wall every two spins. After the system has approached the prethermal state (t > τ D eff ∼ 300), the dynamics for different system sizes L 18 converge to a simple exponential decay-the associated decay rate captures the thermalization timescale of the system. used to extract the various thermalization timescales and their associated uncertainties.
To this end, we consider the impact of system size on the thermalization dynamics of an initial product state (Fig. 5). Focusing on energy density D (0) eff (t ) /L and entanglement entropy S L/2 (t ), we observe significant finite-size effects for L < 18; even at moderate driving frequencies the system fails to exhibit a simple universal approach to infinite temperature and the late-time dynamics is characterized by the presence of fluctuations and a plateau. As one considers larger systems sizes, L 18, such features are greatly reduced and the dynamics converge; indeed, both quantities approach their infinite-temperature value as a simple exponential. The characterization of the dynamics via a single thermalization timescale demonstrates that our results capture the underlying heating dynamics due to the drive and are not limited by the finite system size. Additional data are summarized in Appendix C.
Having established that the system size considered in our analysis does not affect the observed thermalization dynamics, we now demonstrate that it is also large enough to capture the long-range nature of the interactions, Eq. (1). Unlike the short-range case, long-range interactions can induce a finitetemperature ordered phase in one-dimensional systems which leads to distinct late-time thermalization dynamics. In Fig. 6, we probe this distinction by studying the magnetization dynamics, , under the static Hamiltonian D (0) eff , where, for power law α < 2, a ferromagnetic phase exists near the edge of the spectrum [59]. Indeed, when considering α = 1.13, M(t ) remains nonzero at late timeswhich is consistent with an approach toward a spontaneously symmetry broken equilibrium state with nonzero net magnetization. In stark contrast, when considering the short-range case, M(t ) quickly decays to zero-which is consistent with the lack of an ordered phase. Such difference demonstrates that the dynamics are sensitive to the long-range nature of the interactions at the system size considered. Finally, we detail our methodology for measuring the different quantities extracted from our numerics: (1) Heating timescale τ * -For both energy density and entanglement entropy we measure the heating timescale τ * (τ * En and τ * S L/2 , respectively) as the time at which the quantity is midway between its prethermal and final, infinite-temperature value. To estimate the uncertainty, we also measure the times τ * min and τ * max at which each quantity is 35% and 65% between its prethermal and infinite-temperature value, respectively; the uncertainty is taken as the larger of the two deviations, |τ * − τ * min | and |τ * − τ * max |. (2) Late-time error slope m δ n -We divide the late-time linear regime into six equally sized regions, and within each region we perform a linear fit to δ n to extract a slope. The final value m δ n is given by the average of the six extracted slopes with an uncertainty given by twice the standard deviation.
(3) Initial error plateau h n p -Here h n p is given by the average deviation δ n within the time window between 2J −1 and 5J −1 , while the uncertainty is twice the standard deviation.
Conclusion. Despite their ubiquity, periodically driven Floquet systems have generally not shown distinct phases of matter. This is largely due to their tendency to heat up to infinite temperature, except in certain exceptional cases, such as free-fermion systems (e.g., topological insulators [8][9][10][11][12]) and strongly disordered one-dimensional (and possibly, two-dimensional) systems in the many-body localized phase [60][61][62][63][64][65]. In the high-frequency limit, however, we have shown that there is an exponentially long time interval during which a system may, as it would in true thermal equilibrium, realize phases of matter and phase transitions between them, including certain phases that do not exist in undriven systems [31,66,67]. In this Appendix we compute the prethermal effective Hamiltonian D eff for our periodically driven system, Eq. (1). This time-independent Hamiltonian is the approximate generator of stroboscopic time evolution until τ * . We obtain D eff by approximating the time evolution under one period, U f , by a truncated Magnus expansion, leading to a representation of D eff as a power series in the period of the drive T = 2π/ω.
Consider the time evolution under the Hamiltonian described in Eq. (1): where The term E can be thought of as a magnetic field with a square wave time profile in theŷ andẑ directions. As before, we define the analogous short-range model H s (t ) by truncating the Ising interaction to nearest and next-nearest neighbor which defines the short-range version of D eff . The evolution under a period can be succinctly written as U f can now be cast as the exponential of an effective Hamiltonian: Upon algebraic simplification, D eff can be computed as a sum of products of D and E : 25} and a Néel-like initial state with a domain wall every two spins. We complement the data in Fig. 5 by considering additional frequencies of the drive ω ∈ {7, 9}. As before, we observe that for L 18, the dynamics of the two quantities converges toward a simple exponential approach to their infinite-temperature value. (e), (f) Fitting to the decay rate for this late-time regime (t > τ D eff = 300) we extract a consistent thermalization rate for L 18 across the different frequencies.
evolution can be written as Eq. (A1)] and thus applies to both the short and long-range models.

APPENDIX B: PRETHERMALIZATION IN OTHER LONG-RANGE INTERACTING MODELS
So far in this work we have focused our analysis on the long-range Ising model, Eq. (1). To demonstrate the generality of the phenomena explored, we present further data on two different spin models, the long-range XYZ model and the longrange anisotropic XZ model. Despite being distinct from the long-range Ising model, similar parameter regimes lead to the same dynamical features for both models: a quick frequencyindependent approach to a prethermal plateau, followed by an exponentially late approach to the final, infinite-temperature state. Similarly to our previous analysis, in Fig. 7, we highlight these features by computing the dynamics of D (0) eff , the zeroth order term of the prethermal Hamiltonian, and S L/2 , the halfchain entanglement entropy.
In Figs. 7(a) and 7(b), we consider the long-range XYZ model: where where

APPENDIX C: FINITE-SIZE SCALING
In this Appendix, we present additional numerical data on the effect of finite size on the thermalization dynamics. In Fig. 8, we consider the thermalization dynamics for different system sizes and additional driving frequencies. Much like Fig. 5 (which corresponds to ω = 8), we observe that increasing the system size beyond L = 18 leads to a universal simple exponential approach of both energy density and entanglement entropy to their late-time infinite-temperature value. By extracting the timescale associated with this simple exponential approach, we observe that they are consistent across different system sizes for each frequency.

APPENDIX D: APPROACH TO THE GIBBS STATE
In this Appendix we provide further evidence that the entropy plateau observed corresponds to the Gibbs state of the prethermal Hamiltonian.
We begin by outlining the procedure by which we compute properties of the thermal Gibbs state. First, we obtain a pure state that is close to the infinite-temperature thermal state (for any local property). Appealing to quantum typicality we use a random state in the full Hilbert state [69]: where c n are complex Gaussian random variables. We confirm that the deviation of the expectation value of D (0) eff , S L/2 , and 033202-7 FIG. 9. Estimate of the thermal half-chain entanglement entropy, as computed using the imaginary-time evolution of 10 different random initial states. σ α i between |ψ 0 and the infinite-temperature state is always smaller than 5 × 10 −3 .
We then perform imaginary-time evolution under D (0) eff using Krylov subspace methods to obtain an approximation of the thermal state at inverse temperature β. We motivate this approach by noting that |ψ 0 will also correspond to a random vector in the energy basis. Performing imaginarytime evolution then reweighs the coefficients of each energy eigenstate by the correct Boltzmann factor: where c n are complex Gaussian random variables. As we "cool" the system, we further populate the low-energy states. Because the factor e −βE (E ), where (E ) is the density of states, is sharply peaked at the energy of the thermal ensemble at temperature β, the greatest contribution to the state's population will come from this value of the energy. By the eigenstate thermalization hypothesis, we expect this state to correctly capture the true thermal state at that energy density.
To ensure that the measured quantities are indeed robust, we average any quantity of interest across ten random initial states, and use the associated standard deviation as the uncertainty. We repeat the same procedure for the short-range version of D (0) eff . This yields the entanglement entropy for negative energy density (Fig. 9).
Using the initial energy density of the system (which is approximately conserved as the system approaches the prethermal state), one immediately obtains the expected entropy of the prethermal regime. These estimates are plotted in Fig. 2, in great agreement with the observed dynamics.
One can also explore the entire evolution of the system. In Fig. 10, we plot the path of the evolution in D (0) eff /L-S L/2 space, and compare it to the thermal entropy curve obtained in Fig. 9.
We can observe the early time approach to the prethermal state where the system's entropy increases at fixed energy density. Afterward, the heating from the drive increases both the energy density and the entropy of the system, as it ap-  Fig. 2, frequency ω = 9, and both short (a) and long (b) range models. The system begins in a product state, hence S L/2 (t = 0) = 0. Until it reaches the prethermal state, entropy increases while the energy density remains constant. Upon reaching the prethermal state, the slow thermalization toward the infinite-temperature state begins, increasing both entropy and energy density. We observe that the path followed during heating is close to the thermal entropy curve, suggesting that the state remains close to a Gibbs ensemble state as it approaches infinite temperature. The largest deviation occurs for the initial state of lower energy density in the long-range model. proaches the infinite-temperature state. During the heating, the system follows closely the thermal entropy, suggesting that it remains in a Gibbs ensemble throughout the entire heating. The greatest deviation occurs for the lower initial energy density in the long-range system. This behavior seems generic for low-energy states, while higher-energy states more closely follow the thermal entropy. We leave the study of this deviation for future work.

APPENDIX E: τ * AS A FUNCTION OF INITIAL ENERGY DENSITY
The existence of a prethermal regime has been proven as a lower bound in the timescale it takes the expectation value of the prethermal Hamiltonian to approach its infinitetemperature value [27,31,32]. The generality of this approach leaves an open question: what is the effect of different initial states in the thermalization timescale of a system? In this Appendix we attempt to shed some light onto this question by analyzing how the thermalization timescale, τ * , changes as a function of energy density of the initial state for both short and long-range interacting systems.
We estimate τ * in two different ways-using the evolution of the entanglement entropy, S L/2 (t ), and of the energy density, D (2) eff /L. First, we estimate τ * to be the time τ * S L/2 when the entanglement entropy is halfway between its prethermal plateau S P L/2 and its final value of [L ln(2) − 1]/2 [39]: We take S P L/2 as the value S L/2 (t ) when we observe the system has reached a plateau at frequency ω = 9 033202-8 FIG. 11. Entanglement entropy evolution for the full Hamiltonian for short and long-range models. The thermalization of the system with respect to D eff leads to the emergence of a plateau in the entanglement entropy. We define the time at which this plateau begins as τ D eff . Across the different initial states considered, we estimate this timescale for short and long-range models as τ s D eff = 300 and τ D eff = 200, respectively.
( Fig. 11): where we have used τ D eff = 300, 200 for the short and longrange models, respectively. Second, we estimate τ * to be the time τ * En when the energy is halfway between its initial value and its infinite-temperature value D n eff (t ) → 0: Equation (E3) contains the ambiguity as to which order one should consider D eff . Performing the analysis with different D n eff , one observes no change in the results, so we choose D 2 eff . We now analyze how τ * varies for different initial states. We consider initial product states with spins polarized alonĝ z and control the energy density by varying the number of equally spaced domain walls. In Figs. 12(a) and 12(c), we consider τ * for the short-range interacting system at different frequencies using both entanglement entropy, Fig. 12(a), and energy density, Fig. 12(c). In both cases we observe the qualitatively similar behaviors. As a function of frequency, we observe an exponential dependence across the entire set of initial states, as expected from the state-independent proofs [27,31,32]. We also observe no large dependence on the energy density, except near the center and at the edges of the spectrum.
For the former, the closeness of S P L/2 and the initial energy density to their infinite-temperature values limits our ability to correctly estimate τ * . For the latter, a lower density of states is expected to decrease the rate at which the system is able to absorb energy from the drive leading to an increase in τ * .
In Figs. 12(b) and 12(d), we perform the analogous analysis for the long-range interacting system. Again we observe the same qualitative behavior when estimating τ * using the entanglement entropy, Fig. 12(b), and the energy density, Fig. 12(d). Moreover, both short and long-range interacting systems present the same overall qualitative features. We note, however, two important differences between the two. In the long-range model there is a more pronounced increase in τ * En . In both cases we observe an overall independence of τ * on the initial state except near the center and edges of the spectrum, where we believe our estimation scheme and the change in density of states lead to the deviations, respectively. While the long-range model exhibits the same qualitative behavior as the short-range one, there is a larger increase of τ * at the edge of the spectrum and a smaller dependence with the frequency of the drive, in agreement with the analysis presented in Fig. 3. near the edges of the spectrum. This is in agreement with our understanding that this phenomenon arises from a density of states effect, since the long-range model has a smaller density of states near the edge of the spectrum. Moreover, the frequency has a smaller impact on τ * in the long-range model across the entire spectrum, which is consistent with the results presented in Fig. 3 for a few different initial states.

APPENDIX F: ROLE OF J x IN THE THERMALIZATION DYNAMICS
In this Appendix, we explore the role of J x in determining the thermalization timescales, τ D eff and τ * . Because τ D eff corresponds to the thermalization timescale of the system with respect to the prethermal Hamiltonian D eff , it is sensitive to the details of D eff , which, unfortunately, depends on the frequency of the drive, as per Eq. (A6). Nevertheless, much of the intuition can be obtained by considering its zeroth-order term, D (0) eff , which is frequency independent (here we focus on the long-range case, but our discussion carries over to the short-range interacting case too).
Given the form of the Hamiltonian in Eq. (1), the presence of nonzero J x and h x is crucial to break the integrability of the system and enable the system to approach the prethermal plateau. More specifically, the strength of J x (and also h x ) controls how far away from integrability D eff is, and thus should also control how fast the system will approach the equilibrium state. Indeed, by increasing J x , we observe the 033202-9 for initial state |↑↓↑↓↓↑↓↑↑↓↑↓↓↑↓↑↑↓↑↓ . We note that the qualitative features of the heating dynamics, captured by both energy density (top panels) and entanglement entropy (bottom panels), are the same at the different values of the coupling J x . The main distinction corresponds to the quantitative value of the different thermalization timescales, τ D eff and τ * . By increasing J x , D eff is farther away from integrability and a faster thermalization is expected; as a result, the system approaches the prethermal plateau at earlier times-τ D eff is smaller (bottom panels). At the same time, the local energy scale of D eff is increased, leading to a faster late-time approach to the infinite-temperature state; indeed we observe an increase of the extracted J eff from the exponentially slow thermalization. Despite the qualitative agreement with the physical picture developed in this work, the precise connection between microscopic coupling strengths and the extracted J eff remains elusive. Shaded early-time region is plot in a linear timescale to emphasize S L/2 (t = 0) = 0. decrease in timescale at which the system approaches the equilibrium state (Fig. 13).
At the same time, J x should also affect the late-time thermalization timescale τ * , as it modifies the local energy scale of the system. An increase in J x leads to an increase of the local energy scale and thus contributes to a faster approach to the infinite-temperature state. Although this behavior is already clear from the dynamics of the energy density and entanglement entropy, Fig. 13 left and middle panels, we quantify this result by studying the thermalization timescale of the entanglement entropy (following the analysis in Fig. 8). In the right panel of Fig. 13, we highlight the exponential dependence of the thermalization timescale in the frequency of the drive, as well as the increase in the extracted local energy scale J eff .

APPENDIX G: EXPONENTIALLY LONG THERMALIZATION FOR α < d
Throughout this work, we have focused on the regime d < α < 2d for the power law of the long-range interactions. In this Appendix we briefly discuss the regime of very long power laws, α < d, where the the energy becomes superextensive and locality breaks down. These two properties lead to the breakdown of the long-lived prethermal behavior: the first ensures that the local energy scale J local diverges in the thermodynamic limit, so the condition ω J local cannot be satisfied in the thermodynamic limit, while the second suggests that there cannot exist a prethermal Hamiltonian that correctly captures the dynamics of local observables (this fact has been used to explain the lack of such Hamiltonian in classical systems [70]).
However, for any finite-sized system, even at α < d, there is still a finite local energy scale which can control the energy absorption rate from the drive. As such, one expects the thermalization timescale to grow exponentially with the frequency of the drive. This intuition can be observed in simulations of the dynamics of energy density and entanglement entropy for α = 0.63 < d (Fig. 14). Unfortunately such behavior is a purely finite-size phenomenon, as increasing the system size leads to an ever-increasing local energy scale and thus an everdecreasing thermalization timescale. This can be observed by computing the dynamics of the system for different size L, Fig. 15 (left panel).  Fig. 2. We believe the extension of these results to the very long-range case α < d arises from finite-size effects that imbue the system with a finite local energy scale, as further evinced in Fig. 15. In the left panel, we consider the case of the unnormalized long-range interaction, which has a divergent local energy scale in the thermodynamic limit. Indeed we observe the reduction of the thermalization timescale, in agreement with the increase of the local energy scale. Crucially, when the Ising interaction strength is divided by the appropriate size scaling, L 1−α , which ensures the extensivity of the system, we observe a thermalization timescale that is independent of the system size-further emphasizing that the crucial ingredient for observing the exponentially long heating timescale is the existence of a finite local energy scale. To match thermalization timescales for the two cases, we set J = 3.0 ∼ 20 1−0.63 in the normalized dynamics.
To further corroborate this interpretation, we consider the case of the normalized interactions. More specifically, given a power law α < d, the local energy scale of the system diverges with system size as L 1−α ; by dividing the interaction strength by ∝ L 1−α , we obtain an extensive system with a finite local energy scale, even in the thermodynamic limit. For such interactions, the thermalization timescale should not depend on the size of the system size, as confirmed by explicit computations in Fig. 15 (right panel).
Finally, let us emphasize that, based on the current understanding of the prethermal regime, the lack of power-law light-cone Lieb-Robinson bounds for such systems precludes them from having an effective Hamiltonian in the prethermal regime. Unfortunately, the numerical investigation in this regime is much harder due to the sizable finite-size effects. We leave this study for future work.

APPENDIX H: PRETHERMAL EFFECTIVE HAMILTONIAN FOR SHORT-RANGE INTERACTING SYSTEM
In Fig. 4, we analyzed the role of D eff as the approximate generator of stroboscopic time evolution for the long-range interacting model. In this Appendix we supplement those results by studying the short-range model using O = D (0) eff /L as well as considering other local operators in both short and long-range models. Analogously to the results presented in Fig. 4  In Fig. 16(a) we observe the same qualitative behavior as in the long-range interacting system analyzed in Fig. 4(a). In particular we observe the same initial plateau originating from the difference between D n eff and D eff as well as the late-time linear regime. Immediately, one notices that for the same range of frequencies, the linear regime of δ n occurs at later times corresponding to a slower linear growth. In Figs. 16(b) and 16(c) we quantify these aspects by analyzing the frequency dependency of both the slope of the linear regime and the height of the plateaus. From the linear slope, we extract an effective interaction strength J s eff = 0.6 ± 0.1 which is in agreement with the results from the analysis presented in Fig. 3, J s eff = 0.5 ± 0.1. Regarding the plateau height we observe a power-law dependence with frequency h p ∼ ω −γ (n) similar to that in the long-range case, but with larger values of γ (n).
We now demonstrate that the results presented in Fig. 4 and Fig. 16 extend to other local operators of the system. In particular we will consider the operators σ z i , σ x i , σ z i σ z i+1 , and σ x i σ x i+1 at site i, by measuring the errors δ z i , δ x i , δ zz i , and δ xx i , respectively, defined between time evolution under D n eff and the full Hamiltonian. We then define δ z , δ x , δ zz , and δ xx as the average of the errors over all the sites of the chain. In analogy to Fig. 4 and Fig. 16, we observe the emergence of a late-time linear regime for all the considered operators, as shown in Fig. 17 (short-range) and Fig. 18 (long-range).  Fig. 4 and Fig. 16, we observe a late-time linear regime, corresponding to the linear accumulation of error. At early times we observe that a complex behavior arises from the thermalization dynamics to D n eff . For t > τ D eff , as extracted from Fig. 11, the error follows the same behavior as in Fig. 4(a) and Fig. 16(a) when O = D (0) eff /L. Despite the initial behavior, an increase in n leads to an earlier onset of the linear regime, as expected from D eff being the approximate generator of time evolution, consistently with previous theoretical results.
The rate of growth of the linear regime decreases with the increase of both n and ω, which is consistent with D eff being the approximate generator of time evolution. However, unlike the case when O = D (0) eff /L, the early-time behavior has a more complex structure. In these cases, we do not expect these local operators to be approximately conserved, so we observe different, operator-dependent thermalization dynamics.
To corroborate that the early-time behavior is due to differences in short-time thermalization dynamics, we estimate τ D eff as the time at which the system approaches S P L/2 . After FIG. 18. Difference in local operators, σ z i (a), σ x i (b), σ z i σ z i+1 (c), and σ x i σ x i+1 (d), when evolved under the full evolution and D n eff of the long-range model. We observe a qualitatively similar picture to the results presented in Fig. 17. This provides further evidence of D eff being the approximate generator of stroboscopic time evolution. Similarly to Fig. 17, the error dynamics becomes simpler for t > τ D eff , as extracted from Fig. 11. this time, we observe that most of the error is given by an initial plateau followed by a linear regime. For t > τ D eff , extracted in Fig. 11, the system has thermalized to D eff , so the error is dominated initially by the difference in the thermal expectation of O with respect to D eff or D n eff , until the linear growth in error from the difference between D eff and the full evolution dominates.
Finally, we emphasize that the agreement we observe for the long-range model in Fig. 18 between the evolution under the full Hamiltonian and the different orders of D eff at different frequencies provides further evidence of the existence of a prethermal effective Hamiltonian given by D eff that approximately describes the time evolution of our system, even though no formal proofs exist for power-law interactions at present.